This directory contains all agent and network collaboration functionality for yaze and yaze-server.
Overview
The Agent Editor module provides AI-powered assistance and collaborative editing features for ROM hacking projects. It integrates conversational AI agents, local and network-based collaboration, and multimodal (vision) capabilities.
Architecture
Core Components
AgentEditor (<tt>agent_editor.h/cc</tt>)
The main manager class that coordinates all agent-related functionality:
- Manages the chat widget lifecycle
- Coordinates local and network collaboration modes
- Provides high-level API for session management
- Handles multimodal callbacks (screenshot capture, Gemini integration)
Key Features:
- Unified interface for all agent functionality
- Mode switching between local and network collaboration
- ROM context management for agent queries
- Integration with toast notifications and proposal drawer
AgentChatWidget (<tt>agent_chat_widget.h/cc</tt>)
ImGui-based chat interface for interacting with AI agents:
- Real-time conversation with AI assistant
- Message history with persistence
- Proposal preview and quick actions
- Collaboration panel with session controls
- Multimodal panel for screenshot capture and Gemini queries
Features:
- Split-panel layout (session details + chat history)
- Auto-scrolling chat with timestamps
- JSON response formatting
- Table data visualization
- Proposal metadata display
AgentChatHistoryCodec (<tt>agent_chat_history_codec.h/cc</tt>)
Serialization/deserialization for chat history:
- JSON-based persistence (when built with
YAZE_WITH_JSON
)
- Graceful degradation when JSON support unavailable
- Saves collaboration state, multimodal state, and full chat history
- Shared history support for collaborative sessions
Collaboration Coordinators
AgentCollaborationCoordinator (<tt>agent_collaboration_coordinator.h/cc</tt>)
Local filesystem-based collaboration:
- Creates session files in
~/.yaze/agent/sessions/
- Generates shareable session codes
- Participant tracking via file system
- Polling-based synchronization
Use Case: Same-machine collaboration or cloud-folder syncing (Dropbox, iCloud)
NetworkCollaborationCoordinator (<tt>network_collaboration_coordinator.h/cc</tt>)
WebSocket-based network collaboration (requires YAZE_WITH_GRPC
and YAZE_WITH_JSON
):
- Real-time connection to collaboration server
- Message broadcasting to all session participants
- Live participant updates
- Session management (host/join/leave)
Advanced Features (v2.0):
- ROM Synchronization - Share ROM edits and diffs across all participants
- Multimodal Snapshot Sharing - Share screenshots and images with session members
- Proposal Management - Share and track AI-generated proposals with status updates
- AI Agent Integration - Route queries to AI agents for ROM analysis
Use Case: Remote collaboration across networks
Server: See yaze-server
repository for the Node.js WebSocket server v2.0
Usage
Initialization
agent_editor_.Initialize(&toast_manager_, &proposal_drawer_);
agent_editor_.SetRomContext(current_rom_);
AgentChatWidget::MultimodalCallbacks callbacks;
callbacks.capture_snapshot = [](std::filesystem::path* out) { };
callbacks.send_to_gemini = [](const std::filesystem::path& img, const std::string& prompt) { };
agent_editor_.GetChatWidget()->SetMultimodalCallbacks(callbacks);
Drawing
Session Management
auto session = agent_editor_.HostSession("My ROM Hack",
AgentEditor::CollaborationMode::kLocal);
auto session = agent_editor_.JoinSession("ABC123",
AgentEditor::CollaborationMode::kLocal);
agent_editor_.LeaveSession();
Network Mode (requires YAZE_WITH_GRPC and YAZE_WITH_JSON)
agent_editor_.ConnectToServer("ws://localhost:8765");
auto session = agent_editor_.HostSession("Network Session",
AgentEditor::CollaborationMode::kNetwork);
network_coordinator->SendRomSync(username, base64_diff_data, rom_hash);
network_coordinator->SendSnapshot(username, base64_image_data, "overworld_editor");
network_coordinator->SendProposal(username, proposal_json);
network_coordinator->SendAIQuery(username, "What enemies are in room 5?");
File Structure
agent/
├── README.md (this file)
├── agent_editor.h Main manager class
├── agent_editor.cc
├── agent_chat_widget.h ImGui chat interface
├── agent_chat_widget.cc
├── agent_chat_history_codec.h History serialization
├── agent_chat_history_codec.cc
├── agent_collaboration_coordinator.h Local file-based collaboration
├── agent_collaboration_coordinator.cc
├── network_collaboration_coordinator.h WebSocket collaboration
└── network_collaboration_coordinator.cc
Build Configuration
Required
YAZE_WITH_JSON
- Enables chat history persistence (via nlohmann/json)
Optional
YAZE_WITH_GRPC
- Enables all agent features including network collaboration
- Without this flag, agent functionality is completely disabled
Data Files
Local Storage
- Chat History:
~/.yaze/agent/chat_history.json
- Shared Sessions:
~/.yaze/agent/sessions/<session_id>_history.json
- Session Metadata:
~/.yaze/agent/sessions/<code>.session
Session File Format
{
"session_name": "My ROM Hack",
"session_code": "ABC123",
"host": "username",
"participants": ["username", "friend1", "friend2"]
}
Integration with EditorManager
The AgentEditor
is instantiated as a member of EditorManager
and integrated into the main UI:
class EditorManager {
#ifdef YAZE_WITH_GRPC
AgentEditor agent_editor_;
#endif
};
Menu integration:
[this]() { agent_editor_.ToggleChat(); },
[this]() { return agent_editor_.IsChatActive(); }}
Dependencies
Internal
cli::agent::ConversationalAgentService
- AI agent backend
cli::GeminiAIService
- Gemini API for multimodal queries
yaze::test::*
- Screenshot capture utilities
ProposalDrawer
- Displays agent proposals
ToastManager
- User notifications
External (when enabled)
- nlohmann/json - Chat history serialization
- httplib - WebSocket client implementation
- Abseil - Status handling, time utilities
Advanced Features (v2.0)
The network collaboration coordinator now supports:
ROM Synchronization
Share ROM edits in real-time:
- Send diff data (base64 encoded) to all participants
- Automatic ROM hash tracking
- Size limits enforced by server (5MB max)
Multimodal Snapshot Sharing
Share screenshots and images:
- Capture and share specific editor views
- Support for multiple snapshot types (overworld, dungeon, sprite, etc.)
- Base64 encoding for efficient transfer
- Size limits enforced by server (10MB max)
Proposal Management
Collaborative proposal workflow:
- Share AI-generated proposals with all participants
- Track proposal status (pending, accepted, rejected)
- Real-time status updates broadcast to all users
AI Agent Integration
Server-side AI routing:
- Send queries through the collaboration server
- Shared AI responses visible to all participants
- Query history tracked in server database
Health Monitoring
Server health and metrics:
/health
endpoint for server status
/metrics
endpoint for usage statistics
- Graceful shutdown notifications
Future Enhancements
- Voice chat integration - Audio channels for remote collaboration
- Shared cursor/viewport - See what collaborators are editing
- Conflict resolution UI - Handle concurrent edits gracefully
- Session replay - Record and playback editing sessions
- Agent memory - Persistent context across sessions
- Real-time cursor tracking - See where collaborators are working
Server Protocol
The server uses JSON WebSocket messages. Key message types:
Client → Server
host_session
- Create new session (v2.0: supports rom_hash
, ai_enabled
)
join_session
- Join existing session
leave_session
- Leave current session
chat_message
- Send message (v2.0: supports message_type
, metadata
)
rom_sync
- New in v2.0 - Share ROM diff
snapshot_share
- New in v2.0 - Share screenshot/image
proposal_share
- New in v2.0 - Share proposal
proposal_update
- New in v2.0 - Update proposal status
ai_query
- New in v2.0 - Query AI agent
Server → Client
session_hosted
- Session created confirmation
session_joined
- Joined session confirmation
chat_message
- Broadcast message
participant_joined
/ participant_left
- Participant changes
rom_sync
- New in v2.0 - ROM diff broadcast
snapshot_shared
- New in v2.0 - Snapshot broadcast
proposal_shared
- New in v2.0 - Proposal broadcast
proposal_updated
- New in v2.0 - Proposal status update
ai_response
- New in v2.0 - AI agent response
server_shutdown
- New in v2.0 - Server shutting down
error
- Error message