yaze 0.3.2
Link to the Past ROM Editor
 
Loading...
Searching...
No Matches
yaze::cli::ai::AIGUIController Class Reference

High-level controller for AI-driven GUI automation with vision feedback. More...

#include <ai_gui_controller.h>

Collaboration diagram for yaze::cli::ai::AIGUIController:

Public Member Functions

 AIGUIController (GeminiAIService *gemini_service, GuiAutomationClient *gui_client)
 Construct controller with required services.
 
 ~AIGUIController ()=default
 
absl::Status Initialize (const ControlLoopConfig &config)
 Initialize the controller with configuration.
 
absl::StatusOr< ControlResultExecuteCommand (const std::string &command)
 Execute a natural language command with AI vision guidance.
 
absl::StatusOr< ControlResultExecuteActions (const std::vector< ai::AIAction > &actions)
 Execute a sequence of pre-parsed actions.
 
absl::StatusOr< VisionAnalysisResultExecuteSingleAction (const AIAction &action, bool verify_with_vision=true)
 Execute a single action with optional vision verification.
 
absl::StatusOr< VisionAnalysisResultAnalyzeCurrentGUIState (const std::string &context="")
 Analyze the current GUI state without executing actions.
 
const ControlLoopConfigconfig () const
 Get the current configuration.
 
void SetConfig (const ControlLoopConfig &config)
 Update configuration.
 

Private Member Functions

absl::StatusOr< std::filesystem::path > CaptureCurrentState (const std::string &description)
 
absl::Status ExecuteGRPCAction (const AIAction &action)
 
absl::StatusOr< VisionAnalysisResultVerifyActionSuccess (const AIAction &action, const std::filesystem::path &before_screenshot, const std::filesystem::path &after_screenshot)
 
absl::StatusOr< AIActionRefineActionWithVision (const AIAction &original_action, const VisionAnalysisResult &analysis)
 
void EnsureScreenshotsDirectory ()
 
std::filesystem::path GenerateScreenshotPath (const std::string &suffix)
 

Private Attributes

GeminiAIServicegemini_service_
 
GuiAutomationClientgui_client_
 
std::unique_ptr< VisionActionRefinervision_refiner_
 
gui::GuiActionGenerator action_generator_
 
ControlLoopConfig config_
 
std::filesystem::path screenshots_dir_
 

Detailed Description

High-level controller for AI-driven GUI automation with vision feedback.

This class implements the complete vision-guided control loop:

  1. Parse Command → Natural language → AIActions
  2. Take Screenshot → Capture current GUI state
  3. Analyze Vision → Gemini analyzes screenshot
  4. Execute Action → Send gRPC command to GUI
  5. Verify Success → Compare before/after screenshots
  6. Refine & Retry → Adjust parameters if action failed
  7. Repeat → Until goal achieved or max iterations reached

Example usage:

AIGUIController controller(gemini_service, gui_client);
controller.Initialize(config);
auto result = controller.ExecuteCommand(
"Place tile 0x42 at overworld position (5, 7)"
);
if (result->success) {
std::cout << "Success! Took " << result->iterations_performed
<< " iterations\n";
}
High-level controller for AI-driven GUI automation with vision feedback.
const ControlLoopConfig & config() const
Get the current configuration.

Definition at line 80 of file ai_gui_controller.h.

Constructor & Destructor Documentation

◆ AIGUIController()

yaze::cli::ai::AIGUIController::AIGUIController ( GeminiAIService gemini_service,
GuiAutomationClient gui_client 
)

Construct controller with required services.

Parameters
gemini_serviceGemini AI service for vision analysis
gui_clientgRPC client for GUI automation

Definition at line 21 of file ai_gui_controller.cc.

References gemini_service_, and gui_client_.

◆ ~AIGUIController()

yaze::cli::ai::AIGUIController::~AIGUIController ( )
default

Member Function Documentation

◆ Initialize()

absl::Status yaze::cli::ai::AIGUIController::Initialize ( const ControlLoopConfig config)

Initialize the controller with configuration.

Definition at line 36 of file ai_gui_controller.cc.

References config(), config_, EnsureScreenshotsDirectory(), yaze::cli::ai::ControlLoopConfig::screenshots_dir, and screenshots_dir_.

Here is the call graph for this function:

◆ ExecuteCommand()

absl::StatusOr< ControlResult > yaze::cli::ai::AIGUIController::ExecuteCommand ( const std::string &  command)

Execute a natural language command with AI vision guidance.

Parameters
commandNatural language command (e.g., "Place tile 0x42 at (5, 7)")
Returns
Result including success status and execution details

Definition at line 45 of file ai_gui_controller.cc.

References ExecuteActions(), and yaze::cli::ai::AIActionParser::ParseCommand().

Here is the call graph for this function:

◆ ExecuteActions()

◆ ExecuteSingleAction()

absl::StatusOr< VisionAnalysisResult > yaze::cli::ai::AIGUIController::ExecuteSingleAction ( const AIAction action,
bool  verify_with_vision = true 
)

Execute a single action with optional vision verification.

Parameters
actionThe action to execute
verify_with_visionWhether to use vision to verify success
Returns
Success status and vision analysis

Definition at line 158 of file ai_gui_controller.cc.

References yaze::cli::ai::VisionAnalysisResult::action_successful, CaptureCurrentState(), config_, yaze::cli::ai::VisionAnalysisResult::description, yaze::cli::ai::VisionAnalysisResult::error_message, ExecuteGRPCAction(), yaze::cli::ai::ControlLoopConfig::screenshot_delay_ms, and VerifyActionSuccess().

Referenced by ExecuteActions().

Here is the call graph for this function:

◆ AnalyzeCurrentGUIState()

absl::StatusOr< VisionAnalysisResult > yaze::cli::ai::AIGUIController::AnalyzeCurrentGUIState ( const std::string &  context = "")

Analyze the current GUI state without executing actions.

Parameters
contextWhat to look for in the GUI
Returns
Vision analysis of current state

Definition at line 210 of file ai_gui_controller.cc.

References CaptureCurrentState(), and vision_refiner_.

Here is the call graph for this function:

◆ config()

const ControlLoopConfig & yaze::cli::ai::AIGUIController::config ( ) const
inline

Get the current configuration.

Definition at line 133 of file ai_gui_controller.h.

References config_.

Referenced by Initialize(), and SetConfig().

◆ SetConfig()

void yaze::cli::ai::AIGUIController::SetConfig ( const ControlLoopConfig config)
inline

Update configuration.

Definition at line 138 of file ai_gui_controller.h.

References config(), and config_.

Here is the call graph for this function:

◆ CaptureCurrentState()

absl::StatusOr< std::filesystem::path > yaze::cli::ai::AIGUIController::CaptureCurrentState ( const std::string &  description)
private

Definition at line 223 of file ai_gui_controller.cc.

References GenerateScreenshotPath().

Referenced by AnalyzeCurrentGUIState(), ExecuteActions(), and ExecuteSingleAction().

Here is the call graph for this function:

◆ ExecuteGRPCAction()

◆ VerifyActionSuccess()

absl::StatusOr< VisionAnalysisResult > yaze::cli::ai::AIGUIController::VerifyActionSuccess ( const AIAction action,
const std::filesystem::path &  before_screenshot,
const std::filesystem::path &  after_screenshot 
)
private

Definition at line 419 of file ai_gui_controller.cc.

References vision_refiner_.

Referenced by ExecuteSingleAction().

◆ RefineActionWithVision()

absl::StatusOr< AIAction > yaze::cli::ai::AIGUIController::RefineActionWithVision ( const AIAction original_action,
const VisionAnalysisResult analysis 
)
private

Definition at line 427 of file ai_gui_controller.cc.

References yaze::cli::ai::AIAction::parameters, and vision_refiner_.

◆ EnsureScreenshotsDirectory()

void yaze::cli::ai::AIGUIController::EnsureScreenshotsDirectory ( )
private

Definition at line 446 of file ai_gui_controller.cc.

References screenshots_dir_.

Referenced by Initialize().

◆ GenerateScreenshotPath()

std::filesystem::path yaze::cli::ai::AIGUIController::GenerateScreenshotPath ( const std::string &  suffix)
private

Definition at line 456 of file ai_gui_controller.cc.

References screenshots_dir_.

Referenced by CaptureCurrentState().

Member Data Documentation

◆ gemini_service_

GeminiAIService* yaze::cli::ai::AIGUIController::gemini_service_
private

Definition at line 141 of file ai_gui_controller.h.

Referenced by AIGUIController().

◆ gui_client_

GuiAutomationClient* yaze::cli::ai::AIGUIController::gui_client_
private

Definition at line 142 of file ai_gui_controller.h.

Referenced by AIGUIController(), and ExecuteGRPCAction().

◆ vision_refiner_

std::unique_ptr<VisionActionRefiner> yaze::cli::ai::AIGUIController::vision_refiner_
private

◆ action_generator_

gui::GuiActionGenerator yaze::cli::ai::AIGUIController::action_generator_
private

Definition at line 144 of file ai_gui_controller.h.

Referenced by ExecuteGRPCAction().

◆ config_

ControlLoopConfig yaze::cli::ai::AIGUIController::config_
private

◆ screenshots_dir_

std::filesystem::path yaze::cli::ai::AIGUIController::screenshots_dir_
private

The documentation for this class was generated from the following files: