High-level controller for AI-driven GUI automation with vision feedback. More...
#include <ai_gui_controller.h>
Public Member Functions | |
AIGUIController (GeminiAIService *gemini_service, GuiAutomationClient *gui_client) | |
Construct controller with required services. | |
~AIGUIController ()=default | |
absl::Status | Initialize (const ControlLoopConfig &config) |
Initialize the controller with configuration. | |
absl::StatusOr< ControlResult > | ExecuteCommand (const std::string &command) |
Execute a natural language command with AI vision guidance. | |
absl::StatusOr< ControlResult > | ExecuteActions (const std::vector< ai::AIAction > &actions) |
Execute a sequence of pre-parsed actions. | |
absl::StatusOr< VisionAnalysisResult > | ExecuteSingleAction (const AIAction &action, bool verify_with_vision=true) |
Execute a single action with optional vision verification. | |
absl::StatusOr< VisionAnalysisResult > | AnalyzeCurrentGUIState (const std::string &context="") |
Analyze the current GUI state without executing actions. | |
const ControlLoopConfig & | config () const |
Get the current configuration. | |
void | SetConfig (const ControlLoopConfig &config) |
Update configuration. | |
Private Member Functions | |
absl::StatusOr< std::filesystem::path > | CaptureCurrentState (const std::string &description) |
absl::Status | ExecuteGRPCAction (const AIAction &action) |
absl::StatusOr< VisionAnalysisResult > | VerifyActionSuccess (const AIAction &action, const std::filesystem::path &before_screenshot, const std::filesystem::path &after_screenshot) |
absl::StatusOr< AIAction > | RefineActionWithVision (const AIAction &original_action, const VisionAnalysisResult &analysis) |
void | EnsureScreenshotsDirectory () |
std::filesystem::path | GenerateScreenshotPath (const std::string &suffix) |
Private Attributes | |
GeminiAIService * | gemini_service_ |
GuiAutomationClient * | gui_client_ |
std::unique_ptr< VisionActionRefiner > | vision_refiner_ |
gui::GuiActionGenerator | action_generator_ |
ControlLoopConfig | config_ |
std::filesystem::path | screenshots_dir_ |
High-level controller for AI-driven GUI automation with vision feedback.
This class implements the complete vision-guided control loop:
Example usage:
Definition at line 80 of file ai_gui_controller.h.
yaze::cli::ai::AIGUIController::AIGUIController | ( | GeminiAIService * | gemini_service, |
GuiAutomationClient * | gui_client | ||
) |
Construct controller with required services.
gemini_service | Gemini AI service for vision analysis |
gui_client | gRPC client for GUI automation |
Definition at line 21 of file ai_gui_controller.cc.
References gemini_service_, and gui_client_.
|
default |
absl::Status yaze::cli::ai::AIGUIController::Initialize | ( | const ControlLoopConfig & | config | ) |
Initialize the controller with configuration.
Definition at line 36 of file ai_gui_controller.cc.
References config(), config_, EnsureScreenshotsDirectory(), yaze::cli::ai::ControlLoopConfig::screenshots_dir, and screenshots_dir_.
absl::StatusOr< ControlResult > yaze::cli::ai::AIGUIController::ExecuteCommand | ( | const std::string & | command | ) |
Execute a natural language command with AI vision guidance.
command | Natural language command (e.g., "Place tile 0x42 at (5, 7)") |
Definition at line 45 of file ai_gui_controller.cc.
References ExecuteActions(), and yaze::cli::ai::AIActionParser::ParseCommand().
absl::StatusOr< ControlResult > yaze::cli::ai::AIGUIController::ExecuteActions | ( | const std::vector< ai::AIAction > & | actions | ) |
Execute a sequence of pre-parsed actions.
actions | Vector of AI actions to execute |
Definition at line 57 of file ai_gui_controller.cc.
References yaze::cli::ai::ControlResult::actions_executed, CaptureCurrentState(), config_, yaze::cli::ai::ControlLoopConfig::enable_iterative_refinement, yaze::cli::ai::ControlLoopConfig::enable_vision_verification, yaze::cli::ai::ControlResult::error_message, ExecuteSingleAction(), yaze::cli::ai::ControlResult::final_state_description, yaze::cli::ai::ControlResult::iterations_performed, yaze::cli::ai::ControlLoopConfig::max_iterations, yaze::cli::ai::ControlLoopConfig::max_retries_per_action, yaze::cli::ai::AIAction::parameters, yaze::cli::ai::ControlResult::screenshots_taken, yaze::cli::ai::ControlResult::success, yaze::cli::ai::ControlResult::vision_analyses, and vision_refiner_.
Referenced by ExecuteCommand().
absl::StatusOr< VisionAnalysisResult > yaze::cli::ai::AIGUIController::ExecuteSingleAction | ( | const AIAction & | action, |
bool | verify_with_vision = true |
||
) |
Execute a single action with optional vision verification.
action | The action to execute |
verify_with_vision | Whether to use vision to verify success |
Definition at line 158 of file ai_gui_controller.cc.
References yaze::cli::ai::VisionAnalysisResult::action_successful, CaptureCurrentState(), config_, yaze::cli::ai::VisionAnalysisResult::description, yaze::cli::ai::VisionAnalysisResult::error_message, ExecuteGRPCAction(), yaze::cli::ai::ControlLoopConfig::screenshot_delay_ms, and VerifyActionSuccess().
Referenced by ExecuteActions().
absl::StatusOr< VisionAnalysisResult > yaze::cli::ai::AIGUIController::AnalyzeCurrentGUIState | ( | const std::string & | context = "" | ) |
Analyze the current GUI state without executing actions.
context | What to look for in the GUI |
Definition at line 210 of file ai_gui_controller.cc.
References CaptureCurrentState(), and vision_refiner_.
|
inline |
Get the current configuration.
Definition at line 133 of file ai_gui_controller.h.
References config_.
Referenced by Initialize(), and SetConfig().
|
inline |
Update configuration.
Definition at line 138 of file ai_gui_controller.h.
References config(), and config_.
|
private |
Definition at line 223 of file ai_gui_controller.cc.
References GenerateScreenshotPath().
Referenced by AnalyzeCurrentGUIState(), ExecuteActions(), and ExecuteSingleAction().
|
private |
Definition at line 240 of file ai_gui_controller.cc.
References action_generator_, yaze::cli::GuiAutomationClient::Assert(), yaze::cli::GuiAutomationClient::Click(), yaze::cli::gui::GuiActionGenerator::GenerateTestScript(), gui_client_, yaze::cli::ai::kClickButton, yaze::cli::kDouble, yaze::cli::kLeft, yaze::cli::kMiddle, yaze::cli::ai::kPlaceTile, yaze::cli::kRight, yaze::cli::ai::kSelectTile, yaze::cli::ai::kVerifyTile, yaze::cli::ai::kWait, yaze::cli::ai::AIAction::parameters, yaze::cli::ai::AIAction::type, yaze::cli::GuiAutomationClient::Type(), and yaze::cli::GuiAutomationClient::Wait().
Referenced by ExecuteSingleAction().
|
private |
Definition at line 419 of file ai_gui_controller.cc.
References vision_refiner_.
Referenced by ExecuteSingleAction().
|
private |
Definition at line 427 of file ai_gui_controller.cc.
References yaze::cli::ai::AIAction::parameters, and vision_refiner_.
|
private |
Definition at line 446 of file ai_gui_controller.cc.
References screenshots_dir_.
Referenced by Initialize().
|
private |
Definition at line 456 of file ai_gui_controller.cc.
References screenshots_dir_.
Referenced by CaptureCurrentState().
|
private |
Definition at line 141 of file ai_gui_controller.h.
Referenced by AIGUIController().
|
private |
Definition at line 142 of file ai_gui_controller.h.
Referenced by AIGUIController(), and ExecuteGRPCAction().
|
private |
Definition at line 143 of file ai_gui_controller.h.
Referenced by AnalyzeCurrentGUIState(), ExecuteActions(), RefineActionWithVision(), and VerifyActionSuccess().
|
private |
Definition at line 144 of file ai_gui_controller.h.
Referenced by ExecuteGRPCAction().
|
private |
Definition at line 145 of file ai_gui_controller.h.
Referenced by config(), ExecuteActions(), ExecuteSingleAction(), Initialize(), and SetConfig().
|
private |
Definition at line 146 of file ai_gui_controller.h.
Referenced by EnsureScreenshotsDirectory(), GenerateScreenshotPath(), and Initialize().