Algolia-RoboCluster is an advanced robot control system prototype that leverages the power of large language models (Google Gemini) for autonomous planning and execution of complex missions. A key feature of the project is its dynamic data architecture: the AI not only follows instructions but also independently designs, creates, and uses the necessary data structure in Algolia for each unique mission.
- Three-tiered AI Architecture: The system uses a hierarchical decision-making model inspired by a command structure:
- Strategist: A high-level AI that decomposes the overall mission goal into major tactical tasks.
- Tacticians: Two specialized AI agents (one for robot control and one for data operations) that transform tasks from the Strategist into specific actions.
- Executors: Software clients that directly interact with the robot and the database.
- Interactive Web Interface: A modern UI for setting tasks, monitoring mission execution in real-time, and managing its lifecycle (pause, resume, interrupt).
- Data Visualization: The mission log automatically formats and displays complex JSON objects as user-friendly tables.
- Image Display: Images from the robot's camera are displayed directly in the mission log, providing visual confirmation of events.
The project is built on a clear separation of responsibilities between components:
- User sets a high-level goal through the web interface.
- Orchestrator-Strategist (LLM 1) receives the goal and breaks it down into a sequence of tasks, determining which specialist (robot or data) should perform the next one.
- Specialist-Tactician (LLM 2 or 3) receives a specific task and transforms it into a precise tool call (e.g.,
play_motionfor the robot orsaveObjectfor the database). - Client-Executor (
webots_clientoralgolia_client) executes the command. - The result of the execution is returned up the chain, and the cycle repeats until the mission is complete.
A more detailed description of the architecture can be found in the ARCHITECTURE.md file.
- Python 3.10+
- Webots R2023b or newer
- Algolia account
- Google Gemini API key
-
Clone the repository:
git clone https://github.com/premananda108/Algolia-RoboCluster.git cd Algolia-RoboCluster -
Install Python dependencies:
pip install -r requirements.txt
-
Set up the Algolia MCP Server:
This project requires a running instance of the Algolia MCP Server. This server acts as a secure proxy that executes commands for the Algolia API.
-
If you don't have it, clone the official repository:
git clone https://github.com/algolia/mcp-node.git
It is recommended to clone it into a separate directory from this project (e.g.,
C:/dev/mcp-nodeor~/dev/mcp-node). -
Navigate to the MCP server directory:
cd path/to/your/mcp-node -
Install Node.js dependencies:
npm install
-
Ensure the path to this directory is set in the
MCP_NODE_PATHvariable in our project's.envfile.
-
-
Set up environment variables:
- Copy the
orchestrator/.env.examplefile toorchestrator/.env. - Open
orchestrator/.envand fill in your details:GOOGLE_API_KEY="YOUR_GOOGLE_API_KEY" MCP_NODE_PATH="/path/to/your/mcp-node" ALGOLIA_APPLICATION_ID="YOUR_ALGOLIA_APP_ID"
- Copy the
-
Start the simulation in Webots:
- Open the Webots simulator.
- Open the project world:
webots_project/worlds/MCP-test.wbt. - Start the simulation (press the "Play" button).
-
Start the orchestrator server:
python orchestrator/main.py
This script will launch the main FastAPI server and automatically connect to the necessary services.
-
Open the web interface:
- Go to http://127.0.0.1:8000 in your browser.
-
Start a mission:
- Enter the mission goal in the text field (e.g., "take three steps forward with the NAO robot and wave your hand").
- Click the "Start Mission" button and watch the execution in the log.
Algolia-RoboCluster/
├── orchestrator/ # "Brain" of the project: FastAPI server, clients, prompts
│ ├── main.py # Main server file
│ ├── algolia_client.py # Client for interacting with Algolia
│ ├── webots_client.py # Client for interacting with Webots
│ ├── prompt_*.txt # System prompts for LLMs
│ └── .env.example # Example environment variables file
├── frontend/ # Web interface
│ └── index.html
├── webots_project/ # Project for the Webots simulator
│ ├── controllers/ # Robot controllers
│ ├── worlds/ # Simulation worlds
│ └── ...
├── ARCHITECTURE.md # Description of the architecture
├── README.md # This file
└── requirements.txt # Python dependencies
