ESP32 Voice Assistant With Gemini AI
by circuitsmiles in Circuits > Microcontrollers
8880 Views, 64 Favorites, 0 Comments
ESP32 Voice Assistant With Gemini AI
This project combines an ESP32 microcontroller with a Python server (using Google's Gemini AI for smart responses and gTTS for speech) to create a device that talks to you without ever listening. It's a fantastic way to learn about microcontrollers, AI APIs, and text-to-speech, all while keeping your AI token usage super low!
Supplies
Hardware Components:
- Microcontroller: ESP32 Dev Kit C
- Display: 0.96" OLED Display (SSD1306, I2C interface)
- Audio Output: MAX98357A I2S Class-D Amplifier + Small 8-ohm Speaker
- User Input: 2x Tactile Buttons
- Visual Cues: 1x Red LED, 1x Green LED
- Miscellaneous: Breadboard, Jumper Wires (male-to-male), USB Power Supply (at least 1A)
Software & Accounts:
- Arduino IDE (for ESP32 firmware)
- Python 3 (for the server)
- A Google API Key (for Gemini API access)
The Wiring - Connecting Everything Up
This is where the physical build comes together. Take your time, double-check connections, and ensure your ESP32 is powered off while wiring. All GND pins from components should connect to a common ground rail on your breadboard.
Firmware Flash - Programming the ESP32
Now that the hardware is connected, let's load the brain into the ESP32. Use github repo for code.
Important - ensure Wi-Fi credentials are updated
- Install Arduino IDE: If you don't have it, download and install the Arduino IDE.
- Add ESP32 Board: Go to File > Preferences and add this URL to "Additional Boards Manager URLs": https://raw.githubusercontent.com/espressif/arduino-esp32/gh-pages/package_esp32_index.json
- Install Board: Navigate to Tools > Board > Boards Manager, search for "esp32", and install the package.
- Install Libraries: Go to Sketch > Include Library > Manage Libraries, search for and install:
- Adafruit GFX Library
- Adafruit SSD1306 Library
- Open Code: Open the provided ESP32 firmware .ino file.
- Upload: Select your ESP32 board and port (Tools > Board and Tools > Port), then click the "Upload" arrow.
The AI Server - Python & Gemini
This Python server runs on your computer (or a Raspberry Pi) and acts as the intelligence hub. Use github repo for code.
- Install Python: Ensure you have Python 3 installed.
- Virtual Environment (Recommended):
- python3 -m venv venv
- source venv/bin/activate (macOS/Linux) or venv\Scripts\activate (Windows)
- Install Dependencies: run - pip install -r requirements.txt
- Get Gemini API Key: Go to the Google AI Studio to get your GEMINI_API_KEY.
- Create .env file: In the same directory as your server.py file, create a new file named .env and add: GEMINI_API_KEY="YOUR_API_KEY_HERE"
- Run the Server: Open a terminal in your server's directory and run: python server.py The server will now be running, waiting for requests from your ESP32!
Putting It All Together & How to Use
Operation: Your Offline AI Is Ready!
- Power Up: Connect power to your ESP32. It should connect to Wi-Fi, and the OLED will display "Ready" with the green LED solid.
- "Next" Button: Press this button to cycle through the predefined phrases on the OLED display.
- "Speak" Button: When you've found the phrase you want, press "Speak."
- The OLED will show "Thinking..." (red LED solid) as the ESP32 contacts the server.
- Once the server responds, it will switch to "Speaking..." (green LED solid, red LED blinks) as the audio plays.
- After playback, it returns to "Ready."
The Token-Saving Trick: Remember, the Python server deliberately limits the length of the Gemini response to keep your API token usage (and potential costs!) down. It's an efficient little system!
Conclusion & What's Next?
Congratulations! You've built a functional, privacy-conscious AI voice assistant. This project demonstrates how versatile the ESP32 is when combined with powerful APIs.
Ideas for improvement:
- Add a local web interface for custom prompt configuration.
- Integrate other sensors or actuators.
- Explore different Text-to-Speech engines or even local voice models.
I hope you enjoyed this build! If you have any questions or run into issues, leave a comment!