🏠 GPT Home 🤖💬 an Open-source Raspberry Pi Home Assistant!

by Judah Paul in Circuits > Raspberry Pi

358 Views, 7 Favorites, 0 Comments

🏠 GPT Home 🤖💬 an Open-source Raspberry Pi Home Assistant!

my_build.jpg

Inspiration

The inspiration for GPT Home came from my interest in automation, combined with a Raspberry Pi 4B I had laying around. The course I took on IoT further motivated me to implement this project, though I probably would have explored this idea regardless.

What it does

GPT Home transforms a standard Raspberry Pi into a smart home hub similar to a Google Nest Hub or Amazon Alexa. It leverages the OpenAI API to provide an interactive voice assistant capable of understanding and generating human-like responses. Users can ask general questions, control home devices, or get updates about the weather, all through voice commands.

How I built it

The project was built using a Raspberry Pi 4B, equipped with Ubuntu Server as the OS, Python for programming, and various hardware components like a mini auxiliary speaker that simply plugs to the headphone jack, an OLED display, a USB microphone for voice input and audio output, and a battery pack to make it portable (something not possible with the Amazon Alexa or Google Nest Hub). The integration with the OpenAI API allows it to perform sophisticated natural language processing tasks.

Challenges I ran into

One of the main challenges was ensuring seamless integration of various components like the OLED display and the USB microphone with the Raspberry Pi. Configuring the audio input and output on the Ubuntu Server also required meticulous adjustments to avoid latency and feedback issues. Implementing asynchronous operations was particularly tricky, especially when trying to manage concurrent tasks like speaking, updating the OLED display, and handling queries simultaneously. Additionally, setting up Spotify's OAuth for music streaming involved navigating complex authentication flows, which proved to be quite challenging.

Accomplishments that I'm proud of

I'm particularly proud of how seamlessly the components work together to create a responsive and interactive user experience. The ability to convert text to speech and speech to text efficiently, despite the hardware limitations of the Raspberry Pi, I think stands out as a significant achievement. The project has gained the attention of RaspberryPi.com!

What I learned

This project deepened my understanding of integrating hardware with software for IoT applications. I gained practical experience in working with the OpenAI API and improved my skills in troubleshooting hardware compatibility issues on Linux-based systems. I've also greatly enhanced my understanding of Docker during this process.

Supplies

This is the list of parts I used to build my first GPT Home. You can use this as a reference for building your own. I've also included optional parts that you can add to enhance your setup. To be clear you can use any ARM64 system that runs Linux It doesn't necessarily have to be a Raspberry Pi but there may be certain compatibility issues related to using other hardware or operating systems See the compatibility chart for more details.


Core Components

- Raspberry Pi 4B: [Link] - $50-$70

- Mini Speaker: [Link] - $18

- 128 GB MicroSD card: [Link] - $13

- USB 2.0 Mini Microphone: [Link] - $8

Optional Components

- 128x32 OLED Display: [Link] - $13-$14

- Standoff Spacer Column M3x40mm: [Link] - $14

- M1.4 M1.7 M2 M2.5 M3 Screw Kit: [Link] - $15

- Raspberry Pi UPS Power Supply with Battery: [Link] - $30

- Cool Case for Raspberry Pi 4B: [Link] - $16

Total Price Range

- Core Components: $102-$123

- Optional Components: $75

- Total (Without Optional): $102-$123

- Total (With Optional): $177-$198

Plug in Microphone and Speaker

Assuming you already have an operating system loaded and onto your device and you have a connection to the internet, all you need to do is plug in your speaker and microphone. You can use any speaker and microphone be they USB or auxillary as long as they are recognized devices in ALSA. After plugging in you can verify they are available by using the `aplay -l` and `arecord -l` commands. You should see an output similar to this:

# arecord -l
**** List of CAPTURE Hardware Devices ****
card 3: Device [USB PnP Sound Device], device 0: USB Audio [USB Audio]
  Subdevices: 0/1
  Subdevice #0: subdevice #0

# aplay -l
**** List of PLAYBACK Hardware Devices ****
card 0: vc4hdmi0 [vc4-hdmi-0], device 0: MAI PCM i2s-hifi-0 [MAI PCM i2s-hifi-0]
  Subdevices: 1/1
  Subdevice #0: subdevice #0
card 1: vc4hdmi1 [vc4-hdmi-1], device 0: MAI PCM i2s-hifi-0 [MAI PCM i2s-hifi-0]
  Subdevices: 1/1
  Subdevice #0: subdevice #0
card 2: Headphones [bcm2835 Headphones], device 0: bcm2835 Headphones [bcm2835 Headphones]
  Subdevices: 8/8
  Subdevice #0: subdevice #0
  Subdevice #1: subdevice #1
  Subdevice #2: subdevice #2
  Subdevice #3: subdevice #3
  Subdevice #4: subdevice #4
  Subdevice #5: subdevice #5
  Subdevice #6: subdevice #6
  Subdevice #7: subdevice #7
card 4: UACDemoV10 [UACDemoV1.0], device 0: USB Audio [USB Audio]
  Subdevices: 1/1
  Subdevice #0: subdevice #0

Install the Docker Container

It only takes two* commands to get the container up and running on your Raspberry Pi. *Three commands if you want to use one of the models provided by LiteLLM.

1. Required for Semantic Routing: Make sure to export your OpenAI API Key to an environment variable.

echo "export OPENAI_API_KEY='your_api_key_here'" >> ~/.bashrc && source ~/.bashrc

2. Optional: If you want to use a model not provided by OpenAI, make sure your API key for the provider you want to use is exported to an environment variable called `LITELLM_API_KEY`. See the LiteLLM docs for a list of all supported providers.

echo "export LITELLM_API_KEY='your_api_key_here'" >> ~/.bashrc && source ~/.bashrc

3. Run the setup script with the `--no-build` flag to pull the latest image from DockerHub:

curl -s https://raw.githubusercontent.com/judahpaul16/gpt-home/main/contrib/setup.sh | \
  bash -s -- --no-build

Configure Settings in the Web Interface

preview.gif

There are a number of things you can customize from the web interface from choosing the LLM you want to respond to to you, to changing the keyword (default keyword is 'computer'), max tokens, languages (coming soon), to connecting your favorite services like Spotify, Philips Hue, OpenWeatherMap, and more to come!