Harry Potter AI Chatbot + Google Gemini AI Chatbot + an Emailing Machine. All of Them Listens to Your Voice and Even Talks to You. All on a Pi 4!

by thesigmacoderboy in Circuits > Raspberry Pi

26 Views, 0 Favorites, 0 Comments

Harry Potter AI Chatbot + Google Gemini AI Chatbot + an Emailing Machine. All of Them Listens to Your Voice and Even Talks to You. All on a Pi 4!

20250817_112954.jpg
diagram.png

I always wanted to have a personal chat bot in real life. Not only that, everybody in the house is a wild harry potter fan, my mom read all of the books 3 times and watched all of the movies for at least 7 times, I bet. ) So I decided to put together all of these ideas together and make a perfect gift for my parent's anniversary in July. Even though I published the Instructable pretty late, the fact that my parents' anniversary is on July is true!


This project works as a personal chatbot , works as a harry potter simulator, and acts as your email sender. the email sending part is complex, it will be explained to you in the image above.

Supplies

20250812_175311.jpg
20250812_180839.jpg

A raspberry pi - I used a PI 4 B (and something to power it)

A speaker - i used a bluetooth one

Air dry clay or 3d printer - OPTIONAL (i used air dry clay) - to make the body of the pi and speaker

paint - OPTIONAL - top paint ther body for the pi and speaker

another computer to ssh the pi - OPTIONAL BUT RECOMMENDED

Create Credentials

Screenshot 2025-08-03 110154.png
Screenshot 2025-08-03 110336.png
Screenshot 2025-08-03 110832.png
Screenshot 2025-08-03 111724.png

Let's create Azure Credentials -

Go to Azure Portal: https://portal.azure.com

In the top search bar, type "Speech" and click "Speech" (Cognitive Services).

Click + Create.

Fill in the required fields:

  1. Subscription: Select your Azure subscription.
  2. Resource Group: Create a new one or use an existing.
  3. Region: Choose a region close to you (e.g., eastus, westus2, etc.).
  4. Name: Give your resource a name (e.g., MySpeechResource).
  5. Pricing Tier: Choose Free (F0) for testing or Standard (S0) if you need more capacity.

Click Review + create > Create.

Once the deployment is complete, click Go to Resource.

In the left menu, click Keys and Endpoint.

You’ll see:

  1. Key 1 and Key 2 – both work interchangeably.
  2. Endpoint – e.g., https://<your-region>.api.cognitive.microsoft.com/

Copy and save:

  1. Your key
  2. Your region

Now let's create Gmail Credentials (SKIP IF YOU DON'T WANT THIS FEATURE) -

Click here , click security, scroll down to 2 step verification and turn it on. You may be asked to sign in again. Follow directions on the screen.

Once 2-step verification is on, click here and scroll down to the app name part. Type in the app name - could be anything and then copy the 16-char code. KEEP THIS CODE CONFIDENTIAL.




Make Case for the Setup(Optional)

As I had said before, I used air dry clay for the casing for the setup (it looks less ugly)

I placed the pi, the microphone, and the speaker inside.

Turn the pi on.

SSH (skip If You Do Not Have Headless Pi Setup - or If You Want This Project to Be Easier)

First we have to configure ssh if you have not turned on ssh before.(if you want to use ssh and not the inbuilt terminal in the pi using a moitor or something)-

sudo raspi-config

go to interface options and then ssh and then enable

Next we have to ssh it using the other computer. to do that we need the ip address of the pi, so we need to ping it. replace the raspberrypi with the pi's usr name if you changed it into something else -

ping -4 raspberrypi.local

The first line of the output contains the ip address. example-

PING raspberrypi.local (192.168.1.105): 56 data bytes
64 bytes from 192.168.1.105: icmp_seq=0 ttl=64 time=0.523 ms

Here, 192.168.1.105 is the ip adress. ssh -(replace pi with the pi's usrname if it isn't pi!!!)

ssh pi@192.168.1.105

It will ask the pwd, the default is raspberry

now you are in the pi.

VENV Creation, Library Installation and Running the Code

Now let's create a venv called venv.

virtualenv venv
source venv/bin/activate

Create a txt file -

sudo nano requirements.txt

And paste this code -

google-generativeai
pyttsx3
SpeechRecognition
PyAudio

and install the libs -

pip install -r requirements.txt

next, create a dir called python_ai

mkdir python_ai
cd python_ai

create a python file -

sudo nano main.py

and paste this code-

import google.generativeai as genai
import pyttsx3
import speech_recognition as sr
import ssl
import smtplib
import json
from email.message import EmailMessage
import os

engine = pyttsx3.init()
engine.setProperty('rate', 150) # Slower = clearer
engine.setProperty('volume', 1.0)
context = ssl.create_default_context()

def talk(text):
engine.say(text)
engine.runAndWait()

def remove_punctuation_manual(text):
punctuation_chars = ',.;?` !~'
no_punct_text = ""
for char in text:
if char not in punctuation_chars:
no_punct_text += char
return no_punct_text
if not os.path.exists("aidata.json"):
inputone = input("Gmail senderemail - ")
inputtwo = input("Gmail apppassword")
inputthree = input("azure api key")
inputfour = input("azure localty")
inputfive = input("gemini api key")
inputsix = input("gemini version - eg. gemini-version-2.0")
with open ('aidata.json', "w", encoding= "utf-8", newline="") as q:
filedata = f"""
"email_sender": "{inputone}",
"email_apppassword": "{inputtwo}",

"""
with open ('aidata.json', "r", encoding= "utf-8", newline="") as q:
filecontents = json.load(q)
email_sender1 = filecontents["email_sender"]
email_password1 = filecontents["email_apppassword"]
gemini_app_password1 = filecontents["gemini_app_password"]
AZURESPEECHKEY1 = filecontents["azure_api_key"]
location1 = filecontents["azure_localty"]
gemini_version = filecontents["gemini_version"]
genai.configure(api_key=gemini_app_password1)
model = genai.GenerativeModel(gemini_version)
AZURESPEECHKEY= AZURESPEECHKEY1
email_sender = email_sender1
email_password = email_password1
# obtain audio from the microphone
microphone = sr.Microphone()
r = sr.Recognizer()
r.pause_threshold = 1.0
while True:
with microphone as source:
r.adjust_for_ambient_noise(source, duration=2) # Increase duration if needed, e.g., 1 or 2 seconds
try:
print("Please say something!")
audio = r.listen(source)
except sr.WaitTimeoutError:
print("No speech detected. Listening again.")
talk("I didn't hear anything. Please speak when ready.")
continue
except Exception as e:
print(f"An unexpected error occurred while listening: {e}")
talk(f"There was an unexpected issue with my listening device.{e}" + "I didn't hear anything. Please speak gemini, mailman, or harry when ready.")
continue


# recognize speech using Azure Speech Recognition
try:
voicedata = r.recognize_azure(audio, key=AZURESPEECHKEY, location= location1)
audiodata= voicedata[0]
print("Azure Speech Recognition thinks you said " + audiodata)

if audiodata.strip() !="" :
if audiodata.lower()=="bye":
talk("farewell, inquisitive magician!")
break
else:
if audiodata.startswith("Harry"):
harrydata = audiodata.replace("Harry", "")
realharrydata = remove_punctuation_manual(harrydata)
if realharrydata:
response = model.generate_content("answer this question like you are harry potter and I SINCERELY REQUEST YOU PLEASE DO NOT USE ASTERISKS. here is the question - " + harrydata)
print(response.text)
talk(response.text)
continue
else:
print("your input other than harry is blank.")
talk("your input other than harry is blank. please say harry, gemini, or mailman.")
continue
elif audiodata.startswith("Gemini"):
geminidata = audiodata.replace("Gemini", "")
realgeminidata = remove_punctuation_manual(geminidata)
if realgeminidata:
response = model.generate_content(geminidata+ "no asterisks in your response, please.")
print(response.text)
talk(response.text)
continue
else:
continue
elif audiodata.startswith("Mailman"):
if len(email_password)+ len(email_sender) == 0:
context = ssl.create_default_context()
with microphone as source:
r.adjust_for_ambient_noise(source, duration=2) # Increase duration if needed, e.g., 1 or 2 seconds
try:
print("Say username!")
talk("say username in alphabets.")
audio = r.listen(source)
except sr.WaitTimeoutError:
print("No speech detected. Listening again.")
talk("I didn't hear anything. Please speak gemini, mailman, or harry when ready.")
continue
except Exception as e:
print(f"An unexpected error occurred while listening: {e}")
talk(f"An unexpected error occurred while listening: {e}"+ "I didn't hear anything. Please speak gemini, mailman, or harry when ready.")
continue
usrnametuple = r.recognize_azure(audio, key=AZURESPEECHKEY, location= location1)
a = usrnametuple[0].lower()
username = remove_punctuation_manual(a)
with microphone as source:
r.adjust_for_ambient_noise(source, duration=2) # Increase duration if needed, e.g., 1 or 2 seconds
try:
print("Say domain part one!")
talk("say domain part one in alphabets.")
audio = r.listen(source)
except sr.WaitTimeoutError:
print("No speech detected. Listening again.")
talk("I didn't hear anything. Please speak gemini, mailman, or harry when ready.")
continue
except Exception as e:
print(f"An unexpected error occurred while listening: {e}")
talk(f"An unexpected error occurred while listening: {e}"+ "I didn't hear anything. Please speak gemini, mailman, or harry when ready.")
continue
domainonetuple = r.recognize_azure(audio, key=AZURESPEECHKEY, location= location1)
b = domainonetuple[0].lower()
domainone= remove_punctuation_manual(b)
with microphone as source:
r.adjust_for_ambient_noise(source, duration=2) # Increase duration if needed, e.g., 1 or 2 seconds
try:
print("Say domain part two!")
talk("say domain part two in alphabets.")
audio = r.listen(source)
except sr.WaitTimeoutError:
print("No speech detected. Listening again.")
talk("I didn't hear anything. Please speak gemini, mailman, or harry when ready.")
continue
except Exception as e:
print(f"An unexpected error occurred while listening: {e}")
talk(f"An unexpected error occurred while listening: {e}"+ "Please speak gemini, mailman, or harry when ready.")
continue
domaintwotuple = r.recognize_azure(audio, key=AZURESPEECHKEY, location= location1)
c = domaintwotuple[0].lower()
domaintwo= remove_punctuation_manual(c)
with microphone as source:
r.adjust_for_ambient_noise(source, duration=2) # Increase duration if needed, e.g., 1 or 2 seconds
try:
print("Say subject!")
talk("say subject of the email.")
audio = r.listen(source)
except sr.WaitTimeoutError:
print("No speech detected. Listening again.")
talk("I didn't hear anything. Please speak gemini, mailman, or harry when ready.")
continue
except Exception as e:
print(f"An unexpected error occurred while listening: {e}")
talk(f"An unexpected error occurred while listening: {e}. " + "Please speak gemini, mailman, or harry when ready.")
continue
subjecttuple = r.recognize_azure(audio, key=AZURESPEECHKEY, location= location1)
subject = subjecttuple[0]
with microphone as source:
r.adjust_for_ambient_noise(source, duration=2) # Increase duration if needed, e.g., 1 or 2 seconds
try:
print("Say body!")
talk("say body of the email clearly and slowly.")
audio = r.listen(source)
except sr.WaitTimeoutError:
print("No speech detected. Listening again.")
talk("I didn't hear anything. Please speak gemini, mailman, or harry when ready.")
continue
except Exception as e:
print(f"An unexpected error occurred while listening: {e}")
talk(f"An unexpected error occurred while listening: {e}" + "Please speak gemini, mailman, or harry when ready.")
continue
bodytuple = r.recognize_azure(audio, key=AZURESPEECHKEY, location= location1)
body = bodytuple[0]
if username:
if domainone:
if domaintwo:
email_receiver = username+"@"+domainone+"."+domaintwo
else:
print("domaintwo blank.")
talk("domain two blank. " + "Please speak gemini, mailman, or harry when ready.")
continue
else:
print("domain one blank.")
talk("domain one blank. " + "I didn't hear anything. Please speak gemini, mailman, or harry when ready.")
continue
else:
print("username blank")
talk("domain one blank. " + "I didn't hear anything. Please speak gemini, mailman, or harry when ready.")
continue
em = EmailMessage()
em['From'] = email_sender
em['To'] = email_receiver
em['Subject'] = subject
em.set_content(body)
with microphone as source:
r.adjust_for_ambient_noise(source, duration=2) # Increase duration if needed, e.g., 1 or 2 seconds
try:
print("please confirm.")
talk("say yes for confirmation.")
audio = r.listen(source)
except sr.WaitTimeoutError:
print("No speech detected. Listening again.")
talk("I didn't hear anything. Please speak gemini, mailman, or harry when ready.")
continue
except Exception as e:
print(f"An unexpected error occurred while listening: {e}")
talk(f"An unexpected error occurred while listening: {e}" + "Please speak gemini, mailman, or harry when ready.")
continue
confirmtuple = r.recognize_azure(audio, key=AZURESPEECHKEY, location= location1)
confirm = confirmtuple[0].strip(". ")
confirmation = confirm.lower()
if confirmation == "yes":
with smtplib.SMTP_SSL('smtp.gmail.com', 465, context=context) as smtp:
smtp.login(email_sender, email_password)
smtp.sendmail(email_sender, email_receiver, em.as_string())
print('sent email')
talk('email succesfully sent.')
continue
else:
print("email not sent")
talk("email not sent.")
continue
else:
talk("Email feature has been deactivated by the user. Speak gemini or harry when ready.")
else:
continue
else:
continue
except sr.UnknownValueError:
print("Pardon me, dear person. Please speak again.")
talk("Pardon me, Please speak again.")
continue
except sr.RequestError as e:
print("Could not request results from Azure Speech Recognition service; {0}".format(e))
talk("Could not request results from Azure Speech Recognition service; {0}".format(e))
break

exit and then type this into the terminal-

nohup python3 main.py

and then type

exit

You are all done! WARN - You gotta do this whole instructable again if you disconnect power from the Pi!!!