Harry Potter AI Chatbot + Google Gemini AI Chatbot + an Emailing Machine. All of Them Listens to Your Voice and Even Talks to You. All on a Pi 4!

by thesigmacoderboy in Circuits > Raspberry Pi

26 Views, 0 Favorites, 0 Comments

Harry Potter AI Chatbot + Google Gemini AI Chatbot + an Emailing Machine. All of Them Listens to Your Voice and Even Talks to You. All on a Pi 4!

I always wanted to have a personal chat bot in real life. Not only that, everybody in the house is a wild harry potter fan, my mom read all of the books 3 times and watched all of the movies for at least 7 times, I bet. ) So I decided to put together all of these ideas together and make a perfect gift for my parent's anniversary in July. Even though I published the Instructable pretty late, the fact that my parents' anniversary is on July is true!

This project works as a personal chatbot , works as a harry potter simulator, and acts as your email sender. the email sending part is complex, it will be explained to you in the image above.

Supplies

A raspberry pi - I used a PI 4 B (and something to power it)

A speaker - i used a bluetooth one

Air dry clay or 3d printer - OPTIONAL (i used air dry clay) - to make the body of the pi and speaker

paint - OPTIONAL - top paint ther body for the pi and speaker

another computer to ssh the pi - OPTIONAL BUT RECOMMENDED

Create Credentials

Let's create Azure Credentials -

Go to Azure Portal: https://portal.azure.com

In the top search bar, type "Speech" and click "Speech" (Cognitive Services).

Click + Create.

Fill in the required fields:

Subscription: Select your Azure subscription.
Resource Group: Create a new one or use an existing.
Region: Choose a region close to you (e.g., eastus, westus2, etc.).
Name: Give your resource a name (e.g., MySpeechResource).
Pricing Tier: Choose Free (F0) for testing or Standard (S0) if you need more capacity.

Click Review + create > Create.

Once the deployment is complete, click Go to Resource.

In the left menu, click Keys and Endpoint.

You’ll see:

Key 1 and Key 2 – both work interchangeably.
Endpoint – e.g., https://<your-region>.api.cognitive.microsoft.com/

Copy and save:

Your key
Your region

Now let's create Gmail Credentials (SKIP IF YOU DON'T WANT THIS FEATURE) -

Click here , click security, scroll down to 2 step verification and turn it on. You may be asked to sign in again. Follow directions on the screen.

Once 2-step verification is on, click here and scroll down to the app name part. Type in the app name - could be anything and then copy the 16-char code. KEEP THIS CODE CONFIDENTIAL.

Make Case for the Setup(Optional)

As I had said before, I used air dry clay for the casing for the setup (it looks less ugly)

I placed the pi, the microphone, and the speaker inside.

Turn the pi on.

SSH (skip If You Do Not Have Headless Pi Setup - or If You Want This Project to Be Easier)

First we have to configure ssh if you have not turned on ssh before.(if you want to use ssh and not the inbuilt terminal in the pi using a moitor or something)-

sudo raspi-config

go to interface options and then ssh and then enable

Next we have to ssh it using the other computer. to do that we need the ip address of the pi, so we need to ping it. replace the raspberrypi with the pi's usr name if you changed it into something else -

ping -4 raspberrypi.local

The first line of the output contains the ip address. example-

PING raspberrypi.local (192.168.1.105): 56 data bytes

64 bytes from 192.168.1.105: icmp_seq=0 ttl=64 time=0.523 ms

Here, 192.168.1.105 is the ip adress. ssh -(replace pi with the pi's usrname if it isn't pi!!!)

ssh pi@192.168.1.105

It will ask the pwd, the default is raspberry

now you are in the pi.

VENV Creation, Library Installation and Running the Code

Now let's create a venv called venv.

virtualenv venv

source venv/bin/activate

Create a txt file -

sudo nano requirements.txt

And paste this code -

google-generativeai

pyttsx3

SpeechRecognition

PyAudio

and install the libs -

pip install -r requirements.txt

next, create a dir called python_ai

mkdir python_ai

cd python_ai

create a python file -

sudo nano main.py

and paste this code-

import google.generativeai as genai

import pyttsx3

import speech_recognition as sr

import ssl

import smtplib

import json

from email.message import EmailMessage

import os

engine = pyttsx3.init()

engine.setProperty('rate', 150) # Slower = clearer

engine.setProperty('volume', 1.0)

context = ssl.create_default_context()

def talk(text):

engine.say(text)

engine.runAndWait()

def remove_punctuation_manual(text):

punctuation_chars = ',.;?` !~'

no_punct_text = ""

for char in text:

if char not in punctuation_chars:

no_punct_text += char

return no_punct_text

if not os.path.exists("aidata.json"):

inputone = input("Gmail senderemail - ")

inputtwo = input("Gmail apppassword")

inputthree = input("azure api key")

inputfour = input("azure localty")

inputfive = input("gemini api key")

inputsix = input("gemini version - eg. gemini-version-2.0")

with open ('aidata.json', "w", encoding= "utf-8", newline="") as q:

filedata = f"""

"email_sender": "{inputone}",

"email_apppassword": "{inputtwo}",

"""

with open ('aidata.json', "r", encoding= "utf-8", newline="") as q:

filecontents = json.load(q)

email_sender1 = filecontents["email_sender"]

email_password1 = filecontents["email_apppassword"]

gemini_app_password1 = filecontents["gemini_app_password"]

AZURESPEECHKEY1 = filecontents["azure_api_key"]

location1 = filecontents["azure_localty"]

gemini_version = filecontents["gemini_version"]

genai.configure(api_key=gemini_app_password1)

model = genai.GenerativeModel(gemini_version)

AZURESPEECHKEY= AZURESPEECHKEY1

email_sender = email_sender1

email_password = email_password1

# obtain audio from the microphone

microphone = sr.Microphone()

r = sr.Recognizer()

r.pause_threshold = 1.0

while True:

with microphone as source:

r.adjust_for_ambient_noise(source, duration=2) # Increase duration if needed, e.g., 1 or 2 seconds

try:

print("Please say something!")

audio = r.listen(source)

except sr.WaitTimeoutError:

print("No speech detected. Listening again.")

talk("I didn't hear anything. Please speak when ready.")

continue

except Exception as e:

print(f"An unexpected error occurred while listening: {e}")

talk(f"There was an unexpected issue with my listening device.{e}" + "I didn't hear anything. Please speak gemini, mailman, or harry when ready.")

continue

# recognize speech using Azure Speech Recognition

try:

voicedata = r.recognize_azure(audio, key=AZURESPEECHKEY, location= location1)

audiodata= voicedata[0]

print("Azure Speech Recognition thinks you said " + audiodata)

if audiodata.strip() !="" :

if audiodata.lower()=="bye":

talk("farewell, inquisitive magician!")

break

else:

if audiodata.startswith("Harry"):

harrydata = audiodata.replace("Harry", "")

realharrydata = remove_punctuation_manual(harrydata)

if realharrydata:

response = model.generate_content("answer this question like you are harry potter and I SINCERELY REQUEST YOU PLEASE DO NOT USE ASTERISKS. here is the question - " + harrydata)

print(response.text)

talk(response.text)

continue

else:

print("your input other than harry is blank.")

talk("your input other than harry is blank. please say harry, gemini, or mailman.")

continue

elif audiodata.startswith("Gemini"):

geminidata = audiodata.replace("Gemini", "")

realgeminidata = remove_punctuation_manual(geminidata)

if realgeminidata:

response = model.generate_content(geminidata+ "no asterisks in your response, please.")

print(response.text)

talk(response.text)

continue

else:

continue

elif audiodata.startswith("Mailman"):

if len(email_password)+ len(email_sender) == 0:

context = ssl.create_default_context()

with microphone as source:

r.adjust_for_ambient_noise(source, duration=2) # Increase duration if needed, e.g., 1 or 2 seconds

try:

print("Say username!")

talk("say username in alphabets.")

audio = r.listen(source)

except sr.WaitTimeoutError:

print("No speech detected. Listening again.")

talk("I didn't hear anything. Please speak gemini, mailman, or harry when ready.")

continue

except Exception as e:

print(f"An unexpected error occurred while listening: {e}")

talk(f"An unexpected error occurred while listening: {e}"+ "I didn't hear anything. Please speak gemini, mailman, or harry when ready.")

continue

usrnametuple = r.recognize_azure(audio, key=AZURESPEECHKEY, location= location1)

a = usrnametuple[0].lower()

username = remove_punctuation_manual(a)

with microphone as source:

r.adjust_for_ambient_noise(source, duration=2) # Increase duration if needed, e.g., 1 or 2 seconds

try:

print("Say domain part one!")

talk("say domain part one in alphabets.")

audio = r.listen(source)

except sr.WaitTimeoutError:

print("No speech detected. Listening again.")

talk("I didn't hear anything. Please speak gemini, mailman, or harry when ready.")

continue

except Exception as e:

print(f"An unexpected error occurred while listening: {e}")

talk(f"An unexpected error occurred while listening: {e}"+ "I didn't hear anything. Please speak gemini, mailman, or harry when ready.")

continue

domainonetuple = r.recognize_azure(audio, key=AZURESPEECHKEY, location= location1)

b = domainonetuple[0].lower()

domainone= remove_punctuation_manual(b)

with microphone as source:

r.adjust_for_ambient_noise(source, duration=2) # Increase duration if needed, e.g., 1 or 2 seconds

try:

print("Say domain part two!")

talk("say domain part two in alphabets.")

audio = r.listen(source)

except sr.WaitTimeoutError:

print("No speech detected. Listening again.")

talk("I didn't hear anything. Please speak gemini, mailman, or harry when ready.")

continue

except Exception as e:

print(f"An unexpected error occurred while listening: {e}")

talk(f"An unexpected error occurred while listening: {e}"+ "Please speak gemini, mailman, or harry when ready.")

continue

domaintwotuple = r.recognize_azure(audio, key=AZURESPEECHKEY, location= location1)

c = domaintwotuple[0].lower()

domaintwo= remove_punctuation_manual(c)

with microphone as source:

r.adjust_for_ambient_noise(source, duration=2) # Increase duration if needed, e.g., 1 or 2 seconds

try:

print("Say subject!")

talk("say subject of the email.")

audio = r.listen(source)

except sr.WaitTimeoutError:

print("No speech detected. Listening again.")

talk("I didn't hear anything. Please speak gemini, mailman, or harry when ready.")

continue

except Exception as e:

print(f"An unexpected error occurred while listening: {e}")

talk(f"An unexpected error occurred while listening: {e}. " + "Please speak gemini, mailman, or harry when ready.")

continue

subjecttuple = r.recognize_azure(audio, key=AZURESPEECHKEY, location= location1)

subject = subjecttuple[0]

with microphone as source:

r.adjust_for_ambient_noise(source, duration=2) # Increase duration if needed, e.g., 1 or 2 seconds

try:

print("Say body!")

talk("say body of the email clearly and slowly.")

audio = r.listen(source)

except sr.WaitTimeoutError:

print("No speech detected. Listening again.")

talk("I didn't hear anything. Please speak gemini, mailman, or harry when ready.")

continue

except Exception as e:

print(f"An unexpected error occurred while listening: {e}")

talk(f"An unexpected error occurred while listening: {e}" + "Please speak gemini, mailman, or harry when ready.")

continue

bodytuple = r.recognize_azure(audio, key=AZURESPEECHKEY, location= location1)

body = bodytuple[0]

if username:

if domainone:

if domaintwo:

email_receiver = username+"@"+domainone+"."+domaintwo

else:

print("domaintwo blank.")

talk("domain two blank. " + "Please speak gemini, mailman, or harry when ready.")

continue

else:

print("domain one blank.")

talk("domain one blank. " + "I didn't hear anything. Please speak gemini, mailman, or harry when ready.")

continue

else:

print("username blank")

talk("domain one blank. " + "I didn't hear anything. Please speak gemini, mailman, or harry when ready.")

continue

em = EmailMessage()

em['From'] = email_sender

em['To'] = email_receiver

em['Subject'] = subject

em.set_content(body)

with microphone as source:

r.adjust_for_ambient_noise(source, duration=2) # Increase duration if needed, e.g., 1 or 2 seconds

try:

print("please confirm.")

talk("say yes for confirmation.")

audio = r.listen(source)

except sr.WaitTimeoutError:

print("No speech detected. Listening again.")

talk("I didn't hear anything. Please speak gemini, mailman, or harry when ready.")

continue

except Exception as e:

print(f"An unexpected error occurred while listening: {e}")

talk(f"An unexpected error occurred while listening: {e}" + "Please speak gemini, mailman, or harry when ready.")

continue

confirmtuple = r.recognize_azure(audio, key=AZURESPEECHKEY, location= location1)

confirm = confirmtuple[0].strip(". ")

confirmation = confirm.lower()

if confirmation == "yes":

with smtplib.SMTP_SSL('smtp.gmail.com', 465, context=context) as smtp:

smtp.login(email_sender, email_password)

smtp.sendmail(email_sender, email_receiver, em.as_string())

print('sent email')

talk('email succesfully sent.')

continue

else:

print("email not sent")

talk("email not sent.")

continue

else:

talk("Email feature has been deactivated by the user. Speak gemini or harry when ready.")

else:

continue

else:

continue

except sr.UnknownValueError:

print("Pardon me, dear person. Please speak again.")

talk("Pardon me, Please speak again.")

continue

except sr.RequestError as e:

print("Could not request results from Azure Speech Recognition service; {0}".format(e))

talk("Could not request results from Azure Speech Recognition service; {0}".format(e))

break

exit and then type this into the terminal-

nohup python3 main.py

and then type

exit

You are all done! WARN - You gotta do this whole instructable again if you disconnect power from the Pi!!!