Sunday, Your Vocal Assistant !

This is an IoT project of a voice assistant named Sunday, which is used to control the electrical equipment in your home. For instance light, heating and motion detection.

This instructable can be divided in three parts :

Sunday : The core of the system

A single box which embed a microphone module a microcontroller and a WiFi connection to communicate with all other modules. This box is going to detect the keyword "Sunday" and then record the vocal command to send it to the Calculation Unit.

Calculation Unit : The vocal recognition core

This unit can be anything like a personal computer, single board computer etc... It runs a program that receive the vocal command recorded by Sunday Core and translate it into a sentence and then into automation commands which are going to be send to the Sunday core for treatment.

The automation modules : Which are going to receive the commands from Sunday to perform some actions like turn on the light

Each of these module has a unique IP address that is known by Sunday Core and can perform some actions.

Supplies

Hardware components

Breadboard (x2)
Module NodeMCU 1.0 ESP8266 (x2)
Resistors (330Ω x2, 220Ω x3, 150Ω x1, 1M x1, 1kΩ x1, 10kΩ x2, 47kΩ x1)
Capacitors (470nF x2, 10uF x1, 100nF x1, 22uF x1)
Pushbutton (x1)
HC-SR501 PIR Motion Sensor (x1)
Leds RGB (x2)
DHT11 Sensor – Temperature & Humidity (x1)
AOP TL084CN (x2)
Microphone module KY038 (x1)

Software apps and online services

Arduino IDE
WIFI (Need a local network)
MQTT as data communication protocol
Scaleway as a cloud provider
Edge Impulse for the acquisition of data that will be used to train an AI.

Automation Module : Light, Heater, Humidity, Temperature and Motion Sensor

This part is not the real goal of our project, and there is so many Instructables about that subject, that's why we are not going to make a connected variable light bulb. The light will be simulated by a simple LED and also by the heater when it's on.

We are going to connect an ESP8266 to our WiFi network and connect all components to this board. Including :

A LED for the Light
A LED for the heater
A DHT11 sensor for temperature and humidity
A PIR sensor for motion detection
A pushbutton to turn on/off the light

All the wiring is related to the schematic.

The LEDs are connected to a digital output via a resistor of 330 Ohms to limit the current.

The DHT11 and the PIR sensor are also connected to a digital input of the ESP8266 board and also to VCC (3.3V) and GND.

The pushbutton is connected to the analog pin to increase sensibility in our case but if it works, you can plug it on a digital pin in any case, you have to connect it through a pull-down (or pull-up) resistor (Internal or not no matter).

Finally, for simplification reasons, we plug into only one single ESP8266 but you can add any sensors you want on any boards to create a real automation network.

Sunday Core 1/2 : Mechanical Part

As I said previously, the Sunday Core is in charge of record voice and detect "Sunday" Keyword to send it to the calculation unit.

To do that we plug the microphone module to an amplification circuit then to a low-pass filter and directly to the analog pin A0 of the ESP8266. For esthetic reasons this time, we made a PCB to embed all the components. The PCB has been designed on Altium Designer and made with a CNC machine. We also design a box with Solidworks which embed a 18650 lithium battery, a tiny charge circuit for this cell and a voltage regulator to provide a stable 3.3V.

There is also a RGB LED to display the state of Sunday :

Blue when Sunday is in idle state (waiting to detect "Sunday" keyword)
Blue blinking when the keyword is pronounced
Red when Sunday is recording the voice
Red blinking when Sunday send the data and is waiting for the answer
Green when the command is received and valid

This box is 3D printed by a Prusa i3 MK3 and laser cutted.

However, we noticed some mistakes on this board ! The ESP footprint is not the good one (The real ESP is bigger than the footprint) and it is upside down ! The USB connector is in the wrong place according to the box

All PCB file design and solidworks files can be found on the project's github under Sunday_PCB and Sunday_box:

https://github.com/forstark/Sunday.git

Sunday Core 2/2 : Code Part

You can find the Sunday Arduino program on the project's github under Sunday_Master :

https://github.com/forstark/Sunday.git

Of course, there is a problem. We can not sample our audio at 16 kHz which is the frequency required for a good audio quality. Moreover, the frequency is not stable : some times we have 11115 kHz and some other times 11126 kHz. This seems not to be a problem... But when we use Edge Impulse to train a neural network which is able to detect the "Sunday" keyword, we must have the same sampling frequency to generate all the features we need and that was not our case... So we cannot train our model. Consequently, the keyword detection is not working. A future update will try to solve this problem.

In my opinion, I think it is because the ESP8266 runs too slowly to perform the WiFi management and the data acquisition in the same time.

The code starts by configuring the WiFi network and MQTT connection (Which we'll see later)

When we press the button, we detecting it and this is how we simulate the keyword detection. After that, we connect the ESP to the calculation unit server. To avoid the previous problem, we use the unit calculation to perform the vocal recognition. Once the vocal recognition is done, we receive the signal on the integrated server.

Finally, we treat the command, we send data information to the MQTT server and we connect to the concerned module to send him directly the information without Internet.

Downloads

Sunday_Master.ino

Calculation Unit

To perform the vocal recognition, we start by creating a web server using a Python program. Then we detect the command sent by Sunday and if it is a valid command, we start the vocal recognition.

At the line 52 in the main code, the number 14 refers to one of my microphone device. To find it, you can run FindMic.py program which is going to list all the microphone and the device ID.

Finally, we split the result sentence into a command which is going to be sent to the Sunday Core

We run the script using this command (under linux) :

python main.py

Downloads

main.py

FindMic.py

MQTT Server and Scaleway

MQTT & Scaleway

MQTT is a publish/subscribe messaging protocol that allows two remote devices to communicate via messages. To use this message transmission protocole we had to use Scaleway a cloud provider with their MQTT managed message broker. For this project the messages will be transmitted between the two Node-MCUs and the interface of our application.

After creating an account on Scaleway and creating an IoT Hub you have to add the devices. There are as many devices as there are clients that send messages to each other via MQTT.

From here you can connect to MQTT with your username which corresponds to your DeviceID. The data is now published on MQTT.

Node-RED

To display the data from sensors and control the remote heating we used Node-Red, a development instance of Scaleway. First of all, you have to create a Kickstart on your Iot HUb and when your application is ready you have to go on its Dashboard (cf. images). For more functionalities we have downloaded the "node-red-dashboard".You can organize your dashboard as you want. For our dashboard we have used:

MQTT in to receive temperature, humidity, movement data
MQTT out to send data for heating
gauge, graph to display our received data.

Finally, to access the application page, you have to go to: http://yourIP:1880/ui

Edge Impulse : Easy Neural Network !

To detect the keyword "Sunday", like we can say "Ok Google" or "Alexa", we'll use an embedded machine learning algorithm and more precisely a deep learning algorithm.

To do that, we connect our ESP board to the platform and we start recording a lot of sound. We said the word "Sunday" for 10 minutes and we recorded 20 minutes of background noise labelled as "unknown". We also tried to record 10 minutes of a finger snap and recorded the same proportion of sound for the test dataset.

In fact, you must have a training dataset to train your model and a test dataset to test it using data that your model never seen to have more robust result about the accuracy of your model.

The pictures show 10 seconds of sound, then we separated each part that corresponds to the moment we said Sunday. This gives us between 3 and 6 samples of 1 second.

Like I said before, the sampling frequency is different for some samples so we cannot go further. But in fact, the next step is to generate MFCC features which are going to be the input of a neural network. Then we click on train to start the training of the neural network. On another project, I was able to detect a keyword with 66% of accuracy in 36ms but I had a really weak dataset.

Conclusion

The project worked well despite the fact that the keyword detection was not functional. To solve this problem we are going to select another micro controller, more powerful to sample sound from multiple microphone source which is going to be another Instructable and a more complete one !

To improve this project, we can integrate a module system which could allow the user to add any module really simply and not go into the source code to add some lines. Moreover, we actually use the vocal API from Google and this is not really a good idea according to the privacy whereas is so simple to use. So for the next version, I could implement a complex vocal recognition deep learning model on the calculation unit to perform the vocal recognition locally.

We did not think to all conceivable solution, so do not hesitate to tell us what you would have done !

Thank you for reading