Snore-O-Meter: Using AI to Detect Snores

by FabriceA6 in Circuits > Arduino

7824 Views, 28 Favorites, 0 Comments

Snore-O-Meter: Using AI to Detect Snores

Oreiller.png
Ronfleur1.jpg
Ronfleur2.jpg
dBscale.png
snoringcauses.jpg

"The advantage of the snorer is that he sleeps well"... That's what I tried to explain to my wife when she told me that I'm snoring... It didn't make her lough.

So I decided to gather some facts to help me understand the problem, and maybe find a solution...

Some interesting facts about snoring

  • 25% of people snore on a regular basis, and almost 50% of adults will snore at some point in their lives.
  • Snoring can happen during any stage of sleep.
  • Snoring occurs when your soft palate and uvula vibrate while you’re breathing in (see picture).
  • Snoring is hereditary: 70% of snorers have a familial link.

And finally:

Snoring is the third leading cause of divorce in the US. Fortunately, I don't live in the US...

How disturbing is it?

Whereas the average volume of a snore is around 38 dB, scientific studies have classified snoring sounds as mild (40-50 dB), moderate (50-60 dB), or severe (> 60 dB). Others have measured snoring sounds up to 90 dB. The dB sound scale is a logarithmic scale, meaning that adding 3 dB is equivalent to multiplying the sound's amplitude by 2: going from 60 to 90 dB is the same as multiplying the sound volume by 1000.

Modern dishwashers are around 62 dB, and a hair dryer or a thermic lawnlower can reach up to 90 dB. The Boeing 777 was also measured at this level at take-off...

It is said that the loudest snore ever recorded was 120 decibels, the sound of a jackhammer! The dB scale says that it can be harmful after 15 seconds. The Guiness Book of Records only relates a 93 dB sound record.

Proudly introducing the Snore-O-Meter

Imagine you sleep next to a hair dryer! Or imagine me sleeping under an anvil... I had to do something: how could Arduino help me?

People are more likely to snore when sleeping on their back. With gravity, the tongue and soft palate more easily fall to the back of the throat which partially closes the airway. To prevent snoring, it’s best to sleep on the side.

This is a first idea: how to make me sleep on the side? Maybe some device to force me to move on the side when I snore? But when do I snore?

This was then that I thought about a snore sound detector : the Snore O Meter (c)! Easier said than done... See below.

The 2 funny pictures are from a French comic book, edited by Glénat.

What Do You Need ?

20200404_152628.jpg
ESP32.jpg
max4466.jpg
OLED.jpg
Button.jpg

Obviously, this will be dealing with real time sound analysis. An Arduino board, such as a Nano or Uno is a little bit too small for this task. This requires more memory and higher computing speed. So I decided to use an ESP32, which I'm familiar with (see my other Instructables). It is a dual core running at 240 MHz, BT + WiFi equipped System on Chip (SoC), with 4 MB flash memory.

Bill of material:

  • ESP32 board,
  • 128 x 64 I2C OLED display (make sure it's I2C, only 4 pins: VCC, GND, SDA, SCL),
  • a push button,
  • a MAX446 microphone with adjustable gain,
  • one small breadboard, some wires.

The overall cost is under 15 USD.

The electrical diagram will remain the same throughout this Instructable:

  • The ESP32 module powers all the components, using the 3V3 pin,
  • Connect the OLED to I2C pins: SDA to GPIO21, and SCL to GPIO22
  • Connect the analog input of the microphone to the GPIO35
  • Connect the push button between GPIO19 and GND

That's all folks!

Sound analysis is done using Fourier transform. There is a good library available for Arduino and ESP32 boards, made by Enrique Condes. But I wasn't satisfied by it, as it seems to be limited at high frequencies. So I decided to look for another one, and adapt it to the ESP32.

First Trials

Beethoven 5eme spectre
Wemos.jpg
Beethoven.jpg
Audacity.png
wemos-d1-mini-pro.jpg
Spectrogram.JPG
Freq.JPG
SpectroZoom.png

First test

My very first idea was to analyse the snoring sounds. The main problem being that I'm usually asleep when I snore, which prevents me from using the ESP32...

So I first decided to try on fake snoring sounds, that I could produce while awaken. I thus wrote a program for analyzing the frequency content of the sounds, inspired by this project, written for an ESP8266. I made it with the same display, button and microphone, but programmed it on a Wemos D1 Mini board.

I will present this project in more details in another Instructables later (I improved it quite a lot since). You can see on the video that Ludwig van Beethoven ( ^-^ thanks! ^-^ ) gave me a hand on it... Remember that 2020 is the 250th anniversary of his birth.

This first trial wasn't very conclusive: I couldn't find any obvious pattern in the frequency spectrum that would be some kind of signature of a snoring sound.

However, the Journal of Sleep Research has published a paper called "How to measure snoring? A comparison of the microphone, cannula and piezoelectric sensor" in 2015 (ref. J Sleep Res. (2016) 25, 158–168) that indicates that snoring sounds of people from Iceland have a fundamental frequency range mainly in the 70 - 160 Hz range.

This was encouraging: I am not the only one who addresses this problem...

Second test

So I decided to use more appropriate tools.

Audacity is a free software which can record and analyze sounds. I installed it on a laptop, that would stand aside form my bed during the night. I set it up to record all sounds in the bedroom and save that in a wav format.

Audacity can then compute a spectrogram, which displays the sound spectrum as a function of time. The following picture shows one of the recorded spectrograms, centered on some snoring events that I spotted (during the night 22-23 March - what wouldn't we do to keep ourselves busy during lockdown?):

The so-called snoring events are the regularly spaced purple (rectangu-lish) spots on the upper part of the diagram. Here is a zoom on this part of the image:

The period of the snores is around 5 seconds, which must be the duration of a breath when I am in deep sleep.

The good news was that it seems possible to identify a frequency signature of a snore... the bad news was that it is fundamentally different from the results of the paper cited above: the frequency range of interest is between 4000 and 5000 Hz! Other similar measurements I did showed the same kind of result: I have to search for snores in this frequency range. Maybe that's because I don't live in Iceland ???

Artificial intelligence?

But writing a code that must do real time Fourier transform and analyse the frequency content over running periods of 5 seconds, while not getting confused with other sounds that also happen at night (mainly body movements or claps of the tongue) and sometimes share similar frequencies (see the vertical red line at 48:10?), seemed a little bit complex to me. And how can I be sure that this frequency range would be the same for someone else?

So I decided to turn my attention to the hot topic at hand: AI. Artificial Intelligence! Everyone's talking about it, everyone's doing it, why wouldn't I?...

How a Neural Network Works

neural-network.png
TINN.png
Tinn_IDE.JPG

So I figured, let's move on. The buzz-word of the moment: Artificial Intelligence! It's exactly the right tool for this kind of need. An AI program learns to recognize a pattern, or better yet, a family of similar patterns. Once the learning phase is done, a second and much simpler program tries to identify the things it learned among the data it receives.

A neural network tries to reproduce the way our brain learns. In short, a NN is made of several layers of so-called 'neurons', which process a small part of the information it receives. The processing results of the first layer is transmitted to the second one for another processing phase, and incrementally, the network identifies patterns or similarities among the data it was fed with. Roughly speaking, the network is a kind of very big and complex interpolation machine: it uses input data to find the best function which can fit unknown data into its original dataset.

There are many specialized sites and tutorials dealing with Neural Networks on the web. But really not many examples of this kind of application on the ESP32. However, the ESP32 is very well suited for this: it is quick and has lots of RAM.

On a PC, the most popular AI framework today is Google's Tensorflow, which can be easily implemented using the library Keras. Keras provides a high level library of functions that enable a seamless implementation of Tensorflow and fast experimentation of neural networks. It is written in Python, as well as TensorFlow Lite... not suitable for me, as I wanted to code in C on the ESP32.

Only a few developments exist on ESP32 Cam for face recognition, using micro-python, but that's about it.

For the sound analyzer I couldn't find an FFT library I liked, so I decided to adapt another one, written in C, which I found on the Internet. I might as well do the same for the AI! So I searched for AI libraries and codes in C, with little or no dependency, and if possible not too heavy. Here are a few of them:

There are certainly others. I haven't tested them all, of course. In the end, I opted for TINN, by Glouw.

TINN (Tiny Neural Network) is a 200 line dependency free neural network library written in C99.

I chose it because it's lightweight, has no dependency, the code is easy to understand, so it's easy to adapt and the porting was fast. Just one or two functions to adapt (mainly the random number generator - but this is true for all NN libraries, and functions for reading and writing files).

Create the Dataset

IMG_20200513_145639114.jpg
IMG_20200513_145641982.jpg
IMG_20200513_145644611.jpg
IMG_20200513_145651563.jpg
IMG_20200513_145657456.jpg
IMG_20200513_145703163.jpg
IMG_20200513_145628573.jpg

One drawback of neural networks, is that they need a lot of input data to work and learn accurately.

First step: record snore sounds

The NN must practice on the input data, to create its internal image of the data organization and extract or identify the underlying relationships. So I had to create this input dataset.

A recording program was made on the basis of the sound analyzer code (see Step 2), I just needed to make a nice user interface and add the possibility of saving the recorded data in a file. The ESP32 has an internal file system, called SPIFFS, described here, with the associated API and a powerful library, which enables to create, read, write, and delete files. These files are stored in the SPI memory, which retains its content when the ESP32 is not supplied.

It is also possible to upload files from your computer to the SPIFFS of your ESP32, for example via the Arduino IDE by using this add-on. Designed by me-no-dev, a great Bulgarian developper on ESP8266 and ESP32, it can simply be used as follows:

  1. Create a folder called Data in the current forlder of your ESP32 sketch,
  2. Put in this folder the files you want to upload in the SPIFFS
  3. Close the serial monitor (very important, otherwise the add-on will not work),
  4. In Arduino IDE: Tools > ESP32 Sketch data uploader
  5. If necessary, push the button(s) on your ESP32 module, just as you do for uploading a sketch.

Unfortunately, there is no easy way of extracting a file from the ESP32 SPIFFS. So I just wrote a simple sketch that copies in the Serial Monitor the content of all the files in the SPIFFS (they must be ASCII encoded, not binary files). I just have to select the parts I want to keep in the monitor, and copy / paste them in a file using a file editor (Notepad++ is great for that).

This sketch is called Dump_SPIFFS.ino (see below). It's useful if you want to save your dataset on your computer once you have created it.

How to record yourself?

Create a folder called Acquisition_ESP32, and put the following files in it: Acquisition_ESP32.ino, functions.h, params.h

Load the sketch called Acquisition_ESP32.ino on your ESP32 and let it run. The sketch first searches for an already existing file in the ESP32's SPIFFS. If you do not want to keep the file, just push the button to erase it, otherwise, the recorded data will be appended to it (you have 3 seconds to push the button).

Then the sketch runs looping acquisition phases. To start such a phase, just push the button.

An acquisition phase is made of 2 parts: record snoring sounds, then record silence. The second part is as important as the first one, as it will enable the ESP32 to learn to recognize snoring sounds from the background sound. If you intend to use the detector at night, pay attention to record very low background at this moment.

The 2 parts last 3 seconds each. There is a 3 seconds delay in between. Each recording part will save 10 spectra in 16 frequency bands, plus one sound volume data, for a total of 20 * (16+1) = 340 data.

The OLED screen displays all instructions: push button to record (with the number of recording phases so far), make snoring sound, remain silent, etc.

When you decide to stop recording, just unplug the ESP32.

Change the parameters

During the recording phases, the OLED screen will display the variation of the sound spectrum. As I said above, the sound spectrum is divided into 16 sub-bands. This can be changed in the params.h file:

// Frequency bands
#define BANDS 16

However, this number must be a power of 2, as required by the FFT functions. So you can increase it to 32, for a more accurate sampling, at the cost of a larger dataset and longer learning phase.

I recommend to run at least 20 or more acquisition phases, to provide a large dataset to the learning program.

Run the Network

IMG_20200513_170130136.jpg
IMG_20200513_170030863.jpg
IMG_20200513_170033046.jpg
IMG_20200513_170035335.jpg
IMG_20200513_170041772.jpg
IMG_20200513_170049195.jpg
IMG_20200513_170052050.jpg
IMG_20200513_170057990.jpg
IMG_20200513_170104361.jpg

Once the dataset is in the SPIFFS, the ESP32 needs to train on it, in order to learn to recognize snoring sounds from other sounds. This is the objective of this second program.

Now create a second folder in your Arduino folder, named Learning_ESP32 and put inside the following files: Learning_ESP32.ino, params.h, Tinn.h, init.h, train_test.h, and sound_functions.h. Also create a Data subfolder, which will be used if you want to upload files on your ESP32.

Make sure that the FFT parameters are the same as those used in the recording phase. Both params.h files must have the same following lines:

// FFT parameters
#define SAMPLES 256
#define MAX_FREQ 20 // kHz

Train, baby, train...

The training phase is when the neural network uses the dataset you provided to optimize its parameters to best fit the complex function that can interpolate within all the samples in the dataset. This is an iterative process. It is initialized at random, so you may have different results each time you run it. If you are not satisfied by the current results, just run it again and you may get better results...

Upload the Learning_ESP32.ino sketch on your ESP32 and let it run. The OLED screen displays all the instructions.

It first looks for a saved network file. Of course, there is no such file if it is the first time you run this sketch. Otherwise, click the button (within 3 seconds) to load the saved network, and the learning phase will be skipped.

If you didn't push the button, the code launches the learning phase. This is an automatic process, made of the following steps:

  • Read the dataset: the screen displays the summary of the content of the dataset file,
  • Create a network and train it on the dataset: the screen displays a progress bar,
  • When the training results are satisfying, depending on the parameters: the screen displays the learning results (number of errors, error rates).
  • The network is saved in the SPIFFS, for later use.
  • Then the program goes into the inference phase.

The 3 first parts are done in the setup, the loop takes care of the last one.

Changing the learning parameters

The neural network's parameters are in the params.h file:

#define RATIO 0.8f         // ratio of training data vs. testing
#define EPOCHS 6000        // number of training epochs
#define NHID 40            // number of hidden neurons
#define ACTIVATION RELU    // chosen activation function of hidden layer
#define BATCH 50           // number of data used for training in each epoch
#define LR 1.0f            // initial learning rate
#define ANNEAL 0.9999f     // rate of change of learning rate
#define MAXERR 0.0005f     // stop training if error is less than this
#define DETECT 0.9f        // detection threshold

You can play with them to get better results. Keep the RATIO greater or equal to 0.7

The DETECT parameter is used for the "inference phase".

Inference?

Inference is the phase where the capabilities learned during training are put to work. This is presented in the figure below.

In short, during the inference phase, the ESP32 listens and tries to detect a snore from the surrounding sounds. It records and analyses the sound, displays the spectrum on the OLED screen, and whenever the sound looks like the image of a snore it built during the learning phase, it displays an alert message.

The file Data.txt contains the dataset I recorded. I add it so you can see what this data look like.

What Else?

Terminasnore.jpg

The rest depends on what you want to do.

Note that the Snore-O-Meter is an "autonomous" device: you don't need your PC to make it work. All the instructions and information are displayed on the OLED screen. You can just plug a USB phone charger on the ESP32 and let it run...

Monitor with Thingspeak

One great possibility is to monitor your sleep and detect snoring phases. With the help of ThingSpeak, it's quite simple to store snoring data (such as detection times) and visualize your snoring activity on a bar graph. Thingspeak automatically adds a timestamp to the data you upload on your account.

So just import the Thingspeak library for Arduino from here (there are a few examples for ESP32), create an account and a dedicated channel. Each time the inference detects a snoring sound, just send a '1' to your channel, and you're done! Some basic instructions are available here. Below is an example code:

// Write to ThingSpeak. There are up to 8 fields in a channel, allowing you to store up to 
// 8 different pieces of information in a channel. 
// Here, we write to field 1 the value 1 : writeField(Channel, Field, Value, APIKey).
int x = ThingSpeak.writeField(myChannelNumber, 1, 1, myWriteAPIKey); 
if(x == 200) Serial.println("Channel update successful.");
else Serial.println("Problem updating channel. HTTP error code " + String(x));

As you cannot send data quicker than once every 15 seconds, remember to wait for that delay before doing another inference.

Then plot your data using the bar graph template.

Introducing 'Termina-Snore'

Back to the original idea: how to stop snoring? Well... your imagination is the limit. You can add a simple piezo buzzer on your breadboard to play annoying sounds when a snore is detected. Hopefully, this will disturb you enough to make you move on the side and stop snoring. But it may as well disturb whoever share your bedroom...

To do this, you need a piezo buzzer such as the one above and add a few lines of code in the Learning_ESP32 file...

First insert these lines before the setup, in the definition block:

const int buzzerPin = 18;  // put here the GPIO number for the buzzer<br>int channel = 0;
int resolution = 8;

The ESP32 handles the PWM a bit differently than the other Arduino boards: here we define the GPIO number connected to the IO of the buzzer, and choose a channel number and its resolution (in bits) for the PWM.

Then, place these lines at the beginning of the setup:

  ledcAttachPin(buzzerPin, channel);<br>  
  ledcSetup(channel, 255, resolution);
  int volume = 10;  // 0 < volume < 255
  ledcWrite(channel, volume);

This links the pin number to the channel, and defines the channel's properties.

Finally, create a function that will play a sound:

void playSound () {<br>  for (int freq = 300; freq < 6000; freq = freq + 50) {
    ledcWriteTone(channel, freq);
    delay(5);
  }
}

This function can be put before the setup. You can modify it if you like. then, all you have to do is to add the following line in the loop:

playSound ();

just in between these 2 lines:

display.display();<br>delay(400);

Another possibility is to use a DFplayer MP3 module that will play some specific music that you will unconsciously learn as an alarm to move on your side... Less disturbing, but maybe less efficient.

Why not make a robotic arm, triggered by the ESP32, to slap your face when the snore is detected?

You'll call that TERMINA-SNORE... OK, maybe that's a little too complicated.

Disclaimer: This was done only for fun. I only did this to show that it is possible to run artificial intelligence applications on an ESP32. It quite nicely worked for me, but I can't ensure that it will work with anybody else.

I am currently working on a more user friendly way of running AI on the ESP32, namely a library. This will enable everyone to create their own AI applications for their own purpose.

Have fun anyway... and share your results.