Make Lights React to Audio
The goal of this project is to make an LED matrix, which can react to sound, not only by detecting the amplitude, but also by controlling animations based on the frequency spectrum and beats of an song.
Some of the finished animations are straight forward and you can see a connection between the audio and visual like the audio spectrum monitor in the video. Other animations are more ambiguous where for example on each beat some LED light up and/or move and change their color. The presented system gives you a lot of freedom in what should happen.
For this an audio signal is pre-processed by a filter and then processed by a micro-controller. The controller calculates the intensity of the frequency spectrum with the Fast Fourier Transformation (FFT) and performs a simple beat detection. This data gives us insight about the amplitude and frequency spectrum of the audio signal and the beats of the song. Since a controller does the calculations this project is self-sufficient and independent from e.g. a PC.
I'll explain in the following steps the theory and thought process behind this project.
Concept and Theory
Before starting a project you should make a rough concept what it should be able to do. The basic idea is simple: a stereo audio signal is read by a controller, which then does some math and outputs an pattern on a LED matrix. Additionally there should be an display for feedback an some way to control the system.
Hardware concept
The controller for this project needs to be able to read 2 analog values a once, which means that the controller needs 2 ADC units. It should also support DMA, to make the data acquisition independent from the CPU. The DMA would also help to output the data to the LED drivers. While an Cortex M4 controller would give more computing power, I decided to use an atxmega128A3U controller, because the are cheaper and easier to obtain (they are sold with an pre-programmed bootloader on the german ebay page for people who don't have a programmer) while still providing all required features.
The audio signal needs to be amplified and brought on an level to be read properly by the controller. Additionally the signal should be filtered to eliminate calculation errors in the FFT. While not absolutely necessary an AGC was added. This servers as an protection of the audio input stage and smooths extreme changes in the signal level.
As LED I choose WS2812 LED, mainly because I still had some of them, but also because they are convenient to use. The LED were placed behind ping-pong balls, which serve as diffusor for the light. The LEDs were arranged as an 7x6 matrix.
Lastly I added an I2C-based OLED display and a rotary encoder with button as input-output interface for the user. This allows us to easily debug the program and implement a simple menu to control the finished project.
This setup allows us to sample the audio data, process it an make a nice animation on the LED matrix.
On the picture you can also see an audio amplifier with speaker. This is used to hear the signal, which is sampled by the controller. For the final version of the project it won't be used.
The basic idea of functionality
Real-time audio analysis uses a lot of resources, which the used controller doesn't have. With the current system it's possible to sample data and do all the calculations 20 times per second. While this is enough to make certain reactions look quite good, more complex animations will look sloppy, if you would present it with 20 frames per second.
The controller refreshes the display at a rate of 60 frames per second. This makes animations look very smooth, especially if you fade the color. The animation system gets fed with new data 20 times per second and uses this data as template for the animation. While this approach is far from being real-time reaction to the music, it's enough to trick the human perception in an immediate reaction.
Hardware
You can find the schematics as KiCAD project on github
While the schematics look overwhelming at the first glance, they are actually quite simple. There are 2 audio input stages, because there are two audio inputs. One from the left channel and one from the right channel. Each input stage consists of a voltage divider, a AGC and a filter. The voltage divider serves as an protection, because it has a high resistance and lowers the voltage level of the signal. Next the AGC is a second form of protection, but the main reason is to even out extreme changes in the amplitude of the signal. The filter is a bandpass, which lets frequencies between 16Hz and 16kHz through. This is roughly the frequency range a human adult can hear. While not completely necessary, a filter should always be used to reduce the amount of alias/computation error in the FFT.
The rest of the schematics are the default circuits needed to get the controller to work. The power supply has several inductors to filter and smoothen the ripple of the supply voltage. This increases the quality of the analog signals. This input stage was taken from this project.
The amplifier between filter and controller (U5) was added in this prototype just in case the signal was too weak. But it turned out that this amp stage isn't needed. Although not needed it still stabilizes the signal. The connector for the stereo potentiometer can be shorted between pin 1-3 and 2-4.
Software
You can find the software as Atmel Studio Project on github
There are 2 projects in this git. The Basic project includes all basic functions to measure and calculate the data. It also includes a class for WS2812 LED. The Matrixproject includes all data for my project based an an 7x6 LED-matrix with WS2812 LED. As this is very specific, this code is more of an inspiration and example for what you can do.
This part is the most complex of this project as the software needs to do a lot of things.
Sampling the Data
First you need to get the data into the controller. This is done by sampling the audio signal with an ADC. The sampling frequency depends on the frequency range you want to work with. If you sample the signal with a rate of 32kHz you can recreate signals up to 16kHz with the Fourier Transformation (FT). This is the frequency range we want to work with.
To make this work a timer is set to trigger with an frequency of 32kHz and read an value from the ADC each time this happens. Since the atxmega controller has an DMA controller, this is used to automate the process. DMA (Direct Memory Access) allows you to move data from one point in the memory to another completely in hardware. The DMA is set up in a way, that it moves the result from the ADC to a specific place in the RAM whenever it's triggered by the timer. We sample 128 values per channel.
Fast Fourier Transformation (FFT)
We use the elmchan FFT library for this project.
Usually a FT uses a lot of calculation time, but allows you to recreate each frequency within the given signal spectrum. The FFT simplifies this process and speeds it up. The disadvantage is, that it doesn't give you specific frequencies but frequency bands. For the purpose of this project this data is good enough.
Since we sample 128 values, the FFT will give us 64 results. Because of the sampling rate (32kHz) we can recreate frequencies up to 16kHz, which will be divided into 64 bands by the FFT. 16000 / 64 = 250. This means each FFT band describes the level of 250Hz. FFT band 1 gives an value for 1Hz to 250Hz, band 2 for 251Hz to 500Hz, band 3 for 501Hz to 750Hz and so on. By increasing the number of input samples you could increase this resolution, but also increase the needed calculation time.
Those 64 bands are consolidated into 7 bands, which are used for the matrix, which has a width of 7 LED. Some effects are still based on the original 64 bands.
The algorithm to calculate the FFT is coded in a way, that it doesn't do all the calculation steps at once, but splits them into smaller pieces. This is done to give the controller time to calculate other thing in between those steps. If you don't do this, the animations will sometimes stutter, which doesn't look good.
The human ear reacts different to certain frequencies. It's very sensible to frequencies around 4kHz and not sensible to very low and high frequencies. In an audio signal the low frequencies will have a high amplitude, which leads to high FFT results, and low amplitudes in the middle frequencies, which leads to low FFT results. This makes it hard to do calculations with. To counter this a very simple reverse a-weighting system is used. The coefficients are pre-calculated for each band and multiplied with each of the 64 FFT bands.
Beat Detection
The basic idea behind this is simple: average the power of the sound level. If the current power level is way higher than this average you have an beat. To do this properly you would need to sample the audio signal continuously, which we don't do. But we'll still use this idea.
The FFT results are split into 3 bands: low, medium and high. Those values are averaged over time as a moving average. If the current level is higher than 50% to 100% (depending on the band) of this average it's evaluated as beat. This works very well with a metronome, which produces a simple sound, but doesn't work too well with complex signals like songs. The beat detection in the lower band works the best and the middle band the worst when there are vocals in the song.
What to do with that data?
As an example I'll explain how the mono spectrum display works. You can find the code in the complete project in animations.cpp in the function void anim_monospektrum_step(). This animation is used in the intro video.
The values of the 7 consolidated bands are set as target values for the height of the columns. It's scaled in a way, that this value ranges between 0 and 9000. The target color offset is set by the amplitude. The actual values, which will be used to calculate the output, slowly change their value to match the target value. This makes the animation look very fluid and prevents sudden jumps.
The matrix has a height of 6 LED. This means each LED covers 9000 / 6 = 1500 counts of the actual value. If the actual value of a band is 1500, only the LED in the bottom would light up. If the value was 3000 the two most bottom LED would light up. As long the actual value is greater than 1500 we light up one more LED and subtract 1500 from the value. If, after this process, the value is not 0, the LED above the last full-lit LED is lighted as well, but the brightness is scaled with the value. The lower the remaining value, the lower the brightness.
As an additional detail the bands will flash when a beat within this band is detected.
Simplified Version and Getting Started
The most complex part of the hardware is the audio input stage. If you don't want to use a stereo, but a mono signal, you can simply cut one of the input stages. If you want it even simpler you could just a very simple input stage, which has a high resistance and only a filter. You may need to add another amplifier stage to this.
If you are using an Arduino it's not possible to copy and paste the provided sourcecode, since the Arduino Board isn't based on a atxmega controller.
If you want to use this project with another controller, you need to adapt the code yourself. The following steps should give you an idea on what to do:
The first thing you need to do is sample the audio signal. We need 128 data points and want a sample frequency of about 32kHz. The simplest way to do this is to make a loop, which reads the ADC and then pauses for about 30µs. The 30µs delay in combination with the time needed to read the data should give an roughly accurate sample frequency.
The next step is the FFT. Some nice guy put the FFT library used in this project in a convenient library for Arduino. This comes with an example, which explains how you use it. The example uses the free-run mode of the ADC, which unfortunately doesn't run at 30kHz. This isn't necessarily a bad thing, but your frequency range will be affected and your FFT buckets will have a different resolution. You can of course also check the official project of the FFT library.
If you want to implement an beat detection just take a look at the explanation in the software part or the code of my project. This is basic math and can be copy/pasted.
What happens after that is you imagination. The most used methods to make sweet animations, color fading or whatever you want to do are these:
- target/actual value: The target value is derived from the FFT data. The actual value slowly changes until it reaches the target value.
- moving average: You remember the last X values. Add those values up and divide them by X. This gives you the moving average
- weighted moving average: This is kinda like the moving average, but newer values have a higher influence on the result than older values. value = ((value * (NUM - 1)) + new_value) / NUM. NUM can be freely chosen; Higher means it adapts slower/smoother
Those methods allow you to make fluid looking animations. If you take the data direct from the FFT you may see sudden jumps in e.g. your color fading. I use the actual/target value system and weighted moving average most of the time.
A very simple example on how to use this formulas is this line of code: color = ((color * 15) + fft_bucket_h_l) / 16;
It calculates a weighted moving average of the FFT bucket with the highest value on the left channel. This value can then be used to set the color of an RGB LED. Congratulations! You just made an LED, which changes its color depending on the most dominant frequency.
At first glance all this math might look overwhelming, but luckily all the hard work is already done. If you spend some time to work your way into this project you'll see that you can make awesome effects with some simple math you've learned at school.
Important data for you to use
The FFT data can be accessed via the fft class like this:
fft_result_t *fft_left = fft.getLeft(); for channel 1
fft_result_t *fft_right = fft.getRight(); for channel 2
The result structure looks like this:
typedef struct {
uint16_t spectrum[FFT_N / 2];
uint16_t adc_min, adc_max;
} fft_result_t;
spectrum is an array of 64 elements, which hold the result from the FFT. adc_min and adc_max are the minimum and maximum values of the signal.
Within the animation.cpp, where you should place your animation code you have access to the following global variables
uint16_t bands_l[ANIM_BAND_NUM], bands_r[ANIM_BAND_NUM]; arrays which hold the data of the 7 condensed
bands uint16_t amplitude_l, amplitude_r; amplitude, derived by the adc_min and adc_max values
uint8_t beats, bpm_h, bpm_m, bpm_l, bpm_all; beats contains a bitmask, which lets you check if there was a beat in a specific band. The other variable give you the beats per minute for the high, mid, low and all (any) band
uint8_t fft_bucket_h_l, fft_bucket_h_r, fft_bucket_l_l, fft_bucket_l_r; number (0 ... 63) of the FFT bucket with the highest and lowest value For all variables _l refers to the left channel and _r to the right.
Conclusion
While this system, especially if you make your own version with an Arduino, doesn't seem very powerful, it allows you to make neat looking lights and animations, which react to sound. The "advanced" version based on the atxmega controller offers a lot of power, thanks to the DMA options. In contrast to most other projects of this kind you have many options how you want to react. It's also completely independent from other devices. It only needs a power supply and the audio signal as an input and can control different devices.
This project is focused on the output on an LED matrix, but with the help of an FFT you can also realize way different projects. A different idea would be to make a lock, which opens when you play a certain melody. For this the melody needs to be very simple and the frequencies of the tones should not be too close together, so you can make a better distinction between them. Think of something like the ocarina songs in Zelda. You should also increase the number of samples for the FFT, to increase the frequency resolution. Then you need to look for a certain pattern and when this is right you could move a servo to open/close a lock.
I hope this guide gave you some inspiration and you learned something new for your own projects.