AI Air Drums With Computer Vision and RaspberryPi
by Deepali Gaur in Circuits > Raspberry Pi
627 Views, 5 Favorites, 0 Comments
AI Air Drums With Computer Vision and RaspberryPi
These AI drums are played by wiggling your fingers in the air. Bend of every finger tip plays a unique drum beat. I have synced it to the sound of percussion, but you can sync it to any musical instrument of your choice.
This project works by tracking hands through Computer vision and recognising hand gestures through machine learning model. I've used mediapipe machine learning model for hand_landmark recognition and OpenCV for Computer vision. With this model, we track the position of finger tips with respect to the position of the other fingers and sync up a drum beat through our code with each finger tip. I've linked every finger to a unique drum beat. And we can play a symphony by just wiggling our fingers in the air.
I have synced it to tabla-drums .WAV files, you can sync it to any musical instrument of your choice.
I had a lot of fun demonstrating this project at a recent maker faire, this instrument is sure to amuse family and friends.
Prerequisites : Familiarity with RaspberryPi and intermediate Python coding skills.
Supplies
- RaspberryPi 4B with the latest version of Raspian OS installed : link
- Monitor
- Wired keyboard and mouse with RaspberryPi are optional. If you are familiar with VNC/remote access of Pi, you won't need these.
- RaspberryPi camera (link)
- RaspberryPi camera tripod (optional) link
- You'll need audio output - either external speakers can be connected to Raspberry Pi's audio port or you can use the built in monitor speakers. Here's the link to basic PC speakers that I used . link
- Optional add on for light show: If you'd like to add light show synced to your musical notes then you'll need: Breadboard, LED lights, jumper wires, 220 Ohms resistor. The LED buttons that I used can be purchased from this link
Install Latest RaspberryPi OS 64-bit
Make sure you are using the latest version of RaspberryPi OS 64-bit. If you are new to RaspberryPi, you can follow the detailed steps on setting up your Pi from their official link https://projects.raspberrypi.org/en/projects/raspberry-pi-setting-up/0
Connect Picamera and Enable It
Next, you need to connect the camera to RaspberryPi and make sure your camera is enabled On RaspberryPi desktop, Go to Main menu-> Preferences-> RaspberryPi Configuration-> Interfaces -> Camera -> select Enabled -> OK. And then reboot your RaspberryPi
Enable Audio
Make sure you have selected the right audio channel. You can check it by clicking on the volume icon on top right. I have used externals speakers connected to the Raspberry Pi audio jack so I have selected the AV jack option from the list.
Install Pygame
Install pygame with this command:
python3 -m pip install -U pygame --user
Further info on installation on Pygame.org : https://www.pygame.org/wiki/GettingStarted#Raspberry%20Pi
Install Mediapipe Packages
1. Now, we need to install mediapipe package, use the command below to install:
$ python -m pip install mediapipe
You need to have a monitor connected to see the video stream.
2. Now, install the dependencies from Mediapipe github for handlandmark trained model.
Clone the contents of this Github folder to your RaspberryPi.
https://github.com/googlesamples/mediapipe/tree/main/examples/hand_landmarker/raspberry_pi
3. After copying the files, run this script in the cmd prompt of your RaspberryPi:
cd mediapipe/examples/hand_landmarker/raspberry_pi
sh setup.sh
CODE: How It Works
Now that we are done with the installations of all the packages, it's time to code. You can download my code from my GITHUB repository from the link below. In this step, we'll understand how our code works.
How it works:
We’ll use the hand_landmarker task from mediapipe, this is a bundle of two pre-trained models : hand_landmark_detection model and a palm detection model. It’s trained on thousands of ‘hand’ images to detect the 21 hand-knuckle coordinates,as shown in the image.
For our project, we are interested in the four finger tips, we’ll store the positions of these tips in a Python list to use in our code.
This is the Python list we have created in our code:
tip= [8,12,16,20]
tip[1] = 8, This is the hand landmark co-ordinate for the index_finger_tip
tip[2]= 12, This is the hand landmark co-ordinate for the middle_finger_tip
tip[3]= 16, This is the hand landmark co-ordinate for the ring_finger_tip
tip[4]= 20, This is the hand landmark co-ordinate for the pinky_tip
Algorithm divides hands into 21 landmarks, each landmark has (x,y,z) co-ordinates. The upper left of the frame is (0,0) and the bottom right being (1,1).
The z is the landmark depth, with the depth at the wrist being the origin.
For our project, we need to track if the tip of a finger in lower than the tips of the other three fingers, then play the musical note linked with this finger.
Code : Recognizing Patterns for Each Finger
The code works by recognising the bend patterns for each finger.
For instance, for the index_finger (hand_landmark=8) : We consider that a person is intending to play a note with the index-finger when the tip of her index_finger is below the tip of the other fingers. [note the value of ‘y’ coordinate increases downwards, bottom_right of the frame being maximum]. If this condition is true, then we play the drum beat linked with the index finger and we stop playing when this condition stops being true.
Similar logic is used to play notes for the other fingers too.
Now, to play a note with pinky: Pinky is the shortest of the four fingers, this means that the tip of the pinky(landmark 20) is always shorter than the other three fingers. So, instead, we compare it with landmarks 19 and 18. If the tip is lower than the landmark 19 or 18 then we consider it being bend and that the user is intending to play a note with this finger and we play the drum beat linked to it.
Code :Setting ML Model Parameters
We can set parameters like number_of_hands to be tracked, detection_confidence and so on for our hand_landmarker ML task. Lets take a look at these settings parameters in the code below:
with handsModule.Hands(static_image_mode=False, min_detection_confidence=0.7, min_tracking_confidence=0.7, max_num_hands=1) as hands:
For the purpose of this project, we have limited the number of hands being tracked to one. Parameter ‘max_num_hands’ is used to set this value in the line above.
static_image_mode=False, since we are using live_stream.
Minimum_tracking_confidence - minimum confidence score for hand detection to be considered successful. Value between 0.0 and 1.0
Minimum_detection_confidence - Value between 0.0 and 1.0 Its the minimum confidence score for hand detection to be considered successful in palm_detection model.
Code Directory
I have uploaded the code in my github, you can download it from my repository, link below, and run it after all the above installations. The code has been tested several times and works fine.
Substitute Your Sound Files Here in the Code.
You'll need to substitute the names and path of your .WAV sound files here in the code.
Add LEDs to Your Project
Optional upgrades to your musical AI project:
You can add a set of LEDs to glow with each drum beat. They are to be connected to RaspberryPI's GPIO pins.
I have connected four light buttons to GPIO pins.
The schematics for the connections is in the pic above.
The code already has the logic to turn these LEDs on and off along with the musical notes.
(The code works well with or without the add on LEDs)
Now, You Can Play the Drums With These Hand Gestures
Step 1. Place your hand in front of the camera(near the black line) until it’s seen clearly in the frame.
Step 2. Wiggle your fingers to play the drums and control the light show! Each finger plays a unique beat.
To STOP the Drums- Place Your Hand Upright in Front of the Camera
To STOP : Place your hand upright.
Variations and Add Ons
You can add your own variations to this musical project.
For example : You could play any musical notes with it, I have chosen tabla drums, but you could use piano or any other sound .WAV files.
OR You could add another note for the thumb.
OR code one of the fingers for volume control and so on!