Gestures Controlling Lights... and Much More.

Control Your Home with Gestures: Smart Lights and Devices Made Easy!

Welcome once again to this Amburo place. In this tutorial, we will be exploring a new way of controlling things. On previous occasions, we experimented with voice commands to send instructions to a server in order to be processed and trigger an action based on that command. Today I brought this Mediapipe platform. This platform contains a set of frameworks and libraries closely related to innovative machine-learning features. Among them, you will find interesting features like face recognition and gesture recognition for different environments. So we will pick gesture recognition for the web toolkit. We will assemble this infrastructure with our Voice Sever. We will add a new page containing libraries required for gesture recognition and functions for sending and controlling the instructions associated with certain gestures. Let's get into it.

👉 If you like this kind of work, check out my YouTube Shorts series where I showcase more of my creations!

Supplies

For this endeavor, we will need to revisit our previous voice-controlling tutorial. More precisely the NodeJS server component. Which is in charge of rendering HTML files and processing incoming instructions from the HTML page. Also, we will be using the four switches devise we built. Bear in mind you can test this by using another thing like some effect on the web page. That would be the simplest thing. For example, changing some HTML features.

But you can trigger whatever you want. You can send any kind of request by sending a request to a server. You can implement a WebSocket to forward the stream to the server. Imagination is your limit.

Diagram

As mentioned, we will add a new page to our original voice controller page, that will be rendered by the same server. This new page will re-use the same resources in the form of a NodeJS server. Once the gesture is recognized by the MediaPipe library, which is, by the way, a stream of data acknowledging continuously the images from the webcam; a function called doAction() will filter and process the mentioned stream in order to detect a specified pattern ( e.g. Open palme, close palm) and send a single unified instruction.

New Components

So we will be adding three new components/files to the original project: (gesturesScript.js, gestureStyle.css, and gestures.html). And modifying slightly server.js. Not much to say about the CSS file, this will be adding the original style sheet downloaded from the examples in the MediaPipe example resources. Let's dive a little bit deeper into the rest.

Server

A simple addition has been made in order to render the new HTML. Super simple.

Script (JS)

Not much to say about this component (gesturesScript.js), which is basically pretty much the same as you find in the examples. I removed redundant stuff. You´ll find the configuration of the gesture recognition. The setting of the function of the clicks and changes of the web page to control some minimal behaviors like activating the webcam. Notice that in the createGestureRecognizer component, this resource gesture_recognizer.task is called. This is where the gestures are programmed to be recognized. Is written in a special language, and you can modify it and make your own. We will be diving deep into this in the update for this tutorial to increase the gestures we work with.

A function has been added (doAction) to filter the stream and send the instruction. As you can see an if else set of decision making has been used! Raka does not like that! Is not the Amburo´s way. We will correct that shorty by adding some polymorphism.

Html

Very simple HTML containing the 'includes' of the required styles and JavaScript components. And the required video tag for showing the webcam stream. Notice we added our JS script at the end.

Coming Next

So, what next?

Well, my idea is to dive deep into tasks to enhance the variety and complexity of gestures. By recognizing left and right hands, recognizing more gestures and maybe combining hands to create complex gestures. It will give us a wider spectrum to control more things. Also, I would like to improve the setting and processing of the instructions by adding polymorphism.

See you in the next one and have a good one.

Raka.

Bonus Feature

As I mentioned the task used in this demo is kind of limited and I will be adding one with more gestures. In order for you to test it right away I built a switch emulator composed of a simple table and HTML circle element. The script besides sending the request to the server, changes the color attribute of the corresponding circle. Feel free to implement the changes that you need to have other mechanisms. I will be enhancing this in the future as I stated. Bear in mind I used Thumb_Up / Thumb_Down in the first 4 configurations as an on/off indicator. So as an example, I will point up and thumb up to turn on switch 1, and point up and thumb down to turn it off, and so on.

So basically I used this combination to control the switches:

Pointing_Up and Thumb_Up / Thumb_Down light-1
Victory and Thumb_Up / Thumb_Down light-2
Open_Palm Thumb_Up / Thumb_Down light-3
Closed_Fist Thumb_Up / Thumb_Down light-4
Closed_Fist Open_Palm or Open_Palm Closed_Fist to switch on/off all of them

Adiós!

Closed fist to open palm transition, turns all switches on.