Laser Beam Targeting Via Facial Landmark Detection

by thomas9363 in Circuits > Raspberry Pi

222 Views, 1 Favorites, 0 Comments

Laser Beam Targeting Via Facial Landmark Detection

Dlib.gif
empty.png

This project utilizes eye blinking and mouth opening as control inputs to activate a panning servo-mounted laser beam. The hardware setup includes a Raspberry Pi, serving as the interface for the camera, servo, and laser. A computer vision-based Python program is employed for real-time detection and tracking of the eyes and mouth in video frames. The servo angle dynamically adjusts by monitoring changes in the shape of the eyes, while the laser is triggered by detecting changes in the mouth.

Background

Changes in the eyes and mouth are detected by measuring their landmarks. Various facial landmark detection methods are available, such as dlib, OpenCV, FaceAlignment, and mediapipe. For example, mediapipe can detect 468 points, while dlib detects 68 points. Given the limited resources of the Raspberry Pi 4 edge device, dlib's 68 landmarks are sufficient for my project. Adrian Rosebrock has posted an article titled 'Eye blink detection with OpenCV, Python, and dlib.' It provides helpful information to start writing the program.

Hardware Setup

The hardware setup is straightforward, utilizing a Raspberry Pi 4 with 4GB of RAM. A laser is mounted on a panning servo, facing forward, while a camera is connected to the Pi's CSI port, facing the opposite direction and not mounted on the servo. The servo is controlled by either a PCA9685 or a UGEEK Stepper Motor Hat V0.2, with the only difference between them being the I2C address.

Dlib Landmarks

face_landmark.png

Dlib's facial landmark detector provides 68 (x, y)-coordinates. In this project, left or right eye blinking controls the panning servo's left or right movement, while opening the mouth activates the laser. Closing the mouth deactivates the laser. The mouth is accessed using points 48 to 68, the right eye with points from 36 to 42, and the left eye with points from 42 to 48.

Having said that, I only use 4 points each from the mouth and both eyes, as shown above. The closing or opening status of the eyes and mouth is detected through the calculation of their respective aspect ratios. The aspect ratio is defined as shown in the picture above.

Software Implementation

EyeBlinkComparison.png

As I began implementing my algorithm, I noticed that the aspect ratios of my eyes were no longer as distinct. At the age of 70, there are fewer differences in the aspect ratios between opening and closing of my eyes. In particular, when I blink one eye, the other eye tends to follow suit. I plotted the area of the eyes between opening and closing, as shown in the picture above. You can see that when I close one eye, the area of the other eye drops significantly. Having said that, the one that is closed is still smaller than the one that is open. The quadrilateral area of the eye is the sum of the two triangles, and the area of each triangle is:

Area (A) = √[s * (s - a) * (s - b) * (s - c)]
Where s = (a + b + c) / 2 and a, b, c are the three sides

Furthermore, the size of the area changes when I move closer to or further away from the camera. I cannot use the size of the area as an indicator directly, but I normalize them by dividing the area of the left eye by the area of the right eye. The criteria for eye closing is when the aspect ratio of the eye is below a threshold, and the area of the eye is smaller than the other eye:

if leftAR < EYE_AR_THRESH and area_ratio<1:
where leftAR: left eye aspect ratio
EYE_AR_THRESH: threshold of the eye aspect ratio
Area_ratio: left area/right area

Fortunately, the aspect ratio of my mouth appears to be more consistent, and my program can reliably detect when my mouth is open or closed. You can download the programs from my Github repository.


Conclusions

Facial Landmark Control of Laser and Crossbow

The video below provides a demonstration of my granddaughter control the laser in action. The second part of the video shows using facial landmark to control a repeating crossbow. If you are interested, you can find the building instruction of the crossbow in my other article. As I only have two eyes, I can currently control the pan servo. However, considering that mediapipe can detect 468 points, including the iris, I may explore using the rotation of my eyeballs to control the tilt servo in the future.