Mediapipe and Haarcascade Face Tracking With PID Control

by thomas9363 in Circuits > Raspberry Pi

887 Views, 3 Favorites, 0 Comments

Mediapipe and Haarcascade Face Tracking With PID Control

Object tracking involves using a motorized mechanism to move a camera in order to keep a detected object in view. Initially, a camera is employed to detect an object of interest within its field of view. Once the object is identified, its position is determined within the frame. A motorized mechanism (usually servos) is then utilized to adjust the camera's orientation (pan and tilt) to ensure the detected object remains at the center of the frame. The camera continuously adjusts its position to follow the movement of the object, providing a practical means of automating the process of keeping an object in view without manual intervention.

In my earlier post, I utilized this technique to track circular balloons. The tracking involved moving the camera by a fraction of a degree at a time until the balloon was centered in the frame. However, this method was not optimized. In this article, I implement a PID control algorithm. Additionally, two machine learning object detection methods, namely Mediapipe face and Haarcascade frontal face, are employed.

What Is PID Control

PID control, also known as proportional-integral-derivative control, is a control algorithm commonly used in engineering systems to achieve and maintain a desired target value. In mathematical terms, the PID controller adjusts the system's control variable based on the error between the desired setpoint and the actual value of the system. PID control can be expressed in differential equation form as:

y(t) = Kp * e(t) + Ki * ∫ e(t) dt + Kd * de(t)/dt

where:

- y(t) is the control output (e.g., the manipulated variable).

- e(t) is the error at time t, which is the difference between the desired setpoint and the current process value: e(t) = setpoint - process_value(t). It helps the controller respond to the immediate deviation from the setpoint, but it does not account for past or future errors.

- ∫ e(t) dt represents the integral term, which is the accumulation of past errors over time. It helps to eliminate any steady-state error. It amplifies the control signal when the error persists over time.

- de(t)/dt is the derivative of the error with respect to time, representing the rate of change of the error.

- Kp, Ki, and Kd are the PID controller gains for the proportional, integral, and derivative terms, respectively.

Converting the above equation to numerical term, the control output at time t is expressed as:

output = kp * error + ki * error * delta_time+ kd * (error - error_prior) / delta_time

In this numerical representation, the integral term is approximated by summing the error over time, and the derivative term is approximated using the difference in error between two consecutive time steps. These approximations are commonly used in digital or discrete-time implementations of PID control for real-time control systems.

Testing Setup

The hardware consists of a Raspberry Pi 4 with 4GB of RAM running the Buster operating system. The program, written in Python 3, utilizes OpenCV 4.1.0.25 for image processing. The Python programs, with or without PID control, are designed to track faces using the haarcascade frontalface.

The tracking mechanism is constructed with a pan servo and a tilt servo, both connected to a UGEEK Stepper Motor HAT v0.2. This HAT is chosen for its ability to directly sit on top of the Pi. Essentially, it functions similarly to the PCA9685 servo control board, with the only difference being the I2C address. The servo control module is implemented using the Adafruit ServoKit library. If utilizing the PCA9685 servo controller, the address needs to be changed from 0x6F to 0x40, as shown below:

kit = ServoKit(channels=16,address=0x40)

The angle method in the Adafruit ServoKit library accepts float values, enabling precise angle adjustments. However, it's essential to note that not all servos can accurately position themselves with decimal-degree precision, potentially rounding to the nearest valid position supported by the servo hardware.

For testing purposes, a frame size of 352x288 pixels is used due to the limited computational power of the Pi4. This allows for a smoother performance at the standard rate of 24 frames per second (FPS) commonly used in 2D animation movies. The placement of a doll on the right side of the camera, with a fixed offset of 1/40 of the frame width, aims to assess the speed at which the face can be moved to the offset zone.

Without PID Control

The initial test is performed without PID control. Tracking is achieved by adjusting the camera's position in fixed fractions of a degree until the center of the face aligns with the offset zone. Two scenarios were compared: one with a 0.3-degree increment and the other with a 0.4-degree increment. The graph illustrates that with a 0.4-degree increment, the camera moves quickly, but it also exhibits oscillations around the offset box without coming to a stable position.

With PID Control

In the application of PID control, optimizing the values of Kp, Ki, and Kd is crucial. In this particular case, it appears that Ki and Kd have less impact on the movement. Comparing the results to the scenario without PID control, depicted in the graph and image, it's evident that the face swiftly reaches the offset box (referred to as the setpoint in PID terminology).

During experimentation with haarcascade and PID, it was observed that the camera occasionally missed detecting the face. This could be attributed to variations in lighting conditions or when the face is slightly turned at an angle.

Mediapipe Face

To utilize the mediapipe face detection, you must install the 'mediapipe-rpi4' library in your virtual environment using the following command:

pip install mediapipe-rpi4

Two programs are available on my GitHub—one without PID control and the other with PID control. The structure of these programs is similar to those using haarcascade frontalface, but with the use of 'mediapipe.solutions.face_detection'. It's important to note that mediapipe has different syntax and data structures, requiring the extraction of necessary information to locate the face center for subsequent manipulation. You can download all programs from my GitHub repository and examine them line by line for more details.

Mediapipe appears to be more robust and accurate under various lighting conditions, even when the head is turned. Additionally, a 5mw laser is co-located with the camera. When the face stays within the setpoint box for a short duration, the laser automatically activates. Always remember to wear laser protection glasses for safety.

The video above demonstrates efficient face tracking using mediapipe with PID control. Remarkably, even with a frame size of 640x480, the system detects and tracks the face swiftly.