Human Position Recognition With Camera and Raspberry Pi 4 With the Use of OpenCV

by WilliamLudeke in Circuits > Cameras

4580 Views, 17 Favorites, 0 Comments

Human Position Recognition With Camera and Raspberry Pi 4 With the Use of OpenCV

LeftLEDTest.jpg
CenterLEDTest.jpg
RightLEDTest.jpg

have you ever wanted to make a painting where the eyes literally track where someone is or a doll that head follows you around? well I do but I needed a method of detecting the position of a person through a webcam on small hardware such as a raspberry pi. there are a few problems needing to be overcome however:

  1. the raspberry pi isn't a very strong computer so I need to find a fast but accurate way of tracking someone
  2. well it turns out that OpenCV has a function called HOG which with some fine tuning can do just this
  3. how do I get the position of the person that is detected
  4. HOG can output a bounding box which I can use to find the center of to get someones position
  5. how do I know where the eyes/head are currently looking
  6. instead of keeping track where the eyes/head are looking i can put the camera in the eyes/head so it rotates with them and if its left of center I rotate the camera and eyes/head with it to slowly center the person on camera
  7. what happens if there is more than 1 person on camera, who do I track?
  8. simple, whoever is closest to the center
  9. how do I make use of the limited outputs of the raspberry pi to make things happen
  10. simple, I don't. instead I send data to an Arduino and let it handle the output

after a bit of time I finally got the test set up and running and am stunned at how well it works. its accurate(with 1 person at least I havent really been able to test with more than 1 person but i think it should work), its fast relative to the hardware that is running it, but most importantly it works.

in this instructable I'm going to explain how i did it and how it works to the best of my ability.

Supplies

Parts.jpg

What is needed:

  • 1x Raspberry Pi 4 B(I used the 4 GB model but a 2GB model minimum is required)
  • peripherals to run raspberry pi
  • 1x Webcam (i used the Logitech C615)
  • 1x Arduino uno
  • 1x USB type A to USB type B cable
  • 1x bread board
  • 3x LED's
  • jumper wires for bread board

Wiring

image_2022-08-02_163121526.png
Assembled.jpg
CircuitCloseUp.jpg

well start with connecting up all of the hardware. here are a list of connections needed to be made for the raspberry pi:

  • Raspberry pi to ac outlet using USBC adapter
  • Raspberry pi HDMI 0 port to monitors HDMI
  • Raspberry pi to Arduino using USB A to USB B cable

next you will need to plug the LED's in according to the diagram above. first place 3 LED's on the bread board so that none of the anodes or cathodes are connected in the same lane. next you will need a jumper going from the left LED anode(the long pin) to digital pin 4 on the Arduino, a jumper going from the center LED anode to digital pin 8 on the Arduino, and a jumper going from the right LED anode to digital pin 12 on the Arduino. finally everything needs to be connected to the ground pin on the Arduino, to do this have a jumper go from the pin labeled GND on the Arduino to the line on the side of the bread board with a "-" and connect each cathode of the LED to that line with a jumper wire.

Software Prep: Installing Python


first things first update your raspberry pi by running the following commands in the terminal:

sudo apt update
sudo apt upgrade

you will first need to check to see if you have python installed on the raspberry pi and it is trhe right version. to do this run the following command: into the terminal

python --version

or

python3 --version

if a version number shows up make sure it is the latest version. i will be using 3.10.6 for this project.

if you do not have the latest version of python go to the following website and select the most recent version https://www.python.org/downloads/

you might also need to install the following dependencies:

sudo apt-get install libreadline-dev libncursesw5= libssl-dev libsqlite3-dev tk-dev libgdbm-dev libc6-dev libbz2-dev

from there click download and then scroll down to "XZ compressed source tarball" and click it to download. next navigate to your Downloads folder in the file explorer. right click on the ".tar.xz" archive and click extract here to extract the files. finally right click on the folder that was created and select open in terminal. from there type the following commands

./configure
make
make test
sudo make install

finally python is installed! this should install it as 'python3' so whenever you run a command with it make sure to type 'python3' instead of just 'python'. for example to check to make sure the it was installed run

python3 --version

Software Prep: Installing Necessary Libraries

your gonna need to start by installing pip for python 3. to do this simply run the following command in the terminal:

sudo apt-get install python3-pip

next you will nee to install the following libraries if they are not already installed:

  • NumPy:
pip3 install numpy
  • open cv
pip3 install opencv-python
  • PySerial
pip3 install pyserial

once all of these commands are run all the python libraries you will need for this project will have been installed!

Software Prep: Installing Arduino IDE

download.png

to install Arduino IDE to program the Arduino you first need to type the following command to determine what architecture your raspberry pi is running (arm64 or arm32).

uname -m

if the result is aarch64 then you are running arm64 and if it says aarch32 you are running arm32. now go to the following website and download the Arduino IDE by selecting the Linux version for either arm32 or arm64 based on what your result of uname -m is.

https://www.arduino.cc/en/software

once download go to your download folder and unzip the file and open the folder. from there run "install.sh" to install the IDE.

Code: Arduino

ArduinoIDECapture.png

for the Arduino code we need to start by declaring what pins the Arduino is using along with initialize the variable for receiving serial communications.

//variable for incoming data be stored in
int x;
//declare pin numbers
int LED_Left = 4;
int LED_Center = 8;
int LED_Right = 12;

next we run the setup code which only runs once. it starts by setting all of the pinmodes to output and then starts the serial communication along with the timeout.

void setup() {
  //set pins to output
 pinMode(LED_Left, OUTPUT);
 pinMode(LED_Center, OUTPUT);
 pinMode(LED_Right, OUTPUT);
 //start serial com
 Serial.begin(115200);
 Serial.setTimeout(1);
}

finally a loop that runs forever, in the loop it starts by getting trapped in a while statement until a serial com is recieved. once a communication is recieved it exits the while loop and goes on to read the data that just came in which if you recall is a string. we convert it to an integer and then by using if statements we st the LED's output to either on or off.

void loop() {
 while (!Serial.available()){
  //loop while nothing is being sent and stop when data is recieved
 }
 //read serial data and convert to int
 x = Serial.readString().toInt();
 //nothing detected
 if (x == 0){
  digitalWrite(LED_Left, LOW);
  digitalWrite(LED_Center, LOW);
  digitalWrite(LED_Right, LOW);
 }
 //left of center
 if(x == 1){
  digitalWrite(LED_Left, HIGH);
  digitalWrite(LED_Center, LOW);
  digitalWrite(LED_Right, LOW);
 }
 //centered
  if (x == 2){
  digitalWrite(LED_Left, LOW);
  digitalWrite(LED_Center, HIGH);
  digitalWrite(LED_Right, LOW);
 }
 //right of center
 if (x == 3){
  digitalWrite(LED_Left, LOW);
  digitalWrite(LED_Center, LOW);
  digitalWrite(LED_Right, HIGH);
 }
 Serial.print(x);
}

here is the final code:

//variable for incoming data be stored in
int x;
//declare pin numbers
int LED_Left = 4;
int LED_Center = 8;
int LED_Right = 12;
void setup() {
  //set pins to output
 pinMode(LED_Left, OUTPUT);
 pinMode(LED_Center, OUTPUT);
 pinMode(LED_Right, OUTPUT);
 //start serial com
 Serial.begin(115200);
 Serial.setTimeout(1);
}
void loop() {
 while (!Serial.available()){
  //loop while nothing is being sent and stop when data is recieved
 }
 //read serial data and convert to int
 x = Serial.readString().toInt();
 //nothing detected
 if (x == 0){
  digitalWrite(LED_Left, LOW);
  digitalWrite(LED_Center, LOW);
  digitalWrite(LED_Right, LOW);
 }
 //left of center
 if(x == 1){
  digitalWrite(LED_Left, HIGH);
  digitalWrite(LED_Center, LOW);
  digitalWrite(LED_Right, LOW);
 }
 //centered
  if (x == 2){
  digitalWrite(LED_Left, LOW);
  digitalWrite(LED_Center, HIGH);
  digitalWrite(LED_Right, LOW);
 }
 //right of center
 if (x == 3){
  digitalWrite(LED_Left, LOW);
  digitalWrite(LED_Center, LOW);
  digitalWrite(LED_Right, HIGH);
 }
 Serial.print(x);
}

also note the port that the arduino is connected to in the bottom right corner of the IDE window as this will be needed later for the python code

dont forget to upload the code to the arduino!

Downloads

Code: Python

your going to start by importing the necisary libraries like so

#import the necessary packages
import numpy as np
import cv2
import serial

from here we need to set 3 constants which are the port the arduino is connected to along with the setting up the arduino serial com and finally the distance from the center in either direction someone needs to be from the center to be detected as centered.. you will need to replace the <Your Arduino Port Here> with the port name you noted earlier.

#port that the arduino is connected to, can be found in arduino IDE
arduino_port = '<Your Arduino Port Here>'
arduino = serial.Serial(port=arduino_port, baudrate=115200, timeout=0.01)

#sets how many pixels away from the center a person needs to be before the head stops
center_tolerance = 5; 

next we initialize HOG, which is a part of openCV that we use to detect people and will be explained in the next step. we set the hog descriptor to detect humans which is one of the default options.

# initialize the HOG descriptor/person detector
hog = cv2.HOGDescriptor()
hog.setSVMDetector(cv2.HOGDescriptor_getDefaultPeopleDetector())

then we start the video capture

# open webcam video stream
cap = cv2.VideoCapture(0)

the last thing we do before we start the loop to detect people frame by frame is we need to define the function to actually comunicate with the arduino. it starts by writing a stting then reading the reply.

def write_read(x):
    arduino.write(bytes(x, 'utf-8'))
    data = arduino.readline()
    return data

we then start a while loop and put the following in it. the code starts by getting a single frame from the camera then resizing it to process faster. from there it will be put into HOG which will output 2 xy cordinates which are the oposite sides of a rectangle that encompases the person. we then convert it to an array called boxes and initialize a list called centers. next it loops through each box and figures out the distance from the center and also its x cordinate relavtive to the center. it adds this data to the list and moves on to the next detected box. to make sure the code doesnt try to do anything with an empty list we have an if statement to make sure there is actually a box detected before we sort it by distance from the center. next it will draw the rectangles by iterating through the list of boxes and the first one is drawn green and the others red. the green box is closest to the center. next the code checks weather the position is left of the tolerance zone in it or right of it or nothing is detected and sends a string of "1" for left "2" for center "3" for right and "0" for nothing. finally the image is scaled back up for better viewing on the screen and then writes it to the window created earlier. finally we have that if the q key is pressed the program stops.

while(True):
    # Capture frame-by-frame
    ret, frame = cap.read()
    # resizing for faster detection
    frame = cv2.resize(frame, (140, 140))
    # detect people in the image
    # returns the bounding boxes for the detected objects
    boxes, weights = hog.detectMultiScale(frame, winStride=(1,1), scale = 1.05)
    boxes = np.array([[x, y, x + w, y + h] for (x, y, w, h) in boxes])
    centers = []
    for box in boxes:
        #get the distance from the center of each box's center x cord to the center of the screen and ad them to a list
        center_x = ((box[2]-box[0])/2)+box[0]
        x_pos_rel_center = (center_x-70)
        dist_to_center_x = abs(x_pos_rel_center)
        centers.append({'box': box, 'x_pos_rel_center': x_pos_rel_center, 'dist_to_center_x':dist_to_center_x})    
    if len(centers) > 0:
           #sorts the list by distance_to_center
        sorted_boxes = sorted(centers, key=lambda i: i['dist_to_center_x'])
        #draws the box
        center_box = sorted_boxes[0]['box']
        for box in range(len(sorted_boxes)):
        # display the detected boxes in the colour picture
            if box == 0:
                cv2.rectangle(frame, (sorted_boxes[box]['box'][0],sorted_boxes[box]['box'][1]), (sorted_boxes[box]['box'][2],sorted_boxes[box]['box'][3]), (0,255, 0), 2)
            else:
                cv2.rectangle(frame, (sorted_boxes[box]['box'][0],sorted_boxes[box]['box'][1]), (sorted_boxes[box]['box'][2],sorted_boxes[box]['box'][3]),(0,0,255),2)
        #retrieves the distance from center from the list and determins if the head should turn left, right, or stay put and turn lights on
        Center_box_pos_x = sorted_boxes[0]['x_pos_rel_center']  
        if -center_tolerance <= Center_box_pos_x <= center_tolerance:
            #turn on eye light
            print("center")
            result = write_read("2")
        elif Center_box_pos_x >= center_tolerance:
            #turn head to the right
            print("right")
            result = write_read("3")
        elif Center_box_pos_x <= -center_tolerance:
            #turn head to the left
            print("left")
            result = write_read("1")
        print(str(Center_box_pos_x))
    else:
        #prints out that no person has been detected
        result = write_read("0")
        print("nothing detected")
    #resizes the video so its easier to see on the screen
    frame = cv2.resize(frame,(720,720))
    # Display the resulting frame
    cv2.imshow("frame",frame)
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break

the last bit we need to add is for stoping the camera from capturing video stop the output and then close the window.

# When everything done, release the capture
cap.release()
# finally, close the window
cv2.destroyAllWindows()
cv2.waitKey(1)

Here is the final code:

#import the necessary packages
import numpy as np
import cv2
import serial

#port that the arduino is connected to, can be found in arduino IDE
arduino_port = '<Your_Port_Here>'


arduino = serial.Serial(port=arduino_port, baudrate=115200, timeout=0.01)
#sets how many pixels away from the center a person needs to be before the head stops
center_tolerance = 5; 
 
# initialize the HOG descriptor/person detector
hog = cv2.HOGDescriptor()
hog.setSVMDetector(cv2.HOGDescriptor_getDefaultPeopleDetector())


cv2.startWindowThread()


# open webcam video stream
cap = cv2.VideoCapture(0)


def write_read(x):
    arduino.write(bytes(x, 'utf-8'))
    data = arduino.readline()
    return data


while(True):
    # Capture frame-by-frame
    ret, frame = cap.read()
    # resizing for faster detection
    frame = cv2.resize(frame, (140, 140))
    # detect people in the image
    # returns the bounding boxes for the detected objects
    boxes, weights = hog.detectMultiScale(frame, winStride=(1,1), scale = 1.05)
    boxes = np.array([[x, y, x + w, y + h] for (x, y, w, h) in boxes])
    centers = []
    for box in boxes:
        #get the distance from the center of each box's center x cord to the center of the screen and ad them to a list
        center_x = ((box[2]-box[0])/2)+box[0]
        x_pos_rel_center = (center_x-70)
        dist_to_center_x = abs(x_pos_rel_center)
        centers.append({'box': box, 'x_pos_rel_center': x_pos_rel_center, 'dist_to_center_x':dist_to_center_x})    
    if len(centers) > 0:
           #sorts the list by distance_to_center
        sorted_boxes = sorted(centers, key=lambda i: i['dist_to_center_x'])
        #draws the box
        center_box = sorted_boxes[0]['box']
        for box in range(len(sorted_boxes)):
        # display the detected boxes in the colour picture
            if box == 0:
                cv2.rectangle(frame, (sorted_boxes[box]['box'][0],sorted_boxes[box]['box'][1]), (sorted_boxes[box]['box'][2],sorted_boxes[box]['box'][3]), (0,255, 0), 2)
            else:
                cv2.rectangle(frame, (sorted_boxes[box]['box'][0],sorted_boxes[box]['box'][1]), (sorted_boxes[box]['box'][2],sorted_boxes[box]['box'][3]),(0,0,255),2)
        #retrieves the distance from center from the list and determins if the head should turn left, right, or stay put and turn lights on
        Center_box_pos_x = sorted_boxes[0]['x_pos_rel_center']  
        if -center_tolerance <= Center_box_pos_x <= center_tolerance:
            #turn on eye light
            print("center")
            result = write_read("2")
        elif Center_box_pos_x >= center_tolerance:
            #turn head to the right
            print("right")
            result = write_read("3")
        elif Center_box_pos_x <= -center_tolerance:
            #turn head to the left
            print("left")
            result = write_read("1")
        print(str(Center_box_pos_x))
    else:
        #prints out that no person has been detected
        result = write_read("0")
        print("nothing detected")
    #resizes the video so its easier to see on the screen
    frame = cv2.resize(frame,(720,720))
    # Display the resulting frame
    cv2.imshow("frame",frame)
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break


# When everything done, release the capture
cap.release()
# and release the output
out.release()
# finally, close the window
cv2.destroyAllWindows()
cv2.waitKey(1)

What Is HOG?

image_2022-08-08_145516854.png

histogram of oriented gradients, or HOG for short, is an image process method which outputs a given image into a vector representation of the image which is then input into a machine learning algorithm to classify and find the bounding box of the desired object

Credits