The Restorer of Equilibrium
by SpinyGiraffe in Circuits > Arduino
127 Views, 1 Favorites, 0 Comments
The Restorer of Equilibrium
Science fiction novels that weave tales of dystopia and control are typically centered around a core mechanism utilized to control the population. The two most iconic examples are perhaps the devices found within 1984 and Brave New World. 1984 has its constant war and perpetual surveillance. Brave New World has the drug of never-ending happiness: Soma. For this project, we shall be examining a foundational text that precedes both entries: We, by Yevgeny Zamyatin. The mechanism of control utilized in We is centered around a state of perfect equality. Individuals cannot be differentiated from one another. Their emotions are muted and non-existent, their physical appearance is of the same form, and vice is washed away. The human is merely reduced to an alphanumeric, slogging through the day alongside its compatriots in a state of perfect equilibrium. Society functions in a neutral state.
This project seeks to create a device that induces a rudimentary form of that state of mind. Termed “The Restorer of Equilibrium”, it consists of a mannequin head and a wooden hand controlled by an Arduino microcontroller. Its core function is to utilize conversation and a game of rock-paper-scissors to return the user to a neutral state. In the conversation state, the device asks the user “How are you doing today?”. Upon receiving a return statement, the device applies emotional analysis with a fine-tuned BERT model, detects the emotion, and replies with a voice-line that opposes the detected emotion and a facial expression that similarly parallels the voice-line. Facial expressions are generated using an array of LEDs and a shift-register. For instance, if the user states “I am doing great today!”, the device recognizes the emotion to be happy, and replies with a sad voice-line and a frowning expression.
The device then forces the user into the rock-paper-scissors phase. It asks the user to play with it and utilizes a convolutional neural network to detect what outcome the user has played. Upon doing so, it then plays the exact outcome the user has played with its wooden hand and an array of servo motors. The user does not experience the joy of winning or the sadness of losing.
In both instances, the device attempts to force the user into a neutral state. Furthermore, it does so through activities and interactions that are typically familiar to the average person. It is a small representation of a possible device utilized to attain the societal state found in Zamyatin’s We.
Supplies
Head (Conversation Component)
· Foam mannequin head
· 74HC595 shift-register
· Red LED x8
· 220 Ohm Resistors x8
· Miniature Bluetooth Speaker
· Male-to-Female Jumper Wires
· Female-to-Female Jumper Wires
· Cardboard
Hand (Rock-Paper-Scissors)
· Wooden hand with moveable joints
· Fishing line
· Straws
· MG996R Servo motor x2
· 6V 2.5A Power source
· A small box (12cm x 7cm in my case)
· Reel device x2
· Wooden Base
The central control unit of these electronics will be a computer. Speed of response is dependent on the processing power of the computer and the inclusion of a GPU.
Procedure: Python Software
All code mentioned can be found here (https://github.com/tyxiang0530/RestorerOfEquilibrium/tree/main/FullCode)
Let us examine the main loop of the program (pictured above):
At each decision point, the software actively delves
into the branch in the decision tree that places the user in a neutral response. Furthermore, at multiple decision points the user is forced into a specified state. To accomplish this, we need to write multiple software components prior to beginning work with hardware. Required components include multiple pre-processing functions, a speech-to-text function, an Arduino connection function, a neural network for emotional analysis, and a neural network for rock-paper-scissors recognition. The speech-to-text function converts user speech to text that can be processed by the emotional analysis model. The Arduino connection function sends bytes to the Arduino based off user actions so that hardware can perform the adequate responses.
Let us first examine the neural networks. I train the emotional analysis model by fine-tuning BERT on a concatenated dataset that consists of the High Valence Fairy Tales Dataset, Daily Dialog Dataset, ISEAR Dataset, and EmoStim Dataset. All datasets used for training can be found at the following location: (https://github.com/tyxiang0530/RestorerOfEquilibrium/tree/main/datasets).
The trained model has an F1 score of 84%, which is quite high for multi-class emotional analysis models. The complete training guide with extensive comments and explanations can be found here: https://github.com/tyxiang0530/RestorerOfEquilibrium/blob/main/RestorerEmotionTrain.ipynb
I train a convolutional neural network for the rock-paper-scissors recognizer due to the high accuracy for image processing tasks. The rock-paper-scissors recognizer is trained on an image dataset created by Laurence Moroney that can be conveniently loaded from Tensorflow. This model achieves an F1 score of 90%. The complete training guide with extensive comments and explanation can be found here: https://github.com/tyxiang0530/RestorerOfEquilibrium/blob/main/RestorerRPSTrain.ipynb
The remainder of the Python software is split into two classes: ProcessSpeech handles all speech related aspects and the emotional analysis evaluation and DetectPlay handles all rock-paper-scissors image pre-processing and the rock-paper-scissors evaluation.
ProcessSpeech utilizes PyAudio to open the microphone to user input and uses Google Cloud API to convert the user’s speech to text. It then runs the emotional analysis model on this text and returns the emotion. DetectPlay opens the webcam to take in the shape of the user’s hand, converts it to a valid size of input to the rock-paper-scissors recognizer and returns what the user has played. It then runs the recognizer on the user input and returns what the user has played.
Procedure – Hardware and Arduino Software (HEAD)
To build the head, first wire all eight LEDs to the
shift register with jumper wires of adequate length and solder those components to a circuit board. Connect the clock pin, latch pin, and data pin to the Arduino. Take your mannequin head and hollow out a cavity that provides the jumper wires and LEDs a path to the mannequin’s mouth and then cut out the mannequin’s mouth.The cross-sectional view of the electronics in the head is displayed above.
Upon running the LEDs and jumper wires through the cavity and into the mouth, remove the LEDs from the jumper wires. Embed the LEDs into a piece of cardboard so they are fixed in place before reattaching the LEDs. Arrange the LEDs in the configuration outlined in the second figure above.
In your Arduino IDE of choice, designate three byte-patterns. One that activates LEDs 1, 2, 3, 4, 5, and 6, corresponding to smile, one that activates 3, 4, 5, 6, corresponding to neutral, and one that activates LEDS 3, 4, 5, 6, 7, 8, corresponding to frown. The three possible expressions are displayed above.
(From left to right: Happy, Sad, Neutral)
Procedure – Hardware and Arduino Software: Hand and Arduino Code
To build the hand, first mount the wooden hand onto your baseplate of choice. Hot-glue two-centimeter straws to the tip of each finger and attach a loop of fishing line to the straw. Then, attach the two servo motors approximately seven centimeters in front of the hand. Each servo should be horizontally orientated between two fingers, with the first servo motor sitting between the pinky and ring finger and the second servo sitting between the middle and index finger. The two servo motors need to be connected to the 6V 2.5A power source as MG996R servos draw large amounts of current to achieve maximum torque. Glue your box to the baseplate so that it sits between the hand and the servos. Your box should be the same height as your servo motors.
An overhead view of the orientation is provided above.
Hot glue a two-centimeter piece of straw to the top of the box so that each straw aligns with a finger and run the fishing line through the straw so that the straw acts as a guide. Then, wrap the line from the pinky and ring finger onto its corresponding reel, and do the same for the line running from the index and middle finger. Wrap a few pieces of tape on the outer edges of the reel to prevent the fishing line from slipping out of the reel.
(Overhead view of the reel system) An overhead view of the reel system is provided above
(Close-up of the hand) A close up of the hand is provided above
Each possible hand configuration is shown above (From left to right: Scissors configuration, rock configuration, paper configuration)
The rock-paper-scissors model recognizes the handshape of the user and sends this information to the Arduino. A visual display of what the neural network sees and interprets is shown above:
(CV2 frame of rock-paper-scissors recognition. This display frame has been removed in the final iteration to increase processing speed.)
In your Arduino IDE of choice, designate a function that turns Servo1 180 degrees and another that turns Servo2 180 degrees. The hand should be set to a starting position of all fingers pointing upwards in a flat plane. To play paper, simply do not alter the position of the hand. To play scissors, rotate Servo1, curling in the pinky and the ring finger. To play rock, rotate Servo1 and Servo2, curling in the pinky, ring, index, and middle finger.
The Arduino software reads inputs sent to the serial monitor by the Python code and controls the hardware. Based off the outcome of emotional analysis and the outcome of the rock-paper-scissors recognizer, the Arduino will activate certain hardware responses. The frown byte pattern is activated by a positive emotion (joy), the happy byte pattern is activated by a negative emotion (fear, sad, anger), and the neutral emotion is activated by a neural emotion. The rock-paper-scissors hand plays rock when the user plays rock, scissors when the user plays scissors, and paper when the user plays paper.
Code for Arduino components can be found here: (https://github.com/tyxiang0530/RestorerOfEquilibrium/blob/main/Arduino/ArduinoResponse.ino)
A circuit diagram of the completed Restorer-of-Equilibrium is shown above
Results
individual rock-paper-scissors configurations can be found here:
https://drive.google.com/drive/folders/1ciwg9Ok4i3...
The main loop functions as expected and without error. The user converses with the device and is then forced into a game of rock-paper-scissors. All possible sources of diversion from the above decision tree are caught through try-catch blocks. Overall, the device is successful. However, there exists some issues in both software and hardware. Software issues arise from the two deep learning classifiers, as they do not entirely function without error. Notably, the emotional analysis classifier often has confusion between neutral and joy and, in comparison to the rock-paper-scissors recognizer accuracy, has lower F1 score (84% vs 90%). However, these statistics are inhibited by the availability of data and the confusion due to categorical abundance that comes with multi-class classification problems.
Hardware issues stem from the hand mechanism. Servo motors retain data of their initial angle of rotation and thus it is difficult to designate a loop that allows the hand to be reset to its original position. The user must manually reset the hand, thus creating a portion of the main loop that is not automatic. Other less noticeable issues included connection problems with LEDs. This was especially bothersome to remedy as the jumper wires ran deep into the cavity of the head, making re-wiring difficult.
Conclusions
In this report I demonstrate a rudimentary device that restores the user to a state of neutral emotion. The rapid evolution of machine learning and neural net technology as well as the accessibility of hobby electronics has enabled individuals to replicate mechanisms outlined in the science-fiction literature of the early 1900s. This device intends to be a representation of a possible manner of control to achieve a society like that of We, by Yevgeny ZamyatinI
There are also a multitude of possible improvements. Due to the impreciseness of servo motors, the user is required to reset the hand mechanism back to its original state. Possible fixes for this problem involve the use of DC stepper motors instead of servo motors and the use of electromagnets that move the fingers through the activation and de-activation of a magnetic field.
Furthermore, we are currently in a period of data explosion, in which new datasets are being released at a staggering pace. With an increase in tagged training sets for emotional analysis, the accuracy of our software for emotional detection can also be increased.
I think this device also displays the current accessibility of artificial intelligence and electronics. Although my laptop struggled to process at a high speed, I was able to train my models using Google Colabs free cloud-based GPUs. Furthermore, the public availability of pre-trained models such as BERT save lots of training time and previously unachievable benchmarks.
Acknowledgements
Thanks to Professor Scott Tan and Kylie Thompson for their help on the project and general electronics work.
Code Resources and Libraries
Google Cloud API Speech to Text: https://cloud.google.com/speech-to-text/
HuggingFace Fine-tuning of a BERT Model: https://huggingface.co/transformers/training.html
Tensorflow Training a Convolutional Neural Network: https://www.tensorflow.org/tutorials/images/cnn
Image Processing with CV2: opencv-python-tutroals.readthedocs.io/en/latest/py_tutorials/py_imgproc/py_table_of_contents_imgproc/py_table_of_contents_imgproc.html