N3XTSet: Optimizing Athlete Biomechanics With EMG + ML

by krishnamalhotra150 in Circuits > Wearables

244 Views, 5 Favorites, 0 Comments

N3XTSet: Optimizing Athlete Biomechanics With EMG + ML

Hi! My name is Krishna Malhotra and I'm a 16 year old inventor from California! I'm passionate about biosensing and machine learning and decided to fuse these two interests I have into N3XTSet. For some background, I am currently working with UC Davis and a private neurology company in leveraging neuroimaging and neural networks/machine learning to decode the brain and diagnose neurodegenerative diseases!

A lot of methodology from these experiences inspired N3XTSet, also rooting from my background as a competitive AAU basketball player. In this project I aimed to develop a wearable device that can read muscle activity from athletes while jumping. From what I've learned by playing 5 years of competitive basketball is this athletes with poor athleticism/"bounce" may not go very far in their endeavors. As an athlete, I personally struggled with this, thus inspiring me to make this. Now, many people blame their genetics , but in fact, according to many studies, their biomechanics are at fault. N3XTSet monitors the timings and coordinated activation of the different stages of many jump repetitions performed by an athlete through collecting electromyography (EMG) data (see below) and using ML to segment out these "stages" and provide personalized feedback with an LLM.

****

FULL CODE: https://drive.google.com/file/d/1L5e3OV3dN8c8g2kzPGZd-rp2OLtRBlv2/view?usp=sharing

****

Some terminology:

LLM: Large Language Models - ChatGPT, Gemini, etc.
Electromyography (EMG): Measures electrical activity of nerves controlling muscles. In this case, for simplicity, this project records data from the quadriceps, with activity derived from the femoral nerve. To be more specific, EMG detects action potentials from muscle cells when activated by this nerve.

Supplies

Hardware:

MyoWare 2.0 (now MyoWare 2) Muscle Sensors
Gel Electrode Pads (x3)
MyoWare 2.0 Wireless Shield
ESP32 WROVER (or any bluetooth enabled ESP32)

Software:

Python (VS Code) + ML Tools like Matplotlib, PyTorch, NumPy
Arduino IDE

Hardware Configuration

N3XTSet utilizes the Myoware 2.0 interface. It uses their traditional Myoware 2.0 (now 2) muscle sensor paired with the Myoware 2.0 wireless shield. The shield uses bluetooth to communicate wirelessly, allowing athletes to focus on performance and not hundreds of wires!

All you need to do is snap the sensor to the shield and add the three gel electrodes with snap connectors. The gel used is safe for the skin. When snapping the modules together, they will only go in one way, removing this concern. The wireless shield already contains a LiPo battery that is rechargable, and is activated via an on/off switch. And thats it! The electronics are done!

Case Design

The case was custom designed for this setup and to facilitate ease of taking it apart and putting it back together. This included taking very specific measurements and designing the case to fit these parts. This included leaving a bottom opening for the triangular MyoWare 2.0 sensor and ensuring optimal depth so the electrodes are unimpeded. The case was printed in sections so that the top cover can easily slide off, allowing for debugging any electronic faults instead of using a screwdriver for every access. In addition, a small module was printed to encase the switch and slide along a slot axis.

While the case could have been designed to minimize dimensions, in practical use, I envisioned N3XT set to have a distinct, "high-tech" look to it. Its dome-like top accomplishes this and in fact improves the mechanism's center of gravity by creating a more balanced shape, optimizing grip to the skin. The screws on the end hold the top, removable, dome into the bulges on the side, giving the case a complete look that athletes wouldn't be afraid to wear!

(Bonus) Cool Lighting!

The PLA filament used was layered thin enough to allow the lights to shine through, leveraging the fact that polylactic acid (PLA) is somewhat translucent.

Software Configuration

Device Setup:

The MyoWare wireless shield has an onboard ESP32 that facilitates Bluetooth communication. Upload the "device" code provided in the zip linked above to the shield. A modification you would need to make is to rename the device to whatever you want as shown in the below code segment.

#include <ArduinoBLE.h>

#include <MyoWare.h>

// const String localName = "M1Left";

const String localName = "M1Right";

// const String localName = "M2Left";

// const String localName = "M2Right";

You will need to download the ArduinoBLE and MyoWare libraries that are within the main Arduino IDE. The MyoWare 2.0 is the "peripheral" device in the BLE system because it is transmitting the data.

const double sensorValue = ReadBLEData(sensorCharacteristic);

Serial.println(sensorValue);

delay(100);

if (vecMyoWareShields.size() > 1)

Serial.print(",");

Receiver Setup

Using an ESP32, upload the "central" code as this is our "receiver" or "central" device in the BLE system. The below code segment is especially important because this is where the Arduino puts in the Serial data transfer the sensor values. You can format this text for any specific needs. A delay is there because the transmission of the data must be limited. This is optional and can be optimized.

Data Collection/Usage

Next, we designed a simple interface that would allow the recording of trials

Trials of 30 second recordings
Sampling at 20 samples per second
A responsive GUI to show data being recorded
Marking the start and end of reps

The interface works by periodically reading the Serial port defined using the pyserial library. You must define the correct port and baud rate for this to work (this is a common error). The Tkinter interface runs for 30 seconds and during this, when a "start" or "stop" button is clicked, the timestamps are marked down in a CSV file to later use in segmenting individual reps. It saves the signal intensity at each timestep as a 2D array, downloaded as a CSV file.

The code segment below is how the data is obtained and plotted. It works by getting the latest line in the Serial stream (assuming its in utf-8) and collects it and plots it. The rest of the code is based off of this method, just saving this data.

def update(frame):

global ys, recording, current_trial_data, recording_start_time

try:

if ser.in_waiting:

data = ser.readline().decode('utf-8').strip()

try:

value = float(data)

current_time = time.time()

ys.append(value)

ys = ys[-max_points:]

# Update the line data

line.set_ydata(ys)

ax.set_ylim(min(ys) - 10, max(ys) + 10)

if recording:

elapsed_time = current_time - recording_start_time

current_trial_data.append([elapsed_time, value])

remaining = max(0, recording_duration - elapsed_time)

status_var.set(f"Recording Trial #{trial_count}... ({remaining:.1f}s remaining)")

except ValueError:

print(f"Received invalid data: {data}")

except Exception as e:

print(f"Error reading from serial: {e}")

return line,

Now, when the shield and ESP32 receiver are both on, they transfer data through the serial port and the GUI can easily work. Please see the entire code in the zip file.

In machine learning, it is important to overcome bias and data sparsity. Therefore, the experiments designed were to fulfil two tasks: segment individual reps and identify the different stages of a jump for analysis.

The experiments were designed as such (TN denoting Trial N)

T1-2: Good, full force jumps
T3: Goof Full Force Jumps with Bad Landing
T4: Good Full Force Jumps with Too Long Squat
T5: Good Full Force Jumps with Bad Squat
T6: Low Force Jumps with Good Form
T7: Very Good

Currently, data was collected solely from myself with small breaks in between.

Jump Segmentation

Deep Learning?

Now, many athletes train their jumps and this would be the primary use of the product: monitoring exercises with many reps to understand their current jump biomechanics and how to improve it. To do this, individual "jump reps" from a workout must be segmented, a challenge that wouldn't be fully accurate with a heuristic solution as people jump very differently! Simply extracting reps based on minimums, maximums, and graph behaviors isn't sufficient as in practice, a multitude of different behaviors come up. Hence, I implemented a deep learning approach leveraging a recurrent neural network (RNN).

RNN

The RNN is suited for temporal data such as an EMG signal as its "neuron" retain information from the past. The LSTM, a type of RNN, is especially good at this, using gates to manage the flow of information across a complex network. The above image is a visual representation of this.

The goal of the model was to predict a binary mask of the same sequence length as the signal to determine at which times the signal was a part of a rep. Here is a workflow of the model:

Pipeline

Normalize the EMG signal between 0-1
Encode the signal into a latent representation, capturing time dependencies by encoding a latent representation for each timestep with respect to the previous time steps.
Use a linear projection to transform the latent vector into a binary output at each timestep (mapping latent representation at each timestep to a binary output)

The images above show the outputs of the model (where orange is not a rep and blue is)

RNN Prediction of Jump Biomechanics

Now, a bit of a difficult part: segmenting the different stages of the jump from an individual jump rep. This information can help further algorithms alongside an LLM I'll cover next.

I decided to classify a jump to contain different stages that are important for optimal technique (thank you to Coach Christian for teaching me this ;)

"Preload" phase - The "squat" before a jump where you quickly engage your muscles before the jump
"On-ground explosion" - The quick explosion that is before you actually get in the air
"Airtime" - This is where you are in the air with your muscle not engaged
"Landing" - Arguably the most important part. A dynamic, squatted, landing to absorb the impact. Whatever goes up must come down

Then, using the model above, I made a simple GUI to manually define these stages by analyzing the signals, labeling ~40 reps.

Following this, I developed another model which is diagrammed above:

Encode the sequence using an LSTM
Use a MLP (Multilayer perceptron with linear and ReLU layers) to take the encoded representation and map it to 5 classes (the fifth being nothing)

The graphs show some sample outputs.

(Bonus) Interpret RNN Output

Using Principal Component Analysis (PCA), the series (t=num_time_steps) of vectors output from the LSTM can be visualized. Each 128 dimensional vector can be reduced to two latent components using (PCA). Then, through each timestep, this can be plotted. A pattern is clearly apparent as each timestep follows a pattern across what we call a "latent trajectory."

Gemini API

Extracting Features

Now, I used some simple heuristic logic by analyzing the trends from the data. This was to extract meaningful information from the jump stage segmentation model, extracting features from each and organizing this into an ordered dictionary.

Rates

The plot is transformed into a average rate of change graph (approximate derivative) to analyze the rates of all the stages.

Ensuring absolute maximum (on ground explosion) is greater than threshold
Ensuring greatest relative maximum (landing) is greater than threshold.\

Times

This extracts the relative time of each stage as an approximation, assuming each timestep is 0.1 seconds apart.

General Feedback

Detecting dips in activation during preload or explosion
Max strength of on-ground-explosion
Landing power is ~60% of max power

Here is an example output dictionary

{'explosion_on_ground_feedback': 'Good jump intensity',

'landing_feedback': 'Good landing',

'pre_jump_feedback': 'Dip in activation on jump preparation. Need to minimize '

'squat depth and hold time',

'rates': {'landing_rate_feedback': 'Good landing absorption. ',

'on_ground_explosion_rate_feedback': 'Good rate of explosion on the '

'ground. Quick and powerful.'},

'times': {'airtime_time': 0.5,

'explosion_time': 0.1,

'landing_time': 0.8,

'preload_time': 0.8}}

Gemini API

Now, I'm no sports therapist, so the data above is uninterpretable for me. However, RAG-powered LLM's like Gemini may be able to interpret this data. Therefore, with some crafty prompt engineering and relevant data, a very own sports therapist can be integrated into this wearable monitoring device!

To get a free API key, create a new project in google AI studio and then press get an API key. Its just as simple as that and completely free. Follow this video if you are having difficulties: https://www.youtube.com/watch?v=6BRyynZkvf0

The code below shows how you can get prompting right away in python!

model = genai.Client(api_key='YOUR_KEY')

response = model.models.generate_content(model="gemini-2.0-flash", contents="Hey what's up?")

Prompt engineering is a very important topic in today's world surrounded by LLM's. Learning to use these powerful tools is integral to success. For a very specific topic like analyzing jump biomechanics, you can't just say "look at this data and tell me if the athlete is good or not." There is a reason prompt engineering is actually a job now! A good prompt (there are way better out there follows the below structure):

Role/Context
Input
How to respond (output format)
What not to do

In my task, this was my prompt:

"""You are a professional sports therapist. You are an expert in athletic jump biomechanics, optimized for basketball and volleyball. The below information is regarding a single rep of a jump.

Give feedback as a summary to the user based on the dictionary, including how to improve their flaws with form adjustments, exercises, etc.

Do not affirm the message or ask questions. Just explain the feedback directly. Avoid making text bold. When applicable, give exact exercises. Segment the output into the times (seconds)

each stage took ("features" dict within main dict) and then a summary and then improvements and how to fix."""

Interface Design

Next, I combined the output of the Gemini Model with the labeled graph. Using streamlit, I developed a web interface that currently, for simplicity, analyzing all the samples in the dataset (dataloader). Above are some of the different outputs, highlighting the advanced capabilities of the LLM and the RNN-MLP model.

Testing!