Project Page | arXiv | Video
Runsheng Wang1, Xinyue Zhu1, Ava Chen, Jingxi Xu, Lauren Winterbottom, Dawn M. Nilsen2, Joel Stein2, Matei Ciocarlie2
1Equal contribution, 2Co-Principal Investigators
Columbia University
ReactEMG is a zero-shot, low-latency EMG framework that segments forearm signals in real time to predict hand gestures at every timestep, delivering calibration-free, high-accuracy intent detection ideal for controlling prosthetic and robotic devices.
Clone the repo with --recurse-submodules and install our conda (mamba) environment on an Ubuntu machine with a NVIDIA GPU. We use Ubuntu 24.04 LTS and Python 3.11.
mamba env create -f environment.ymlInstall PyTorch in the conda environment, then install wandb via pip:
pip install wandbLastly, install minLoRA via:
cd minLoRA && pip install -e .minLorRA was built for editable install with setup.py develop, which is deprecated. Consider enabling --use-pep517 and use setuptools β₯ 64 when working with pip β₯ 25.3.
We are open-sourcing our sEMG dataset, ROAM-EMG.
- Scope: Using the Thalmic Myo armband, we recorded eight-channel sEMG signals from 28 participants as they performed hand gestures in four arm postures, followed by two grasping tasks and three types of arm movement. Full details of the dataset are provided in our paper and its supplementary materials.
- Download: Dropbox Link
For full reproducibility, we also provide pre-processed versions of all public EMG dataset used in the paper. The file structures and data formats have been aligned with ROAM-EMG. We recommend organizing all datasets under the data/ folder (automatically created with the command below) in the root directory of the repo. To download all datasets (including ROAM-EMG):
curl -L -o data.zip "https://www.dropbox.com/scl/fi/isj4450alriqjfstkna2s/data.zip?rlkey=n5sf910lopskewzyae0vgn6j7&st=vt89hfpj&dl=1" && unzip data.zip && rm data.zipWe use W&B to track experiments. Decide whether you want metrics online or offline:
# online (default) β set once in your shell
export WANDB_PROJECT=my-emg-project
export WANDB_ENTITY=<your-wandb-username>
# or completely disable
export WANDB_MODE=disabledUse the following command to pre-train our model on EMG-EPN-612 and other public datasets:
python3 main.py \
--embedding_method linear_projection \
--use_input_layernorm \
--task_selection 0 1 2 \
--offset 30 \
--share_pe \
--num_classes 3 \
--use_warmup_and_decay \
--dataset_selection pub_with_epn \
--window_size 600 \
--val_patient_ids s1 \
--epn_subset_percentage 1.0 \
--model_choice any2any \
--inner_window_size 600 \
--exp_name <RUN_ID>Replace <RUN_ID> with your desired name, and the script will save checkpoints to model_checkpoints/<RUN_ID>_<timestamp>_<machine_name>/epoch_<N>.pth, where <timestamp> records the runβs start time and <machine_name> identifies the host. Ensure you have write permission where you launch the job.
You may also initialize weights from a saved checkpoint by adding --saved_checkpoint_pth path/to/epoch_X.pth to the training command. If you wish to fine-tune a model via LoRA, provide the flag --use_lora 1, in addition to the locally saved checkpoint path.
To train EPN-only models for evaluation purposes, set --dataset_selection epn_only
If this is your first time using W&B on your machine, you will be prompted to provide credentials:
wandb: (1) Create a W&B account
wandb: (2) Use an existing W&B account
wandb: (3) Don't visualize my results
wandb: Enter your choice:
Enter 2 to use your W&B account, and follow the prompts to provide your API key.
Fine-tuning follows a leave-one-subject-out (LOSO) protocol. The helper script finetune_runner.sh trains a separate model for every subject in the ROAM-EMG dataset. Open finetune_runner.sh and set saved_checkpoint_pth to be your pre-trained checkpoint path, and start LOSO fine-tuning via:
source finetune_runner.shWe evaluate model performance on two metrics:
- Raw accuracy: Per-timestep correctness across the entire EMG recording
- Transition accuracy: Event-level score that captures accuracy and stability
During evaluation, we run the model exactly as how it would run online: windows slide forward in real time and predictions are aggregated live. This gives a realistic view of online performance instead of an offline, hindsight-only score.
python3 event_classification.py \
--eval_task predict_action \
--files_or_dirs ../data/ROAM_EMG \
--allow_relax 0 \
--buffer_range 200 \
--stride 1 \
--lookahead 50 \
--weight_max_factor 1.0 \
--likelihood_format logits \
--samples_between_prediction 20 \
--maj_vote_range future \
--saved_checkpoint_pth <path_to_your_pth_checkpoint> \
--epn_eval 0 \
--verbose 1 \
--model_choice any2anyTo remove all smoothing, set --stride 20, --lookahead 0, --samples_between_prediction 1, and --maj_vote_range single.
To evaluate EPN-only models, set --files_or_dirs ../data/EMG-EPN-612 and --epn_eval 1.
The evaluation code produces three outputs under output/:
- Summary txt: Overall raw & transition accuracy (mean Β± std), event counts, and a tally of failure reasons.
- Per-file JSON: Metrics plus full ground-truth & prediction sequences for each file.
- PNG plots: 3-panel figure: 8-channel EMG, ground-truth labels, and model predictions over time.
This repository includes experimental infrastructure for studying healthy-to-stroke few-shot adaptation for sEMG intent detection. The experiments evaluate different fine-tuning strategies when adapting a healthy-pretrained model to stroke participants.
Participants: p15, p20 (both left-hand stroke survivors)
Data Organization:
- Calibration pool: 4 baseline sets (open_1-4, close_1-4)
- Test conditions: mid-session baseline, end-session fatigue, unseen posture, sensor shift, orthosis-actuated
Fine-tuning Strategies:
- Zero-shot (frozen pretrained model)
- Stroke-only (train from scratch)
- Head-only (freeze backbone, train classification head only)
- LoRA (low-rank adaptation)
- Full fine-tuning (update all parameters)
First, verify all implementations work correctly:
cd reactemg/
python3 test_implementations.pyThis tests repetition extraction, nested sampling, and verifies all scripts exist.
Run the full main experiment for all participants and variants:
cd reactemg/
python3 run_main_experiment.pyThis will:
- Evaluate pretrained model zero-shot on stroke data
- Run 4-fold CV hyperparameter search for each variant (36 configs Γ 4 folds)
- Train final models with best hyperparameters on full calibration pool
- Evaluate all models on 5 test conditions
- Save results to
results/main_experiment/
Expected runtime: ~40-100 GPU hours (1,152 training runs)
To run CV search for a specific variant only:
cd reactemg/
python3 cv_hyperparameter_search.py \
--participant p15 \
--participant_folder ~/Workspace/myhand/src/collected_data/2025_12_04 \
--variant lora \
--pretrained_checkpoint /path/to/pretrained.pthVariants: stroke_only, head_only, lora, full_finetune
Results saved to: temp_cv_checkpoints/p15_lora_cv_results.json
After identifying the best variant from main experiment:
cd reactemg/
python3 run_data_efficiency.py \
--participant p15 \
--variant lora \
--config_file temp_cv_checkpoints/p15_lora_cv_results.jsonThis evaluates performance with K=1,4,8 paired repetitions (12 trials per K).
Results saved to: results/data_efficiency/p15/
Track convergence and potential catastrophic forgetting:
cd reactemg/
python3 run_convergence.py \
--participant p15 \
--variant lora \
--config_file temp_cv_checkpoints/p15_lora_cv_results.jsonThis trains for 10Γ the optimal epochs and evaluates each epoch on both stroke test sets and healthy s15 data.
Results saved to: results/convergence/p15/
Freeze all parameters except the action prediction head:
python3 main.py \
--freeze_backbone 1 \
--saved_checkpoint_pth /path/to/pretrained.pth \
[other training args]Low-rank adaptation with fixed hyperparameters (rank=16, alpha=8):
python3 main.py \
--use_lora 1 \
--lora_rank 16 \
--lora_alpha 8 \
--lora_dropout_p 0.05 \
--saved_checkpoint_pth /path/to/pretrained.pth \
[other training args]Update all parameters:
python3 main.py \
--saved_checkpoint_pth /path/to/pretrained.pth \
[other training args]Omit the pretrained checkpoint:
python3 main.py \
[training args without --saved_checkpoint_pth]Searched parameters (36 combinations):
- Learning rate: [1e-5, 5e-5, 1e-4, 5e-4]
- Epochs: [5, 10, 15]
- Dropout: [0.1, 0.3, 0.5]
Fixed parameters:
- Batch size: 128
- Window size: 600
- LoRA rank: 16
- LoRA alpha: 8
- LoRA dropout: 0.05
Selection criteria:
- Primary: Highest average transition accuracy across 4 CV folds
- Tiebreaker: Fewer epochs
All evaluations use consistent parameters:
--buffer_range 800
--lookahead 100
--samples_between_prediction 100
--allow_relax 1
--stride 1
--likelihood_format logits
--maj_vote_range futureresults/
βββ main_experiment/
β βββ p15/
β β βββ zero_shot/
β β β βββ mid_session_baseline/
β β β βββ end_session_baseline/
β β β βββ unseen_posture/
β β β βββ sensor_shift/
β β β βββ orthosis_actuated/
β β βββ stroke_only/
β β βββ head_only/
β β βββ lora/
β β βββ full_finetune/
β βββ p20/ [same structure]
β
βββ data_efficiency/
β βββ p15/
β β βββ K1/ [12 trials + aggregated_metrics.json]
β β βββ K4/
β β βββ K8/
β βββ p20/
β
βββ convergence/
βββ p15/
β βββ frozen_baseline/
β βββ epoch_1/ ... epoch_N/
β βββ convergence_curves.json
βββ p20/
model_checkpoints/
βββ main_experiment/
β βββ p15_stroke_only_final.pth
β βββ p15_head_only_final.pth
β βββ p15_lora_final.pth
β βββ p15_full_finetune_final.pth
βββ data_efficiency/
β βββ p15/ [K1_trial0.pth ... K8_trial11.pth]
βββ convergence/
βββ p15/ [epoch_1.pth ... epoch_N.pth]
For custom evaluation scripts, use the programmatic API:
from event_classification import evaluate_checkpoint_programmatic
metrics = evaluate_checkpoint_programmatic(
checkpoint_path='model.pth',
csv_files=['p15_open_1.csv', 'p15_close_1.csv'],
buffer_range=800,
lookahead=100,
samples_between_prediction=100,
allow_relax=1,
stride=1,
model_choice="any2any",
verbose=0,
)
print(f"Transition Accuracy: {metrics['transition_accuracy']:.4f}")
print(f"Raw Accuracy: {metrics['raw_accuracy']:.4f}")Core modifications:
main.py: Added--freeze_backboneflag for head-only fine-tuningdataset.py: Addedsampled_segmentsparameter for data efficiency experimentsevent_classification.py: Addedevaluate_checkpoint_programmatic()function
New utilities:
dataset_utils.py: Repetition extraction and nested sampling functionscv_hyperparameter_search.py: 4-fold CV with automated hyperparameter selectionrun_main_experiment.py: Complete main experiment orchestrationrun_data_efficiency.py: Data efficiency experiment (K=1,4,8)run_convergence.py: Convergence and catastrophic forgetting trackingtest_implementations.py: Validation test suite
"No checkpoint found":
- Verify training completed successfully
- Check
model_checkpoints/directory exists - Ensure epoch number matches configuration
"Test data not found":
- Verify participant data paths in script configuration
- Check file naming patterns match (e.g.,
p15_open_1.csv) - Ensure all baseline and test files exist
CUDA out of memory:
- Reduce batch size (default: 128)
- Run experiments sequentially instead of parallel
- Use gradient accumulation (requires code modification)
If you find this codebase useful, consider citing:
@misc{https://doi.org/10.48550/arxiv.2506.19815,
doi = {10.48550/ARXIV.2506.19815},
url = {https://arxiv.org/abs/2506.19815},
author = {Wang, Runsheng and Zhu, Xinyue and Chen, Ava and Xu, Jingxi and Winterbottom, Lauren and Nilsen, Dawn M. and Stein, Joel and Ciocarlie, Matei},
keywords = {Robotics (cs.RO), FOS: Computer and information sciences, FOS: Computer and information sciences},
title = {ReactEMG: Zero-Shot, Low-Latency Intent Detection via sEMG},
publisher = {arXiv},
year = {2025},
copyright = {Creative Commons Attribution 4.0 International}
}For questions or support, please email Runsheng at runsheng.w@columbia.edu
This project is released under the MIT License; see the License file for full details.
This work was supported in part by an Amazon Research Award and the Columbia University Data Science Institute Seed Program. Ava Chen was supported by NIH grant 1F31HD111301-01A1. The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies, either expressed or implied, of the sponsors. We would like to thank Katelyn Lee, Eugene Sohn, Do-Gon Kim, and Dilara Baysal for their assistance with the hand orthosis hardware. We thank Zhanpeng He and Gagan Khandate for their helpful feedback and insightful discussions.