Minimal loop + DTW: update data_loader; add smoke test#17
Conversation
AmitMY
left a comment
There was a problem hiding this comment.
looks ok, left a few comments.
|
So what I understand is that you are unable to overfit to a small dataset. |
|
In this update I focused on stabilizing the training–inference pipeline and diagnosing the long-standing visualization issues. Over the past month I repeatedly encountered inconsistencies between the 586-joint source format and the 178-joint reduced skeleton used for training: GT visualizations were sometimes shifted, flipped, or distorted depending on whether reduce_holistic() was applied, and several mapping mismatches (e.g., wrong component order, inconsistent mean/std shape, and mismatched header joints) caused predicted skeletons to appear stretched or collapsed. At this stage I still have several open questions regarding the remaining prediction distortions:
|
|
Great! Now it is time to run a full training for the entire dataset |
|
Lint warnings are about "too many arguments" (R0913/R0917) in |
|
Fix the lint issues, and I'd merge You may supress |
|
Fixed lint, suppressed R0913 and R0917 in pyproject.toml and added ignore-paths for translation. |




This PR adds a minimal training/validation loop that uses masked MSE and a validation-time DTW sanity metric.
And I met some problems when implementing these tasks:
In the loop I convert BTJC → BJCT before calling the model, and then convert the model output BJCT → BTJC after the forward pass. The model was implemented around BJCT (time last), while our masks, zero_pad_collator, and loss/metrics (masked MSE + DTW) are written for BTJC (time at dim=1). So inputs/past are permuted to match the model, and predictions are permuted back so losses operate over [B, T, J, C].
I’m not sure if this is the best place to keep the conversions. Right now it works, but the double permute feels a bit clunky. Can you give some suggestions on this? Thanks!