diff --git a/GSoC-2026/movement-Human Pose Estimation Format Support (Peter Nelson Subrata).md b/GSoC-2026/movement-Human Pose Estimation Format Support (Peter Nelson Subrata).md new file mode 100644 index 00000000..3d379b24 --- /dev/null +++ b/GSoC-2026/movement-Human Pose Estimation Format Support (Peter Nelson Subrata).md @@ -0,0 +1,110 @@ +# movement: Human Pose Estimation Format Support (Peter Nelson Subrata) + +## Personal details + +- **Full name:** Peter Nelson Subrata +- **Email:** peterns1609@gmail.com +- **GitHub username:** PewterZz +- **Zulip username:** Peter Nelson Subrata +- **Location & time-zone:** Jakarta, Indonesia — GMT+7 +- **Code contribution:** https://github.com/neuroinformatics-unit/movement/pull/914 +- **Proposal discussion link:** https://github.com/neuroinformatics-unit/gsoc/pull/97 + +## Project proposal + +### Synopsis + +movement currently supports animal pose estimation formats (DeepLabCut, SLEAP, +LightningPose, Anipose) but has no support for human-focused formats, MMPose, +COCO keypoints, FreeMocap, motionBIDS, BVH, and C3D. Researchers studying +human movement in rehabilitation, sports science, or clinical gait analysis +cannot load their data into movement without writing custom preprocessing +scripts. This project implements loaders for at least three of these formats, +following the existing from_sleap_file/from_dlc_file pattern and producing +the standard xarray.Dataset output that works with all of movement's +kinematics and visualisation tools. The wanted result is that movement becomes +a unified interface across pose estimation frameworks that extends +to human data as well. + +### Implementation timeline + +**Minimal deliverables:** +- MMPose JSON loader (COCO-17, Halpe-26, custom schemas, multi-individual) +- COCO keypoint annotation loader (visibility-to-confidence mapping) +- FreeMocap 3D loader (first 3D format in movement) +- motionBIDS loader (BIDS directory traversal, TSV + JSON sidecar) +- Unit and integration tests for each loader +- Gallery examples for each format + +**Stretch goals:** +- BVH and C3D motion capture loaders +- from_file() format detection utility + +**Weekly timeline** (12 weeks, ~30 hours/week): + +| Weeks | Work | +|-------|------| +| Bonding | Read codebase, finalise MMPose loader from pre-proposal prototype, discuss any API decisions with mentor | +| 1–2 | MMPose loader: complete implementation, COCO-17/Halpe-26/COCO-133 schemas, multi-individual grouping, unit tests | +| 3–4 | COCO keypoint loader, visibility-to-confidence mapping, integration tests, MMPose gallery example | +| 5–6 | FreeMocap 3D loader, verify kinematics tools handle 3D space coord, gallery example | +| 7–8 | motionBIDS loader, BIDS directory traversal, sidecar metadata extraction | +| 9–10 | BVH loader (stretch), forward kinematics via bvhio library | +| 11 | C3D loader (stretch), from_file format detection utility | +| 12 | Documentation, gallery examples, cleanup, final PR review | + +### Communication plan + +Weekly async updates on the movement GitHub discussions or Zulip covering +what was completed and any blockers. All work submitted as focused PRs, one +loader per PR. Available for video calls during European morning hours. + +## Personal statement + +### Past experience + +I have a background in computer vision engineering. One area I worked on was +video capture models for collecting gameplay data — writing pipelines that +process video frames, track objects and joints across frames, and feed that +data into downstream systems. That work gave me hands-on experience with +pose estimation tools including MMPose and MediaPipe, which are the backbone +of two of the formats this project targets (MMPose JSON and FreeMocap). + +On the open source side: I have a merged PR to pytorch/ignite and sktime, an approved +PR to kornia, and an open fix in movement itself (PR #914, fixing a silent +crash in compute_time_derivative on single-frame data). I also built a +proof of concept MMPose loader on a fork branch that correctly parses +MMPose JSON predictions into the movement xarray.Dataset schema. + +### Motivation: why this project? + +I want to contribute to serious open source scientific software and this +project sounds like a perfect fit for me. Building a common interface so researchers +don't have to reinvent the loading logic for every tool they use is something I am interested in. +Extending that to human pose formats is a natural next step and I believe I am well positioned +to do that given my background with the relevant tools. + +### Match: why me? + +The MMPose loader proof-of-concept is already working on my fork. I +understand the from_numpy builder pattern from reading the existing loaders +and have already made a fix to the kinematics module. The CV background means +I won't be learning the formats from scratch — I have used MMPose, COCO +keypoint format, and MediaPipe in production work and understand their +quirks (schema mismatches, multi-individual grouping, 3D coordinate handling). + +### Availability + +I am essentially fully available during the GSoC period with no competing +commitments. + +## GSoC + +### GSoC experience + +I expect structured mentorship, high quality code reviews to learn from, and the experience of working in a frontier computer vision library. + +### Other applications + +Yes, I am also applying to Gemini CLI (Google) and MesaLLM. +My preference is GeminiCLI if there is a tie.