open-mmlab/mmpose

Roadmap of MMPose

hellock opened this issue ยท 76 comments

We keep this issue open to collect feature requests from users and hear your voice. Our monthly release plan is also available here.

You can either:

  1. Suggest a new feature by leaving a comment.
  2. Vote for a feature request with ๐Ÿ‘ or be against with ๐Ÿ‘Ž. (Remember that developers are busy and cannot respond to all feature requests, so vote for your most favorable one!)
  3. Tell us that you would like to help implement one of the features in the list or review the PRs. (This is the greatest things to hear about!)

i hope that MMPose can support 21 hand landmark detetion, thanks

i hope that MMPose can support 21 hand landmark detetion, thanks

Good suggestions! We will add this feature in our TODO list. Thank you.

TODO List (continuously updated... [last edit: 2023.1.14]) :
Here is a collection of feature requests.
Items that have already been implemented in MMPose will be removed from the list.

  1. More popular backbones
  1. Add more popular datasets:
  1. More 2d human pose estimation method.
  1. More 2d face alignment algorithms.

  2. More 3d human pose algorithms.

  1. Support 2d video pose estimation and tracking
  1. Support Vehicle pose estimation
  1. Add 3D Pose Consistency Benchmark #828

  2. Mano based hand keypoints detection

  3. Depth-based 3d hand pose estimation

  1. Multi-view 3d pose estimation
  1. Support memonger

  2. Support Pytorch AMP training #339

  3. Hyperparameter tuner Optuna

  4. Support Unity plugin

  5. print loss during evaluation. #333

  6. Quantization Aware Training #359

  7. Easier Usage (API)

  8. Export to Torchscript #576

Would you mind add #31 (comment) to the TODO list.

Would you mind add #31 (comment) to the TODO list.

Sure.

Speed up inference #40

Support video pose estimation #67

Would be great to add support for whole body pose estimation dataset (body+face+hands) via COCO-WholeBody

Also add support for MPII in mmdetection.

It would be great to add support for pose tracking dataset i.e. posetrack2017/2018.

Support to convert pytorch model to onnx by the way.Thx๏ผ

@OasisYang could you elaborate? is loading the data and processing on a frame basis is enough, or you want the tracking part also?

@flynnamy sounds like a request for a general tool. maybe we can provide such tools for the whole mm-series (just saying, not a confirmation).

@innerlee If possible, adding both data loading and tracking part would be great. However, the tracking part seems a little bit complicate and always comes with some extra modules. Maybe, the first step is basically to support the data loading and processing. Thanks

@OasisYang could you elaborate? is loading the data and processing on a frame basis is enough, or you want the tracking part also?

@flynnamy sounds like a request for a general tool. maybe we can provide such tools for the whole mm-series (just saying, not a confirmation).

Support ShuffleNet V2 & MobileNet V3 backbones. #94

Add Yolov4 and OpenPose

Please make it possible to obtain estimated heatmaps from methods

@hamedcan could you explain more about the usage? do you want a visualizer of heatmaps during training, or a visualization tool for demo, or anything else?

Bottom up for MPII dataset?

@innerlee First, I really want to thank you for the MMPose. It really helped me. I want to compare different models' performance on hard poses. So I need to be able to observe generated heatmaps. I want a visualization tool for demo.

Support multi-head networks #219

Please support mpii_trb demo and mpi_inf_3dhp datasets!

Support 3d hand keypoint estimation!!!!!!

Support log info when dataset is tinty, #333

Support PyTorch AMP training, thanks. #339

Would be great to see the integration of a hyperparameter tuner like Optuna

A Unity plugin would be amazing to have, using json input data and/or real-time pose estimation with a webcam and seeing it reflected on a 3D model.

@MaxGodTier do you have experience in developing unity plugin? contributions are welcome :D

I don't, but a dirty implementation may be possible using an existing repo , it reads pose data from simple text files each representing a single frame , I see two solutions: (1) If pose_results from mmpose were translated into the same format expected from that repo, it will work out of the box without needing to change a single line of code or (2) edit the repo code (C#) to use mmpose rules instead of theirs.

Quantization Aware Training for models to get the int8 models ,int8 models will greatly improve inference speed #359,thanks

Albumentations augmetnations similar to mmclassification

i hope that MMPose can support 3D hand landmarks detetion, thanks

Does MMPose support Single Person Pose Estimation?
Currently I found only multi-person versions are supported.

@rhiver single is a case of multi

@rhiver single is a case of multi

Sort of. But multi-person version has two stages, person detection and pose estimation, which have to infer on two models.
So this method doesn't work for realtime pose estimation in mobile devices since it takes too long on the inference.
MobileNetV2 is good enough for simple pose estimation. But for best FPS, it's better to let it do both single person detection and pose estimation.

I see it has supported heatmap method of face datasets now. Please support regression method of face dataset!

I see it has supported heatmap method of face datasets now. Please support regression method of face dataset!

Do you have any recommended papers/codes ?

I see it has supported heatmap method of face datasets now. Please support regression method of face dataset!

Do you have any recommended papers/codes ?

Yes,wingloss:https://arxiv.org/pdf/1711.06753.pdf, and GCN+softwing loss: https://arxiv.org/pdf/2006.11697.pdf

I want to use mmpose with the pypi package much more easily than now; such as:

from mmpose import top_down

top_down("darkpose", "COCO_wholebody", video_path="hoge.mp4", output_json_title="hoge") # Analyze hoge.mp4 with COCO wholebody on darkpose and output the result as hoge/hoge000000000000.json, hoge/hoge000000000001.json, hoge/hoge000000000002.json, ....

MPII multi-person dataset for bottom-up methods is really needed!

hi ,can you add handlandmark filtering algorithm for eliminating handlandmark jittering in videos? thanks

Adding Vehicle pose estimation to the pipe line using CarFusion dataset. Similar to Occlusion-net, and Apollocar3D.

Export to Torchscript !

Lite-HRNet, its already built with mmpose, so including into the main repo should be super simple. Would be amazing if it could work with the pytorch2onnx tool for deployment

Please support Halpe data set: https://github.com/Fang-Haoshu/Halpe-FullBody

It has 3 useful points in addition to the COCO-WholeBody.

Hi everyone,

I intend to create my own keypoints dataset with 3 points of interest (two endpoints and one center point). Can anyone kindly help me on how I can create annotations to be loaded into mmpose? Because I believe that the repo is based on mmcv, how can I get my own dataloader?
Any help in this regard will be highly appreciated.
Thank you

Support 3dpw dataset #682

aqsc commented

Do you have any plans for the mano based hand keypoints detection? Also optimization with the IK loss

Add 3D Pose Consistency Benchmark - #828

It would be nice to add "PoseFormer". It based on VideoPose3D, which already supported.

It would be nice to add "CenterNet". it is a bottom up based 2d human pose estimation method and it groups keypoints of one person by combine regression and heatmap of keypoints which is quite different from associated embedding and affinity fields

Add RLE into MMPose

Background : 3d pose estimation (with video generation) with a high number of people (es: official video, minute 00:19 sec, but with a lot of people

Result video: the original video is put on the top-left, with the subsuquent 3d pose of the people on the right. If there are a lot of people, the final video has strange resolution (i.e 6000x400) because every people detected is on put on the same row.

What could be improve: split the people 3d pose visualization into multiple row

It would be great to have a 'score_per_joint' option in test_cfg in order to output one score per joint, instead of having only a global score for the pose, my use case is related to associative embedding

update Interhand2.6M dataset which contains MANO hand mesh parameters.......

It would be nice to have Depth-Based 3D Hand Pose Estimation methods like A2J.

vra commented

It would be great to have SmoothNet trained on 3DPW and AIST++ :)

qinb commented

It would be nice to add SmoothNet training code about pose estimation, hoping it could easily retrain on my own dataset.

3D Human Mesh
frankmocap

ly015 commented

3D Human Mesh frankmocap

Thanks for your feedback. 3D human mesh recovery is no longer supported in MMPose. We have MMHuman3D for this task and you are welcome to submit an issue there about your request.

It would be so helpful for better analysis if AP for each type of body joints are printed, for example 17 AP value for 17 kinds of body joints are given when inferencing a model in MS COCO body-keypoint dataset.

Will be really helpful to implement MIPNet into mmpose:

  • It is particularly useful to tackle data where there are crowded/highly occluded humans. Was previously the SOTA on OCHuman before ViTPose came along. Within the realms of convnets, it should still be the SOTA, and it seems like the idea is general enough to be applied to different types of backbones.

Also similar to #1389 request, will be nice to integrate ViTPose into mmpose. ViTPose is already implemented in mmpose, so I expect integration to be much easier ๐Ÿ˜„

It would be so helpful for better analysis if AP for each type of body joints are printed, for example 17 AP value for 17 kinds of body joints are given when inferencing a model in MS COCO body-keypoint dataset.

@yshMars
Already supported in #1170

It would be nice to support ConvNeXt backbones. It is a very simple model that is purely convolutional. They can serve as a drop-in replacement for ResNet or Swin Transformer architectures. ImageNet-22k pretrained ConvNeXt variants are considered state-of-the-art in this regime.

Official code: https://github.com/facebookresearch/ConvNeXt
ConvNeXt was also implemented in the mmsegmentation and mmdetection libraries.

Thanks!

It would be nice to have Poseaug augmentation pipeline for 3d pose estimation.
official code: [(https://github.com/jfzhang95/PoseAug)]

Thanks!

Would like to have Tracing support for mmPose models.
I have been able to successfully use 'torch2torchscript' under mmdeploy.apis to trace mmSegmentation Models. However, using the same on mmPose (triedconfigs under dekr and associative_embedding) with the following mmdeploy config: '\mmdeploy\configs\mmpose\pose-detection_torchscript.py' would throw the following error:

File "mmdetection\mmpose\mmpose\datasets\pipelines\shared_transform.py", line 176, in call
meta[key_tgt] = results[key_src]

KeyError: 'flip_index'

Would really appreciate having this feature! Thanks

Can SCRFD be added to mmpose? https://github.com/deepinsight/insightface/tree/master/detection/scrfd

It seems that SCRFD is for face detection. MMPose will focus on pose estimation/keypoint detection. Maybe it is more appropriate to support it in mmdet.

BobDLA commented

Are you still woking on the openpose which has list on the ROADMAP for years?

Thanks