wxyz-lang/awesome-3dbody-papers

😎Awesome list of papers about 3D body

Awesome 3D Body Papers

An awesome & curated list of papers about 3D human body.

Table of Contents

Body Model
Body Pose
Naked Body Mesh
Clothed Body Mesh
Human Depth Estimation
Human Motion
Human-Object Interaction
Animation
Cloth/Try-On
Neural Rendering
Dataset

Body Model

SCAPE: Shape Completion and Animation of People. SIGGRAPH, 2005. [Page]

SMPL: A Skinned Multi-Person Linear Model. SIGGRAPH Asia, 2015. [Page] [Code]

Expressive Body Capture: 3D Hands, Face, and Body from a Single Image. CVPR, 2019. [Page] [Code]

SoftSMPL: Data-driven Modeling of Nonlinear Soft-tissue Dynamics for Parametric Humans. Eurographics, 2020. [Page]

STAR: Sparse Trained Articulated Human Body Regressor. ECCV, 2020. [Page] [Code]

BLSM: A Bone-Level Skinned Model of the Human Mesh. ECCV, 2020. [Page]

Joint Optimization for Multi-Person Shape Models from Markerless 3D-Scans. ECCV, 2020. [Code]

GHUM & GHUML: Generative 3D Human Shape and Articulated Pose Models. CVPR (Oral), 2020. [Code]

BASH: Biomechanical Animated Skinned Human for Visualization of Kinematics and Muscle Activity. GRAPP, 2021. [Code]

Body Pose

MotioNet: 3D Human Motion Reconstruction from Monocular Video with Skeleton Consistency. ToG, 2020. [Page] [Code]

VNect: Real-time 3D Human Pose Estimation with a Single RGB Camera. SIGGRAPH Asia, 2017. [Page] [Code]

XNect: Real-time Multi-person 3D Human Pose Estimation with a Single RGB Camera. SIGGRAPH, 2020. [Page] [Code]

PhysCap: Physically Plausible Monocular 3D Motion Capture in Real Time. SIGGRAPH Asia, 2020. [Page]

Unsupervised 3D Human Pose Representation with Viewpoint and Pose Disentanglement. ECCV, 2020. [Code]

Cascaded Deep Monocular 3D Human Pose Estimation with Evolutionary Training Data. CVPR, 2020. [Code]

End-to-End Estimation of Multi-Person 3D Poses from Multiple Cameras. ECCV (Oral), 2020.

Learnable Triangulation of Human Pose. ICCV (Oral), 2019. [Code]

Compressed Volumetric Heatmaps for Multi-Person 3D Pose Estimation. CVPR, 2020. [Code]

Multi-person 3D Pose Estimation in Crowded Scenes Based on Multi-View Geometry. ArXiv, 2020. [Code]

PI-Net: Pose Interacting Network for Multi-Person Monocular 3D Pose Estimation. WACV, 2021.

PandaNet: Anchor-Based Single-Shot Multi-Person 3D Pose Estimation. ArXiv, 2021.

SMAP: Single-Shot Multi-Person Absolute 3D Pose Estimation. ECCV, 2020. [Page] [Code]

PI-Net: Pose Interacting Network for Multi-Person Monocular 3D Pose Estimation. WACV, 2021.

Temporal Smoothing for 3D Human Pose Estimation and Localization for Occluded People. ArXiv, 2020. [Code]

Attention Mechanism Exploits Temporal Contexts: Real-time 3D Human Pose Reconstruction. CVPR (Oral), 2020. [Code]

MeTRAbs: Metric-Scale Truncation-Robust Heatmaps for Absolute 3D Human Pose Estimation. T-BIOM, 2020. [Page] [Code]

PCLs: Geometry-aware Neural Reconstruction of 3D Pose with Perspective Crop Layers. ArXiv, 2020.

CanonPose: Self-Supervised Monocular 3D Human Pose Estimation in the Wild. ArXiv, 2020.

DOPE: Distillation Of Part Experts for whole-body 3D pose estimation in the wild. ECCV, 2020. [Code]

MocapNET: Ensemble of SNN Encoders for 3D Human Pose Estimation in RGB Images. BMVC, 2019. [Code]

Invariant Teacher and Equivariant Student for Unsupervised 3D Human Pose Estimation. AIII, 2021. [Code]

A Graph Attention Spatio-temporal Convolutional Networks for 3D Human Pose Estimation in Video. ArXiv, 2020. [Page] [Code]

Real-time Lower-body Pose Prediction from Sparse Upper-body Tracking Signals. ArXiv, 2021.

Residual Pose: A Decoupled Approach for Depth-based 3D Human Pose Estimation. IROS, 2020. [Code]

PoP-Net: Pose over Parts Network for Multi-Person 3D Pose Estimation from a Depth Image. ArXiv, 2020. [Code]

Naked Body Mesh

Keep it SMPL: Automatic Estimation of 3D Human Pose and Shape from a Single Image. ECCV, 2016. [Page] [Code]

Learning to Estimate 3D Human Pose and Shape from a Single Color Image. CVPR, 2018. [Page]

Neural Body Fitting: Unifying Deep Learning and Model Based Human Pose and Shape Estimation. 3DV (Oral), 2018. [Code]

Appearance Consensus Driven Self-Supervised Human Mesh Recovery. ECCV (Oral), 2020. [Page] [Code]

Delving Deep Into Hybrid Annotations for 3D Human Recovery in the Wild. ICCV, 2019. [Page] [Code]

Learning 3D Human Shape and Pose from Dense Body Parts. ArXiv, 2019. [Page] [Code]

Full-Body Awareness from Partial Observations. ECCV, 2020. [Page] [Code]

3D Multi-bodies: Fitting Sets of Plausible 3D Human Models to Ambiguous Image Data. NeurIPS, 2020.

Parametric Shape Estimation of Human Body under Wide Clothing. ACM MM, 2020. [Code]

3D Human Pose, Shape and Texture from Low-Resolution Images and Videos. ArXiv, 2021.

Human Body Model Fitting by Learned Gradient Descent. ECCV, 2020. [Page]

End-to-end Recovery of Human Shape and Pose. CVPR, 2018. [Page] [Code]

Learning to Reconstruct 3D Human Pose and Shape via Model-fitting in the Loop. ICCV, 2019. [Page] [Code]

3D Human Mesh Regression with Dense Correspondence. CVPR, 2020. [Code]

Hierarchical Kinematic Human Mesh Recovery. ECCV, 2020. [Page]

I2L-MeshNet: Image-to-Lixel Prediction Network for Accurate 3D Human Pose and Mesh Estimation from a Single RGB Image. ECCV, 2020. [Code]

PoseLifter: Absolute 3D Human Pose Lifting Network from a Single Noisy 2D Human Pose. ArXiv, 2020. [Code]

Pose2Mesh: Graph Convolutional Network for 3D Human Pose and Mesh Recovery from a 2D Human Pose. ECCV, 2020. [Code]

PoseNet3D: Learning Temporally Consistent 3D Human Pose via Knowledge Distillation. 3DV, 2020.

Human Mesh Recovery from Monocular Images via a Skeleton-disentangled Representation. ICCV, 2019. [Code]

Learning 3D Human Shape and Pose from Dense Body Parts. TPAMI, 2020. [Page] [Code]

Exemplar Fine-Tuning for 3D Human Pose Fitting Towards In-the-Wild 3D Human Pose Estimation. ArXiv, 2020. [Code]

HybrIK: A Hybrid Analytical-Neural Inverse Kinematics Solution for 3D Human Pose and Shape Estimation. ArXiv, 2020. [Code]

Chasing the Tail in Monocular 3D Human Reconstruction with Prototype Memory. ArXiv, 2020.

Beyond Weak Perspective for Monocular 3D Human Pose Estimation. ArXiv, 2020.

CenterHMR: a Bottom-up Single-shot Method for Multi-person 3D Mesh Recovery from a Single Image. ArXiv, 2020. [Code]

Full-body motion capture for multiple closely interacting persons. CVM, 2020.

Coherent Reconstruction of Multiple Humans from a Single Image. CVPR, 2020. [Page] [Code]

Learning 3D Human Dynamics from Video. CVPR, 2019. [Page] [Code]

VIBE: Video Inference for Human Body Pose and Shape Estimation. CVPR, 2020. [Code]

3D Human Motion Estimation via Motion Compression and Refinement. ACCV (Oral), 2020. [Page] [Code]

Beyond Static Features for Temporally Consistent 3D Human Pose and Shape from a Video. ArXiv, 2020.

End-to-End Human Pose and Mesh Reconstruction with Transformers. ArXiv, 2020.

Human Mesh Recovery from Multiple Shots. ArXiv, 2020. [Page]

PC-HMR: Pose Calibration for 3D Human Mesh Recovery from 2D Images/Videos. ArXiv, 2020.

Total Capture: A 3D Deformation Model for Tracking Faces, Hands, and Bodies. CVPR (Oral), 2018. [Page]

Monocular Total Capture: Posing Face, Body and Hands in the Wild. CVPR (Oral), 2019. [Page] [Code]

Expressive Body Capture: 3D Hands, Face, and Body from a Single Image. CVPR, 2019. [Page] [Code]

FrankMocap: A Fast Monocular 3D Hand and Body Motion Capture by Regression and Integration. ArXiv, 2020. [Page] [Code]

Monocular Expressive Body Regression through Body-Driven Attention. ECCV, 2020. [Page] [Code]

NeuralAnnot: Neural Annotator for in-the-wild Expressive 3D Human Pose and Mesh Training Sets. ArXiv, 2020. [Page]

Pose2Pose: 3D Positional Pose-Guided 3D Rotational Pose Prediction for Expressive 3D Human Pose and Mesh Estimation. ArXiv, 2020. [Page]

Monocular Real-time Full Body Capture with Inter-part Correlations. ArXiv, 2020.

Real-time RGBD-based Extended Body Pose Estimation. WACV, 2021. [Code]

Clothed Body Mesh

LiveCap: Real-time Human Performance Capture from Monocular Video. SIGGRAPH, 2019. [Page]

DeepCap: Monocular Human Performance Capture Using Weak Supervision. CVPR (Oral), 2020. [Page]

MonoClothCap: Towards Temporally Coherent Clothing Capture from Monocular RGB Video. 3DV, 2020.

MulayCap: Multi-layer Human Performance Capture Using A Monocular Video Camera. TVCG, 2020. [Page]

ChallenCap: Monocular 3D Capture of Challenging Human Performances using Multi-Modal References. ArXiv, 2021.

Deep Physics-aware Inference of Cloth Deformation for Monocular Human Performance Capture. ArXiv, 2020.

DoubleFusion: Real-time Capture of Human Performance with Inner Body Shape from a Depth Sensor. CVPR (Oral), 2018. [Page] [Code]

SimulCap : Single-View Human Performance Capture with Cloth Simulation. CVPR, 2019. [Page]

Robust 3D Self-portraits in Seconds. CVPR (Oral), 2020. [Page]

RobustFusion: Human Volumetric Capture with Data-driven Visual Cues using a RGBD Camera. ECCV, 2020.

NormalGAN: Learning Detailed 3D Human from a Single RGB-D Image. ECCV, 2020. [Page]

TexMesh: Reconstructing Detailed Human Texture and Geometry from RGB-D Video. ECCV, 2020. [Page]

Neural Deformation Graphs for Globally-consistent Non-rigid Reconstruction. CVPR (Oral), 2021. [Page]

Function4D: Real-time Human Volumetric Capture from Very Sparse Consumer RGBD Sensors. CVPR (Oral), 2021.

POSEFusion:Pose-guided Selective Fusion for Single-view Human Volumetric Capture. CVPR (Oral), 2021.

Fast Generation of Realistic Virtual Humans. VRST, 2017. [Page]

Realistic Virtual Humans from Smartphone Videos. VRST, 2020. [Page]

Video Based Reconstruction of 3D People Models. CVPR, 2018. [Page]

Learning to Reconstruct People in Clothing from a Single RGB Camera. CVPR, 2019. [Page] [Code]

SiCloPe: Silhouette-Based Clothed People. CVPR, 2019.

Tex2Shape: Detailed Full Human Body Geometry from a Single Image. ICCV, 2019. [Page] [Code]

Multi-Garment Net: Learning to Dress 3D People from Images. ICCV, 2019. [Page]

3DPeople: Modeling the Geometry of Dressed Humans. ICCV, 2019. [Page] [Code]

SIZER: A Dataset and Model for Parsing 3D Clothing and Learning Size Sensitive 3D Clothing. ECCV (Oral), 2020. [Page] [Code]

PIFu: Pixel-Aligned Implicit Function for High-Resolution Clothed Human Digitization. ICCV, 2019. [Page] [Code]

PIFuHD: Multi-Level Pixel-Aligned Implicit Function for High-Resolution 3D Human Digitization. CVPR (Oral), 2020. [Page] [Code]

Geo-PIFu: Geometry and Pixel Aligned Implicit Functions for Single-view Human Reconstruction. NeurIPS, 2020. [Code]

StereoPIFu: Depth Aware Clothed Human Digitization via Stereo Vision. CVPR, 2021.

ARCH: Animatable Reconstruction of Clothed Humans. CVPR, 2020.

S3: Neural Shape, Skeleton, and Skinning Fields for 3D Human Modeling. ArXiv, 2021.

Detailed Human Avatars from Monocular Video. 3DV, 2018. [Code]

Monocular Real-Time Volumetric Performance Capture. ECCV, 2020. [Page] [Code]

Implicit Functions in Feature Space for 3D Shape Reconstruction and Completion. CVPR, 2020. [Page] [Code]

Combining Implicit Function Learning and Parametric Models for 3D Human Reconstruction. ECCV (Oral), 2020. [Page] [Code]

PaMIR: Parametric Model-Conditioned Implicit Representation for Image-based Human Reconstruction. TPAMI, 2020. [Page]

RIN: Textured Human Model Recovery and Imitation with a Single Image. ArXiv, 2020.

3D Human Avatar Digitization from a Single Image. VRCAI, 2019.

SMPLicit: Topology-aware Generative Model for Clothed People. CVPR, 2021. [Page]

Reconstructing NBA Players. ECCV, 2020. [Page] [Code]

Capturing Detailed Deformations of Moving Human Bodies. ArXiv, 2021.

Human Depth Estimation

Learning the Depths of Moving People by Watching Frozen People. CVPR, 2019. [Page] [Code]

A Neural Network for Detailed Human Depth Estimation from a Single Image. ICCV, 2019. [Code]

Self-Supervised Human Depth Estimation from Monocular Videos. CVPR, 2020. [Code]

DressNet: High Fidelity Depth Estimation of Dressed Humans from a Single View Image. ArXiv, 2021.

Learning High Fidelity Depths of Dressed Humans by Watching Social Media Dance Videos. CVPR (Oral), 2021. [Page] [Code]

Human Motion

3D Semantic Trajectory Reconstruction from 3D Pixel Continuum. CVPR, 2018. [Page]

Convolutional Autoencoders for Human Motion Infilling. 3DV, 2020.

Robust Motion In-betweening. SIGGRAPH, 2020. [Page]

Single-Shot Motion Completion with Transformer. ArXiv, 2021. [Code]

Learning Compositional Representation for 4D Captures with Neural ODE. CVPR (Oral), 2021.

Predicting 3D Human Dynamics from Video. ICCV, 2019. [Page] [Code]

Long-term Human Motion Prediction with Scene Context. ECCV (Oral), 2020. [Page] [Code]

Adversarial Refinement Network for Human Motion Prediction. ACCV, 2020.

Aggregated Multi-GANs for Controlled 3D Human Motion Prediction. AAAI, 2021.

Aggregated Multi-GANs for Controlled 3D Human Motion Prediction. ArXiv, 2021. [Code]

Synthesizing Long-Term 3D Human Motion and Interaction in 3D. ArXiv, 2020. [Page]

GlocalNet: Class-aware Long-term Human Motion Synthesis. MACV, 2021.

A Causal Convolutional Neural Network for Motion Modeling and Synthesis. ArXiv, 2021.

Learn to Dance with AIST++: Music Conditioned 3D Dance Generation. ArXiv, 2021. [Page]

Learning Speech-driven 3D Conversational Gestures from Video. ArXiv, 2021.

DanceNet3D: Music Based Dance Generation with Parametric Motion Transformer. ArXiv, 2021.

Human-Object Interaction

Perceiving 3D Human-Object Spatial Arrangements from a Single Image in the Wild. ECCV, 2020. [Page] [Code]

Resolving 3D Human Pose Ambiguities with 3D Scene Constraints. ICCV, 2019. [Page] [Code]

GRAB: A Dataset of Whole-Body Human Grasping of Objects. ECCV, 2020. [Page] [Code]

Populating 3D Scenes by Learning Human-Scene Interaction. ArXiv, 2020. [Page]

Holistic 3D Human and Scene Mesh Estimation from Single View Images. ArXiv, 2020.

Animation

Predicting Animation Skeletons for 3D Articulated Models via Volumetric Nets. 3DV (Oral), 2019. [Page] [Code]

RigNet: Neural Rigging for Articulated Characters. SIGGRAPH, 2020. [Page] [Code]

Skeleton-Aware Networks for Deep Motion Retargeting. SIGGRAPH, 2020. [Page] [Code]

Motion Retargetting based on Dilated Convolutions and Skeleton-specific Loss Functions. Eurographics, 2020. [Page] [Code]

A Deep Emulator for Secondary Motion of 3D Characters. ArXiv, 2021.

DeePSD: Automatic Deep Skinning And Pose Space Deformation For 3D Garment Animation. ArXiv, 2020.

UniCon: Universal Neural Controller For Physics-based Character Motion. ArXiv, 2020. [Page]

Cloth/Try-On

DeepWrinkles: Accurate and Realistic Clothing Modeling. ECCV (Oral), 2018.

Wallpaper Pattern Alignment along Garment Seams. SIGGRAPH, 2019. [Page]

Reﬂection Symmetry in Textured Sewing Patterns. VMV, 2019. [Page]

Deep Fashion3D: A Dataset and Benchmark for 3D Garment Reconstruction from Single-view Images. ECCV (Oral), 2020. [Page]

TailorNet: Predicting Clothing in 3D as a Function of Human Pose, Shape and Garment Style. CVPR (Oral), 2020. [Page] [Code]

Learning-Based Animation of Clothing for Virtual Try-On. Eurographics, 2019. [Page]

Physically Based Neural Simulator for Garment Animation. ArXiv, 2020.

Deep Deformation Detail Synthesis for Thin Shell Models. ArXiv, 2021.

DeepCloth: Neural Garment Representation for Shape and Style Editing. ArXiv, 2020. [Page]

3D Custom Fit Garment Design with Body Movement. ArXiv, 2021.

Dynamic Neural Garments. ArXiv, 2021.

Example-based Real-time Clothing Synthesis for Virtual Agents. ArXiv, 2021.

BCNet: Learning Body and Cloth Shape from a Single Image. ECCV, 2020. [Code]

Fully Convolutional Graph Neural Networks for Parametric Virtual Try-On. SCA, 2020. [Page]

Neural 3D Clothes Retargeting from a Single Image. ArXiv, 2021.

Neural Rendering

Neural3D: Light-weight Neural Portrait Scanning via Context-aware Correspondence Learning. ACM MM, 2020.

Multi-view Neural Human Rendering. CVPR, 2020. [Page] [Code]

NeuralHumanFVV: Real-Time Neural Volumetric Human Performance Rendering using RGB Cameras. ArXiv, 2021.

ANR: Articulated Neural Rendering for Virtual Avatars. ArXiv, 2020. [Page]

SMPLpix: Neural Avatars from 3D Human Models. WACV, 2020. [Page] [Code]

Vid2Actor: Free-viewpoint Animatable Person Synthesis from Video in the Wild. ArXiv, 2020. [Page]

Neural Body: Implicit Neural Representations with Structured Latent Codes for Novel View Synthesis of Dynamic Humans. ArXiv, 2020. [Page]

A-NeRF: Surface-free Human 3D Pose Refinement via Neural Rendering. ArXiv, 2021. [Page]

D-NeRF: Neural Radiance Fields for Dynamic Scenes. ArXiv, 2021. [Page]

Dataset

3DPW: Recovering Accurate 3D Human Pose in The Wild Using IMUs and a Moving Camera. ECCV, 2018. [Page]

AMASS: Archive of Motion Capture as Surface Shapes. ICCV, 2019. [Page] [Code]

3DBodyTex: Textured 3D Body Dataset. 3DV, 2018. [Page]

Motion Capture from Internet Videos. ECCV, 2020. [Page] [Code]

3DPeople: Modeling the Geometry of Dressed Humans. ICCV, 2019. [Page] [Code]

Full-Body Awareness from Partial Observations. ECCV, 2020. [Page] [Code]

HUMBI: A Large Multiview Dataset of Human Body Expressions. ECVPR, 2020. [Page] [Code]

SMPLy Benchmarking 3D Human Pose Estimation in the Wild. 3DV (Oral), 2020. [Page]

Back to Top