SherlockSunset
Research Areas: Computer Vsion, Object Detection, 3D Defect Inspection.
Nanyang Technological UniversitySingapore
SherlockSunset's Stars
HVision-NKU/StoryDiffusion
Accepted as [NeurIPS 2024] Spotlight Presentation Paper
sicxu/Deep3DFaceRecon_pytorch
Accurate 3D Face Reconstruction with Weakly-Supervised Learning: From Single Image to Image Set (CVPRW 2019). A PyTorch implementation.
Brian417-cup/AnimatedDrawings
Code to accompany "A Method for Animating Children's Drawings of the Human Figure"
facebookresearch/AnimatedDrawings
Code to accompany "A Method for Animating Children's Drawings of the Human Figure"
memoavatar/memo
Memory-Guided Diffusion for Expressive Talking Video Generation
antgroup/animate-x
Animate-X: Universal Character Image Animation with Enhanced Motion Representation
sh-lee-prml/HierSpeechpp
The official implementation of HierSpeech++
hustvl/4DGaussians
[CVPR 2024] 4D Gaussian Splatting for Real-Time Dynamic Scene Rendering
GhostCai/PortraitRelighting
Official PyTorch implementation of the CVPR 2024 Highlight Paper "Real-time 3D-aware Portrait Video Relighting"
Francis-Rings/StableAnimator
We present StableAnimator, the first end-to-end ID-preserving video diffusion framework, which synthesizes high-quality videos without any post-processing, conditioned on a reference image and a sequence of poses.
arthurhero/Long-LRM
Self-reimplemented version of Long-LRM.
zhaofuq/LOD-3DGS
LetsGo: Large-Scale Garage Modeling and Rendering via LiDAR-Assisted Gaussian(Published in SIGGRAPH Asia 2024)
xg-chu/GAGAvatar
[NeurIPS 2024] Generalizable and Animatable Gaussian Head Avatar
jdh-algo/JoyVASA
metame-ai/awesome-audio-plaza
Daily tracking of awesome audio papers, including music generation, zero-shot tts, asr, audio generation
open-mmlab/Amphion
Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.
genmoai/mochi
The best OSS video generation models
arthur-qiu/ReliTalk
[IJCV 2024] Code for ReliTalk
suno-ai/bark
🔊 Text-Prompted Generative Audio Model
SWivid/F5-TTS
Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"
sen-mao/StyleDiffusion
Official Implementations "StyleDiffusion: Prompt-Embedding Inversion for Text-Based Editing" (CVMJ2024)
williamyang1991/DualStyleGAN
[CVPR 2022] Pastiche Master: Exemplar-Based High-Resolution Portrait Style Transfer
Huanshere/VideoLingo
Netflix-level subtitle cutting, translation, alignment, and even dubbing - one-click fully automated AI video subtitle team | Netflix级字幕切割、翻译、对齐、甚至加上配音,一键全自动视频搬运AI字幕组
yliu-cs/PiTe
PiTe: Pixel-Temporal Alignment for Large Video-Language Model
jefftan969/dressrecon
DressRecon: Freeform 4D Human Reconstruction from Monocular Video
anliyuan/Ultralight-Digital-Human
一个超轻量级、可以在移动端实时运行的数字人模型
MLEveryday/100-Days-Of-ML-Code
100-Days-Of-ML-Code中文版
ArcticHare105/S3Diff
Official implementation of S3Diff
apple/ml-depth-pro
Depth Pro: Sharp Monocular Metric Depth in Less Than a Second.
myshell-ai/MeloTTS
High-quality multi-lingual text-to-speech library by MyShell.ai. Support English, Spanish, French, Chinese, Japanese and Korean.