Pinned Repositories
act-plus-plus
Imitation learning algorithms with Co-training for Mobile ALOHA: ACT, Diffusion Policy, VINN
alibabacloud-bailian-speech-demo
Sample Repository for the AlibabaCloud Bailian Speech SDK
Arc2Face
Arc2Face: A Foundation Model of Human Faces
champ
Champ: Controllable and Consistent Human Image Animation with 3D Parametric Guidance
CosyVoice
LLM based TTS model, providing inference/training/deployment full-stack ability.
distil-whisper
Distilled variant of Whisper for speech recognition. 6x faster, 50% smaller, within 1% word error rate.
DrEureka
Official Repository for "DrEureka: Language Model Guided Sim-To-Real Transfer" (RSS 2024)
DynamiCrafter
DynamiCrafter: Animating Open-domain Images with Video Diffusion Priors
EDA-AI
Implementation of NeurIPS 2021 paper "On Joint Learning for Solving Placement and Routing in Chip Design" & NeurIPS 2022 paper "The Policy-gradient Placement and Generative Routing Neural Networks for Chip Design".
ffmpeg-webrtc
ffmpeg-webrtc for whip and whep protocol
carcloudfly's Repositories
carcloudfly/act-plus-plus
Imitation learning algorithms with Co-training for Mobile ALOHA: ACT, Diffusion Policy, VINN
carcloudfly/alibabacloud-bailian-speech-demo
Sample Repository for the AlibabaCloud Bailian Speech SDK
carcloudfly/Arc2Face
Arc2Face: A Foundation Model of Human Faces
carcloudfly/champ
Champ: Controllable and Consistent Human Image Animation with 3D Parametric Guidance
carcloudfly/CosyVoice
LLM based TTS model, providing inference/training/deployment full-stack ability.
carcloudfly/distil-whisper
Distilled variant of Whisper for speech recognition. 6x faster, 50% smaller, within 1% word error rate.
carcloudfly/DrEureka
Official Repository for "DrEureka: Language Model Guided Sim-To-Real Transfer" (RSS 2024)
carcloudfly/DynamiCrafter
DynamiCrafter: Animating Open-domain Images with Video Diffusion Priors
carcloudfly/EDA-AI
Implementation of NeurIPS 2021 paper "On Joint Learning for Solving Placement and Routing in Chip Design" & NeurIPS 2022 paper "The Policy-gradient Placement and Generative Routing Neural Networks for Chip Design".
carcloudfly/ffmpeg-webrtc
ffmpeg-webrtc for whip and whep protocol
carcloudfly/FunASR
A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models.
carcloudfly/LookOnceToHear
A novel human-interaction method for real-time speech extraction on headphones.
carcloudfly/MARS5-TTS
MARS5 speech model (TTS) from CAMB.AI
carcloudfly/MinerU
A one-stop, open-source, high-quality data extraction tool, supports PDF/webpage/e-book extraction.一站式开源高质量数据提取工具,支持PDF/网页/多格式电子书提取。
carcloudfly/MusePose
MusePose: a Pose-Driven Image-to-Video Framework for Virtual Human Generation
carcloudfly/noise-reduction
noise reduction
carcloudfly/yay_robot
PyTorch implementation of YAY Robot