zengbohan0217's Stars
opendatalab/MinerU
A one-stop, open-source, high-quality data extraction tool, supports PDF/webpage/e-book extraction.一站式开源高质量数据提取工具,支持PDF/网页/多格式电子书提取。
HeliosZhao/GenXD
GenXD: Generating Any 3D and 4D Scenes
beccabai/multi-agent-data-selection
This is the repo for the paper Multi-Agent Collaborative Data Selection for Efficient LLM Pretraining.
Ahren09/AgentReview
Official Implementation for EMNLP 2024 (main) "AgentReview: Exploring Academic Peer Review with LLM Agent."
black-forest-labs/flux
Official inference repo for FLUX.1 models
KovenYu/WonderJourney
genmoai/models
The best OSS video generation models
dreamscene4d/dreamscene4d
[NeurIPS 2024] DreamScene4D: Dynamic Multi-Object Scene Generation from Monocular Videos
baaivision/Emu3
Next-Token Prediction is All You Need
YangLing0818/SuperCorrect-llm
SuperCorrect: Supervising and Correcting Language Models with Error-Driven Insights
YangLing0818/IterComp
IterComp: Iterative Composition-Aware Feedback Learning from Model Gallery for Text-to-Image Generation
YangLing0818/SemanticSDS-3D
Semantic Score Distillation Sampling for Compositional Text-to-3D Generation
G-U-N/Rectified-Diffusion
Rectified Diffusion: Straightness Is Not Your Need
jy0205/Pyramid-Flow
Code of Pyramidal Flow Matching for Efficient Video Generative Modeling
YangLing0818/VideoTetris
[NeurIPS 2024] VideoTetris: Towards Compositional Text-To-Video Generation
YangLing0818/Trans4D
Trans4D: Realistic Geometry-Aware Transition for Compositional Text-to-4D Synthesis
liruiw/HPT
Heterogeneous Pre-trained Transformer (HPT) as Scalable Policy Learner.
VAST-AI-Research/TriplaneGaussian
TriplaneGaussian: A new hybrid representation for single-view 3D reconstruction.
Huage001/LinFusion
Official PyTorch and Diffusers Implementation of "LinFusion: 1 GPU, 1 Minute, 16K Image"
Q-Future/Q-Align
③[ICML2024] [IQA, IAA, VQA] All-in-one Foundation Model for visual scoring. Can efficiently fine-tune to downstream datasets.
nicolaus-huang/ProcessPainter
[SIGGRAPH Asia 2024] Painting process generating using diffusion models
zeng-yifei/STAG4D
Official Implementation for STAG4D: Spatial-Temporal Anchored Generative 4D Gaussians
TencentARC/SEED-Story
SEED-Story: Multimodal Long Story Generation with Large Language Model
cilinyan/VISA
[ECCV24] VISA: Reasoning Video Object Segmentation via Large Language Model
lllyasviel/Paints-UNDO
Understand Human Behavior to Align True Needs
AiuniAI/Unique3D
[NeurIPS 2024] Unique3D: High-Quality and Efficient 3D Mesh Generation from a Single Image
facebookresearch/dinov2
PyTorch code and models for the DINOv2 self-supervised learning method.
Luh1124/UV-IDM
Official implementation of UV-IDM.
YangLing0818/buffer-of-thought-llm
[NeurIPS 2024 Spotlight] Buffer of Thoughts: Thought-Augmented Reasoning with Large Language Models
PKU-YuanGroup/Open-Sora-Plan
This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.