Pine-sha's Stars
rinongal/textual_inversion
zju-vipa/awesome-neural-trees
Introduction, selected papers and possible corresponding codes in our review paper "A Survey of Neural Trees"
hpcaitech/Open-Sora
Open-Sora: Democratizing Efficient Video Production for All
avisingh599/imitation-dagger
[Reimplementation Ross et al 2011] An implementation of DAGGER using ConvNets for driving from pixels.
montrealrobotics/active-domainrand
Code repository for Active Domain Randomization (CoRL 2019, https://arxiv.org/abs/1904.04762)
lukasHoel/text2room
Text2Room generates textured 3D meshes from a given text prompt using 2D text-to-image models (ICCV2023).
FrozenBurning/SceneDreamer
[TPAMI 2023] SceneDreamer: Unbounded 3D Scene Generation from 2D Image Collections
google-deepmind/dm_control
Google DeepMind's software stack for physics-based simulation and Reinforcement Learning environments, using MuJoCo.
lucidsim/lucidsim
Official Repo for the paper "Learning Visual Parkour from Generated Images" (CoRL 2024).
AILab-CVC/YOLO-World
[CVPR 2024] Real-Time Open-Vocabulary Object Detection
UX-Decoder/DINOv
[CVPR 2024] Official implementation of the paper "Visual In-context Learning"
Oswald522/ams-thesis
院使用的Latex论文模板
IDEA-Research/T-Rex
[ECCV2024] API code for T-Rex2: Towards Generic Object Detection via Text-Visual Prompt Synergy
autonomousvision/gaussian-opacity-fields
[SIGGRAPH Asia'24 & TOG] Gaussian Opacity Fields: Efficient Adaptive Surface Reconstruction in Unbounded Scenes
hugobl1/ray_gauss
RayGauss: Volumetric Gaussian-Based Ray Casting for Photorealistic Novel View Synthesis
THU-luvision/OmniSeg3D
Segment Everything All at Once
VAST-AI-Research/TriplaneGaussian
TriplaneGaussian: A new hybrid representation for single-view 3D reconstruction.
OpenGVLab/InternVL
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型
hrz2000/CustomNeRF
[CVPR 2024] Customize your NeRF: Adaptive Source Driven 3D Scene Editing via Local-Global Iterative Training
ashawkey/stable-dreamfusion
Text-to-3D & Image-to-3D & Mesh Exportation with NeRF + Diffusion.
adobe-research/custom-diffusion
Custom Diffusion: Multi-Concept Customization of Text-to-Image Diffusion (CVPR 2023)
google/dreambooth
zhengli97/Awesome-Prompt-Adapter-Learning-for-VLMs
A curated list of awesome prompt/adapter learning methods for vision-language models like CLIP.
huggingface/diffusers
🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch and FLAX.
wkentaro/labelme
Image Polygonal Annotation with Python (polygon, rectangle, circle, line, point and image-level flag annotation).
Megvii-BaseDetection/YOLOX
YOLOX is a high-performance anchor-free YOLO, exceeding yolov3~v5 with MegEngine, ONNX, TensorRT, ncnn, and OpenVINO supported. Documentation: https://yolox.readthedocs.io/
OpenRobotLab/PointLLM
[ECCV 2024 Best Paper Candidate] PointLLM: Empowering Large Language Models to Understand Point Clouds
lzhnb/Analytic-Splatting
[ECCV 2024 - Oral] Analytic-Splatting Anti-Aliased 3D Gaussian Splatting via Analytic Integration
nv-tlabs/lift-splat-shoot
Lift, Splat, Shoot: Encoding Images from Arbitrary Camera Rigs by Implicitly Unprojecting to 3D (ECCV 2020)
tobiasfshr/map4d
Photo-realistic mapping of dynamic urban areas