chelbos's Stars
guanwei49/LogLLM
PKU-YuanGroup/LLaVA-CoT
ArcherFMY/SD-T2I-360PanoImage
repository for 360 panorama image generation based on Stable Diffusion
EnVision-Research/Lotus
Official Implementation of LOTUS: Diffusion-based Visual Foundation Model for High-quality Dense Prediction
Ucas-HaoranWei/GOT-OCR2.0
Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model
nerfstudio-project/gsplat
CUDA accelerated rasterization of gaussian splatting
fishaudio/fish-speech
Brand new TTS solution
GANWANSHUI/GaussianOcc
GaussianOcc: Fully Self-supervised and Efficient 3D Occupancy Estimation with Gaussian Splatting
modelscope/modelscope-agent
ModelScope-Agent: An agent framework connecting models in ModelScope with the world
GengzeZhou/NavGPT-2
[ECCV 2024] Official implementation of NavGPT-2: Unleashing Navigational Reasoning Capability for Large Vision-Language Models
NVlabs/VILA
VILA - a multi-image visual language model with training, inference and evaluation recipe, deployable from cloud to edge (Jetson Orin and laptops)
huggingface/parler-tts
Inference and training library for high-quality TTS models.
kodermax/pdfjs-viewer
jehna/humanify
Deobfuscate Javascript code using ChatGPT
jasonppy/VoiceCraft
Zero-Shot Speech Editing and Text-to-Speech in the Wild
h4r5h1t/webcopilot
An automation tool that enumerates subdomains then filters out xss, sqli, open redirect, lfi, ssrf and rce parameters and then scans for vulnerabilities.
albertan017/LLM4Decompile
Reverse Engineering: Decompiling Binary Code with Large Language Models
microsoft/garnet
Garnet is a remote cache-store from Microsoft Research that offers strong performance (throughput and latency), scalability, storage, recovery, cluster sharding, key migration, and replication features. Garnet can work with existing Redis clients.
heheyas/V3D
V3D: Video Diffusion Models are Effective 3D Generators
deepseek-ai/DeepSeek-VL
DeepSeek-VL: Towards Real-World Vision-Language Understanding
google/magika
Detect file content types with deep learning
Nutlope/roomGPT
Upload a photo of your room to generate your dream room with AI.
langgenius/dify
Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting you quickly go from prototype to production.
0nutation/SpeechGPT
SpeechGPT Series: Speech Large Language Models
TencentARC/PhotoMaker
PhotoMaker [CVPR 2024]
louislam/uptime-kuma
A fancy self-hosted monitoring tool
ayushjain1144/odin
Code for the paper: "ODIN: A Single Model for 2D and 3D Segmentation" (CVPR 2024)
huggingface/amused
myshell-ai/OpenVoice
Instant voice cloning by MIT and MyShell.
FoundationVision/GLEE
[CVPR2024 Highlight]GLEE: General Object Foundation Model for Images and Videos at Scale