longcw

Computer Vision

Tsinghua UniversityBeijing, China

longcw's Stars

chenfei-wu/TaskMatrix
Language:Python34.5k 305 3503.3k
langgenius/dify
Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting you quickly go from prototype to production.
Language:TypeScript32.1k 264 2.1k4.2k
haotian-liu/LLaVA
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
Language:Python17.2k 154 1.3k1.8k
Mintplex-Labs/anything-llm
The all-in-one Desktop & Docker AI application with full RAG and AI Agent capabilities.
Language:JavaScript15.4k 125 1k1.6k
Rudrabha/Wav2Lip
This repository contains the codes of "A Lip Sync Expert Is All You Need for Speech to Lip Generation In the Wild", published at ACM Multimedia 2020. For HD commercial model, please try out Sync Labs
Language:Python9.5k 161 6342.1k
mlfoundations/open_clip
An open source implementation of CLIP.
Language:Jupyter Notebook8.8k 76 439881
leptonai/search_with_lepton
Building a quick conversation-based search demo with Lepton AI.
Language:TypeScript7.2k 49 56914
Unstructured-IO/unstructured
Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.
Language:HTML7k 49 954533
OpenTalker/video-retalking
[SIGGRAPH Asia 2022] VideoReTalking: Audio-based Lip Synchronization for Talking Head Video Editing In the Wild
Language:Python5.9k 70 216865
FlagOpen/FlagEmbedding
Retrieval and Retrieval-augmented LLMs
Language:Python5.4k 33 763376
IDEA-Research/GroundingDINO
Official implementation of the paper "Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection"
Language:Python5.3k 35 274558
OpenBMB/MiniCPM-V
MiniCPM-Llama3-V 2.5: A GPT-4V Level Multimodal LLM on Your Phone
Language:Python4.5k 60 160319
tencent-ailab/IP-Adapter
The image prompt adapter is designed to enable a pretrained text-to-image diffusion model to generate images with image prompt.
Language:Jupyter Notebook4.2k 57 325274
OpenBMB/MiniCPM
MiniCPM-2B: An end-side LLM outperforming Llama2-13B.
Language:Jupyter Notebook4.1k 52 116291
ali-vilab/AnyDoor
Official implementations for paper: Anydoor: zero-shot object-level image customization
Language:Python3.8k 87 86350
rom1504/img2dataset
Easily turn large sets of image urls to an image dataset. Can download, resize and package 100M urls in 20h on one machine.
Language:Python3.4k 30 249316
TencentARC/T2I-Adapter
T2I-Adapter
Language:Python3.2k 40 105193
sail-sg/EditAnything
Edit anything in images powered by segment-anything, ControlNet, StableDiffusion, etc. (ACM MM)
Language:Python3.2k 39 57174
promptfoo/promptfoo
Test your prompts, models, and RAGs. Catch regressions and improve prompt quality. LLM evals for OpenAI, Azure, Anthropic, Gemini, Mistral, Llama, Bedrock, Ollama, and other local & private models with CI/CD integration.
Language:TypeScript3.2k 17 397207
omerbt/TokenFlow
Official Pytorch Implementation for "TokenFlow: Consistent Diffusion Features for Consistent Video Editing" presenting "TokenFlow" (ICLR 2024)
Language:Python1.5k 78 40134
Computer-Vision-in-the-Wild/CVinW_Readings
A collection of papers on the topic of ``Computer Vision in the Wild (CVinW)''
1k 37 653
CircleRadon/Osprey
[CVPR2024] The code for "Osprey: Pixel Understanding with Visual Instruction Tuning"
Language:Python701 13 3240
bytedance/MVDream
Multi-view Diffusion for 3D Generation
Language:Python682 20 3147
LLaVA-VL/LLaVA-Plus-Codebase
LLaVA-Plus: Large Language and Vision Assistants that Plug and Learn to Use Skills
Language:Python642 11 2449
lkeab/gaussian-grouping
Gaussian Grouping for open-world Anything reconstruction, segmentation and editing.
Language:Jupyter Notebook471 21 3835
bytedance/MVDream-threestudio
3D generation code for MVDream
Language:Python449 19 2730
aimagelab/multimodal-garment-designer
This is the official repository for the paper "Multimodal Garment Designer: Human-Centric Latent Diffusion Models for Fashion Image Editing". ICCV 2023
Language:Python377 28 2944
LAION-AI/laion-datasets
Description and pointers of laion datasets
Language:HTML214 6 89
xuanandsix/GFPGAN-onnxruntime-demo
This is the onnxruntime inference code for GFP-GAN: Towards Real-World Blind Face Restoration with Generative Facial Prior (CVPR 2021). Official code: https://github.com/TencentARC/GFPGAN
Language:Python104 3 1116
bychen7/Face-Restoration-TensorRT
A simple face restoration TensorRT deployment solution.
Language:C++69 1 2110

longcw

longcw's Stars

chenfei-wu/TaskMatrix

langgenius/dify

haotian-liu/LLaVA

Mintplex-Labs/anything-llm

Rudrabha/Wav2Lip

mlfoundations/open_clip

leptonai/search_with_lepton

Unstructured-IO/unstructured

OpenTalker/video-retalking

FlagOpen/FlagEmbedding

IDEA-Research/GroundingDINO

OpenBMB/MiniCPM-V

tencent-ailab/IP-Adapter

OpenBMB/MiniCPM

ali-vilab/AnyDoor

rom1504/img2dataset

TencentARC/T2I-Adapter

sail-sg/EditAnything

promptfoo/promptfoo

omerbt/TokenFlow

Computer-Vision-in-the-Wild/CVinW_Readings

CircleRadon/Osprey

bytedance/MVDream

LLaVA-VL/LLaVA-Plus-Codebase

lkeab/gaussian-grouping

bytedance/MVDream-threestudio

aimagelab/multimodal-garment-designer

LAION-AI/laion-datasets

xuanandsix/GFPGAN-onnxruntime-demo

bychen7/Face-Restoration-TensorRT