Chuanqi-Zang

A DL and CV researcher, on the road.

China

Chuanqi-Zang's Stars

cruiseresearchgroup/MAPLE
Language:Python5
jianchang512/pyvideotrans
Translate the video from one language to another and add dubbing. 将视频从一种语言翻译为另一种语言，同时支持语音识别转录、语音合成、字幕翻译。
Language:Python10.9k1.2k
PaddlePaddle/PaddleSeg
Easy-to-use image segmentation library with awesome pre-trained model zoo, supporting wide-range of practical tasks in Semantic Segmentation, Interactive Segmentation, Panoptic Segmentation, Image Matting, 3D Segmentation, etc.
Language:Python8.7k1.7k
ultralytics/ultralytics
Ultralytics YOLO11 🚀
Language:Python33.4k6.4k
THU-MIG/yolov10
YOLOv10: Real-Time End-to-End Object Detection [NeurIPS 2024]
Language:Python10k1k
CASIA-IVA-Lab/FastSAM
Fast Segment Anything
Language:Python7.5k711
liliu-avril/Awesome-Segment-Anything
This repository is for the first comprehensive survey on Meta AI's Segment Anything Model (SAM).
84856
openai/whisper
Robust Speech Recognition via Large-Scale Weak Supervision
Language:Python72.1k8.6k
facebookresearch/sam2
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
Language:Jupyter Notebook12.8k1.2k
SY-Xuan/Pink
Pink: Unveiling the Power of Referential Comprehension for Multi-modal LLMs
Language:Python775
microsoft/LLaVA-Med
Large Language-and-Vision Assistant for Biomedicine, built towards multimodal GPT-4 level capabilities.
Language:Python1.6k202
tencent-ailab/IP-Adapter
The image prompt adapter is designed to enable a pretrained text-to-image diffusion model to generate images with image prompt.
Language:Jupyter Notebook5.4k340
JunjieHu/ReCo-RL
Codes of AAAI 2020 paper "What Makes A Good Story? Designing Composite Rewards for Visual Storytelling"
Language:Python269
NExT-GPT/NExT-GPT
Code and models for NExT-GPT: Any-to-Any Multimodal Large Language Model
Language:Python3.3k331
eric-ai-lab/MiniGPT-5
Official implementation of paper "MiniGPT-5: Interleaved Vision-and-Language Generation via Generative Vokens"
Language:Python85352
Luodian/Otter
🦦 Otter, a multi-modal model based on OpenFlamingo (open-sourced version of DeepMind's Flamingo), trained on MIMIC-IT and showcasing improved instruction-following and in-context learning ability.
Language:Python3.6k243
eric-xw/AREL
Code for the ACL paper "No Metrics Are Perfect: Adversarial Reward Learning for Visual Storytelling"
Language:Python13635
BradyFU/Awesome-Multimodal-Large-Language-Models
:sparkles::sparkles:Latest Advances on Multimodal Large Language Models
12.9k824
haotian-liu/LLaVA
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
Language:Python20.5k2.3k
OpenGVLab/InternGPT
InternGPT (iGPT) is an open source demo platform where you can easily showcase your AI models. Now it supports DragGAN, ChatGPT, ImageBind, multimodal chat like GPT-4, SAM, interactive image editing, etc. Try it at igpt.opengvlab.com (支持DragGAN、ChatGPT、ImageBind、SAM的在线Demo系统)
Language:Python3.2k232
facebookresearch/dinov2
PyTorch code and models for the DINOv2 self-supervised learning method.
Language:Jupyter Notebook9.3k832
skypilot-org/skypilot
SkyPilot: Run AI and batch jobs on any infra (Kubernetes or 12+ clouds). Get unified execution, cost savings, and high GPU availability via a simple interface.
Language:Python6.8k517
wilson1yan/VideoGPT
Language:Jupyter Notebook985120
THUDM/CogVideo
text and image to video generation: CogVideoX (2024) and CogVideo (ICLR 2023)
Language:Python9.5k897
showlab/Awesome-Video-Diffusion
A curated list of recent diffusion models for video generation, editing, restoration, understanding, etc.
3.5k205
kohjingyu/fromage
🧀 Code and models for the ICML 2023 paper "Grounding Language Models to Images for Multimodal Inputs and Outputs".
Language:Jupyter Notebook47835
BIMK/PlatEMO
Evolutionary multi-objective optimization platform
Language:MATLAB1.6k471
BIT-thesis/LaTeX-template
LaTeX template for BIT thesis
Language:TeX36692
BITNP/BIThesis
📖 北京理工大学非官方 LaTeX 模板集合，包含本科、研究生毕业设计模板及更多。🎉 （更多文档请访问 wiki 和 release 中的手册）
Language:TeX71497
lllyasviel/ControlNet
Let us control diffusion models!
Language:Python30.6k2.8k

Chuanqi-Zang

Chuanqi-Zang's Stars

cruiseresearchgroup/MAPLE

jianchang512/pyvideotrans

PaddlePaddle/PaddleSeg

ultralytics/ultralytics

THU-MIG/yolov10

CASIA-IVA-Lab/FastSAM

liliu-avril/Awesome-Segment-Anything

openai/whisper

facebookresearch/sam2

SY-Xuan/Pink

microsoft/LLaVA-Med

tencent-ailab/IP-Adapter

JunjieHu/ReCo-RL

NExT-GPT/NExT-GPT

eric-ai-lab/MiniGPT-5

Luodian/Otter

eric-xw/AREL

BradyFU/Awesome-Multimodal-Large-Language-Models

haotian-liu/LLaVA

OpenGVLab/InternGPT

facebookresearch/dinov2

skypilot-org/skypilot

wilson1yan/VideoGPT

THUDM/CogVideo

showlab/Awesome-Video-Diffusion

kohjingyu/fromage

BIMK/PlatEMO

BIT-thesis/LaTeX-template

BITNP/BIThesis

lllyasviel/ControlNet