TerryH-dog

My name is Jimmy Uccio, I am a college student and most of the time I have to study. I enjoy rap and coding, as well as playing football.

Beijing

TerryH-dog's Stars

amirbar/visual_prompting
Official implementation and data release of the paper "Visual Prompting via Image Inpainting".
Language:Jupyter Notebook30621
IDEA-Research/GroundingDINO
[ECCV 2024] Official implementation of the paper "Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection"
Language:Python7.2k735
QwenLM/Qwen2.5
Qwen2.5 is the large language model series developed by Qwen team, Alibaba Cloud.
Language:Shell11.9k711
OpenGVLab/VisionLLM
VisionLLM Series
Language:Python98133
rom1504/img2dataset
Easily turn large sets of image urls to an image dataset. Can download, resize and package 100M urls in 20h on one machine.
Language:Python3.9k349
SkyworkAI/Vitron
NeurIPS 2024 Paper: A Unified Pixel-level Vision LLM for Understanding, Generating, Segmenting, Editing
Language:Python46627
bytedance/1d-tokenizer
This repo contains the code for 1D tokenizer and generator
Language:Jupyter Notebook65835
PixArt-alpha/PixArt-alpha
PixArt-α: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis
Language:Python2.9k183
CompVis/latent-diffusion
High-Resolution Image Synthesis with Latent Diffusion Models
Language:Jupyter Notebook12.2k1.6k
AFeng-x/PixWizard
Language:Python128
mit-han-lab/vila-u
[ICLR 2025] VILA-U: a Unified Foundation Model Integrating Visual Understanding and Generation
Language:Python2073
PKU-YuanGroup/Open-Sora-Plan
This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.
Language:Python11.9k1k
PKU-YuanGroup/Video-LLaVA
【EMNLP 2024🔥】Video-LLaVA: Learning United Visual Representation by Alignment Before Projection
Language:Python3.1k222
TencentARC/SEED-Story
SEED-Story: Multimodal Long Story Generation with Large Language Model
Language:Python78459
AILab-CVC/SEED-X
Multimodal Models in Real World
Language:Jupyter Notebook42919
rongyaofang/PUMA
Empowering Unified MLLM with Multi-granular Visual Generation
Language:Python1151
deepseek-ai/Janus
Janus-Series: Unified Multimodal Understanding and Generation Models
Language:Python1.3k74
baaivision/Emu3
Next-Token Prediction is All You Need
Language:Python2k78
VectorSpaceLab/OmniGen
OmniGen: Unified Image Generation. https://arxiv.org/pdf/2409.11340
Language:Jupyter Notebook3.4k281
showlab/Show-o
Repository for Show-o, One Single Transformer to Unify Multimodal Understanding and Generation.
Language:Python1.1k48
myshell-ai/OpenVoice
Instant voice cloning by MIT and MyShell. Audio foundation model.
Language:Python30.6k3k
ShieldMnt/invisible-watermark
python library for invisible image watermark (blind image watermark)
Language:Python1.7k153
githubvpn007/proxy
proxy、proxy server、代理协议、代理服务器、代理协议说明、ss、ssr、v2ray、trojan、Clash、V2rayN、Qv2ray、V2rayW、V2RayS、Mellow、V2rayX、V2rayU、ClashX、Kitsunebi、BifrostV、i2Ray 、Quantumult、Surge 4、winXray、Qv2ray、Kitsunebi、Trojan-Qt5
25833
githubvpn007/v2rayNvpn
翻墙、免费翻墙、免费科学上网、免费节点、免费梯子、免费ss/ssr/v2ray/trojan节点、蓝灯、谷歌商店、翻墙梯子、外网游戏、国外游戏、vpn、vpn推荐、每天更新、上外网、外网、V2rayN、Qv2ray、V2rayW、V2RayS、Mellow、V2rayX、V2rayU、ClashX、Kitsunebi、BifrostV、i2Ray 、Quantumult、Surge 4、winXray、Qv2ray、Kitsunebi、Trojan-Qt5、代理服务器、机场、马里奥、魔兽世界、poshMark、亚马逊、虾皮、煤炉、Mercari、外贸
4.4k358
teacherpeterpan/self-correction-llm-papers
This is a collection of research papers for Self-Correcting Large Language Models with Automated Feedback.
48728
OpenInterpreter/open-interpreter
A natural language interface for computers
Language:Python57.9k5k
wilicc/gpu-burn
Multi-GPU CUDA stress test
Language:C++1.5k304
kourgeorge/arxiv-style
A Latex style and template for paper preprints (based on NIPS style)
Language:TeX1.2k323
tatarchm/tangent_conv
Tangent Convolutions for Dense Prediction in 3D
Language:Python12226
drprojects/superpoint_transformer
Official PyTorch implementation of Superpoint Transformer introduced in [ICCV'23] "Efficient 3D Semantic Segmentation with Superpoint Transformer" and SuperCluster introduced in [3DV'24 Oral] "Scalable 3D Panoptic Segmentation As Superpoint Graph Clustering"
Language:Python70186

TerryH-dog

TerryH-dog's Stars

amirbar/visual_prompting

IDEA-Research/GroundingDINO

QwenLM/Qwen2.5

OpenGVLab/VisionLLM

rom1504/img2dataset

SkyworkAI/Vitron

bytedance/1d-tokenizer

PixArt-alpha/PixArt-alpha

CompVis/latent-diffusion

AFeng-x/PixWizard

mit-han-lab/vila-u

PKU-YuanGroup/Open-Sora-Plan

PKU-YuanGroup/Video-LLaVA

TencentARC/SEED-Story

AILab-CVC/SEED-X

rongyaofang/PUMA

deepseek-ai/Janus

baaivision/Emu3

VectorSpaceLab/OmniGen

showlab/Show-o

myshell-ai/OpenVoice

ShieldMnt/invisible-watermark

githubvpn007/proxy

githubvpn007/v2rayNvpn

teacherpeterpan/self-correction-llm-papers

OpenInterpreter/open-interpreter

wilicc/gpu-burn

kourgeorge/arxiv-style

tatarchm/tangent_conv

drprojects/superpoint_transformer