rexnxiaobai

rexliu90@gmail.com

hangzhou

rexnxiaobai's Stars

meta-llama/llama
Inference code for Llama models
Language:Python53.5k 507 9219.2k
xai-org/grok-1
Grok open release
Language:Python48.5k 540 1948.2k
facebookresearch/segment-anything
The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
Language:Jupyter Notebook44.5k 294 6385.2k
microsoft/unilm
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
Language:Python18.6k 293 1.3k2.4k
hpcaitech/Open-Sora
Open-Sora: Democratizing Efficient Video Production for All
Language:Python17k 154 2571.6k
haotian-liu/LLaVA
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
Language:Python16.8k 153 1.3k1.8k
HqWu-HITCS/Awesome-Chinese-LLM
整理开源的中文大语言模型，以规模较小、可私有化部署、训练成本较低的模型为主，包括底座模型，垂直领域微调及应用，数据集与教程等。
11.6k 171 211k
state-spaces/mamba
Language:Python10k 100 295780
BradyFU/Awesome-Multimodal-Large-Language-Models
:sparkles::sparkles:Latest Papers and Datasets on Multimodal Large Language Models, and Their Evaluation.
9.4k 219 91626
QwenLM/Qwen1.5
Qwen1.5 is the improved version of Qwen, the large language model series developed by Qwen team, Alibaba Cloud.
Language:Shell3.1k 29 379172
PKU-YuanGroup/Video-LLaVA
Video-LLaVA: Learning United Visual Representation by Alignment Before Projection
Language:Python2.5k 27 147185
DAMO-NLP-SG/Video-LLaMA
[EMNLP 2023 Demo] Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding
Language:Python2.5k 30 146228
X-PLUG/MobileAgent
Mobile-Agent: Autonomous Multi-Modal Mobile Device Agent with Visual Perception
Language:Python1.9k 35 17156
alibaba/EasyCV
An all-in-one toolkit for computer vision
Language:Python1.7k 31 75188
invictus717/MetaTransformer
Meta-Transformer for Unified Multimodal Learning
Language:Python1.4k 22 63111
lichao-sun/Mora
Mora: More like Sora for Generalist Video Generation
Language:Jupyter Notebook1.3k 64 670
GaParmar/img2img-turbo
One-step image-to-image with Stable Diffusion turbo: sketch2image, day2night, and more
Language:Python1.2k 16 36121
microsoft/LLaVA-Med
Large Language-and-Vision Assistant for Biomedicine, built towards multimodal GPT-4 level capabilities.
Language:HTML1.2k 24 67121
X-PLUG/mPLUG-DocOwl
mPLUG-DocOwl: Modularized Multimodal Large Language Model for Document Understanding
Language:Python986 26 6857
PKU-YuanGroup/LanguageBind
【ICLR 2024🔥】 Extending Video-Language Pretraining to N-modality by Language-based Semantic Alignment
Language:Python549 13 4844
Yuliang-Liu/MultimodalOCR
On the Hidden Mystery of OCR in Large Multimodal Models (OCRBench)
Language:Python306 11 2019
HongguLiu/Deepfake-Detection
The Pytorch implemention of Deepfake Detection based on Faceforensics++
Language:Python284 7 2855
hendrycks/imagenet-r
ImageNet-R(endition) and DeepAugment (ICCV 2021)
Language:Python243 8 815
Haiyang-W/GiT
Official Implementation of "GiT: Towards Generalist Vision Transformer through Universal Language Interface"
Language:Python215 6 710
fdbtrs/ElasticFace
Official repository for ElasticFace: Elastic Margin Loss for Deep Face Recognition
Language:Python155 4 1020
large-ocr-model/large-ocr-model.github.io
Language:Python114 5 123
layumi/U_turn
IJCV22 :see_no_evil: Attack your retrieval model via Query! They are not robust as you expected! :hear_no_evil:
Language:Python47 5 02
MonsterZhZh/HRN
Implementation for Label Relation Graphs Enhanced Hierarchical Residual Network for Hierarchical Multi-Granularity Classification
Language:Python47 2 23
JaySon-Huang/pyjpegtbx
一个针对JPEG格式图像提取原始数据，方便图像数据操作的python库
Language:Python18 5 205
juan-csv/eye_blink_detection
eye blink detection
Language:Python15 1 012

rexnxiaobai

rexnxiaobai's Stars

meta-llama/llama

xai-org/grok-1

facebookresearch/segment-anything

microsoft/unilm

hpcaitech/Open-Sora

haotian-liu/LLaVA

HqWu-HITCS/Awesome-Chinese-LLM

state-spaces/mamba

BradyFU/Awesome-Multimodal-Large-Language-Models

QwenLM/Qwen1.5

PKU-YuanGroup/Video-LLaVA

DAMO-NLP-SG/Video-LLaMA

X-PLUG/MobileAgent

alibaba/EasyCV

invictus717/MetaTransformer

lichao-sun/Mora

GaParmar/img2img-turbo

microsoft/LLaVA-Med

X-PLUG/mPLUG-DocOwl

PKU-YuanGroup/LanguageBind

Yuliang-Liu/MultimodalOCR

HongguLiu/Deepfake-Detection

hendrycks/imagenet-r

Haiyang-W/GiT

fdbtrs/ElasticFace

large-ocr-model/large-ocr-model.github.io

layumi/U_turn

MonsterZhZh/HRN

JaySon-Huang/pyjpegtbx

juan-csv/eye_blink_detection