JasonZhang156's Stars
openai/whisper
Robust Speech Recognition via Large-Scale Weak Supervision
azl397985856/leetcode
LeetCode Solutions: A Record of My Problem Solving Journey.( leetcode题解,记录自己的leetcode解题之路。)
huggingface/pytorch-image-models
The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (ViT), MobileNetV4, MobileNet-V3 & V2, RegNet, DPN, CSPNet, Swin Transformer, MaxViT, CoAtNet, ConvNeXt, and more
Chanzhaoyu/chatgpt-web
用 Express 和 Vue3 搭建的 ChatGPT 演示网页
microsoft/unilm
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
HqWu-HITCS/Awesome-Chinese-LLM
整理开源的中文大语言模型,以规模较小、可私有化部署、训练成本较低的模型为主,包括底座模型,垂直领域微调及应用,数据集与教程等。
kaldi-asr/kaldi
kaldi-asr/kaldi is the official location of the Kaldi project.
Dao-AILab/flash-attention
Fast and memory-efficient exact attention
BradyFU/Awesome-Multimodal-Large-Language-Models
:sparkles::sparkles:Latest Advances on Multimodal Large Language Models
SYSTRAN/faster-whisper
Faster Whisper transcription with CTranslate2
NVIDIA/NeMo
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
google-research/vision_transformer
google/sentencepiece
Unsupervised text tokenizer for Neural Network-based text generation.
TheR1D/shell_gpt
A command-line productivity tool powered by AI large language models like GPT-4, will help you accomplish your tasks faster and more efficiently.
facebookresearch/xformers
Hackable and optimized Transformers building blocks, supporting a composable construction.
espnet/espnet
End-to-End Speech Processing Toolkit
OpenGVLab/InternVL
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型
ChanChiChoi/awesome-Face_Recognition
papers about Face Detection; Face Alignment; Face Recognition && Face Identification && Face Verification && Face Representation; Face Reconstruction; Face Tracking; Face Super-Resolution && Face Deblurring; Face Generation && Face Synthesis; Face Transfer; Face Anti-Spoofing; Face Retrieval;
OFA-Sys/Chinese-CLIP
Chinese version of CLIP which achieves Chinese cross-modal retrieval and representation generation.
baichuan-inc/Baichuan2
A series of large language models developed by Baichuan Intelligent Technology
rom1504/img2dataset
Easily turn large sets of image urls to an image dataset. Can download, resize and package 100M urls in 20h on one machine.
onnx/onnx-tensorrt
ONNX-TensorRT: TensorRT backend for ONNX
LLaVA-VL/LLaVA-NeXT
wolfcw/libfaketime
libfaketime modifies the system time for a single application
kpu/kenlm
KenLM: Faster and Smaller Language Model Queries
jingyi0000/VLM_survey
Collection of AWESOME vision-language models for vision tasks
gengyanlei/fire-smoke-detect-yolov4
fire-smoke-detect-yolov4-yolov5 and fire-smoke-detection-dataset 火灾检测,烟雾检测
christophschuhmann/improved-aesthetic-predictor
CLIP+MLP Aesthetic Score Predictor
LAION-AI/CLIP_benchmark
CLIP-like model evaluation
Beckschen/ViTamin
[CVPR 2024] Official implementation of "ViTamin: Designing Scalable Vision Models in the Vision-language Era"