songge25's Stars
lucasjinreal/Namo-R1
A CPU Realtime VLM in 500M. Surpassed Moondream2 and SmolVLM. Training from scratch with ease.
PKU-Alignment/align-anything
Align Anything: Training All-modality Model with Feedback
hiyouga/EasyR1
EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL
dyh/unbox_yolov5_deepsort_counting
yolov5 deepsort 行人 车辆 跟踪 检测 计数
codelion/optillm
Optimizing inference proxy for LLMs
Ucas-HaoranWei/GOT-OCR2.0
Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model
mendableai/firecrawl
🔥 Turn entire websites into LLM-ready markdown or structured data. Scrape, crawl and extract with a single API.
LLaVA-VL/LLaVA-NeXT
open-compass/VLMEvalKit
Open-source evaluation toolkit of large multi-modality models (LMMs), support 220+ LMMs, 80+ benchmarks
OpenBMB/MiniCPM-o
MiniCPM-o 2.6: A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming on Your Phone
wdndev/llm_interview_note
主要记录大语言大模型(LLMs) 算法(应用)工程师相关的知识及面试题
QwenLM/Qwen2.5
Qwen2.5 is the large language model series developed by Qwen team, Alibaba Cloud.
sparkfish/augraphy
Augmentation pipeline for rendering synthetic paper printing, faxing, scanning and copy machine processes
OleehyO/TexTeller
TexTeller can convert image to latex formulas (image2latex, latex OCR) with higher accuracy and exhibits superior generalization ability, enabling it to cover most usage scenarios.
opendatalab/UniMERNet
UniMERNet: A Universal Network for Real-World Mathematical Expression Recognition
liguodongiot/llm-action
本项目旨在分享大模型相关技术原理以及实战经验(大模型工程化、大模型应用落地)
InternLM/xtuner
An efficient, flexible and full-featured toolkit for fine-tuning LLM (InternLM2, Llama3, Phi3, Qwen, Mistral, ...)
NVIDIA/Megatron-LM
Ongoing research training transformer models at scale
FlagOpen/FlagEmbedding
Retrieval and Retrieval-augmented LLMs
mlabonne/llm-course
Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.
dvlab-research/MGM
Official repo for "Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models"
kovzol/Java-Geometry-Expert
Java Geometry Expert
NVIDIA/NeMo-Curator
Scalable data pre processing and curation toolkit for LLMs
megvii-research/NAFNet
The state-of-the-art image restoration model without nonlinear activation functions.
SupritYoung/RLHF-Label-Tool
用于大模型 RLHF 进行人工数据标注排序的工具。A tool for manual response data annotation sorting in RLHF stage.
mlfoundations/open_clip
An open source implementation of CLIP.
QwenLM/Qwen-VL
The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.
wgwang/awesome-LLMs-In-China
**大模型
hiyouga/LLaMA-Factory
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
TencentARC/PhotoMaker
PhotoMaker [CVPR 2024]