songyinghao

songyinghao's Stars

hoppscotch/hoppscotch
Open source API development ecosystem - https://hoppscotch.io (open-source alternative to Postman, Insomnia)
Language:TypeScript65.8k 485 1.7k4.6k
facebookresearch/segment-anything
The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
Language:Jupyter Notebook47.8k 308 6745.7k
TabbyML/tabby
Self-hosted AI coding assistant
Language:Rust22k 107 7591k
continuedev/continue
⏩ Continue is the leading open-source AI code assistant. You can connect any models and any context to build custom autocomplete and chat experiences inside VS Code and JetBrains
Language:TypeScript19.6k 96 1.7k1.7k
labring/FastGPT
FastGPT is a knowledge-based platform built on the LLMs, offers a comprehensive suite of out-of-the-box capabilities such as data processing, RAG retrieval, and visual AI workflow orchestration, letting you easily develop and deploy complex question-answering systems without the need for extensive setup or configuration.
Language:TypeScript18.6k 118 2.1k4.9k
OpenBMB/MiniCPM-V
MiniCPM-V 2.6: A GPT-4V Level MLLM for Single Image, Multi Image and Video on Your Phone
Language:Python12.7k 106 592893
pwxcoo/chinese-xinhua
:orange_book: 中华新华字典数据库。包括歇后语，成语，词语，汉字。
Language:Python11k 308 602.6k
HKUDS/LightRAG
"LightRAG: Simple and Fast Retrieval-Augmented Generation"
Language:Python9.8k 125 2511.2k
CASIA-IVA-Lab/FastSAM
Fast Segment Anything
Language:Python7.5k 56 209711
SWivid/F5-TTS
Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"
Language:Python7.5k 71 336933
OpenTalker/video-retalking
[SIGGRAPH Asia 2022] VideoReTalking: Audio-based Lip Synchronization for Talking Head Video Editing In the Wild
Language:Python6.7k 74 246982
FunAudioLLM/CosyVoice
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
Language:Python6.5k 65 548702
Ucas-HaoranWei/GOT-OCR2.0
Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model
Language:Python6.1k 53 157530
THUDM/GLM-4
GLM-4 series: Open Multilingual Multimodal Chat LMs | 开源多语言多模态对话模型
Language:Python5.4k 34 569441
fudan-generative-vision/champ
Champ: Controllable and Consistent Human Image Animation with 3D Parametric Guidance
Language:Python4.8k 313 127598
fudan-generative-vision/hallo2
Hallo2: Long-Duration and High-Resolution Audio-driven Portrait Image Animation
Language:Python4.4k 430 48623
wenge-research/YAYI2
YAYI 2 是中科闻歌研发的新一代开源大语言模型，采用了超过 2 万亿 Tokens 的高质量、多语言语料进行预训练。(Repo for YaYi 2 Chinese LLMs)
Language:Python3.6k 7 917
chanind/hanzi-writer
Chinese character stroke order animations and practice quizzes
Language:TypeScript3.6k 62 186557
FunAudioLLM/SenseVoice
Multilingual Voice Understanding Model
Language:Python3.5k 38 132318
VectorSpaceLab/OmniGen
OmniGen: Unified Image Generation. https://arxiv.org/pdf/2409.11340
Language:Jupyter Notebook2.9k 84 110232
InternLM/InternLM-XComposer
InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output
Language:Python2.5k 43 391156
NUS-HPC-AI-Lab/VideoSys
VideoSys: An easy and efficient system for video generation
Language:Python1.8k 27 88123
gusye1234/nano-graphrag
A simple, easy-to-hack GraphRAG implementation
Language:Python1.7k 10 65161
open-compass/VLMEvalKit
Open-source evaluation toolkit of large vision-language models (LVLMs), support 160+ VLMs, 50+ benchmarks
Language:Python1.4k 11 227195
qhjqhj00/MemoRAG
Empowering RAG with a memory-based data interface for all-purpose applications!
Language:Python1.3k 11 3280
xdit-project/xDiT
xDiT: A Scalable Inference Engine for Diffusion Transformers (DiTs) with Massive Parallelism
Language:Python733 7 9156
NousResearch/Hermes-Function-Calling
Language:Jupyter Notebook728 13 2590
AIDC-AI/Ovis
A novel Multimodal Large Language Model (MLLM) architecture, designed to structurally align visual and textual embeddings.
Language:Python545 7 3532
smthemex/ComfyUI_Hallo2
ComfyUI_Hallo2: Long-Duration and High-Resolution Audio-driven Portrait Image Animation
Language:Python444
adigunturu/AugmentedPhysics
Creating Interactive and Embedded Physics Simulations from Static Textbook Diagrams
Language:HTML13