songyinghao's Stars
hoppscotch/hoppscotch
Open source API development ecosystem - https://hoppscotch.io (open-source alternative to Postman, Insomnia)
facebookresearch/segment-anything
The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
TabbyML/tabby
Self-hosted AI coding assistant
continuedev/continue
⏩ Continue is the leading open-source AI code assistant. You can connect any models and any context to build custom autocomplete and chat experiences inside VS Code and JetBrains
labring/FastGPT
FastGPT is a knowledge-based platform built on the LLMs, offers a comprehensive suite of out-of-the-box capabilities such as data processing, RAG retrieval, and visual AI workflow orchestration, letting you easily develop and deploy complex question-answering systems without the need for extensive setup or configuration.
OpenBMB/MiniCPM-V
MiniCPM-V 2.6: A GPT-4V Level MLLM for Single Image, Multi Image and Video on Your Phone
pwxcoo/chinese-xinhua
:orange_book: 中华新华字典数据库。包括歇后语,成语,词语,汉字。
HKUDS/LightRAG
"LightRAG: Simple and Fast Retrieval-Augmented Generation"
CASIA-IVA-Lab/FastSAM
Fast Segment Anything
SWivid/F5-TTS
Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"
OpenTalker/video-retalking
[SIGGRAPH Asia 2022] VideoReTalking: Audio-based Lip Synchronization for Talking Head Video Editing In the Wild
FunAudioLLM/CosyVoice
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
Ucas-HaoranWei/GOT-OCR2.0
Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model
THUDM/GLM-4
GLM-4 series: Open Multilingual Multimodal Chat LMs | 开源多语言多模态对话模型
fudan-generative-vision/champ
Champ: Controllable and Consistent Human Image Animation with 3D Parametric Guidance
fudan-generative-vision/hallo2
Hallo2: Long-Duration and High-Resolution Audio-driven Portrait Image Animation
wenge-research/YAYI2
YAYI 2 是中科闻歌研发的新一代开源大语言模型,采用了超过 2 万亿 Tokens 的高质量、多语言语料进行预训练。(Repo for YaYi 2 Chinese LLMs)
chanind/hanzi-writer
Chinese character stroke order animations and practice quizzes
FunAudioLLM/SenseVoice
Multilingual Voice Understanding Model
VectorSpaceLab/OmniGen
OmniGen: Unified Image Generation. https://arxiv.org/pdf/2409.11340
InternLM/InternLM-XComposer
InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output
NUS-HPC-AI-Lab/VideoSys
VideoSys: An easy and efficient system for video generation
gusye1234/nano-graphrag
A simple, easy-to-hack GraphRAG implementation
open-compass/VLMEvalKit
Open-source evaluation toolkit of large vision-language models (LVLMs), support 160+ VLMs, 50+ benchmarks
qhjqhj00/MemoRAG
Empowering RAG with a memory-based data interface for all-purpose applications!
xdit-project/xDiT
xDiT: A Scalable Inference Engine for Diffusion Transformers (DiTs) with Massive Parallelism
NousResearch/Hermes-Function-Calling
AIDC-AI/Ovis
A novel Multimodal Large Language Model (MLLM) architecture, designed to structurally align visual and textual embeddings.
smthemex/ComfyUI_Hallo2
ComfyUI_Hallo2: Long-Duration and High-Resolution Audio-driven Portrait Image Animation
adigunturu/AugmentedPhysics
Creating Interactive and Embedded Physics Simulations from Static Textbook Diagrams