lucasjinreal
Play with Neural Magic, AIGC, 3D Computer Vision, Virtual Human, 3D Artist. work at @tencent
GoogleSanfancisco
lucasjinreal's Stars
2noise/ChatTTS
ChatTTS is a generative speech model for daily dialogue.
warpdotdev/Warp
Warp is a modern, Rust-based terminal with AI built in so you and your team can build great software, faster.
joaomdmoura/crewAI
Framework for orchestrating role-playing, autonomous AI agents. By fostering collaborative intelligence, CrewAI empowers agents to work together seamlessly, tackling complex tasks.
naklecha/llama3-from-scratch
llama3 implementation one matrix multiplication at a time
infiniflow/ragflow
RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.
ratatui-org/ratatui
Rust library that's all about cooking up terminal user interfaces (TUIs) 👨🍳🐀
THU-MIG/yolov10
YOLOv10: Real-Time End-to-End Object Detection
OpenBMB/MiniCPM-V
MiniCPM-Llama3-V 2.5: A GPT-4V Level Multimodal LLM on Your Phone
microsoft/TaskWeaver
A code-first agent framework for seamlessly planning and executing data analytics tasks.
xorbitsai/inference
Replace OpenAI GPT with another LLM in your app by changing a single line of code. Xinference gives you the freedom to use any LLM you need. With Xinference, you're empowered to run inference with any open-source language models, speech recognition models, and multimodal models, whether in the cloud, on-premises, or even on your laptop.
tencent-ailab/V-Express
V-Express aims to generate a talking head video under the control of a reference image, an audio, and a sequence of V-Kps images.
THUDM/CogVLM2
GPT4V-level open-source multi-modal model based on Llama3-8B
kingwrcy/moments
极简朋友圈
jianchang512/vocal-separate
an extremely simple tool for separating vocals and background music, completely localized for web operation, using 2stems/4stems/5stems models 这是一个极简的人声和背景音乐分离工具,本地化网页操作,无需连接外网
flashinfer-ai/flashinfer
FlashInfer: Kernel Library for LLM Serving
multimodal-art-projection/MAP-NEO
magic-research/PLLaVA
Official repository for the paper PLLaVA
CasualGANPapers/Make-A-Scene
Pytorch implementation of Make-A-Scene: Scene-Based Text-to-Image Generation with Human Priors
ramsrigouthamg/Supertranslate.ai
Subtitle Videos and add text motion graphics - https://www.supertranslate.ai/
NVlabs/VILA
VILA - A multi-image visual language model with training, inference and evaluation recipe, deployable from cloud to edge (Jetson Orin and laptops)
LLM-Red-Team/step-free-api
🚀 跃问YueWen 多模态大模型逆向API白嫖测试【特长:超强多模态】,支持高速流式输出、联网搜索、长文档解读、图像解析、多轮对话,零配置部署,多路token支持,自动清理会话痕迹。
CoffeeStraw/PyonFX
An easy way to create KFX (Karaoke Effects) and complex typesetting using the ASS format (Advanced Substation Alpha).
wangyuchi369/InstructAvatar
Official implementation of the paper 'InstructAvatar: Text-Guided Emotion and Motion Control for Avatar Generation'
TIGER-AI-Lab/Mantis
Official code for Paper "Mantis: Multi-Image Instruction Tuning"
wzzheng/OccSora
OccSora: 4D Occupancy Generation Models as World Simulators for Autonomous Driving
xiaoachen98/Open-LLaVA-NeXT
An open-source implementation of LLaVA-NeXT.
bravekingzhang/search-engine-tool
可能是免费中最好的搜索引擎API,支持Google,Bing,DuckDuckGo,Yahoo
zhuojg/chinese-calligraphy-dataset
coreweave/ml-containers
zjunlp/WKM
Agent Planning with World Knowledge Model