lonngxiang's Stars
mudler/LocalAI
:robot: The free, Open Source alternative to OpenAI, Claude and others. Self-hosted and local-first. Drop-in replacement for OpenAI, running on consumer-grade hardware. No GPU required. Runs gguf, transformers, diffusers and many more models architectures. Features: Generate Text, Audio, Video, Images, Voice Cloning, Distributed, P2P inference
searxng/searxng
SearXNG is a free internet metasearch engine which aggregates results from various search services and databases. Users are neither tracked nor profiled.
naklecha/llama3-from-scratch
llama3 implementation one matrix multiplication at a time
OpenBMB/MiniCPM-V
MiniCPM-V 2.6: A GPT-4V Level MLLM for Single Image, Multi Image and Video on Your Phone
THU-MIG/yolov10
YOLOv10: Real-Time End-to-End Object Detection [NeurIPS 2024]
BloopAI/bloop
bloop is a fast code search engine written in Rust.
xiph/rnnoise
Recurrent neural network for audio noise reduction
LLM-Red-Team/kimi-free-api
🚀 KIMI AI 长文本大模型逆向API白嫖测试【特长:长文本解读整理】,支持高速流式输出、智能体对话、联网搜索、长文档解读、图像OCR、多轮对话,零配置部署,多路token支持,自动清理会话痕迹。
Tencent/HunyuanDiT
Hunyuan-DiT : A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding
tencent-ailab/V-Express
V-Express aims to generate a talking head video under the control of a reference image, an audio, and a sequence of V-Kps images.
THUDM/CogVLM2
GPT4V-level open-source multi-modal model based on Llama3-8B
Alpha-VLLM/Lumina-T2X
Lumina-T2X is a unified framework for Text to Any Modality Generation
alexpinel/Dot
Text-To-Speech, RAG, and LLMs. All local!
timsainb/noisereduce
Noise reduction in python using spectral gating (speech, bioacoustics, audio, time-domain signals)
deedy5/duckduckgo_search
Search for words, documents, images, videos, news, maps and text translation using the DuckDuckGo.com search engine. Downloading files and images to a local hard drive.
pix2pixzero/pix2pix-zero
Zero-shot Image-to-Image Translation [SIGGRAPH 2023]
fatwang2/search2ai
Help your LLMs online
IDEA-Research/Grounding-DINO-1.5-API
API for Grounding DINO 1.5: IDEA Research's Most Capable Open-World Object Detection Model Series
HITsz-TMG/UMOE-Scaling-Unified-Multimodal-LLMs
The codes about "Uni-MoE: Scaling Unified Multimodal Models with Mixture of Experts"
vb000/LookOnceToHear
A novel human-interaction method for real-time speech extraction on headphones.
maitrix-org/Pandora
Pandora: Towards General World Model with Natural Language Actions and Video States
zhangliwei7758/unity-AI-Chat-Toolkit
使用unity实现AI聊天相关功能。目前这个库包含了对chatgpt、chatglm等大语言模型的api调用的代码实现以及实现了微软Azure以及百度AI的语音服务功能,语音服务均采用web api实现,支持Windows/WebGL/Android等平台
sozercan/aikit
🏗️ Fine-tune, build, and deploy open-source LLMs easily!
jjihwan/FIFO-Diffusion_public
Official implementation of FIFO-Diffusion: Generating Infinite Videos from Text without Training (NeurIPS 2024)
FaceAdapter/Face-Adapter
bytedance/Make-An-Audio-2
a text-conditional diffusion probabilistic model capable of generating high fidelity audio.
mingukkang/Diffusion2GAN
Website source files for Diffusion2GAN Project.
LuMen-ze/Semantic-Gesticulator-Official
jjmlovesgit/D-id_Streaming_Chatgpt
jjmlovesgit/d-id_streams