Pinned Repositories
a16z-xHuman-DIY
A Javascript AI getting started stack for weekend projects, including image/text models, vector stores, auth, and deployment configs
AGI-Samantha
AGI has been achieved externally
AI-RealChat-DIY
🎙️🤖Create, Customize and Talk to your AI Character/Companion in Realtime(All in One Codebase!). Have a natural seamless conversation with AI everywhere(mobile, web and terminal) using LLM OpenAI GPT3.5/4, Anthropic Claude2, Chroma Vector DB, Whisper Speech2Text, ElevenLabs Text2Speech🎙️🤖
ai-video-search-engine
AnyDoor
Official implementations for paper: Anydoor: zero-shot object-level image customization
build-your-own-x
Master programming by recreating your favorite technologies from scratch.
BunnyVisionPro
Bimanual Dexterous Teleoperation with Real-Time Retargeting using VisionPro
ChatLaw
中文法律大模型
NotionWebsite
使用 NextJS + Notion API 实现的静态博客
Pic-2-3D
Official PyTorch Implementation of Magic123: One Image to High-Quality 3D Object Generation Using Both 2D and 3D Diffusion Priors
QAGITECH's Repositories
QAGITECH/NotionWebsite
使用 NextJS + Notion API 实现的静态博客
QAGITECH/build-your-own-x
Master programming by recreating your favorite technologies from scratch.
QAGITECH/Amphion
Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.
QAGITECH/audio-preprocess
Preprocess Audio for training
QAGITECH/Bert-VITS2
vits2 backbone with multilingual-bert
QAGITECH/DiffSynth-Studio
Enjoy the magic of Diffusion models!
QAGITECH/fish-diffusion
An easy to understand TTS / SVS / SVC framework
QAGITECH/fish-speech
Brand new TTS solution
QAGITECH/flux
Official inference repo for FLUX.1 models
QAGITECH/free-programming-books
:books: Freely available programming books
QAGITECH/FreeStyleRet
Precision Search through Multi-Style Inputs
QAGITECH/GPT-SoVITS
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
QAGITECH/gpt4all
GPT4All: Chat with Local LLMs on Any Device
QAGITECH/gptpdf
Using GPT to parse PDF
QAGITECH/graphrag
A modular graph-based Retrieval-Augmented Generation (RAG) system
QAGITECH/groqbook
Groqbook: Generate entire books in seconds using Groq and Llama3
QAGITECH/localGPT
Chat with your documents on your local device using GPT models. No data leaves your device and 100% private.
QAGITECH/metahuman-stream
Real time interactive streaming digital human
QAGITECH/mini-omni
open-source multimodal large language model that can hear, talk while thinking. Featuring real-time end-to-end speech input and streaming audio output conversational capabilities.
QAGITECH/moshi
QAGITECH/mPLUG-Owl
mPLUG-Owl: The Powerful Multi-modal Large Language Model Family
QAGITECH/NotebookLlama
NotebookLlama: An Open Source version of NotebookLM
QAGITECH/OpenGlass
Turn any glasses into AI-powered smart glasses
QAGITECH/openpilot
openpilot is an operating system for robotics. Currently, it upgrades the driver assistance system in 275+ supported cars.
QAGITECH/Perplexica
Perplexica is an AI-powered search engine. It is an Open source alternative to Perplexity AI
QAGITECH/PyMuPDF
PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.
QAGITECH/sapiens
High-resolution models for human tasks.
QAGITECH/so-vits-svc
基于vits与softvc的歌声音色转换模型
QAGITECH/Streamer-Sales
Streamer-Sales 销冠 —— 卖货主播 LLM 大模型🛒🎁,一个能够根据给定的商品特点从激发用户购买意愿角度出发进行商品解说的卖货主播大模型。🚀⭐内含详细的数据生成流程❗ 📦另外还集成了 LMDeploy 加速推理🚀、RAG检索增强生成 📚、TTS文字转语音🔊、数字人生成 🦸、 Agent 使用网络查询实时信息🌐、ASR 语音转文字🎙️
QAGITECH/SyncTalk
[CVPR 2024] This is the official source for our paper "SyncTalk: The Devil is in the Synchronization for Talking Head Synthesis"