kaka1909

kaka1909's Stars

PlexPt/awesome-chatgpt-prompts-zh
ChatGPT 中文调教指南。各种场景使用指南。学习怎么让它听你的话。
54.7k 346 9613.6k
ageitgey/face_recognition
The world's simplest facial recognition api for Python and the command line
Language:Python54.5k 1.6k 1.4k13.6k
ultralytics/yolov5
YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite
Language:Python53.3k 372 9.4k16.8k
LAION-AI/Open-Assistant
OpenAssistant is a chat-based assistant that understands tasks, can interact with third-party systems, and retrieve information dynamically to do so.
Language:Python37.3k 437 1.6k3.3k
upscayl/upscayl
🆙 Upscayl - #1 Free and Open Source AI Image Upscaler for Linux, MacOS and Windows.
Language:TypeScript36.1k 169 8891.7k
facebookresearch/detectron2
Detectron2 is a platform for object detection, segmentation and other visual recognition tasks.
Language:Python31.7k 391 3.5k7.6k
iperov/DeepFaceLive
Real-time face swap for PC streaming or video calls
Language:Python28k 397 144398
haotian-liu/LLaVA
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
Language:Python22.1k 158 1.6k2.4k
mlc-ai/mlc-llm
Universal LLM Deployment Engine with ML Compilation
Language:Python20.3k 180 1.5k1.7k
SYSTRAN/faster-whisper
Faster Whisper transcription with CTranslate2
Language:Python15.2k 134 8191.3k
m-bain/whisperX
WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
Language:Python14.8k 144 7921.6k
NVIDIA/NeMo
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
Language:Python13.5k 218 2.5k2.8k
jina-ai/clip-as-service
🏄 Scalable embedding, reasoning, ranking for images and sentences with CLIP
Language:Python12.6k 220 6122.1k
kkroening/ffmpeg-python
Python bindings for FFmpeg - with complex filtering support
Language:Python10.4k 112 716907
AIGC-Audio/AudioGPT
AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head
Language:Python10.1k 134 52862
BloopAI/bloop
bloop is a fast code search engine written in Rust.
Language:Rust9.5k 64 144583
PeterL1n/RobustVideoMatting
Robust Video Matting in PyTorch, TensorFlow, TensorFlow.js, ONNX, CoreML!
Language:Python8.9k 135 2511.2k
deep-floyd/IF
Language:Python7.8k 83 102509
pyannote/pyannote-audio
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
Language:Jupyter Notebook7.2k 78 1k860
MahmoudAshraf97/whisper-diarization
Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper
Language:Jupyter Notebook4.3k 46 242397
showlab/Tune-A-Video
[ICCV 2023] Tune-A-Video: One-Shot Tuning of Image Diffusion Models for Text-to-Video Generation
Language:Python4.3k 50 97388
xtekky/chatgpt-clone
ChatGPT interface with better UI
Language:Python3.5k 48 841k
SCUTlihaoyu/open-chat-video-editor
Open source short video automatic generation tool
Language:Python2.8k 41 34361
blmoistawinde/HarvestText
文本挖掘和预处理工具（文本清洗、新词发现、情感分析、实体识别链接、关键词抽取、知识抽取、句法分析等），无监督或弱监督方法
Language:Python2.5k 53 46336
saffsd/langid.py
Stand-alone language identification system
Language:Python2.4k 64 72324
joonson/syncnet_python
Out of time: automated lip sync in the wild
Language:Python743 15 64166
TaoRuijie/TalkNet-ASD
ACM MM 2021: 'Is Someone Speaking? Exploring Long-term Temporal Features for Audio-visual Active Speaker Detection'
Language:Python362 11 7379
keplerlab/katna
Tool for automating common video key-frame extraction, video compression and Image Auto-crop/Image-resize tasks
Language:Python353 25 2766
Junhua-Liao/Light-ASD
The repository for IEEE CVPR 2023 (A Light Weight Model for Active Speaker Detection)
Language:Python118 2 2215
awslabs/aws-media-replay-engine
Media Replay Engine (MRE) is a framework to build automated video clipping and replay (highlight) generation pipelines for live and video-on-demand content.
Language:Python97 6 2023