Pinned Repositories
autogen
A programming framework for agentic AI 🤖
fk-visual-search
Flipkart's visual search and recommendation system
Flash3D
InternVL
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4V. 接近GPT-4V表现的可商用开源多模态对话模型
llama.cpp
LLM inference in C/C++
LLMFarm
llama and other large language models on iOS and MacOS offline using GGML library.
mlc-llm
Universal LLM Deployment Engine with ML Compilation
point_based_clothing
Official PyTorch code for the paper: "Point-Based Modeling of Human Clothing" (ICCV 2021)
Qwen
The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.
VINS-AR
AR project based on "Monocular Visual-Inertial State Estimator on Mobile Phones"
GPTAlgoPro's Repositories
GPTAlgoPro/autogen
A programming framework for agentic AI 🤖
GPTAlgoPro/lmdeploy
LMDeploy is a toolkit for compressing, deploying, and serving LLMs.
GPTAlgoPro/VideoChat
实时语音交互数字人,支持端到端语音方案(GLM-4-Voice - THG)和级联方案(ASR-LLM-TTS-THG)。可自定义形象与音色,无须训练,支持音色克隆,首包延迟低至3s。Real-time voice interactive digital human, supporting end-to-end voice solutions (GLM-4-Voice - THG) and cascaded solutions (ASR-LLM-TTS-THG). Customizable appearance and voice, supporting voice cloning, with initial package delay as low as 3s.
GPTAlgoPro/whisper.cpp
Port of OpenAI's Whisper model in C/C++
GPTAlgoPro/CogVideo
text and image to video generation: CogVideoX (2024) and CogVideo (ICLR 2023)
GPTAlgoPro/cogvideox-factory
Memory optimized finetuning scripts for CogVideoX using TorchAO and DeepSpeed
GPTAlgoPro/DocLayout-YOLO
DocLayout-YOLO: Enhancing Document Layout Analysis through Diverse Synthetic Data and Global-to-Local Adaptive Perception
GPTAlgoPro/docling
Get your documents ready for gen AI
GPTAlgoPro/Fay
Fay is an open-source digital human framework integrating language models and digital characters. It offers retail, assistant, and agent versions for diverse applications like virtual shopping guides, broadcasters, assistants, waiters, teachers, and voice or text-based mobile assistants.
GPTAlgoPro/GLM-4-Voice
GLM-4-Voice | 端到端中英语音对话模型
GPTAlgoPro/GPT-SoVITS
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
GPTAlgoPro/ILLIXR
ILLIXR: Illinois Extended Reality Testbed
GPTAlgoPro/labelU
Data annotation toolbox supports image, audio and video data.
GPTAlgoPro/LGU-SLAM
LGU-SLAM: Learnable Gaussian Uncertainty Matching with Deformable Correlation Sampling for Deep Visual SLAM
GPTAlgoPro/LiveTalking
Real time interactive streaming digital human
GPTAlgoPro/MinerU
A one-stop, open-source, high-quality data extraction tool, supports PDF/webpage/e-book extraction.一站式开源高质量数据提取工具,支持PDF/网页/多格式电子书提取。
GPTAlgoPro/MNN
MNN is a blazing fast, lightweight deep learning framework, battle-tested by business-critical use cases in Alibaba
GPTAlgoPro/moonshine
Fast and accurate automatic speech recognition (ASR) for edge devices
GPTAlgoPro/oxford_spires_dataset
GPTAlgoPro/ppsspp
A PSP emulator for Android, Windows, Mac and Linux, written in C++. Want to contribute? Join us on Discord at https://discord.gg/5NJB6dD or just send pull requests / issues. For discussion use the forums at forums.ppsspp.org.
GPTAlgoPro/pyvideotrans
Translate the video from one language to another and add dubbing. 将视频从一种语言翻译为另一种语言,并支持api调用
GPTAlgoPro/rlt
Official Implementation for our NeurIPS 2024 paper, "Don't Look Twice: Run-Length Tokenization for Faster Video Transformers".
GPTAlgoPro/Steel-LLM
Train a Chinese LLM From 0 by Personal
GPTAlgoPro/Super-Scanner-distribution-privacy
Super Scanner distribution privacy page
GPTAlgoPro/SuperVINS
A robust real-time visual-inertial SLAM framework for challenging imaging conditions (integrated deep learning features)
GPTAlgoPro/SwiftUIX
An exhaustive expansion of the standard SwiftUI library.
GPTAlgoPro/TANGO
Official implementation of the paper "TANGO: Co-Speech Gesture Video Reenactment with Hierarchical Audio-Motion Embedding and Diffusion Interpolation"
GPTAlgoPro/tiled
Flexible level editor
GPTAlgoPro/Video-XL
🔥🔥First-ever hour scale video understanding models
GPTAlgoPro/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs