pywangyu

pywangyu's Stars

kby-ai/FaceRecognition-Docker
This is the docker project for face recognition
Language:Python5536
SWivid/F5-TTS
Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"
Language:Python9.1k1.2k
OpenBMB/MiniCPM-o
MiniCPM-o 2.6: A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming on Your Phone
Language:Python17.1k1.2k
bilibili/Index-1.9B
A SOTA lightweight multilingual LLM
Language:Python93847
KwaiVGI/LivePortrait
Bring portraits to life!
Language:Python13.7k1.5k
Kwai-Kolors/Kolors
Kolors Team
Language:Python4.1k304
ali-vilab/videocomposer
Official repo for VideoComposer: Compositional Video Synthesis with Motion Controllability
Language:Python91784
QwenLM/Qwen-Audio
The official repo of Qwen-Audio (通义千问-Audio) chat & pretrained large audio language model proposed by Alibaba Cloud.
Language:Python1.6k112
cosmicman-cvpr2024/CosmicMan
CosmicMan: A Text-to-Image Foundation Model for Humans (CVPR 2024)
Language:Python3248
PKU-YuanGroup/MoE-LLaVA
Mixture-of-Experts for Large Vision-Language Models
Language:Python2.1k128
awslabs/aws-ai-solution-kit
Machine Learning APIs for common use cases, include: General OCR (Simplified/Traditional Chinese), Custom OCR, Image Similarity, Object Recognition, Face Detection, Face Comparison, Human Image Segmentation, Human Attribute Recognition, Pornography Detection, Image Super Resolution, Text Similarity, Car License Plate, etc.
Language:Python16824
PixArt-alpha/PixArt-sigma
PixArt-Σ: Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image Generation
Language:Python1.7k85
facefusion/facefusion
Industry leading face manipulation platform
Language:Python21.1k3.2k
THUDM/CogVLM
a state-of-the-art-level open visual language model | 多模态预训练模型
Language:Python6.3k426
OpenGVLab/InternVideo
[ECCV2024] Video Foundation Models & Data for Multimodal Understanding
Language:Python1.6k97
Qengineering/Install-OpenCV-Jetson-Nano
OpenCV installation script with CUDA and cuDNN support
Language:Shell15655
datawhalechina/self-llm
《开源大模型食用指南》针对**宝宝量身打造的基于Linux环境快速微调（全参数/Lora）、部署国内外开源大模型（LLM）/多模态大模型（MLLM）教程
Language:Jupyter Notebook11.6k1.3k
pytorch/serve
Serve, optimize and scale PyTorch models in production
Language:Java4.3k868
Haxxnet/Compose-Examples
Various Docker Compose examples of selfhosted FOSS and proprietary projects.
5.9k275
ultralytics/ultralytics
Ultralytics YOLO11 🚀
Language:Python35.6k6.9k
PKU-YuanGroup/Chat-UniVi
[CVPR 2024 Highlight🔥] Chat-UniVi: Unified Visual Representation Empowers Large Language Models with Image and Video Understanding
Language:Python89943
xyy15926/proxy
51
Alvin9999/new-pac
翻墙-科学上网、自由上网、免费科学上网、免费翻墙、fanqiang、油管youtube/视频下载、软件、VPN、一键翻墙浏览器，vps一键搭建翻墙服务器脚本/教程，免费shadowsocks/ss/ssr/v2ray/goflyway账号/节点，翻墙梯子，电脑、手机、iOS、安卓、windows、Mac、Linux、路由器翻墙、科学上网、youtube视频下载、youtube油管镜像/免翻墙网站、美区apple id共享账号、翻墙-科学上网-梯子
Language:Python57.9k9.7k
open-mmlab/mmocr
OpenMMLab Text Detection, Recognition and Understanding Toolbox
Language:Python4.4k755
mxin262/SwinTextSpotter
Pytorch re-implementation of Paper: SwinTextSpotter: Scene Text Spotting via Better Synergy between Text Detection and Text Recognition (CVPR 2022)
Language:Python27742
pytorch/pytorch
Tensors and Dynamic neural networks in Python with strong GPU acceleration
Language:Python86k23.2k
Lawouach/WebSocket-for-Python
WebSocket client and server library for Python 2 and 3 as well as PyPy (ws4py 0.5.1)
Language:Python1.1k288
WenmuZhou/DBNet.pytorch
A pytorch re-implementation of Real-time Scene Text Detection with Differentiable Binarization
Language:Python971250
intel/handwritten-chinese-ocr-samples
End-to-end model training and deployment reference for handwritten Chinese text recognition, and can also be extended to other languages.
Language:Python15132
kingyiusuen/image-to-latex
Convert images of LaTex math equations into LaTex code.
Language:Python2.1k313