pywangyu's Stars
kby-ai/FaceRecognition-Docker
This is the docker project for face recognition
SWivid/F5-TTS
Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"
OpenBMB/MiniCPM-o
MiniCPM-o 2.6: A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming on Your Phone
bilibili/Index-1.9B
A SOTA lightweight multilingual LLM
KwaiVGI/LivePortrait
Bring portraits to life!
Kwai-Kolors/Kolors
Kolors Team
ali-vilab/videocomposer
Official repo for VideoComposer: Compositional Video Synthesis with Motion Controllability
QwenLM/Qwen-Audio
The official repo of Qwen-Audio (通义千问-Audio) chat & pretrained large audio language model proposed by Alibaba Cloud.
cosmicman-cvpr2024/CosmicMan
CosmicMan: A Text-to-Image Foundation Model for Humans (CVPR 2024)
PKU-YuanGroup/MoE-LLaVA
Mixture-of-Experts for Large Vision-Language Models
awslabs/aws-ai-solution-kit
Machine Learning APIs for common use cases, include: General OCR (Simplified/Traditional Chinese), Custom OCR, Image Similarity, Object Recognition, Face Detection, Face Comparison, Human Image Segmentation, Human Attribute Recognition, Pornography Detection, Image Super Resolution, Text Similarity, Car License Plate, etc.
PixArt-alpha/PixArt-sigma
PixArt-Σ: Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image Generation
facefusion/facefusion
Industry leading face manipulation platform
THUDM/CogVLM
a state-of-the-art-level open visual language model | 多模态预训练模型
OpenGVLab/InternVideo
[ECCV2024] Video Foundation Models & Data for Multimodal Understanding
Qengineering/Install-OpenCV-Jetson-Nano
OpenCV installation script with CUDA and cuDNN support
datawhalechina/self-llm
《开源大模型食用指南》针对**宝宝量身打造的基于Linux环境快速微调(全参数/Lora)、部署国内外开源大模型(LLM)/多模态大模型(MLLM)教程
pytorch/serve
Serve, optimize and scale PyTorch models in production
Haxxnet/Compose-Examples
Various Docker Compose examples of selfhosted FOSS and proprietary projects.
ultralytics/ultralytics
Ultralytics YOLO11 🚀
PKU-YuanGroup/Chat-UniVi
[CVPR 2024 Highlight🔥] Chat-UniVi: Unified Visual Representation Empowers Large Language Models with Image and Video Understanding
xyy15926/proxy
Alvin9999/new-pac
翻墙-科学上网、自由上网、免费科学上网、免费翻墙、fanqiang、油管youtube/视频下载、软件、VPN、一键翻墙浏览器,vps一键搭建翻墙服务器脚本/教程,免费shadowsocks/ss/ssr/v2ray/goflyway账号/节点,翻墙梯子,电脑、手机、iOS、安卓、windows、Mac、Linux、路由器翻墙、科学上网、youtube视频下载、youtube油管镜像/免翻墙网站、美区apple id共享账号、翻墙-科学上网-梯子
open-mmlab/mmocr
OpenMMLab Text Detection, Recognition and Understanding Toolbox
mxin262/SwinTextSpotter
Pytorch re-implementation of Paper: SwinTextSpotter: Scene Text Spotting via Better Synergy between Text Detection and Text Recognition (CVPR 2022)
pytorch/pytorch
Tensors and Dynamic neural networks in Python with strong GPU acceleration
Lawouach/WebSocket-for-Python
WebSocket client and server library for Python 2 and 3 as well as PyPy (ws4py 0.5.1)
WenmuZhou/DBNet.pytorch
A pytorch re-implementation of Real-time Scene Text Detection with Differentiable Binarization
intel/handwritten-chinese-ocr-samples
End-to-end model training and deployment reference for handwritten Chinese text recognition, and can also be extended to other languages.
kingyiusuen/image-to-latex
Convert images of LaTex math equations into LaTex code.