GuanRainy's Stars
mindee/doctr
docTR (Document Text Recognition) - a seamless, high-performing & accessible library for OCR-related tasks powered by Deep Learning.
echo1118/Live_Detection
活体检测:眨眼检测、张嘴检测、摇头检测、点头检测
codeniko/shape_predictor_81_face_landmarks
Custom shape predictor model trained to find 81 facial feature landmarks given any image
davisking/dlib-models
Trained model files for dlib example programs.
abnercloud/Facial_106_Landmarks
Facial_106_Landmarks
hpc203/yoloface-landmark106
纯YOLO系列的人脸检测+106个关键点检测
biubug6/Pytorch_Retinaface
Retinaface get 80.99% in widerface hard val using mobilenet0.25.
jhb86253817/PIPNet
Efficient facial landmark detector
midasklr/facelandmarks
light-weight 98 points face landmark超轻98点人脸关键点检测模型
HumanSignal/awesome-data-labeling
A curated list of awesome data labeling tools
Evezerest/PPOCRLabel
PPOCRLabel is a semi-automatic graphic annotation tool suitable for OCR field, with built-in PP-OCR model to automatically detect and re-recognize data. It is written in Python 3 and PyQT5, supporting rectangular box annotation and four-point annotation modes. Annotations can be directly used for the training of PP-OCR detection and recognition models.
itmorn/robot-mouse-track
随着互联网技术的发展,鼠标轨迹识别算法在很多人机交互产品中的需求日益增加,比如,一些网站为了防止被爬,增加了一些滑块验证码,但是一些软件已经可以模拟人的行为破解滑块验证码。本项目就是通过对鼠标轨迹的特征分析,判定是否是人的行为还是机器行为。常见应用场景:网站反爬虫、在线考试系统脚本刷题。文档:https://robot-mouse-track.readthedocs.io
kwuking/TimeMixer
[ICLR 2024] Official implementation of "TimeMixer: Decomposable Multiscale Mixing for Time Series Forecasting"
mitchellh/mapstructure
Go library for decoding generic map values into native Go structures and vice versa.
google-research/timesfm
TimesFM (Time Series Foundation Model) is a pretrained time-series foundation model developed by Google Research for time-series forecasting.
fundamentalvision/Deformable-DETR
Deformable DETR: Deformable Transformers for End-to-End Object Detection.
minio/minio
The Object Store for AI Data Infrastructure
google-ai-edge/mediapipe
Cross-platform, customizable ML solutions for live and streaming media.
DefTruth/torchlm
💎A high level pipeline for face landmarks detection, it supports training, evaluating, exporting, inference(Python/C++) and 100+ data augmentations, can easily install via pip.
reqable/reqable-app
Reqable issue track repo
leon-thomm/Ryven
Flow-based visual scripting for Python
haochenheheda/segment-anything-annotator
We developed a python UI based on labelme and segment-anything for pixel-level annotation. It support multiple masks generation by SAM(box/point prompt), efficient polygon modification and category record. We will add more features (such as incorporating CLIP-based methods for category proposal and VOS methods for video datasets
LLaVA-VL/LLaVA-NeXT
yformer/EfficientSAM
EfficientSAM: Leveraged Masked Image Pretraining for Efficient Segment Anything
huggingface/peft
🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.
salesforce/LAVIS
LAVIS - A One-stop Library for Language-Vision Intelligence
haotian-liu/LLaVA
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
dvlab-research/LISA
Project Page for "LISA: Reasoning Segmentation via Large Language Model"
facebookresearch/mae
PyTorch implementation of MAE https//arxiv.org/abs/2111.06377
OpenBMB/MiniCPM-V
MiniCPM-Llama3-V 2.5: A GPT-4V Level Multimodal LLM on Your Phone