drock577

IntelArizona, USA

drock577's Stars

HumanSignal/label-studio
Label Studio is a multi-type data labeling and annotation tool with standardized output format
Language:JavaScript21.3k 181 2.5k2.6k
facebookresearch/sam2
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
Language:Jupyter Notebook14.6k 82 4891.5k
cvat-ai/cvat
Annotate better with CVAT, the industry-leading data engine for machine learning. Used and trusted by teams at any scale, for data of any scale.
Language:Python13.4k 182 4.4k3.2k
voxel51/fiftyone
Refine high-quality datasets and visual AI models
Language:Python9.3k 65 1.6k607
TadasBaltrusaitis/OpenFace
OpenFace – a state-of-the art tool intended for facial landmark detection, head pose estimation, facial action unit recognition, and eye-gaze estimation.
Language:MATLAB7.2k 280 1k1.9k
MrNeRF/awesome-3D-gaussian-splatting
Curated list of papers and resources focused on 3D Gaussian Splatting, intended to keep pace with the anticipated surge of research in the coming months.
Language:HTML6.9k 223 60424
autodistill/autodistill
Images to inference with no labeling (use foundation models to train supervised models).
Language:Python2.2k 20 107178
nvkelso/natural-earth-vector
A global, public domain map dataset available at three scales and featuring tightly integrated vector and raster data.
Language:HTML1.9k 82 874382
Purfview/whisper-standalone-win
Whisper & Faster-Whisper standalone executables for those who don't want to bother with Python.
1.8k 43 23185
xiaobai1217/Awesome-Video-Datasets
Video datasets
1.4k 29 12100
NVlabs/InstantSplat
InstantSplat: Sparse-view SfM-free Gaussian Splatting in Seconds
Language:Python1.3k 19 6392
juanmc2005/diart
A python package to build AI-powered real-time audio applications
Language:Python1.2k 18 16395
Rudrabha/Lip2Wav
This is the repository containing codes for our CVPR, 2020 paper titled "Learning Individual Speaking Styles for Accurate Lip to Speech Synthesis"
Language:Python702 25 39154
zenml-io/awesome-open-data-annotation
Open Source Data Annotation & Labeling Tools
563 14 045
pliang279/MultiBench
[NeurIPS 2021] Multiscale Benchmarks for Multimodal Representation Learning
Language:HTML526 15 3676
microsoft/XPretrain
Multi-modality pre-training
Language:Python487 12 4037
supervisely/supervisely
Supervisely SDK for Python - convenient way to automate, customize and extend Supervisely Platform for your computer vision task
Language:Python474 22 7072
CMU-MultiComp-Lab/CMU-MultimodalSDK
Language:Python224 5 1825
jimmy646/violin
Data and code for CVPR 2020 paper: "VIOLIN: A Large-Scale Dataset for Video-and-Language Inference"
Language:Python160 9 1115
cvqluu/simple_diarizer
Simplified diarization pipeline using some pretrained models - audio file to diarized segments in a few lines of code
Language:Python145 7 1629
jssprz/video_captioning_datasets
Summary about Video-to-Text datasets. This repository is part of the review paper *Bridging Vision and Language from the Video-to-Text Perspective: A Comprehensive Review*
Language:Jupyter Notebook121 2 112
DmZhukov/CrossTask
Language:Python89 5 1210
Chris10M/Lip2Speech
A pipeline to read lips and generate speech for the read content, i.e Lip to Speech Synthesis.
Language:Python82 4 320
Zhao-Tian-yi/RSDet
Language:Jupyter Notebook61 1 156
yochaiye/LipVoicer
Official Code implementation for the ICLR paper "LipVoicer: Generating Speech from Silent Videos Guided by Lip Reading"
Language:Python59 2 68
chahuja/pats
PATS Dataset. Aligned Pose-Audio-Transcripts and Style for co-speech gesture research
Language:Python57 1 136
Jerrry-Li/YOLO-FIRI
Language:Jupyter Notebook13 1 21
JeongHun0716/Personalized-Lip-Reading
Personalized Lip Reading: Adapting to Your Unique Lip Movements with Vision and Language (AAAI 2025)
Language:Python12 1 03
3loi/MSP_Face
Language:Python10 2 44
jinchiniao/LSHUC
BMVC'23 Learning Separable Hidden Unit Contributions for Speaker-Adaptive Lip-Reading
Language:Python6 1 00

drock577

drock577's Stars

HumanSignal/label-studio

facebookresearch/sam2

cvat-ai/cvat

voxel51/fiftyone

TadasBaltrusaitis/OpenFace

MrNeRF/awesome-3D-gaussian-splatting

autodistill/autodistill

nvkelso/natural-earth-vector

Purfview/whisper-standalone-win

xiaobai1217/Awesome-Video-Datasets

NVlabs/InstantSplat

juanmc2005/diart

Rudrabha/Lip2Wav

zenml-io/awesome-open-data-annotation

pliang279/MultiBench

microsoft/XPretrain

supervisely/supervisely

CMU-MultiComp-Lab/CMU-MultimodalSDK

jimmy646/violin

cvqluu/simple_diarizer

jssprz/video_captioning_datasets

DmZhukov/CrossTask

Chris10M/Lip2Speech

Zhao-Tian-yi/RSDet

yochaiye/LipVoicer

chahuja/pats

Jerrry-Li/YOLO-FIRI

JeongHun0716/Personalized-Lip-Reading

3loi/MSP_Face

jinchiniao/LSHUC