Metokarski

Metokarski's Stars

upscayl/upscayl
🆙 Upscayl - #1 Free and Open Source AI Image Upscaler for Linux, MacOS and Windows.
Language:TypeScript30.2k 153 7501.4k
huggingface/diffusers
🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch and FLAX.
Language:Python25.4k 196 4.1k5.2k
roboflow/supervision
We write your reusable computer vision tools. 💜
Language:Python22.7k 156 4151.7k
facebookresearch/segment-anything-2
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
Language:Jupyter Notebook11.1k 64 259944
pyannote/pyannote-audio
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
Language:Jupyter Notebook6k 71 989757
kyutai-labs/moshi
Language:Python5.9k 68 54438
deepseek-ai/DeepSeek-V2
DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model
3.5k 29 84143
DepthAnything/Depth-Anything-V2
Depth Anything V2. A More Capable Foundation Model for Monocular Depth Estimation
Language:Python3.4k 29 154279
gpt-omni/mini-omni
open-source multimodal large language model that can hear, talk while thinking. Featuring real-time end-to-end speech input and streaming audio output conversational capabilities.
Language:Python2.7k 94 89252
isl-org/ZoeDepth
Metric depth estimation from a single image
Language:Jupyter Notebook2.3k 32 114210
bghira/SimpleTuner
A general fine-tuning kit geared toward diffusion models.
Language:Python1.6k 19 348143
aiola-lab/whisper-medusa
Whisper with Medusa heads
Language:Python784 13 1348
IDEA-Research/Motion-X
[NeurIPS 2023] Official implementation of the paper "Motion-X: A Large-scale 3D Expressive Whole-body Human Motion Dataset"
Language:Python541 26 9815
ai-forever/Real-ESRGAN
PyTorch implementation of Real-ESRGAN model
Language:Python472 12 23122
apple/ml-sigmoid-attention
Language:Python2149
VIPL-Audio-Visual-Speech-Understanding/LipNet-PyTorch
The state-of-art PyTorch implementation of the method described in the paper "LipNet: End-to-End Sentence-level Lipreading" (https://arxiv.org/abs/1611.01599)
Language:Python208 6 3649
Lornatang/ESRGAN-PyTorch
A simple implementation of esrgan, which uses the pytorch framework.
Language:Python137 4 3929
chenzhuo1011/libri_css
Libri-CSS: dataset and evaluation pipeline
Language:Python131 8 722
facebookresearch/ava-256
Train universal codec avatars
Language:Jupyter Notebook77 9 84
sailordiary/LipNet-PyTorch
"LipNet: End-to-End Sentence-level Lipreading" in PyTorch
Language:Python64 4 620
KoljaB/WhoSpeaks
Efficient approach to speaker diarization using voice characteristics extraction
Language:Python58 2 36
huggingface/fineVideo
Language:Python313
wizenheimer/cyyrus
Transform Unstructured Data into Synthetic Datasets
Language:Python202
dimtzionas/HandObjectInteractionIJCV16_HandMotionViewer
Hand MoCap 3d viewer for the IJCV'16 paper "Capturing Hands in Action using Discriminative Salient Points and Physics Simulation"
Language:HTML11 4 03
jack-tol/youtube-to-audio
A lightweight Python package and command-line interface (CLI) tool that extracts audio from YouTube videos and playlists in multiple formats, such as MP3, WAV, OGG, AAC, and FLAC.
Language:Python71
ffeew/LipCoordNet
A multi-modal neural network built upon LipNet that achieves SOTA performance on the GRID corpus
Language:Python4 2 22
dimtzionas/HandObjectInteractionIJCV16_GroundTruthViewer
Ground-truth viewer for the IJCV'16 paper "Capturing Hands in Action using Discriminative Salient Points and Physics Simulation"
Language:C++2 2 01
cvlabbonn/hand_2d_gt_viewer
A tool to view the data set distributed freely by Dimitris Tzionas
Language:C++13
cvlabbonn/hands_3d_motion_viewer
A tool to view the data set distributed freely by Dimitris Tzionas
Language:C++13
PingYufeng/LipNet-PyTorch-1
PyTorch implementation of the method described in the paper "LipNet: End-to-End Sentence-level Lipreading" (https://arxiv.org/abs/1611.01599)
Language:Python1 1 00