a2819z's Stars
openai/whisper
Robust Speech Recognition via Large-Scale Weak Supervision
CorentinJ/Real-Time-Voice-Cloning
Clone a voice in 5 seconds to generate arbitrary speech in real-time
gyoogle/tech-interview-for-developer
πΆπ» μ μ κ°λ°μ μ 곡 μ§μ & κΈ°μ λ©΄μ λ°±κ³Όμ¬μ π
openai/tiktoken
tiktoken is a fast BPE tokeniser for use with OpenAI's models.
khangich/machine-learning-interview
Machine Learning Interviews from FAANG, Snapchat, LinkedIn. I have offers from Snapchat, Coupang, Stitchfix etc. Blog: mlengineer.io.
zhanymkanov/fastapi-best-practices
FastAPI Best Practices and Conventions we used at our startup
Plachtaa/VALL-E-X
An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io/vallex/
WooVictory/Ready-For-Tech-Interview
π» μ μ κ°λ°μλ‘μ μ§μμ μκΈ° μν΄ κ³΅λΆνλ κ³΅κ° π¨βπ»
andrewekhalel/MLQuestions
Machine Learning and Computer Vision Engineer - Technical Interview Questions
resemble-ai/Resemblyzer
A python package to analyze and compare voices with deep learning
haoheliu/AudioLDM2
Text-to-Audio/Music Generation
google-research/big_vision
Official codebase used to develop Vision Transformer, SigLIP, MLP-Mixer, LiT and more.
SeanNaren/deepspeech.pytorch
Speech Recognition using DeepSpeech2.
boost-devs/ai-tech-interview
π©βπ»π¨βπ» AI μμ§λμ΄ κΈ°μ λ©΄μ μ€ν°λ (βοΈ 1k+)
ELS-RD/kernl
Kernl lets you run PyTorch transformer models several times faster on GPU with a single line of code, and is designed to be easily hackable.
Jamie-Stirling/RetNet
An implementation of "Retentive Network: A Successor to Transformer for Large Language Models"
gnobitab/InstaFlow
:zap: InstaFlow! One-Step Stable Diffusion with Rectified Flow (ICLR 2024)
zsyOAOA/ResShift
ResShift: Efficient Diffusion Model for Image Super-resolution by Residual Shifting (NeurIPS@2023 Spotlight, TPAMI@2024)
naver-ai/DenseDiffusion
Official Pytorch Implementation of DenseDiffusion (ICCV 2023)
subinium/Misc-Cheatsheet
λνμ μνμ νλ©° μ¬μ©νλ μκ³ μμ€ν μ½λ©ν (linux λͺ λ Ήμ΄ λ±)
qkraudghgh/coding-interview
μ·¨μ μ€λΉλ₯Ό μν΄ κ³΅λΆν λ΄μ©μ μ 리νλ λ ν¬
hutaiHang/Faster-Diffusion
[NeurIPS 2024] Official implementation of "Faster Diffusion: Rethinking the Role of UNet Encoder in Diffusion Models"
YuanGongND/cav-mae
Code and Pretrained Models for ICLR 2023 Paper "Contrastive Audio-Visual Masked Autoencoder".
TiankaiHang/Min-SNR-Diffusion-Training
[ICCV 2023] Efficient Diffusion Training via Min-SNR Weighting Strategy
pcb9382/StereoAlgorithms
Stereo Algorithms (Include:CREStereo,RAFT-Stereo,Hitnet,FastACVNet_plus,Stereo Transformers,RealtimeStereo,DistDepth) with TensorRT,ORT,OpenVINO
OscarXZQ/weight-selection
winddori2002/TriAAN-VC
TriAAN-VC: Triple Adaptive Attention Normalization for Any-to-Any Voice Conversion
curryjung/InjectFusion_official
ibaiGorordo/ONNX-FastACVNet-Depth-Estimation
Python scripts performing stereo depth estimation using the Fast-ACVNet model in ONNX.
vinceecws/Monodepth
PyTorch implementation of Unsupervised Monocular Depth Estimation with Left-Right Consistency