edwin-19

Machine Learning Engineer with a focused on deep learning and deploying deep learning models. I specialise in computer vision, NLP, Speech and Multimodal AI

Malaysia

edwin-19's Stars

microsoft/vscode
Visual Studio Code
Language:TypeScript163k 3.3k 184k28.8k
langgenius/dify
Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting you quickly go from prototype to production.
Language:TypeScript46.7k 348 3.9k6.6k
RVC-Boss/GPT-SoVITS
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
Language:Python33.1k 202 1.2k3.8k
vllm-project/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
Language:Python27.5k 225 4.6k4k
danielmiessler/fabric
fabric is an open-source framework for augmenting humans using AI. It provides a modular framework for solving specific problems using a crowdsourced set of AI prompts that can be used anywhere.
Language:Go23.2k 310 4482.5k
unslothai/unsloth
Finetune Llama 3.1, Mistral, Phi & Gemma LLMs 2-5x faster with 80% less memory
Language:Python16.1k 110 8261.1k
fishaudio/fish-speech
Brand new TTS solution
Language:Python12.6k 89 360948
neuml/txtai
💡 All-in-one open-source embeddings database for semantic search, LLM orchestration and language model workflows
Language:Python8.8k 85 750580
pydantic/FastUI
Build better UIs faster.
Language:Python8.1k 65 212311
netease-youdao/EmotiVoice
EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine
Language:Python7.3k 63 150621
LiheYoung/Depth-Anything
[CVPR 2024] Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data. Foundation Model for Monocular Depth Estimation
Language:Python6.8k 49 211522
google/gemma.cpp
lightweight, standalone C++ inference engine for Google's Gemma models.
Language:C++5.9k 40 85501
pytorch-labs/gpt-fast
Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.
Language:Python5.5k 63 98504
google/gemma_pytorch
The official PyTorch implementation of Google's Gemma models
Language:Python5.2k 39 37502
open-mmlab/Amphion
Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.
Language:Python4.5k 58 152383
collabora/WhisperSpeech
An Open Source text-to-speech system built by inverting Whisper.
Language:Jupyter Notebook3.8k 76 105206
argmaxinc/WhisperKit
On-device Speech Recognition for Apple Silicon
Language:Swift3.2k 28 115267
sh-lee-prml/HierSpeechpp
The official implementation of HierSpeech++
Language:Python1.2k 56 52134
gemelo-ai/vocos
Vocos: Closing the gap between time-domain and Fourier-based neural vocoders for high-quality audio synthesis
Language:Python769 33 4688
JusperLee/Conv-TasNet
Conv-TasNet: Surpassing Ideal Time-Frequency Magnitude Masking for Speech Separation Pytorch's Implement
Language:Python417 6 5476
fh2019ustc/Awesome-Document-Image-Rectification
A comprehensive list of awesome document image rectification papers.
353 14 429
X-LANCE/VoiceFlow-TTS
[ICASSP 2024] This is the official code for "VoiceFlow: Efficient Text-to-Speech with Rectified Flow Matching"
Language:Python301 15 1520
huggingface/dataspeech
Language:Python277 13 1537
NVIDIA-AI-IOT/nanoowl
A project that optimizes OWL-ViT for real-time inference with NVIDIA TensorRT.
Language:Python233 4 2842
isaaccorley/torchseg
Segmentation models with pretrained backbones. PyTorch.
Language:Python98 6 218
interactiveaudiolab/ppgs
High-Fidelity Neural Phonetic Posteriorgrams
Language:Python85 8 134
NeuralVox/OpenPhonemizer
An espeak-compatible, permissively-licensed IPA phonemizer (G2P) based on DeepPhonemizer. Usable as a drop-in replacement for espeak's GPL phonemizer.
Language:Python75 4 65
hitachi-nlp/appjsonify
A handy PDF-to-JSON conversion tool for academic papers implemented in Python.
Language:Python52 3 33
WangHelin1997/Aty-TTS
Aty-TTS: Improving fairness for spoken language understanding in atypical speech with Text-to-Speech
Language:Python10 2 11
ETZET/SpeechEmotionAVLearning
Language:HTML9 1 12

edwin-19

edwin-19's Stars

microsoft/vscode

langgenius/dify

RVC-Boss/GPT-SoVITS

vllm-project/vllm

danielmiessler/fabric

unslothai/unsloth

fishaudio/fish-speech

neuml/txtai

pydantic/FastUI

netease-youdao/EmotiVoice

LiheYoung/Depth-Anything

google/gemma.cpp

pytorch-labs/gpt-fast

google/gemma_pytorch

open-mmlab/Amphion

collabora/WhisperSpeech

argmaxinc/WhisperKit

sh-lee-prml/HierSpeechpp

gemelo-ai/vocos

JusperLee/Conv-TasNet

fh2019ustc/Awesome-Document-Image-Rectification

X-LANCE/VoiceFlow-TTS

huggingface/dataspeech

NVIDIA-AI-IOT/nanoowl

isaaccorley/torchseg

interactiveaudiolab/ppgs

NeuralVox/OpenPhonemizer

hitachi-nlp/appjsonify

WangHelin1997/Aty-TTS

ETZET/SpeechEmotionAVLearning