luyizhou4

End2End ASR

Shanghai Jiao Tong UniversityShanghai

luyizhou4's Stars

google-research/bert
TensorFlow code and pre-trained models for BERT
Language:Python38.4k 1k 1.1k9.6k
facebookresearch/audiocraft
Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable music generation LM with textual and melodic conditioning.
Language:Python21.2k 212 3942.2k
Hannibal046/Awesome-LLM
Awesome-LLM: a curated list of Large Language Model
19.6k 382 271.6k
BradyFU/Awesome-Multimodal-Large-Language-Models
:sparkles::sparkles:Latest Advances on Multimodal Large Language Models
13.2k 257 128838
AIGC-Audio/AudioGPT
AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head
Language:Python10.1k 136 51868
modelscope/FunASR
A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.
Language:Python7.5k 69 1.3k798
mli/autocut
用文本编辑器剪视频
Language:Python6.8k 50 82692
NVIDIA/FasterTransformer
Transformer related optimization, including BERT, GPT
Language:C++5.9k 63 625896
pytorch-labs/gpt-fast
Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.
Language:Python5.7k 60 106521
google/gemma_pytorch
The official PyTorch implementation of Google's Gemma models
Language:Python5.3k 39 41514
archinetai/audio-ai-timeline
A timeline of the latest AI models for audio generation, starting in 2023!
1.9k 170 470
ChenyangQiQi/FateZero
[ICCV 2023 Oral] "FateZero: Fusing Attentions for Zero-shot Text-based Video Editing"
Language:Jupyter Notebook1.1k 14 35107
declare-lab/tango
A family of diffusion models for text-to-audio generation.
Language:Python1.1k 28 4993
HillZhang1999/llm-hallucination-survey
Reading list of hallucination in LLMs. Check out our new survey paper: "Siren’s Song in the AI Ocean: A Survey on Hallucination in Large Language Models"
959 12 352
Jonathan-LeRoux/IguanaTex
A PowerPoint add-in to insert LaTeX equations into PowerPoint presentations on Windows and Mac
Language:VBA924 14 7664
mit-han-lab/distrifuser
[CVPR 2024 Highlight] DistriFusion: Distributed Parallel Inference for High-Resolution Diffusion Models
Language:Python620 9 2424
X-LANCE/SLAM-LLM
Speech, Language, Audio, Music Processing with Large Language Model
Language:Python620 22 4856
abacusai/Long-Context
This repository contains code and tooling for the Abacus.AI LLM Context Expansion project. Also included are evaluation scripts and benchmark tasks that evaluate a model’s information retrieval capabilities with context expansion. We also include key experimental results and instructions for reproducing and building on them.
Language:Python583 13 637
huawei-noah/Speech-Backbones
This is the main repository of open-sourced speech technology by Huawei Noah's Ark Lab.
Language:Jupyter Notebook566 23 31122
facebookresearch/AudioMAE
This repo hosts the code and models of "Masked Autoencoders that Listen".
Language:Python556 32 2947
liusongxiang/Large-Audio-Models
Keep track of big models in audio domain, including speech, singing, music etc.
463 46 228
mit-han-lab/tiny-training
On-Device Training Under 256KB Memory [NeurIPS'22]
Language:Python446 18 860
X-PLUG/Youku-mPLUG
Youku-mPLUG: A 10 Million Large-scale Chinese Video-Language Pre-training Dataset and Benchmarks
Language:Python288 6 3211
princeton-nlp/CoFiPruning
[ACL 2022] Structured Pruning Learns Compact and Accurate Models https://arxiv.org/abs/2204.00408
Language:Python192 9 5231
WoosukKwon/retraining-free-pruning
[NeurIPS 2022] A Fast Post-Training Pruning Framework for Transformers
Language:Python176 6 1927
KeSpeech/KeSpeech
The repo provides information about KeSpeech dataset.
129 5 147
microsoft/ReinMax
Beyond Straight-Through
Language:Python91 4 14
mit-han-lab/patch_conv
Patch convolution to avoid large GPU memory usage of Conv2D
Language:Python80 8 15
XiaoMi/dasheng
Official PyTorch code for Deep Audio-Signal Holistic Embeddings
Language:Python61 6 47
X-LANCE/public_talks
Materials of public talks given By SJTU X-LANCE members
14 5 00

luyizhou4

luyizhou4's Stars

google-research/bert

facebookresearch/audiocraft

Hannibal046/Awesome-LLM

BradyFU/Awesome-Multimodal-Large-Language-Models

AIGC-Audio/AudioGPT

modelscope/FunASR

mli/autocut

NVIDIA/FasterTransformer

pytorch-labs/gpt-fast

google/gemma_pytorch

archinetai/audio-ai-timeline

ChenyangQiQi/FateZero

declare-lab/tango

HillZhang1999/llm-hallucination-survey

Jonathan-LeRoux/IguanaTex

mit-han-lab/distrifuser

X-LANCE/SLAM-LLM

abacusai/Long-Context

huawei-noah/Speech-Backbones

facebookresearch/AudioMAE

liusongxiang/Large-Audio-Models

mit-han-lab/tiny-training

X-PLUG/Youku-mPLUG

princeton-nlp/CoFiPruning

WoosukKwon/retraining-free-pruning

KeSpeech/KeSpeech

microsoft/ReinMax

mit-han-lab/patch_conv

XiaoMi/dasheng

X-LANCE/public_talks