luyizhou4's Stars
google-research/bert
TensorFlow code and pre-trained models for BERT
facebookresearch/audiocraft
Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable music generation LM with textual and melodic conditioning.
Hannibal046/Awesome-LLM
Awesome-LLM: a curated list of Large Language Model
BradyFU/Awesome-Multimodal-Large-Language-Models
:sparkles::sparkles:Latest Advances on Multimodal Large Language Models
AIGC-Audio/AudioGPT
AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head
modelscope/FunASR
A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.
mli/autocut
用文本编辑器剪视频
NVIDIA/FasterTransformer
Transformer related optimization, including BERT, GPT
pytorch-labs/gpt-fast
Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.
google/gemma_pytorch
The official PyTorch implementation of Google's Gemma models
archinetai/audio-ai-timeline
A timeline of the latest AI models for audio generation, starting in 2023!
ChenyangQiQi/FateZero
[ICCV 2023 Oral] "FateZero: Fusing Attentions for Zero-shot Text-based Video Editing"
declare-lab/tango
A family of diffusion models for text-to-audio generation.
HillZhang1999/llm-hallucination-survey
Reading list of hallucination in LLMs. Check out our new survey paper: "Siren’s Song in the AI Ocean: A Survey on Hallucination in Large Language Models"
Jonathan-LeRoux/IguanaTex
A PowerPoint add-in to insert LaTeX equations into PowerPoint presentations on Windows and Mac
mit-han-lab/distrifuser
[CVPR 2024 Highlight] DistriFusion: Distributed Parallel Inference for High-Resolution Diffusion Models
X-LANCE/SLAM-LLM
Speech, Language, Audio, Music Processing with Large Language Model
abacusai/Long-Context
This repository contains code and tooling for the Abacus.AI LLM Context Expansion project. Also included are evaluation scripts and benchmark tasks that evaluate a model’s information retrieval capabilities with context expansion. We also include key experimental results and instructions for reproducing and building on them.
huawei-noah/Speech-Backbones
This is the main repository of open-sourced speech technology by Huawei Noah's Ark Lab.
facebookresearch/AudioMAE
This repo hosts the code and models of "Masked Autoencoders that Listen".
liusongxiang/Large-Audio-Models
Keep track of big models in audio domain, including speech, singing, music etc.
mit-han-lab/tiny-training
On-Device Training Under 256KB Memory [NeurIPS'22]
X-PLUG/Youku-mPLUG
Youku-mPLUG: A 10 Million Large-scale Chinese Video-Language Pre-training Dataset and Benchmarks
princeton-nlp/CoFiPruning
[ACL 2022] Structured Pruning Learns Compact and Accurate Models https://arxiv.org/abs/2204.00408
WoosukKwon/retraining-free-pruning
[NeurIPS 2022] A Fast Post-Training Pruning Framework for Transformers
KeSpeech/KeSpeech
The repo provides information about KeSpeech dataset.
microsoft/ReinMax
Beyond Straight-Through
mit-han-lab/patch_conv
Patch convolution to avoid large GPU memory usage of Conv2D
XiaoMi/dasheng
Official PyTorch code for Deep Audio-Signal Holistic Embeddings
X-LANCE/public_talks
Materials of public talks given By SJTU X-LANCE members