JinhuaLiang

A Ph.D. student from Centre for Digial Music (C4DM), Queen Mary University of London.

London

JinhuaLiang's Stars

Labbeti/conette-audio-captioning
CoNeTTE: An efficient Audio Captioning system leveraging multiple datasets with Task Embedding
Language:Python13
hpcaitech/ColossalAI
Making large AI models cheaper, faster and more accessible
Language:Python38.7k4.3k
Aria-K-Alethia/BigCodec
Official implementation of the paper "BigCodec: Pushing the Limits of Low-Bitrate Neural Speech Codec"
Language:Python794
kyutai-labs/moshi
Language:Python6.4k489
anusfoil/LLaQo
LLaQo, a Large Language Query-based Coach in the domain of expressive performance
Language:Python6
npurson/fid-metrics
A toolkit for computing Fréchet Inception Distance (FID) & Fréchet Video Distance (FVD) metrics.
Language:Python115
haoheliu/fid-metrics
A toolkit for computing Fréchet Inception Distance (FID) & Fréchet Video Distance (FVD) metrics.
1
hadeel253/Music-Generation-with-WaveGAN
WaveGAN on GTZAN Music genre classification dataset
Language:Jupyter Notebook1
JinhuaLiang/bigvgan
Official PyTorch implementation of BigVGAN (ICLR 2023)
1
FoundationVision/OmniTokenizer
OmniTokenizer: one model and one weight for image-video joint tokenization.
Language:Python2456
researchmm/MM-Diffusion
[CVPR'23] MM-Diffusion: Learning Multi-Modal Diffusion Models for Joint Audio and Video Generation
Language:Python39222
google-research/magvit
Official JAX implementation of MAGVIT: Masked Generative Video Transformer
Language:Python94742
phizaz/diffae
Official implementation of Diffusion Autoencoders
Language:Jupyter Notebook856127
ashleve/lightning-hydra-template
PyTorch Lightning + Hydra. A very user-friendly template for ML experimentation. ⚡🔥⚡
Language:Python4.2k650
ccfddl/ccf-deadlines
⏰ Collaboratively track deadlines of conferences recommended by CCF (Website, Python Cli, Wechat Applet) / If you find it useful, please star this project, thanks~
Language:Vue6.1k432
declare-lab/tango
A family of diffusion models for text-to-audio generation.
Language:Python1k81
habla-liaa/encodecmae
Codebase for the paper 'EncodecMAE: Leveraging neural codecs for universal audio representation learning'
Language:Python874
FoundationVision/LlamaGen
Autoregressive Model Beats Diffusion: 🦙 Llama for Scalable Image Generation
Language:Python1.3k52
ali-vilab/MimicBrush
Official implementations for paper: Zero-shot Image Editing with Reference Imitation
Language:Python1.1k77
IDEA-Research/DINO
[ICLR 2023] Official implementation of the paper "DINO: DETR with Improved DeNoising Anchor Boxes for End-to-End Object Detection"
Language:Python2.2k246
gle-bellier/flow-matching
Annotated Flow Matching paper
Language:Jupyter Notebook1134
AILab-CVC/CV-VAE
[NeurIPS 24] CV-VAE: A Compatible Video VAE for Latent Generative Video Models
Language:Jupyter Notebook2207
hpcaitech/Open-Sora
Open-Sora: Democratizing Efficient Video Production for All
Language:Python21.9k2.1k
anusfoil/PianoJudges
From Audio Encoders to Piano Judges: Benchmarking Performance Understanding for Solo Piano
Language:Jupyter Notebook8
facebookresearch/audioseal
Localized watermarking for AI-generated speech audios, with SOTA on robustness and very fast detector
Language:Python43354
NVIDIA/NeMo
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
Language:Python11.9k2.5k
bytedance/1d-tokenizer
This repo contains the code for our paper An Image is Worth 32 Tokens for Reconstruction and Generation
Language:Jupyter Notebook43716
python-poetry/poetry
Python packaging and dependency management made easy
Language:Python31.5k2.3k
r9y9/tacotron_pytorch
PyTorch implementation of Tacotron speech synthesis model.
Language:Jupyter Notebook30679
lucidrains/rotary-embedding-torch
Implementation of Rotary Embeddings, from the Roformer paper, in Pytorch
Language:Python55045

JinhuaLiang

JinhuaLiang's Stars

Labbeti/conette-audio-captioning

hpcaitech/ColossalAI

Aria-K-Alethia/BigCodec

kyutai-labs/moshi

anusfoil/LLaQo

npurson/fid-metrics

haoheliu/fid-metrics

hadeel253/Music-Generation-with-WaveGAN

JinhuaLiang/bigvgan

FoundationVision/OmniTokenizer

researchmm/MM-Diffusion

google-research/magvit

phizaz/diffae

ashleve/lightning-hydra-template

ccfddl/ccf-deadlines

declare-lab/tango

habla-liaa/encodecmae

FoundationVision/LlamaGen

ali-vilab/MimicBrush

IDEA-Research/DINO

gle-bellier/flow-matching

AILab-CVC/CV-VAE

hpcaitech/Open-Sora

anusfoil/PianoJudges

facebookresearch/audioseal

NVIDIA/NeMo

bytedance/1d-tokenizer

python-poetry/poetry

r9y9/tacotron_pytorch

lucidrains/rotary-embedding-torch