youngsheen

USTC

youngsheen's Stars

youngsheen/GPST
[ACL 2024] Generative Pre-Trained Speech Language Model with Efficient Hierarchical Transformer
Language:Python372
souzatharsis/podcastfy
An Open Source Python alternative to NotebookLM's podcast feature: Transforming Multimodal Content into Captivating Multilingual Audio Conversations with GenAI
Language:Python1k112
facebookresearch/lingua
Meta Lingua: a lean, efficient, and easy-to-hack codebase to research LLMs.
Language:Python4.2k216
youngsheen/SimVQ
SimVQ: Addressing Representation Collapse in Vector Quantized Models with One Linear Layer
Language:Python1104
facebookresearch/spiritlm
Inference code for the paper "Spirit-LM Interleaved Spoken and Written Language Model".
Language:Python77247
DAMO-NLP-SG/DiGIT
[NeurIPS 2024] Stabilize the Latent Space for Image Autoregressive Modeling: A Unified Perspective
Language:Python402
kyutai-labs/moshi
Language:Python6.7k522
gpt-omni/mini-omni
open-source multimodal large language model that can hear, talk while thinking. Featuring real-time end-to-end speech input and streaming audio output conversational capabilities.
Language:Python3.1k277
homebrewltd/ichigo
Local realtime voice AI
Language:Python1.9k86
triton-lang/triton
Development repository for the Triton language and compiler
Language:C++13.4k1.6k
pytorch/torchtitan
A native PyTorch Library for large model training
Language:Python2.6k205
facebookresearch/chameleon
Repository for Meta Chameleon, a mixed-modal early-fusion foundation model from FAIR.
Language:Python1.8k112
Haoqiu-Yan/PerceptiveAgent
Code for Talk With Human-like Agents: Empathetic Dialogue Through Perceptible Acoustic Reception and Reaction (ACL24))
Language:Python321
TencentARC/Open-MAGVIT2
Open-MAGVIT2: Democratizing Autoregressive Visual Generation
Language:Python69628
TinyLLaVA/TinyLLaVA_Factory
A Framework of Small-scale Large Multimodal Models
Language:Python64766
minyoungg/platonic-rep
Language:Python45929
openai/tiktoken
tiktoken is a fast BPE tokeniser for use with OpenAI's models.
Language:Python12.4k844
pytorch/torchtune
PyTorch native finetuning library
Language:Python4.3k431
haotian-liu/LLaVA
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
Language:Python20.2k2.2k
facebookresearch/jepa
PyTorch code and models for V-JEPA self-supervised learning from video.
Language:Python2.7k255
dropreg/efficient_alpaca
The aim of this repository is to utilize LLaMA to reproduce and enhance the Stanford Alpaca
Language:Python969
google-deepmind/alphageometry
Language:Python4.2k466
haoliuhl/language-quantized-autoencoders
Language Quantized AutoEncoders
Language:Python945
ytongbai/LVM
Language:Python1.8k54
ml-explore/mlx
MLX: An array framework for Apple silicon
Language:C++17.2k996
atong01/conditional-flow-matching
TorchCFM: a Conditional Flow Matching library
Language:Python1.2k98
The-Run-Philosophy-Organization/run
润学全球官方指定GITHUB，整理润学宗旨、纲领、理论和各类润之实例；解决为什么润，润去哪里，怎么润三大问题；并成为新**人的核心宗教，核心信念。
31.7k2.6k
kakaobrain/rq-vae-transformer
The official implementation of Autoregressive Image Generation using Residual Quantization (CVPR '22)
Language:Jupyter Notebook78887
tickstep/aliyunpan
阿里云盘命令行客户端，支持JavaScript插件，支持同步备份功能。
Language:Go4.2k354
facebookresearch/encodec
State-of-the-art deep learning based audio codec supporting both mono 24 kHz audio and stereo 48 kHz audio.
Language:Python3.5k304

youngsheen

youngsheen's Stars

youngsheen/GPST

souzatharsis/podcastfy

facebookresearch/lingua

youngsheen/SimVQ

facebookresearch/spiritlm

DAMO-NLP-SG/DiGIT

kyutai-labs/moshi

gpt-omni/mini-omni

homebrewltd/ichigo

triton-lang/triton

pytorch/torchtitan

facebookresearch/chameleon

Haoqiu-Yan/PerceptiveAgent

TencentARC/Open-MAGVIT2

TinyLLaVA/TinyLLaVA_Factory

minyoungg/platonic-rep

openai/tiktoken

pytorch/torchtune

haotian-liu/LLaVA

facebookresearch/jepa

dropreg/efficient_alpaca

google-deepmind/alphageometry

haoliuhl/language-quantized-autoencoders

ytongbai/LVM

ml-explore/mlx

atong01/conditional-flow-matching

The-Run-Philosophy-Organization/run

kakaobrain/rq-vae-transformer

tickstep/aliyunpan

facebookresearch/encodec