Zth9730

University of Science and Technology Beijing

Computer of Science and Technology Beijing

Zth9730's Stars

jasonppy/PromptingWhisper
Promting Whisper for Audio-Visual Speech Recognition, Code-Switched Speech Recognition, and Zero-Shot Speech Translation
Language:Python13711
DmitryRyumin/INTERSPEECH-2023-24-Papers
INTERSPEECH 2023-2024 Papers: A complete collection of influential and exciting research papers from the INTERSPEECH 2023-24 conference. Explore the latest advances in speech and language processing. Code included. Star the repository to support the advancement of speech technology!
65142
facebookresearch/fairseq2
FAIR Sequence Modeling Toolkit 2
Language:Python75990
google-research/perch
Language:Python19543
ivy-llc/ivy
Convert Machine Learning Code Between Frameworks
Language:Python14k5.7k
microsoft/CLAP
Learning audio concepts from natural language supervision
Language:Python51539
bojone/rerope
Rectified Rotary Position Embeddings
Language:Python34730
YuanGongND/whisper-at
Code and Pretrained Models for Interspeech 2023 Paper "Whisper-AT: Noise-Robust Automatic Speech Recognizers are Also Strong Audio Event Taggers"
Language:Python34628
MontaEllis/Pytorch-Medical-Segmentation
This repository is an unoffical PyTorch implementation of Medical segmentation in 2D and 3D.
Language:Python872195
MLNLP-World/MyArxiv
Arxiv个性化定制化模版，实现对特定领域的相关内容、作者与学术会议的有效跟进。
Language:CSS27024
deskflow/deskflow
Deskflow lets you share one mouse and keyboard between multiple computers on Windows, macOS and Linux. It's like a software KVM (but without video).
Language:C++15.1k3.9k
wenet-e2e/wenet
Production First and Production Ready End-to-End Speech Recognition Toolkit
Language:Python4.3k1.1k
microsoft/torchscale
Foundation Architecture for (M)LLMs
Language:Python3k211
Jamie-Stirling/RetNet
An implementation of "Retentive Network: A Successor to Transformer for Large Language Models"
Language:Python1.2k101
InternLM/InternLM
Official release of InternLM2.5 base and chat models. 1M context support
Language:Python6.6k466
OFA-Sys/OFA
Official repository of OFA (ICML 2022). Paper: OFA: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework
Language:Python2.4k248
alinlab/ifseg
IFSeg: Image-free Semantic Segmentation via Vision-Language Model (CVPR 2023)
Language:Python849
Long-Kai/ADV_CE
Source code for paper "Improving Task-Specific Generalization in Few-Shot Learning via Adaptive Vicinal Risk Minimization"
Language:Python4
bigscience-workshop/bigscience
Central place for the engineering/scaling WG: documentation, SLURM scripts and logs, compute environment and data.
Language:Shell982101
microsoft/DeepSpeed
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
Language:Python36.2k4.2k
yeyupiaoling/Whisper-Finetune
Fine-tune the Whisper speech recognition model to support training without timestamp data, training with timestamp data, and training without speech data. Accelerate inference and support Web deployment, Windows desktop deployment, and Android deployment
Language:C931152
google-research/tuning_playbook
A playbook for systematically maximizing the performance of deep learning models.
27.8k2.3k
microsoft/unilm
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
Language:Python20.6k2.6k
fabawi/ImageBind-LoRA
Fine-tuning "ImageBind One Embedding Space to Bind Them All" with LoRA
Language:Python17616
chenkui164/FastASR
这是一个用C++实现ASR推理的项目，它依赖很少，安装也很简单，推理速度很快，在树莓派4B等ARM平台也可以流畅的运行。支持的模型是由Google的Transformer模型中优化而来，数据集是开源wenetspeech(10000+小时)或阿里私有数据集(60000+小时)，所以识别效果也很好，可以媲美许多商用的ASR软件。
Language:C49277
Mddct/WeUSM
Language:Python131
milely/SRN.Pytorch
Unofficial implementation of Towards Accurate Scene Text Recognition with Semantic Reasoning Networks
Language:Python285
tomekkorbak/pretraining-with-human-feedback
Code accompanying the paper Pretraining Language Models with Human Preferences
Language:Python17914
jiaaro/pydub
Manipulate audio with a simple and easy high level interface
Language:Python9.1k1.1k
deezer/spleeter
Deezer source separation library including pretrained models.
Language:Python26.2k2.9k

Zth9730

Zth9730's Stars

jasonppy/PromptingWhisper

DmitryRyumin/INTERSPEECH-2023-24-Papers

facebookresearch/fairseq2

google-research/perch

ivy-llc/ivy

microsoft/CLAP

bojone/rerope

YuanGongND/whisper-at

MontaEllis/Pytorch-Medical-Segmentation

MLNLP-World/MyArxiv

deskflow/deskflow

wenet-e2e/wenet

microsoft/torchscale

Jamie-Stirling/RetNet

InternLM/InternLM

OFA-Sys/OFA

alinlab/ifseg

Long-Kai/ADV_CE

bigscience-workshop/bigscience

microsoft/DeepSpeed

yeyupiaoling/Whisper-Finetune

google-research/tuning_playbook

microsoft/unilm

fabawi/ImageBind-LoRA

chenkui164/FastASR

Mddct/WeUSM

milely/SRN.Pytorch

tomekkorbak/pretraining-with-human-feedback

jiaaro/pydub

deezer/spleeter