baixf-xyz's Stars
public-apis/public-apis
A collective list of free APIs
rasbt/LLMs-from-scratch
Implement a ChatGPT-like LLM in PyTorch from scratch, step by step
waydabber/BetterDisplay
Unlock your displays on your Mac! Flexible HiDPI scaling, XDR/HDR extra brightness, virtual screens, DDC control, extra dimming, PIP/streaming, EDID override and lots more!
exo-explore/exo
Run your own AI cluster at home with everyday devices 📱💻 🖥️⌚
Zeyi-Lin/HivisionIDPhotos
⚡️HivisionIDPhotos: a lightweight and efficient AI ID photos tools. 一个轻量级的AI证件照制作算法。
SubtitleEdit/subtitleedit
the subtitle editor :)
HerbertHe/iptv-sources
Autoupdate iptv sources
RubyMetric/chsrc
chsrc 全平台通用换源工具与框架. Change Source everywhere for every software
alienator88/Pearcleaner
A free, source-available and fair-code licensed mac app cleaner
facebookresearch/encodec
State-of-the-art deep learning based audio codec supporting both mono 24 kHz audio and stereo 48 kHz audio.
gpt-omni/mini-omni
open-source multimodal large language model that can hear, talk while thinking. Featuring real-time end-to-end speech input and streaming audio output conversational capabilities.
BadToBest/EchoMimic
EchoMimic: Lifelike Audio-Driven Portrait Animations through Editable Landmark Conditioning
ictnlp/LLaMA-Omni
LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level.
lucidrains/soundstorm-pytorch
Implementation of SoundStorm, Efficient Parallel Audio Generation from Google Deepmind, in Pytorch
jxzhangjhu/Awesome-LLM-RAG
Awesome-LLM-RAG: a curated list of advanced retrieval augmented generation (RAG) in Large Language Models
jishengpeng/WavTokenizer
SOTA discrete acoustic codec models with 40 tokens per second for audio language modeling
maoserr/epublifier
Converts some webnovels to epub format
ZhangXInFD/SpeechTokenizer
This is the code for the SpeechTokenizer presented in the SpeechTokenizer: Unified Speech Tokenizer for Speech Language Models. Samples are presented on
checkToke/yangtai
青龙面板、上车, 巴奴火锅、一汽大众、海底捞、屈臣氏、霸王茶姬、华住会、东呈会、万达酒店、益禾堂 自动签到
YuanGongND/whisper-at
Code and Pretrained Models for Interspeech 2023 Paper "Whisper-AT: Noise-Robust Automatic Speech Recognizers are Also Strong Audio Event Taggers"
DangJin/awesome-readme-generator-tools
收录了一些可以快速创建出精美readme.md的工具集合
yangdongchao/SoundStorm
The reproduced code for Google's SoundStorm
jishengpeng/Languagecodec
Language-Codec: Reducing the Gaps Between Discrete Codec Representation and Speech Language Models
xingchensong/S3Tokenizer
Reverse Engineering of Supervised Semantic Speech Tokenizer (S3Tokenizer) proposed in CosyVoice
qq332374857/BlueSkyClouds-My-Actions
爱奇艺会员签到抽奖,腾讯视频会员签到,哔哩哔哩签到,**电信签到,V2ex签到,哔咔漫画签到,百度贴吧自动签到
mct10/RepCodec
Models and code for RepCodec: A Speech Representation Codec for Speech Tokenization
0nutation/USLM
Unified Speech Language Model for paper "SpeechTokenizer: Unified Speech Tokenizer for Speech Large Language Models"(ICLR 2024)
CLUEbenchmark/SuperCLUE-Safety
SC-Safety: 中文大模型多轮对抗安全基准
ZhangXInFD/soundstorm-speechtokenizer
Implementation of SoundStorm built upon SpeechTokenizer.
JasonJarvan/interview-helper
开源的AI面试助手,使用OpenAI Whipser模型进行STT(Speak to Text 语音转文字)转录,然后将问题交给ChatGPT回答。