myaxxxxx's Stars
k2-fsa/libriheavy
Libriheavy: a 50,000 hours ASR corpus with punctuation casing and context
enhuiz/vall-e
An unofficial PyTorch implementation of the audio LM VALL-E
coqui-ai/TTS
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
jaywalnut310/vits
VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech
haoxiangsnr/llm-tse
Typing to Listen at the Cocktail Party: Text-Guided Target Speaker Extraction (LLM-TSE)
myaxxxxx/onebit-st
myaxxxxx/transfer-st
zhayujie/chatgpt-on-wechat
基于大模型搭建的聊天机器人,同时支持 微信公众号、企业微信应用、飞书、钉钉 等接入,可选择GPT3.5/GPT-4o/GPT-o1/ Claude/文心一言/讯飞星火/通义千问/ Gemini/GLM-4/Claude/Kimi/LinkAI,能处理文本、语音和图片,访问操作系统和互联网,支持基于自有知识库进行定制企业智能客服。
idootop/mi-gpt
🏠 将小爱音箱接入 ChatGPT 和豆包,改造成你的专属语音助手。
songquanpeng/one-api
OpenAI 接口管理 & 分发系统,支持 Azure、Anthropic Claude、Google PaLM 2 & Gemini、智谱 ChatGLM、百度文心一言、讯飞星火认知、阿里通义千问、360 智脑以及腾讯混元,可用于二次分发管理 key,仅单可执行文件,已打包好 Docker 镜像,一键部署,开箱即用. OpenAI key management & redistribution system, using a single API for all LLMs, and features an English UI.
IcarusRyy/NewJob
一眼看出该职位最后修改时间,绿色为2周之内,暗橙色为1.5个月之内,红色为1.5个月以上
huggingface/lerobot
🤗 LeRobot: Making AI for Robotics more accessible with end-to-end learning
BigBrotherTrade/trader
交易模块
meta-llama/llama3
The official Meta Llama 3 GitHub site
langgenius/dify
Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting you quickly go from prototype to production.
PKU-YuanGroup/Open-Sora-Plan
This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.
hpcaitech/Open-Sora
Open-Sora: Democratizing Efficient Video Production for All
doocs/source-code-hunter
😱 从源码层面,剖析挖掘互联网行业主流技术的底层实现原理,为广大开发者 “提升技术深度” 提供便利。目前开放 Spring 全家桶,Mybatis、Netty、Dubbo 框架,及 Redis、Tomcat 中间件等
myaxxxxx/pruning
myaxxxxx/Hyperbolic-Prompt-Learning
xai-org/grok-1
Grok open release
haotian-liu/LLaVA
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
DmitryRyumin/INTERSPEECH-2023-24-Papers
INTERSPEECH 2023-2024 Papers: A complete collection of influential and exciting research papers from the INTERSPEECH 2023-24 conference. Explore the latest advances in speech and language processing. Code included. Star the repository to support the advancement of speech technology!
modelscope/FunCodec
FunCodec is a research-oriented toolkit for audio quantization and downstream applications, such as text-to-speech synthesis, music generation et.al.
krahets/hello-algo
《Hello 算法》:动画图解、一键运行的数据结构与算法教程。支持 Python, Java, C++, C, C#, JS, Go, Swift, Rust, Ruby, Kotlin, TS, Dart 代码。简体版和繁体版同步更新,English version ongoing
facebookresearch/seamless_communication
Foundational Models for State-of-the-Art Speech and Text Translation
ictnlp/DASpeech
Code for NeurIPS 2023 paper "DASpeech: Directed Acyclic Transformer for Fast and High-quality Speech-to-Speech Translation".
facebookresearch/ToMe
A method to increase the speed and lower the memory footprint of existing vision transformers.
xuchennlp/S2T
The project for speech translation
Rongjiehuang/awesome-speech-to-speech-translation
List of direct speech-to-speech translation papers.