myaxxxxx

China

myaxxxxx's Stars

k2-fsa/libriheavy
Libriheavy: a 50,000 hours ASR corpus with punctuation casing and context
Language:Python16910
enhuiz/vall-e
An unofficial PyTorch implementation of the audio LM VALL-E
Language:Python2.9k417
coqui-ai/TTS
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
Language:Python33.4k4.1k
jaywalnut310/vits
VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech
Language:Python6.7k1.2k
haoxiangsnr/llm-tse
Typing to Listen at the Cocktail Party: Text-Guided Target Speaker Extraction (LLM-TSE)
Language:JavaScript321
myaxxxxx/onebit-st
Language:Python1
myaxxxxx/transfer-st
Language:Python1
zhayujie/chatgpt-on-wechat
基于大模型搭建的聊天机器人，同时支持微信公众号、企业微信应用、飞书、钉钉等接入，可选择GPT3.5/GPT-4o/GPT-o1/ Claude/文心一言/讯飞星火/通义千问/ Gemini/GLM-4/Claude/Kimi/LinkAI，能处理文本、语音和图片，访问操作系统和互联网，支持基于自有知识库进行定制企业智能客服。
Language:Python29.9k7.9k
idootop/mi-gpt
🏠 将小爱音箱接入 ChatGPT 和豆包，改造成你的专属语音助手。
Language:TypeScript7.1k658
songquanpeng/one-api
OpenAI 接口管理 & 分发系统，支持 Azure、Anthropic Claude、Google PaLM 2 & Gemini、智谱 ChatGLM、百度文心一言、讯飞星火认知、阿里通义千问、360 智脑以及腾讯混元，可用于二次分发管理 key，仅单可执行文件，已打包好 Docker 镜像，一键部署，开箱即用. OpenAI key management & redistribution system, using a single API for all LLMs, and features an English UI.
Language:JavaScript17.9k4k
IcarusRyy/NewJob
一眼看出该职位最后修改时间，绿色为2周之内，暗橙色为1.5个月之内，红色为1.5个月以上
Language:JavaScript1k53
huggingface/lerobot
🤗 LeRobot: Making AI for Robotics more accessible with end-to-end learning
Language:Python6.3k548
BigBrotherTrade/trader
交易模块
Language:Python3.5k790
meta-llama/llama3
The official Meta Llama 3 GitHub site
Language:Python26.1k2.9k
langgenius/dify
Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting you quickly go from prototype to production.
Language:TypeScript45.3k6.4k
PKU-YuanGroup/Open-Sora-Plan
This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.
Language:Python11.2k1k
hpcaitech/Open-Sora
Open-Sora: Democratizing Efficient Video Production for All
Language:Python21.6k2.1k
doocs/source-code-hunter
😱 从源码层面，剖析挖掘互联网行业主流技术的底层实现原理，为广大开发者 “提升技术深度” 提供便利。目前开放 Spring 全家桶，Mybatis、Netty、Dubbo 框架，及 Redis、Tomcat 中间件等
Language:Java22k4.1k
myaxxxxx/pruning
Language:Python2
myaxxxxx/Hyperbolic-Prompt-Learning
Language:Python21
xai-org/grok-1
Grok open release
Language:Python49.4k8.3k
haotian-liu/LLaVA
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
Language:Python19.3k2.1k
DmitryRyumin/INTERSPEECH-2023-24-Papers
INTERSPEECH 2023-2024 Papers: A complete collection of influential and exciting research papers from the INTERSPEECH 2023-24 conference. Explore the latest advances in speech and language processing. Code included. Star the repository to support the advancement of speech technology!
62942
modelscope/FunCodec
FunCodec is a research-oriented toolkit for audio quantization and downstream applications, such as text-to-speech synthesis, music generation et.al.
Language:Python34430
krahets/hello-algo
《Hello 算法》：动画图解、一键运行的数据结构与算法教程。支持 Python, Java, C++, C, C#, JS, Go, Swift, Rust, Ruby, Kotlin, TS, Dart 代码。简体版和繁体版同步更新，English version ongoing
Language:Java95.1k12.1k
facebookresearch/seamless_communication
Foundational Models for State-of-the-Art Speech and Text Translation
Language:Jupyter Notebook10.8k1k
ictnlp/DASpeech
Code for NeurIPS 2023 paper "DASpeech: Directed Acyclic Transformer for Fast and High-quality Speech-to-Speech Translation".
Language:Python555
facebookresearch/ToMe
A method to increase the speed and lower the memory footprint of existing vision transformers.
Language:Python93467
xuchennlp/S2T
The project for speech translation
Language:Python112
Rongjiehuang/awesome-speech-to-speech-translation
List of direct speech-to-speech translation papers.
282