lorashen's Stars
xai-org/grok-1
Grok open release
Dao-AILab/flash-attention
Fast and memory-efficient exact attention
Tencent/secguide
面向开发人员梳理的代码安全指南
CompVis/latent-diffusion
High-Resolution Image Synthesis with Latent Diffusion Models
huggingface/text-generation-inference
Large Language Model Text Generation Inference
NVIDIA/cutlass
CUDA Templates for Linear Algebra Subroutines
togethercomputer/RedPajama-Data
The RedPajama-Data repository contains code for preparing large datasets for training large language models.
snakers4/silero-vad
Silero VAD: pre-trained enterprise-grade Voice Activity Detector
ztxz16/fastllm
纯c++的全平台llm加速库,支持python调用,chatglm-6B级模型单卡可达10000+token / s,支持glm, llama, moss基座,手机端流畅运行
Zjh-819/LLMDataHub
A quick guide (especially) for trending instruction finetuning datasets
mengjian-github/copilot-analysis
allenai/natural-instructions
Expanding natural instructions
alphacep/vosk-server
WebSocket, gRPC and WebRTC speech recognition server based on Vosk and Kaldi libraries
NVIDIA/nccl-tests
NCCL Tests
jcpeterson/openwebtext
Open clone of OpenAI's unreleased WebText dataset scraper. This version uses pushshift.io files instead of the API for speed.
volcengine/veScale
A PyTorch Native LLM Training Framework
espressif/esp-skainet
Espressif intelligent voice assistant
wenet-e2e/wekws
Production First and Production Ready End-to-End Keyword Spotting Toolkit
ranchlai/mandarin-tts
Chinese Mandarin tts text-to-speech 中文 (普通话) 语音 合成 , by fastspeech 2 , implemented in pytorch, using waveglow as vocoder, with biaobei and aishell3 datasets
mattilyra/LSH
Locality Sensitive Hashing using MinHash in Python/Cython to detect near duplicate text documents
FudanNLPLAB/CBook-150K
中文图书语料MD5链接
ZhuiyiTechnology/TableQA
NL2SQL competition dataset
zhenlohuang/awesome-chinese-llm
Awesome Chinese LLM: A curated list of Chinese Large Language Model 中文大语言模型数据集和模型资料汇总
glee4810/EHRSQL
[NeurIPS'22] EHRSQL: A Practical Text-to-SQL Benchmark for Electronic Health Records
pkucoli/PKU-Paraphrase-Bank
quarkslab/aosp_dataset
Large Commit Precise Vulnerability Dataset based on AOSP CVE
jitsi/jitsi-webrtc-vad-wrapper
A java wrapper around the WebRTC Voice Activity Detection library
chatc/TriageSQL
The dataset and source code for our paper: "Did You Ask a Good Question? A Cross-Domain Question IntentionClassification Benchmark for Text-to-SQL"
jpqiang/Chinese-Idiom-Paraphrasing
X-LANCE/medical-dataset
[ACL 2023 Findings] CSS: A Large-scale Cross-schema Chinese Text-to-SQL Medical Dataset