Pinned Repositories
alpa
Training and serving large-scale neural networks with auto parallelization.
MInference
[NeurIPS'24 Spotlight] To speed up Long-context LLMs' inference, approximate and dynamic sparse calculate the attention, which reduces inference latency by up to 10x for pre-filling on an A100 while maintaining accuracy.
gateway
A Blazing Fast AI Gateway with integrated Guardrails. Route to 200+ LLMs, 50+ AI Guardrails with 1 fast & friendly API.
stylellm_models
StyleLLM文风大模型:基于大语言模型的文本风格迁移项目。Text style transfer base on Large Language Model. #文字修饰 # 润色 #风格模仿
alpa
Auto parallelization for large-scale neural networks
fastT5
⚡ boost inference speed of T5 models by 5x & reduce the model size by 3x.
MedicalGPT
MedicalGPT: Training Your Own Medical GPT Model with ChatGPT Training Pipeline. 训练医疗大模型,实现了包括增量预训练、有监督微调、RLHF(奖励建模、强化学习训练)和DPO(直接偏好优化)。
wgimperial's Repositories
wgimperial/alpa
Auto parallelization for large-scale neural networks
wgimperial/fastT5
⚡ boost inference speed of T5 models by 5x & reduce the model size by 3x.
wgimperial/MedicalGPT
MedicalGPT: Training Your Own Medical GPT Model with ChatGPT Training Pipeline. 训练医疗大模型,实现了包括增量预训练、有监督微调、RLHF(奖励建模、强化学习训练)和DPO(直接偏好优化)。