Adaxry
Fast learner, eagle for new knowledge and deeper understanding
WeChat AI, Tencent Inc, ChinaBeijing
Adaxry's Stars
ollama/ollama
Get up and running with Llama 3.2, Mistral, Gemma 2, and other large language models.
lm-sys/FastChat
An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.
microsoft/DeepSpeed
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
LC044/WeChatMsg
提取微信聊天记录,将其导出成HTML、Word、Excel文档永久保存,对聊天记录进行分析生成年度聊天报告,用聊天数据训练专属于个人的AI聊天助手
immersive-translate/immersive-translate
沉浸式双语网页翻译扩展 , 支持输入框翻译, 鼠标悬停翻译, PDF, Epub, 字幕文件, TXT 文件翻译 - Immersive Dual Web Page Translation Extension
state-spaces/mamba
Mamba SSM architecture
NVIDIA/Megatron-LM
Ongoing research training transformer models at scale
google/sentencepiece
Unsupervised text tokenizer for Neural Network-based text generation.
QwenLM/Qwen2.5
Qwen2.5 is the large language model series developed by Qwen team, Alibaba Cloud.
nlpxucan/WizardLM
LLMs build upon Evol Insturct: WizardLM, WizardCoder, WizardMath
NVIDIA/TensorRT-LLM
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
jzhang38/TinyLlama
The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.
OpenBMB/MiniCPM
MiniCPM3-4B: An edge-side LLM that surpasses GPT-3.5-Turbo.
mit-han-lab/streaming-llm
[ICLR 2024] Efficient Streaming Language Models with Attention Sinks
togethercomputer/RedPajama-Data
The RedPajama-Data repository contains code for preparing large datasets for training large language models.
albertan017/LLM4Decompile
Reverse Engineering: Decompiling Binary Code with Large Language Models
LDNOOBW/List-of-Dirty-Naughty-Obscene-and-Otherwise-Bad-Words
List of Dirty, Naughty, Obscene, and Otherwise Bad Words
microsoft/Megatron-DeepSpeed
Ongoing research training transformer language models at scale, including: BERT & GPT-2
sahil280114/codealpaca
SkyworkAI/Skywork
Skywork series models are pre-trained on 3.2TB of high-quality multilingual (mainly Chinese and English) and code data. We have open-sourced the model, training data, evaluation data, evaluation methods, etc. 天工系列模型在3.2TB高质量多语言和代码数据上进行预训练。我们开源了模型参数,训练数据,评估数据,评估方法。
OpenBMB/InfiniteBench
Codes for the paper "∞Bench: Extending Long Context Evaluation Beyond 100K Tokens": https://arxiv.org/abs/2402.13718
hemingkx/Spec-Bench
Spec-Bench: A Comprehensive Benchmark and Unified Evaluation Platform for Speculative Decoding (ACL 2024 Findings)
huggingface/optimum-habana
Easy and lightning fast training of 🤗 Transformers on Habana Gaudi processor (HPU)
dilab-zju/self-speculative-decoding
Code associated with the paper **Draft & Verify: Lossless Large Language Model Acceleration via Self-Speculative Decoding**
artwalker/EasyTranslator
Your Companion for Multilingual Reading
DAMO-NLP-MT/PolyLM
leogao2/lm_dataformat
raymin0223/fast_robust_early_exit
Fast and Robust Early-Exiting Framework for Autoregressive Language Models with Synchronized Parallel Decoding (EMNLP 2023 Long)
Adaxry/Unified_Layer_Skipping
pppa2019/Mango
Code for "Comments as Natural Logic Pivots: Improve Code Generation via Comment Perspective"