Adaxry

Fast learner, eagle for new knowledge and deeper understanding

WeChat AI, Tencent Inc, ChinaBeijing

Adaxry's Stars

ollama/ollama
Get up and running with Llama 3.2, Mistral, Gemma 2, and other large language models.
Language:Go99.5k 592 5k7.9k
lm-sys/FastChat
An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.
Language:Python37.1k 353 1.8k4.6k
microsoft/DeepSpeed
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
Language:Python35.6k 345 2.8k4.1k
LC044/WeChatMsg
提取微信聊天记录，将其导出成HTML、Word、Excel文档永久保存，对聊天记录进行分析生成年度聊天报告，用聊天数据训练专属于个人的AI聊天助手
Language:Python34.8k 174 4153.6k
immersive-translate/immersive-translate
沉浸式双语网页翻译扩展 , 支持输入框翻译，鼠标悬停翻译， PDF, Epub, 字幕文件, TXT 文件翻译 - Immersive Dual Web Page Translation Extension
14.3k 78 1.8k789
state-spaces/mamba
Mamba SSM architecture
Language:Python13.3k 98 5531.1k
NVIDIA/Megatron-LM
Ongoing research training transformer models at scale
Language:Python10.7k 162 7852.4k
google/sentencepiece
Unsupervised text tokenizer for Neural Network-based text generation.
Language:C++10.3k 127 7541.2k
QwenLM/Qwen2.5
Qwen2.5 is the large language model series developed by Qwen team, Alibaba Cloud.
Language:Shell10k 60 871618
nlpxucan/WizardLM
LLMs build upon Evol Insturct: WizardLM, WizardCoder, WizardMath
Language:Python9.3k 113 190718
NVIDIA/TensorRT-LLM
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
Language:C++8.8k 94 2k1k
jzhang38/TinyLlama
The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.
Language:Python8k 110 158468
OpenBMB/MiniCPM
MiniCPM3-4B: An edge-side LLM that surpasses GPT-3.5-Turbo.
Language:Jupyter Notebook7.1k 76 216454
mit-han-lab/streaming-llm
[ICLR 2024] Efficient Streaming Language Models with Attention Sinks
Language:Python6.7k 65 83368
togethercomputer/RedPajama-Data
The RedPajama-Data repository contains code for preparing large datasets for training large language models.
Language:Python4.6k 79 90350
albertan017/LLM4Decompile
Reverse Engineering: Decompiling Binary Code with Large Language Models
Language:Python3.2k 38 32236
LDNOOBW/List-of-Dirty-Naughty-Obscene-and-Otherwise-Bad-Words
List of Dirty, Naughty, Obscene, and Otherwise Bad Words
2.9k 74 38666
microsoft/Megatron-DeepSpeed
Ongoing research training transformer language models at scale, including: BERT & GPT-2
Language:Python1.9k 24 182346
sahil280114/codealpaca
Language:Python1.4k 21 19107
SkyworkAI/Skywork
Skywork series models are pre-trained on 3.2TB of high-quality multilingual (mainly Chinese and English) and code data. We have open-sourced the model, training data, evaluation data, evaluation methods, etc. 天工系列模型在3.2TB高质量多语言和代码数据上进行预训练。我们开源了模型参数，训练数据，评估数据，评估方法。
Language:Python1.2k 24 63111
OpenBMB/InfiniteBench
Codes for the paper "∞Bench: Extending Long Context Evaluation Beyond 100K Tokens": https://arxiv.org/abs/2402.13718
Language:Python286 9 2223
hemingkx/Spec-Bench
Spec-Bench: A Comprehensive Benchmark and Unified Evaluation Platform for Speculative Decoding (ACL 2024 Findings)
Language:Python192 2 1519
huggingface/optimum-habana
Easy and lightning fast training of 🤗 Transformers on Habana Gaudi processor (HPU)
Language:Python154 39 121202
dilab-zju/self-speculative-decoding
Code associated with the paper **Draft & Verify: Lossless Large Language Model Acceleration via Self-Speculative Decoding**
Language:Jupyter Notebook141 2 219
artwalker/EasyTranslator
Your Companion for Multilingual Reading
Language:Python125 1 15
DAMO-NLP-MT/PolyLM
Language:Python76 1 67
leogao2/lm_dataformat
Language:Python76 3 431
raymin0223/fast_robust_early_exit
Fast and Robust Early-Exiting Framework for Autoregressive Language Models with Synchronized Parallel Decoding (EMNLP 2023 Long)
Language:Python53 2 98
Adaxry/Unified_Layer_Skipping
Language:Python9 2 00
pppa2019/Mango
Code for "Comments as Natural Logic Pivots: Improve Code Generation via Comment Perspective"
Language:Python2 1 00

Adaxry

Adaxry's Stars

ollama/ollama

lm-sys/FastChat

microsoft/DeepSpeed

LC044/WeChatMsg

immersive-translate/immersive-translate

state-spaces/mamba

NVIDIA/Megatron-LM

google/sentencepiece

QwenLM/Qwen2.5

nlpxucan/WizardLM

NVIDIA/TensorRT-LLM

jzhang38/TinyLlama

OpenBMB/MiniCPM

mit-han-lab/streaming-llm

togethercomputer/RedPajama-Data

albertan017/LLM4Decompile

LDNOOBW/List-of-Dirty-Naughty-Obscene-and-Otherwise-Bad-Words

microsoft/Megatron-DeepSpeed

sahil280114/codealpaca

SkyworkAI/Skywork

OpenBMB/InfiniteBench

hemingkx/Spec-Bench

huggingface/optimum-habana

dilab-zju/self-speculative-decoding

artwalker/EasyTranslator

DAMO-NLP-MT/PolyLM

leogao2/lm_dataformat

raymin0223/fast_robust_early_exit

Adaxry/Unified_Layer_Skipping

pppa2019/Mango