rolsheng

Redemption and reborn!

Southeast UniversityNanjing,China

rolsheng's Stars

huggingface/transformers
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
Language:Python136k 1.1k 16.3k27.3k
langchain-ai/langchain
🦜🔗 Build context-aware reasoning applications
Language:Jupyter Notebook96.3k 692 8k15.7k
meta-llama/llama
Inference code for Llama models
Language:Python56.8k 526 1k9.6k
hiyouga/LLaMA-Factory
Unified Efficient Fine-Tuning of 100+ LLMs (ACL 2024)
Language:Python36.2k 214 5.5k4.5k
microsoft/DeepSpeed
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
Language:Python35.9k 346 2.9k4.2k
meta-llama/llama3
The official Meta Llama 3 GitHub site
Language:Python27.5k 229 2713.1k
KaTeX/KaTeX
Fast math typesetting for the web.
Language:JavaScript18.5k 283 1.6k1.2k
xiaolincoder/CS-Base
图解计算机网络、操作系统、计算机组成、数据库，共 1000 张图 + 50 万字，破除晦涩难懂的计算机基础知识，让天下没有难懂的八股文！🚀 在线阅读：https://xiaolincoding.com
14.8k 96 1311.9k
naklecha/llama3-from-scratch
llama3 implementation one matrix multiplication at a time
Language:Jupyter Notebook13.9k 98 181.1k
BradyFU/Awesome-Multimodal-Large-Language-Models
:sparkles::sparkles:Latest Advances on Multimodal Large Language Models
13.1k 256 127836
NVIDIA/NeMo
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
Language:Python12.5k 208 2.3k2.6k
liguodongiot/llm-action
本项目旨在分享大模型相关技术原理以及实战经验（大模型工程化、大模型应用落地）
Language:HTML11.8k 97 241.2k
NVIDIA/Megatron-LM
Ongoing research training transformer models at scale
Language:Python10.8k 164 7972.4k
google/sentencepiece
Unsupervised text tokenizer for Neural Network-based text generation.
Language:C++10.4k 126 7551.2k
NVIDIA/apex
A PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch
Language:Python8.5k 100 1.2k1.4k
InternLM/MindSearch
🔍 An LLM-based Multi-agent Framework of Web Search Engine (like Perplexity.ai Pro and SearchGPT)
Language:JavaScript5.5k 37 174553
microsoft/Megatron-DeepSpeed
Ongoing research training transformer language models at scale, including: BERT & GPT-2
Language:Python1.9k 24 183343
jquesnelle/yarn
YaRN: Efficient Context Window Extension of Large Language Models
Language:Python1.4k 14 56118
zonechen1994/CV_Interview
I hope this repo can help you a lot!
1.3k 14 5221
alibaba/Megatron-LLaMA
Best practice for training LLaMA models in Megatron-LM
Language:Python634 7 6453
Wang-ML-Lab/llm-continual-learning-survey
Continual Learning of Large Language Models: A Comprehensive Survey
285 8 116
arcee-ai/PruneMe
Automated Identification of Redundant Layer Blocks for Pruning in Large Language Models
Language:Python205 3 1226
astramind-ai/Mixture-of-depths
Unofficial implementation for the paper "Mixture-of-Depths: Dynamically allocating compute in transformer-based language models"
Language:Python137 1 147
SysCV/shift-dev
SHIFT Dataset DevKit - CVPR2022
Language:Python103 4 5110
kyegomez/Mixture-of-Depths
Implementation of the paper: "Mixture-of-Depths: Dynamically allocating compute in transformer-based language models"
Language:Python75 3 35
BeyonderXX/TRACE
TRACE: A Comprehensive Benchmark for Continual Learning in Large Language Models
Language:Python60 2 37
rolsheng/MM-VUFM4DS
A systematic survey of multi-modal and multi-task visual understanding foundation models for driving scenarios
47 5 12
sramshetty/mixture-of-depths
An unofficial implementation of "Mixture-of-Depths: Dynamically allocating compute in transformer-based language models"
Language:Python33 3 43
Mixture-AI/Mixture-of-Depths
Google DeepMind: Mixture of Depths Unofficial Implementation.
Language:Python7 0 01
rolsheng/OpenDet-D
OpenDet-D: Open World Object Detection via Cooperative Foundation Models for Driving Scenes
1 1 00

rolsheng

rolsheng's Stars

huggingface/transformers

langchain-ai/langchain

meta-llama/llama

hiyouga/LLaMA-Factory

microsoft/DeepSpeed

meta-llama/llama3

KaTeX/KaTeX

xiaolincoder/CS-Base

naklecha/llama3-from-scratch

BradyFU/Awesome-Multimodal-Large-Language-Models

NVIDIA/NeMo

liguodongiot/llm-action

NVIDIA/Megatron-LM

google/sentencepiece

NVIDIA/apex

InternLM/MindSearch

microsoft/Megatron-DeepSpeed

jquesnelle/yarn

zonechen1994/CV_Interview

alibaba/Megatron-LLaMA

Wang-ML-Lab/llm-continual-learning-survey

arcee-ai/PruneMe

astramind-ai/Mixture-of-depths

SysCV/shift-dev

kyegomez/Mixture-of-Depths

BeyonderXX/TRACE

rolsheng/MM-VUFM4DS

sramshetty/mixture-of-depths

Mixture-AI/Mixture-of-Depths

rolsheng/OpenDet-D