GHGmc2's Stars
binary-husky/gpt_academic
为GPT/GLM等LLM大语言模型提供实用化交互接口,特别优化论文阅读/润色/写作体验,模块化设计,支持自定义快捷按钮&函数插件,支持Python和C++等项目剖析&自译解功能,PDF/LaTex论文翻译&总结功能,支持并行问询多种LLM模型,支持chatglm3等本地模型。接入通义千问, deepseekcoder, 讯飞星火, 文心一言, llama2, rwkv, claude2, moss等。
karpathy/LLM101n
LLM101n: Let's build a Storyteller
naklecha/llama3-from-scratch
llama3 implementation one matrix multiplication at a time
fchollet/ARC-AGI
The Abstraction and Reasoning Corpus
kvcache-ai/Mooncake
Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.
dabochen/spreadsheet-is-all-you-need
A nanoGPT pipeline packed in a spreadsheet
fengbintu/Neural-Networks-on-Silicon
This is originally a collection of papers on neural network accelerators. Now it's more like my selection of research on deep learning and computer architecture.
cambrian-mllm/cambrian
Cambrian-1 is a family of multimodal LLMs with a vision-centric design.
gkamradt/LLMTest_NeedleInAHaystack
Doing simple retrieval from LLM models at various context lengths to measure accuracy
laekov/fastmoe
A fast MoE impl for PyTorch
HuangOwen/Awesome-LLM-Compression
Awesome LLM compression research papers and tools.
datamllab/LongLM
[ICML'24 Spotlight] LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning
hemingkx/SpeculativeDecodingPapers
📰 Must-read papers and blogs on Speculative Decoding ⚡️
minyoungg/platonic-rep
swc-17/SparseDrive
SparseDrive: End-to-End Autonomous Driving via Sparse Scene Representation
feifeibear/long-context-attention
USP: Unified (a.k.a. Hybrid, 2D) Sequence Parallel Attention for Long Context Transformers Model Training and Inference
Tim-Salzmann/l4casadi
Use PyTorch Models with CasADi for data-driven optimization or learning-based optimal control. Supports Acados.
FlagOpen/FlagGems
FlagGems is an operator library for large language models implemented in Triton Language.
HKUNLP/ChunkLlama
[ICML'24] Data and code for our paper "Training-Free Long-Context Scaling of Large Language Models"
HorizonRobotics/Sparse4D
imbue-ai/cluster-health
HuaiyuanXu/3D-Occupancy-Perception
[Information Fusion 2024] A Survey on Occupancy Perception for Autonomous Driving: The Information Fusion Perspective
google/aqt
bytedance/flux
A fast communication-overlapping library for tensor parallelism on GPUs.
Mellanox/nccl-rdma-sharp-plugins
RDMA and SHARP plugins for nccl library
google-deepmind/language_modeling_is_compression
fanshiqing/grouped_gemm
PyTorch bindings for CUTLASS grouped GEMM.
madsys-dev/deepseekv2-profile
feifeibear/Odysseus-Transformer
Odysseus: Playground of LLM Sequence Parallelism
SC-SGS/hardware_sampling
The Hardware Sampling (hws) library can be used to track hardware performance like clock frequency, memory usage, temperatures, or power draw.