GHGmc2's Stars
HuaiyuanXu/3D-Occupancy-Perception
[Information Fusion 2024] A Survey on Occupancy Perception for Autonomous Driving: The Information Fusion Perspective
madsys-dev/deepseekv2-profile
kvcache-ai/Mooncake
Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.
cambrian-mllm/cambrian
Cambrian-1 is a family of multimodal LLMs with a vision-centric design.
fchollet/ARC-AGI
The Abstraction and Reasoning Corpus
minyoungg/platonic-rep
swc-17/SparseDrive
SparseDrive: End-to-End Autonomous Driving via Sparse Scene Representation
google/aqt
hemingkx/SpeculativeDecodingPapers
📰 Must-read papers and blogs on Speculative Decoding ⚡️
Tim-Salzmann/l4casadi
Use PyTorch Models with CasADi for data-driven optimization or learning-based optimal control. Supports Acados.
fengbintu/Neural-Networks-on-Silicon
This is originally a collection of papers on neural network accelerators. Now it's more like my selection of research on deep learning and computer architecture.
google-deepmind/language_modeling_is_compression
binary-husky/gpt_academic
为GPT/GLM等LLM大语言模型提供实用化交互接口,特别优化论文阅读/润色/写作体验,模块化设计,支持自定义快捷按钮&函数插件,支持Python和C++等项目剖析&自译解功能,PDF/LaTex论文翻译&总结功能,支持并行问询多种LLM模型,支持chatglm3等本地模型。接入通义千问, deepseekcoder, 讯飞星火, 文心一言, llama2, rwkv, claude2, moss等。
feifeibear/Odysseus-Transformer
Odysseus: Playground of LLM Sequence Parallelism
dabochen/spreadsheet-is-all-you-need
A nanoGPT pipeline packed in a spreadsheet
bytedance/flux
A fast communication-overlapping library for tensor parallelism on GPUs.
HKUNLP/ChunkLlama
[ICML'24] Data and code for our paper "Training-Free Long-Context Scaling of Large Language Models"
gkamradt/LLMTest_NeedleInAHaystack
Doing simple retrieval from LLM models at various context lengths to measure accuracy
HuangOwen/Awesome-LLM-Compression
Awesome LLM compression research papers and tools.
FlagOpen/FlagGems
FlagGems is an operator library for large language models implemented in Triton Language.
laekov/fastmoe
A fast MoE impl for PyTorch
datamllab/LongLM
[ICML'24 Spotlight] LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning
fanshiqing/grouped_gemm
PyTorch bindings for CUTLASS grouped GEMM.
naklecha/llama3-from-scratch
llama3 implementation one matrix multiplication at a time
feifeibear/long-context-attention
USP: Unified (a.k.a. Hybrid, 2D) Sequence Parallel Attention for Long Context Transformers Model Training and Inference
Mellanox/nccl-rdma-sharp-plugins
RDMA and SHARP plugins for nccl library
HazyResearch/ThunderKittens
Tile primitives for speedy kernels
GigaAI-research/General-World-Models-Survey
adam-maj/tiny-gpu
A minimal GPU design in Verilog to learn how GPUs work from the ground up
pytorch/torchtitan
A native PyTorch Library for large model training