MarshtompCS

MarshtompCS's Stars

OpenGVLab/OmniCorpus
OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text
1532
FranxYao/Retrieval-Head-with-Flash-Attention
Efficient retrieval head analysis with triton flash attention that supports topK probability
Language:Jupyter Notebook12
ivnle/synth-icl
Language:Jupyter Notebook32
mit-han-lab/llm-awq
[MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration
Language:Python2.1k150
jzhang38/EasyContext
Memory optimization and training recipes to extrapolate language models' context length to 1 million tokens, with minimal hardware.
Language:Python52633
pmichel31415/are-16-heads-really-better-than-1
Code for the paper "Are Sixteen Heads Really Better than One?"
Language:Shell16314
apple/corenet
CoreNet: A library for training deep neural networks
Language:Python6.7k517
crabml/crabml
a fast cross platform AI inference engine 🤖 using Rust 🦀 and WebGPU 🎮
Language:Rust37931
ankurtaly/Integrated-Gradients
Attributing predictions made by the Inception network using the Integrated Gradients method
Language:Jupyter Notebook58699
NVIDIA/cutlass
CUDA Templates for Linear Algebra Subroutines
Language:C++4.8k833
WecoAI/aideml
AIDE: the Machine Learning CodeGen Agent
Language:Python28822
TIGER-AI-Lab/LongICLBench
Code and Data for "Long-context LLMs Struggle with Long In-context Learning"
Language:Python752
Shawn-Guo-CN/Lossless_Text_Compression_with_Transformer
This repo is to demo the concept of lossless compression with Transformers as encoder and decoder.
Language:Python14
inseq-team/inseq
Interpretability for sequence generation models 🐛 🔍
Language:Python33436
OpenInterpreter/open-interpreter
A natural language interface for computers
Language:Python50.6k4.4k
EvolvingLMMs-Lab/lmms-eval
Accelerating the development of large multimodal models (LMMs) with lmms-eval
Language:Python1k54
zepingyu0512/awesome-llm-understanding-mechanism
awesome papers in LLM interpretability
19211
apple/ml-sigma-reparam
Language:Python27610
MiuLab/Taiwan-LLM
Traditional Mandarin LLMs for Taiwan
Language:Python1k90
HazyResearch/based
Code for exploring Based models from "Simple linear attention language models balance the recall-throughput tradeoff"
Language:Python18511
jzhang38/LongMamba
Some preliminary explorations of Mamba's context scaling.
Language:Python1718
hf-lin/ChatMusician
Language:Python17120
bartwojcik/adaptive_computation_modules
Language:Python3
uclnlp/EMAT
Efficient Memory-Augmented Transformers
Language:Python344
OpenCodeInterpreter/OpenCodeInterpreter
OpenCodeInterpreter is a suite of open-source code generation systems aimed at bridging the gap between large language models and sophisticated proprietary systems like the GPT-4 Code Interpreter. It significantly enhances code generation capabilities by integrating execution and iterative refinement functionalities.
Language:Python1.5k198
yuzhaouoe/pretraining-data-packing
Language:Python113
LargeWorldModel/LWM
Language:Python7k539
nushu-script/Nyushu
𛆁𛈬𛈤𛋒 | 女书字体 | Nüshu fonts
1845
AntreasAntoniou/kubejobs
Language:Python238
FranxYao/Long-Context-Data-Engineering
Implementation of paper Data Engineering for Scaling Language Models to 128K Context
Language:Python38224