mcognetta

@googleTokyo

mcognetta's Stars

tunib-ai/parallelformers
Parallelformers: An Efficient Model Parallelization Toolkit for Deployment
Language:Python77661
mlcommons/algorithmic-efficiency
MLCommons Algorithmic Efficiency is a benchmark and competition measuring neural network training speedups due to algorithmic improvements in both training algorithms and models.
Language:Python32262
Liuhong99/Sophia
The official implementation of “Sophia: A Scalable Stochastic Second-order Optimizer for Language Model Pre-training”
Language:Python93352
cp-algorithms/cp-algorithms
Algorithm and data structure articles for https://cp-algorithms.com (based on http://e-maxx.ru)
Language:C++7.4k1.6k
libprima/prima
PRIMA is a package for solving general nonlinear optimization problems without using derivatives. It provides the reference implementation for Powell's derivative-free optimization methods, i.e., COBYLA, UOBYQA, NEWUOA, BOBYQA, and LINCOA. PRIMA means Reference Implementation for Powell's methods with Modernization and Amelioration, P for Powell.
Language:Fortran30440
IntelLabs/academic-budget-bert
Repository containing code for "How to Train BERT with an Academic Budget" paper
Language:Python30947
e9t/nsmc
Naver sentiment movie corpus
Language:Python558211
Tiiiger/QPyTorch
Low Precision Arithmetic Simulation in PyTorch
Language:Python26374
bitsandbytes-foundation/bitsandbytes
Accessible large language models via k-bit quantization for PyTorch.
Language:Python6.1k611
google-research/jaxpruner
Language:Python20313
triton-lang/triton
Development repository for the Triton language and compiler
Language:C++12.9k1.6k
quantumaikr/KoreanLM
한국어 언어모델 오픈소스
Language:Python803
JuliaSIMD/VectorizedRNG.jl
Vectorized uniform and normal random samplers.
Language:Julia337
AshwinDeshpande96/Hierarchical-Softmax
This is a scalable hierarchical softmax layer for Neural Networks with large output classes.
Language:Jupyter Notebook194
harrisonvanderbyl/rwkv-cpp-accelerated
A torchless, c++ rwkv implementation using 8bit quantization, written in cuda/hip/vulkan for maximum compatibility and minimum dependencies
Language:C++30619
omlins/julia-gpu-course
GPU Programming with Julia - course at the Swiss National Supercomputing Centre (CSCS), ETH Zurich
Language:Jupyter Notebook24427
openai/tiktoken
tiktoken is a fast BPE tokeniser for use with OpenAI's models.
Language:Python11.9k812
SymbolicML/DynamicExpressions.jl
Ridiculously fast symbolic expressions
Language:Julia10315
ggerganov/llama.cpp
LLM inference in C/C++
Language:C++65.7k9.4k
kakaobrain/jejueo
Jejueo Datasets for Machine Translation and Speech Synthesis
Language:Python758
Dao-AILab/flash-attention
Fast and memory-efficient exact attention
Language:Python13.6k1.2k
KanjiVG/kanjivg
Kanji vector graphics
Language:Python1.1k181
sagemath/sage
Main repository of SageMath
Language:Python1.4k461
karpathy/nanoGPT
The simplest, fastest repository for training/finetuning medium-sized GPTs.
Language:Python36.5k5.7k
aojunzz/NM-sparsity
Language:Python21229
run-llama/llama_index
LlamaIndex is a data framework for your LLM applications
Language:Python35.8k5.1k
tysam-code/hlb-CIFAR10
Train to 94% on CIFAR-10 in <6.3 seconds on a single A100. Or ~95.79% in ~110 seconds (or less!)
Language:Python1.2k75
scandum/quadsort
Quadsort is a branchless stable adaptive mergesort faster than quicksort.
Language:C2.1k105
BYVoid/OpenCC
Conversion between Traditional and Simplified Chinese
Language:C++8.4k975
FluxML/FastAI.jl
Repository of best practices for deep learning in Julia, inspired by fastai
Language:Julia58751

mcognetta

mcognetta's Stars

tunib-ai/parallelformers

mlcommons/algorithmic-efficiency

Liuhong99/Sophia

cp-algorithms/cp-algorithms

libprima/prima

IntelLabs/academic-budget-bert

e9t/nsmc

Tiiiger/QPyTorch

bitsandbytes-foundation/bitsandbytes

google-research/jaxpruner

triton-lang/triton

quantumaikr/KoreanLM

JuliaSIMD/VectorizedRNG.jl

AshwinDeshpande96/Hierarchical-Softmax

harrisonvanderbyl/rwkv-cpp-accelerated

omlins/julia-gpu-course

openai/tiktoken

SymbolicML/DynamicExpressions.jl

ggerganov/llama.cpp

kakaobrain/jejueo

Dao-AILab/flash-attention

KanjiVG/kanjivg

sagemath/sage

karpathy/nanoGPT

aojunzz/NM-sparsity

run-llama/llama_index

tysam-code/hlb-CIFAR10

scandum/quadsort

BYVoid/OpenCC

FluxML/FastAI.jl