Pinned Repositories
agner
Reworking of Agner Fog's performance test programs for Linux
algo-lib
Concise, performant data structures and algorithms in C++ (mostly authored by saketh-are).
ArchBenchSuite
low level kernels to benchmark peak compute, cache bandwidth on various levels, memory bandwidth, and some basic compute routines
arrayfire
ArrayFire: a general purpose GPU library.
arrow
Apache Arrow is a cross-language development platform for in-memory data. It specifies a standardized language-independent columnar memory format for flat and hierarchical data, organized for efficient analytic operations on modern hardware. It also provides computational libraries and zero-copy streaming messaging and interprocess communication. Languages currently supported include C, C++, Java, JavaScript, Python, and Ruby.
asmjit
Machine code generation for C++
ATen
ATen: A TENsor library for C++11
audio
Data manipulation and transformation for audio signal processing, powered by PyTorch
avx-turbo
Test the non-AVX, AVX2 and AVX-512 speeds across various active core counts
pt-shard-experiments
imaginary-person's Repositories
imaginary-person/pt-shard-experiments
imaginary-person/ArchBenchSuite
low level kernels to benchmark peak compute, cache bandwidth on various levels, memory bandwidth, and some basic compute routines
imaginary-person/arrow
Apache Arrow is a cross-language development platform for in-memory data. It specifies a standardized language-independent columnar memory format for flat and hierarchical data, organized for efficient analytic operations on modern hardware. It also provides computational libraries and zero-copy streaming messaging and interprocess communication. Languages currently supported include C, C++, Java, JavaScript, Python, and Ruby.
imaginary-person/charm
The Charm++ parallel programming system. Visit https://charmplusplus.org/ for more information.
imaginary-person/FasterTransformer
Transformer related optimization, including BERT, GPT
imaginary-person/FBGEMM
FB (Facebook) + GEMM (General Matrix-Matrix Multiplication) - https://code.fb.com/ml-applications/fbgemm/
imaginary-person/gdrcopy
A fast GPU memory copy library based on NVIDIA GPUDirect RDMA technology
imaginary-person/gemm
imaginary-person/leveldb
LevelDB is a fast key-value storage library written at Google that provides an ordered mapping from string keys to string values.
imaginary-person/likwid
Performance monitoring and benchmarking suite
imaginary-person/llama2.c
Andrej Karpthy's Llama 2 inference in C
imaginary-person/loop_tool
A thin, highly portable C++ intermediate representation for dense loop-based computation.
imaginary-person/madrona
imaginary-person/MonetDB
This is the official mirror of the MonetDB Mercurial repository. Please note that we do not accept pull requests on github. The regression test results can be found on the MonetDB Testweb http://monetdb.cwi.nl/testweb/web/status.php .For contributions please see: https://www.monetdb.org/Developers
imaginary-person/nanoGPT
Andrej Karpathy's nanoGPT
imaginary-person/obs-studio
OBS Studio - Free and open source software for live streaming and screen recording
imaginary-person/pytorch-1
Tensors and Dynamic neural networks in Python with strong GPU acceleration
imaginary-person/qBittorrent
qBittorrent BitTorrent client
imaginary-person/rocksdb
A library that provides an embeddable, persistent key-value store for fast storage.
imaginary-person/Stanford_CS348K_readings
This is a list of readings for Stanford CS348K.
imaginary-person/stdgpu
stdgpu: Efficient STL-like Data Structures on the GPU
imaginary-person/stylegan3
Official PyTorch implementation of StyleGAN3
imaginary-person/TensorRT
TensorRT is a C++ library for high performance inference on NVIDIA GPUs and deep learning accelerators.
imaginary-person/torch-mlir
The Torch-MLIR project aims to provide first class support from the PyTorch ecosystem to the MLIR ecosystem.
imaginary-person/torcharrow
A torch.Tensor-like DataFrame library supporting multiple execution runtimes and Arrow as a common memory format
imaginary-person/torchdistx
Torch Distributed Experimental
imaginary-person/torchdynamo
A Python-level JIT compiler designed to make unmodified PyTorch programs faster.
imaginary-person/transformers
🤗 Transformers: State-of-the-art Natural Language Processing for Pytorch, TensorFlow, and JAX.
imaginary-person/tuplex
Tuplex is a parallel big data processing framework that runs data science pipelines written in Python at the speed of compiled code. Tuplex has similar Python APIs to Apache Spark or Dask, but rather than invoking the Python interpreter, Tuplex generates optimized LLVM bytecode for the given pipeline and input data set.
imaginary-person/tvm
Open deep learning compiler stack for cpu, gpu and specialized accelerators