lcy-seso

MSR Asia. Previously worked at Baidu IDL(Institution of Deep Learning) and contributed as a member of the Paddle team.

MSRA, system research groupChina

Pinned Repositories

DLFrameworkTest
My tests and experiments with some popular dl frameworks.
Language:Python12 4 00
EfficientAttention-Notes
3 1 00
FractalTensor
FractalTensor is a programming framework that introduces a novel approach to organizing data in deep neural networks (DNNs) as a list of lists of statically-shaped tensors, referred to as a FractalTensor.
Language:Python0 0 00
lcy-seso.github.io
Ying's blog posts.
Language:SCSS0 1 00
LearnHaskell
So I decide to learn a functional programming language.
2 2 00
LearningNotes
Ying's notes
Language:TeX7 4 00
models
Model configureations
Language:Python3 2 02
paddle_confs_v1
paddle configuration files written by old API.
Language:Python2 3 01
TileFusion
TileFusion is a highly efficient kernel template library designed to elevate the level of abstraction in CUDA C for processing tiles.
Language:Cuda0 0 00
VPTQ
VPTQ, A Flexible and Extreme low-bit quantization algorithm
Language:Cuda0 0 00

lcy-seso's Repositories

lcy-seso/LearnHaskell
So I decide to learn a functional programming language.
2 2 00
lcy-seso/JuliaMachineLearning
Small exercise of some machine learning algorithms using the Julia programming language.
Language:Jupyter Notebook1 2 00
lcy-seso/pypoly
Extract polyhedral representation from PyTorch programs.
Language:C++1
lcy-seso/JuliaLearningNotes
My learning notes of the Julia programming language.
Language:Jupyter Notebook0 1 00
lcy-seso/awesome-fast-attention
list of efficient attention modules
Language:Python1 0
lcy-seso/batched_gemm
Language:C1 0
lcy-seso/coding-interview-university
A complete computer science study plan to become a software engineer.
1 0
lcy-seso/CUDAMemoryPool
Language:C++2 0
lcy-seso/DeepBench
Benchmarking Deep Learning operations on different hardware
lcy-seso/experiments
Language:Python3 0
lcy-seso/instaparse
Language:Clojure1 0
lcy-seso/isl
Integer Set Library (source repository: http://repo.or.cz/w/isl.git)
Language:C1 0
lcy-seso/jax
Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more
Language:Jupyter Notebook1 0
lcy-seso/myia
Myia prototyping
Language:Python1 0
lcy-seso/onnx-simplifier
Simplify your onnx model
Language:Python1 0
lcy-seso/Optimizing-SGEMM-on-NVIDIA-Turing-GPUs
Optimizing SGEMM kernel functions on NVIDIA GPUs to a close-to-cuBLAS performance.
Language:Cuda1 0
lcy-seso/pbatch
Language:C++1 0
lcy-seso/pet
Polyhedral Extraction Tool (source repository: http://repo.or.cz/w/pet.git)
Language:C1 0
lcy-seso/play-with-torch-script
Play with torch script.
Language:Python2 0
lcy-seso/ppcg
Polyhedral Parallel Code Generation (source repository: http://repo.or.cz/ppcg.git)
Language:C1 0
lcy-seso/reformer-pytorch
Reformer, the efficient Transformer, in Pytorch
Language:Python1 0
lcy-seso/rmm
RAPIDS Memory Manager
Language:C++2 0
lcy-seso/sofp
A free book: "The Science of Functional Programming"
Language:PostScript2 0
lcy-seso/tensor_ops
3 0
lcy-seso/tensorflow
Computation using data flow graphs for scalable machine learning
Language:C++3 0
lcy-seso/tiramisu
A polyhedral compiler for expressing fast and portable data parallel algorithms
lcy-seso/torchscript-to-tvm
Language:Cuda1 0
lcy-seso/tvm-cuda-int8-benchmark
Benchmark of TVM quantized model on CUDA
Language:Python1 0
lcy-seso/tvm_examples
Language:Python1 0
lcy-seso/utvm_staticrt_codegen
This project contains a code generator that produces static C NN inference deployment code targeting tiny micro-controllers (TinyML) as replacement for other µTVM runtimes. This tools generates a runtime, which statically executes the compiled model. This reduces the overhead in terms of code size and execution time compared to having a dynamic on-device runtime.
Language:C1 0