This repo is a memory footprint predictor(emulator) for cuda applications.
It is based on pytorch-malloc, a repo showing how to replace the pytorch memory allocator as a customized one.
- install
pybind11
; - run
build.sh
; - run
LD_PRELOAD=./fake_libcudart.so python torch_example.py