hpcgarage/spatter

Memory Utilization

Closed this issue · 7 comments

There is an issue with the refactor using more memory than the current version.

Using the uniform stride test (cpu-ustride.json) as an example, the Jupyter notebook mentions:

This test will require 2GB of memory. The plots in the paper used 16 GB but we want this notebook to run quickly on laptops with less memory.

I ran the uniform stride test and confirmed the current version uses 1.9 GB of memory. When I ran the same test with the refactor, I found that it used 7.4 GB of memory.

However, the uniform stride test specifies UNIFORM patterns with NR and the refactor does not handle these patterns correctly. There are two branches that resolves the handling of these patterns:

https://github.com/radelja/spatter/tree/refactor-nr-fix
https://github.com/plavin/spatter/tree/refactor-new

Running the uniform stride test with either of these branches revealed the refactor uses 29.8 GB of memory compared to the 2 GB of memory used by the current version.

Discussion noted that the refactor is using different buffers for each configuration, which is making the overall memory usage grow dramatically along with the runtime.

There are two fixes we discussed:

  1. Use the same buffers between configs which should make resize() operations much faster.
  2. Incorporate the old Spatter function to calculate the max buffer size for vectors, and remove vector resize() operations.

I measured the memory usage of the current (main branch) and refactor (refactor-new branch) versions of Spatter running cpu-ustride.json with the serial backend using valgrind's massif. The following graphs were generated with massif-visualizer:

Current
main

Refactor
refactor-new

Rerunning the above test with the changes from the refactor-mem-fix branch produces the following graph:

Refactor
refactor-changes

It now uses 1.9 GB of memory, just like the current version.

While this issue has been resolved for cpu-ustride.json and other trace files, there is still a memory usage issue with the branson trace. I ran the branson trace with the serial backend using massif on both the refactor and the original versions of Spatter. Here are the massif-visualizer graphs from these runs:

Original
original_branson

Refactor
branson

The refactor-fix-args branch resolves this issue for the Branson trace:

Refactor
refactor-branson-new

I ran pennant_gpu.json with the CUDA backend on both the original and current versions of Spatter. The memory usage was measured with Massif and visualized with massif-visualizer:

Original
pennant_gpu_v1

Current (previously referred to as Refactor)
pennant_gpu_v2
Note: this memory usage graph is incomplete, as an issue with Massif caused Spatter to terminate. This occurred after the buffers had already been allocated, so this graph should still show most of the memory usage.

The current version uses slightly more memory, but only for the CUDA backend. This issue seems to be resolved.

Thanks for the detailed output, @radelja! I agree that this looks much improved with the refactor. I'll close this for now, and we can revisit if we see issues with these patterns or new ones we collect.