Feature Request: Get loaded cache lines
stettberger opened this issue · 2 comments
As a user, I would like to know which memory addresses are loaded from the MainMemory
. I came up with a "solution" that parses cachesims log output. But really that is not a real solution. Here is my "solution" for reference:
import sys
import io
from cachesim import CacheSimulator, Cache, MainMemory
mem = MainMemory()
l1 = Cache("L1", 128, 4, 32, "LRU")
mem.load_to(l1)
mem.store_from(l1)
cs = CacheSimulator(l1, mem)
l1.backend.verbosity=4
def cache_op(cache, op, addr, width):
"""This is wild"""
old = cache.stats()['MISS_count']
old_sys = sys.stdout
sys.stdout = io.StringIO()
op(addr, width)
X = sys.stdout
sys.stdout = old_sys
delta = cache.stats()['MISS_count'] - old
lines = [x for x in X.getvalue().split("\n") if 'MISS' in x and 'LOAD' in x]
cls = []
for line in lines:
line = line[line.index("cl_id=")+len("cl_id="):]
line = line[:line.index(" ")]
line = int(line)
if line not in cls:
cls.append(line)
assert len(cls) == delta, X.getvalue()
return cls
print('Loaded CL_ids:', cache_op(l1, cs.load, 16, 4))
print('Loaded CL_ids:', cache_op(l1, cs.load, 20, 4))
print('Loaded CL_ids:', cache_op(l1, cs.load, 32, 4))
print('Loaded CL_ids:', cache_op(l1, cs.load, 127, 4))
Result:
Loaded CL_ids: [0]
Loaded CL_ids: []
Loaded CL_ids: [1]
Loaded CL_ids: [3, 4]
Hi @stettberger,
sorry for the delayed reply!
I am not sure if my suggested solution fully meets your requirements, but maybe it is a first step in the right direction.
The cleanest way would be check against cache.backend.cached
for a specific cache if a cache line ID is in the cache, but due to the fact this is a set
and quite expensive in terms of computation, I would recommend introducing another dictionary data structure like this:
from cachesim import CacheSimulator, Cache, MainMemory
import collections
import itertools
def load_and_track_cl(CacheSimulator: cs, addr, length):
cl_ids = set()
# make sure to have one address per CL in case length > cl_size
addresses = itertools.chain(range(addr, addr+length, cs.last_level.backend.cl_size), [addr+length-1])
for ad in addresses:
cl_id = ad >> cs.last_level.backend.cl_bits
if cl_id not in cs.cache_dict:
cl_ids.add(cl_id)
cs.cache_dict[cl_id] = True
# do actual load
cs.load(addr, length)
return cl_ids
mem = MainMemory()
l1 = Cache("L1", 128, 4, 32, "LRU")
mem.load_to(l1)
mem.store_from(l1)
cs = CacheSimulator(l1, mem)
# create additional data structure in CacheSimulator
cs.cache_dict = collections.OrderedDict()
print("Loaded CL_ids: {}".format(load_and_track_cl(cs, 16, 4)))
print("Loaded CL_ids: {}".format(load_and_track_cl(cs, 20, 4)))
print("Loaded CL_ids: {}".format(load_and_track_cl(cs, 32, 4)))
print("Loaded CL_ids: {}".format(load_and_track_cl(cs, 127, 4)))
print("Loaded CL_ids: {}".format(load_and_track_cl(cs, 192, 80)))
Result:
Loaded CL_ids: {0}
Loaded CL_ids: set()
Loaded CL_ids: {1}
Loaded CL_ids: {3, 4}
Loaded CL_ids: {8, 6, 7}
I checked this against your code snippet and validated it with a few hundred thousands randomly generated memory addresses and timed it, which gave me an approx. speedup of 3x.
Please keep in mind that there are cases in which this doesn't work , for example, if you have a victim cache.
I hope this helps nonetheless!
Closing this issue for now, but I am happy to further discuss the topic!
Therefore, feel free to reopen.