The reuse distance is the uniqe PC counts between two appearence of one same PC.
Assume a PC sequence a,b,c,c, d, a, a,b,c,e.......
.
Between the first a
and the second a
, there is b, c, c, d. So the reuse distance of the first a
is unique([b,c,c,d]) = 3
.
Reuse latency is also used to refer to reuse distance in our repo.
We use two for
loops to do this work:
PC = [1,2,3,4,1,1,2,3,5]
RL = []
L = len(PC)
for i in range(L):
for j in range(i+1, L):
if PC[i]!=PC[j]:
continue
RL.append(RL_uniqe(i, j))
print(RL)
Use a hashmap to record all the PCs and its indexes. Then we just need to traverse the log once. We will also maintain some states to speedup RL calculation for basic bolcks. For details please see the code.
PC = [1,2,3,4,1,1,2,3,5]
RL = [0xffff]*len(PC)
L = len(PC)
hashmap = {}
for i in range(L):
pc = PC[i]
if pc in hashmap.keys():
last_i = hashmap[pc]
RL[last_i] = RL_uniqe(last_i + 1, i)
hashmap[PC[i]] = i
print(RL) # 0xffff in RL means that pc is not reused
The file pc.txt is a dummy file for tests
83910 ffffffffa4e96106 2 66 90
83910 ffffffffa4e96108 1 c3
83910 ffffffffa4e1b854 2 66 90
83910 ffffffffa4e1b856 7 4c 89 a3 60 01 00 00
83910 ffffffffa4e1b85d 1 5b
83910 ffffffffa4e1b85e 2 41 5c
83910 ffffffffa4e1b860 1 5d
83910 ffffffffa4e1b861 1 c3
83910 ffffffffa4e1c68d 4 48 83 c4 20
Each pc is a little-Endian, 8-byte, unsigined integer.
Each rl(reuse latency) is a little-Endian, 8-byte, unsigined integer.
It keeps the counts for every rl. Each rl(reuse latency) and each count is a little-Endian, 8-byte, unsigined integer, like this [rl][count][rl][count].......
./rl_tool pc.txt