Consider renaming FLUSH to EVICT
jhauser-us opened this issue · 24 comments
Multiple people have said in the past that they consider the word flush potentially to be a synonym for what the proposal calls clean. The word evict has an existing defined meaning for caches that cannot be confused with clean. Please consider renaming the FLUSH operation to the less ambiguous EVICT.
Seems like a good idea. Terminology is inconsistent at many companies.
I assume that EVICT writes dirty data out, and leaves the line behind in invalid state? I.e. the operation Intel calls WBINVD (WriteBack INValiDate).
Whereas CLEAN writes the dirty data out, but leaves the line behind in a clean state - e.g. E or S or O in MOESI. (Probably not O if intended to invalidate for acces by others, e.g. I/O, but that's another issue.)
While we are at it, do you have a preference between INVALIDATE (Intel) and DISCARD (IBM) - both of which throw any dirty data away without writing it back, and leave the line in invalid state.
In CHI/ACE, the ARM "clean" operation leaves the cache state in the equivalent of either E or S, not O (since O is technically modified with respect to memory).
EVICT could be confused with the CHI/ACE "Evict" transaction, which is initiated to update an external snoop filter when a cache replaces a clean line (E or S) without writing data out.
In the past, I've used "flush" as shorthand for the ARM "clean & invalidate" operation since it connotes (to me) that the copy has been invalidated, rather than has been retained as clean.
My vote would be CLEAN, FLUSH, and INVAL, but certainly if any of those cause confusion, we should pick something else. Despite the similarity to the CHI/ACE transaction, EVICT seems a good enough substitute for FLUSH, so I'd be OK with that.
Sorry, I mean Clean+Owned, not Dirty+Owned - as in Clean, but this cache must respond to requests. (Don't ask me to provide a reference, I'm lazy).
I assume that EVICT writes dirty data out, and leaves the line behind in invalid state? I.e. the operation Intel calls WBINVD (WriteBack INValiDate).
Exactly.
Note also, instead of INVALIDATE or DISCARD for the instruction cache, we should call this operation EVICT as well. Why? As you discuss in your document, DISCARD often has security risks to worry about, but EVICT does not. For a read-only cache like the instruction cache, DISCARD = EVICT, and we know the operation (by either name) is always safe, so choose the safe word for it, EVICT. We should train programmers to be wary of DISCARD operations.
While we are at it, do you have a preference between INVALIDATE (Intel) and DISCARD (IBM)
Either seems fine to me.
David Kruckemyer raised a good point:
EVICT could be confused with the CHI/ACE "Evict" transaction, which is initiated to update an external snoop filter when a cache replaces a clean line (E or S) without writing data out.
But then...
Despite the similarity to the CHI/ACE transaction, EVICT seems a good enough substitute for FLUSH, so I'd be OK with that.
See, it's alright; he's okay with EVICT. You heard him say it. :^)
A possible confusion with CHI/ACE over the word evict is a concern. But as I like to point out, there are a thousand programmers for every person constructing the hardware, and very few of those programmers will know a thing about the physical buses, CHI/ACE or otherwise. What programmers will see is the instruction name.
EVICT is somewhat ambiguous in the context of a strictly exclusive cache, e.g. L1 cache that treats its L2 as a victim cache ( in AMD's case, where the victim back several times larger than the L1).
Normally, in such a strictly exclusive cache, evicting from the L1 moves the line to the L2 whether clean or not.
While that might be sufficient for our purposes, e.g. if combined with the subsequent EVICT from the L2 [*], at the very least it is power inefficient: (1) why waste power sending a clean victim to the L2 when you can simply invalidate it? and (2) why waste power sending a dirty victim to the L2 instead of straight to ... whatever is further out, L3 or memory or… ?
The answer to these "why waste power?" questions might be "because the normal cache protocol does it that way, and we didn't want to make a special case for CMOs".
I just want it to be clear that if the implementer wants to avoid wasting power, he's allowed. I.e. that the CMO EVICT does not need to be the same as the normal exclusive cache eviction.
Sorry, I mean Clean+Owned, not Dirty+Owned - as in Clean, but this cache must respond to requests. (Don't ask me to provide a reference, I'm lazy).
Do you mean F (for forward) state? QPI/UPI calls it that anyway (i.e. MESIF).
Note also, instead of INVALIDATE or DISCARD for the instruction cache, we should call this operation EVICT as well. Why? As you discuss in your document, DISCARD often has security risks to worry about, but EVICT does not. For a read-only cache like the instruction cache, DISCARD = EVICT, and we know the operation (by either name) is always safe, so choose the safe word for it, EVICT. We should train programmers to be wary of DISCARD operations.
Are you suggesting eliminating INVALIDATE/DISCARD as an operation? I'm actually fine with that to avoid some of the virtualization issues that introduces, but I just want to clarify.
I just want it to be clear that if the implementer wants to avoid wasting power, he's allowed. I.e. that the CMO EVICT does not need to be the same as the normal exclusive cache eviction.
I think there's a stronger requirement, which is that an EVICT must not write clean data to memory. Imagine you have non-coherent I/O that has written memory. If a hart then wants to read the new data, it performs an EVICT, and if its cache writes clean data back, the old data clobbers the new data.
I think that we can probably approve this issue - and that old fogies and people used to other systems like Derek W and I will just have to train ourselves to say EVICT for RISC-V, rather than flush as in Intel CLFLUSH and POWER DCBF.
We do need to figure out a way of finalizing and closing issues for our task group. A verbal in our first meeting may suffice, as might an email discussion that winds to an end with no further objections. However, past experience shows that positive confirmation of agreement is a good thing. Issue trackers often have that - I don't know if this does. If it does not, I may try creating another Doodle poll. Not because this issue is particularly important, but just to establish the mechanism or more important issues in the future. Assuming, of course, I obtain a non-temporary Doodle account
David Kruckemyer wrote:
I think there's a stronger requirement, which is that an EVICT must not write clean data to memory. Imagine you have non-coherent I/O that has written memory. If a hart then wants to read the new data, it performs an EVICT, and if its cache writes clean data back, the old data clobbers the new data.
In my experience, the way this needs to work in practice is for software to perform EVICT or DISCARD before initiating the non-coherent I/O, and then be careful not to do anything that would accidentally bring in those cache lines until the I/O is complete. If you don't do it that way, you are seriously risking corruption of your I/O data if your core decides on its own (perhaps because it needs space in the cache) to evict a cache line that overwrites the I/O data. If you follow this procedure of EVICT/DISCARD before starting the I/O, there is no problem with EVICT writing back clean data.
Now, if you know that EVICT never writes out clean data, software could use EVICT also after the I/O is done as a way to avoid needing to know exactly the cache line size. But it's debatable whether it's better just to tell software the maximal cache line size, or hide this from software and make it execute additional EVICT instructions with a potential performance cost. The latter option requires that EVICT never write clean data., while the first option (tell software the maximal cache line size) doesn't care.
In my experience, the way this needs to work in practice is for software to perform EVICT or DISCARD before initiating the non-coherent I/O, and then be careful not to do anything that would accidentally bring in those cache lines until the I/O is complete. If you don't do it that way, you are seriously risking corruption of your I/O data if your core decides on its own (perhaps because it needs space in the cache) to evict a cache line that overwrites the I/O data. If you follow this procedure of EVICT/DISCARD before starting the I/O, there is no problem with EVICT writing back clean data.
The challenge is that, with speculation and prefetchers, there's no way to guarantee an address hasn't been brought back into the cache (especially one you might have touched previously). And because of that, the hart needs to perform cache maintenance after the I/O transfer as well, in order to observe the data in memory. One could use DISCARD, but because that's a "destructive" operation, there are security ramifications in a virtualized environment. So the safe operation is EVICT, assuming that it doesn't evict lines to memory, and the "safe" sequence looks like:
- EVICT the I/O region to force any dirty copies back to memory
- Perform the I/O transfer
- EVICT the I/O region again to invalidate any clean copies that the HW brought in
(Note that you want this property, that clean lines don't write back to memory, to be true independent of cache operations, since a replacement is asynchronous with respect to steps in the above sequence.)
Now, if you know that EVICT never writes out clean data, software could use EVICT also after the I/O is done as a way to avoid needing to know exactly the cache line size. But it's debatable whether it's better just to tell software the maximal cache line size, or hide this from software and make it execute additional EVICT instructions with a potential performance cost. The latter option requires that EVICT never write clean data., while the first option (tell software the maximal cache line size) doesn't care.
The more I think about this in the context of software-managed coherence, the more I think knowing the cache line size is unavoidable because of the "false sharing" scenario, i.e. you really can't have I/O writing one part of the line while the hart simultaneously writes the other. There's really no convenient way for the hardware to keep track of what the hart updated, short of adding byte valid bits in the cache (ick!). So I think that argues for your first option. Though considering my comments above, I think you still care about the behavior of clean lines with respect to memory, independent of the behavior of EVICT (i.e. clean lines never update memory regardless of replacement policy or cache operation).
The challenge is that, with speculation and prefetchers, there's no way to guarantee an address hasn't been brought back into the cache (especially one you might have touched previously)
Good point. That was never a risk with the simpler cores I'd been using.
Somehow it feels like the first EVICT/DISCARD ideally ought to suppress further speculative loading for the affected address range, but that's a whole lot easier said than implemented, I do believe.
Oh, where to begin? :-). Amazing how the most inconsequential questions about names generate the most ink.... :-)
-
I think IBM uses INVALIDATE, not DISCARD (at least Power does). I think the Power terminology and the ARM CHI seem to be fairly in sync other than they say clean+invalidate vs. just saying flush.
-
I think FLUSH, CLEAN, INVALIDATE are just fine. and I don't think we should change the names. Though, no mater what names you pick, you will annoy some group of people -- choose wisely. I don't find the arguments for renaming FLUSH to EVICT to be all that persuasive personally. I personally wouldn't change any of those names. It's also probably a good thing I don't get to decide completely on my own, but that would be my vote for this. (Kruckemyer saying he's happy with EVICT does not mean I am :-) -- or that he should have been :-)).
Whatever words are picked, I'm sure the ISA manual will clearly lay out the intent. People will get used to it.
-
I loathe the idea of calling INVALIDTE by the name of EVICT (suggested above for I-caches). Two things, that's hiding the ball on what it's doing and let's not go there. Second, we aren't talking about the I-cache instructions here (modulo the shape of the instructions --- and it's going to turn out to be a bit of a trick). The i-cache instructions a different kettle of fish that's over in the J-group. Programmers will know that INVALIDATE and ALLOCATE are dangerous all on their own.
-
If we're going to have DCBI (INVALIDATE) or DCBA (ALLOCATE), I'm going to strongly suggest that the architecture explicitly state that the implementations are always allowed to implement those as DCBF (FLUSH) and DCBZ (ZERO) respectively. That allows people who need those dangerous operations and can live with their consequences to have them, while at the same time allowing general purpose machines that don't want the semantic risks/securty holes they induce to gently tip-top around them without requiring those dangerous cache ops be in a separate optional extension.
-
I'll sign up for being used to other systems... Andy, you can have the old fogie :-). I don't think either of us should be using any word other than flush.
-
All of the cache ops need to be atomic in the sense that they get the set of caches to the state they promise (in other words if there is a competing load instruction fighting with, say, a flush, the flush has to managed to get the lines out of all the caches at some point. The load can be before or after, but shouldn't be in the middle.
-
FLUSH/INVALIDATE should not try to suppress speculative loading (it can't from other processors). FLUSH/INVALIDATE need to compete in the bus protocol and finish as an atomic op (see 5).
Derek
I'll add my vote to the seeming consensus on terminology: "EVICT" instead of "FLUSH".
One other counterpoint to naming it EVICT.
I (and I'm sure others) think of an "eviction" as what happens at a local cache when a line is evicted due to a capacity problem.
FLUSH is what happens when the line for a given address is removed from every cache in a coherent system due to a flush instruction.
So, if we go with EVICT the terminology is overloaded and confusing in a different direction.
Derek
Agree that EVICT will also need to operate on all the copies in a coherent system that a thread could be migrated to, otherwise place an additional requirement on the Operating System to do so if it migrates threads in the middle of a CMO sequence.
But IMHO we can still call it "evict", and describe the operation as "evicting" even if it's done to other caches.
Sorry @jhauser-us, I'm going to express a stronger preference for FLUSH. I'm still "OK" with EVICT, but I do think FLUSH is more appropriate since we are concerned about other caches and not just one's own cache.
FWIW, it's not obvious to me why the word evict implies one cache but flush may impact multiple caches.
It seems to me like we're trying to come up with a short-hand word to describe "clean & invalidate" and we can't find an un-ambiguous term. How about calling FLUSH/EVICT it what it is: Data Cache Clean & Invalidate? And if that's too much to type, then how about:
- concatenate: CLEANINVAL
- an acronym:
dcbci
or DCBCI
Even if those options sound like other ISAs (Arm, Power), I don't think we should invent a new term just for the sake of being different.
Consensus: from email discussion and voice vote in two of the meetings:
This proposal is rejected.
The POR is that the operation that writes back dirty data and then sets all clean and dirty data to invalid will continue to be called FLUSH.
--
To put things all in one place as of this date:
FLUSH: writes back dirty data and then sets all clean and dirty data to invalid
CLEAN: writes back dirty data, but keeps all lines in the cache in clean state
DISCARD: converts lines to invalid, without writing back dirty data. This operation is insecure.
There should be other CMO operations, if not in phase 1 that in phase 2.
But this is a good starting set for phase 1, in combination with the prefetches PREFETCH.R and PREFETCH.W
To evict from the instruction cache (or from any other read-only cache), will the operation be called FLUSH (even though there's no write-back) or DISCARD (even though the operation is not insecure)?