There are indications that it is possible to cause bit flips in memory by row hammering without using CLFLUSH, using normal cached memory accesses.
This makes me wonder: Is it possible to do double-sided row hammering using cached memory accesses, or only single-sided row hammering? The former is more likely to cause bit flips, and might be the only way to cause bit flips on some machines, such as those using a 2x refresh rate -- i.e. those configured to refresh DRAM every 32ms instead of every 64ms. (See the rowhammer blog post for more background.)
The answer appears to be "no" -- at least on my test machine.
For this machine (which has a Sandy Bridge CPU), I figured out how physical addresses map to cache sets and to banks and rows in DRAM. We can use these mappings to answer questions about what kinds of row hammering are possible using cached memory accesses.
More specifically, my question is this: For a machine with an N-way L3 cache, is it possible to pick N+1 addresses that map to the same cache set, where at least two of these addresses map to rows R-1 and R+1 in one bank (for some neighbouring row R)? If so, repeatedly accessing these addresses would cause cache misses that cause rows R-1 and R+1 to be repeatedly activated. That puts more stress on row R (the victim row) than repeatedly activating only row R-1 or row R+1.
The answer to this is "no": It's not possible to pick two such physical addresses.
Suppose we have two such addresses, A and B. Then:
The addresses map to the same bank, so:
(1): A[14:17] ^ A[18:21] = B[14:17] ^ B[18:21]
(using the bank/row XOR scheme I described previously)
The addresses are 2 rows apart, so:
(2): A[18:32] + 2 = B[18:32]
The addresses map to the same cache set, so:
(3): A[6:17] = B[6:17]
(also, SliceHash(A[17:32]) = SliceHash(B[17:32]), but we don't need this property)
(2) implies that A = ~B.
(3) implies that A[14:17] = B[14:17]. Combining that with (1) gives A[18:21] = B[18:21]. That implies A = B, which contradicts (2), so the constraints can't be satisfied.
Basically, the bank/row XOR scheme used by the memory controller's address mapping gets in the way.
- Do single-sided row hammering only.
- Attempt double-sided hammering by picking two sets of 13 addresses (given a 12-way L3 cache), mapping to two different cache sets. This would likely halve the rate of memory accesses (relative to accessing one set of 13 addresses).
Some further notes
I might be getting ahead of myself by doing this analysis, because the feasibility of causing rowhammer-induced bit flips via cached accesses has not yet been demonstrated on a wide selection of machines.
Also, I am asking whether double-sided hammering is possible, not practical, given the constraint that the attacker's memory accesses must go through the cache. While my analysis shows that the risk of causing double-sided hammering is zero with this constraint, the risk might only be negligible without this constraint. Even if a program can use CLFLUSH to bypass the cache, it doesn't necessarily have a way to determine which pairs of addresses map to rows that are spaced apart by two (see the rowhammer blog post for further discussion). Such a program might only be able to get such pairs by chance, by selecting addresses randomly, which would take a long time.
Lastly, my analysis makes assumptions about a machine's DRAM and L3 cache mappings. It would need to be redone for other machines where these assumptions aren't true.