-
Notifications
You must be signed in to change notification settings - Fork 84
Description
64-bit seed did look like providing only a low safety margin to me during my ProgPoW review last year, and I was going to revisit this and share some thoughts with the community, but in the end I ran out of time and I felt like ProgPoW was non-final anyway (to my liking, at least) yet further tweaks were not encouraged. Now reminded and inspired by @kik's #51 and by this community's prompt response to it and willingness to tweak ProgPoW to fix it, I present another related yet very different attack:
While mining ProgPoW with a large cluster, maintain a cluster-wide cache of mappings from 64-bit seed to 256-bit mix digest (immediate result of the memory-hard computation). This cache can be emptied on every new period (10 blocks) and filled during the period, maybe for up to a pre-defined maximum size (as memory permits) such as 2^32 entries (128 GiB).
Once the cache fill is above a threshold, each cluster node can reasonably start to utilize its attached Keccak ASICs to search the nonce space until a previously cached seed is seen. For example, with a cache fill of 2^32 entries, it'd take around 2^32 Keccak computations until a cache hit. With a large enough cache and with enough Keccak ASICs working in parallel, this might be cheaper than doing a mix digest computation for a previously unseen seed (although the node's GPUs would also continue working on new seeds in parallel).
Now, what cache size would make this attack worthwhile? We'd need to match (and then exceed) a GPU's hash rate with our rate of finding previously cached seeds. With a cache of 2^32, and thus needing to do 2^32 Keccak computations, to match a GPU's e.g. 2^24 (16.8M) hashes/s we need to perform 2^56 Keccak computations per second. Can an ASIC with enough Keccak cores (perhaps across many chips) to accomplish this potentially consume less power than a GPU does? Probably not.
Can a cache much larger than 2^32 reasonably be maintained? Probably yes, distributed across the cluster nodes' RAM. Then fewer Keccak cores would be needed, and their energy efficiency vs. GPUs would be better.
Can a cluster node's Keccak ASICs quickly determine if a seed is (likely) cached? Probably yes, with a probabilistic data structure such as a Bloom filter in RAM closely attached to the ASICs. (They would not need to wait for this check result, but would proceed to test more nonces. There would need to also be locally stored queues of seed values to check.) This RAM could be many times smaller than the cache itself (perhaps 10 to 20 bits per seed, not 256), but it would nevertheless be the limiting factor on the total size of the cache.
Can the Bloom filter RAM have enough throughput to accommodate the many candidate seeds coming out of Keccak every second? That's probably the worst bottleneck. I guess it'd be tricky ASIC design with Keccak+SRAM cores (distributed RAM), NOC, and inter-chip mesh to implement that.
Overall, this doesn't look practical yet. But if we want to have a better safety margin and greater confidence with respect to attacks like this, we need to move to larger seeds.
My bigger concern isn't this attack per-se, but rather that this line of thought could become the missing piece of the puzzle in making some other yet-undiscovered attack practical.