Description
If we send input data to a node (that we know is going to execute a thunk in the future) before its thunk is ready to execute, and cache it on that node, then we can execute that thunk much quicker (assuming network transfers are fully asynchronous and don't impede other thunk executions). We should allow the scheduler to do a small amount of this "prefetching" of data when it has a large amount of data associated with a thunk.
We'll need to be able to pre-allocate a processor for each thunk, implement a worker-local cache of received inputs, and make thunks check this cache for their inputs before moving them. We should also start modeling the memory availability for a given worker, and the memory costs of inputs and estimated max memory allocations of each thunk.