A map-reduce architecture is a distributed divide-and-conquer approach to data processing. The map is the transformation, and the reduce is the combination of the transformed data. The classic merge sort is an example: divide, sub-process parts, and merge together. In the real world, the transformation might be to divide based on the presence of a feature, e.g. a search term, in the input, and build separate indexes in the map stage. Reduce may do no more than detect termination.
In order to guarantee termination, algorithms of this class are always described as a tree of computation such that an inductive proof of termination is possible. For example, sort(N) = sort(N/2) + sort(N/2) + merge(N), and sort(1) and merge(N) are defined. This structure makes scheduling computation and detecting termination quite easy.
So what happens when computation does not form a tree or DAG? What happens if the structure of the computation contains loops for which an inductive proof is not possible? What happens if insufficient information about the structural decomposition of the algorithm is available to the compiler? A map (e.g. in a cycle) must terminate when it is no longer reachable, but this reachability can no longer be proved by passing an end-of-stream signal, or returning from a function. Without a more advanced static analysis of the program structure to know when a map sub-process must be terminated, we cannot schedule a reduce.
There is an algorithm which can still detect termination if it occurs, furthermore schedule the reduce operations at the earliest provably correct point without waiting for a global condition, and so forth. Intuitively, we felt that the answer existed, and we used a fairly heavyweight logic solver to analyse specifications of the problem until we found a correct formulation.
While we could generate an optimal scheduling of any program using the logic solver, I then spent a little while converting the logical specification into an imperative algorithm, which we now use to schedule programs.
I suspect my next exercise will be in applied calculus, something I have considered little since my first year. I suspect further that this post does not make sense, so I will restrict the readership so as to reduce my own embarassment.
So, did you see the game last night?