I was thinking that as I was typing my comment. Another solution, which S3 uses,...

a-priori · on Feb 19, 2010

At what point does it become ridiculous to move the data, which may be measured in TB or PB, when the algorithm itself would be measured in KB or MB?

ssp · on Feb 19, 2010

Hush. Not in front of the VCs.

akronim · on Feb 20, 2010

In clusters working on large amounts of in memory data, the approach is often to load the data, then move the code (e.g. a java class implementing some data procesing interface) to the data as required, rather than move the data to the code.

robryan · on Feb 20, 2010

There is always the stuff that goes the other way though like how Seti@Home does FFT's which is computationally expensive and benefits from a distributed system but the file size is quiet small.

a-priori · on Feb 20, 2010

Yes, BOINC projects are cases where it is not ridiculous to move the data, because it is computation power that is the scare resource and the work units are typically only in the hundreds of kilobytes to single-digit megabytes.