I also just came across the sentence “Moving Computation is Cheaper than Moving Data” (from the Apache Hadoop documentation) and after some reading I think this refers to the principle of data locality.
Data locality is a strategy for task scheduling aimed at optimizing performance based on the observation that moving data across a network is costly, so when choosing which task to prioritize whenever a computing/data node is free, preference will be given to the task that's going to operate on the data in the free node or in its proximity.
This (from Delay Scheduling: A Simple Technique for Achieving
Locality and Fairness in Cluster Scheduling, Zaharia et al., 2010) explains it clearly:
Hadoop’s default scheduler runs jobs in FIFO order, with five priority levels. When the scheduler receives a heartbeat indicating that a map
or reduce slot is free, it scans through jobs in order of priority and submit time to find one with a task of the required type. For maps,
Hadoop uses a locality optimization as in Google’s MapReduce [18]: after selecting a job, the scheduler greedily picks the map task in
the job with data closest to the slave (on the same node if possible, otherwise on the same rack, or finally on a remote rack).
Note that the fact Hadoop replicates data across nodes increases fair scheduling of tasks (the higher the replication, the higher the probability of a task to have data on the next free node and hence get picked to run next).