I've been using Clojure for some fairly large-scale data processing tasks (definitely gigabytes of data, typically lots of largish Java arrays stored inside various Clojure constructs/STM refs).
As long as everything fits in available memory, you shouldn't have a problem with extremely large amounts of data in a single ref. The ref itself applies only a small fixed amount of STM overhead that is independent of the size of whatever is contained within it.
A nice extra bonus comes from the structural sharing that is built into Clojure's standard data structures (maps, vectors etc.) - you can take a complete copy of a 10GB data structure, change one element anywhere in the structure, and be guaranteed that both data structures will together only require a fraction more than 10GB. This is very helpful, particularly if you consider that due to STM/concurrency you will potentially have several different versions of the data being created at any one time.