What is scratch space /filesystem in HPC

Asked 21/1, 2015 at 11:32 Answered 22/1, 2015 at 4:8

Solved filesystems nfs hpc supercomputers lustre

I am studying about HPC applications and Parallel Filesystems. I came across the term scratch space AND scratch filesystem.

I cannot visualize where this scratch space exists. Is it on the compute node as a mounted filesystem /scratch or on the main storage space.

What are it's contents.

Is scratch space independent on each compute node or, two or more nodes can share a single scratch space.

So lets say I have a file 123.txt which I want to process parallelly. Will the scratch space contain the parts of this file or the whole file will be copied.

I am confused and nowhere on google is there a clear description. Please point out to some.

Thanks a Lot.

Belia answered 21/1, 2015 at 11:32 Comment(0)

It all depends on how the cluster was setup and what the users need. When you are given access to a cluster you should also be given some information about how it is meant to be used which should answer most of your questions.

On one of the clusters I work with NFS is used for long term storage and some Lustre space is available for job scratch space. Both the NFS and Lustre are seen by all of the nodes. Each of the nodes also has some scratch space on the node that only that node can see.

If you want your job to work on 123.txt in parallel you can copy 123.txt to a shared scratch space(Lustre) or you can copy it to each of your node scratch spaces in your job file.

for i in `cat $PBS_NODEFILE | sort -u ` ; do scp 123.txt $i:/scratch ; done

Once each node has a copy you can run your job. Once the job is done you need to copy your results to persistent storage since clusters will often run scripts to cleanup scratch space.

Peekaboo answered 21/1, 2015 at 14:36 Comment(5)

No I have been told to come up with my own study about these terminologies..hence. So as you said the job scratch space is available as Lustre file system ie. object based storage. This scratch space can be anything like magnetic tape HDD media etc. Similarly the local scratch space can be a disk drive or a PCI based SSD. Is my understanding correct ? – Belia 21/1, 2015 at 17:38

I think the problem, as you found with your google search, is that these terms are not well defined. A general definition would be that scratch filesystems/space/partitions are used for short term storage for a single job or set of computational jobs and they often have the benefit of being faster then regular storage or offering a larger pool of space than you would normally have access to or both. The people that use, and most likely pay for, the cluster will determine if they need fast or large scratch space and if it needs to be shared between nodes based on what applications they run. – Peekaboo 21/1, 2015 at 19:43

Is Lustre a completely new File system or Is it based on ext3 or ext4 modified. – Belia 12/2, 2015 at 11:17

While Lustre can use ext4, or ZFS, on the back end they are different technologies. Ext4 allows files to be stored on block devices(hard/flash/floppy drives) while Lustre is a parallel network file system that allows clients to read and write files across a network connection. Since it is parallel the reads and writes are striped(split) across the servers(OSTs) in a Lustre system. This should get you better performance than something like NFS where a single server is normally a bottleneck. – Peekaboo 12/2, 2015 at 13:37

ok so lustre is a VFS equivalent of Linux, and the underlying file systems can be ext4 or ZFS ?? – Belia 13/2, 2015 at 4:13

There are a lot of different ways to think about or deploy scratch space or a scratch file system.

Let's say you have a cluster of linux nodes, and these nodes all have a hard disk. You could imagine a /scratch space, local to each node. Since the OS image is going to be relatively small, and one cannot procure anything smaller than a terabyte drive nowadays, you end up with close to a terabyte of storage for the node to use.

What would you do with this node-local storage? Oh, lots of things. Scalable Checkpoint-Restart. Local out-of-core operations.

When I first started playing with clusters, it seemed like a good idea to gang all this un-used space into a parallel file system. PVFS worked really well for that purpose.

which lets me segue to a /scratch parallel file system available to all nodes. There is a technology component to this (which parallel file system will a site deploy?) but there is also a policy component: how long will data on this file system be retained? is it backed up? /scratch often implies files are not backed up and in fact are purged after some period of not being accessed (typically two weeks)

Iquique answered 22/1, 2015 at 4:8 Comment(0)

Recommended topics

Hot tags