What is lazy space allocation in Google File system
Asked Answered
K

4

18

I was going through google file system (GFS) paper, It mentions that GFS uses Lazy space allocation to reduce internal fragmentation.
Can someone explain, how lazy space reduces internal fragmetation?

Source: http://research.google.com/archive/gfs-sosp2003.pdf

Ken answered 7/8, 2013 at 17:11 Comment(2)
possible duplicate of what is lazy allocation?Paraformaldehyde
If you find one of replies useful, please accept it by ticking the mark left to the reply text.Argybargy
B
12

With lazy space allocation, the physical allocation of space is delayed as long as possible, until data at the size of the chunk size (in GFS's case, 64 MB according the 2003 paper) is accumulated. In other words, the decision process that precedes the allocation of a new chunk on disk, is heavily influenced by the size of the data that is to be written. This preference of waiting instead of allocating more chunks based on some other characteristic, minimizes the chance of internal fragmentation (i.e. unused portions of the 64 MB chunk).

In the Google paper, it also says: "Most chunks are full because most files contain many chunks, only the last of which may be partially filled." So, the same approach is applied to file creation.

It is analogous to this: http://duartes.org/gustavo/blog/post/how-the-kernel-manages-your-memory

Ballottement answered 12/2, 2014 at 19:53 Comment(0)
R
1

I have not read the entire paper..but I am hoping that the following fragment should help you in a small way.

The first question I would ask is: what is the effect of having large block sizes in a file system? Let us say that FS block size is 64MB. Good news is that we write in good contiguous chunks to hard disks (more data written per seek), less metadata to keep in indirect blocks, etc. Bad news is internal fragmentation..if the file is 1MB, but minimum block size is 64MB, there is Internal fragmentation of 63MB. So, how to get the good news and avoid the bad news?

One way is to do lazy space allocation OR delayed space allocation. Here, we keep the block size small (say 1MB), but we write a big chumk of data i.e. many 1MB chunks together when we write to disk. This way, we get the goodness of large block writes. Note that this means that we write to an incore buffer but tell the write() sys call that it is done writing to disk...just like writing to the buffer cache.

NOTE: When the "time" has come to do the real block allocation, we need to be guaranteed space on disk. So, delayed block allocation => space reservation is done at the time of write, but space allocation is done at a later time when enough data blocks have accumulated in-core.

Rexferd answered 7/8, 2013 at 18:15 Comment(2)
but the block size in GFS is not small (1 MB). Actually it says "We have chosen 64 MB, which is much larger than typical file system blocksizes. Lazy space allocation avoids wasting space due to internal fragmentation". So here despite using greater block size, internal fragmentaion is reduced, how?. Maybe i am missing any concept here.Ken
hello, does this means that for a 1kb file the space made unavailable on disk is 64MB?Overawe
B
0

Data is first written into a buffer. So, instead of allocating memory the moment the file is created, they are waiting till the actual write occurs. As in XFS http://en.wikipedia.org/wiki/XFS#Delayed_allocation

Baalman answered 3/1, 2014 at 22:20 Comment(0)
I
0

You don't have to fix the file size on creating. And you can append it to a larger file. You can reference this.

Inexplicable answered 30/11, 2014 at 8:28 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.