Inserting data in RandomAccessFile and updating index

Asked 10/10, 2011 at 8:43 Answered 15/5, 2014 at 12:49

I've got a RandomAccessFile in Java where i manage some data. Simplified: At the start of the file i have an index. (8 byte long value per dataset which represents the offset where the real data can be found).

So if i want to now where i can find the data of dataset no 3 for example. I read 8 Bytes at offset (2*8). (Indexing starts with 0).

A dataset itsself consists of 4 Bytes which represents the size of the dataset and then all the bytes belonging to the dataset.

So that works fine in case i always rewrite the whole file.

It's pretty important here, that Dataset no 3 could have been written as the first entry in the file so the index is ordered but not the data itsself.

If i insert a new dataset, i always append it to the end of the file. But the number of datasets that could be i n one file is limited. If i can store 100 datasets in the file there will be always 100 entries in the index. If the offset read from the index of a dataset is 0 the dataset is new and will be appended to the file.

Bu there's one case which is not working for me yet. If i read dataset no. 3 from the file and i add some data to it in my application and i want to update it in the file i have no idea how to do this.

If it has the same length as befor i can simply overwrite the old data. But if the new dataset has more bytes than the old one i'll have to move all the data in the file which is behind this dataset and update the indexes for these datasets.

Any idea how to do that? Or is there maybe a better way to manage storing these datasets in a file?

PS: Yes of course i thought of using a database but this is not applicable for my project. I really do need simple files.

Xhosa answered 10/10, 2011 at 8:43 Comment(0)

You can't easily insert data into the middle of a file. You'd basically have to read all the remaining data, write the "new" data and then rewrite the "old" data. Alternatively, you could potentially invalidate the old "slow" (potentially allowing it to be reused later) and then just write the whole new record to the end of the file. Your file format isn't really clear to me to be honest, but fundamentally you need to be aware that you can't insert (or delete) in the middle of a file.

Congratulant answered 10/10, 2011 at 8:47 Comment(2)

okay seems like i will keep the data in memory as long as possible and then always rewrite the whole file. – Xhosa 10/10, 2011 at 8:54

@Chris: Well you can always do it without using much memory by copying the old file to a new file chunk by chunk. – Congratulant 10/10, 2011 at 8:55

I've got a RandomAccessFile in Java where i manage some data.

Stop right there. You have a file. You are presently accessing it via RandomAccessFile in Java. However your entire question relates to the file itself, not to RandomAccessFile or Java. You have a major file design problem, as you are assuming facilities like inserting into the middle of a file that don't exist in any filesystem I have used since about 1979.

Lennalennard answered 10/10, 2011 at 9:24 Comment(1)

actually i was not assuming that the filesystem can insert data in the middle of a file. I was just searching for a suitable way to do this myself. (Reading the rest of the file and rewriting it again). I was just hoping to find a way to maintain the index of this file when i do that. – Xhosa 10/10, 2011 at 12:20

As the others answered too, there's no real possibility to make the file longer/shorter without rewriting the whole. There are some workarounds and maybe one solution would work after all.

Limit all datasets to a fixed length.
Delete by changing/removing the index and add by always adding to the end of the file. Update by removing the old dataset and adding the new dataset to the end if the new dataset is longer. Compress the file from time to time by actually deleting the "ignored datasets" and moving all valid datasets together (rewriting everything).
If you can't limit the dataset to a fixed length and you intend to update a dataset making it longer, you can also leave a pointer at the end of the first part of a dataset and continue it later in the file. Thus you get a structure like a linked list. If a lot of editing takes place it would make here sense too, to rearrange & compress the file.

Most solutions have a data overhead but file size is usually not the problem and as mentioned you can let some method "clean it up".

PS: I hope it's ok to answer such old questions - I couldn't find anything about it in the help center and I'm relatively new here.

Diba answered 15/5, 2014 at 12:49 Comment(0)

Recommended topics

Hot tags