How to prepend to a file (add at the top)
Asked Answered
L

8

12

Imagine you have a file

sink("example.txt")
data.frame(a = runif(10), b = runif(10), c = runif(10))
sink()

and would want to add some header information, like

/* created on 31.3.2011 */
/* author */
/* other redundant information */

How would I add this "header"? Doing it manually seems trivial. Hit a few Enters, copy/paste or write information and you're done. Of course, in R, I could read in example.txt, create example2.txt, add header information and then example.txt.

I was wondering if there's another way of appending files from the "top". Other solutions (from c++ or Java...) also welcome (I'm curious how other languages approach this problem).

Langston answered 31/3, 2011 at 13:21 Comment(2)
you cannot append file from top, you must rewrite entire file with header includedKeffer
I think this has nothing to do with the language (C++, Java, etc.) because it is limited by the file system. And I don't know any which allows doing this natively. All filesystems allow changing data in the middle of the file or add data at the end. I believe that you R approach is applicable to all the programming languages.Hurtful
S
7

in R there is no need to work with an extra file. You can just do :

writeLines(c(header,readLines(File)),File)

Yet, using the linux shell seems the most optimal solution, as R is not famous for performant file reading and writing. Especially not since you have to read in the complete file first.

Example :

Lines <- c(
"First line",
"Second line",
"Third line")
File <- "test.txt"
header <- "A line \nAnother line \nMore line \n\n"

writeLines(Lines,File)
readLines(File)    

writeLines(c(header,readLines(File)),File)
readLines(File)
unlink(File)
Syck answered 31/3, 2011 at 13:42 Comment(1)
The solution provided overwrites the content, not appending the header to the Lines. Am I missing something? I provide in the answers a working solution of your code.Overstretch
S
6

It is totally easy in the linux shell:

echo 'your additional header here' >> tempfile
cat example.tst >> tempfile
mv tempfile example
rm tempfile
Secretin answered 31/3, 2011 at 13:26 Comment(6)
Which is the same as Gareth said.Secretin
@tgmath: no, it is not. Gareth suggests reading all the data (header + contents), while your solution is more optimal, you (or OS) read only the contents. +1Hurtful
@Igor - what do you think cat does? Magic? It's a simple C program that reads the source file line by line and outputs it to stdout. In fact, Gareth's answer is going to be faster (provided a file fits in memory) since it can be optimized to read and write larger chunks of data rather than a line at a time.Edieedification
@Brian: I didn't treat tgmath's approach literally and I know what the cat utility does and how it works. I just wanted to say that if you already had a header and didn't need it after the merging would be done than appending (yes, cat contents >> header) would be faster.Hurtful
for one line header: sed -i '1i your additional header' example.txtMauricio
Wouldn't the rm tempfile fail with, "File not found" or equivalent?Nereen
B
5

In any language there is ultimately only one solution. And that is to overwrite the whole file:

contents = readAllOf("example.txt")

overwrite("example.txt", header + contents )
Barilla answered 31/3, 2011 at 13:24 Comment(3)
Imagine a "header" of 10 gigabytes in size and "contents" of 100 bytes. Will it not be faster to append the contents to the existing header, in a manner desribed by thmath?Hurtful
tgmath's solution is this solution only implemented in shell script. Note that the contents of the header still have to be copied to a tempfile and then append to. My pseudo code does imply this is done in memory, but could easily be on disk.Barilla
Generally speaking, yes. But in the case when you don't need a header after your merging is done, you can append the contents to the header and rename the header afterwards. Yes, I understand that is the case of power failure in the middle of the operation the header will be lost, but sometimes performance is more important than reliability.Hurtful
D
2

Either (A) Read the file in, add your header before and write back out (as Gareth suggested) ..or (B) Cache what you want write to the file somewhere and only write it all out when you've generated your header.

Discreditable answered 31/3, 2011 at 13:28 Comment(0)
U
2

In C++, if you're willing to get your hands dirty, you can take the following steps.

  1. Stream the new content into temporary buffer (so you know the exact size of the new content)
  2. Resize the file (truncate(), ftruncate()) to include current size plus new size
  3. Map the whole file in
  4. memmove() the original file size to new position which is the new content size
  5. Copy the new data at position 0.

It's probably less effort to:

  1. Construct a new file and push the new content in
  2. Read the old file and push that in too
  3. Call operating system calls to move the new file to the old file
Unconscionable answered 31/3, 2011 at 13:35 Comment(0)
R
1

You generally can't expand a file backwards with most filesystems.

Normally, when you save a file, the existing data is completely overwritten. Even if you only change the first two lines of a 1,000,000 line file, the application will usually re-write the unchanged lines to disk when you hit save.

For most file formats, any headers are fixed size, so it's not a problem to change them.

There are also formats that are stream based; since the data is parsed from the stream and used to construct the document, it's possible for the stream to contain an instruction to insert some data at the beginning of the resulting document. These stream-based file formats are fairly complicated, though.

Riplex answered 31/3, 2011 at 13:26 Comment(2)
I am just curious. You say: "can't expand a file backwards with most filesystems", but do you know any that allows this?Hurtful
I don't know of any, but there must someone somewhere who's developed one. Even a slightly modified file system similar to FAT* could achieve this by inserting a cluster at the head of a file.Riplex
M
0

Using bash:

$ cat > license << EOF
> /* created on 31.3.2011 */
> /* author */
> /* other redundant information */
> EOF
$ sed -i '1i \\' example.txt
$ sed -i '1 {
>    r license
>    d }' example.txt

Don't know how to do it with one sed command (sed -i -e '1i \\' -e '1 { ... inserts after first line).

Mauricio answered 31/3, 2011 at 14:2 Comment(0)
O
0

In R, following the original question:

df <- data.frame(a = runif(10), b = runif(10), c = runif(10))
txt <- "# created on 31.3.2011 \n# author \n# other redundant information"
write(txt, file = "example.txt") 
write.table(df, file = "example.txt", row.names = FALSE, append = TRUE)

The solution provided by Joris Meys does not work properly, overwrites the content, not appending the header to the Lines. A working version would be:

Lines <- c("First line", "Second line", "Third line")
File <- "test.txt"
header <- "A line \nAnother line \nMore line \n\n"
writeLines(Lines, File)
txt <- c(header, readLines(File)) 
writeLines(txt, File) # Option1
readLines(File) 

Instead of writeLines we could use too:

write(txt, File) # Option 2
cat(txt, sep="\n", File) # Option 3
Overstretch answered 9/4, 2015 at 20:42 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.