Combine files in one
Asked Answered
N

3

6

Currently I am in this directory-

/data/real/test

When I do ls -lt at the command prompt. I get like below something-

REALTIME_235000.dat.gz
REALTIME_234800.dat.gz
REALTIME_234600.dat.gz
REALTIME_234400.dat.gz
REALTIME_234200.dat.gz

How can I consolidate the above five dat.gz files into one dat.gz file in Unix without any data loss. I am new to Unix and I am not sure on this. Can anyone help me on this?

Update:-

I am not sure which is the best way whether I should unzip each of the five file then combine into one? Or combine all those five dat.gz into one dat.gz?

Nasion answered 2/8, 2012 at 20:52 Comment(0)
M
12

If it's OK to concatenate files content in random order, then following command will do the trick:

zcat REALTIME*.dat.gz | gzip > out.dat.gz

Update

This should solve order problem:

zcat $(ls -t REALTIME*.dat.gz) | gzip > out.dat.gz
Marchand answered 2/8, 2012 at 20:57 Comment(7)
zcat *.gz | gzip > out.dat.gz I tried doing like this. And I got this error REALTIME_EXPORT_v1x0_20120801_9_T_234000_234200.dat.gz.Z: No such file or directory for all the five files. Why is it so?Nasion
@Nevzz03 I can't reproduce this problem. I'm using bash on linux and same file names.Marchand
@Nevzz03 Are you on Solaris instead of Linux? If so, use gzcat *.gz | gzip > out.dat.gz instead. The zcat utility on Solaris works with a different compression suite (compress and decompress) that uses .Z as a suffix instead of .gz. This might also be the case on other non-Linux Unixen (AIX, etc.)...Fathomless
If you see my above comment, There is extra Z that gets appended after the file name. dat.gz.Z Why is it so?Nasion
Please see answer by Mark Adler. ~1000 times faster and more correct.Perorate
@Perorate Mark is using cat, so it'll not work with compressed files that OP is asking for.Marchand
@IvanNevostruev Yes it will, that is the beauty of the gzip format. If you cat a.txt and b.txt THEN gzip or gzip them both then cat, you get two archives with the exact same content. To verify, unzip the two archives and use md5sum. (I just re-tried it to confirm). That is why Mark Adler pointed to the fact that it is unnecessary to decompress then recompress them.Perorate
O
5

What do you want to happen when you gunzip the result? If you want the five files to reappear, then you need to use something other than the gzip (.gz) format. You would need to either use tar (.tar.gz) or zip (.zip).

If you want the result of the gunzip to be the concatenation of the gunzip of the original files, then you can simply cat (not zcat or gzcat) the files together. gunzip will then decompress them to a single file.

cat [files in whatever order you like] > combined.gz

Then:

gunzip combined.gz

will produce an output that is the concatenation of the gunzip of the original files.

The suggestion to decompress them all and then recompress them as one stream is completely unnecessary.

Obrian answered 2/8, 2012 at 23:14 Comment(0)
S
-1

It seems almost like a black magic, but you can actually concatenate GZ files directly!

The format was made for this (along with MP3). Internally, GZ is organized in independent chunks of compressed stream, each with its own header, compression dictionary, checksum, and so on.

So when you concatenate several GZ files, the uncompressed stream is exactly the concatenation of original files.

Scrummage answered 23/11, 2023 at 14:37 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.