I have a very large file, over 100GB (many billions of lines), and I would like to conduct a two-level sort as quick as possible on a Unix system with limited memory. This will be one step in a large Perl script, so I'd like to use Perl if possible.
So, how can I do this?
My data looks like this:
A 129
B 192
A 388
D 148
D 911
A 117
... but for billions of lines. I need to first sort by letter, and then by number. Would it be easier to use a Unix sort, like
sort -k1,2 myfile
or can I do this all in Perl somehow? My system will have something like 16GB RAM, but the file is about 100GB.