I want to use jq on a 50GB file. Needless to say the machines memory can't handle it. It's running out of memory.
I tried several options including --stream but it didn't help. Can someone tell me what I'm doing wrong ? And how to fix it
jq -cn --stream 'fromstream(1|truncate_stream(inputs))' file.json | jq -cr .data[] >> out.json
The file contains data like this:
{"data":[{"id":"id1","value":"value1"},{"id":"id2","value":"value2"},{"id":"id3","value":"value3"}...]}
i want to read each value of the array in the data field and put it line by line in another file. such as below
{"id":"id1","value":"value1"}
{"id":"id2","value":"value2"}
{"id":"id3","value":"value3"}
Right now the command is running out of memory and gets killed.
fromstream
you're asking jq to read the stream and make a data structure in memory from it. Using--stream
isn't useful unless you somehow filter the stream down to something more manageable before callingfromstream
(if you ever do so at all). As for advice on how to do that... it would help if you described the actual problem you're trying to solve in more detail. – Callidata
key from the top-level object in the original? Is the value found there small enough to fit in RAM? Are you trying to do something else? – Callimongoimport
command for loading the file into mongo. Once it's in, everything else will be trivial. – Bandage