Why does the jq --raw-output argument fail to remove quotes from @csv output?
Asked Answered
F

2

9

I am trying to use jq to reformat some elements of the JSON output generated by ffprobe as csv. I'm close (it seems to me), but struggling with a detail:

My ffprobe output is shown in the jq 1.6 playground

I'm running a recently d/l binary of jq (jq --version => jq-1.6) on MacOS Mojave (10.14.6)

From the terminal on my Mac, my results are:

$ fn_ffprobeall | jq -r '[.format.filename,.format.format_name,.format.tags.album_artist] | @csv'
"01 Jubilee.flac","flac","Bill Charlap Trio"

# where fn_ffprobeall is a function defined as: 
fn_ffprobeall () { ffprobe -i "01 Jubilee.flac" -hide_banner -v quiet -print_format json -show_format -show_streams; }

But this jq output (shown above) is not what I need... I need values without the surrounding quotes "". According to the documentation for --raw-output / -r:

With this option, if the filter’s result is a string then it will be written directly to standard output rather than being formatted as a JSON string with quotes. This can be useful for making jq filters talk to non-JSON-based systems.

Also, it seems odd that using @tsv instead of @csv "does the right thing" as the quotes will be stripped. I suppose one could do some additional work to replace tab chars with ,, but I'd like to know if I'm missing something before falling back to that approach.

Favata answered 22/2, 2020 at 8:23 Comment(0)
C
10

The --raw-output option, has no control on strings passed to @csv, as, by that time, they are not final results of the filter. They are quoted because @csv quoted them.

jq considers the result of @csv as a single string output value. The --raw-output option works as it says in documentation, it does not encode string results in output to JSON.

If you try without that option, you will see output as "\"01 Jubilee.flac\",\"flac\",\"Bill Charlap Trio\"", which, is a properly encoded JSON string. It has quoting, as well as disallowed characters escaped. You can see this difference by simply checking and unchecking the Raw Output option at https://jqplay.org/s/OerK1MlARS.

If you want unquoted strings in CSV, you can use join(",") in place of @csv, but it will not work well when some string itself has comma in it.

Cole answered 14/7, 2020 at 20:9 Comment(1)
I understand your explanation - it makes sense. I wish the jq documentation was as clear :) And fwiw, I was trying to add the string as a row in an existing CSV file using Excel.Favata
T
4

The @csv filter produces CSV in general accordance with the prevalent standards, which require strings to be quoted under certain circumstances (e.g. if they contain commas), and which allow fields to be quoted.

jq's -r option is much misunderstood. It only affects "top-level" JSON string outputs. It should be used with the @csv option to produce CSV output, but it does not strip quotation marks from string-valued fields.

If you want to have fine-grained control over where quotation marks appear, you have numerous options (one of the simplest being @tsv | gsub("\\t";",")), but you then run the risk of producing invalid CSV.

Tellus answered 22/2, 2020 at 8:40 Comment(3)
OK, fair enough that the "standard" is a bit squishy. After reading your answer, I discovered RFC 4180 which adds some insights. Your answer stated, "jq's -r option is much misunderstood. It should be..." I may not understand what you're trying to say... I did use it with the @csv option, and the docs do state clearly (as i read them) that the quotes will be stripped. Can you clarify what it is that has been misunderstood?Favata
How do you square that with the docs?: With this option, if the filter’s result is a string then it will be written directly to standard output rather than being formatted as a JSON string with quotes.Favata
Maybe that's what we've got in common: the mental block?Favata

© 2022 - 2024 — McMap. All rights reserved.