Output semicolon-separated string
Asked Answered
A

3

6

Lets say we have this file:

{
  "persons": [
    {
      "friends": 4,
      "phoneNumber": 123456,
      "personID": 11111
    },
    {
      "friends": 2057,
      "phoneNumber": 432100,
      "personID": 22222
    },
    {
      "friends": 50,
      "phoneNumber": 147258,
      "personID": 55555
    }
  ]
}

I now want to extract the phone numbers of the persons 11111, 22222, 33333, 44444 and 55555 as a semicolon-separated string:

123456;432100;;;147258

While running

cat persons.txt | jq ".persons[] | select(.personID==<ID>) | .phoneNumber"

once for each <ID> and glueing the results together with the ; afterwards works, this is terribly slow, because it has to reload the file for each of the IDs (and other fields I want to extract).

Concatenating it in a single query:

 cat persons.txt | jq "(.persons[] | select(.personID==11111) | .phoneNumber), (.persons[] | select(.personID==22222) | .phoneNumber), (.persons[] | select(.personID==33333) | .phoneNumber), (.persons[] | select(.personID==44444) | .phoneNumber), (.persons[] | select(.personID==55555) | .phoneNumber)"

This also works, but it gives

123456
432100
147258

so I do not know which of the fields are missing and how many ; I have to insert.

Accouchement answered 5/11, 2018 at 14:52 Comment(6)
Do you want a single semicolon between the numbers? I see in your question you wrote ´123456;432100;;;147258´ why does this last number have 3 semicolons before it?Mistiemistime
Yes, because I look for 11111, 22222, 33333, 44444, and 55555, but 33333 and 44444 do not show up in the file and thus are null-ish. If I had only a single ;, I could not guarantee that the correct number is in the correct column.Accouchement
This is disgusting and not quite what you're asking, but it might inspire some : . as $input | [11111, 22222, 33333, 44444, 55555][] | (. as $id | $input.persons | map(select(.personID == $id))) (jqplay)Kentiga
Now this is what you're asking, but even more disgusting : [. as $input | [11111, 22222, 33333, 44444, 55555][] | . as $id | $input.persons | map(select(.personID == $id)) | map(.phoneNumber) | join("")] | join(";") (jqplay). You should seriously avoid using that, wait for someone that knows jq better than I do and will cleanup this mess.Kentiga
does the solution have to use jq?Mistiemistime
@MikeKatz45, no, not at all.Accouchement
A
3

With your sample input in input.json, and using jq 1.6 (or a jq with INDEX/2), the following invocation of jq produces the desired output:

jq -r --argjson ids '[11111, 22222, 33333, 44444, 55555]' -f tossv.jq input.json 

assuming tossv.jq contains the program:

INDEX(.persons[]; .personID) as $dict
| $ids
| map( $dict[tostring] | .phoneNumber)
| join(";")

Program notes

  1. INDEX/2 produces a JSON object that serves as a dictionary. Since JSON keys must be strings, tostring must be used in line 3 above.

  2. When using join(";"), null values effectively become empty strings.

  3. If your jq does not have INDEX/2, then now might be a good time to upgrade. Otherwise you can snarf its definition by googling: jq "def INDEX" builtin.jq

Arlinda answered 5/11, 2018 at 19:6 Comment(4)
If that worked it would be very elegant! Unfortunately I have no chance of upgrading my jq 1.5 to 1.6...Accouchement
@Accouchement - So why don’t you copy-and-paste the def?Arlinda
You mean def index($i): indices($i) | .[0]; from github.com/stedolan/jq/blob/master/src/builtin.jq ?Accouchement
@Accouchement - No, of course not. For one thing, jq names are case-sensitive. For another, INDEX/2 has arity 2.Arlinda
A
0

Unfortunately I couldn't test if peak's answer works since I only have jq 1.5. Here's what I came up with yesterday evening:

  • For each semicolon, add the following query

    (\";\" as \$a | \$a)
    
  • Resulting command (abstract):

     cat persons.txt | jq "(<1's phone number>), (\";\" as \$a | \$a), 
     (<2's phone number>), (\";\" as \$a | \$a), ..."
    
  • Resulting command (concrete):

    cat persons.txt | jq "(.persons[] | select(.personID==11111) | .phoneNumber), (\";\" as \$a | \$a), 
    (.persons[] | select(.personID==22222) | .phoneNumber), (\";\" as \$a | \$a), 
    (.persons[] | select(.personID==33333) | .phoneNumber), (\";\" as \$a | \$a), 
    (.persons[] | select(.personID==44444) | .phoneNumber), (\";\" as \$a | \$a), 
    (.persons[] | select(.personID==55555) | .phoneNumber)"
    
  • Result:

    123456
    ";"
    432100
    ";"
    ";"
    ";"
    147258
    
  • Delete the newlines and ":

    <commandAsAbove> | tr --delete "\n\""
    
  • Result:

    123456;432100;;;147258
    

Do not get me wrong, this is far uglier than peak's answer, but it worked for me yesterday.

Accouchement answered 6/11, 2018 at 8:22 Comment(3)
The last two parts can easily be done by jq : to delete the newlines, first put your results in an array then join them with the empty string ([ command ] | join("")). This would leave you with a "1;2;;;3" result, in which you can remove the quotes by using the -r "raw-output" flag. See this jqplay.Kentiga
Oh and bash-wise : 1) make your life easier by enclosing your jq command in single quotes so you don't have to escape the double-quotes and dollars inside and 2) prefer jq command file to cat file | jq commandKentiga
Also, E as $a | $a can here be simplified to just E in each case.Arlinda
M
0

Without jq solution:

for i in $(seq 11111 11111 55555)
do
  string=$(grep -B1 "$i" persons.txt | head -1 | sed 's/.* \(.*\),/\1/g')
  echo "$string;" >> output
done
cat output | tr -d '\n' | rev | cut -d';' -f2- | rev > tmp && mv tmp output

This little script will yield the result you want and you can adapt it quickly if the input data varies

cat output
123456;432100;;;147258
Mistiemistime answered 6/11, 2018 at 15:22 Comment(2)
Do I see this right -- this relies heavily on the JSON structure being the same every time, correct? Thus, when an entry gets mixed (eg. friends after phone number), it stops working?Accouchement
yep this makes the assumption that each JSON "person" will have friends, phone and ID in that order of lines.Mistiemistime

© 2022 - 2024 — McMap. All rights reserved.