Parsing pipe delimited input in awk
Asked Answered
I

1

6

Have seen many posts asking similar question. Can't get it working.

Input looks like:

<field one with spaces>|<field two with spaces>

Trying to parse with awk.

Have tried many variants from excellent posts:

FS = "^[\x00- ]*|[\x00- ]*[|][\x00- ]*|[\x00- ]*$";
FS = "^[\x00- ]*|[\x00- ]*\|[\x00- ]*|[\x00- ]*$";
FS = "^[\x00- ]*|[\x00- ]*\\|[\x00- ]*|[\x00- ]*$";

Still can't get the pipe delimiter to work.

Using CentOS.

Any help?

Ilion answered 2/8, 2011 at 19:56 Comment(0)
T
14
 echo "field one has spaces | field two has spaces" \
 | awk '
   BEGIN {
      FS="|" 
 }
 {
   print $2
   print $1
   # or what ever you want
 }'

 #output

  field two has spaces
  field one has spaces

You can also reduce this to

awk -F'|' {
    print $2
    print $1
}'

Edit Also, not all awks can take a multi-character regex for the FS value.

Edit2 Somehow I missed this originally, but I see you are trying to include \x00 in the char classes pre and post of the | char. I assume you mean for \x00 == null char? I don't think you're going to be able to have awk parse a file with null chars embedded. You could prep-rocess your input like

 tr '\x00'   ' ' < file.txt > spacesForNulls.txt 

OR delete them altogether with

tr -d '\x00' < file.txt > deletedNulls.txt

and eliminate that part of your regex. But as above, some awk don't support regex for the FS value. And, I don't use the tr trick very much, you may find that it requires a slightly different notation for the null char, depending on your version of tr.

I hope this helps.

Tyrothricin answered 2/8, 2011 at 20:0 Comment(3)
Great point with \x00. Or the op should use a more specialized tool like perl or ruby. ++Preengage
I don't think you're going to be able to have awk parse a file with null chars embedded Or a second thought? awk '{gsub("\x00","")}1 is possible.Preengage
If you have GNU userland you get gawk and it will support FS with multi-character regex: linux, hurd and bsd sureley ... not sure about max osx.Advisory

© 2022 - 2024 — McMap. All rights reserved.