Counting no. of Delimiter in a row in a File in Unix
Asked Answered
W

3

5

I have a file 'records.txt' which contains over 200,000 records.

Each record is on a separate line and has multiple fields separated by a delimiter '|'.

Each row should have 35 fields, but the problem is one of these rows has <>35 fields, i.e. <>35 '|' characters.

Can someone please suggest a way in Unix, by which I can identify the row. (Like getting count of '|' characters in each row in the file)

Wilks answered 14/1, 2009 at 9:57 Comment(0)
K
13

Try this:

awk -F '|'  'NF != 35 {print NR, $0} ' your_filefile
Kinson answered 14/1, 2009 at 10:7 Comment(1)
This is printing everything both equal to 35 and not equal to...its not workingDysthymia
T
2

This small perl script should do it:

cat records.txt | perl -ne '$t = $_; $t =~ s/[^\|]//g; print unless length($t) == 35;'

This works by removing all the characters except the |, then counting what is left.

Telpherage answered 14/1, 2009 at 10:3 Comment(0)
B
1

Greg's way with bash stuff, for the bash friends out there :)

while read n; do [ `echo $n | tr -cd '|' | wc -c` != 35 ] && echo $n; done < records.txt
Bushranger answered 14/1, 2009 at 11:5 Comment(1)
I just wanted to find out a row which has more than N(35 here) separators. Greg and your's, both codes work. Thanks :)Wilks

© 2022 - 2024 — McMap. All rights reserved.