Bash output the line with highest value

Asked 27/11, 2012 at 9:36 Answered 26/3, 2018 at 0:0

my question is pretty much like this one but with one difference; i want the output the line that has highest score on the 3rd tab. my data is like:

1.gui  Qxx  16
2.gui  Qxy  23
3.guT  QWS  11

and i want to get this:

1.gui  Qxy  23
3.guT  QWS  11

I used:

cat file.f | uniq | cut -d" " -f3 | sort | uniq -d >>out.f

but did not get what i want!?

Myers answered 27/11, 2012 at 9:36 Comment(3)

Can you recheck the input and output? Should the 1.gui... in the output be 2.gui...? – Taphouse 27/11, 2012 at 9:42

@Raze2dust the numbers are not important for the first tab; they just to represnt line numbers.. – Myers 27/11, 2012 at 9:47

k.. then you should change the 3.guT to 2.guT in the output. It is confusing otherwise. – Taphouse 27/11, 2012 at 10:14

With sort:

$ sort -rk3 file             # Sort on column 3, display all results

2.gui  Qxy  23
1.gui  Qxx  16
3.guT  QWS  11

$ sort -rk3 file | head -2   # Sort on column 3, filter number of results

2.gui  Qxy  23
1.gui  Qxx  16

$ sort -rk3 file | uniq      # Sort on column 3, on display unique results 

2.gui  Qxy  23
1.gui  Qxx  16
3.guT  QWS  11

-r reverse sort, highest first.

-k3 sort on the 3rd column.

If you only want to display line which the 3rd column is greater than some value (i.e. 15) then try this using awk:

awk '$3>15' file | sort -rk3  # Display line where column 3 > 15 and sort

2.gui  Qxy  23
1.gui  Qxx  16

Nombril answered 27/11, 2012 at 9:42 Comment(3)

thanks @sudo_O.. after sorting you got only first line, but i want the get all occurrences.. – Myers 27/11, 2012 at 9:46

the thing is, my file has ~10million rows, i do not know how many occurrences for each.. meaning, i do not know how many lines i can get with head.. – Myers 27/11, 2012 at 9:49

If you want all the results just do sort -rk3 file – Nombril 27/11, 2012 at 9:53

for future users with same question:

do not forget to introduce -n switch to the -sort command, or your values are ordered starting from 9999's and followed by 999's etc.. so use

sort -rnk3 file

and if you want to get only one line with highest value (remove duplicates) use this:

sort -rnk3 file | awk '!x[$2]++'

and if you have an usual delimiter you can tell -awk to notice:

sort -rnk3 file | awk -F"[. ]" '!x[$2]++'

Myers answered 27/11, 2012 at 17:2 Comment(1)

Great - I was looking for that specific awk construct which prevents duplicates on a specific field - I do not know if that can be achieved with just bash sort - it seems that -u works on the complete line. – Webbed 17/7, 2013 at 8:55

This must give you the highest value for those lines where the name is repeated and conserve those lines that have not repeated names.

sort -rk3 file | awk '!seen[$1]++' > file_filtered.txt

Cheesecloth answered 26/3, 2018 at 0:0 Comment(0)

Recommended topics

Hot tags