Bash output the line with highest value
Asked Answered
M

3

6

my question is pretty much like this one but with one difference; i want the output the line that has highest score on the 3rd tab. my data is like:

1.gui  Qxx  16
2.gui  Qxy  23
3.guT  QWS  11

and i want to get this:

1.gui  Qxy  23
3.guT  QWS  11

I used:

cat file.f | uniq | cut -d" " -f3 | sort | uniq -d >>out.f

but did not get what i want!?

Myers answered 27/11, 2012 at 9:36 Comment(3)
Can you recheck the input and output? Should the 1.gui... in the output be 2.gui...?Taphouse
@Raze2dust the numbers are not important for the first tab; they just to represnt line numbers..Myers
k.. then you should change the 3.guT to 2.guT in the output. It is confusing otherwise.Taphouse
N
10

With sort:

$ sort -rk3 file             # Sort on column 3, display all results

2.gui  Qxy  23
1.gui  Qxx  16
3.guT  QWS  11

$ sort -rk3 file | head -2   # Sort on column 3, filter number of results

2.gui  Qxy  23
1.gui  Qxx  16

$ sort -rk3 file | uniq      # Sort on column 3, on display unique results 

2.gui  Qxy  23
1.gui  Qxx  16
3.guT  QWS  11

-r reverse sort, highest first.

-k3 sort on the 3rd column.


If you only want to display line which the 3rd column is greater than some value (i.e. 15) then try this using awk:

awk '$3>15' file | sort -rk3  # Display line where column 3 > 15 and sort

2.gui  Qxy  23
1.gui  Qxx  16
Nombril answered 27/11, 2012 at 9:42 Comment(3)
thanks @sudo_O.. after sorting you got only first line, but i want the get all occurrences..Myers
the thing is, my file has ~10million rows, i do not know how many occurrences for each.. meaning, i do not know how many lines i can get with head..Myers
If you want all the results just do sort -rk3 fileNombril
M
5

for future users with same question:

do not forget to introduce -n switch to the -sort command, or your values are ordered starting from 9999's and followed by 999's etc.. so use

sort -rnk3 file

and if you want to get only one line with highest value (remove duplicates) use this:

sort -rnk3 file | awk '!x[$2]++'

and if you have an usual delimiter you can tell -awk to notice:

sort -rnk3 file | awk -F"[. ]" '!x[$2]++'
Myers answered 27/11, 2012 at 17:2 Comment(1)
Great - I was looking for that specific awk construct which prevents duplicates on a specific field - I do not know if that can be achieved with just bash sort - it seems that -u works on the complete line.Webbed
C
0

This must give you the highest value for those lines where the name is repeated and conserve those lines that have not repeated names.

sort -rk3 file | awk '!seen[$1]++' > file_filtered.txt
Cheesecloth answered 26/3, 2018 at 0:0 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.