awk solution for comparing current line to next line and printing one of the lines based on a condition
Asked Answered
C

2

9

I have an input file that looks like this (first column is a location number and the second is a count that should increase over time):

1       0
1       2
1       6
1       7
1       7
1       8
1       7
1       7
1       9
1       9
1       10
1       10
1       9
1       10
1       10
1       10
1       10
1       10
1       10
1       9
1       10
1       10
1       10
1       10
1       10
1       10

and I'd like to fix it look like this (substitute counts that decreased with the previous count):

1       0
1       2
1       6
1       7
1       7
1       8
1       8
1       8
1       9
1       9
1       10
1       10
1       10
1       10
1       10
1       10
1       10
1       10
1       10
1       10
1       10
1       10
1       10
1       10
1       10
1       10

I've been trying to use awk for this, but am stumbling with getline since I can't seem to figure out how to reset the line number (NR?) so it'll read each line and it's next line, not two lines at a time. This is the code I have so far, any ideas?

awk '{a=$1; b=$2; getline; c=$1; d=$2; if (a==c && b<=d) print a"\t"b; else print c"\t"d}' original.txt > fixed.txt

Also, this is the output I'm currently getting:

1       0
1       6
1       7
1       7
1       9
1       10
1       9
1       10
1       10
1       9
1       10
1       10
1       10
Cozza answered 28/7, 2012 at 21:37 Comment(3)
Ok, just to clarify, are you trying to skip the lines where the count decreases? That's a lot of lines, I wonder if you could give a shorter example that would be just as clear?Lorylose
Sorry if my explanation wasn't clear, I want to print the previous line when the count decreases, so end up with the same number of lines but with a file where the count stays put or increases, but never decreases.Cozza
I got it .. check out the answers provided below, I think you'll find what you were looking for.Lorylose
Q
8

Perhaps all you want is:

awk '$2 < p { $2 = p } { p = $2 } 1' input-file

This will fail on the first line if the value in the second column is negative, so do:

awk 'NR > 1 && $2 < p ...'

This simply sets the second column to the previous value if the current value is less, then stores the current value in the variable p, then prints the line.

Note that this also slightly modifies the spacing of the output on lines that change. If your input is tab-separated, you might want to do:

awk 'NR > 1 && $2 < p { $2 = p } { p = $2 } 1' OFS=\\t input-file
Quarrel answered 28/7, 2012 at 21:43 Comment(2)
Wow .. so much more concise .. I think I have the verbose version of your first solutionLorylose
Fantastic, I was just trying to figure the spacing out, thanks!Cozza
L
2

This script will do what you like:

{
  if ($2 < prev_count)
    $2 = prev_count
  else
    prev_count = $2

  printf("%d   %d\n", $1, $2)
}

This is a verbose version to be easily readable :)

Lorylose answered 28/7, 2012 at 21:47 Comment(2)
Thanks, I appreciate the verbose version as well!Cozza
@Cozza Happy to help .. I adjusted the output spacing with printf which may give you a bit more finer control over the format/spacing if you need it.Lorylose

© 2022 - 2024 — McMap. All rights reserved.