Explain awk command
Asked Answered
A

5

17

Today I was searching for a command online to print next two lines after a pattern and I came across an awk command which I'm unable to understand.

$ /usr/xpg4/bin/awk '_&&_--;/PATTERN/{_=2}' input

Can someone explain it?

Anamorphism answered 23/8, 2013 at 16:43 Comment(0)
B
38

See https://mcmap.net/q/129249/-printing-with-sed-or-awk-a-line-following-a-matching-pattern for the answer that was duplicated here.

Baumgardner answered 23/8, 2013 at 18:25 Comment(0)
O
16

_ is being used as a variable name here (valid but obviously confusing). If you rewrite it as:

awk 'x && x--; /PATTERN/ { x=2 }' input

then it's a little easier to parse. Whenever /PATTERN/ is matched, the variable gets set to 2 (and that line is not output) - that's the second half. The first part fires when x is not zero, and decrements x as well as printing the current line (the default action, since that clause does not specify an action).

The end result is to print the two lines immediately following any match of the pattern, as long as neither of those lines also matches the pattern.

Offload answered 23/8, 2013 at 16:56 Comment(0)
C
8

Simply put the command prints a number of lines after a given regular expression expression match not including the matched line.

The number of lines is specified in the block {_=2} and the variable _ is set to 2 if the line matches PATTERN. Every line read after a matching line causes _ to be decremented. You can read _&&_-- as if _ is greater than zero then minus one from it, this happens for every line after a match until _ hits zero. It's quite simple when you replace the variable _ with a more sensible name like n.

A simple demo should make it clear (print the 2 lines that follow any line matching foo):

$ cat file
foo
1
2
3
foo
a
b
c

$ awk 'n && n--;/foo/{n=2}' file
1
2
a
b

So n is only True when it gets set to 2 after matching a line with foo then it decrements n and prints the current line. Due to awk having short circuit evaluation n is only decrement when n is True (n>0) so the only possible values in this case for n are 2,1 or 0.

Awk has the following structure condition{block} and when a condition is evaluated True then block is executed for the current record. If you don't provide a block awk uses the default block {print $0} so n && n--; is a condition without a block that only evaluates to True for n lines after the regular expression match. The semi-colon just delimits the condition n&&n-- for the conditions /foo/ make it explicit that the condition has no block.

To print the two lines following the match including the match you would do:

$ awk '/foo/{n=3} n && n--' file
foo
1
2
foo
a
b

Extra extra: the fact that the full path of /usr/xpg4/bin/awk is used tells me this code is intended for a Solaris machine as the /usr/bin/awk is totally broken and should be avoided at all costs.

Commerce answered 23/8, 2013 at 16:56 Comment(0)
C
3

Explanation

awk expressions have the following form:

condition action; NEXT_EXPRESSION

If conditon is true action(s) will be executed. Further note, if condition is true but action has been omitted awk will execute print (the default action).

You have two expressions in your code that will get executed on every line of input:

_&&_--          ;
/PATTERN/{_=2}

Both are separated by a ;. As I told that default action print will happen if the action is omitted it is the same as:

_&&_--    {print};
/PATTERN/ {_=2}

In your example _ is a variable name, which gets initialized by 0 on the first line of input, before it's first usage - automatically by awk.

First condition would be (0) && (0).. What results in the condition being false, as 0 && 0 evaluates to false and awk will not print.

If the pattern is found, _ will be set to 2 which makes the first condition being (2) && (2) on the next line and (1) && (1) on the next line after that line as _ is decremented after the condition has being evaluated. Both are evaluating to true and awk will print those lines.

However, nice puzzle ;)

Carnify answered 23/8, 2013 at 17:5 Comment(5)
Slight mistake, post-decrement has higher order or precedence that logical AND so explicitly (0) && (0--) as (0&&0)-- evaluates to -1.Commerce
nice to know.. let me enhance the post in that way.Carnify
@sudo_O awk '_&&_--; 1{printf "%i\n", _};/Pattern/{_=2}' input.txt I don't understand the output of the line above. If decrement takes place before (what is conform to awk docs, you are right), then the right side of && should be 1 in the first test and 0 in the second test after the pattern has matched. So I would expect that just one line will be printed.. What am I missing here?Carnify
It's subtle but maybe this helps you ideone.com/JlT8BG notice the pattern of n before the condition 0,2,1,0,2,1,0 that is because the post-decrement happens post (after) the condition have been evaluated. To skip or to echo is decided first then n is decremented.Commerce
@sudo_O Yeah have thought about this while having a walk.. Operator precedence has nothing to do with the increment action happening before or after the condition has been evaluated. Sorry, shame on me I should know about that! ;) ... Will now update the post so that it is completely correct...Carnify
C
2

Wonderfully obscure. Will update when time allows.

_ is being used as a variable name. The && is a logical operator that has 2 true actions run together. Once the value of _ is reduced to zero, the 2nd half of the && is false and no output is generated.

print -- "
xxxxx
yyyy
PATTERN
zzz
aa
bbb
ccc
ddd" | awk '_&&_--;/PATTERN/{_=2}'

output

zzz
aa

debug version

print -- "
xxxxx
yyyy
PATTERN
zzz
aa
bbb
ccc
ddd" | awk '_&&_--;{print "_="_;print _&&_};/PATTERN/{_=2;print "_="_ }'

output

_=
0
_=
0
_=
0
_=
0
_=2
zzz
_=1
1
aa
_=0
0
_=0
0
_=0
0
_=0
0
Colossae answered 23/8, 2013 at 16:53 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.