I want Perl to parse a code text and identify certain stuffs, example code:
use strict;
use warnings;
$/ = undef;
while (<DATA>) {
s/(\w+)(\s*<=.*?;)/$1_yes$2/gs;
print;
}
__DATA__
always @(posedge clk or negedge rst_n)
if(!rst_n)begin
d1 <= 0; //perl_comment_4
//perl_comment_5
d2 <= 1 //perl_comment_6
+ 2;
end
else if( d3 <= d4 && ( d5 <= 3 ) ) begin
d6 <= d7 +
(d8 <= d9 ? 1 : 0);
//perl_comment_7
d10 <= d11 <=
d12
+ d13
<= d14 ? 1 : 0;
end
Match target is something that meets all of the following:
(1) It begins with the combination word\s*<=
. Here \s*
maybe 0 or more spaces, newlines, tabs.
(2) The aforementioned "combination" should be out of any pair of (
and )
.
(3) If multiple "combinations" appear consecutively, then take the first one as the beginning. (Something like "greedy" matching at the left boundary)
(4) it ends with the first ;
after the "combination" mentioned in (1).
There may be word\s*<=
and ;
in code comments (there may be anything in comments); this makes things more complicated. To make life easier, I already pre-processed the text, scanning for comments and replacing them with stuff like //perl_comment_6
. (This solution seems rather cumbersome and stupid. Any smarter, more elegant solutions?)
What I wanna do:
For all matched word\s*<=
, replace word
with word_yes
. For the example code, d1, d2, d6 and d10 should be replaced by d1_yes, d2_yes, d6_yes and d10_yes, respectively, and all other parts of the text should remain unchanged.
In my current code I use s/(\w+)(\s*<=.*?;)/$1_yes$2/gs;
, which correctly recognizes d1, d2 and d10, but fails to recognize d6 and mistakenly recognizes d3.
Any suggestions? Thanks in advance~
if
, so your regex is a little bit limited in its scope, but I really think it's tidy and neat. Is it possible to make it applicable in more scenarioes? – Swastikaif\s*
and it will be a rather generic pattern. – Stephan(\((?>[^()]|(?1))*\))(*SKIP)(*F)|(\w+)(\s*<=[^;]*)
tomorrow ;-) – Swastika