grep -P no longer works. How can I rewrite my searches?
Asked Answered
C

13

153

It looks like the new version of OS X no longer supports grep -P and as such has made some of my scripts stop working, for example:

var1=`grep -o -P '(?<=<st:italic>).*(?=</italic>)' file.txt`

I need to capture grep's result to a variable and I need to use zero-width assertions, as well as \K:

var2=`grep -P -o '(property:)\K.*\d+(?=end)' file.txt`

Any alternatives would be greatly appreciated.

Counteract answered 20/5, 2013 at 20:59 Comment(10)
how about installing gnu grep?Goldfinch
Are you sure it's the -P? Mine has it.Surplice
@Surplice It was removed in 10.8.Downswing
@LauriRanta I have 10.8... Interestingly, it's still in the usage but actually using it doesn't workSurplice
Cannot install anything on these machines unfortunately.Counteract
@Goldfinch care to elaborate on how one might do that?Ashok
It really seems to have been removed, what a dick move of Apple if this happened intentionally.Gensmer
@AdrianFrühwirth OS X's grep actually changed from grep (GNU grep) 2.5.1 in 10.7 to grep (BSD grep) 2.5.1-FreeBSD in 10.8. I guess it was because of GPL. The FreeBSD grep is also based on GNU grep and both versions of grep are from 2002. --label and -u / --unix-byte-offets were also removed in 10.8. -z / --decompress, -J / --bz2decompress, --exclude-dir, --include-dir, -S, -O, and -p were added in 10.8. -Z changed from --null to --decompress.Downswing
@LauriRanta Thanks for the info, that explains it...much appreciated. I don't have an OS X/*BSD installation handy but read that BSD grep is way slower than GNU grep, can you confirm if this is still the case on 10.8 (compared to GNU grep installed via homebrew, for example)? I'm just curious.Gensmer
The FreeBSD grep that comes with OS X is from 2002, and wiki.freebsd.org/BSDgrep still says that "the only TODO item is improving performance", so yeah. time grep aa /usr/share/dict/words>/dev/null takes about 0.09 seconds with OS X's grep and about 0.01 seconds with a new GNU grep on repeated runs on my iMac.Downswing
R
91

If you want to do the minimal amount of work, change

grep -P 'PATTERN' file.txt

to

perl -nle'print if m{PATTERN}' file.txt

and change

grep -o -P 'PATTERN' file.txt

to

perl -nle'print $& while m{PATTERN}g' file.txt

So you get:

var1=`perl -nle'print $& while m{(?<=<st:italic>).*(?=</italic>)}g' file.txt`
var2=`perl -nle'print $& while m{(property:)\K.*\d+(?=end)}g' file.txt`

In your specific case, you can achieve simpler code with extra work.

var1=`perl -nle'print for m{<st:italic>(.*)</italic>}g' file.txt`
var2=`perl -nle'print for /property:(.*\d+)end/g' file.txt`
Rubel answered 20/5, 2013 at 21:27 Comment(7)
This works great but it returns all matches as where the grep I used only returned the first match. any idea about how to return just the first match?Counteract
@ironintention: add | tail -1 to the end of the pipeline.Hornswoggle
grep always returns all matching lines (unless you use one of the options where it prints none at all). Anyway, if (/.../) { print $1; last; } will cause it to only print the first match.Rubel
I used this to get out the urls of a sitemap - thanks mate, would not have made it without your post! perl -nle'print $1 if m{<loc>(.*)</loc>}' sitemap.xmlPhooey
@Christian, Would only take 3 lines to do it with a proper XML parser such as XML::LibXML. (Key line: say $_->textContent for $doc->findnodes('//loc');)Rubel
@Ikegami I needed this only one time for a specific use case. The result be trashed in the way. I am happy with it right now. Anyway thanks for letting me know about libxml. There are times i regret my lack of perl-fu.Phooey
Adjusted to handle multiple matches per line like grep -oRubel
A
150

If your scripts are for your use only, you can install grep from homebrew-core using brew:

brew install grep 

Then it's available as ggrep (GNU grep). it doesn't replaces the system grep (you need to put the installed grep before the system one on the PATH).

The version installed by brew includes the -P option, so you don't need to change your scripts.

If you need to use these commands with their normal names, you can add a "gnubin" directory to your PATH from your bashrc like:

PATH="/usr/local/opt/grep/libexec/gnubin:$PATH"

You can export this line on your ~/.bashrc or ~/.zshrc to keep it for new sessions.

Please see here for a discussion of the pro-s and cons of the old --with-default-names option and it's (recent) removal.

Ashok answered 28/3, 2014 at 4:44 Comment(8)
@Assr what didn't work? Likely the path isn't set properly - what's the output of which grep? Should be /usr/local/bin/grep. It;s a bit mean to downvote before you've checked carefully that there is a problem!Ashok
indeed that is it! But it did not put the installed grep before the system one on the PATH as you said. I'm happy to upvote, I really do not understand the fuss about the point system on this website (can't you just hack around that?). Glad you pointed this out though, I'm setting up an alias for grep ASAP!Assr
probably better to add /usr/local/bin to the front of your PATH. Brew is supposed to set that up I believe? Did you use --default-names? Anyway, glad it works (: Not sure about hacking around it, but I think the point system is one of the reasons this site is such a good resource.Ashok
yes I did use --default-names and brew. Not sure if putting /usr/local/bin in the front of your path is better than an alias, just an alternativeAssr
great answer, the only way that makes work. would be great if you can add how to setup an alias to your answer.Micrometry
an alternative to --with-default-names is to add alias grep='ggrep' to your bash profile and let brew dupes keep their prefixIldaile
--with-default-names is removed from brew. I had to brew install grep to get ggrep and then do as @Ildaile says and do alias grep='ggrep' .Namedropper
maybe if your script runs sub-shell an alias would not work. Need to brew install grep and then add to your path, or export PATH="/usr/local/opt/grep/libexec/gnubin:$PATH"Thoron
R
91

If you want to do the minimal amount of work, change

grep -P 'PATTERN' file.txt

to

perl -nle'print if m{PATTERN}' file.txt

and change

grep -o -P 'PATTERN' file.txt

to

perl -nle'print $& while m{PATTERN}g' file.txt

So you get:

var1=`perl -nle'print $& while m{(?<=<st:italic>).*(?=</italic>)}g' file.txt`
var2=`perl -nle'print $& while m{(property:)\K.*\d+(?=end)}g' file.txt`

In your specific case, you can achieve simpler code with extra work.

var1=`perl -nle'print for m{<st:italic>(.*)</italic>}g' file.txt`
var2=`perl -nle'print for /property:(.*\d+)end/g' file.txt`
Rubel answered 20/5, 2013 at 21:27 Comment(7)
This works great but it returns all matches as where the grep I used only returned the first match. any idea about how to return just the first match?Counteract
@ironintention: add | tail -1 to the end of the pipeline.Hornswoggle
grep always returns all matching lines (unless you use one of the options where it prints none at all). Anyway, if (/.../) { print $1; last; } will cause it to only print the first match.Rubel
I used this to get out the urls of a sitemap - thanks mate, would not have made it without your post! perl -nle'print $1 if m{<loc>(.*)</loc>}' sitemap.xmlPhooey
@Christian, Would only take 3 lines to do it with a proper XML parser such as XML::LibXML. (Key line: say $_->textContent for $doc->findnodes('//loc');)Rubel
@Ikegami I needed this only one time for a specific use case. The result be trashed in the way. I am happy with it right now. Anyway thanks for letting me know about libxml. There are times i regret my lack of perl-fu.Phooey
Adjusted to handle multiple matches per line like grep -oRubel
E
13

Install ack and use it instead. Ack is a grep replacement written in Perl. It has full support for Perl regular expressions.

Evacuee answered 20/5, 2013 at 21:27 Comment(5)
I'd like to check this out but this is for work computers so we cannot install anythingCounteract
@ironintention: If you can install Perl modules, you're good. Even if you can't add to the local Perl installation you can always use local::lib.Evacuee
ack is designed to be self-contained; you don't need to actually install it. If you can save a file, mark it as exectutable, and update your PATH if necessary, you are good to go.Keneth
Can you please the ack syntax that replaces the abovePessimist
@FullDecent: It's almost identical: ack -o '(property:)\K.*\d+(?=end)' file.txt (-o means the same thing, but you don't need the -P with ack)Evacuee
F
11

OS X tends to provide BSD rather than GNU tools. It does come with egrep however, which is probably all you need to perform regex searches.

example: egrep 'fo+b?r' foobarbaz.txt

A snippet from the OSX grep man page:

grep is used for simple patterns and basic regular expressions (BREs); egrep can handle extended regular expressions (EREs).

Flyover answered 30/3, 2016 at 13:36 Comment(2)
Direct invocation as egrep is deprecated. The same ability is also available as grep -E. It's... a sad shadow of Perl, lacking lookaround assertions, most of the backslash escapes, options, conditionals, etc :( Power users will hate it, but it does at least do the job.Peonage
Thanks. grep -E instead of grep -P was exactly what I needed.Fuzee
K
8

use perl;

perl -ne 'print if /regex/' files ...

If you need more grep options (I see you would like -o at least) there are various pgrep implementations floating around the net, many of them in Perl.

If "almost Perl" is good enough, PCRE ships with pcregrep.

Keneth answered 20/5, 2013 at 21:3 Comment(0)
T
7

There is another alternative: pcregrep.

Pcregrep is a grep with Perl-compatible regular expressions. It has the exactly same usage as grep -P. So it will be compatible with your scripts.

It can be installed with homebrew:

brew install pcre

Transpicuous answered 27/7, 2014 at 11:37 Comment(3)
Error: No available formula for pcregrepMelodymeloid
GaborMarton, I edited your answer to include @Martin 's correcting comment, and had to move the formatting around a bit to get over the minimum changes.Bynum
To search through text files that are larger than 20.4 KB, for the equivalent of grep -o -P 'PATTERN' file.txt, you must use pcregrep -o --buffer-size=100K 'PATTERN' file.txt. Note that there is no -P option for pcregrep. Note: pcregrep is also available for Linux: command-not-found.com/pcregrepNautilus
B
4

How about using the '-E' option? It works fine for me, for example, if I want to check for a php_zip, php_xml, php_gd2 extension from php -m I use:

php -m | grep -E '(zip|xml|gd2)'
Baseboard answered 8/12, 2016 at 1:52 Comment(1)
this works. Mac uses FreeBSD grep and Linux uses GNU grep...so this fix worked on my macOS sierraBathetic
N
3

Equivalent of the accepted answer, but without the requirement of the -P switch, which was not present on both machines I had available.

find . -type f -exec perl -nle 'print $& if m{\r\n}' {} ';' -exec perl -pi -e 's/\r\n/\n/g' {} '+'
Nucleoprotein answered 18/5, 2016 at 8:17 Comment(0)
E
2

This one worked for me:

    awk  -F":" '/PATTERN/' file.txt
Encincture answered 15/8, 2016 at 19:57 Comment(0)
R
0

Another Perl solution for -P

var1=$( perl -ne 'print $1 if m#<st:italic>([^<]+)</st:italic># ' file.txt)
Rahm answered 20/5, 2013 at 21:9 Comment(0)
W
0

use the perl one-liner regex by passing the find output with a pipe. I used lookbehind (get src links in html) and lookahead for " and passed the output of curl (html) to it.

bash-3.2# curl stackoverflow.com | perl -0777 -ne '$a=1;while(m/(?<=src\=\")(.*)(?=\")/g){print "Match #".$a." "."$&\n";$a+=1;}'
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  239k  100  239k    0     0  1911k      0 --:--:-- --:--:-- --:--:-- 1919k
Match #1 //ajax.googleapis.com/ajax/libs/jquery/1.12.4/jquery.min.js
Match #2 //cdn.sstatic.net/Js/stub.en.js?v=fb6157e02696
Match #3 https://ssum-sec.casalemedia.com/usermatch?s=183712&amp;cb=https%3A%2F%2Fengine.adzerk.net%2Fudb%2F22%2Fsync%2Fi.gif%3FpartnerId%3D1%26userId%3D
Match #4 //i.stack.imgur.com/817gJ.png" height="16" width="18" alt="" class="sponsor-tag-img">elasticsearch</a> <a href="/questions/tagged/elasticsearch-2.0" class="post-tag" title="show questions tagged &#39;elasticsearch-2.0&#39;" rel="tag">elasticsearch-2.0</a> <a href="/questions/tagged/elasticsearch-dsl" class="post-tag" title="show questions tagged &#39;elasticsearch-dsl&#39;" rel="tag
Match #5 //i.stack.imgur.com/817gJ.png" height="16" width="18" alt="" class="sponsor-tag-img">elasticsearch</a> <a href="/questions/tagged/sharding" class="post-tag" title="show questions tagged &#39;sharding&#39;" rel="tag">sharding</a> <a href="/questions/tagged/master" class="post-tag" title="show questions tagged &#39;master&#39;" rel="tag
Match #6 //i.stack.imgur.com/tKsDb.png" height="16" width="18" alt="" class="sponsor-tag-img">android</a> <a href="/questions/tagged/linux" class="post-tag" title="show questions tagged &#39;linux&#39;" rel="tag">linux</a> <a href="/questions/tagged/camera" class="post-tag" title="show questions tagged &#39;camera&#39;" rel="tag
Match #7 //i.stack.imgur.com/tKsDb.png" height="16" width="18" alt="" class="sponsor-tag-img">android</a> <a href="/questions/tagged/firebase" class="post-tag" title="show questions tagged &#39;firebase&#39;" rel="tag"><img src="//i.stack.imgur.com/5d55j.png" height="16" width="18" alt="" class="sponsor-tag-img">firebase</a> <a href="/questions/tagged/firebase-authentication" class="post-tag" title="show questions tagged &#39;firebase-authentication&#39;" rel="tag
Match #8 //i.stack.imgur.com/tKsDb.png" height="16" width="18" alt="" class="sponsor-tag-img">android</a> <a href="/questions/tagged/ios" class="post-tag" title="show questions tagged &#39;ios&#39;" rel="tag">ios</a> <a href="/questions/tagged/in-app-purchase" class="post-tag" title="show questions tagged &#39;in-app-purchase&#39;" rel="tag">in-app-purchase</a> <a href="/questions/tagged/piracy-protection" class="post-tag" title="show questions tagged &#39;piracy-protection&#39;" rel="tag
Match #9 //i.stack.imgur.com/tKsDb.png" height="16" width="18" alt="" class="sponsor-tag-img">android</a> <a href="/questions/tagged/unity3d" class="post-tag" title="show questions tagged &#39;unity3d&#39;" rel="tag">unity3d</a> <a href="/questions/tagged/vr" class="post-tag" title="show questions tagged &#39;vr&#39;" rel="tag
Match #10 http://pixel.quantserve.com/pixel/p-c1rF4kxgLUzNc.gif" alt="" class="dno
bash-3.2# date
Mon Oct 24 20:57:11 EDT 2016
Woodham answered 25/10, 2016 at 1:13 Comment(0)
P
0

I had this same problem with grep suddenly on a docker rebuilt, I found the solution here : https://github.com/firehol/firehol/issues/325

just replaced -oP with -oE

echo $some_var | grep -oE '\b[0-9a-f]{5,40}\b' | head -1

Proper answered 17/9, 2021 at 9:19 Comment(0)
P
-1

Some more options, these also set correct exit status:

  • equivalent to grep -P PATTERN FILE :

    perl -e'while(<>){if( (m!PATTERN!) ){$ok++;print}};if(!($ok)){exit 1}' FILE

  • equivalent to grep -P -i PATTERN FILE :

    perl -e'while(<>){if( (m!PATTERN!i) ){$ok++;print}};if(!($ok)){exit 1}' FILE

  • equivalent to grep -v -P PATTERN FILE :

    perl -e'while(<>){if( !(m!PATTERN!) ){$ok++;print}};if(!($ok)){exit 1}' FILE

For a more cleaner solution use this gist - implemented switches are: -A , -B , -v , -P , -i : https://gist.github.com/torson/bd6931bda0035c4884b2a8c4c64a33b2

Pounds answered 10/2, 2022 at 20:35 Comment(1)
Probably lose the useless uses of catKeneth

© 2022 - 2024 — McMap. All rights reserved.