What's the difference between vim regex and normal regex?
Asked Answered
V

5

40

I noticed that vim's substitute regex is a bit different from other regexp. What's the difference between them?

Vagabondage answered 5/10, 2010 at 14:12 Comment(2)
Define "normal regex". Every regex engine is at least subtly different from almost every other.Ignatzia
Perhaps a comparison to Perl regex (PCRE) would be most expected but I was looking for a summary of major differences to grep regexes especially which of grep basic, extended ... or whether choosing PCRE with grep would most closely resemble Vim regex of which level of magic, @IgnatziaMiddle
N
38

If by "normal regex" you mean Perl-Compatible Regular Expressions (PCRE), then the Vim help provides a good summary of the differences between Vim's regexes and Perl's:

:help perl-patterns

Here's what it says as of Vim 7.2:

9. Compare with Perl patterns                           *perl-patterns*

Vim's regexes are most similar to Perl's, in terms of what you can do.  The
difference between them is mostly just notation;  here's a summary of where
they differ:

Capability                      in Vimspeak     in Perlspeak ~
----------------------------------------------------------------
force case insensitivity        \c              (?i)
force case sensitivity          \C              (?-i)
backref-less grouping           \%(atom\)       (?:atom)
conservative quantifiers        \{-n,m}         *?, +?, ??, {}?
0-width match                   atom\@=         (?=atom)
0-width non-match               atom\@!         (?!atom)
0-width preceding match         atom\@<=        (?<=atom)
0-width preceding non-match     atom\@<!        (?!atom)
match without retry             atom\@>         (?>atom)

Vim and Perl handle newline characters inside a string a bit differently:

In Perl, ^ and $ only match at the very beginning and end of the text,
by default, but you can set the 'm' flag, which lets them match at
embedded newlines as well.  You can also set the 's' flag, which causes
a . to match newlines as well.  (Both these flags can be changed inside
a pattern using the same syntax used for the i flag above, BTW.)

On the other hand, Vim's ^ and $ always match at embedded newlines, and
you get two separate atoms, \%^ and \%$, which only match at the very
start and end of the text, respectively.  Vim solves the second problem
by giving you the \_ "modifier":  put it in front of a . or a character
class, and they will match newlines as well.

Finally, these constructs are unique to Perl:
- execution of arbitrary code in the regex:  (?{perl code})
- conditional expressions:  (?(condition)true-expr|false-expr)

...and these are unique to Vim:
- changing the magic-ness of a pattern:  \v \V \m \M
   (very useful for avoiding backslashitis)
- sequence of optionally matching atoms:  \%[atoms]
- \& (which is to \| what "and" is to "or";  it forces several branches
   to match at one spot)
- matching lines/columns by number:  \%5l \%5c \%5v
- setting the start and end of the match:  \zs \ze
Nihi answered 5/10, 2010 at 20:1 Comment(2)
What's the meaning of embedded newlines in the doc?Currier
Not only does it not provide a good summary, its summary is almost entirely worthless. Have a downvote. Why is this selected as an answer?Dendrochronology
W
44

"Regular expression" really defines algorithms, not a syntax. What that means is that different flavours of regular expressions will use different characters to mean the same thing; or they'll prefix some special characters with backslashes where others don't. They'll typically still work in the same way.

Once upon a time, POSIX defined the Basic Regular Expression syntax (BRE), which Vim largely follows. Very soon afterwards, an Extended Regular Expression (ERE) syntax proposal was also released. The main difference between the two is that BRE tends to treat more characters as literals - an "a" is an "a", but also a "(" is a "(", not a special character - and so involves more backslashes to give them "special" meaning.

The discussion of complex differences between Vim and Perl on a separate comment here is useful, but it's also worth mentioning a few of the simpler ways in which Vim regexes differ from the "accepted" norm (by which you probably mean Perl.) As mentioned above, they mostly differ in their use of a preceding backslash.

Here are some obvious examples:

Perl    Vim     Explanation
---------------------------
x?      x\=     Match 0 or 1 of x
x+      x\+     Match 1 or more of x
(xyz)   \(xyz\) Use brackets to group matches
x{n,m}  x\{n,m} Match n to m of x
x*?     x\{-}   Match 0 or 1 of x, non-greedy
x+?     x\{-1,} Match 1 or more of x, non-greedy
\b      \< \>   Word boundaries
$n      \n      Backreferences for previously grouped matches

That gives you a flavour of the most important differences. But if you're doing anything more complicated than the basics, I suggest you always assume that Vim-regex is going to be different from Perl-regex or Javascript-regex and consult something like the Vim Regex website.

Widera answered 13/2, 2013 at 10:38 Comment(5)
this was the best answer and helped me figure out exactly what I was looking for thanks.Quiz
Great link to the vimregex site... though it may need changes for "newer" versions as it was last updated in 2002. :|Stephan
I think you mean: x*? x\{-} Match 0 or MORE of x, non-greedyLifeordeath
The vimregex website link is not valid anymore :(Heptateuch
This answer really needs to clarify the level of magic in use with the Vim expressions. For those unaware try :h magic.Middle
N
38

If by "normal regex" you mean Perl-Compatible Regular Expressions (PCRE), then the Vim help provides a good summary of the differences between Vim's regexes and Perl's:

:help perl-patterns

Here's what it says as of Vim 7.2:

9. Compare with Perl patterns                           *perl-patterns*

Vim's regexes are most similar to Perl's, in terms of what you can do.  The
difference between them is mostly just notation;  here's a summary of where
they differ:

Capability                      in Vimspeak     in Perlspeak ~
----------------------------------------------------------------
force case insensitivity        \c              (?i)
force case sensitivity          \C              (?-i)
backref-less grouping           \%(atom\)       (?:atom)
conservative quantifiers        \{-n,m}         *?, +?, ??, {}?
0-width match                   atom\@=         (?=atom)
0-width non-match               atom\@!         (?!atom)
0-width preceding match         atom\@<=        (?<=atom)
0-width preceding non-match     atom\@<!        (?!atom)
match without retry             atom\@>         (?>atom)

Vim and Perl handle newline characters inside a string a bit differently:

In Perl, ^ and $ only match at the very beginning and end of the text,
by default, but you can set the 'm' flag, which lets them match at
embedded newlines as well.  You can also set the 's' flag, which causes
a . to match newlines as well.  (Both these flags can be changed inside
a pattern using the same syntax used for the i flag above, BTW.)

On the other hand, Vim's ^ and $ always match at embedded newlines, and
you get two separate atoms, \%^ and \%$, which only match at the very
start and end of the text, respectively.  Vim solves the second problem
by giving you the \_ "modifier":  put it in front of a . or a character
class, and they will match newlines as well.

Finally, these constructs are unique to Perl:
- execution of arbitrary code in the regex:  (?{perl code})
- conditional expressions:  (?(condition)true-expr|false-expr)

...and these are unique to Vim:
- changing the magic-ness of a pattern:  \v \V \m \M
   (very useful for avoiding backslashitis)
- sequence of optionally matching atoms:  \%[atoms]
- \& (which is to \| what "and" is to "or";  it forces several branches
   to match at one spot)
- matching lines/columns by number:  \%5l \%5c \%5v
- setting the start and end of the match:  \zs \ze
Nihi answered 5/10, 2010 at 20:1 Comment(2)
What's the meaning of embedded newlines in the doc?Currier
Not only does it not provide a good summary, its summary is almost entirely worthless. Have a downvote. Why is this selected as an answer?Dendrochronology
F
12

Try Vim's very magic regex mode. It behaves more like traditional regex, just prepend your pattern with \v. See :help /\v for more info. I love it.

Fordham answered 28/10, 2014 at 4:43 Comment(4)
by traditional regex do you mean Basic Regular expressions (BRE) or PCRE?Depalma
pcre :) more texttextFordham
Why used :help /\v not :help \v? What's the slash / for?Currier
This is awesome in fact! I love it too ! Too bad this cannot become the default? Refer to #3760944, but with drawbacks.Wheedle
M
5

There is a plugin called eregex.vim which translates from PCRE (Perl-compatible regular expressions) to Vim's syntax. It takes over a thousand lines of vim to achieve that translation! I guess it also serves as precise documentation of the differences.

Mae answered 11/1, 2014 at 20:10 Comment(0)
L
1

Too broad question. Run vim and type :help pattern.

Leery answered 5/10, 2010 at 14:15 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.