Split Ruby regex over multiple lines
Asked Answered
M

4

95

This might not be quite the question you're expecting! I don't want a regex that will match over line-breaks; instead, I want to write a long regex that, for readability, I'd like to split onto multiple lines of code.

Something like:

"bar" =~ /(foo|
           bar)/  # Doesn't work!
# => nil. Would like => 0

Can it be done?

Magel answered 21/9, 2010 at 16:3 Comment(0)
M
61

You need to use the /x modifier, which enables free-spacing mode.

In your case:

"bar" =~ /(foo|
           bar)/x
Marya answered 21/9, 2010 at 16:16 Comment(2)
This answer could be improved by replacing the link with a more detailed explanation.Interrupter
Like this: regexp = /(\d+)(\d+)/xQuotation
M
145

Using %r with the x option is the prefered way to do this.

See this example from the github ruby style guide

regexp = %r{
  start         # some text
  \s            # white space char
  (group)       # first group
  (?:alt1|alt2) # some alternation
  end
}x

regexp.match? "start groupalt2end"

https://github.com/github/rubocop-github/blob/master/STYLEGUIDE.md#regular-expressions

Mcnully answered 11/12, 2013 at 16:39 Comment(3)
The example to follow. Comments inside the regex do wonders for maintainability.Cashandcarry
Or with / instead of %r, because rubocop complains if a regex isn't between slashes. Also their style guide which recommends it like that: github.com/bbatsov/ruby-style-guide#regular-expressionsHizar
I can't seem to get this to work, ruby 3.1.2p20.Gerontology
M
61

You need to use the /x modifier, which enables free-spacing mode.

In your case:

"bar" =~ /(foo|
           bar)/x
Marya answered 21/9, 2010 at 16:16 Comment(2)
This answer could be improved by replacing the link with a more detailed explanation.Interrupter
Like this: regexp = /(\d+)(\d+)/xQuotation
T
5

Rather than cutting the regex mid-expression, I suggest breaking it into parts:

full_rgx = /This is a message\. A phone number: \d{10}\. A timestamp: \d*?/

msg = /This is a message\./
phone = /A phone number: \d{10}\./
tstamp = /A timestamp: \d*?/

/#{msg} #{phone} #{tstamp}/

I do the same for long strings.

Timpani answered 4/6, 2020 at 6:36 Comment(1)
I went with this answer over the others recommending the /x modifier because I would have had to sprinkle \s everywhere. Breaking up the regex was much faster and arguably easier to read and maintain.Coniferous
B
3

you can use:

"bar" =~ /(?x)foo|
         bar/
Brocade answered 11/12, 2013 at 16:28 Comment(3)
This answer was helpful to my situation, but only after I searched for what (?x) meant and was able to add more context. I would be nice if this answer was updated to be more explicit about what it's illustrating. For others interested, I found the notes about the (?on-off) construct here helpful: ruby-doc.org/core-1.9.3/Regexp.html#class-Regexp-label-OptionsTomlin
@BenParizek Perhaps you could add a short explanation in here as a comment?Acklin
I'm no expert on this topic but as I understand it, most of the answers here are saying different versions of the same thing. The problem is complex regexes are hard to read. The generic answer is: you can enable free-spacing mode to help make regexes more readable. There are various ways you can enable free-spacing mode. 1) You can add the modifier after the end delimiter /myregex/x, 2) you can toggle free-spacing mode along the way using the (?on-off) construct /myregex(?x) with free spacing/, 3) you can use the %r{myregex}x syntax.Tomlin

© 2022 - 2024 — McMap. All rights reserved.