Your assumption that regex matches a longer alternation is incorrect.
If you have a bit of time, let's look at how your regex works...
Quick refresher: How regex works: The state machine always reads from left to right, backtracking where necessary.
There are two pointers, one on the Pattern:
(cdefghijkl|bcd)
The other on your String:
abcdefghijklmnopqrstuvw
The pointer on the String moves from the left. As soon as it can return, it will:
(source: gyazo.com)
Let's turn that into a more "sequential" sequence for understanding:
(source: gyazo.com)
Your foobar
example is a different topic. As I mentioned in this post:
How regex works: The state machine always reads from left to right. ,|,, == ,
, as it always will only be matched to the first alternation.
That's good, Unihedron, but how do I force it to the first alternation?
Look!*
^(?:.*?\Kcdefghijkl|.*?\Kbcd)
Here have a regex demo.
This regex first attempts to match the entire string with the first alternation. Only if it fails completely will it then attempt to match the second alternation. \K
is used here to keep the match with the contents behind the construct \K
.
*
: \K
was supported in Ruby since 2.0.0.
Read more:
Ah, I was bored, so I optimized the regex:
^(?:(?:(?!cdefghijkl)c?[^c]*)++\Kcdefghijkl|(?:(?!bcd)b?[^b]*)++\Kbcd)
You can see a demo here.