multi-byte function to replace preg_match_all?
Asked Answered
S

2

12

I'm looking for a multi-byte function to replace preg_match_all(). I need one that will give me an array of matched strings, like the $matches argument from preg_match(). The function mb_ereg_match() doesn't seem to do it -- it only gives me a boolean indicating if there were any matches.

Looking at the mb_* functions page, I don't offhand see anythng that replaces the functionality of preg_match(). What do I use?

Edit I'm an idiot. I originally posted this question asking for a replacement for preg_match, which of course is ereg_match. However both those only return the first result. What I wanted was a replacement for preg_match_all, which returns all match texts. But anyways, the u modifier works in my case for preg_match_all, as hakre pointed out.

Sitology answered 6/10, 2011 at 14:18 Comment(2)
#1766985Riana
I note your say that ereg_match() is a replacement for preg_match(). Be aware that PHP's ereg_ functions are deprecated, and should be avoided.Harden
T
17

Have you taken a look into mb_ereg?

Additionally, you can pass an UTF-8 encoded string into preg_match using the u modifier, which might be the kind of multi-byte support you need. The other option is to encode into UTF-8 and then encode the results back.

See as well an answer to a related question: Are the PHP preg_functions multibyte safe?

Tinworks answered 6/10, 2011 at 14:33 Comment(10)
Can you point me to some documentation on the u modifier? That's part of the regex?Sitology
Actually it looks like the 4th answer down on that related question has some info about the u modifier.Sitology
So I tried it out, and it only seems to return the first match :P Unless I'm doing it wrong.Sitology
You should add your code to your question, so it's actually clear what you tried so far. Take care that the input string is UTF-8 encoded if you're using preg_match with the u modifier. Then I might be able to spot your error.Tinworks
Sorry, what I meant was that mb_ereg returns only the first match string (apparently).Sitology
I'm an idiot. I'm looking for a replacement for preg_match_all! :PSitology
LOL ;), okay. What is the encoding/charset of your string? I ask, because if you have this in UTF-8, you don't need any replacement. If not, you needs to create a replacement function on your own that consists of mb_ereg... functions, doing one match after the other.Tinworks
The u modifier is the correct answer. Avoid the ereg_ (and mb_ereg_) functions because they have been deprecated.Harden
@Tinworks I'm not looking to do replacement, but to pull multiple matches out of a large string.Sitology
Find the next match after the offset of the last match + the length of the last match (both 0 at start). Loop until nothing is found any longer. Store matches inside an array.Tinworks
M
3

PHP: preg_grep manual

$matches = preg_grep('/(needles|to|find)/u', $inputArray);

Returns an array indexed using the keys from the input array.

Note the /u modifier which enables multibyte support.

Hope it helps others.

Macaluso answered 14/2, 2014 at 15:36 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.