Regular expression as delimiter in explode()
Asked Answered
B

1

11

So I have a string which I'm turning into an array but I want to separate each word using a regex. I'm matching a whole word using the below function.

function substr_count_array($haystack, $needle)
{
     $initial = 0;
     $bits = explode(' ', $haystack);

     foreach ($needle as $substring) 
     {
        if (!in_array($substring, $bits))
        {
            continue;
        }

        $initial += substr_count($haystack, $substring);
     }

     return $initial;
}

The problem is that it matches the string animal for example but not animals. And if I do a partial match like this:

function substr_count_array2($haystack, $needle)
{
     $initial = 0;

     foreach ($needle as $substring) 
     {
          $initial += substr_count($haystack, $substring);
     }

     return $initial;
}

It also matches, let's say, a since it's contained withing the word animals and returns 2. How do I explode() using a regular expression as a delimiter so that I may, for example, match every string that has a length of 5-7 characters?

Explained simpler:

$animals = array('cat','dog','bird');
$toString = implode(' ', $animals);
$data = array('a');

echo substr_count_array($toString, $data);

If I search for a character such as a, it gets through the check and validates as a legit value because a is contained within the first element. But if I match whole words exploded by a space, it omits them if they are not separated by a space. Thus, I need to separate with a regular expression that matches anything AFTER the string that is to be matched.

Bedclothes answered 18/7, 2014 at 16:36 Comment(0)
M
17

Simply put, you need to use preg_split instead of explode.

While explode will split on constant values, preg_split will split based on a regular expression.

In your case, it would probably be best to split on non-word characters \W+, then manually filter the results for length.

Microscopium answered 18/7, 2014 at 16:38 Comment(3)
Something like this? preg_split('(.+?)', $haystack);Bedclothes
@JessieStalk - Not quite. The regular expression you pass to preg_split is the pattern the string is split on, not what strings you want to keep. If you're trying to keep the words in your input, you should split on non-word characters: preg_split('/\W+/', $haystack)Microscopium
Thanks for you time and effort :)Bedclothes

© 2022 - 2024 — McMap. All rights reserved.