Deep (infinite) NESTED split words using regex
Asked Answered
C

1

6

IMPORTANT EDIT: Since many people said that this should be avoided and almost unable to do using RegEx, I'm going to allow you for some other solutions. As from now on, any solution could be used as an answer and finally a solution. Thanks!

Lets say I have:

$line = "{ It is { raining { and streets are wet } | snowing { and streets are { slippy | white }}}. Tomorrow will be nice { weather | walk }. }" 

Desired output:

It is raining and streets are wet. Tomorrow will be nice weather.
It is raining and streets are wet. Tomorrow will be nice walk.
It is snowing and streets are slippy. Tomorrow will be nice weather.
It is snowing and streets are slippy. Tomorrow will be nice walk.
It is snowing and streets are white. Tomorrow will be nice weather.
It is snowing and streets are white. Tomorrow will be nice walk. 

With the code from this answer to my previous question, I'm currently able to split the words but can't figure out the nested values. Could someone help me out with what I have bellow. I'm pretty sure I should implement a for loop somewhere to make it work but I can't understand where.

$line = "{This is my {sentence|statement} I {wrote|typed} on a {hot|cold} {day|night}.}";
 $matches = getMatches($line);
 printWords([], $matches, $line);


function getMatches(&$line) {
    $line = trim($line, '{}'); 
    $matches = null;
    $pattern = '/\{[^}]+\}/';

    preg_match_all($pattern, $line, $matches);

    $matches = $matches[0];

    $line = preg_replace($pattern, '%s', $line);

    foreach ($matches as $index => $match) {
        $matches[$index] = explode('|', trim($match, '{}'));
    }

    return $matches;
}


function printWords(array $args, array $matches, $line) {
    $current = array_shift($matches);
    $currentArgIndex = count($args);

    foreach ($current as $word) {
        $args[$currentArgIndex] = $word;

        if (!empty($matches)) {
                printWords($args, $matches, $line);
        } else {
                echo vsprintf($line, $args) . '<br />';
        }
    }
}

One way I got on my mind is using lexer technique, as in read char by char, create appropriate bytecodes and then loop through it. It's not regex but it should work.

Commerce answered 4/4, 2016 at 21:31 Comment(9)
Is your desired output the above output you posted or what you current produce?Enallage
The output above is what I want to achieve, the current procedure works for a non nested line. Check variable $line inside the function.Commerce
I don't think the expected output is correct. The ' | snowing ' part is an OR with the previous '{...}' block and the output does not seem to follow that rule. Differently said: there does not seem to be an algorithmic way to go from your input to the desired output.Stenotype
Also I think regex is the complicated way to go here. A streaming parser would be easier to comprehend.Stenotype
Not a PHP or regex person, but it sounds like what you really are after is permutations, or even just basic nested loops. Regex seems like it's making this way too complicated - assuming you have control of the data source, I would personally use a simple string with placeholders and create nested loops to fill it. Perhaps arrays of pairs, something like [[raining],[wet]], [[snow],[slippy, white]],[weather, walk]] inserted into It is {0} and streets are {1}. Tomorrow will be nice {2}.Reddick
@SergiuParaschiv the expected output is correct. The current function does work for a non-nested values, check it yourself.Commerce
@Reddick Interesting, just heard of permutation. The problem is that the data a/k/a $line var can be just about anything so I don't have a control of the data source.Commerce
By definition, regex is not the solution to what you're asking for, because the expressions you want to process are not regular expressions. The infinite and nested nature of them means that a parser would be a more appropriate tool than a regex. That parser may contain some simple regex calls to help extract individual parts of the expression, but a single regex string isn't going to be able to do what you want.Migratory
@Migratory well, I'm getting out of options with regex so I'm starting to look for another solutions. Any advices?Commerce
I
1

This class does the work, allthough not sure how efficient it is:

class Randomizer {

    public function process($text) {
        return preg_replace_callback('/\{(((?>[^\{\}]+)|(?R))*)\}/x', array($this, 'replace'), $text);
    }

    public function replace($text) {
        $text = $this->process($text[1]);
        $parts = explode('|', $text);
        $part = $parts[array_rand($parts)];
        return $part;
    }
}

To use it you can simply do:

$line = "{This is my {sentence|statement} I {wrote|typed} on a {hot|cold} {day|night}.}";
$randomizer = new Randomizer( );
echo   $randomizer->process($line);

I am not the best when it comes to regular expressions so I can't really explain why that particular regex works, sorry for that.

By the way, it returns random strings and not all the possible strings. Let me know if you need all the strings instead of a random one. I will update the answer..

Ivett answered 6/4, 2016 at 12:49 Comment(6)
Thank you for contribution but this is not what I'm looking for. The script I already posted is working to output all possible variation of $line variable (which I have to), but my solution doesn't work for a nested values. Read the post again, the {my {words|sentence}| yours} should be possible.Commerce
@J.Doe I ran {my {words|sentence}| yours} and possible outputs were my words , yours, my sentence. Isn't that the correct output?Ivett
My mistake, I'm sorry, I just glanced through the code. Just checked it and it works. Though this randomise value, it's possible to translate it as in to output every possible variation, right?Commerce
@J.Doe It should, but the code needs to be updated. array_rand($parts) part should be changed. I am sorry I am currently in office, so can't refactor it right now. Give me a few hour and I will update the answer with what you exactly need. Stay tuned :)Ivett
Appreciated. Meanwhile, I'll try it out myself and try to figure it out. Cheers.Commerce
Have you tried to display all combinations? Please let me know so I can acept your answer.Commerce

© 2022 - 2024 — McMap. All rights reserved.