Regex extract variables from [shortcode]
Asked Answered
G

4

4

After migrating some content from WordPress to Drupal, I've got som shortcodes that I need to convert:

String content:

Irrelevant tekst... [sublimevideo class="sublime" poster="http://video.host.com/_previews/600x450/sbx-60025-00-da-ANA.png" src1="http://video.host.com/_video/H.264/LO/sbx-60025-00-da-ANA.m4v" src2="(hd)http://video.host.com/_video/H.264/HI/sbx-60025-00-da-ANA.m4v" width="560" height="315"] ..more irrelevant text.

I need to find all variables within the shortcode [sublimevideo ...] and turn it into an array:

Array (
    class => "sublime"
    poster => "http://video.host.com/_previews/600x450/sbx-60025-00-da-FMT.png"
    src1 => "http://video.host.com/_video/H.264/LO/sbx-60025-00-da-FMT.m4v"
    src2 => "(hd)http://video.host.com/_video/H.264/HI/sbx-60025-00-da-FMT.m4v"
    width => "560"
    height => "315"
)

And preferably handle multiple instances of the shortcode.

I guess it can be done with preg_match_all() but I've had no luck.

Gilmer answered 12/8, 2013 at 20:31 Comment(1)
You should show what you have tried, we're not a free coding service. This task may become quite "hard", but one of the best solutions would be to use a recursive pattern to match nested brackets. What I mean is that if the content of that shortcode has [] then below regexes would all fail. Show what you have tried, I may help you.Dustidustie
C
10

This will give you what you want.

$data = 'Irrelevant tekst... [sublimevideo class="sublime" poster="http://video.host.com/_previews/600x450/sbx-60025-00-da-ANA.png" src1="http://video.host.com/_video/H.264/LO/sbx-60025-00-da-ANA.m4v" src2="(hd)http://video.host.com/_video/H.264/HI/sbx-60025-00-da-ANA.m4v" width="560" height="315"] ..more irrelevant text.';

$dat = array();
preg_match("/\[sublimevideo (.+?)\]/", $data, $dat);

$dat = array_pop($dat);
$dat= explode(" ", $dat);
$params = array();
foreach ($dat as $d){
    list($opt, $val) = explode("=", $d);
    $params[$opt] = trim($val, '"');
}

print_r($params);

In anticipation of the next challenge you will face with processing short codes you can use preg_replace_callback to replace the short tag data with it's resultant markup.

$data = 'Irrelevant tekst... [sublimevideo class="sublime" poster="http://video.host.com/_previews/600x450/sbx-60025-00-da-ANA.png" src1="http://video.host.com/_video/H.264/LO/sbx-60025-00-da-ANA.m4v" src2="(hd)http://video.host.com/_video/H.264/HI/sbx-60025-00-da-ANA.m4v" width="560" height="315"] ..more irrelevant text.';

function processShortCode($matches){
    // parse out the arguments
    $dat= explode(" ", $matches[2]);
    $params = array();
    foreach ($dat as $d){
        list($opt, $val) = explode("=", $d);
        $params[$opt] = trim($val, '"');
    }
    switch($matches[1]){
        case "sublimevideo":
            // here is where you would want to return the resultant markup from the shorttag call.
             return print_r($params, true);        
    }

}
$data = preg_replace_callback("/\[(\w+) (.+?)]/", "processShortCode", $data);
echo $data;
Childbirth answered 12/8, 2013 at 20:59 Comment(3)
Use an ungreedy pattern .+? otherwise it will match until the last ] bracket (in the input) :)Dustidustie
@Dustidustie Again you cement your position as regex god. :)Childbirth
Hooo~ I'm not that good, there are several great regex gurus on SO who are way better than me, but we're learning ! :-)Dustidustie
V
7

You could use the following RegEx to match the variables:

$regex = '/(\w+)\s*=\s*"(.*?)"/';

I would suggest to first match the sublimevideo shortcode and get that into a string with the following RegEx:

$pattern = '/\[sublimevideo(.*?)\]/';

To get the correct array keys I used this code:

// $string is string content you specified
preg_match_all($regex, $string, $matches);

$sublimevideo = array();
for ($i = 0; $i < count($matches[1]); $i++)
    $sublimevideo[$matches[1][$i]] = $matches[2][$i];

This returns the following array: (the one that you've requested)

Array
(
    [class] => sublime
    [poster] => http://video.host.com/_previews/600x450/sbx-60025-00-da-ANA.png
    [src1] => http://video.host.com/_video/H.264/LO/sbx-60025-00-da-ANA.m4v
    [src2] => (hd)http://video.host.com/_video/H.264/HI/sbx-60025-00-da-ANA.m4v
    [width] => 560
    [height] => 315
)
Vindicable answered 12/8, 2013 at 20:57 Comment(1)
Is it possible for you to provide the same in C#?Sharronsharyl
H
0

This is my interpretation, I come from a WordPress background and tried to recreate the setup for a custom php project.

It'll handle things like [PHONE] [PHONE abc="123"] etc

The only thing it falls flat on is the WordPress style [HERE] to [HERE]

Function to build a list of available shortcodes


// Setup the default global variable

function create_shortcode($tag, $function)
{
    global $shortcodes;
    $shortcodes[$tag] = $function;
}

define shortcodes individually, e.g. [IFRAME url="https://www.bbc.co.uk"]:


/**
 * iframe, allows the user to add an iframe to a page with responsive div wrapper
 */
create_shortcode('IFRAME', function($atts) {

    // ... some validation goes here

    // The parameters that can be set in the shortcode
    if (empty($atts['url'])) {
        return false;
    }

    return '
    <div class="embed-responsive embed-responsive-4by3">
      <iframe class="embed-responsive-item" src="' . $atts['url'] . '">
      </iframe>
    </div>';
});

Then when you want to pass a block of html via the shortcode handling do... handle_shortcodes($some_html_with_shortcodes);

function handle_shortcodes($content)
{

    global $shortcodes;

    // Loop through all shortcodes
    foreach($shortcodes as $key => $function){

        $matches = [];

        // Look for shortcodes, returns an array of ALL matches
        preg_match_all("/\[$key([^_^\]].+?)?\]/", $content, $matches, PREG_UNMATCHED_AS_NULL);

        if (!empty($matches))
        {
            $i = 0;
            $full_shortcode = $matches[0];
            $attributes = $matches[1];

            if (!empty($attributes))
            {
                foreach($attributes as $attribute_string) {

                    // Decode the values (e.g. &quot; to ") 
                    $attribute_string = htmlspecialchars_decode($attribute_string);

                    // Find all the query args, looking for `arg="anything"`
                    preg_match_all('/\w+\=\"(.[^"]+)\"/', $attribute_string, $query_args);

                    $params = [];
                    foreach ($query_args[0] as $d) {

                        // Split the
                        list($att, $val) = explode('=', $d, 2);

                        $params[$att] = trim($val, '"');
                    }

                    $content = str_replace($full_shortcode[$i], $function($params), $content);
                    $i++;
                }
            }
        }
    }
    return $content;
}

I've plucked these examples from working code so hopefully it's readable and doesn't have any extra functions exclusive to our setup.

Hypnology answered 17/5, 2019 at 14:33 Comment(0)
S
-1

As described in this answer, I'd suggest letting WordPress do the work for you using the get_shortcode_regex() function.

 $pattern = get_shortcode_regex();
 preg_match_all("/$pattern/",$wp_content,$matches);

This will give you an array that is easy to work with and shows the various shortcodes and affiliated attributes in your content. It isn't the most obvious array format, so print it and take a look so you know how to manipulate the data you need.

Sideling answered 25/9, 2014 at 15:41 Comment(1)
Outstanding, bearing in mind the OP wanted Drupal (not WordPress).Yazbak

© 2022 - 2024 — McMap. All rights reserved.