Parse string containing dots in php
Asked Answered
I

5

9

I would parse the following string:

$str = 'ProceduresCustomer.tipi_id=10&ProceduresCustomer.id=1';                 
parse_str($str,$f);

I wish that $f be parsed into:

array(
    'ProceduresCustomer.tipi_id' => '10',
    'ProceduresCustomer.id' => '1'
)

Actually, the parse_str returns

array(
        'ProceduresCustomer_tipi_id' => '10',
        'ProceduresCustomer_id' => '1'
    )

Beside writing my own function, does anybody know if there is a php function for that?

Intendment answered 20/3, 2014 at 16:42 Comment(11)
us1.php.net/explode , in your case <?php $array = explode('&',$input); ?>Deitz
Explode on the '=' symbol and use every odd numbered index as a value, and every even numbered index as a key?Hosmer
Explode? $f = explode("&",$str);Ulceration
What's wrong with parse_str()?Plainsong
parse_str() would work but it's unnecessaryDeitz
@zjd parse_str() does it all at once, you'd need to explode twice and use a third function or a loop to set the keys to the strings.Intertwine
@Wesly It is not a duplicate, beacuse in my question I emphasize that the string contains dots.Intendment
@giuseppe After your edit, it's clear this is an exception and not a dupe.Gerdi
well, parse_str() actually won't do what he wants in one line, personally I think the explode method is much more clear than running a regular expression but I can understand if that's the way people like to do things :)Deitz
possible duplicate of Get PHP to stop replacing '.' characters in $_GET or $_POST arrays?Lackadaisical
Related: https://mcmap.net/q/1172800/-how-to-convert-a-string-to-a-multidimensional-recursive-array-in-php-duplicate/2943403Sate
L
13

From the PHP Manual:

Dots and spaces in variable names are converted to underscores. For example <input name="a.b" /> becomes $_REQUEST["a_b"].

So, it is not possible. parse_str() will convert all periods to underscores. If you really can't avoid using periods in your query variable names, you will have to write custom function to achieve this.

The following function (taken from this answer) converts the names of each key-value pair in the query string to their corresponding hexadecimal form and then does a parse_str() on it. Then, they're reverted back to their original form. This way, the periods aren't touched:

function parse_qs($data)
{
    $data = preg_replace_callback('/(?:^|(?<=&))[^=[]+/', function($match) {
        return bin2hex(urldecode($match[0]));
    }, $data);

    parse_str($data, $values);

    return array_combine(array_map('hex2bin', array_keys($values)), $values);
}

Example usage:

$data = parse_qs($_SERVER['QUERY_STRING']);
Liegnitz answered 20/3, 2014 at 16:45 Comment(2)
Nice catch Amal. Hopefully there are ever any brackets, &var[]=1&var[]=2Gerdi
@giuseppe: Updated the answer to include an alternative solution.Liegnitz
D
3

Quick 'n' dirty.

$str = "ProceduresCustomer.tipi_id=10&ProceduresCustomer.id=1";    

function my_func($str){
    $expl = explode("&", $str);
    foreach($expl as $r){
        $tmp = explode("=", $r);
        $out[$tmp[0]] = $tmp[1];
    }
    return $out;
}

var_dump(my_func($str));

array(2) {
    ["ProceduresCustomer.tipi_id"]=> string(2) "10"
    ["ProceduresCustomer.id"]=>string(1) "1"
}
Dinnie answered 20/3, 2014 at 17:9 Comment(1)
Warning : this solution does not urldecode the query, as parse_str does. You probably need to urldecode $tmp[0] and $tmp[1] before appending to $outPrescriptible
S
1

This quick-made function attempts to properly parse the query string and returns an array.

The second (optional) parameter $break_dots tells the parser to create a sub-array when encountering a dot (this goes beyond the question, but I included it anyway).

/**
 * parse_name -- Parses a string and returns an array of the key path
 * if the string is malformed, only return the original string as a key
 *
 * $str The string to parse
 * $break_dot Whether or not to break on dots (default: false)
 *
 * Examples :
 *   + parse_name("var[hello][world]") = array("var", "hello", "world")
 *   + parse_name("var[hello[world]]") = array("var[hello[world]]") // Malformed
 *   + parse_name("var.hello.world", true) = array("var", "hello", "world")
 *   + parse_name("var.hello.world") = array("var.hello.world")
 *   + parse_name("var[hello][world") = array("var[hello][world") // Malformed
 */
function parse_name ($str, $break_dot = false) {
    // Output array
    $out = array();
    // Name buffer
    $buf = '';
    // Array counter
    $acount = 0;
    // Whether or not was a closing bracket, in order to avoid empty indexes
    $lastbroke = false;

    // Loop on chars
    foreach (str_split($str) as $c) {
        switch ($c) {
            // Encountering '[' flushes the buffer to $out and increments the
            // array counter
            case '[':
                if ($acount == 0) {
                    if (!$lastbroke) $out[] = $buf;
                    $buf = "";
                    $acount++;
                    $lastbroke = false;
                // In this case, the name is malformed. Return it as-is
                } else return array($str);
                break;

            // Encountering ']' flushes rge buffer to $out and decrements the
            // array counter
            case ']':
                if ($acount == 1) {
                    if (!$lastbroke) $out[] = $buf;
                    $buf = '';
                    $acount--;
                    $lastbroke = true;
                // In this case, the name is malformed. Return it as-is
                } else return array($str);
                break;

            // If $break_dot is set to true, flush the buffer to $out.
            // Otherwise, treat it as a normal char.
            case '.':
                if ($break_dot) {
                    if (!$lastbroke) $out[] = $buf;
                    $buf = '';
                    $lastbroke = false;
                    break;
                }

            // Add every other char to the buffer
            default:
                $buf .= $c;
                $lastbroke = false;
        }
    }

    // If the counter isn't back to 0 then the string is malformed. Return it as-is
    if ($acount > 0) return array($str);

    // Otherwise, flush the buffer to $out and return it.
    if (!$lastbroke) $out[] = $buf;
    return $out;
}

/**
 * decode_qstr -- Take a query string and decode it to an array
 *
 * $str The query string
 * $break_dot Whether or not to break field names on dots (default: false)
 */
function decode_qstr ($str, $break_dots = false) {
    $out = array();

    // '&' is the field separator 
    $a = explode('&', $str);

    // For each field=value pair:
    foreach ($a as $param) {
        // Break on the first equal sign.
        $param = explode('=', $param, 2);

        // Parse the field name
        $key = parse_name($param[0], $break_dots);

        // This piece of code creates the array structure according to th
        // decomposition given by parse_name()
        $array = &$out; // Reference to the last object. Starts to $out
        $append = false; // If an empty key is given, treat it like $array[] = 'value'

        foreach ($key as $k) {
            // If the current ref isn't an array, make it one
            if (!is_array($array)) $array = array();
            // If the current key is empty, break the loop and append to current ref
            if (empty($k)) {
                $append = true;
                break;
            }
            // If the key isn't set, set it :)
            if (!isset($array[$k])) $array[$k] = NULL;

            // In order to walk down the array, we need to first save the ref in
            // $array to $tmp
            $tmp = &$array;
            // Deletes the ref from $array
            unset($array);
            // Create a new ref to the next item
            $array =& $tmp[$k];
            // Delete the save
            unset($tmp);
        }

        // If instructed to append, do that
        if ($append) $array[] = $param[1];
        // Otherwise, just set the value
        else $array = $param[1];

        // Destroy the ref for good
        unset($array);
    }

    // Return the result
    return $out;
}

I tried to correctly handle multi-level keys. The code is a bit hacky, but it should work. I tried to comment the code, comment if you have any question.

Test case:

var_dump(decode_qstr("ProceduresCustomer.tipi_id=10&ProceduresCustomer.id=1"));
// array(2) {
//   ["ProceduresCustomer.tipi_id"]=>
//   string(2) "10"
//   ["ProceduresCustomer.id"]=>
//   string(1) "1"
// }


var_dump(decode_qstr("ProceduresCustomer.tipi_id=10&ProceduresCustomer.id=1", true));
// array(1) {
//   ["ProceduresCustomer"]=>
//   array(2) {
//     ["tipi_id"]=>
//     string(2) "10"
//     ["id"]=>
//     string(1) "1"
//   }
// }
Stabler answered 20/3, 2014 at 18:0 Comment(0)
P
0

I would like to add my solution as well, because I had trouble finding one that did all I needed and would handle all circumstances. I tested it quite thoroughly. It keeps dots and spaces and unmatched square brackets (normally changed to underscores), plus it handles arrays in the input well. Tested in PHP 8.0.0 and 8.0.14.

const periodPlaceholder = 'QQleQPunT';
const spacePlaceholder = 'QQleQSpaTIE';


function parse_str_clean($querystr): array {
    // without the converting of spaces and dots etc to underscores.
    $qquerystr = str_ireplace(['.','%2E','+',' ','%20'], [periodPlaceholder,periodPlaceholder,spacePlaceholder,spacePlaceholder,spacePlaceholder], $querystr);
    $arr = null ; parse_str($qquerystr, $arr);

    sanitizeArr($arr, $querystr);
    return $arr;
}


function sanitizeArr(&$arr, $querystr) {
    foreach($arr as $key=>$val) {
        // restore values to original
        if ( is_string($val)) {
            $newval = str_replace([periodPlaceholder,spacePlaceholder], ["."," "], $val);
            if ( $val != $newval) $arr[$key]=$newval;
        }
    }
    unset($val);
    foreach($arr as $key=>$val) {
        $newkey = str_replace([periodPlaceholder,spacePlaceholder], ["."," "], $key);
        
        if ( str_contains($newkey, '_') ) { 

            // periode of space or [ or ] converted to _. Restore with querystring
            $regex = '/&('.str_replace('_', '[ \.\[\]]', preg_quote($newkey, '/')).')=/';
            $matches = null ;
            if ( preg_match_all($regex, "&".urldecode($querystr), $matches) ) {

                if ( count(array_unique($matches[1])) === 1 && $key != $matches[1][0] ) {
                    $newkey = $matches[1][0] ;
                }
            }
        }
        if ( $newkey != $key ) $arr = array_replace_key($arr,$key, $newkey);

        if ( is_array($val)) {
            sanitizeArr($arr[$newkey], $querystr);
        }
    }
}


function array_replace_key($array, $oldKey, $newKey): array {
    // preserves order of the array
    if( ! array_key_exists( $oldKey, $array ) )   return $array;
    $keys = array_keys( $array );
    $keys[ array_search( $oldKey, $keys ) ] = $newKey;
    return array_combine( $keys, $array );
}
  • First replaces spaces and . by placeholders in querystring before coding before parsing, later undoes that within the array keys and values. This way we can use the normal parse_str.
  • Unmatched [ and ] are also replaced by underscores by parse_str, but these cannot be reliably replaced by a placeholder. And we definitely don't want to replace matched []. Hence we don't replace [ and ], en let them be replaced by underscores by parse_str. Then we restore the _ in the resulting keys and seeing in the original querystring if there was a [ or ] there.
  • Known bug: keys 'something]something' and almost identical 'something[something' may be confused. It's occurrence will be zero, so I left it.

Test:

var_dump(parse_str_clean("code.1=printr%28hahaha&code 1=448044&test.mijn%5B%5D%5B2%5D=test%20Roemer&test%20mijn%5B=test%202e%20Roemer"));

yields correctly

array(4) {
  ["code.1"]=>
  string(13) "printr(hahaha"
  ["code 1"]=>
  string(6) "448044"
  ["test.mijn"]=>
  array(1) {
    [0]=>
    array(1) {
      [2]=>
      string(11) "test Roemer"
    }
  }
  ["test[mijn"]=>
  string(14) "test 2e Roemer"
}

whereas the original parse_str only yields, with the same string:

array(2) {
  ["code_1"]=>
  string(6) "448044"
  ["test_mijn"]=>
  string(14) "test 2e Roemer"
}
Pointenoire answered 21/1, 2022 at 20:56 Comment(0)
I
0

I've found this Symfony function useful:

// Parses a query string but maintains dots (PHP parse_str() replaces '.' by '_')
HeaderUtils::parseQuery('foo[bar.baz]=qux');
// => ['foo' => ['bar.baz' => 'qux']]

https://symfony.com/doc/current/components/http_foundation.html#processing-http-headers

Isleana answered 14/8, 2024 at 10:30 Comment(0)

© 2022 - 2025 — McMap. All rights reserved.