Convert backslash-delimited string into an associative array
Asked Answered
A

4

10

I have a string like this:

key1\value1\key2\value2\key3\value3\key4\value4\key5\value5

And I'd like it to be an associative array so that I can do:

echo $myArray['key1']; // prints value1
echo $myArray['key3']; // prints value3
//etc...

I know I can explode on the backslash, but not sure how to go from there.

Allfired answered 13/3, 2011 at 15:32 Comment(0)
G
24

Using a simple regex via preg_match_all and array_combine is often the shortest and quickest option:

 preg_match_all("/([^\\\\]+)\\\\([^\\\\]+)/", $string, $p);
 $array = array_combine($p[1], $p[2]);

Now this is of course a special case. Both keys and values are separated by a \ backslash, as are all pairs of them. The regex is also a bit lengthier due to the necessary double escaping.

However this scheme can be generalized to other key:value,-style strings.

Distinct key:value, separators

Common variations include : and = as key/value separators, and , or & and others as pair delimiters. The regex becomes rather obvious in such cases (with the /x flag for readability):

 #                    ↓    ↓    ↓
 preg_match_all("/ ([^:]+) : ([^,]+) /x", $string, $p);
 $array = array_combine($p[1], $p[2]);

Which makes it super easy to exchange : and , for other delimiters.

  • Equal signs = instead of : colons.
  • For example \\t as pair delimiter (tab-separated key:value lists)
  • Classic & or ; as separator between key=value pairs.
  • Or just \\s spaces or \\n newlines even.

Allow varying delimiters

You can make it more flexible/forgiving by allowing different delimiters between keys/values/pairs:

 #                    ↓      ↓       ↓
 preg_match_all("/ ([^:=]+) [:=]+ ([^,+&]+) /x", $string, $p);

Where both key=value,key2:value2++key3==value3 would work. Which can make sense for more human-friendlinies (AKA non-technical users).

Constrain alphanumeric keys

Oftentimes you may want to prohibit anything but classic key identifiers. Just use a \w+ word string pattern to make the regex skip over unwanted occurences:

 #                   ↓   ↓    ↓
 preg_match_all("/ (\w+) = ([^,]+) /x", $string, $p);

This is the most trivial whitelisting approach. If OTOH you want to assert/constrain the whole key/value string beforehand, then craft a separate preg_match("/^(\w+=[^,]+(,|$))+/", …

Strip spaces or quoting

You can skip a few post-processing steps (such as trim on keys and values) with a small addition:

 preg_match_all("/ \s*([^=]+) \s*=\s* ([^,]+) (?<!\s) /x", $string, $p);

Or for instance optional quotes:

 preg_match_all("/ \s*([^=]+) \s*=\s* '? ([^,]+) (?<![\s']) /x", $string, $p);

INI-style extraction

And you can craft a baseline INI-file extraction method:

 preg_match_all("/^ \s*(\w+) \s*=\s* ['\"]?(.+?)['\"]? \s* $/xm", $string, $p);

Please note that this is just a crude subset of common INI schemes.

Alternative: parse_str()

If you have a key=value&key2=value2 string already, then parse_str works like a charm. But by combining it with strtr can even process varying other delimiters:

 #                         ↓↓    ↑↑
 parse_str(strtr($string, ":,", "=&"), $pairs);

Which has a couple of pros and cons of its own:

  • Even shorter than the two-line regex approach.
  • Predefines a well-known escaping mechanism, such as %2F for special characters).
  • Does not permit varying delimiters, or unescaped delimiters within.
  • Automatically converts keys[]= to arrays, which you may or may not want though.

Alternative: explode + foreach

You'll find many examples of manual key/value string expansion. Though this is often more code. explode is somewhat overused in PHP due to optimization assumptions. After profiling often turns out to be slower however due to the manual foreach and array collection.

Gaiter answered 13/3, 2011 at 15:38 Comment(3)
How would the regex look if you have a string that includes tab jumps like that: key1=value1\tkey2=value2\tkey3=value3 ?Aultman
@EchtEinfachTV More like that: #168671 - Albeit there are simpler approaches (replacing \t with & and then a str_parse(), or with a regex \t(\w+)=(\w+) or something). Please post a new question however if you have a different source.Gaiter
@Gaiter Thanks, I stumbled over parse_str() and use that now: // convert tab jumps to & to be able to use query function $toURL = str_replace("\t","&",$keysValues); // parse as URL parse_str($toURL, $data); // access keys in array $postid = $data['postid'];Aultman
S
7

What about something like this :

$str = 'key1\value1\key2\value2\key3\value3\key4\value4\key5\value5';
$list = explode('\\', $str);

$result = array();
for ($i=0 ; $i<count($list) ; $i+=2) {
    $result[ $list[$i] ] = $list[$i+1];
}

var_dump($result);

Which would get you :

array
  'key1' => string 'value1' (length=6)
  'key2' => string 'value2' (length=6)
  'key3' => string 'value3' (length=6)
  'key4' => string 'value4' (length=6)
  'key5' => string 'value5' (length=6)


Basically, here, the idea is to :

  • split the string
  • which will give you an array such as 'key1', 'value1', 'key2', 'value2', ...
  • and, then, iterate over this list, with a jump of 2, using each time :
    • one element as the key -- the one pointed by $i
    • the one just after it as the value -- the one pointed by $i+1
Silassilastic answered 13/3, 2011 at 15:36 Comment(0)
B
0

I am not that good with RegExp but how about this one line code

parse_str(preg_replace("/key(.*?)\\value(.*?)(\\|$)/", "key$1=value$2&", $input_lines), $output);
Book answered 3/12, 2016 at 19:57 Comment(0)
C
0

@Wasim's answer which

  1. does not work as posted and
  2. of which @mario lists the caveats of its use

can be improved / made more flexible by using a mix of negated character classes and escaped literal backslashes to establish a querystring.

Code: (Demo)

parse_str(preg_replace('/([^\\\\]+)\\\\([^\\\\]+)\\\\?/', '$1=$2&', $string), $output);
var_export($output);

Output:

array (
  'key1' => 'value1',
  'key2' => 'value2',
  'key3' => 'value3',
  'key4' => 'value4',
  'key5' => 'value5',
)

Otherwise, you can explode, chunk, and manually assign the elements.

Code: (Demo)

$result = [];
foreach (array_chunk(explode('\\', $string), 2) as [$key, $val]) {
    $result[$key] = $val;
}
var_export($result);  // same output as snippet above

Or a body-less foreach() with array destructuring: (Demo)

$result = [];
foreach (
    array_chunk(explode('\\', $string), 2)
    as
    [$key, $result[$key]]
);
var_export($result);

Or use looped strtok() calls: (Demo)

$result = [];
$key = strtok($string, "\\");
while ($key !== false) {
    $result[$key] = strtok("\\");
    $key = strtok("\\");
}
var_export($result);
Carothers answered 7/12, 2020 at 9:18 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.