Preg_match_all returning array within array?
Asked Answered
E

2

5

I am trying to get the information out of this array, but for some reason it is nesting everything into $matches[0].

<?

$file = shell_exec('pdf2txt.py docs/April.pdf');

preg_match_all('/.../',$file,&$matches);
print_r($matches)

?>

Is this working as intended? Is there a way to put this in an array of depth 1?

EDIT:

This is the RegEx:

([A-Z][a-z]+\s){1,5}\s?[^a-zA-Z\d\s:,.\'\"]\s?[A-Za-z+\W]+\s[\d]{1,2}\s[A-Z][a-z]+\s[\d]{4}
Exophthalmos answered 19/5, 2011 at 5:59 Comment(1)
is this the regex you are using? show the real oneIvanovo
D
14

preg_match_all() always returns an array (if successful, otherwise you get an empty array) where index 0 contains an array with an element for each entire match, and the other indexes become the capturing groups, with an internal array for each match.

This might be easier to understand...

array(2) {
  [0]=>
  array(2) {
    [0]=>
    string(12) "entire match"
    [1]=>
    string(32) "entire match matched second time"
  }
  [1]=>
  array(2) {
    [0]=>
    string(15) "capturing group"
    [1]=>
    string(35) "capturing group matched second time"
  }
}
Distillate answered 19/5, 2011 at 6:5 Comment(3)
Wow. It's like Inception. Let me try to wrap my head around this.Exophthalmos
Whoa... I hit that red button when Family Guy was blasting pretty loud! Win!Exophthalmos
@RVWard No worries, glad it was appropriate :)Distillate
C
2

If your objective is to obtain only the captured characters (what was captured by your "([A-Z][a-z]+\s){1,5}") you should look inside $matches[1]. $matches[1][0] contains the first captured character sequence.

Per the preg_match_all docs, if no order flag is specified (as in your example), PREG_PATTERN_ORDER is assumed. Using this pattern, you'll find that $matches[0] is an array, which contains all strings that matched your full pattern, and $matches[1] contains an array of strings captured by your regex.

Capsulize answered 19/5, 2011 at 6:13 Comment(1)
Alex illustrates it perfectly. =)Capsulize

© 2022 - 2024 — McMap. All rights reserved.