Why is this call to strstr not returning false?
Asked Answered
B

2

16

I've stumbled upon an issue with strstr in an old legacy codebase. There's lot of code, but basically the test case would come down to this:

$value = 2660;
$link = 'affiliateid=1449&zoneid=6011&placement_id=11736&publisher_id=1449&period_preset=yesterday&period_start=2017-03-27&period_end=2017-03-27';

var_dump(strstr($link, $value));

I would expect this to return false since "2660" is not in the string however it returns d=1449&zoneid=6011&placement_id=11736&publisher_id=1449&period_preset=yesterday&period_start=2017-03-27&period_end=2017-03-27.

I realise that $value should be a string but still I don't understand why it's not casted to a string by PHP and why it's finding this number in the link.

Actually, if I try with $value = '2660'; it returns false as expected.

Any idea what's happening?

Branchia answered 28/3, 2017 at 10:10 Comment(5)
Just stringify it var_dump(strstr($link, (string)$value)); PHP is not a strict language and tends to give strange results, when different types are being compared. Just like (1=="a1") equals as true.Speedwell
Excuse my ignorance, but... what "strstr" means? I checked the documentation and I honestly don't have a clue why such a weird name was picked up for this function! Is it "Stray String"? "Search This Range for a String"? "String String"?Cheryle
@Tsar, yeah it's neither a good name nor a good function. Maybe it comes from strpos which returns the position of the first occurrence, while strstr returns the string at the first occurrence.Branchia
@TSar: strstr finds a string inside another string. Don't blame PHP for this; the name comes from C.Overeat
@Overeat Eh, that's not an excuse :P But fair point. C is know to be really... uh, let's say, exotic regarding some naming decisions here and there, so I'm not surprised at all!Cheryle
B
38

Short answer

When you run strstr($str, 2660) the $needle is resolved to the character "d" by calling chr(2660) and therefore it stops at the first "d" found in the given $str, in this case right at the 11th character.


Why are we calling chr(2660)?

Because when the $needle is not a string strstr casts that argument to an integer and uses the corresponding character for that position from the extended ASCII code where chr(2660) is "d".

If needle is not a string, it is converted to an integer and applied as the ordinal value of a character.


But why does chr(2660) return "d" when "d" is ord(100)?

Because values outside the valid range [0-255] will be bitwise and'ed with 255, which is equivalent to the following algorithm[source]

while ($ascii < 0) {
    $ascii += 256;
}
$ascii %= 256;

So, 2660 becomes 100, and then when passed to strstr it's used as the ordinal value of the character and looks for character "d".

Confusing? Yes. I also expected it to be casted to a string, but that's what we get for assuming things in programming. This, at least, is documented; you'd be surprised the amount of times something weird happens and there's no official explanation to be found. At least not as easily as following the same link you provided.


Why is it named strstr?

I did a little bit of research and found this gem (Rationale for American National Standard for Information Systems - Programming Language - C) from all the way back 1989 where they named all the functions relating to strings with the prefix str which is logical, then since PHP's source is written in C it will explain why it has carried. My best guess is that we are searching for a string inside another string, they do say:

The strstr function is an invention of the Committee. It is included as a hook for efficient substring algorithms, or for built-in substring instructions.

Useful docs

  • Documentation for strstr
  • Documentation for chr
  • PHPwtf a good resource for weirdness

Breton answered 28/3, 2017 at 10:18 Comment(3)
I looked into chr, apparently it checks division with 256 $ascii %= 256;, thus 2660 turns into 100 which is d. That is actually pretty strange creation.Speedwell
Yeah, it is kind-of documented in the PHP doc, "Example #2 Overflow behavior". Also, the first comment there truly sheds light on it: "Note that if the number is higher than 256, it will return the number mod 256.". Weird that this isn't noted more clearly in the Return Values or Parameters section though.Saffron
Well, PHP usually comes with a large set of confusing weird thingsThereupon
B
8

I think this answers your question:

needle
If needle is not a string, it is converted to an integer and applied as the ordinal value of a character.

http://php.net/manual/en/function.strstr.php

Edit because of the comments:

chr(2660) returns character d, which is indeed in the haystack and that's why it won't return false as you expected.

Balmacaan answered 28/3, 2017 at 10:13 Comment(5)
,,,so 2660 is which character?! :)Threescore
Which... is not in the haystack.Saffron
Yes I googled "spade" but @Juan Cortés with a (slightly cheeky) later answer says char(2660) is a d. InterestingThreescore
Hm, ord("♠") gives 226, chr(2660) gives d!Saffron
@domdom, precisely, that's what I was just writing as a response. And d is indeed in the haystack.Balmacaan

© 2022 - 2024 — McMap. All rights reserved.