Check to see if a string is serialized?
Asked Answered
H

13

153

What's the best way to determine whether or not a string is the result of the serialize() function?

https://www.php.net/manual/en/function.serialize

Haslam answered 2/9, 2009 at 20:28 Comment(0)
H
65

From WordPress core functions:

<?php
function is_serialized( $data, $strict = true ) {
    // If it isn't a string, it isn't serialized.
    if ( ! is_string( $data ) ) {
        return false;
    }
    $data = trim( $data );
    if ( 'N;' === $data ) {
        return true;
    }
    if ( strlen( $data ) < 4 ) {
        return false;
    }
    if ( ':' !== $data[1] ) {
        return false;
    }
    if ( $strict ) {
        $lastc = substr( $data, -1 );
        if ( ';' !== $lastc && '}' !== $lastc ) {
            return false;
        }
    } else {
        $semicolon = strpos( $data, ';' );
        $brace     = strpos( $data, '}' );
        // Either ; or } must exist.
        if ( false === $semicolon && false === $brace ) {
            return false;
        }
        // But neither must be in the first X characters.
        if ( false !== $semicolon && $semicolon < 3 ) {
            return false;
        }
        if ( false !== $brace && $brace < 4 ) {
            return false;
        }
    }
    $token = $data[0];
    switch ( $token ) {
        case 's':
            if ( $strict ) {
                if ( '"' !== substr( $data, -2, 1 ) ) {
                    return false;
                }
            } elseif ( false === strpos( $data, '"' ) ) {
                return false;
            }
            // Or else fall through.
        case 'a':
        case 'O':
            return (bool) preg_match( "/^{$token}:[0-9]+:/s", $data );
        case 'b':
        case 'i':
        case 'd':
            $end = $strict ? '$' : '';
            return (bool) preg_match( "/^{$token}:[0-9.E+-]+;$end/", $data );
    }
    return false;
} 
Hereby answered 14/2, 2011 at 16:32 Comment(6)
I basically needed a regex to do a basic detect, I ended up using: ^([adObis]:|N;)Osmose
Current WordPress version is somewhat more sophisticated: codex.wordpress.org/Function_Reference/…Birck
+1 for giving credits. I didn't know WordPress had this built-in. Thanks for the idea -- I'll now go ahead and create an archive of useful functions from the WordPress Core.Nellenelli
This is a good function. Unserialize by default throws an error if the target isn't valid...yet this will not only detect if it is serialized but correctly formatted which is huge.Geez
This function does not handle any arrays. Be also careful that you can false detect strings with added slashes before '"' are also detected as serialized, but in fact unserialize fails on them.Buddhi
@CédricFrançoys I've included updated url in answer, so you can delete comment now.Lockett
A
212

I'd say, try to unserialize it ;-)

Quoting the manual :

In case the passed string is not unserializeable, FALSE is returned and E_NOTICE is issued.

So, you have to check if the return value is false or not (with === or !==, to be sure not to have any problem with 0 or null or anything that equals to false, I'd say).

Just beware the notice : you might want/need to use the @ operator.

For instance :

$str = 'hjkl';
$data = @unserialize($str);
if ($data !== false) {
    echo "ok";
} else {
    echo "not ok";
}

Will get you :

not ok


EDIT : Oh, and like @Peter said (thanks to him!), you might run into trouble if you are trying to unserialize the representation of a boolean false :-(

So, checking that your serialized string is not equal to "b:0;" might be helpful too ; something like this should do the trick, I suppose :

$data = @unserialize($str);
if ($str === 'b:0;' || $data !== false) {
    echo "ok";
} else {
    echo "not ok";
}

testing that special case before trying to unserialize would be an optimization -- but probably not that usefull, if you don't often have a false serialized value.

Accouterment answered 2/9, 2009 at 20:31 Comment(10)
But what if the unserialized value is a boolean with a value of FALSE?Kato
@Kato : excellent remark ; I've edited my answer with a proposition to deal with that case ; thanks !Accouterment
Thanks. :) I assumed this was probably going to be the answer.. Just seems to me that there should be a way to find out if it's serialized before actually forcing the parser to attempt to process it.Haslam
I added an another answer below that addresses the "false as a valid value" problem below. Let me know what you think.Sartorial
Still, logs are flooded with notices if serialization fails :(Wingover
Does this method have any reasonable impact on performance with bigger pieces of data?Ethnomusicology
despite the '@' still getting a warning with this. Is there any way to truly suppress the warning when attempting to unserialize a non-serialized variable?Tailspin
IMPORTANT: Never ever unserialize raw user data since it can be used as an attack vector. OWASP:PHP_Object_InjectionDermott
How about $data = @unserialize($str); if (serialize($data) === $str) echo "ok";Germanophobe
How about a solution that doesn't bow down to the evil @?Dulcimer
H
65

From WordPress core functions:

<?php
function is_serialized( $data, $strict = true ) {
    // If it isn't a string, it isn't serialized.
    if ( ! is_string( $data ) ) {
        return false;
    }
    $data = trim( $data );
    if ( 'N;' === $data ) {
        return true;
    }
    if ( strlen( $data ) < 4 ) {
        return false;
    }
    if ( ':' !== $data[1] ) {
        return false;
    }
    if ( $strict ) {
        $lastc = substr( $data, -1 );
        if ( ';' !== $lastc && '}' !== $lastc ) {
            return false;
        }
    } else {
        $semicolon = strpos( $data, ';' );
        $brace     = strpos( $data, '}' );
        // Either ; or } must exist.
        if ( false === $semicolon && false === $brace ) {
            return false;
        }
        // But neither must be in the first X characters.
        if ( false !== $semicolon && $semicolon < 3 ) {
            return false;
        }
        if ( false !== $brace && $brace < 4 ) {
            return false;
        }
    }
    $token = $data[0];
    switch ( $token ) {
        case 's':
            if ( $strict ) {
                if ( '"' !== substr( $data, -2, 1 ) ) {
                    return false;
                }
            } elseif ( false === strpos( $data, '"' ) ) {
                return false;
            }
            // Or else fall through.
        case 'a':
        case 'O':
            return (bool) preg_match( "/^{$token}:[0-9]+:/s", $data );
        case 'b':
        case 'i':
        case 'd':
            $end = $strict ? '$' : '';
            return (bool) preg_match( "/^{$token}:[0-9.E+-]+;$end/", $data );
    }
    return false;
} 
Hereby answered 14/2, 2011 at 16:32 Comment(6)
I basically needed a regex to do a basic detect, I ended up using: ^([adObis]:|N;)Osmose
Current WordPress version is somewhat more sophisticated: codex.wordpress.org/Function_Reference/…Birck
+1 for giving credits. I didn't know WordPress had this built-in. Thanks for the idea -- I'll now go ahead and create an archive of useful functions from the WordPress Core.Nellenelli
This is a good function. Unserialize by default throws an error if the target isn't valid...yet this will not only detect if it is serialized but correctly formatted which is huge.Geez
This function does not handle any arrays. Be also careful that you can false detect strings with added slashes before '"' are also detected as serialized, but in fact unserialize fails on them.Buddhi
@CédricFrançoys I've included updated url in answer, so you can delete comment now.Lockett
H
27

Optimizing Pascal MARTIN's response

/**
 * Check if a string is serialized
 * @param string $string
 */
public static function is_serial($string) {
    return (@unserialize($string) !== false);
}
Housing answered 14/2, 2011 at 16:22 Comment(0)
D
18

If the $string is a serialized false value, ie $string = 'b:0;' SoN9ne's function returns false, it's wrong

so the function would be

/**
 * Check if a string is serialized
 *
 * @param string $string
 *
 * @return bool
 */
function is_serialized_string($string)
{
    return ($string == 'b:0;' || @unserialize($string) !== false);
}
Depilate answered 16/9, 2013 at 17:2 Comment(6)
Swapping the order of these tests would be more efficient.Savonarola
The @ ( at operator ) should be discouraged. Use try catch block instead.Liesa
@FranciscoLuz from the manual php.net/manual/en/function.unserialize.php In case the passed string is not unserializeable, FALSE is returned and E_NOTICE is issued. We can't catch E_NOTICE error as it isn't a thrown exception.Depilate
@HazemNoor I tested it with PHP 7 and it does get caught. Also, in PHP 7, there is catch(\Throwable $e) which catches everything that goes wrong under the hood.Liesa
@FranciscoLuz how did you caught E_Notice in PHP 7?Gladdie
@Gladdie Take a read on this php.net/manual/en/language.errors.php7.phpLiesa
S
13

Despite Pascal MARTIN's excellent answer, I was curious if you could approach this another way, so I did this just as a mental exercise

<?php

ini_set( 'display_errors', 1 );
ini_set( 'track_errors', 1 );
error_reporting( E_ALL );

$valueToUnserialize = serialize( false );
//$valueToUnserialize = "a"; # uncomment this for another test

$unserialized = @unserialize( $valueToUnserialize );

if ( FALSE === $unserialized && isset( $php_errormsg ) && strpos( $php_errormsg, 'unserialize' ) !== FALSE )
{
  echo 'Value could not be unserialized<br>';
  echo $valueToUnserialize;
} else {
  echo 'Value was unserialized!<br>';
  var_dump( $unserialized );
}

And it actually works. The only caveat is that it will likely break if you have a registered error handler because of how $php_errormsg works.

Sartorial answered 2/9, 2009 at 21:26 Comment(5)
+1 : This one is fun, I have to admit -- wouldn't have thought about it ! And I don't find a way to make it fail, too ^^ Nice work ! And thanks for the comment on my answer : without it, I would probably not have seen this answer.Accouterment
$a = 'bla'; $b = 'b:0;'; Try to unserialize $a then $b with this, both will fail while $b shouldn't.Allard
Not if there was a failure right before. Because $php_errormsg will still contain the serialization error from before and once you deserialize false then it will fail.Allard
Yeah, but only if you don't error-check in-between deserializing $a and deserializing $b, which is not practical application design.Sartorial
Just an FYI - This feature has been DEPRECATED as of PHP 7.2.0. Relying on this feature is highly discouraged. - php.net/manual/en/reserved.variables.phperrormsg.phpAdjust
L
11
$data = @unserialize($str);
if($data !== false || $str === 'b:0;')
    echo 'ok';
else
    echo "not ok";

Correctly handles the case of serialize(false). :)

Lupien answered 2/9, 2009 at 20:38 Comment(0)
A
2

build in to a function

function isSerialized($value)
{
   return preg_match('^([adObis]:|N;)^', $value);
}
Abase answered 22/8, 2017 at 10:43 Comment(1)
This regex is dangerous, it's returning positive when a: (or b: etc) is present somewhere inside $value, not in the beginning. And ^ here doesn't mean beginning of a string. It's totally misleading.Anisole
B
2

There is WordPress solution: (detail is here)

    function is_serialized($data, $strict = true)
    {
        // if it isn't a string, it isn't serialized.
        if (!is_string($data)) {
            return false;
        }
        $data = trim($data);
        if ('N;' == $data) {
            return true;
        }
        if (strlen($data) < 4) {
            return false;
        }
        if (':' !== $data[1]) {
            return false;
        }
        if ($strict) {
            $lastc = substr($data, -1);
            if (';' !== $lastc && '}' !== $lastc) {
                return false;
            }
        } else {
            $semicolon = strpos($data, ';');
            $brace = strpos($data, '}');
            // Either ; or } must exist.
            if (false === $semicolon && false === $brace)
                return false;
            // But neither must be in the first X characters.
            if (false !== $semicolon && $semicolon < 3)
                return false;
            if (false !== $brace && $brace < 4)
                return false;
        }
        $token = $data[0];
        switch ($token) {
            case 's' :
                if ($strict) {
                    if ('"' !== substr($data, -2, 1)) {
                        return false;
                    }
                } elseif (false === strpos($data, '"')) {
                    return false;
                }
            // or else fall through
            case 'a' :
            case 'O' :
                return (bool)preg_match("/^{$token}:[0-9]+:/s", $data);
            case 'b' :
            case 'i' :
            case 'd' :
                $end = $strict ? '$' : '';
                return (bool)preg_match("/^{$token}:[0-9.E-]+;$end/", $data);
        }
        return false;
    }
Bighead answered 17/2, 2018 at 9:9 Comment(0)
C
1
/**
 * some people will look down on this little puppy
 */
function isSerialized($s){
if(
    stristr($s, '{' ) != false &&
    stristr($s, '}' ) != false &&
    stristr($s, ';' ) != false &&
    stristr($s, ':' ) != false
    ){
    return true;
}else{
    return false;
}

}
Chalcocite answered 27/8, 2012 at 23:11 Comment(4)
well, this would give true for many JSON strings as well, wouldnt it? So it's not reliable to determine whether the string can un/serialized.Nutwood
Might be true, but if the alternative is serialized, or just plain text, as it was for me, it works like a charm.Familial
@Chalcocite "Well it works for me in this specific case" is a really bad mentality to have when coding. There are a lot of developers who are lazy or not forward-thinking like this and it makes for a nightmare later on down the line when other developers have to work with their code or try to change something and suddenly nothing works properly anymore.Leo
Making completly solid code (if that even was possible) is not always the goal or the best practice. Not when it comes at an expence of time. This is only true from the programmers perspective. In real life there is a lot of circomstances where quick and dirty is the preferred way.Familial
S
1

This works fine for me

<?php

function is_serialized($data){
    return (is_string($data) && preg_match("#^((N;)|((a|O|s):[0-9]+:.*[;}])|((b|i|d):[0-9.E-]+;))$#um", $data));
    }

?>
Subversive answered 5/11, 2015 at 20:34 Comment(1)
Please bear in mind this checks if given string is serialize-looking string - it won't actually check the validity of that string.Menstruation
E
1

I would just try to unserialize it. This is how i would solve it

public static function is_serialized($string)
{
    try {
        unserialize($string);
    } catch (\Exception $e) {
        return false;
    }

    return true;
}

Or more like a helper function

function is_serialized($string) {
  try {
        unserialize($string);
    } catch (\Exception $e) {
        return false;
    }

    return true;
}
Expurgate answered 1/12, 2020 at 15:54 Comment(2)
As of PHP 8.1.0, it seems a PHP Notice is shown, which is not catchable.Symphonia
ditto to what Max13 said, this looked like a great solve until I tried it. Oh well.Collator
B
1
  • The mentionned WordPress function does not really detect arrays (a:1:{42} is considered to be serialized) and falsely returns true on escaped strings like a:1:{s:3:\"foo\";s:3:\"bar\";} (although unserialize does not work)

  • If you use the @unserialize way on the other side WordPress for example adds an ugly margin at the top of the backend when using define('WP_DEBUG', true);

enter image description here

  • A working solution that solves both problems and circumvents the stfu-operator is:
function __is_serialized($var)
{
    if (!is_string($var) || $var == '') {
        return false;
    }
    set_error_handler(function ($errno, $errstr) {});
    $unserialized = unserialize($var);
    restore_error_handler();
    if ($var !== 'b:0;' && $unserialized === false) {
        return false;
    }
    return true;
}
Buddhi answered 1/6, 2021 at 12:45 Comment(0)
B
0

see the wordpress function is_serialized

function is_serialized( $data, $strict = true ) {
// If it isn't a string, it isn't serialized.
if ( ! is_string( $data ) ) {
    return false;
}
$data = trim( $data );
if ( 'N;' === $data ) {
    return true;
}
if ( strlen( $data ) < 4 ) {
    return false;
}
if ( ':' !== $data[1] ) {
    return false;
}
if ( $strict ) {
    $lastc = substr( $data, -1 );
    if ( ';' !== $lastc && '}' !== $lastc ) {
        return false;
    }
} else {
    $semicolon = strpos( $data, ';' );
    $brace     = strpos( $data, '}' );
    // Either ; or } must exist.
    if ( false === $semicolon && false === $brace ) {
        return false;
    }
    // But neither must be in the first X characters.
    if ( false !== $semicolon && $semicolon < 3 ) {
        return false;
    }
    if ( false !== $brace && $brace < 4 ) {
        return false;
    }
}
$token = $data[0];
switch ( $token ) {
    case 's':
        if ( $strict ) {
            if ( '"' !== substr( $data, -2, 1 ) ) {
                return false;
            }
        } elseif ( false === strpos( $data, '"' ) ) {
            return false;
        }
        // Or else fall through.
    case 'a':
    case 'O':
        return (bool) preg_match( "/^{$token}:[0-9]+:/s", $data );
    case 'b':
    case 'i':
    case 'd':
        $end = $strict ? '$' : '';
        return (bool) preg_match( "/^{$token}:[0-9.E+-]+;$end/", $data );
}
return false;

}

Barajas answered 20/10, 2020 at 10:16 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.