"Transient" properties in a PHP class?
Asked Answered
V

3

9

I've worked with PHP for a few years now, but up until now never had a need to deal with serialisation explicitly, only using the $_SESSION. Now I have a project that requires me to manually implement serialisation mechanism for certain data - and I realise that the issue is applicable to $_SESSION as well.

I have a class that contains a number of properties. Most of these properties are small (as in memory consumption): numbers, relatively short strings, etc. However the class also contains some properties, which may contain HUGE arrays (e.g. an entire dump of a database table: 100,000 rows with 100 fields each). As it happens, this is one of the classes that needs to be serialised/deserialised - and, luckly, the properties containing large arrays don't need to be serialised, as they are essentially temporary pieces of work and are rebuilt anyway as necessary.

In such circumstances in Java, I would simply declare the property as transient - and it would be omitted from serialisaion. Unfortunately, PHP doesn't support such qualifiers.

One way to deal with is it to have something like this:

class A implements Serializable
{
    private $var_small = 1234;
    private $var_big = array( ... );  //huge array, of course, not init in this way

    public function serialize()
    {
        $vars = get_object_vars($this);
        unset($vars['var_big']);
        return serialize($vars);
    }

    public function unserialize($data)
    {
        $vars = unserialize($data);
        foreach ($vars as $var => $value) {
            $this->$var = $value;
        }
    }
}

However this is rather cumbersome, as I would need to update serialize method every time I add another transient property. Also, once the inheritance comes into play, this becomes even more complicated - to deal with, as transient properties may be in both subclass and the parent. I know, it's still doable, however I would prefer to delegate as much as possible to the language rather than reinvent the wheel.

So, what's the best way to deal with transient properties? Or am I missing something and PHP supports this out of the box?

Virilism answered 2/2, 2012 at 14:51 Comment(0)
K
7

Php provides __sleep magic method which allows you to choose what attributes are to be serialized.

EDIT I've tested how does __sleep() work when inheritance is in the game:

<?php

class A {
    private $a = 'String a';
    private $b = 'String b';

    public function __sleep() {
        echo "Sleep A\n";
        return array( 'a');
    }
}

class B extends A {
    private $c = 'String c';
    private $d = 'String d';

    public function __sleep() {
        echo "Sleep B\n";
        return array( 'c');
    }
}

class C extends A {
    private $e = 'String e';
    private $f = 'String f';

    public function __sleep() {
        echo "Sleep C\n";
        return array_merge( parent::__sleep(), array( 'e'));
    }
}

$a = new A();
$b = new B();
$c = new C();

echo serialize( $a) ."\n";  // Result: O:1:"A":1:{s:4:"Aa";s:8:"String a";}
// called "Sleep A" (correct)

echo serialize( $b) ."\n"; // Result: O:1:"B":1:{s:4:"Bc";s:8:"String c";}
// called just "Sleep B" (incorrect)

echo serialize( $c) ."\n"; // Caused: PHP Notice:  serialize(): "a" returned as member variable from __sleep() but does not exist ...

// When you declare `private $a` as `protected $a` that class C returns:
// O:1:"C":2:{s:4:"*a";s:8:"String a";s:4:"Ce";s:8:"String e";}
// which is correct and called are both: "Sleep C" and "Sleep A"

So it seems that you can serialize parent data only if it's declared as protected :-/

EDIT 2 I've tried it with Serializable interface with following code:

<?php

class A implements Serializable {
    private $a = '';
    private $b = '';

    // Just initialize strings outside default values
    public function __construct(){
        $this->a = 'String a';
        $this->b = 'String b';
    }

    public function serialize() {
        return serialize( array( 'a' => $this->a));
    }

    public function unserialize( $data){
        $array = unserialize( $data);
        $this->a = $array['a'];
    }
}

class B extends A {
    private $c = '';
    private $d = '';

    // Just initialize strings outside default values
    public function __construct(){
        $this->c = 'String c';
        $this->d = 'String d';
        parent::__construct();
    }

    public function serialize() {
        return serialize( array( 'c' => $this->c, '__parent' => parent::serialize()));
    }

    public function unserialize( $data){
        $array = unserialize( $data);
        $this->c = $array['c'];
        parent::unserialize( $array['__parent']);
    }
}

$a = new A();
$b = new B();

echo serialize( $a) ."\n";
echo serialize( $b) ."\n";

$a = unserialize( serialize( $a)); // C:1:"A":29:{a:1:{s:1:"a";s:8:"String a";}}
$b = unserialize( serialize( $b)); // C:1:"B":81:{a:2:{s:1:"c";s:8:"String c";s:8:"__parent";s:29:"a:1:{s:1:"a";s:8:"String a";}";}}


print_r( $a);
print_r( $b);

/** Results:
A Object
(
    [a:A:private] => String a
    [b:A:private] => 
)
B Object
(
    [c:B:private] => String c
    [d:B:private] => 
    [a:A:private] => String a
    [b:A:private] => 
)
*/

So to sum up: you can serialize classes via __sleep() only if they don't have private members in super class (which need to be serialized as well). You can serialize complex object via implementing Serializable interface, but it brings you some programming overhead.

Keeley answered 2/2, 2012 at 14:55 Comment(3)
__sleep will not work with private properties in the parent class, so it's only helpful as long as the inheritance is not involved.Virilism
Thanks, it does look like a viable approach. I'll experiment a bit more with this to see how it works.Virilism
@AleksG if you find any better way to do this I'll be glad to edit my answer (at least comment here so I would see any good answer), but I cannot image better way to do this... You have to allow parent to serialize first (and store data in format that they would be usable in parent unserialize again without any overhead (and I cannot think of better approach than creating array( '__parent' => parent::serialize(), ...).Keeley
R
0

You can use __sleep and __wakeup. For the former, you provide an array of the names of object properties you want serialized. Omit "transient" members from this list.

__wakeup is called immediately when an instance is unserialized. You could use this to, for example, refill the non-transient properties on some conditions.

Rodina answered 2/2, 2012 at 15:0 Comment(0)
D
0

Disclaimer: If no inheritance is involved or all properties are public or protected, you can use one of many solutions provided before. The solution discussed here is designed to work with inheritance and private properties. It's specially useful to remove injected dependencies.

Basically use __sleep() to exclude properties from serialization. But we need a way to extract all property names of $this. Use __wakeup() to re-establish those lost connections/data.

PHP Serialization of Objects with Inheritance and Private Properties

If __sleep() is present in your class and __serialize() is not, serialize() uses __sleep() to grab a list of properties which should be serialized. The list is one-dimensional but has to be a specific format to determine which private property belongs to which class. This is the format, where \0 are null chars:

[
    "publicProperty",
    "\0*\0protectedProperty",
    "\0ClassName\0privateProperty",
]

To bring the property list into the correct format, we found two solutions.

Note: __sleep() only needs to be implemented on the parent class.

Use (array) cast

The array cast results in an array with all properties of an object and with the correct keys.

Note: This solution is still quite error-prone especially during refactoring, since excluded property names are hardcoded as strings.

class ParentClass {
    private $parentProperty1;
    private $parentProperty2;

    public function __construct() {
        $this->parentProperty1 = 'Parent Property 1';
        $this->parentProperty2 = 'Parent Property 2';
    }

    public function __sleep() {
        $excludedProperties = [
            "\0ParentClass\0parentProperty1",
        ];

        $properties = (array)$this;
        return array_filter(array_keys($properties), function ($propertyName) use ($excludedProperties) {
            return !in_array($propertyName, $excludedProperties);
        });
    }
}

class ChildClass extends ParentClass {
    private $childProperty3;

    public function __construct()
    {
        parent::__construct();
        $this->childProperty3 = 'Child Property 3';
    }
}

$child = new ChildClass();
var_dump($child);
$serialized = serialize($child);
var_dump($serialized);
$child = unserialize($serialized);
var_dump($child);

Result

object(ChildClass)#1 (3) {
  ["parentProperty1":"ParentClass":private] => string(17) "Parent Property 1"
  ["parentProperty2":"ParentClass":private] => string(17) "Parent Property 2"
  ["childProperty3":"ChildClass":private] => string(16) "Child Property 3"
}

string(141) "O:10:"ChildClass":2:{s:28:" ParentClass parentProperty2";s:17:"Parent Property 2";s:26:" ChildClass childProperty3";s:16:"Child Property 3";}"

object(ChildClass)#2 (3) {
  ["parentProperty1":"ParentClass":private] => NULL
  ["parentProperty2":"ParentClass":private] => string(17) "Parent Property 2"
  ["childProperty3":"ChildClass":private] => string(16) "Child Property 3"
}

Use Reflection and Attributes (Recommended)

Trade speed for elegance and readability. Using Reflections is about 2x slower according to some simple, not-representative benchmarks listed below. The serialization of a complex object in our production code takes about 60 microseconds (which isn't representative either), just so you have a baseline.

The reflection loops over all properties of this class and all parent classes. It checks if the property is private and builds the property names accordingly.

#[Attribute]
class DoNotSerialize {}

class ParentClass {
    private $parentProperty1;
    
    #[DoNotSerialize]
    private $parentProperty2;

    public function __construct(
            #[DoNotSerialize] private $parentProperty3,
    ) {
        $this->parentProperty1 = 'Parent Property 1';
        $this->parentProperty2 = 'Parent Property 2';
    }

    public function __sleep() {
        $props = [];
        $reflectionClass = new ReflectionClass($this);

        do {
            $reflectionProps = $reflectionClass->getProperties();

            foreach ($reflectionProps as $reflectionProp) {
                // $reflectionProp->setAccessible(true); // not needed after PHP 8.1
                if (empty($reflectionProp->getAttributes(DoNotSerialize::class)) && !$reflectionProp->isStatic()) {
                    $propertyName = $reflectionProp->getName();
                    // PHP uses NUL-byte prefixes to represent visibility in property names
                    if ($reflectionProp->isPrivate()) {
                        $propertyName = "\0" . $reflectionProp->getDeclaringClass()->getName() . "\0" . $propertyName;
                    } elseif ($reflectionProp->isProtected()) {
                        $propertyName = "\0*\0" . $propertyName;
                    }
                    $props[] = $propertyName;
                }
            }
            $reflectionClass = $reflectionClass->getParentClass();
        } while ($reflectionClass);

        return $props;
    }
}

class ChildClass extends ParentClass {
    private $childProperty4;
    
    public function __construct()
    {
        parent::__construct('Parent Property 3');
        $this->childProperty4 = 'Child Property 4';
    }
}

$child = new ChildClass();
var_dump($child);
$serialized = serialize($child);
var_dump($serialized);
$child = unserialize($serialized);
var_dump($child);

Result

object(ChildClass)#1 (4) {
  ["parentProperty1":"ParentClass":private] => string(17) "Parent Property 1"
  ["parentProperty2":"ParentClass":private] => string(17) "Parent Property 2"
  ["parentProperty3":"ParentClass":private] => string(17) "Parent Property 3"
  ["childProperty4":"ChildClass":private] => string(16) "Child Property 4"
}

string(141) "O:10:"ChildClass":2:{s:26:" ChildClass childProperty4";s:16:"Child Property 4";s:28:" ParentClass parentProperty1";s:17:"Parent Property 1";}"

object(ChildClass)#6 (4) {
  ["parentProperty1":"ParentClass":private] => string(17) "Parent Property 1"
  ["parentProperty2":"ParentClass":private] => NULL
  ["parentProperty3":"ParentClass":private] => NULL
  ["childProperty4":"ChildClass":private] => string(16) "Child Property 4"
}

Simple Benchmark

(array) cast

$start = microtime(true);
for ($i = 0; $i < 1000000; $i++) {
    $child = new ChildClass();
    $serializedArray = serialize($child);
}
$end = microtime(true);
$timeArray = ($end - $start) * 1000;
echo "Time: $timeArray ms" . PHP_EOL;

Result: Time: 587.77904510498 ms

Reflection

$start = microtime(true);
for ($i = 0; $i < 1000000; $i++) {
    $child = new ChildClass();
    $serializedReflection = serialize($child);
}
$end = microtime(true);
$timeReflection = ($end - $start) * 1000;
echo "Time: $timeReflection ms" . PHP_EOL;

Result: Time: 1218.0068492889 ms

Other Approaches

Problem with get_object_vars()

get_object_vars() returns an associative array with all properties of an object in scope. This is a problem because it breaks serialization of classes with inherited private properties.

Problem with __serialize() and __unserialize()

This seems to be the recommended approach. However, it comes with a large programming overhead. You would have to implement this behaviour on every class in your hierarchy. Plus we want PHP to do the serialization, so we don't have to __unserialize() manually.

Example grabbed from PHP RFC: New custom object serialization mechanism

class A {
    private $prop_a;
    public function __serialize(): array {
        return ['prop_a' => $this->prop_a];
    }
    public function __unserialize(array $data) {
        $this->prop_a = $data['prop_a'];
    }
}
class B extends A {
    private $prop_b;
    public function __serialize(): array {
        return [
            'prop_b' => $this->prop_b,
            'parent_data' => parent::__serialize(),
        ];
    }
    public function __unserialize(array $data) {
        parent::__unserialize($data['parent_data']);
        $this->prop_b = $data['prop_b'];
    }
}
Dortheydorthy answered 1/12, 2023 at 14:42 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.