Which is a faster approach to typechecking in PHP? gettype() or multiple is_*()
Asked Answered
G

1

6

In PHP, which is dynamically typed, we can create functions that may accept multiple data types as parameters. We can then operate on the data depending on the type of the variable. There are two ways to do this:

Approach One:

function doSomething1($param) {
    $type = gettype($param);
    if ($type === 'string') {
        // do something
    }
    else if ($type === 'integer') {
        // do something
    }
    else if ($type === 'array') {
        // do something
    }
}

Approach Two:

function doSomething2($param) {
    if (is_string($param)) {
        // do something
    }
    else if (is_int($param)) {
        // do something
    }
    else if (is_array($param)) {
        // do something
    }
}
  1. As far as I know, these two approaches are functionally equivalent from a testing perspective, but since PHP has so many gotchas, I gotta ask if there is anything I could miss if I favour one approach over the other?

  2. From a performance perspective, is it right to say approach one is faster than two because PHP function calls are expensive? Or is gettype() a much more expensive operation than the individual is_*() functions?

  3. Is there any coding idioms / style guides regarding this?

Update From my benchmark using PHP 7.0.4, a million iterations of doSomething2() took 159ms, slightly less than half the time of doSomething1() at 315ms. This was regardless of whether a string (first check) or array (last check) was passed in. This seems to suggest that gettype() is indeed an expensive operation, and more expensive than multiple function calls using is_*().

Anyone with more insight into why this might be, your help is appreciated.

Garbanzo answered 8/4, 2016 at 7:35 Comment(6)
As with any benchmarking question: test it. Likely you'll find the difference to be so minimal as to be negligible.Deportation
I did. I don't know if the difference is considered "minimal", but the results are a little surprising to me. I think it would be good to be abe to understand why. I know some may say this is premature optimization, but it's really not. It's about gaining a better understanding of the language - evaluating the pros and cons of two functionally equivalent approaches.Garbanzo
Err... your results are exactly opposite your conclusion, or you have typoed something. 315ms for is_* is expectedly slower than 159ms for gettype.Deportation
Either way though, even the slower version is still just a fraction of a second for a million calls; in practice this makes absolutely no difference and you should base your decision on readability and clarity of the code rather than performance. – I'll grant you that it's alright to want to get to know the language; but then you should probably want to dig into the C implementation of those functions...Deportation
Proofread that paragraph again, still not making sense. :PDeportation
I agree that fussing over this performance difference is probably not the most effective way for performance tuning, but this is really just an attempt to understand the language better. PHP does seem to have many ways of doing the same thing. I'm actually trying to find out if there are any idioms in this language regarding this sort of things, and hopefully the rationale behind preferring certain functions / approaches to others.Garbanzo
L
5

Let's compare C-code of gettype and is_string functions.

gettype:

PHP_FUNCTION(gettype)
{
    zval *arg;
    zend_string *type;

    ZEND_PARSE_PARAMETERS_START(1, 1)
        Z_PARAM_ZVAL(arg)
    ZEND_PARSE_PARAMETERS_END();

    type = zend_zval_get_type(arg);
    if (EXPECTED(type)) {
        RETURN_INTERNED_STR(type);
    } else {
        RETURN_STRING("unknown type");
    }
}

So, it creates string type and fill it by result of calling function zend_zval_get_type, which is:

ZEND_API zend_string *zend_zval_get_type(const zval *arg) /* {{{ */
{
    switch (Z_TYPE_P(arg)) {
        case IS_NULL:
            return ZSTR_KNOWN(ZEND_STR_NULL);
        case IS_FALSE:
        case IS_TRUE:
            return ZSTR_KNOWN(ZEND_STR_BOOLEAN);
        case IS_LONG:
            return ZSTR_KNOWN(ZEND_STR_INTEGER);
        case IS_DOUBLE:
            return ZSTR_KNOWN(ZEND_STR_DOUBLE);
        case IS_STRING:
            return ZSTR_KNOWN(ZEND_STR_STRING);
        case IS_ARRAY:
            return ZSTR_KNOWN(ZEND_STR_ARRAY);
        case IS_OBJECT:
            return ZSTR_KNOWN(ZEND_STR_OBJECT);
        case IS_RESOURCE:
            if (zend_rsrc_list_get_rsrc_type(Z_RES_P(arg))) {
                return ZSTR_KNOWN(ZEND_STR_RESOURCE);
            } else {
                return ZSTR_KNOWN(ZEND_STR_CLOSED_RESOURCE);
            }
        default:
            return NULL;
    }
}

Let's compare with is_string, for example:

PHP_FUNCTION(is_string)
{
    php_is_type(INTERNAL_FUNCTION_PARAM_PASSTHRU, IS_STRING);
}

Go to php_is_type:

static inline void php_is_type(INTERNAL_FUNCTION_PARAMETERS, int type)
{
    zval *arg;

    ZEND_PARSE_PARAMETERS_START(1, 1)
        Z_PARAM_ZVAL(arg)
    ZEND_PARSE_PARAMETERS_END();

    if (Z_TYPE_P(arg) == type) {
        if (type == IS_RESOURCE) {
            const char *type_name = zend_rsrc_list_get_rsrc_type(Z_RES_P(arg));
            if (!type_name) {
                RETURN_FALSE;
            }
        }
        RETURN_TRUE;
    } else {
        RETURN_FALSE;
    }
}

So, the core logic of these methods is absolutely the same – PHP uses Z_TYPE_P to detect the type of the variable.

But in case of gettype it also creates additional string for result and fill it with the constant string instead of just returning boolean TRUE or FALSE in case of is_* functions. So, definitely is_* functions are faster :)

Limburger answered 6/4, 2020 at 18:27 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.