How to hide only sensitive arguments in PHP's debug_backtrace?
Asked Answered
M

5

10

Consider the following code. In the event that an exception occurs, the trace (which will be logged and stored in a database) will include the sensitive password data. How can sensitive data in cases like this, while allowing other non-sensitive arguments, be hidden?

<?php
$user = 'john';
$pass = 'secret';

function auth($user, $pass) {
    // authentication logic
}

function login($user, $pass) {
    throw new Exception('Unexpected error');

    // various logic
    auth($user, $pass);
    // various logic
}

try {
    login($user, $pass);
} catch (Throwable $e) {
    send_to_log($e->getTrace()); // This reveals the password "secret"
}
Mccay answered 18/6, 2021 at 22:24 Comment(3)
Write data to log file.Adaminah
WHY would you EVER show the results of a backtrace to your end users? Don't do it.Paradisiacal
@Paradisiacal WHY do you assume that a code example reflects what is shown to end users? I'm talking about the data that gets inserted into the database or log files, which end users never see.Mccay
P
18

Starting from the PHP version 8.2 (Dec 2022) there is a feature named "Redacting parameters in back traces". This will hide the parameter from any stack trace in your PHP application.

Here is an example from that RFC:

<?php
 
function test(
    $foo,
    #[\SensitiveParameter] $bar,
    $baz
) {
    throw new \Exception('Error');
}
 
test('foo', 'bar', 'baz');
 
/*
Fatal error: Uncaught Exception: Error in test.php:8
Stack trace:
#0 test.php(11): test('foo', Object(SensitiveParameterValue), 'baz')
#1 {main}
  thrown in test.php on line 8
*/

Note that for some built-in functions (such as PDO and mysqli database password parameter for example), this annotation is already in effect.

Pompano answered 8/9, 2022 at 19:25 Comment(0)
H
6

Disclaimer: I (kind of) assume that you never really pipe the result of var_dump back to your user. It's (once again, kind of) obvious that end users rarely care about innards of your engine, so showing them traceroutes is almost never a good way to handle server errors. But you're right; even logging this info might actually not be a really good idea - for various reasons.

So, answering the original question: well, you can force your exception logging to either discard the params completely - or limit their length:

Note that PHP 7.4 introduced the setting zend.exception_ignore_args, which allowed removing argument information from exceptions completely (in getTrace(), getTraceAsString(), etc.).

Setting zend.exception_string_param_max_len=0 still provides more information than completely disabling tracking args (you still know the argument is a string, and types of non-strings).

Still that might complicate the debugging for other cases. This was somewhat mitigated in PHP 8.0 by introducing zend.exception_string_param_max_len config param:

zend.exception_string_param_max_len is a new INI directive to set the maximum string length in an argument of a stringified stack strace.

The idea behind this (quoted above) is, among other things, to limit the number of potentially exposed sensitive data when logging exceptions without actually compromising the data needed to debug the issue.

Note though that this setting only affects getTraceAsString() results (which you should consider using instead of var_dumping result of getTrace anyway).

Hydrolyte answered 18/6, 2021 at 22:28 Comment(3)
I downvoted when your answer was only talking about string lengths. Turning off the storage of arguments altogether is a better solution, but OP's code suggests they're going to do something foolish with it. (And it's looking for an opinion.)Paradisiacal
I see, thank you for explanation. Yes, sometimes I do assume people won't do something that doesn't make sense to me - and sometimes this assumption fails. Hence the added disclaimer to the top of the post.Hydrolyte
Well, I do understand miken32's reasoning: cutting corners in explanation (and assuming it's clearly visible that corners ARE cut) sometimes misfires here at SO, when others come and see the code as is. But yes, for me the question was essentially more about limiting the amount of data that gets passed to exception tracing than about particular ways of storing those traces.Hydrolyte
X
1

My goal is to not redact everything, but only

  • what can be a password (type string) or

  • is defined as such:

    • functions in overall (see $sMatchFunction) or
    • partial matching of array keys (see $sMatchIndex).

    Both are one regular expression as you'd use it in f.e. preg_match(). As a personal preference I used # instead of / as delimiters.

As per hakre I also defined limits on how much to recurse while processing. The property array $aMax can be adjusted. Its current values won't affect the demo - you'd have to use much lower values to then see arrays cut with a final element like:

[...] => 3 more

The class has 3 entry points to register all different culprits of errors:

  • ErrorHandler:: err_handler() for runtime errors/warnings/hints/etc...
  • ErrorHandler:: exc_handler() for exceptions
  • ErrorHandler:: die_handler() for script abortions

They have to be registered (see below) and then the class works the same for all 3 occasions/sources. If you inspect each handler you will notice only minor differences. The class never needs an instance.

<?php

class ErrorHandler {

    /*** The core: how to redact only sensitive parts ***
     ****************************************************/

    // Redacting values happens either per key in an array element or name in an object property...
    public static $sMatchIndex= '#pass|time#i';
    // ...or per function name. Both are PCRE patterns to match just any text.
    public static $sMatchFunction= '#connect$|login#i';

    // In case you think all this exhausts memory you might want to limit the verbosity.
    // If you don't care, set it to a high value, like PHP_INT_MAX.
    public static $aMax= array
    ( 'recursion_count' => 30  // Overall in sum.
    , 'recursion_depth' => 5   // How many levels down.
    , 'array_elements'  => 20  // Avoiding to list whole content of huge arrays.
    , 'class_properties'=> 15  // Class instances may have way too many fields.
    , 'parameters'      => 20  // Function arguments can be excessive, too.
    );


    // Should every STRING data type be hidden? This is set anew per iterated stack trace function.
    private static $bRedactAllStrings= FALSE;
    // Count limit for aMax anew each time.
    private static $aMaxCount= array();


    // Handle a variable as per its data type, so an array or an object is recursively checked against 
    // STRINGs, too. Side effect: make STRINGs look like literals. This method should be used on:
    // - every array element value,
    // - every object property value,
    // - optionally on every array element key (to later distinguish numeric indexes from textual).
    public static function as_per_type
    ( $vValue  // Variable of any data type.
    , &$sType  // Recognized data type; is needed later, too.
    , $vKey= ''  // Potential array index to test for password-like name.
    ) {
        $sType= gettype( $vValue );
        switch( $sType ) {
            // Each key and value can have different data types.
            case 'array':
                return self:: recurse_array( $vValue );

            // Each property can have different data types.
            case 'object':
                return self:: recurse_object( $vValue );

            // Either all STRING values should be redacted, or the key has a name hinting for a password.
            case 'string':
                if( self:: $bRedactAllStrings ) {
                    return '**REDACTED_PER_FUNCTION**';
                } else
                if( $vKey&& preg_match( self:: $sMatchIndex, $vKey ) ) {
                    return '**REDACTED_PER_INDEX**'; 
                } else
                return "'$vValue'";  // Original text, but as literal.

            // BOOLEAN, INTEGER, DOUBLE, RESOURCE, NULL and others: won't have passwords.
            default:
                return $vValue;
        }
    }


    // Handle a class instance's properties as per their data types, which can be arrays or objects again.
    public static function recurse_object
    ( $oInput  // Object with any properties.
    ) {
        // Respect recursion depth and overall count.
        if( self:: $aMaxCount['recursion_count']> self:: $aMax['recursion_count']
         || self:: $aMaxCount['recursion_depth']> self:: $aMax['recursion_depth']
          ) {
            return 'O('. count( get_object_vars( $oInput ) ). ')';
        } else self:: $aMaxCount['recursion_count']++;

        self:: $aMaxCount['recursion_depth']++;

        // Inspect each property.
        $aObj= get_object_vars( $oInput );  // Get all property names as array.
        $aOutput= array();
        $iProperty= 1;
        foreach( $aObj as $iObj=> $vObj ) {
            // Respect maximum element count of array.
            if( $iProperty> self:: $aMax['class_properties'] ) {
                $aOutput['...']= (count( $aObj )- $iProperty+ 1). ' more';
                break;
            } else $iProperty++;

            $vValue= self:: as_per_type( $oInput-> $iObj, $sType, $iObj );
            $aOutput["$iObj ($sType)"]= $vValue;  // Array key hints at value data type.
        }

        self:: $aMaxCount['recursion_depth']--;
        return $aOutput;
    }


    // Handle all array elements as per their data types, which can be objects or arrays again.
    public static function recurse_array
    ( $aInput  // Array with any elements.
    ) {
        // Respect recursion depth and overall count.
        if( self:: $aMaxCount['recursion_count']> self:: $aMax['recursion_count']
         || self:: $aMaxCount['recursion_depth']> self:: $aMax['recursion_depth']
          ) {
            return 'A('. count( $aInput ). ')';
        } else self:: $aMaxCount['recursion_count']++;

        self:: $aMaxCount['recursion_depth']++;

        // Inspect each element.
        $aOutput= array();
        $iElement= 1;
        foreach( $aInput as $iKey=> $vValue ) {
            // Respect maximum element count of array.
            if( $iElement> self:: $aMax['array_elements'] ) {
                $aOutput['...']= (count( $aInput )- $iElement+ 1). ' more';
                break;
            } else $iElement++;

            $sKey= self:: as_per_type( $iKey, $sTypeKey );  // Element keys need no redaction...
            $sValue= self:: as_per_type( $vValue, $sTypeValue, $iKey );  // ...but values do.

            // Objects are converted to arrays by us, loosing the information of which class they were.     
            // So we append the class name to the type hint in the array element key.
            if( $sTypeValue== 'object' ) $sTypeValue.= ' '. get_class( $vValue );

            $aOutput["$sKey ($sTypeValue)"]= $sValue;  // Array key hints at value data type.
        }

        self:: $aMaxCount['recursion_depth']--;
        return $aOutput;
    }


    // Parse the stack trace to redact potentially sensitive texts.
    public static function redact_backtrace
    ( $aTrace  // Stack trace array to be parsed.
    ) {
        // Reset on each new error handling, as this is the entry of every further processing.
        self:: $aMaxCount= array
        ( 'recursion_count'=> 0
        , 'recursion_depth'=> 1
        );

        foreach( $aTrace as $iFunc=> $aFunc ) {
            // Yet this is no sensitive function being called.
            self:: $bRedactAllStrings= FALSE;

            // If this is a class instance we only need to redact by property name.
            if( isset( $aFunc['object'] ) ) {
                $aTrace[$iFunc]['object']= self:: recurse_object( $aTrace[$iFunc]['object'] );
            }

            // Should the function's name match we'll recursively redact ANY string.
            if( isset( $aFunc['function'] ) ) {
                self:: $bRedactAllStrings= preg_match( self:: $sMatchFunction, $aFunc['function'] );
            }

            // Now parse all parameters to potentially redact chosen ones.
            if( isset( $aFunc['args'] ) ) {
                // Respect amount of parameters.
                $iRemoved= 0;
                while( count( $aTrace[$iFunc]['args'] )> self:: $aMax['parameters'] ) {
                    array_pop( $aTrace[$iFunc]['args'] );
                    $iRemoved++;
                }

                $aTrace[$iFunc]['args']= self:: recurse_array( $aTrace[$iFunc]['args'] );

                // Inform about too many parameters.
                if( $iRemoved ) $aTrace[$iFunc]['args']['...']= $iRemoved. ' more';
            }
        }

        return $aTrace;
    }



    /*** Functional example: seeing the redacted data ***
     ****************************************************/

    // Write log messages to wherever we want to.
    private static $bHeadersSent= FALSE;
    public static function err_log
    ( $aLog  // Array to be saved.
    ) {
        if( !self:: $bHeadersSent ) {
            header( 'content-type: text/plain' );  // Don't let browser interpret output as HTML, preserve spaces.
            self:: $bHeadersSent= TRUE;  // Only send header once.
        }

        print_r( $aLog );  // Imagine this being our log file.
    }



    /*** Demo: actually handling errors to get stack traces ***
     **********************************************************/

    // Handler for uncaught errors.
    public static function err_handler
    ( $iError  // See https://www.php.net/manual/en/errorfunc.constants.php
    , $sText  // Error message.
    , $sFile  // PHP file which was parsed.
    , $iLine  // Line of error in PHP file.
    ) {
        // First one is this function, and we won't need this ever
        $aTrace= debug_backtrace();
        unset( $aTrace[0] );

        self:: err_log
        ( array
            ( 'where'   => 'Error handler'
            , 'file'    => $sFile
            , 'line'    => $iLine
            , 'code'    => $iError
            , 'msg'     => $sText
            , 'trace'   => self:: redact_backtrace( $aTrace )
            )
        );
    }

    // Handler for uncaught exceptions.
    public static function exc_handler
    ( $e  // Exception
    ) {
        self:: err_log
        ( array
            ( 'where'   => 'Exception handler'
            , 'file'    => $e-> getFile()
            , 'line'    => $e-> getLine()
            , 'code'    => $e-> getCode()
            , 'msg'     => $e-> getMessage()
            , 'trace'   => self:: redact_backtrace( $e-> getTrace() )
            , 'class'   => get_class( $e )
            )
        );
    }

    // Handler for potentially fatal errors.
    public static function die_handler() {
        // No error occurred? Nothing to inspect.
        $aErr= error_get_last();
        if( !count( $aErr ) ) return;

        // First one is this function, and we won't need this ever
        $aTrace= debug_backtrace();
        unset( $aTrace[0] );

        self:: err_log
        ( array
            ( 'where'   => 'Shutdown handler'
            , 'file'    => $aErr['file']
            , 'line'    => $aErr['line']
            , 'code'    => $aErr['type']
            , 'msg'     => $aErr['message']
            , 'trace'   => self:: redact_backtrace( $aTrace )
            )
        );
    }
}

// For register_shutdown_function() a stack trace is not available.
set_error_handler           ( array( 'ErrorHandler', 'err_handler' ), E_ALL );
set_exception_handler       ( array( 'ErrorHandler', 'exc_handler' ) );
register_shutdown_function  ( array( 'ErrorHandler', 'die_handler' ) );



/*** Demo: creating errors ***
 *****************************/

class Example {
    public $iNumber     = 12345;  // Integers won't be redacted.
    public $sPassword   = 'secret';  // The property name should hint at a password.
    public $sInfo       = 'a password?';  // No chance to identify this as password.

    public function login( $sUser, $sPass ) {
        echo( array() );  // Notice: Array to string conversion.
    }

    public function test( $a, $b ) {
        $this-> login( 'username', 'password' );  // Deeper nesting, recognition by function name.

        unset( $a['obj'] );  // Seeing the object once is enough for demonstration purposes.

        1/ 0;  // Error: Division by zero.
        1+ $x;  // Error: Undefined variable.
        throw new Exception( 'TestException' );  // Unhandled exception.
    }
}


// Building a rather complex parameter, using as many data types as possible.
$aFirst= array
( 'string'  => 'Text'
, 'int'     => 42
, 'float'   => 3.1415
, 'bool'    => TRUE
, 'array'   => array
    ( 'key'         => 'value'
    , 'db_password' => 'adminadmin'  // Array in array: should be redacted as per key text.
    )
, 'obj'     => new DateTime  // So we get an actual class instance.
, 'pass'    => '12345'  // Should be redacted as per key text.
, 110       => 'ordinal index'
);

// Simple parameter: array with ordinal indexes only.
$aSecond= array
( 'no index identifying a password'
, 'username'
);


// Outcome:
// WHERE                           | REDACTION
// --------------------------------|----------
// Example-> login()               |
// - $oTest-> sPassword            | Index
// - $sUser  (1st function param)  | Function
// - $sPass  (2nd function param)  | Function
// Example-> test()                |
// - $oTest-> sPassword            | Index
// - $a['array']['db_password']    | Index
// - $a['obj']-> timezone          | Index
// - $a['pass']                    | Index
$oTest= new Example;
$oTest-> test( $aFirst, $aSecond );

This is much more code/complex than reformed's answer and the target audience are people being confident with PHP or wanting to invest time to understand it. I tried to comment as much as possible with explanations.

Xanthochroid answered 22/10, 2023 at 2:22 Comment(3)
This is way overkill for something that is accomplished with a single line in PHP 8.2+ as Josh's answer demonstrates. The accepted answer is more than sufficient for the time being, until one's codebase can be upgraded to PHP 8.2. KISSMccay
Yes, I upvoted his answer. But apparently you didn't even look at what I linked in my comment, and you don't seem to honor what else this solution provides. And don't forget: I achieved all this with 5.6. The added complexity of avoiding memory limits was hakre's idea, and it's a good one. This is no quick hack and no oversimplified haiku - there's no need to have ever expected that.Xanthochroid
5.6 wasn't outdated when I wrote that - I wrote that already and you ignore it again. It also doesn't automatically mean it has problems running with recent PHP versions. Having no time "to read" but still always having enough time to comment indicates egoism. This A is valid as per your Q requirements for both before 8.2 and newer - not wanting something complex is a yet another addition through comments.Xanthochroid
H
1

For php prior 8.2 this should work ...

<?php

class Obfuscated
{
    protected string $value;

    public function __construct(string $value)
    {
        $this->value = $value;
    }

    public function __toString(): string
    {
        return $this->value;
    }
}

class Test
{
    public function login($user, $pass)
    {
        $user = new Obfuscated($user);
        $pass = new Obfuscated($pass);

        return $this->out($user, $pass);
    }

    # be careful with type-hints ... 
    protected function out($user, string $pass)
    {
        throw new \Exception('Error');
        # return sprintf('%s & %s', $user, $pass);
    }
}



$test = new Test();
var_dump($test->login('foo', 'bar'));


/**
Stack trace:
#0 /var/www/html/test.php(32): Test->out(Object(Obfuscated), 'bar')
#1 /var/www/html/test.php(45): Test->login(Object(Obfuscated), Object(Obfuscated))
#2 {main}
  thrown in /var/www/html/test.php on line 37
*/
Hoiden answered 2/8, 2024 at 19:48 Comment(0)
M
0

I ended up adding logic in the code that handles logging to file/database to clear the arguments of specific functions showing up in the trace. Here is a suitable <PHP 8.2 solution:

<?php
function send_to_log(Throwable $e) {
    $noArgs = [
        'login' => true,
        'auth' => true,
        // ...
    ];

    $trace = $e->getTrace();
    foreach ($trace as &$err) {
        if (isset($noArgs[$err['function'] ?? ''])) {
            $cnt = count($err['args'] ?? []);
            if ($cnt > 0) {
                $err['args'] = array_fill(0, $cnt, 'REDACTED');
            }
        }
    }
    unset($err);

    var_dump($trace); /* This now shows "REDACTED" for all arguments
    to functions specified in the $noArgs array */

    // logging logic
}
Mccay answered 22/6, 2021 at 14:32 Comment(13)
Your question did not imply to redact all parameters at once - only sensitive ones. The other answers understood it the latter way, too. The code should be smart enough to only redact parameters of is_string() (recursively doing the same for array and object types).Xanthochroid
@Xanthochroid That is not what I want. I want to hide only sensitive arguments, not all arguments and not all arguments of is_string().Mccay
Then both your code and accepting that answer is wrong, because it does redact all parameters. Also I did not wrote you should redact "all arguments" of type string.Xanthochroid
@AmigoJAck No, it's not, and no, it doesn't. The code only redacts the parameters of specific functions specified in $noArgs.Mccay
Yes, of specific functions. No, all parameters of those, not selected of those. Your formulation "allowing other non-sensitive arguments" does not mean either none of a function's parameters is redacted or all of them - it means that only selected parameters of selected functions should be redacted. However, your code is an all-or-nothing approach.Xanthochroid
Neither the PHP version nor being only somewhat close was any requirement in the Q - those come in now and by your A and your comments. Why don't you unaccept your own A and maybe accept the one that really fits your Q already? Just because it came in 1 year later doesn't mean it's an unacceptable/incorrect A. It's unfair to other posters to shift your requirements at will - that's why others are upvoted and you never got one.Xanthochroid
@AmigoJack: Why not thank reformed for Q&A this fully back then? You wouldn't have had the opportunity to comment more than a year later, but also the Q&A already collected quite some shared wisdom of the different options that are available in past, present and future (not seeing you're contributing much to it, sadly). And while you're at it, why not also thank the PHP developers that filed the changes for this new feature. And even if you fully disagree, it's the asking user who accepts an answer, don't question the question nor the green check mark. That's plain bad style.Surfacetosurface
@Surfacetosurface I could contribute the PHP 5.4 compatible code I wrote in 2012 similar to this, which only redacts string parameters to chosen functions (because passwords and usernames are always strings) and with less strict function name matching, still working very well as of today. But I wonder who'd be interested in that (anymore) - my comments were largely rejected, too. I should report reformed's comment(s) for being subjective as contribution...Xanthochroid
Sounds to me that instead of running with heads against each other you two should put heads and hands together. But that would not play well if you still think that you need to report comments (which are subjective, this is what that are in part for, just don't clash). Why not put your example on 3v4l.org and drop the link here, then everyone can take a look and consider things?! @XanthochroidSurfacetosurface
@Surfacetosurface 3v4l.org/fKQAc#v5.6.40 sums up how I did it, redacting only per data type, but also recursively. It you guys think it's worth an answer I'll post it. Sadly it looks much more complex, but that's the price for precise redaction and not killing too much details for debugging purposes.Xanthochroid
Looks at least solid, perhaps more standard coding style could benefit a public presentation (but that is clearly subjective, just my commentary). support for both debug_backtrace() and Exception::getTrace(), error handler w/o args, and as it traverses structures, it's an addition. Maybe one or two feature flags, can imagine those handlers may bring memory to limits with the one or other code-base and E_ALL. Why not post it and add some infos on top so the context is clear. @Mccay you're still with us? Maybe add your 2 cents?Surfacetosurface
@Surfacetosurface my question was asked back in 2021 and when no one came up with a suitable solution, I settled for what I posted in my answer. This has been done and over with since then, and I don't have time to continue hashing this out. When I move to PHP8.2 I may revisit this, but there's nothing stopping anyone from asking a PHP8.2 specific question.Mccay
@reformed: Sure, but my intend was not to ask for an update, but if you're fine with an additional answer of that kind for your original question. If you don't have any objections I'd say very fine to go.Surfacetosurface

© 2022 - 2025 — McMap. All rights reserved.