PDO MySQL: Use PDO::ATTR_EMULATE_PREPARES or not?
Asked Answered
C

7

139

This is what I've read so far about PDO::ATTR_EMULATE_PREPARES:

  1. PDO's prepare emulation is better for performance since MySQL's native prepare bypasses the query cache.
  2. MySQL's native prepare is better for security (preventing SQL Injection).
  3. MySQL's native prepare is better for error reporting.

I don't know how true any of these statements are anymore. My greatest concern in choosing a MySQL interface is preventing SQL Injection. The second concern is performance.

My application currently uses procedural MySQLi (without prepared statements), and utilizes the query cache quite a bit. It will rarely re-use prepared statements in a single request. I started the move to PDO for the named parameters and security of prepared statements.

I'm using MySQL 5.1.61 and PHP 5.3.2

Should I leave PDO::ATTR_EMULATE_PREPARES enabled or not? Is there a way to have both the performance of the query cache and the security of prepared statements?

Caparison answered 11/4, 2012 at 20:37 Comment(3)
Honestly? Just keep using MySQLi. If it's already working using prepared statements under that, PDO is basically a pointless layer of abstraction. EDIT: PDO is really useful for green field applications where you're not sure what database is going into the back-end.Uphemia
Sorry, my question was unclear before. I've edited it. The application does not use prepared statements in MySQLi at the moment; just mysqli_run_query(). From what I've read, MySQLi prepared statements also bypass the query cache.Caparison
@Uphemia mysqli (still) not offers named parameters. a good reason to give it a wide berth.Turk
P
122

To answer your concerns:

  1. MySQL >= 5.1.17 (or >= 5.1.21 for the PREPARE and EXECUTE statements) can use prepared statements in the query cache. So your version of MySQL+PHP can use prepared statements with the query cache. However, make careful note of the caveats for caching query results in the MySQL documentation. There are many kinds of queries which cannot be cached or which are useless even though they are cached. In my experience the query cache isn't often a very big win anyway. Queries and schemas need special construction to make maximum use of the cache. Often application-level caching ends up being necessary anyway in the long run.

  2. Native prepares doesn't make any difference for security. The pseudo-prepared statements will still escape query parameter values, it will just be done in the PDO library with strings instead of on the MySQL server using the binary protocol. In other words, the same PDO code will be equally vulnerable (or not-vulnerable) to injection attacks regardless of your EMULATE_PREPARES setting. The only difference is where the parameter replacement occurs--with EMULATE_PREPARES, it occurs in the PDO library; without EMULATE_PREPARES, it occurs on the MySQL server.

  3. Without EMULATE_PREPARES you may get syntax errors at prepare-time rather than at execute-time; with EMULATE_PREPARES you will only get syntax errors at execution time because PDO doesn't have a query to give to MySQL until execution time. Note that this affects the code you will write! Especially if you are using PDO::ERRMODE_EXCEPTION!

An additional consideration:

  • There is a fixed cost for a prepare() (using native prepared statements), so a prepare();execute() with native prepared statements may be a little slower than issuing a plain textual query using emulated prepared statements. On many database systems the query plan for a prepare() is cached as well and may be shared with multiple connections, but I don't think MySQL does this. So if you do not reuse your prepared statement object for multiple queries your overall execution may be slower.

As a final recommendation, I think with older versions of MySQL+PHP, you should emulate prepared statements, but with your very recent versions you should turn emulation off.

After writing a few apps that use PDO, I've made a PDO connection function which has what I think are the best settings. You should probably use something like this or tweak to your preferred settings:

/**
 * Return PDO handle for a MySQL connection using supplied settings
 *
 * Tries to do the right thing with different php and mysql versions.
 *
 * @param array $settings with keys: host, port, unix_socket, dbname, charset, user, pass. Some may be omitted or NULL.
 * @return PDO
 * @author Francis Avila
 */
function connect_PDO($settings)
{
    $emulate_prepares_below_version = '5.1.17';

    $dsndefaults = array_fill_keys(array('host', 'port', 'unix_socket', 'dbname', 'charset'), null);
    $dsnarr = array_intersect_key($settings, $dsndefaults);
    $dsnarr += $dsndefaults;

    // connection options I like
    $options = array(
        PDO::ATTR_ERRMODE => PDO::ERRMODE_EXCEPTION,
        PDO::ATTR_DEFAULT_FETCH_MODE => PDO::FETCH_ASSOC
    );

    // connection charset handling for old php versions
    if ($dsnarr['charset'] and version_compare(PHP_VERSION, '5.3.6', '<')) {
        $options[PDO::MYSQL_ATTR_INIT_COMMAND] = 'SET NAMES '.$dsnarr['charset'];
    }
    $dsnpairs = array();
    foreach ($dsnarr as $k => $v) {
        if ($v===null) continue;
        $dsnpairs[] = "{$k}={$v}";
    }

    $dsn = 'mysql:'.implode(';', $dsnpairs);
    $dbh = new PDO($dsn, $settings['user'], $settings['pass'], $options);

    // Set prepared statement emulation depending on server version
    $serverversion = $dbh->getAttribute(PDO::ATTR_SERVER_VERSION);
    $emulate_prepares = (version_compare($serverversion, $emulate_prepares_below_version, '<'));
    $dbh->setAttribute(PDO::ATTR_EMULATE_PREPARES, $emulate_prepares);

    return $dbh;
}
Pushbike answered 4/5, 2012 at 19:48 Comment(12)
Re #2: surely values that MySQL receives as parameters (to native prepared statements) do not get parsed for SQL at all? So the risk of injection must be lower than using PDO's prepare emulation, where any flaw in escaping (e.g. the historical issues mysql_real_escape_string had with multi-byte characters) would still leave one open to injection attacks?Ruhnke
#1: Awesome! This is completely new information to me, and I'm very happy about it. The app does utilize the query cache quite a bit (over 50% of all queries are retrieved straight from the cache). #2: I'd have to agree with eggyal. That's the whole point of moving to prepared statements is for the security of MySQL escaping the content itself rather than PHP guessing how the content should be escaped. #3: I actually prefer knowing there's an issue before the query is executed, so that's fine. Great answer. Thanks for your help!Caparison
@eggyal, you are making assumptions about how prepared statements are implemented. PDO may have a bug in its emulated prepares escaping, but MySQL might have bugs too. AFAIK, no problems have been discovered with emulated prepares which could cause parameter literals to pass through unescaped.Pushbike
Awesome answer, but I have a question: If you turn EMULATION off, won't the execution be slower ? PHP would have to send the prepared statement to MySQL for validation and only then send the parameters. So, if you use the prepared statement 5 times, PHP will talk to MySQL 6 times (instead of 5). Won't this make it slower ? Besides, I think there's a greater chance that PDO could have bugs in the validation process, rather than MySQL...Merrell
The only way to know is to test but in my experience it's usually faster because 1) The total amount of data sent back-and-forth is smaller (5 whole queries vs 1 query and 5 sets of data) and 2) the query plan is calculated only once instead of five times. As for PDO vs MySQL code quality, on what basis do you prefer one over the other? Neither of them are exactly known as stellar, bulletproof systems!Pushbike
Note the points made in this answer re prepared statement emulation using mysql_real_escape_string under the hood and the consequent vulnerabilities that can arise (in very particular edge cases).Ruhnke
@RaduMurzea The difference is normally negligible. When performing repeated queries, like multiple updates, or inserts, turning off emulation is faster but only by a small margin. The other case would be for external RDBMS's where there would be a high latency or large amounts of traffic (1000+ connections). At that point though you lean more toward clustering and stored procedures+triggers rather than multiple application based calls to the server.Fiance
+1 Good answer! But for the record, if you use native prepare, parameters are never escaped or combined into the SQL query even on the MySQL server side. By the time you execute and supply parameters, the query has been parsed and transformed into internal data structures in MySQL. Read this blog by a MySQL optimizer engineer that explains this process: guilhembichot.blogspot.com/2014/05/… I'm not saying this means native prepare is better, insofar as we trust the PDO code to do escaping correctly (which I do).Goggles
This kind of posts is actually the best answer to the question "Why SO is that valuable learning resource for all developers/programmers". Classic definition how good answers should look alike. Simply brilliant. Thanks to Francis.Compressed
So despite acknowledging that emulating prepares can radically change the application behaviour, you provide a "function" that arbitrarily switches mode depending on MySQL version. So anyone foolish enough to use your "function", then upgrading MySQL, suddenly breaks their application in many mysterious and hard-to-trace ways. Brilliant.Orientate
"Native prepares doesn't make any difference for security. The pseudo-prepared statements will still escape query parameter values, it will just be done in the PDO library with strings instead of on the MySQL server using the binary protocol." This is either a complete and dangerous misrepresentation or MySQL is very poorly implemented. That is not how parameterized queries are supposed to be implemented; the parameters are supposed to bypass the parsing phase entirely, preventing them from ever being interpreted as part of SQL code. So which is it: dangerously wrong or incorrect?Iridaceous
Someone should check if ATTR_EMULATE_PREPARES when off is able to avoid round trip times perhaps by buffering up the query, binds, execute to send all at once but I don't think it does. See if prepare with bad SQL syntax fails on prepare or execute.Theodora
C
29

I'm surprised no one has mentioned one of the biggest reasons to turn off emulation when using older PHP versions. With emulation on and PHP prior to 8.1 PDO returns all integers and floats as strings. When you turn off emulation or you are on 8.1 or later, integers and floats in MySQL become integers and floats in PHP.

For more information, see the accepted answer for this question: How do I return integer and numeric columns from MySQL as integers and numerics in PHP?.

Charlottcharlotta answered 13/11, 2019 at 3:47 Comment(1)
This is true in PHP versions < 8.1. However, beginning in 8.1, emulated prepares will be compatible with native prepares with regards to properly returning integers and floats. See the PHP 8.1 Upgrade Guide for more information.Narwhal
E
11

Beware on disabling PDO::ATTR_EMULATE_PREPARES (turning native prepares on) when your PHP pdo_mysql is not compiled against mysqlnd.

Because old libmysql is not fully compatible with some functions, it can lead to strange bugs, for example:

  1. Losing most significant bits for 64bit integers when binding as PDO::PARAM_INT ( 0x12345678AB will be cropped to 0x345678AB on 64bit machine )
  2. Inability to make simple queries like LOCK TABLES ( it throws SQLSTATE[HY000]: General error: 2030 This command is not supported in the prepared statement protocol yet exception )
  3. Need to fetch all rows from result or close cursor before next query ( with mysqlnd or emulated prepares it automatically does this work for you and doesn't go out of sync with mysql server )

These bugs I figured out in my simple project when migrated to other server which used libmysql for pdo_mysql module. Maybe there are much more bugs, I don't know. Also I tested on fresh 64bit debian jessie, all listed bugs occur when I apt-get install php5-mysql, and disappear when I apt-get install php5-mysqlnd.

When PDO::ATTR_EMULATE_PREPARES is set to true (as default) - these bugs don't happen anyway, because PDO doesn't use prepared statements at all in this mode. So, if you use pdo_mysql based on libmysql ("mysqlnd" substring does't appear in "Client API version" field of pdo_mysql section in phpinfo) - you should not turn PDO::ATTR_EMULATE_PREPARES off.

Erick answered 29/9, 2016 at 19:36 Comment(2)
is this concern still valid in 2019?!Revels
@Revels no, it's hard to find libmysql in the wildSuetonius
R
9

I would turn off emulate prepares as you're running 5.1 which means PDO will take advantage of the native prepared statement functionality.

PDO_MYSQL will take advantage of native prepared statement support present in MySQL 4.1 and higher. If you're using an older version of the mysql client libraries, PDO will emulate them for you.

http://php.net/manual/en/ref.pdo-mysql.php

I ditched MySQLi for PDO for the prepared named statements and the better API.

However, to be balanced, PDO performs negligibly slower than MySQLi, but it's something to bear in mind. I knew this when I made the choice, and decided that a better API and using the industry standard was more important than using a negligibly faster library that ties you to a particular engine. FWIW I think the PHP team is also looking favourably at PDO over MySQLi for the future too.

Reger answered 4/5, 2012 at 9:6 Comment(5)
Thank you for that information. How has not being able to use the query cache impacted your performance or were you even using it before?Caparison
I can't say as the framework I'm using caches on multiple levels anyway. You can always explicitly use SELECT SQL_CACHE <rest of statement> though.Reger
Did not even know there was a SELECT SQL_CACHE option. However, it appears that that still wouldn't work. From the docs: "The query result is cached if it is cacheable..." dev.mysql.com/doc/refman/5.1/en/query-cache-in-select.htmlCaparison
Yes. That depends on the nature of the query, rather than the platform specifics.Reger
I read that to mean "The query result is cached unless something else prevents it from being cacheable," which - from what I had read until then - included prepared statements. However, thanks to Francis Avila's answer, I know that that is no longer true for my version of MySQL.Caparison
P
8

I'd recommend enabling real database PREPARE calls as the emulation doesn't catch everything.., for example, it will prepare INSERT;!

var_dump($dbh->prepare('INSERT;'));
$dbh->setAttribute(PDO::ATTR_EMULATE_PREPARES, false);
var_dump($dbh->prepare('INSERT;'));

The output

object(PDOStatement)#2 (1) {
  ["queryString"]=>
  string(7) "INSERT;"
}
bool(false)

I'll gladly take a performance hit for code that actually works.

FWIW

PHP Version: PHP 5.4.9-4ubuntu2.4 (cli)

MySQL Version: 5.5.34-0ubuntu0

Pyrosis answered 15/7, 2014 at 5:27 Comment(3)
It's an interesting point. I guess emulation postpones server-side parsing to the execution phase. While it isn't a big deal (wrong SQL will eventually fail) it's cleaner to let prepare do the job it's supposed to. (Additionally, I've always assumed that the client-side parameter parser will necessarily have bugs of its own.)Fraternity
IDK if you're interested, but here's a little writeup on some other spurious behavior I noticed with PDO that lead me down this rabbit hole to begin with. Seems the handling of multiple queries is lacking.Pyrosis
I just looked at some migrations libraries on GitHub... What do you know, this one pretty much does the exact same thing as my blog post.Pyrosis
B
6

Why switch emulation to ‘false’?

The main reason for this is that having the database engine do the prepare instead of PDO is that the query and the actual data are sent separately, which increases security. This means when the parameters are passed to the query, attempts to inject SQL into them are blocked, since MySQL prepared statements are limited to a single query. That means that a true prepared statement would fail when passed a second query in a parameter.

The main argument against using the database engine for the prepare vs PDO is the two trips to the server – one for the prepare, and another for the parameters to get passed – but I think the added security is worth it. Also, at least in the case of MySQL, query caching has not been an issue since version 5.1.

https://tech.michaelseiler.net/2016/07/04/dont-emulate-prepared-statements-pdo-mysql/

Bedstead answered 15/6, 2018 at 1:10 Comment(1)
Query caching is gone anyway: The query cache is deprecated as of MySQL 5.7.20, and is removed in MySQL 8.0.Fraternity
U
5

For the record

PDO::ATTR_EMULATE_PREPARES=true

It could generate a nasty side effect. It could return int values as string.

PHP 7.4, pdo with mysqlnd.

Running a query with PDO::ATTR_EMULATE_PREPARES=true

Column : id
Type :integer
Value : 1

Running a query with PDO::ATTR_EMULATE_PREPARES=false

Column : id
Type :string
Value : "1"

In any case, decimal values are always returned a string, regardless of the configuration :-(

Upthrust answered 27/7, 2020 at 14:55 Comment(8)
decimal values are always returned a string is the only correct waySuetonius
Yes from the viewpoint of MySQL but it's wrong at the PHP side. Both Java and C# consider Decimal as a numeric value.Upthrust
Nope, it is not. It as all correct for the entire computer science. If you think it is wrong, then you need another type, of arbitrary precisionSuetonius
@YourCommonSense Care to dig a little deeper into why you think that?Ghee
@RobertMcKee you need to think why there is a decimal type when there is already a float type.Suetonius
@YourCommonSense a float can not accurately represent 0.2, while a decimal can. But, that's a PHP problem, not a entire computer science problem. Many languages (and databases) have intrinsic data types that can represent numbers such as 0.2 accurately. PHP can, but there is no intrinsic data type in the base language. But to say decimal values are always returned as a string is the only correct way is meh. That is assuming you'd prefer 0.2 to be accurately represented over "12" being sorted before "2". Also "2" not being equal to "2.0".Ghee
All right, it's "a PHP problem". In PHP, decimal values are always returned a string is the only correct waySuetonius
The edit queue for this post is full, so I'll add it here: You switched the values of PDO::ATTR_EMULATE_PREPARES - true will return everything as strings, false will allow returning ints and floats.Benniebenning

© 2022 - 2024 — McMap. All rights reserved.