Symfony2 / Doctrine make $statement->execute() not "buffer" all values
Asked Answered
L

3

10

I've got a basic codeset like this (inside a controller):

$sql = 'select * from someLargeTable limit 1000';
$em = $this->getDoctrine()->getManager();
$conn = $em->getConnection();
$statement = $conn->prepare($sql);
$statement->execute();

My difficulty is that when the resultset is only a few records, the memory usage is not that bad. I echoed some debugging information before and after running the $statement->execute(); part of the code, and found for my implementation that I have the following:

pre-execute... rowCount :: 0 memory: 49.614 MB
post-execute... rowCount :: 1000 memory: 50.917 MB

When moving this up from 1000 records, to 10k the difference in MB usage grows to 13 MB

pre-execute... rowCount :: 0 memory: 49.614 MB
post-execute... rowCount :: 10000 memory: 62.521 MB

Eventually, retrieving around 50k records I get close to my maximum memory allocation:

pre-execute... rowCount :: 0 memory: 49.614 MB
post-execute... rowCount :: 50000 memory: 114.096 MB

With this implementation, there is no way I could write a controller (or even command for that matter) that will allow me to retrieve a CSV of data. Sure, 50k+ entries sounds a lot and the question begs why, but that's not the issue.

My ultimate question is: Is it possible to tell the DBAL/Connection or DBAL/Statement to, when executing, buffer the data inside SQL rather than in PHP in it's entirety. For instance, if I have 10 million rows, to only send the first say 10k rows to PHP... let me look through them by way of @statement->fetch(); and when the cursor gets to the end of the 10k, truncate the array and fetch the next 10k from the DB?

Laveta answered 4/9, 2014 at 8:42 Comment(0)
V
14

I just ran into the same problem and wanted to share a possible solution. Chances are your DBAL uses PDO library and its PDO::MYSQL_ATTR_USE_BUFFERED_QUERY set to true which means all the results in your query are cached on mysql side and buffered into memory by PDO even though you never call $statement->fetchAll(). To fix this, we just need to set PDO::MYSQL_ATTR_USE_BUFFERED_QUERY to false but DBAL does not give us a way to do it - its PDO connection class is protected without a public method to retrieve it and it does not give us a way to use setAttribute on the PDO connection.

So, in such situations, I just use my own PDO connection to save memory and speed things up. You can easily instantiate one with your doctrine db parameters like this:

$dbal_conn = $this->getDoctrine()->getManager()->getConnection();
$params = $dbal_conn->getParams();
$pdo_conn = new \PDO(
  'mysql:dbname='.$dbal_conn->getDatabase().';unix_socket='.$params['unix_socket'],
  $dbal_conn->getUsername(),
  $dbal_conn->getPassword()
);
$pdo_conn->setAttribute(PDO::MYSQL_ATTR_USE_BUFFERED_QUERY, false);

I am using unix sockets but IP host addresses can also be easily used.

Vicissitude answered 28/11, 2014 at 9:58 Comment(2)
You can get the PDO connection from DBAL by calling getWrappedConnection on the DBAL Connection object ($this->getDoctrine()->getManager()->getConnection()->getWrappedConnection()).Tarlatan
@Tarlatan you are right, I am not sure why I didn't use this method but I was using Symfony 2.2 at the time of the writing and didn't really find an easy way to get the pdo connection.Vicissitude
J
12

The selected answer is wrong and @kroky's answer should be selected as the correct one.

The problem is Buffer vs Unbuffered Queries.

Now it won't be a good idea to change the behaviour for all queries, because:

Unless the full result set was fetched from the server no further queries can be sent over the same connection.

Hence, it should only be used when necessary. Here is a full working example with >200k objects:

    $qb = ...->createQueryBuilder('p');

    $this
        ->em
        ->getConnection()
        ->getWrappedConnection()
        ->setAttribute(\PDO::MYSQL_ATTR_USE_BUFFERED_QUERY, false);

    $query = $qb->getQuery();
    $result = $query->iterate();
    $batchSize = 20;
    $i = 0;
    foreach ($result as $product)
    {
        $i++;

        var_dump($product[0]->getSku());

        if (($i % $batchSize) === 0) {
            $this->em->flush();
            $this->em->clear(); // Detaches all objects from Doctrine!
        }
    }

It most likely needs some refinement.

Jessy answered 24/3, 2016 at 13:43 Comment(7)
@kroky's post is not an answer to the OP's question. Why should it be the accepted one?Jeromyjerreed
And, your post is just an extension of the already posted solution, in that it handles the edge case of unbuffered queries. You could have posted this as a comment to the accepted answer.Jeromyjerreed
Thank you for the good answer with a practical example. Wouldn't it be a good idea to store the current state of this parameter and restore it after performing a query though?Steffie
@Jeromyjerreed The OP asked "...buffer the data inside SQL rather than in PHP in it's entirety...". I've taken that he meant keeping the data on the MySQL server and not pumping everything in memory of the PHP process. Which makes perfectly sense if he is talking about 50k records and maximum memory allocation. Just using the method "iterate" doesn't do much, it's still all pumped into memory (see the mentioned article, e.g. "query results are immediately transferred from the MySQL Server to PHP and is then kept in the memory of the PHP process).Jessy
@YourCommonSense It was just a quick example and yes, it would make sense to restore the state after you finished. I believe PDOStatement supports setting attributes on query level, but i assume Doctrine doesn't expose the PDOStatement interface.Jessy
@EnricoStahn Fair enough, I didn't remember that part. In that case, I agree that switching to an unbuffered query makes sense.Jeromyjerreed
This is an interesting solution, which we tried. However you can not make additional queries to retrieve related objects while this unbuffered query is not finished.Gustin
R
0

You can disable query buffer by doctrine config param options

doctrine:
    dbal:
        # configure these for your database server
        driver: 'pdo_mysql'
        ...
        options:
            1000: false
Ribaldry answered 3/12, 2018 at 13:0 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.