How to export millions of rows from MySQL to CSV via PHP without exhausting memory?
Asked Answered
A

1

5

So I have this table:

mysql> DESCRIBE table;
+-------+------------------+------+-----+---------+----------------+
| Field | Type             | Null | Key | Default | Extra          |
+-------+------------------+------+-----+---------+----------------+
| id    | int(15) unsigned | NO   | PRI | NULL    | auto_increment |
| unid  | char(9)          | NO   | UNI | NULL    |                |
| rs    | varchar(255)     | NO   |     | NULL    |                |
+-------+------------------+------+-----+---------+----------------+
3 rows in set (0.00 sec)

Which contains millions of rows:

mysql> SELECT COUNT(1) FROM table;
+----------+
| COUNT(1) |
+----------+
|  9435361 |
+----------+
1 row in set (0.00 sec)

I'm willing to export all rows in a .csv file (I'm using Symfony2.6). This file is meant to be stored on the server (not downloaded) and later on, read by PHP.

1st attempt

I tried to make a huge request to select all at once (as per this blog post) but this has, despite the use of ->iterate(), led to Allowed memory size of 1073741824 bytes exhausted after having run for ~9s.

    ini_set('memory_limit', '1024M');
    ini_set('max_execution_time', -1);

    $results = $em
        ->getRepository('MyBundle:Entity')
        ->createQueryBuilder('e')
        ->getQuery()
        ->iterate();
    $handle = fopen('/path/to/csv/file/', 'w');
    while (false !== ($row = $results->next())) {
        fputcsv($handle, $row[0]->toArray());
        $em->detach($row[0]);
    }
    fclose($handle);

2nd attempt

I retrieved the total of rows and then, did a loop to make the same number of queries to retrieve rows one by one. But after having written ~260K rows into the .csv file, PHP runs out of memory and throws the same error as above : Allowed memory size of 1073741824 bytes exhausted.

    ini_set('memory_limit', '1024M');
    ini_set('max_execution_time', -1);

    $total = (int) $em
        ->getRepository('MyBundle:Entity')
        ->countAll();
    $csv = '/path/to/csv/file';
    $handle = fopen($csv, 'w');
    for($i = 1; $i < $total; $i++)
    {
        $entity = $em->getRepository('MyBundle:Entity')->findOneById($i);
        fputcsv($handle, $entity->toArray());
        $em->detach($entity);
    }
    fclose($handle);

3rd attempt

I have thought of the use of the exec() function to run the MySQL command line that would export the table. However, my manager seems not to like this option.


So am I making a fool of myself thinking that dumping ~9.5M of rows using PHP into a .csv file is even possible? Are there any other way I'm not yet aware of?

Thanks for your help on this one.

Afghanistan answered 17/7, 2015 at 8:12 Comment(9)
You should be able to iterate through the records without holding all of them in memory at once. Try to bypass your framework and use the database connection directly.Cumulative
Shouldn't the 2nd attempt do exactly that job? eg. not loading all of them in memory at once by querying rows one by one?Afghanistan
It should, but apparently it isn't for whatever reason. I suspect its the framework's fault.Cumulative
@Afghanistan I think detaching the entity does not necessary free its used memory. Not Familar with EMs in php, but in Java the Object is still existing, it just becomes detached from the persistence context. Maybe you can try to explicit use unset() after detaching the entity?Enrich
I describe my opinion with PDO (I don't know Doctrine). Create statement, fetch it row by row while ($row = $stmt->fetch()) {} and save to file via fput, fputcsv or whatever stream. This should bypass memory exhaustion.Declaratory
@Enrich unlink deletes files...Cumulative
@Cumulative Yea, ment unset() ofc.Enrich
@Afghanistan : I know you got answer. have you tried increasing ini_set('memory_limit', '4096M'); Since you have mentioned mem_limit as 1024M you got Allowed memory size of 1073741824 bytes exhausted ?Departmentalism
@Departmentalism I didn't mentioned it but increasing PHP's memory limit was, of course, not an option to consider as it would have merely solved the problem (if possible at all).Afghanistan
E
10

Rather than attempting to build the object-tree, you could directly try to select the result into a file: http://dev.mysql.com/doc/refman/5.7/en/select.html

Something like

SELECT * INTO OUTFILE "c:/temp/mycsv.csv"
FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"'
LINES TERMINATED BY "\n"
FROM theTable;

This should leave the job up to mysql and bypass any php memory limitations.


As venca noted: In this case the user under which you are running the mysql service needs write permissions to the directory in question.

Enrich answered 17/7, 2015 at 8:18 Comment(3)
This is the probably the fastest option, but it will create the file on the database server. You might have difficulties retrieving it.Cumulative
User under which is running mysql server would need write permission to the directory, you want to save the csv.Declaratory
I was also thinking the same about why to involve PHP in this scenario. Thanks for the answer :)Departmentalism

© 2022 - 2024 — McMap. All rights reserved.