Efficient method for large file uploads ( 0 - 5GB ) through php
Asked Answered
E

2

6

I have been searching for a good method, and banging my head against the wall.

In a file sharing service project, I have been assigned to determine the best method available for upload large files.

After searching a lot of questions here on stackoverflow and other forums, here's what I got :

  1. Increase the script maximum execution time, along with maximum file size allowed

    This case really doesn't fit good. It will almost timeout everytime when the file is being uploaded through a normal broadband connection (1mbps-2mbps). Even if PHP scripts are executed after the upload has been done, there is still no guarantee that the upload will not timeout.

  2. Chunked upload.

    Although I kind of understand what I'm supposed to do here, but what I'm confused about is that, say a 1GB file is being uploaded, and I'm reading it in chunks of 2MB, still if the upload is slow, the php script execution will timeout and give error.

  3. Use other languages like Java and Perl?

    Is it really efficient to use java or perl for handling file uploads?

Method used by the client is not the problem here, as we'll be issuing a client SDK, and can implement the method of our choice in it. Both the client and server end implementations will be decided by us.

What method, according to you, should be the best one, considering that the memory usage should be efficient, and there may be many concurrent uploads going on?

How do Dropbox, and similar cloud storage services handle big file uploads, and still stay fast at it?

Explain answered 17/8, 2014 at 14:25 Comment(8)
What is the actual problem you're having? There are a couple of PHP settings that limit the upload filesize (upload_max_filesize and post_max_size) and you may also need to change max_input_time if it takes longer than 5 minutes to upload the file.... but it won't store the actual file in memory unless you explicitly load it into memoryCaves
@MarkBaker Oops, I messed up. I'm not having a problem, I'm just asking for suggestions, as there are more experienced people here. I just want to know the methods that may be the best for this case.Explain
Chunked upload doesn't mean that 1 script/process is tasked with assembling the file or accepting every piece. It's a task distribution problem and is best solved if you split the file in chunks. If you create code in such a way that it doesn't care which chunk it receives, then you've done the task properly. Only when all chunks are there, you assemble the file in the original one.Laski
@Laski But that means, there will be a lot of HTTP requests in transferring the file, right? If the file is 1 GB, and I'm sending chunks of 10MB, that would mean there will be 100 requests for a single file.Explain
“But that means, there will be a lot of HTTP requests in transferring the file, right?” – a) yes, b) so what?Anticipative
As @CBroe said - yes and so what?Laski
@Laski Well yeah, the resources used by the requests don't matter, I see it now.Explain
Use FTP, Or make a server like FTP since you have control over both endsQuyenr
K
5

I suggest you use PHP I/O streams with AJAX. This will keep the memory footprint low on the server and you can easily build an async file upload. Note that this uses the HTML5 API which is available only in modern browsers.

Check out this post: https://web.archive.org/web/20170803172549/http://www.webiny.com/blog/2012/05/07/webiny-file-upload-with-html5-and-ajax-using-php-streams/

Pasting the code from the article here:

HTML

<input type="file" name="upload_files" id="upload_files" multiple="multiple">

JS

function upload(fileInputId, fileIndex)
    {
        // take the file from the input
        var file = document.getElementById(fileInputId).files[fileIndex];
        var reader = new FileReader();
        reader.readAsBinaryString(file); // alternatively you can use readAsDataURL
        reader.onloadend  = function(evt)
        {
                // create XHR instance
                xhr = new XMLHttpRequest();
                 
                // send the file through POST
                xhr.open("POST", 'upload.php', true);

                // make sure we have the sendAsBinary method on all browsers
                XMLHttpRequest.prototype.mySendAsBinary = function(text){
                    var data = new ArrayBuffer(text.length);
                    var ui8a = new Uint8Array(data, 0);
                    for (var i = 0; i < text.length; i++) ui8a[i] = (text.charCodeAt(i) & 0xff);
         
                    if(typeof window.Blob == "function")
                    {
                         var blob = new Blob([data]);
                    }else{
                         var bb = new (window.MozBlobBuilder || window.WebKitBlobBuilder || window.BlobBuilder)();
                         bb.append(data);
                         var blob = bb.getBlob();
                    }

                    this.send(blob);
                }
                 
                // let's track upload progress
                var eventSource = xhr.upload || xhr;
                eventSource.addEventListener("progress", function(e) {
                    // get percentage of how much of the current file has been sent
                    var position = e.position || e.loaded;
                    var total = e.totalSize || e.total;
                    var percentage = Math.round((position/total)*100);
                     
                    // here you should write your own code how you wish to proces this
                });
                 
                // state change observer - we need to know when and if the file was successfully uploaded
                xhr.onreadystatechange = function()
                {
                    if(xhr.readyState == 4)
                    {
                        if(xhr.status == 200)
                        {
                            // process success
                        }else{
                            // process error
                        }
                    }
                };
                 
                // start sending
                xhr.mySendAsBinary(evt.target.result);
        };
    }

PHP

// read contents from the input stream
$inputHandler = fopen('php://input', "r");
// create a temp file where to save data from the input stream
$fileHandler = fopen('/tmp/myfile.tmp', "w+");
 
// save data from the input stream
while(true) {
    $buffer = fgets($inputHandler, 4096);
    if (strlen($buffer) == 0) {
        fclose($inputHandler);
        fclose($fileHandler);
        return true;
    }
 
    fwrite($fileHandler, $buffer);
}
Kadiyevka answered 17/8, 2014 at 16:6 Comment(6)
Then there's the same problem, if the file is large, and the php script is taking only 4096 byte chunks from a 1GB file, then there is a chance of timeout, and the script may not complete.Explain
Try setting set_time_limit(0) at the beginning of the PHP script. It would also be a good idea to set ignore_user_abort(true), even though i'm not sure how well this would work with file uploads.Kadiyevka
Still, it's possible that the HTTP connection may time out too if the file is very large. Wouldn't you say?Explain
That's why you send information about which chunk out of 1000 chunks is being uploaded. If it fails, retry. That's the point of chunked upload. Even in large systems that depend on data transfer across the network implement this method in one way or another. Therefore, it doesn't matter if connection crashes or times out if you are able to simply restart the chunk transfer.Laski
@Laski Yes I see that now. It may make resumable uploads easier to implement too..Explain
I can not see any size of chunk stream. Looks like its sent in one chunk - full size.Witkin
F
2

May be the tus HTTP-based resumable file upload protocol and its implementations?

https://tus.io/

https://github.com/tus

https://github.com/ankitpokhrel/tus-php

Foxe answered 4/7, 2020 at 1:24 Comment(1)
What about uploading efficiency or fast uploading file without stopping? there is any trick or solution for that?Porcia

© 2022 - 2024 — McMap. All rights reserved.