Fastest Way to Serve a File Using PHP
Asked Answered
O

8

110

I'm trying to put together a function that receives a file path, identifies what it is, sets the appropriate headers, and serves it just like Apache would.

The reason I am doing this is because I need to use PHP to process some information about the request before serving the file.

Speed is critical

virtual() isn't an option

Must work in a shared hosting environment where the user has no control of the web server (Apache/nginx, etc)

Here's what I've got so far:

File::output($path);

<?php
class File {
static function output($path) {
    // Check if the file exists
    if(!File::exists($path)) {
        header('HTTP/1.0 404 Not Found');
        exit();
    }

    // Set the content-type header
    header('Content-Type: '.File::mimeType($path));

    // Handle caching
    $fileModificationTime = gmdate('D, d M Y H:i:s', File::modificationTime($path)).' GMT';
    $headers = getallheaders();
    if(isset($headers['If-Modified-Since']) && $headers['If-Modified-Since'] == $fileModificationTime) {
        header('HTTP/1.1 304 Not Modified');
        exit();
    }
    header('Last-Modified: '.$fileModificationTime);

    // Read the file
    readfile($path);

    exit();
}

static function mimeType($path) {
    preg_match("|\.([a-z0-9]{2,4})$|i", $path, $fileSuffix);

    switch(strtolower($fileSuffix[1])) {
        case 'js' :
            return 'application/x-javascript';
        case 'json' :
            return 'application/json';
        case 'jpg' :
        case 'jpeg' :
        case 'jpe' :
            return 'image/jpg';
        case 'png' :
        case 'gif' :
        case 'bmp' :
        case 'tiff' :
            return 'image/'.strtolower($fileSuffix[1]);
        case 'css' :
            return 'text/css';
        case 'xml' :
            return 'application/xml';
        case 'doc' :
        case 'docx' :
            return 'application/msword';
        case 'xls' :
        case 'xlt' :
        case 'xlm' :
        case 'xld' :
        case 'xla' :
        case 'xlc' :
        case 'xlw' :
        case 'xll' :
            return 'application/vnd.ms-excel';
        case 'ppt' :
        case 'pps' :
            return 'application/vnd.ms-powerpoint';
        case 'rtf' :
            return 'application/rtf';
        case 'pdf' :
            return 'application/pdf';
        case 'html' :
        case 'htm' :
        case 'php' :
            return 'text/html';
        case 'txt' :
            return 'text/plain';
        case 'mpeg' :
        case 'mpg' :
        case 'mpe' :
            return 'video/mpeg';
        case 'mp3' :
            return 'audio/mpeg3';
        case 'wav' :
            return 'audio/wav';
        case 'aiff' :
        case 'aif' :
            return 'audio/aiff';
        case 'avi' :
            return 'video/msvideo';
        case 'wmv' :
            return 'video/x-ms-wmv';
        case 'mov' :
            return 'video/quicktime';
        case 'zip' :
            return 'application/zip';
        case 'tar' :
            return 'application/x-tar';
        case 'swf' :
            return 'application/x-shockwave-flash';
        default :
            if(function_exists('mime_content_type')) {
                $fileSuffix = mime_content_type($path);
            }
            return 'unknown/' . trim($fileSuffix[0], '.');
    }
}
}
?>
Oxpecker answered 13/9, 2010 at 3:47 Comment(4)
Why aren't you letting Apache do this? It's always going to be considerably faster than starting up the PHP interpreter...Furiya
I need to process the request and store some information in the database before outputting the file.Oxpecker
May I suggest a way to get the extension without the more expensive regular expressions: $extension = end(explode(".", $pathToFile)), or you can do it with substr and strrpos: $extension = substr($pathToFile, strrpos($pathToFile, '.')). Also, as a fallback to mime_content_type(), you can try a system call: $mimetype = exec("file -bi '$pathToFile'", $output);Electrothermal
What do you mean by fastest? Fastest download time?Principality
R
151

My previous answer was partial and not well documented, here is an update with a summary of the solutions from it and from others in the discussion.

The solutions are ordered from best solution to worst but also from the solution needing the most control over the web server to the one needing the less. There don't seem to be an easy way to have one solution that is both fast and work everywhere.


Using the X-SendFile header

As documented by others it's actually the best way. The basis is that you do your access control in php and then instead of sending the file yourself you tell the web server to do it.

The basic php code is :

header("X-Sendfile: $file_name");
header("Content-type: application/octet-stream");
header('Content-Disposition: attachment; filename="' . basename($file_name) . '"');

Where $file_name is the full path on the file system.

The main problem with this solution is that it need to be allowed by the web server and either isn't installed by default (apache), isn't active by default (lighttpd) or need a specific configuration (nginx).

Apache

Under apache if you use mod_php you need to install a module called mod_xsendfile then configure it (either in apache config or .htaccess if you allow it)

XSendFile on
XSendFilePath /home/www/example.com/htdocs/files/

With this module the file path could either be absolute or relative to the specified XSendFilePath.

Lighttpd

The mod_fastcgi support this when configured with

"allow-x-send-file" => "enable" 

The documentation for the feature is on the lighttpd wiki they document the X-LIGHTTPD-send-file header but the X-Sendfile name also work

Nginx

On Nginx you can't use the X-Sendfile header you must use their own header that is named X-Accel-Redirect. It is enabled by default and the only real difference is that it's argument should be an URI not a file system. The consequence is that you must define a location marked as internal in your configuration to avoid clients finding the real file url and going directly to it, their wiki contains a good explanation of this.

Symlinks and Location header

You could use symlinks and redirect to them, just create symlinks to your file with random names when a user is authorized to access a file and redirect the user to it using:

header("Location: " . $url_of_symlink);

Obviously you'll need a way to prune them either when the script to create them is called or via cron (on the machine if you have access or via some webcron service otherwise)

Under apache you need to be able to enable FollowSymLinks in a .htaccess or in the apache config.

Access control by IP and Location header

Another hack is to generate apache access files from php allowing the explicit user IP. Under apache it mean using mod_authz_host (mod_access) Allow from commands.

The problem is that locking access to the file (as multiple users may want to do this at the same time) is non trivial and could lead to some users waiting a long time. And you still need to prune the file anyway.

Obviously another problem would be that multiple people behind the same IP could potentially access the file.

When everything else fail

If you really don't have any way to get your web server to help you, the only solution remaining is readfile it's available in all php versions currently in use and work pretty well (but isn't really efficient).


Combining solutions

In fine, the best way to send a file really fast if you want your php code to be usable everywhere is to have a configurable option somewhere, with instructions on how to activate it depending on the web server and maybe an auto detection in your install script.

It is pretty similar to what is done in a lot of software for

  • Clean urls (mod_rewrite on apache)
  • Crypto functions (mcrypt php module)
  • Multibyte string support (mbstring php module)
Revisionism answered 16/9, 2010 at 23:29 Comment(7)
Is there any problem with doing some PHP works (check cookie/other GET/POST params against database) before doing header("Location: " . $path);?Giamo
No problem for such action, the thing you need to be careful with are sending content (print, echo) as the header must come before any content and doing things after sending this header, it is not an immediate redirection and code after it will be executed most of the time but you have no guaranties that the browser won't cut the connection.Revisionism
Jords: I didn't know that apache also supported this, i'll add this to my answer when i have time. The only problem with it is that i isn't unified (X-Accel-Redirect nginx for example) so a second solution is needed if the server either don't support it. But i should add it to my answer.Revisionism
Where can I allow .htaccess to controll the XSendFilePath?Preconcert
@Keyne I don't think you can. tn123.org/mod_xsendfile does not list .htaccess in the context for the XSendFilePath optionAssistant
@Assistant - that's right. XSendFilePath can't be put in .htaccess. If you're the only one who has control over the server, it works just fine to omit it.Liveried
Can I do any of this when the file is hosted on a different server?My issue is that I need to serve files from a third party server without giving away the URL in the most efficient wayHilliary
H
34

The fastest way: Don't. Look into the x-sendfile header for nginx, there are similar things for other web servers also. This means that you can still do access control etc in php but delegate the actual sending of the file to a web server designed for that.

P.S: I get chills just thinking about how much more efficient using this with nginx is, compared to reading and sending the file in php. Just think if 100 people are downloading a file: With php + apache, being generous, thats probably 100*15mb = 1.5GB (approx, shoot me), of ram right there. Nginx will just hand off sending the file to the kernel, and then it's loaded directly from the disk into the network buffers. Speedy!

P.P.S: And, with this method you can still do all the access control, database stuff you want.

Hardner answered 13/9, 2010 at 3:52 Comment(5)
Let me just add that this also exists for Apache: jasny.net/articles/how-i-php-x-sendfile . You can make the script sniff out the server and send the appropriate headers. If none exist (and the user has no control over the server as per the question), fall back to a normal readfile()Electrothermal
Now this is just awesome - I always hated bumping up the memory limit in my virtual hosts just so that PHP would serve up a file, and with this I shouldn't have to. I'll be trying it out very soon.Advanced
And for credit where credit is due, Lighttpd was the first web server to implement this (And the rest copied it, which is fine since it's a great idea. But give credit where credit is due)...Rotarian
This answer keeps getting upvoted, but it won't work in an environment where the web server and its settings are out of the user's control.Oxpecker
You actually added that to your question after I posted this answer. And if performance is an issue, then the web server has to be within your control.Hardner
P
27

Here goes a pure PHP solution. I've adapted the following function from my personal framework:

function Download($path, $speed = null, $multipart = true)
{
    while (ob_get_level() > 0)
    {
        ob_end_clean();
    }

    if (is_file($path = realpath($path)) === true)
    {
        $file = @fopen($path, 'rb');
        $size = sprintf('%u', filesize($path));
        $speed = (empty($speed) === true) ? 1024 : floatval($speed);

        if (is_resource($file) === true)
        {
            set_time_limit(0);

            if (strlen(session_id()) > 0)
            {
                session_write_close();
            }

            if ($multipart === true)
            {
                $range = array(0, $size - 1);

                if (array_key_exists('HTTP_RANGE', $_SERVER) === true)
                {
                    $range = array_map('intval', explode('-', preg_replace('~.*=([^,]*).*~', '$1', $_SERVER['HTTP_RANGE'])));

                    if (empty($range[1]) === true)
                    {
                        $range[1] = $size - 1;
                    }

                    foreach ($range as $key => $value)
                    {
                        $range[$key] = max(0, min($value, $size - 1));
                    }

                    if (($range[0] > 0) || ($range[1] < ($size - 1)))
                    {
                        header(sprintf('%s %03u %s', 'HTTP/1.1', 206, 'Partial Content'), true, 206);
                    }
                }

                header('Accept-Ranges: bytes');
                header('Content-Range: bytes ' . sprintf('%u-%u/%u', $range[0], $range[1], $size));
            }

            else
            {
                $range = array(0, $size - 1);
            }

            header('Pragma: public');
            header('Cache-Control: public, no-cache');
            header('Content-Type: application/octet-stream');
            header('Content-Length: ' . sprintf('%u', $range[1] - $range[0] + 1));
            header('Content-Disposition: attachment; filename="' . basename($path) . '"');
            header('Content-Transfer-Encoding: binary');

            if ($range[0] > 0)
            {
                fseek($file, $range[0]);
            }

            while ((feof($file) !== true) && (connection_status() === CONNECTION_NORMAL))
            {
                echo fread($file, round($speed * 1024)); flush(); sleep(1);
            }

            fclose($file);
        }

        exit();
    }

    else
    {
        header(sprintf('%s %03u %s', 'HTTP/1.1', 404, 'Not Found'), true, 404);
    }

    return false;
}

The code is as efficient as it can be, it closes the session handler so that other PHP scripts can run concurrently for the same user / session. It also supports serving downloads in ranges (which is also what Apache does by default I suspect), so that people can pause/resume downloads and also benefit from higher download speeds with download accelerators. It also allows you to specify the maximum speed (in Kbps) at which the download (part) should be served via the $speed argument.

Principality answered 29/9, 2011 at 0:18 Comment(5)
Obviously this is only a good idea if you can't use X-Sendfile or one of it's variants to have the kernel send the file. You should be able to replace the feof()/fread() loop above with [php.net/manual/en/function.eio-sendfile.php](PHP's eio_sendfile()] call, which accomplishes the same thing in PHP. This isn't as fast as doing it directly in the kernel, as any output generated in PHP still has to go back out through the webserver process, but it's going to be a helluva lot faster than doing it in PHP code.Chromatogram
@BrianC: Sure, but you can't limit the speed or the multipart ability with X-Sendfile (which may not be available) and eio is also not always available. Still, +1, didn't knew about that pecl extension. =)Principality
Would it be useful to support transfer-encoding:chunked and content-encoding:gzip?Moustache
Why $size = sprintf('%u', filesize($path)) ?Epigraphic
@Alix Axel Thank you ;)Quartas
R
14
header('Location: ' . $path);
exit(0);

Let Apache do the work for you.

Reich answered 13/9, 2010 at 4:18 Comment(6)
That's simpler than the x-sendfile method, but will not work to restrict access to a file, to say only logged in people. If you don't need to do that then it's great!Hardner
Also add a referrer check with mod_rewrite.Origen
You could auth before passing the header. That way you're also not pumping tons of stuff through PHP's memory.Flavin
@UltimateBrent The location still has to be accessible to all.. And a refer check is no security at all since it comes from the clientStines
@Jimbo A user token that you're going to check how? With PHP? Suddenly your solution is recursing.Fitful
this is the most simple solution to the streaming problem. When streaming files in php none of the common methods are as efficient as just loading the file in memory with a file_get_contents($file_path) and then echoing it with the headers (obviously this doesnt work for large files (bigger than the alowed memory) lets say 300-400GB i tried all the methods, none of them got not even close to this one. at this point i wopuld say is easier and better to handle the permissions problems than trying to stream with php (which is disgoustinly horrible)Semitone
M
1

A better implementation, with cache support, customized http headers.

serveStaticFile($fn, array(
        'headers'=>array(
            'Content-Type' => 'image/x-icon',
            'Cache-Control' =>  'public, max-age=604800',
            'Expires' => gmdate("D, d M Y H:i:s", time() + 30 * 86400) . " GMT",
        )
    ));

function serveStaticFile($path, $options = array()) {
    $path = realpath($path);
    if (is_file($path)) {
        if(session_id())
            session_write_close();

        header_remove();
        set_time_limit(0);
        $size = filesize($path);
        $lastModifiedTime = filemtime($path);
        $fp = @fopen($path, 'rb');
        $range = array(0, $size - 1);

        header('Last-Modified: ' . gmdate("D, d M Y H:i:s", $lastModifiedTime)." GMT");
        if (( ! empty($_SERVER['HTTP_IF_MODIFIED_SINCE']) && strtotime($_SERVER['HTTP_IF_MODIFIED_SINCE']) == $lastModifiedTime ) ) {
            header("HTTP/1.1 304 Not Modified", true, 304);
            return true;
        }

        if (isset($_SERVER['HTTP_RANGE'])) {
            //$valid = preg_match('^bytes=\d*-\d*(,\d*-\d*)*$', $_SERVER['HTTP_RANGE']);
            if(substr($_SERVER['HTTP_RANGE'], 0, 6) != 'bytes=') {
                header('HTTP/1.1 416 Requested Range Not Satisfiable', true, 416);
                header('Content-Range: bytes */' . $size); // Required in 416.
                return false;
            }

            $ranges = explode(',', substr($_SERVER['HTTP_RANGE'], 6));
            $range = explode('-', $ranges[0]); // to do: only support the first range now.

            if ($range[0] === '') $range[0] = 0;
            if ($range[1] === '') $range[1] = $size - 1;

            if (($range[0] >= 0) && ($range[1] <= $size - 1) && ($range[0] <= $range[1])) {
                header('HTTP/1.1 206 Partial Content', true, 206);
                header('Content-Range: bytes ' . sprintf('%u-%u/%u', $range[0], $range[1], $size));
            }
            else {
                header('HTTP/1.1 416 Requested Range Not Satisfiable', true, 416);
                header('Content-Range: bytes */' . $size);
                return false;
            }
        }

        $contentLength = $range[1] - $range[0] + 1;

        //header('Content-Disposition: attachment; filename="xxxxx"');
        $headers = array(
            'Accept-Ranges' => 'bytes',
            'Content-Length' => $contentLength,
            'Content-Type' => 'application/octet-stream',
        );

        if(!empty($options['headers'])) {
            $headers = array_merge($headers, $options['headers']);
        }
        foreach($headers as $k=>$v) {
            header("$k: $v", true);
        }

        if ($range[0] > 0) {
            fseek($fp, $range[0]);
        }
        $sentSize = 0;
        while (!feof($fp) && (connection_status() === CONNECTION_NORMAL)) {
            $readingSize = $contentLength - $sentSize;
            $readingSize = min($readingSize, 512 * 1024);
            if($readingSize <= 0) break;

            $data = fread($fp, $readingSize);
            if(!$data) break;
            $sentSize += strlen($data);
            echo $data;
            flush();
        }

        fclose($fp);
        return true;
    }
    else {
        header('HTTP/1.1 404 Not Found', true, 404);
        return false;
    }
}
Mellow answered 6/4, 2015 at 9:56 Comment(0)
C
0

if you have the possibility to add PECL extensions to your php you can simply use the functions from the Fileinfo package to determine the content-type and then send the proper headers...

Chyle answered 23/9, 2010 at 18:30 Comment(1)
/bump, have you mentioned this possibility? :)Chyle
D
0

The PHP Download function mentioned here was causing some delay before the file actually started to download. I don't know if this was caused by using varnish cache or what, but for me it helped to remove the sleep(1); completely and set $speed to 1024. Now it works without any problem as is fast as hell. Maybe you could modify that function too, because I saw it used all over the internet.

Daysidayspring answered 12/12, 2013 at 10:3 Comment(0)
S
0

I coded a very simple function to serve files with PHP and automatic MIME type detection :

function serve_file($filepath, $new_filename=null) {
    $filename = basename($filepath);
    if (!$new_filename) {
        $new_filename = $filename;
    }
    $mime_type = mime_content_type($filepath);
    header('Content-type: '.$mime_type);
    header('Content-Disposition: attachment; filename="downloaded.pdf"');
    readfile($filepath);
}

Usage

serve_file("/no_apache/invoice243.pdf");
Sunbathe answered 28/2, 2020 at 16:33 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.