Faster alternative to file_get_contents()
Asked Answered
H

6

13

Currently I'm using file_get_contents() to submit GET data to an array of sites, but upon execution of the page I get this error:

Fatal error: Maximum execution time of 30 seconds exceeded

All I really want the script to do is start loading the webpage, and then leave. Each webpage may take up to 5 minutes to load fully, and I don't need it to load fully.

Here is what I currently have:

        foreach($sites as $s) //Create one line to read from a wide array
        {
                file_get_contents($s['url']); // Send to the shells
        }

EDIT: To clear any confusion, this script is being used to start scripts on other servers, that return no data.

EDIT: I'm now attempting to use cURL to do the trick, by setting a timeout of one second to make it send the data and then stop. Here is my code:

        $ch = curl_init($s['url']); //load the urls
        curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 1); //Only send the data, don't wait.
        curl_exec($ch); //Execute
        curl_close($ch); //Close it off.

Perhaps I've set the option wrong. I'm looking through some manuals as we speak. Just giving you an update. Thank you all of you that are helping me thus far.

EDIT: Ah, found the problem. I was using CURLOPT_CONNECTTIMEOUT instead of CURLOPT_TIMEOUT. Whoops.

However now, the scripts aren't triggering. They each use ignore_user_abort(TRUE); so I can't understand the problem

Hah, scratch that. Works now. Thanks a lot everyone

Humeral answered 18/4, 2010 at 16:17 Comment(4)
No, I have no experience with cURL. Wanted to do it with something I have at least a little experience with. You think I should scrap this with php and go with cURL?Humeral
what exactly does the webpage do? do you just want it to start a script that should run by itself, which returns no data?Ziska
@Humeral it still needs to run, though, correct? Meaning that it needs to wait until the remote site is done sending data? You couldn't just call the URL to trigger a remote scripting action and then discard any data it sends?Ganef
@Pekka: It doesn't need to wait for it to finish. All it has to do is send the GET data and then it can close the connection.Humeral
L
6

There are many ways to solve this.

You could use cURL with its curl_multi_* functions to execute asynchronously the requests. Or use cURL the common way but using 1 as timeout limit, so it will request and return timeout, but the request will be executed.

If you don't have cURL installed, you could continue using file_get_contents but forking processes (not so cool, but works) using something like ZendX_Console_Process_Unix so you avoid the waiting between each request.

Leander answered 18/4, 2010 at 18:2 Comment(3)
Yep let me look into this and play with it for a few minutesHumeral
Tried this: $ch = curl_init($s['url']); //load the urls curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 1); //Only send the data, don't wait. curl_exec($ch); //Execute curl_close($ch); //Close it It's still loading all of themHumeral
Sorry, I don't have time to test it. You may want to try the other methods.Leander
G
2

Re your update that you only need to trigger the operation:

You could try using file_get_contents with a timeout. This would lead to the remote script being called, but the connection being terminated after n seconds (e.g. 1).

If the remote script is configured so it continues to run even if the connection is aborted (in PHP that would be ignore_user_abort), it should work.

Try it out. If it doesn't work, you won't get around increasing your time_limit or using an external executable. But from what you're saying - you just need to make the request - this should work. You could even try to set the timeout to 0 but I wouldn't trust that.

From here:

<?php
$ctx = stream_context_create(array(
    'http' => array(
        'timeout' => 1
        )
    )
);
file_get_contents("http://example.com/", 0, $ctx);
?>

To be fair, Chris's answer already includes this possibility: curl also has a timeout switch.

Ganef answered 18/4, 2010 at 16:20 Comment(1)
Well I know why the download takes so long, its the pages I'm loading, they take between 30 seconds and 5 minutes to load fully.Humeral
D
2

As Franco mentioned and I'm not sure was picked up on, you specifically want to use the curl_multi functions, not the regular curl ones. This packs multiple curl objects into a curl_multi object and executes them simultaneously, returning (or not, in your case) the responses as they arrive.

Example at http://php.net/curl_multi_init

Delphadelphi answered 20/3, 2011 at 12:56 Comment(0)
D
1

it is not file_get_contents() who consume that much time but network connection itself.
Consider not to submit GET data to an array of sites, but create an rss and let them get RSS data.

Duiker answered 18/4, 2010 at 16:23 Comment(1)
+1, sanest approach if a feed is available. But that will leave some feed feeble sites where he continues to block, which curl would fix.Handiness
C
1

I don't fully understands the meaning behind your script. But here is what you can do:

  1. In order to avoid the fatal error quickly you can just add set_time_limit(120) at the beginning of the file. This will allow the script to run for 2 minutes. Of course you can use any number that you want and 0 for infinite.
  2. If you just need to call the url and you don't "care" for the result you should use cUrl in asynchronous mode. This case any call to the URL will not wait till it finished. And you can call them all very quickly.

BR.

Corrinecorrinne answered 18/4, 2010 at 19:13 Comment(0)
L
1

If the remote pages take up to 5 minutes to load, your file_get_contents will sit and wait for that 5 minutes. Is there any way you could modify the remote scripts to fork into a background process and do the heavy processing there? That way your initial hit will return almost immediately, and not have to wait for the startup period.

Another possibility is to investigate if a HEAD request would do the trick. HEAD does not return any data, just headers, so it may be enough to trigger the remote jobs and not wait for the full output.

Lampe answered 18/4, 2010 at 20:19 Comment(1)
I like the HEAD idea, might be worth a try.Angle

© 2022 - 2024 — McMap. All rights reserved.