Grab, cache and parse remote XML feed, validation checks in PHP
Asked Answered
E

3

5

Currently, I'm grabbing a remote site's XML feed and saving a local copy on my server to be parsed in PHP.

Problem is how do I go about adding some checks in PHP to see if the feed.xml file is valid and if so use feed.xml.

And if invalid with errors (of which sometimes the remote XML feed somes display blank feed.xml), serve a backup valid copy of the feed.xml from previous grab/save ?

code grabbing feed.xml

<?php
/**
* Initialize the cURL session
*/
$ch = curl_init();
/**
* Set the URL of the page or file to download.
*/
curl_setopt($ch, CURLOPT_URL,
'http://domain.com/feed.xml');
/**
* Create a new file
*/
$fp = fopen('feed.xml', 'w');
/**
* Ask cURL to write the contents to a file
*/
curl_setopt($ch, CURLOPT_FILE, $fp);
/**
* Execute the cURL session
*/
curl_exec ($ch);
/**
* Close cURL session and file
*/
curl_close ($ch);
fclose($fp);
?>

so far only have this to load it

$xml = @simplexml_load_file('feed.xml') or die("feed not loading");

thanks

England answered 14/2, 2010 at 17:49 Comment(0)
P
4

If it's not pricipial that curl should write directly into file, then you could check XML before re-writing your local feed.xml:

<?php
/**
* Initialize the cURL session
*/
$ch = curl_init();
/**
* Set the URL of the page or file to download.
*/
curl_setopt($ch, CURLOPT_URL, 'http://domain.com/feed.xml');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$xml = curl_exec ($ch);
curl_close ($ch);
if (@simplexml_load_string($xml)) {
    /**
    * Create a new file
    */
    $fp = fopen('feed.xml', 'w');
    fwrite($fp, $xml);
    fclose($fp);
}

?>
Paapanen answered 14/2, 2010 at 18:27 Comment(1)
Hi revisiting this code again and it seems I'm not able to pull the remote xml to save locally whereas the code i posted above in first post works but the save xml file is cut abruptly short ? any ideas ?England
D
3

How about this? No need to use curl if you just need to retrieve a document.

$feed = simplexml_load_file('http://domain.com/feed.xml');

if ($feed)
{
    // $feed is valid, save it
    $feed->asXML('feed.xml');
}
elseif (file_exists('feed.xml'))
{
    // $feed is not valid, grab the last backup
    $feed = simplexml_load_file('feed.xml');
}
else
{
    die('No available feed');
}
Dovap answered 14/2, 2010 at 19:11 Comment(1)
thanks Josh definitely learning.. glad i signed on on this site :)England
I
0

In a class I put together, I have a function that checks if the remote file exists and if it's responding in a timely manner:

/**
* Check to see if remote feed exists and responding in a timely manner
*/
private function remote_file_exists($url) {
  $ret = false;
  $ch = curl_init($url);

  curl_setopt($ch, CURLOPT_NOBODY, true); // check the connection; return no content
  curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 1); // timeout after 1 second
  curl_setopt($ch, CURLOPT_TIMEOUT, 2); // The maximum number of seconds to allow cURL functions to execute.
  curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows; U; Windows NT 6.0; da; rv:1.9.0.11) Gecko/2009060215 Firefox/3.0.11');

  // do request
  $result = curl_exec($ch);

  // if request is successful
  if ($result === true) {
    $statusCode = curl_getinfo($ch, CURLINFO_HTTP_CODE);
    if ($statusCode === 200) {
      $ret = true;
    }
  }
  curl_close($ch);

  return $ret;
}

The full class contains fall-back code to make sure we always have something to work with.

Blog post explaining the full class is here: http://weedygarden.net/2012/04/simple-feed-caching-with-php/

Code is here: https://github.com/erunyon/FeedCache

Inhalant answered 26/4, 2012 at 23:43 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.