Get href attribute of <a> tag in HTML table cells
Asked Answered
P

1

7

I am trying to pull the href from a url from some data using php's domDocument.

The following pulls the anchor for the url, but I want the url

$events[$i]['race_1'] = trim($cols->item(1)->nodeValue); 

Here is more of the code if it helps.

   // initialize loop
   $i = 0;
   // new dom object  
   $dom = new DOMDocument();  
   
   //load the html  
   $html = @$dom->loadHTMLFile($url);  
   //discard white space   
   $dom->preserveWhiteSpace = true;   
   
   //the table by its tag name  
   $information = $dom->getElementsByTagName('table'); 
   $rows = $information->item(4)->getElementsByTagName('tr');  

   foreach ($rows as $row)   
   { 
    $cols = $row->getElementsByTagName('td');   
    $events[$i]['title'] = trim($cols->item(0)->nodeValue); 
    $events[$i]['race_1'] = trim($cols->item(1)->nodeValue);   
$events[$i]['race_2'] = trim($cols->item(2)->nodeValue);  
$events[$i]['race_3'] = trim($cols->item(3)->nodeValue);
$date = explode('/', trim($cols->item(4)->nodeValue));
$events[$i]['month'] = $date['0'];
$events[$i]['day'] = $date['1'];
$citystate = explode(',', trim($cols->item(5)->nodeValue));   
$events[$i]['city'] = $citystate['0'];
$events[$i]['state'] = $citystate['1'];
$i++;
   }
   print_r($events);

Here is the contents of the TD tag

<td width="12%" align="center" height="13"><!--mstheme--><font face="Arial"><span lang="en-us"><b>
          <font style="font-size: 9pt;" face="Verdana">
          <a linkindex="18" target="_blank" href="results2010/brmc5k10.htm">Overall</a>    

Pyrrhonism answered 12/7, 2011 at 15:53 Comment(0)
H
11

Update, I see the issue. You need to get the list of a elements from the td.

$cols = $row->getElementsByTagName('td');
// $cols->item(1) is a td DOMElement, so have to find anchors in the td element
// then get the first (only) ancher's href attribute
// (chaining looks long, might want to refactor/check for nulls)
$events[$i]['race_1'] = trim($cols->item(1)->getElementsByTagName('a')->item(0)->getAttribute('href');

Pretty sure that you should be able to call getAttribute() on the item. You can verify that the item is nodeType XML_ELEMENT_NODE; it will return an empty string if the item isn't a DOMElement.

<?php
// ...
$events[$i]['race_1'] = trim($cols->item(1)->getAttribute('href'));
// ...   
?>

See related: DOMNode to DOMElement in php

Hasid answered 12/7, 2011 at 16:4 Comment(2)
This returns an empty string but I don't understand why, I have added the contents of the TD tag above to maybe helpPyrrhonism
It's probably because $cols->item(1) isn't a DOMElement. Do a var_dump($cols->item(1)->nodeType == XML_ELEMENT_NODE); to see if it is.Hasid

© 2022 - 2024 — McMap. All rights reserved.