How to parse html table to array with symfony dom crawler
Asked Answered
D

2

9

I have html table and I want to make array from that table

$html = '<table>
<tr>
    <td>satu</td>
    <td>dua</td>
</tr>
<tr>
    <td>tiga</td>
    <td>empat</td>
</tr>
</table>

My array must look like this

array(
   array(
      "satu",
      "dua",
   ),
   array(
     "tiga",
     "empat",
   )
)

I have tried the below code but could not get the array as I need

$crawler = new Crawler();
$crawler->addHTMLContent($html);
$row = array();
$tr_elements = $crawler->filterXPath('//table/tr');
foreach ($tr_elements as $tr) {
 // ???????
}
Dandelion answered 28/6, 2016 at 1:25 Comment(4)
have you checked this link which has the complete details symfony.com/doc/current/components/dom_crawler.htmlAndrosterone
yes. i have., i just cant understand how crawler work inside foreach.,Dandelion
The HTML in your first code block is missing a closing single quote. Typo?Intertidal
no, its just example,.Dandelion
B
21
$table = $crawler->filter('table')->filter('tr')->each(function ($tr, $i) {
    return $tr->filter('td')->each(function ($td, $i) {
        return trim($td->text());
    });
});

print_r($table);

The above example will give you a multidimensional array where the first layer are the table lines "tr" and the second layer are the table columns "td".

EDIT

If you got nested tables, this code will flatten them out nicely into a single dimension array.

$html = 'MY HTML HERE';
$crawler = new Crawler($html);

$flat = function(string $selector) use ($crawler) {
    $result = [];
    $crawler->filter($selector)->each(function ($table, $i) use (&$result) {
        $table->filter('tr')->each(function ($tr, $i) use (&$result) {
            $tr->filter('td')->each(function ($td, $i) use (&$result) {
                $html = trim($td->html());
                if (strpos($html, '<table') !== FALSE) return;

                $iterator = $td->getIterator()->getArrayCopy()[0];
                $address = $iterator->getNodePath();

                if (!empty($html)) $result[$address] = $html;
            });
        });
    });
    return $result;
};

// The selector gotta point to the most outwards table.
print_r($flat('#Prod fieldset div table'));
Bobodioulasso answered 23/4, 2017 at 3:37 Comment(1)
This is much better (y)Datary
N
7
$html = '<table>
            <tr>
                <td>satu</td>
                <td>dua</td>
            </tr>
            <tr>
                <td>tiga</td>
                <td>empat</td>
            </tr>
            </table>';

    $crawler = new Crawler();
    $crawler->addHTMLContent($html);
    $rows = array();
    $tr_elements = $crawler->filterXPath('//table/tr');
    // iterate over filter results
    foreach ($tr_elements as $i => $content) {
        $tds = array();
        // create crawler instance for result
        $crawler = new Crawler($content);
        //iterate again
        foreach ($crawler->filter('td') as $i => $node) {
           // extract the value
            $tds[] = $node->nodeValue;

        }
        $rows[] = $tds;

    }
    var_dump($rows );exit;

will display

array 
  0 => 
    array 
      0 => string 'satu' 
      1 => string 'dua' 
  1 => 
    array (size=2)
      0 => string 'tiga' 
      1 => string 'empat'
Nazareth answered 28/6, 2016 at 10:38 Comment(2)
@wpacoder No problem , do you need additional explanations? FYI, it's not really good to say thanks in stackoverflow (comments are not for that), an upvote is enoughtNazareth
its ok , i know where is my fault.,Dandelion

© 2022 - 2024 — McMap. All rights reserved.