I have a set of html pages. I want to extract all table nodes where the attribute "border" = 1. Here is an example:
<table border="1" cellspacing="0" cellpadding="5">
<tbody><tr><td>
<table border="0" cellpadding="2" cellspacing="0">
<tbody><tr>
<td bgcolor="#ff9999"><strong><font size="+1">CASEID</font></strong></td>
</tr></tbody>
</table>
<tr><td>[tbody]
</table>
In the example, I want to select the table node where border=1 but not the tables where border = 0. I am using html_nodes()
from rvest
but can't figure out how to add attributes:
html_nodes(x, "table")