How do I get the input value from a Nokogiri::XML::NodeSet?
Asked Answered
K

3

5

I am looking for my input element using Nokogiri's xpath method. It's returning an object of class Nokogiri::XML::NodeSet:

[#<Nokogiri::XML::Element:0x3fcc0e07de14 name="input" attributes=[#<Nokogiri::XML::Attr:0x3fcc0e07dba8 name="type" value="text">, #<Nokogiri::XML::Attr:0x3fcc0e07db94 name="name" value="creditInstallmentAmount">, #<Nokogiri::XML::Attr:0x3fcc0e07db44 name="style" value="width:240px">, #<Nokogiri::XML::Attr:0x3fcc0e07dae0 name="value" value="94.8">, #<Nokogiri::XML::Attr:0x3fcc0e07da18 name="readonly" value="true">]>

Is there a faster and cleaner way to get the value of input than casting this using to_s:

"<input type=\"text\" name=\"creditInstallmentAmount\" style=\"width:240px\" value=\"94.8\" readonly>"

and match with regular expressions?

Krimmer answered 15/6, 2012 at 13:31 Comment(1)
If you add the xpath expression and a bit more of the XML/HTML, we may be able to help.Hollinger
V
19

A couple things will help:

Nokogiri has the at method, which is the equivalent of search(...).first, and, instead of returning a NodeSet, it returns the Node itself, making it easy to grab values from it:

require 'nokogiri'

doc = Nokogiri::HTML('<input type="text" name="creditInstallmentAmount" style="width:240px" value="94.8" readonly>')
doc.at('input')['value'] # => "94.8"
doc.at('input')['value'].to_f # => 94.8

Also, notice I'm using CSS notation, instead of XPath. Nokogiri supports both, and a lot of times the CSS is more obvious and easily readable. The at_css method is an alias to at for convenience.

Note that Nokogiri uses a little test in search and at to try to determine whether the selector is CSS or XPath, and then branches accordingly to the specific method. The test can be fooled, at which point you should use the specific CSS or XPath variant, or always use them if you're paranoid. In years of using Nokogiri I've only once encountered the situation where the code was confused. If you want to be more explicit about which input you want, you can look into the parameters for the tag:

doc.at('input[@name="creditInstallmentAmount"]')['value'] # => "94.8"

Get familiar with the difference between search and at and their varients, and Nokogiri will really become useful to you. Learn how to access the parameters and text() nodes and you'll know 99% of what you need to know for parsing HTML and XML.

Vise answered 16/6, 2012 at 3:13 Comment(2)
at and at_css are actually different methods. You can pass xpath to at but not at_css for example.Guidance
Correct. Nokogiri makes a little test in search and at to try to determine whether the selector is CSS or XPath, and then branches accordingly. The CSS and XPath specific methods don't do that, they assume we're aware of what we're doing.Vise
K
0

Ok, I found the answer:

.map{|node| node["value"]}.first
Krimmer answered 15/6, 2012 at 13:36 Comment(2)
Why extract the value attribute of all elements if you only need the first? Use .first["value"] instead.Barthelemy
If you found 10,000 matches, your code would process all of them just to toss all that work away. That's not very system friendly and in a production environment would cost money.Vise
L
0

Ok, this works for me

require 'nokogiri'
require 'open-uri'

html = open ARGV[0]

doc = Nokogiri::HTML(html)
inputs = doc.search 'input'
inputs.map{|node| node['name']}

or all in one

inputs = Nokogiri::HTML(html).search('input').map{|node| node['name']}
Lezlielg answered 5/3, 2018 at 17:35 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.