i just want the text out of there with out those tags. Does Hrpicot.XML have any methods for this?
how does one remove <![CDATA[ ]]> tags from around text in XML using Hpricot?
Asked Answered
use element.inner_text instead of #inner_html and it removes them for you
You probably will want a #inner_text.strip to get rid of the (almost guaranteed) extraneous whitespace. –
Wheels
doc.search("*") do |element|
element.swap element.content if element.kind_of? Hpricot::CData
end
doc = Hpricot::XML(open('http://www.cnn.com/.element/ssi/www/auto/2.0/video/xml/most_popular.xml'))
(doc/:cnn_video/:video).each do |status|
['tease_txt'].each do |el|
puts "#{status.at(el).inner_text}"
end
end
Example output (looks spammy but this is not spam!):
New Reno air crash video shows impact
Teen catches 800-pound gator
Resuming careers post 'don't ask' repeal
Creepy skirt peepers
Bus-sized satellite to hit Earth thi ...
'DWTS' cast hits ballroom for first time
What caused trainer's death at SeaWorld?
What led to Troy Davis clemency denial?
© 2022 - 2024 — McMap. All rights reserved.