How do I avoid pretty-printing HTML in Nokogiri while using to_html?
Asked Answered
S

2

8

I am using Nokogiri with Ruby on Rails v2.3.8.

Is there a way in which I can avoid pretty-printing in Nokogiri while using to_html?

I read that to_xml allows this to be done using to_xml(:indent => 0), but this doesn't work with to_html.

Right now I am using gsub to strip away new-line characters. Does Nokogiri provide any option to do it?

Simarouba answered 22/1, 2013 at 13:49 Comment(2)
Maybe you want to use HTML.fragment()? See [this question][1]. [1]: #4723844Vital
See my proper answer. Just load your HTML into a HTML fragment instead of a HTML document and to_html will not add formattingVital
T
7

I solved this using .to_html(save_with: 0)?

2.1.0 :001 > require 'nokogiri'
 => true
2.1.0 :002 >  doc = Nokogiri::HTML.fragment('<ul><li><span>hello</span> boom!</li></ul>')
 => #<Nokogiri::HTML::DocumentFragment:0x4e4cbd2 name="#document-fragment" children=[#<Nokogiri::XML::Element:0x4e4c97a name="ul" children=[#<Nokogiri::XML::Element:0x4e4c47a name="li" children=[#<Nokogiri::XML::Element:0x4e4c240 name="span" children=[#<Nokogiri::XML::Text:0x4e4c0a6 "hello">]>, #<Nokogiri::XML::Text:0x4e4c86c " boom!">]>]>]>
2.1.0 :003 > doc.to_html
 => "<ul><li>\n<span>hello</span> boom!</li></ul>"
2.1.0 :004 > doc.to_html(save_with: 0)
 => "<ul><li><span>hello</span> boom!</li></ul>"

tested on: nokogiri (1.6.5) + libxml2 2.7.6.dfsg-1ubuntu1 + ruby 2.1.0p0 (2013-12-25 revision 44422) [i686-linux]

Todo answered 6/3, 2015 at 21:3 Comment(0)
V
3

You can use Nokogiri::HTML.fragment() instead of just Nokogiri::HTML(). When you perform to_html it won't add newlines, a DOCTYPE header or make it 'pretty' in any way.

Vital answered 22/1, 2013 at 14:9 Comment(4)
Hi Hank, I tried to fragment the following HTML "<ul><li><a href=\"en.wikipedia.org/wiki/Concept_inventory\" title=\"Concept inventory\">Concept inventory</a>, an assessment to reveal student thinking on a topic</li></ul>" This is what i got using to_html "<ul><li>\n<a href=\"en.wikipedia.org/wiki/Concept_inventory\" title=\"Concept inventory\">Concept inventory</a>, an assessment to reveal student thinking on a topic</li></ul>" Am I doing something wrong here?Simarouba
What version of libxml2 are you using? the \n might be a result of having an outdated libxml2 versionVital
my libxml2 version is 2.8.0+dfsg1-5ubuntu2.1. fragment() doesn't work to me. Any other ways to solve the problem? newlines are still added.Barbel
this doesn't work on nokogiri (1.6.5) + libxml2 2.7.6.dfsg-1ubuntu1 + ruby 2.1.0p0 (2013-12-25 revision 44422) [i686-linux]. See my answer below.Todo

© 2022 - 2024 — McMap. All rights reserved.