Strip html from string Ruby on Rails
Asked Answered
G

9

146

I'm working with Ruby on Rails, Is there a way to strip html from a string using sanitize or equal method and keep only text inside value attribute on input tag?

Grizzle answered 14/9, 2011 at 9:45 Comment(0)
C
151

There's a strip_tags method in ActionView::Helpers::SanitizeHelper:

http://api.rubyonrails.org/classes/ActionView/Helpers/SanitizeHelper.html#method-i-strip_tags

Edit: for getting the text inside the value attribute, you could use something like Nokogiri with an Xpath expression to get that out of the string.

Corissa answered 14/9, 2011 at 9:49 Comment(0)
S
209

If we want to use this in model

ActionView::Base.full_sanitizer.sanitize(html_string)

which is the code in "strip_tags" method

Snipe answered 17/10, 2012 at 17:28 Comment(3)
This works but referring to ActionView from the mdoel is awkward. More cleanly you can require 'html/sanitizer' and instantiate your own sanitizer with HTML::FullSanitizer.new.Aluminate
@nhaldimann, require 'html/sanitizer' raises error so I have to use: Rails::Html::FullSanitizer.new (edgeapi.rubyonrails.org/classes/HTML/…)Khalsa
I'm using Rails::Html::FullSanitizer.new.sanitize(string) with Rails 7Lily
C
151

There's a strip_tags method in ActionView::Helpers::SanitizeHelper:

http://api.rubyonrails.org/classes/ActionView/Helpers/SanitizeHelper.html#method-i-strip_tags

Edit: for getting the text inside the value attribute, you could use something like Nokogiri with an Xpath expression to get that out of the string.

Corissa answered 14/9, 2011 at 9:49 Comment(0)
B
36
ActionView::Base.full_sanitizer.sanitize(html_string)

White list of tags and attributes can be specified as bellow

ActionView::Base.full_sanitizer.sanitize(html_string, :tags => %w(img br p), :attributes => %w(src style))

Above statement allows tags img, br and p and attributes src and style.

Blowsy answered 2/7, 2015 at 9:7 Comment(0)
P
35

Yes, call this: sanitize(html_string, tags:[])

Pollinosis answered 15/3, 2013 at 14:40 Comment(1)
Working well in 2024 and Rails 7, no extra requires/includes required here. The tags: [] argument is important, this is what excludes all HTML tags. Otherwise you'll actually get raw HTML and that will render in-page.Harrietharriett
T
10

I've used the Loofah library, as it is suitable for both HTML and XML (both documents and string fragments). It is the engine behind the html sanitizer gem. I'm simply pasting the code example to show how simple it is to use.

Loofah Gem

unsafe_html = "ohai! <div>div is safe</div> <script>but script is not</script>"

doc = Loofah.fragment(unsafe_html).scrub!(:strip)
doc.to_s    # => "ohai! <div>div is safe</div> "
doc.text    # => "ohai! div is safe "
Tudela answered 2/10, 2017 at 7:15 Comment(0)
I
2

How about this?

white_list_sanitizer = Rails::Html::WhiteListSanitizer.new
WHITELIST = ['p','b','h1','h2','h3','h4','h5','h6','li','ul','ol','small','i','u']


[Your, Models, Here].each do |klass| 
  klass.all.each do |ob| 
    klass.attribute_names.each do |attrs|
      if ob.send(attrs).is_a? String
        ob.send("#{attrs}=", white_list_sanitizer.sanitize(ob.send(attrs), tags: WHITELIST, attributes: %w(id style)).gsub(/<p>\s*<\/p>\r\n/im, ''))
        ob.save
      end
    end
  end
end
Irishman answered 8/9, 2015 at 19:14 Comment(1)
There is also Rails::Html::FullSanitizer.new if you don't want to specify a whitelist.Maddux
M
2

If you want to remove all html tags you can use

   htm.gsub(/<[^>]*>/,'')
Murdoch answered 10/9, 2022 at 20:35 Comment(0)
M
0

This is working for me in rails 6.1.3:

.errors-description
  = sanitize(message, tags: %w[div span strong], attributes: %w[class])
Mirellamirelle answered 8/4, 2021 at 18:30 Comment(0)
G
0

If your HTML is coming from ActionText, you can do .to_plain_text:

@my_string = <p>My HTML String</p>
@my_string.to_plain_text
=> My HTML String

https://www.rubydoc.info/github/rails/rails/ActionText%2FContent:to_plain_text

Greedy answered 4/12, 2022 at 13:58 Comment(2)
What is this? .to_plain_text isn't a thing in the core.Kaslik
@JoshuaPinter You are correct. Just updated my answer. Thanks!Greedy

© 2022 - 2024 — McMap. All rights reserved.