How to remove HTML markup from string
Asked Answered
T

7

27

Let's say I have:

@string = "it is a <a href="#">string</a>"

I want to use it in different parts of my application in two ways:

  • With a clickable link
  • Without the clickable link (but not showing any HTML markup)

The first one can be done using html_safe:

@string.html_safe

It is a string

How can I achieve the second one?

It is a string.

Theona answered 6/3, 2013 at 15:45 Comment(1)
Possible Dupilcate: #7414767Nonu
L
54

You can try this:

ActionView::Base.full_sanitizer.sanitize(@string)

See strip_tags(html).

Lenten answered 6/3, 2013 at 15:53 Comment(1)
Please note that this also adds HTML markup. It converts & to &amp;Cocke
D
11

You can try this:

strip_tags(@string)
Doubtful answered 18/7, 2016 at 9:12 Comment(0)
W
7

For general-purpose use (e.g. web scraper):

puts Rails::Html::FullSanitizer.new.sanitize("<div>Hello</div><br>")
# Hello
Wardwarde answered 22/3, 2017 at 11:5 Comment(0)
O
3

You can use nokogiri to do the same.

This SO post tells the story.

Here in short:

This uses the XPath's starts-with function:

You have to first define it like this:

require 'nokogiri'

item = Nokogiri::HTML('<a href="#">string</a>')
puts item.to_html

The above will give the html output. Then you can use XPath.

item.search('//a[not(starts-with(@href, "http://"))]').each do |a|
  a.replace(a.content)
end
puts item.to_html
Ordinand answered 6/3, 2013 at 16:0 Comment(0)
N
1

In Rails, see also the strip_tags method. http://api.rubyonrails.org/classes/ActionView/Helpers/SanitizeHelper.html#method-i-strip_tags

Negrito answered 19/3, 2015 at 19:17 Comment(0)
D
0

Rails provides a method called strip_links, which seems to do what you want (looking at its name).

According to its APIDock page it is a bit limited. To make it applicable to a/any string you could extend the string class:

class String
  def strip_links
    ActionController::Base.helpers.strip_links(self)
  end
end

So you can use:

@string.strip_links
Drouin answered 6/3, 2013 at 15:54 Comment(2)
strip_links gives an error if the string has no html markups. Extending the method don't give the error, but does not work for some markups, such as <em>. But thanks anyway.Theona
Oh... I thought/assumed you always have a link in your string... I guess the sanitize method removes all HTML... (It is in the same Helper module)Drouin
K
0

Inspired by upstairs, I define this function in my project

 def delete_html_markup(data)
    return data if data.blank?
    if data.is_a?(Array)
      data.map{ | s |  delete_html_markup(s)  }
    elsif data.is_a?(Hash)
      data.each do | k, v |
        data[k] = delete_html_markup(v)
      end
    else
      ActionView::Base.full_sanitizer.sanitize(data)
    end
  end
Kandrakandy answered 17/12, 2021 at 2:23 Comment(1)
This does not provide an answer to the question. Once you have sufficient reputation you will be able to comment on any post; instead, provide answers that don't require clarification from the asker. - From ReviewSagitta

© 2022 - 2024 — McMap. All rights reserved.