Deleting all special characters from a string - ruby
Asked Answered
L

7

54

I was doing the challenges from pythonchallenge writing code in ruby, specifically this one. It contains a really long string in page source with special characters. I was trying to find a way to delete them/check for the alphabetical chars.

I tried using scan method, but I think I might not use it properly. I also tried delete! like that:

    a = "PAGE SOURCE CODE PASTED HERE"
    a.delete! "!", "@"  #and so on with special chars, does not work(?) 
    a

How can I do that?

Thanks

Locality answered 30/1, 2014 at 1:47 Comment(0)
J
154

You can do this

a.gsub!(/[^0-9A-Za-z]/, '')
Joeyjoffre answered 30/1, 2014 at 5:50 Comment(0)
U
21

try with gsub

a.gsub!(/[!@%&"]/,'')

try the regexp on rubular.com

if you want something more general you can have a string with valid chars and remove what's not in there:

a.gsub!(/[^abcdefghijklmnopqrstuvwxyz ]/,'')
Unbelievable answered 30/1, 2014 at 1:58 Comment(1)
I think this [^A-Za-z ] works better, in this case. Otherwise, if you have a sentence, which typically should start with a capital letter, you will lose your capital letters. You would also lose any 1337 speak, or other possible crypts within the text. Case in point: phrase = "Joe can't tell between 'large' and large." => "Joe can't tell between 'large' and large."Kerenkeresan
S
9

When you give multiple arguments to string#delete, it's the intersection of those arguments that is deleted. a.delete! "!", "@" deletes the intersections of the sets ! and @ which means that nothing will be deleted and the method returns nil.

What you wanted to do is a.delete! "!@" with the characters to delete passed as a single string.

Since the challenge is asking to clean up the mess and find a message in it, I would go with a whitelist instead of deleting special characters. The delete method accepts ranges with - and negations with ^ (similar to a regex) so you can do something like this: a.delete! "^A-Za-z ".

You could also use regular expressions as shown by @arieljuod.

Singer answered 30/1, 2014 at 2:21 Comment(0)
P
6

gsub is one of the most used Ruby methods in the wild.

specialname="Hello!#$@"
cleanedname = specialname.gsub(/[^a-zA-Z0-9\-]/,"") 
Pornocracy answered 30/1, 2014 at 5:58 Comment(1)
This still keeps the "-" btwKoah
K
5

I think a.gsub(/[^A-Za-z0-9 ]/, '') works better in this case. Otherwise, if you have a sentence, which typically should start with a capital letter, you will lose your capital letter. You would also lose any 1337 speak, or other possible crypts within the text.

Case in point:

phrase = "Joe can't tell between 'large' and large." => "Joe can't tell between 'large' and large."

phrase.gsub(/[^a-z ]/, '') => "oe cant tell between large and large"

phrase.gsub(/[^A-Za-z0-9 ]/, '') => "Joe cant tell between large and large"

phrase2 = "W3 a11 f10a7 d0wn h3r3!" phrase2.gsub(/[^a-z ]/, '') => " a fa dwn hr"

phrase2.gsub(/[^A-Za-z0-9 ]/, '') => "W3 a11 f10a7 d0wn h3r3"

Kerenkeresan answered 11/5, 2017 at 18:10 Comment(0)
B
2

If you don't want to change the original string - i.e. to solve the challenge.

str.each_char do |letter|
  if letter =~ /[a-z]/  
    p letter    
  end  
end  
Broach answered 30/1, 2014 at 2:32 Comment(0)
M
0

You will have to write down your own string sanitize function, could easily use regex and the gsub method.

Atomic sample:

your_text.gsub!(/[!@\[;\]^%*\(\);\-_\/&\\|$\{#\}<>:`~"]/,'')

API sample:

Route: post 'api/sanitize_text', to: 'api#sanitize_text'

Controller:

  def sanitize_text
    return render_bad_request unless params[:text].present? && params[:text].present?
    sanitized_text = params[:text].gsub!(/[!@\[;\]^%*\(\);\-_\/&\\|$\{#\}<>:`~"]/,'')
    render_response( {safe_text: sanitized_text})
  end

Then you call it

POST /api/sanitize_text?text=abcdefghijklmnopqrstuvwxyz123456<>$!@%23^%26*[]:;{}()`,.~'"\|/
Masurium answered 20/6, 2022 at 16:26 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.