Check if string contains any substring in an array in Ruby
Asked Answered
A

6

51

I am using the Tmail library, and for each attachment in an email, when I do attachment.content_type, sometimes I get not just the content type but also the name. Examples:

image/jpeg; name=example3.jpg

image/jpeg; name=example.jpg

image/jpeg; name=photo.JPG

image/png

I have an array of valid content types like this:

VALID_CONTENT_TYPES = ['image/jpeg']

I would like to be able to check if the content type is included in any of the valid content types array elements.

What would be the best way of doing so in Ruby?

Apace answered 18/4, 2012 at 18:26 Comment(0)
S
0

I think we can divide this question in two:

  1. How to clean undesired data
  2. How to check if cleaned data is valid

The first is well answered above. For the second, I would do the following:

(cleaned_content_types - VALID_CONTENT_TYPES) == 0

The nice thing about this solution is that you can easily create a variable to store the undesired types to list them later like this example:

VALID_CONTENT_TYPES = ['image/jpeg']
cleaned_content_types = ['image/png', 'image/jpeg', 'image/gif', 'image/jpeg']

undesired_types = cleaned_content_types - VALID_CONTENT_TYPES
if undesired_types.size > 0
  error_message = "The types #{undesired_types.join(', ')} are not allowed"
else
  # The happy path here
end
Sulfathiazole answered 3/6, 2019 at 21:30 Comment(0)
S
124

There are multiple ways to accomplish that. You could check each string until a match is found using Enumerable#any?:

str = "alo eh tu"
['alo','hola','test'].any? { |word| str.include?(word) }

Though it might be faster to convert the array of strings into a Regexp:

words = ['alo','hola','test']
r = /#{words.join("|")}/ # assuming there are no special chars
r === "alo eh tu"
Spina answered 18/4, 2012 at 18:42 Comment(6)
To be safe, you should escape the words in the regex (in case there are any regex special characters present): r = /#{words.map{|w|Regexp.escape(w)}.join('|')}/Eliseoelish
@steenslag Thanks! I had never seen that method (present since at least 1.8.6!).Eliseoelish
@steenslag So its not necessary to do the join? I can just do union and it does the escaping? Awesome...Apace
I tried both and tried benchmarking it 1_000_000x: .any? # => ( 0.877526) r = Regexp.union(*words); r === string # => ( 17.374344) Just for reference.Modicum
Your Regexp was exactly what I needed. Thank you!Preamplifier
few years late, but @Modicum 's benchmark still works and is still true. Only that machines process it faster now, .any? # => ( 0.160000 ); union => ( 6.410000 )Mousey
E
3

If image/jpeg; name=example3.jpg is a String:

("image/jpeg; name=example3.jpg".split("; ") & VALID_CONTENT_TYPES).length > 0

i.e. intersection (elements common to the two arrays) of VALID_CONTENT_TYPES array and attachment.content_type array (including type) should be greater than 0.

That's at least one of many ways.

Encyclical answered 18/4, 2012 at 18:43 Comment(0)
F
3

So if we just want existence of a match:

VALID_CONTENT_TYPES.inject(false) do |sofar, type| 
    sofar or attachment.content_type.start_with? type
end

If we want the matches this will give the list of matching strings in the array:

VALID_CONTENT_TYPES.select { |type| attachment.content_type.start_with? type }
Frowst answered 18/4, 2012 at 18:45 Comment(0)
S
2
# will be true if the content type is included    
VALID_CONTENT_TYPES.include? attachment.content_type.gsub!(/^(image\/[a-z]+).+$/, "\1") 
Sowder answered 18/4, 2012 at 18:51 Comment(0)
S
0

I think we can divide this question in two:

  1. How to clean undesired data
  2. How to check if cleaned data is valid

The first is well answered above. For the second, I would do the following:

(cleaned_content_types - VALID_CONTENT_TYPES) == 0

The nice thing about this solution is that you can easily create a variable to store the undesired types to list them later like this example:

VALID_CONTENT_TYPES = ['image/jpeg']
cleaned_content_types = ['image/png', 'image/jpeg', 'image/gif', 'image/jpeg']

undesired_types = cleaned_content_types - VALID_CONTENT_TYPES
if undesired_types.size > 0
  error_message = "The types #{undesired_types.join(', ')} are not allowed"
else
  # The happy path here
end
Sulfathiazole answered 3/6, 2019 at 21:30 Comment(0)
A
0

I use the next helper:

class String

    # line.includes_any? ['keyword_1', 'keyword_2']
    # line.includes_any? 'keyword_1', 'keyword_2'
    def includes_any?(*arr)
        arr.flatten.any? { self.include? _1 }
    end

end
Amalea answered 2/4 at 8:37 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.