How to select unique elements
Asked Answered
A

6

10

I would like to extend the Array class with a uniq_elements method which returns those elements with multiplicity of one. I also would like to use closures to my new method as with uniq. For example:

t=[1,2,2,3,4,4,5,6,7,7,8,9,9,9]
t.uniq_elements # => [1,3,5,6,8]

Example with closure:

t=[1.0, 1.1, 2.0, 3.0, 3.4, 4.0, 4.2, 5.1, 5.7, 6.1, 6.2]
t.uniq_elements{|z| z.round} # => [2.0, 5.1]

Neither t-t.uniq nor t.to_set-t.uniq.to_set works. I don't care of speed, I call it only once in my program, so it can be a slow.

Amine answered 28/7, 2014 at 0:32 Comment(2)
Not clear. Why is 5.7 included in the result of the second example?Volume
becaused I missed, now I excluded it.Amine
S
14

Helper method

This method uses the helper:

class Array
  def difference(other)
    h = other.each_with_object(Hash.new(0)) { |e,h| h[e] += 1 }
    reject { |e| h[e] > 0 && h[e] -= 1 }
  end
end

This method is similar to Array#-. The difference is illustrated in the following example:

a = [3,1,2,3,4,3,2,2,4]
b = [2,3,4,4,3,4]

a - b              #=> [1]
c = a.difference b #=> [1, 3, 2, 2] 

As you see, a contains three 3's and b contains two, so the first two 3's in a are removed in constructing c (a is not mutated). When b contains as least as many instances of an element as does a, c contains no instances of that element. To remove elements beginning at the end of a:

a.reverse.difference(b).reverse #=> [3, 1, 2, 2]

Array#difference! could be defined in the obvious way.

I have found many uses for this method: here, here, here, here, here, here, here, here, here, here, here, here, here, here, here, here, here, here, here, here, here, here and here.

I have proposed that this method be added to the Ruby core.

When used with Array#-, this method makes it easy to extract the unique elements from an array a:

a = [1,3,2,4,3,4]
u = a.uniq          #=> [1, 2, 3, 4]
u - a.difference(u) #=> [1, 2]

This works because

a.difference(u)     #=> [3,4]    

contains all the non-unique elements of a (each possibly more than once).

Problem at Hand

Code

class Array
  def uniq_elements(&prc)
    prc ||= ->(e) { e }
    a = map { |e| prc[e] }
    u = a.uniq
    uniques = u - a.difference(u)
    select { |e| uniques.include?(prc[e]) ? (uniques.delete(e); true) : false }
  end
end

Examples

t = [1,2,2,3,4,4,5,6,7,7,8,9,9,9]
t.uniq_elements
  #=> [1,3,5,6,8]

t = [1.0, 1.1, 2.0, 3.0, 3.4, 4.0, 4.2, 5.1, 5.7, 6.1, 6.2]
t.uniq_elements { |z| z.round }
  # => [2.0, 5.1]
Succor answered 28/7, 2014 at 6:33 Comment(0)
S
3

Here's another way.

Code

require 'set'

class Array
  def uniq_elements(&prc)
    prc ||= ->(e) { e }
    uniques, dups = {}, Set.new
    each do |e|
      k = prc[e]
      ((uniques.key?(k)) ? (dups << k; uniques.delete(k)) :
          uniques[k] = e) unless dups.include?(k)
    end
    uniques.values
  end
end

Examples

t = [1,2,2,3,4,4,5,6,7,7,8,9,9,9]
t.uniq_elements #=> [1,3,5,6,8]

t = [1.0, 1.1, 2.0, 3.0, 3.4, 4.0, 4.2, 5.1, 5.7, 6.1, 6.2]
t.uniq_elements { |z| z.round } # => [2.0, 5.1]

Explanation

  • if uniq_elements is called with a block, it is received as the proc prc.
  • if uniq_elements is called without a block, prc is nil, so the first statement of the method sets prc equal to the default proc (lambda).
  • an initially-empty hash, uniques, contains representations of the unique values. The values are the unique values of the array self, the keys are what is returned when the proc prc is passed the array value and called: k = prc[e].
  • the set dups contains the elements of the array that have found to not be unique. It is a set (rather than an array) to speed lookups. Alternatively, if could be a hash with the non-unique values as keys, and arbitrary values.
  • the following steps are performed for each element e of the array self:
    • k = prc[e] is computed.
    • if dups contains k, e is a dup, so nothing more needs to be done; else
    • if uniques has a key k, e is a dup, so k is added to the set dups and the element with key k is removed from uniques; else
    • the element k=>e is added to uniques as a candidate for a unique element.
  • the values of unique are returned.
Succor answered 28/7, 2014 at 3:57 Comment(1)
thx, I used this method till now, but it doesn't receive a block: def uelements(a) t=a.sort u=[] u.push t1[0] if t1[0] != t1[1] for i in 1..t.size-2 do u.push t[i] if t[i] != t[i+1] && t[i] != t[i-1] end u.push t[-1] if t[-2] != t[-1] return u endAmine
T
1
class Array
  def uniq_elements
    counts = Hash.new(0)

    arr = map do |orig_val|
      converted_val =  block_given? ? (yield orig_val) : orig_val
      counts[converted_val] += 1
      [converted_val, orig_val]
    end

    uniques = []

    arr.each do |(converted_val, orig_val)|
      uniques << orig_val if counts[converted_val] == 1
    end

    uniques
  end
end

t=[1,2,2,3,4,4,5,6,7,7,8,9,9,9]
p t.uniq_elements

t=[1.0, 1.1, 2.0, 3.0, 3.4, 4.0, 4.2, 5.1, 5.7, 6.1, 6.2]
p  t.uniq_elements { |elmt| elmt.round }

--output:--
[1, 3, 5, 6, 8]
[2.0, 5.1]

Array#uniq does not find non-duplicated elements, rather Array#uniq removes duplicates.

Tass answered 28/7, 2014 at 2:51 Comment(1)
VII, after the map block, consider arr.each_with_object([]) do |(converted_val, orig_val),uniques|...end.Succor
L
1

Use Enumerable#tally:

class Array
  def uniq_elements
    tally.select { |_obj, nb| nb == 1 }.keys
  end
end

t=[1,2,2,3,4,4,5,6,7,7,8,9,9,9]
t.uniq_elements # => [1,3,5,6,8]

If you are using Ruby < 2.7, you can get tally with the backports gem

require 'backports/2.7.0/enumerable/tally'
Landgrabber answered 20/7, 2020 at 23:14 Comment(0)
G
0
class Array
  def uniq_elements
    zip( block_given? ? map { |e| yield e } : self )
      .each_with_object Hash.new do |(e, v), h| h[v] = h[v].nil? ? [e] : false end
      .values.reject( &:! ).map &:first
  end
end

[1,2,2,3,4,4,5,6,7,7,8,9,9,9].uniq_elements #=> [1, 3, 5, 6, 8]
[1.0, 1.1, 2.0, 3.0, 3.4, 4.0, 4.2, 5.1, 5.7, 6.1, 6.2].uniq_elements &:round #=> [2.0, 5.1]
Gospel answered 28/7, 2014 at 3:22 Comment(4)
each_with_index() is not necessary. You can just insert 1 every time. Also note: you traverse the array twice(the minimal number of times) but then you additionally have to call keys().Tass
Now get rid of your Hash and use Hash.new(0) instead; there's no need to create all those arrays.Tass
I've done that before your comment was written, but thanks for attentiveness.Gospel
Yeah, but I was thinking it the first time I read your post! I still think mines more efficient...well, you nicely sidestep the map() call if there's no block given. Ahh..but yours won't work with the rounding example because you do not preserve the mapping between the original val and the converted val.Tass
T
0
  1. Creating and calling a default proc is a waste of time, and
  2. Cramming everything into one line using tortured constructs doesn't make the code more efficient--it just makes the code harder to understand.
  3. In require statements, rubyists don't capitalize file names.

....

require 'set'

class Array
  def uniq_elements
    uniques = {}
    dups = Set.new

    each do |orig_val|
      converted_val =  block_given? ? (yield orig_val) : orig_val
      next if dups.include? converted_val 

      if uniques.include?(converted_val)  
        uniques.delete(converted_val)
        dups << converted_val
      else
        uniques[converted_val] = orig_val
      end
    end

    uniques.values
  end
end


t=[1,2,2,3,4,4,5,6,7,7,8,9,9,9]
p t.uniq_elements

t=[1.0, 1.1, 2.0, 3.0, 3.4, 4.0, 4.2, 5.1, 5.7, 6.1, 6.2]

p  t.uniq_elements {|elmt|
  elmt.round
}

--output:--
[1, 3, 5, 6, 8]
[2.0, 5.1]
Tass answered 28/7, 2014 at 17:45 Comment(2)
Thanks, 7. I initially had next if..., as you suggest, but switched to unless because the code was short, though next may read better. I prefer using the proc, in part because unique_elements(&prc) tells the reader immediately that a parameter may be being passed, and the first line clarifies that; a naked unique_elements suggests the opposite, until the reader sees yield. I downcased set.Succor
I prefer using the proc, because tells the reader immediately that a parameter may be being passed Fair enough, but ruby's syntax was created to allow you to pass a method into another method without specifying an argument. Also, frivolous method calls aren't free. switched to unless I follow 'Perl Best Practices' in regards to unless--it's an abomination. In any case, nice solution using only a single traversal of the array.Tass

© 2022 - 2024 — McMap. All rights reserved.