Slicing params hash for specific values
Asked Answered
T

5

14

Summary

Given a Hash, what is the most efficient way to create a subset Hash based on a list of keys to use?

h1 = { a:1, b:2, c:3 }        # Given a hash...
p foo( h1, :a, :c, :d )       # ...create a method that...
#=> { :a=>1, :c=>3, :d=>nil } # ...returns specified keys...
#=> { :a=>1, :c=>3 }          # ...or perhaps only keys that exist

Details

The Sequel database toolkit allows one to create or update a model instance by passing in a Hash:

foo = Product.create( hash_of_column_values )
foo.update( another_hash )

The Sinatra web framework makes available a Hash named params that includes form variables, querystring parameters and also route matches.

If I create a form holding only fields named the same as the database columns and post it to this route, everything works very conveniently:

post "/create_product" do
  new_product = Product.create params
  redirect "/product/#{new_product.id}"
end

However, this is both fragile and dangerous. It's dangerous because a malicious hacker could post a form with columns not intended to be changed and have them updated. It's fragile because using the same form with this route will not work:

post "/update_product/:foo" do |prod_id|
  if product = Product[prod_id]
    product.update(params)
    #=> <Sequel::Error: method foo= doesn't exist or access is restricted to it>
  end
end

So, for robustness and security I want to be able to write this:

post "/update_product/:foo" do |prod_id|
  if product = Product[prod_id]
    # Only update two specific fields
    product.update(params.slice(:name,:description))
    # The above assumes a Hash (or Sinatra params) monkeypatch
    # I will also accept standalone helper methods that perform the same
  end
end

...instead of the more verbose and non-DRY option:

post "/update_product/:foo" do |prod_id|
  if product = Product[prod_id]
    # Only update two specific fields
    product.update({
      name:params[:name],
      description:params[:description]
    })
  end
end

Update: Benchmarks

Here are the results of benchmarking the (current) implementations:

                    user     system      total        real
sawa2           0.250000   0.000000   0.250000 (  0.269027)
phrogz2         0.280000   0.000000   0.280000 (  0.275027)
sawa1           0.297000   0.000000   0.297000 (  0.293029)
phrogz3         0.296000   0.000000   0.296000 (  0.307031)
phrogz1         0.328000   0.000000   0.328000 (  0.319032)
activesupport   0.639000   0.000000   0.639000 (  0.657066)
mladen          1.716000   0.000000   1.716000 (  1.725172)

The second answer by @sawa is the fastest of all, a hair in front of my tap-based implementation (based on his first answer). Choosing to add the check for has_key? adds very little time, and is still more than twice as fast as ActiveSupport.

Here is the benchmark code:

h1 = Hash[ ('a'..'z').zip(1..26) ]
keys = %w[a z c d g A x]
n = 60000

require 'benchmark'
Benchmark.bmbm do |x|
  %w[ sawa2 phrogz2 sawa1 phrogz3 phrogz1 activesupport mladen ].each do |m|
    x.report(m){ n.times{ h1.send(m,*keys) } }
  end
end
Theme answered 13/4, 2011 at 17:10 Comment(6)
Your example at the top doesn't seem to agree with the details? In the example you show that if you select a key that doesn't exist in the original hash you should get a nil value in the new hash. In your Sequel example it doesn't seem you need to create a new hash but really just a subset. What is the real requirement?Rigorism
@Rigorism The two are not incompatible, I think. The real situation will be that I will never (knowingly) ask for a key that does not exist in the original. I included :d in the summary to clearly specify how the edge case should be handled. However, I am also amenable to solutions which do not include any missing-but-requested keys. (Indeed, Mladen's answer and ActiveSupport both do not include any keys not present in the original.)Theme
Wow, good to hear the result. A possible lesson here; a naive implementation is faster than going too much into Rubyish way and fully using its function? Hope ruby implementation gets faster.Otherness
The differences between the first 5 benchmarks are not statistically significant. That is, they are all essentially the same speed.Ine
Great question and followup! Exactly what I was looking for :)Middlebreaker
Just a fyi for anyone reading this 9 years later, Ruby has this built-in now ^^ ruby-doc.org/core-2.5.0/Hash.html#method-i-sliceHuggins
O
5

I changed by mind. The previous one doesn't seem to be any good.

class Hash
  def slice1(*keys)
    keys.each_with_object({}){|k, h| h[k] = self[k]}
  end
  def slice2(*keys)
    h = {}
    keys.each{|k| h[k] = self[k]}
    h
  end
end
Otherness answered 13/4, 2011 at 17:35 Comment(0)
O
19

I would just use the slice method provided by active_support

require 'active_support/core_ext/hash/slice'
{a: 1, b: 2, c: 3}.slice(:a, :c)                  # => {a: 1, c: 3}

Of course, make sure to update your gemfile:

gem 'active_support'
Outwards answered 13/4, 2011 at 19:27 Comment(1)
+1 I didn't know that you could cherry pick individual methods from ActiveSupport. See the updated question above for the results of benchmarking this method.Theme
O
5

I changed by mind. The previous one doesn't seem to be any good.

class Hash
  def slice1(*keys)
    keys.each_with_object({}){|k, h| h[k] = self[k]}
  end
  def slice2(*keys)
    h = {}
    keys.each{|k| h[k] = self[k]}
    h
  end
end
Otherness answered 13/4, 2011 at 17:35 Comment(0)
K
3

Sequel has built-in support for only picking specific columns when updating:

product.update_fields(params, [:name, :description])

That doesn't do exactly the same thing if :name or :description is not present in params, though. But assuming you are expecting the user to use your form, that shouldn't be an issue.

I could always expand update_fields to take an option hash with an option that will skip the value if not present in the hash. I just haven't received a request to do that yet.

Kronstadt answered 14/4, 2011 at 19:41 Comment(3)
I had no idea. Very nice. This still does not address the needs of Product.create(), correct?Theme
Ah, good point. Note that I just ran into the case last night where I was processing checkboxes and I did explicitly want to include a nil value when asking for a field not present in the hash. I will definitely not be making a request for functionality to skip non-present values. :)Theme
FWIW, another case where I just needed slice with Sequel: model.add_associateditem( existing_item.slice( hash_of_fields_without_id ) )Theme
D
2

Perhaps

class Hash
  def slice *keys
    select{|k| keys.member?(k)}
  end
end

Or you could just copy ActiveSupport's Hash#slice, it looks a bit more robust.

Drab answered 13/4, 2011 at 17:50 Comment(0)
T
0

Here are my implementations; I will benchmark and accept faster (or sufficiently more elegant) solutions:

# Implementation 1
class Hash
  def slice(*keys)
    Hash[keys.zip(values_at *keys)]
  end
end

# Implementation 2
class Hash
  def slice(*keys)
    {}.tap{ |h| keys.each{ |k| h[k]=self[k] } }
  end
end

# Implementation 3 - silently ignore keys not in the original
class Hash
  def slice(*keys)
    {}.tap{ |h| keys.each{ |k| h[k]=self[k] if has_key?(k) } }
  end
end
Theme answered 13/4, 2011 at 17:11 Comment(5)
Why not Hash#only from ActiveSupport?Ine
@ReinHeinrichs Because I'm using Sinatra, not Rails, and I don't have the bloat of ActiveSupport included in my app. Also, because I didn't know about it. :) Thanks, I'll look into that.Theme
Wouldn't including keys not existing in the hash cause setting some table columns to NULL? I believe you would have to check for has_key? in your app.Modestia
@Mladen Yes, it would. I'm torn on whether or not this is desirable. For example, an HTML checkbox that is unchecked will not send a key value pair. This is one case where I might ask for a column that is reasonably not present in the hash, and desire the nil. As you can see, I edited my answer above with a version that uses has_key?, for when this is desirable.Theme
Database table can provide default values (not necessarrily NULLs) for its columns, so you're good if the parameter is not there.Modestia

© 2022 - 2024 — McMap. All rights reserved.