Why is it taking so long to retrieve a file URL from S3 using CarrierWave and Fog?
Asked Answered
O

2

8

I'm using CarrierWave (0.9.0), Fog (1.14.0) and S3 to store user avatars. It seems to be taking a long time to determine the avatar URL for a given user. Subsequent calls have a greatly reduced time.

config/initializers/fog.rb

CarrierWave.configure do |config|
  config.fog_credentials = {
    provider: 'AWS',
    aws_access_key_id: ENV['AWS_ACCESS_KEY_ID'],
    aws_secret_access_key: ENV['AWS_SECRET_ACCESS_KEY']
  }
  config.fog_directory = ENV['AWS_S3_BUCKET_AVATAR']
  config.fog_public = false
end

user.rb

mount_uploader :avatar, ImageUploader

Console

irb(main):002:0> Benchmark.measure { user.avatar_url }
=>   0.500000   0.020000   0.520000 (  0.537884)

irb(main):003:0> Benchmark.measure { user.avatar_url }
=>   0.000000   0.000000   0.000000 (  0.001830)

irb(main):004:0> Benchmark.measure { user.avatar_url }
=>   0.000000   0.000000   0.000000 (  0.001454)

irb(main):005:0> Benchmark.measure { user.avatar_url }
=>   0.000000   0.000000   0.000000 (  0.000998)

New Relic reports that sometimes user.avatar_url takes up to 1 second. What could be causing this to be so slow? There's a discussion on slow URL generation for public files, but my avatars are not public.


Update 1:

Explicitly requiring Fog and CarrierWave before the first call makes no difference, as false is returned, indicating that they're already loaded.

irb(main):002:0> require 'carrierwave'
=> false
irb(main):003:0> require 'fog'
=> false
irb(main):004:0> Benchmark.measure { user.avatar_url }
=>   0.510000   0.030000   0.540000 (  1.627774)

Update 2:

Here's the uploader:

# encoding: utf-8

class ImageUploader < CarrierWave::Uploader::Base

  include CarrierWave::RMagick

  storage :fog

  def store_dir
    "uploads/#{model.class.to_s.underscore}/#{mounted_as}/#{model.id}"
  end

  version :s_32 do
   process resize_to_fill: [32, 32]
  end

  version :s_40 do
   process resize_to_fill: [40, 40]
  end

  version :s_50 do
   process resize_to_fill: [50, 50]
  end

  version :s_115_120 do
   process resize_to_fill: [115, 120]
  end

  version :s_128 do
   process resize_to_fill: [128, 128]
  end

  def extension_white_list
    %w(jpg jpeg gif png)
  end

end

Update 3:

user.avatar.url doesn't seem to make a difference:

irb(main):003:0> Benchmark.measure { user.avatar.url }
=>   0.500000   0.030000   0.530000 (  0.926975)
Oshea answered 30/7, 2013 at 15:7 Comment(3)
The the first you call user.avatar_url you'll query a row in the database, while subsequent calls may be caching the query. While you probably shouldn't expect such a slow down with such a simple query, this may be your problem. What does your ImageUploader#store_dir look like?Iraidairan
Agreed, I think I need a bit more detail around the uploader to have any further guesses.Secede
Added the uploader to the question.Oshea
S
4

I think the fog require may still be at issue (though it is less obvious now). Since fog has so very many different things within it, we made a choice long ago to defer loading many of the dependencies until they were needed. This has the benefit of speeding up 'fog', but can have the detriment of slowing down the first time certain things happen. Not sure how I forgot about this part, but in doing some small benchmarking on my machine I can certainly see a slow down when taking this into consideration.

To get around this, you can change the related requiring benchmark above to something like:

require 'benchmark'
require 'fog'
Fog::Storage.new(
  provider: 'AWS',
  aws_access_key_id: ENV['AWS_ACCESS_KEY_ID'],
  aws_secret_access_key: ENV['AWS_SECRET_ACCESS_KEY']
)
Benchmark.measure { ... }

It may seem a bit odd in that you don't use that connection, but I set it up to defer loading S3 specifics until you do that (so that you don't, for instance, have to load S3 specifics in order to use EC2). By initializing a connection at some earlier time, however, you avoid that overhead. Hopefully that will at least get you closer to where you want to be.

Secede answered 31/7, 2013 at 15:6 Comment(6)
Tried it, made no difference. Updated the question.Oshea
In looking over the docs and stuff, one thing stuck out that I thought I'd ask about. Can you do user.avatar.url? It looked like that might work directly, whereas I wonder if avatar_url hits method_missing the first time through (or similar). Wish I could help more directly but I haven't touched carrierwave in so long that it is a bit fuzzy/uncertain for me.Secede
Good idea, but it didn't help. I've updated the question with the result. This one is killing me!Oshea
Killing me too, I have one more idea but it takes a little effort to explain. I'll just update my answer to better reflect above.Secede
That fixed it! Benchmark time for the first attempt is now 0.003877. Is it OK to add the S3 connection to an initializer, or does it have to be refreshed periodically?Oshea
Initializer should be fine. It's kind of weird in that I wouldn't expect the connection you create in this way to ever actually be used (I don't think it will even make a connection to the server). It's just that Fog::Storage::AWS.new includes a number of requires which this will hit (so no need to refresh). Perhaps the requires should be moved out to some bootstrapping method or something, but for now I'm glad to hear this works for you.Secede
M
2

I spent some time on this. While I haven't figured out the root cause, here are some findings. You can start from what I've done and hopefully you don't need to worry about this too much.

  • Based on what you said and my experiments. This slowness only happens when you call user.avatar_url the first. It's fast if you call user.avatar_url or another_user.avatar_url after the first call. Basically only one request is affected by this 'issue', so I think this is not a big problem.
  • What are the difference between the first and subsequent calls? I suggest you use ruby-prof to check the difference. I am not an expert but from the report you will see the first run loads more stuff than the second run. ruby-prof provides some other reports which might be helpful to you, especially the tree report.

  • I thought CarrierWave/Fog might need to access S3 to make sure the url does exist. This was ruled out by using an incorrect S3 bucket.

The above report is generated by below code.

user = User.first
RubyProf.start
user.avatar_url
results = RubyProf.stop

File.open "/tmp/profile-graph1.txt", 'w' do |file|
  RubyProf::FlatPrinter.new(results).print(file)
end

RubyProf.start
user.avatar_url
results = RubyProf.stop

File.open "/tmp/profile-graph2.txt", 'w' do |file|
  RubyProf::FlatPrinter.new(results).print(file)
end
Mcchesney answered 10/8, 2013 at 8:5 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.