Need to change the storage "directory" of files in an S3 Bucket (Carrierwave / Fog)
Asked Answered
S

1

4

I am using Carrierwave with 3 separate models to upload photos to S3. I kept the default settings for the uploader, which was to store photos in a root S3 bucket. I then decided to store them in sub-directories according to model name like /avatars, items/, etc. based on the model they were uploaded from...

Then, I noticed that files of the same name were being overwritten and when I deleted a model record, the photo wasn't being deleted.

I've since changed the store_dir from an uploader-specific setup like this:

  def store_dir
    "items"
  end

to a generic one which stores photo under the model ID (I use mongo FYI):

  def store_dir
    "uploads/#{model.class.to_s.underscore}/#{mounted_as}/#{model.id}"
  end

Here comes the problem. I am trying to move all the photos already into S3 into the proper "directory" within S3. From what I've ready, S3 doesn't have directories per se. I'm having trouble with the rake task. Since i changed the store_dir, Carrierwave is looking for all the photos previously uploaded in the wrong directory.

namespace :pics do
  desc "Fix directory location of pictures on s3"
  task :item_update => :environment do
    connection = Fog::Storage.new({
      :provider                 => 'AWS',
      :aws_access_key_id => 'XXXX',
      :aws_secret_access_key => 'XXX'
    })
    directory = connection.directories.get("myapp-uploads-dev")

    Recipe.all.each do |l|
      if l.images.count > 0
        l.items.each do |i|
          if i.picture.path.to_s != ""
            new_full_path = i.picture.path.to_s
            filename = new_full_path.split('/')[-1].split('?')[0]
            thumb_filename = "thumb_#{filename}"
            original_file_path = "items/#{filename}"
            puts "attempting to retrieve: #{original_file_path}"
            original_thumb_file_path = "items/#{thumb_filename}"
            photo = directory.files.get(original_file_path) rescue nil
            if photo
              puts "we found: #{original_file_path}"
              photo.expires = 2.years.from_now.httpdate
              photo.key = new_full_path
              photo.save
              thumb_photo = directory.files.get(original_thumb_file_path) rescue nil
              if thumb_photo
                puts "we found: #{original_thumb_file_path}"
                thumb_photo.expires = 2.years.from_now.httpdate
                thumb_photo.key = "/uploads/item/picture/#{i.id}/#{thumb_filename}"
                thumb_photo.save
              end
            end
          end
        end
      end
    end
  end
end

So I'm looping through all the Recipes, looking for items with photos, determining the old Carrierwave path, trying to update it with the new one based on the store_dir change. I thought if I simply updated the photo.key with the new path, it would work, but it's not.

What am I doing wrong? Is there a better way to accomplish the ask here?

Here's what I did to get this working...

namespace :pics do
  desc "Fix directory location of pictures"
  task :item_update => :environment do
    connection = Fog::Storage.new({
      :provider                 => 'AWS',
      :aws_access_key_id => 'XXX',
      :aws_secret_access_key => 'XXX'
    })
    bucket = "myapp-uploads-dev"
    puts "Using bucket: #{bucket}"
    Recipe.all.each do |l|
      if l.images.count > 0
        l.items.each do |i|
          if i.picture.path.to_s != ""
            new_full_path = i.picture.path.to_s
            filename = new_full_path.split('/')[-1].split('?')[0]
            thumb_filename = "thumb_#{filename}"
            original_file_path = "items/#{filename}"
            original_thumb_file_path = "items/#{thumb_filename}"
            puts "attempting to retrieve: #{original_file_path}"
            # copy original item
            begin
              connection.copy_object(bucket, original_file_path, bucket, new_full_path, 'x-amz-acl' => 'public-read')
              puts "we just copied: #{original_file_path}"
            rescue
              puts "couldn't find: #{original_file_path}"
            end
            # copy thumb
            begin
              connection.copy_object(bucket, original_thumb_file_path, bucket, "uploads/item/picture/#{i.id}/#{thumb_filename}", 'x-amz-acl' => 'public-read')
              puts "we just copied: #{original_thumb_file_path}"
            rescue
              puts "couldn't find thumb: #{original_thumb_file_path}"
            end

          end
        end
      end
    end
  end
end

Perhaps not the prettiest thing in the world, but it worked.

Sayyid answered 26/9, 2013 at 21:8 Comment(2)
I would kind of expect that to work as well. Are there errors or the files just don't exist where you expect? If there are small number this may work fine, but especially with a larger number you may want to use copy_object as jeremy mentions below (as it does everything much more quickly and you don't have to download anything).Perturbation
Thanks for this, it's been helpful for me going through the same thing. One thing you might want to know in the future is you can get the filename directly without having to parse the path: i.picture.file.filename would do it in your case.Chalkstone
H
6

You need to be interacting with the S3 Objects directly to move them. You'll probably want to look at copy_object and delete_object in the Fog gem, which is what CarrierWave uses to interact with S3.

https://github.com/fog/fog/blob/8ca8a059b2f5dd2abc232dd2d2104fe6d8c41919/lib/fog/aws/requests/storage/copy_object.rb

https://github.com/fog/fog/blob/8ca8a059b2f5dd2abc232dd2d2104fe6d8c41919/lib/fog/aws/requests/storage/delete_object.rb

Hagiography answered 27/9, 2013 at 2:57 Comment(1)
Thanks for the help Jeremy, this set me on the right path. Posting my solution as an edit now for anyone else who comes along here.Sayyid

© 2022 - 2024 — McMap. All rights reserved.