How to Upload Large Files on Heroku (Particularly Videos)
Asked Answered
C

3

7

I'm using heroku to host a web application with the primary focus of hosting videos. The videos are hosted through vimeo pro, and I'm using the vimeo gem by matthooks to help handle the upload process. Upload works for small files, but not for larger ones (~50mb, for example).

A look at heroku logs shows that I am getting http error 413, which stands for "Request Entity Too Large." I believe this might have to do with a limit that heroku places on file uploads (greater than 30mb, according to this webpage). The problem though is that any information I can find on the subject seems to be outdated and conflicting (like this page that claims there is no size limit). I also couldn't find anything on heroku's site about this.

I've searched google and found a few somewhat relevant pages (one and two), but no solutions that worked for me. Most of the pages I found deal with uploading large files to amazon s3, which is different from what I'm trying to do.

Here's the relevant output of the logs:

2012-07-18T05:13:31+00:00 heroku[nginx]: 152.3.68.6 - - [18/Jul/2012:05:13:31 +0000]
  "POST /videos HTTP/1.1" 413 192 "http://neoteach.com/components/19" "Mozilla/5.0 
  (Macintosh; Intel Mac OS X 10.7; rv:13.0) Gecko/20100101 Firefox/13.0.1" neoteach.com

There are no other errors in the logs. This is the only output that appears when I try to upload a video that is too large. Which means that this is not a timeout error or a problem with exceeding the allotted memory per dyno.

Does heroku really place a limit on upload sizes? If so, is there any way to change this limit? Note that the files themselves are not being stored on heroku's servers at all, they are merely being passed on to vimeo's servers.

If the problem is not limit on upload sizes, does anyone have an idea of what else might be going wrong?

Much thanks!

Chloromycetin answered 18/7, 2012 at 5:39 Comment(2)
As far as I know, there's no such way. I had to upload directly to S3. You might be able to find some way to pass the videos directly to Vimeo, but the only result I found for that wasn't very encouraging: vimeo.com/forums/topic:28113Insecurity
Worth noting, I just tested uploading an 8.5MB file to my Heroku app, which took 3 minutes and 15 seconds (yes, I have DSL). I have web: gunicorn -t 60 -k "eventlet" -w 3 myapp.wsgi:application in my Procfile. In other words, I've increased my timeout to 60 seconds, and my app will allow an upload to take more than 3 minutes. I'm not sure of the reason for this, but it has something to do with my Dyno allowing concurrent connections.Slavism
C
5

Update:

OP here. I'm still not exactly sure why I was getting this particular 413 error, but I was able to come up with a solution that works using the s3_swf_upload gem. The implementation involves flash, which is less than ideal, but it was the only solution (out of 3 or 4 that I tried) that I could get working.

As Neil pointed out (thanks Neil!), the error I should have been getting is "H12 - Request timeout". And I did end up running into this error after repeated trials. The problem occurs when you try to upload large files to the heroku server from your controller (using a web dyno), because it takes too long for the server to respond to the post request.

The proper approach is to send the file directly to s3 without passing through heroku.

Here's a high-level overview of my approach:

  1. Use the s3_swf_upload gem to supply a direct upload form to s3.
  2. Detect when the file is done uploading with the javascript callback function provided in the gem.
  3. Using javascript, send rails a post message to let your server know the file is done uploading.
  4. The controller that responds to the javascript post does two things: (a) assigns an s3_key attribute to the video object (served up as a param in the form). (b) initiates a background task using the delayed_job gem.
  5. The background task retrieves the file from s3. I used the aws-sdk gem to accomplish this, because it was already included in s3_swf_upload. Note that this is distinctly different from the aws-s3 gem (in fact they conflict with one another).
  6. After the file has been retrieved from s3, I used the vimeo gem to upload it to vimeo (still in the background).

The implementation above works, but it isn't perfect. For files that are close to 500MB in size, you'll still run into R14 errors in your worker dynos. This occurs because heroku only allots 512MB of memory per dyno, so you can't load the entire file into memory at once. The way around this problem is to implement some sort of chunking in the final step, where you retrieve the file from s3 and upload it to vimeo piece by piece. I'm still working on this part, and I'd love to hear any suggestions you might have.

Hopefully this might help someone. Feel free to ask me any questions. Like I said, my solution isn't perfect so feel free to add your own answer if you think it could be better.

Chloromycetin answered 26/7, 2012 at 19:10 Comment(1)
Look at "Carrierwave" with "Fog" and "CarrierwaveDirect"Pitchman
D
3

I think the best option here is indeed to upload directly to S3. It's much cheaper and much more secure than allowing users to upload files to your own server (or Heroku in this case). It's also a well-proven pattern used by lots of video hosting platforms (I know vzaar do this).

Check out the jQuery upload plugin, which allows direct uploads to S3: https://github.com/blueimp/jQuery-File-Upload

Also check out the Railscasts around this topic: #381 and #383.

Diameter answered 30/10, 2012 at 15:53 Comment(0)
L
2

Your biggest problem is not the size of the files here, but the fact that you are expecting the user to upload large files to Heroku, and then pass them on. The issue here is that all requests on the Heroku platform must return the first byte within 30 seconds - which in your case is very unlikely.

Therefore, you need to look at getting users to upload direct to S3/Vimeo/whereever and then connect your application data to these uploaded assets.

If you're using Ruby, then the carrier-wave direct gem might be worth a look for how it's done . Failing that there are 3rd party services out there which allow you to do this via some code which you can drop into the page, but these come with an attached cost.

Links answered 18/7, 2012 at 10:1 Comment(5)
Thanks Neil, the problem I described above does not appear to be a timeout. There is no "H12 - Request timeout" in the logs. I understand however that timing out is a big risk with the big file uploads, so I'll look into direct uploads. Perhaps that will solve the problem above as well.Chloromycetin
It should. I would guess you've been testing on a decent connection so far.Links
It is only when a connection is idle for 30 seconds that it is timed out.Saxe
@James - 30 seconds until first byte: devcenter.heroku.com/articles/http-routing#timeoutsLinks
Thanks Neil for clarifying. Also looks like the timeout is 55 seconds between bytes after the first one.Saxe

© 2022 - 2024 — McMap. All rights reserved.