Uploading big files over HTTP
Asked Answered
H

9

17

I need to upload potentially big (as in, 10's to 100's of megabytes) files from a desktop application to a server. The server code is written in PHP, the desktop application in C++/MFC. I want to be able to resume file uploads when the upload fails halfway through because this software will be used over unreliable connections. What are my options? I've found a number of HTTP upload components for C++, such as http://www.chilkatsoft.com/refdoc/vcCkUploadRef.html which looks excellent, but it doesn't seem to handle 'resume' of half done uploads (I assume this is because HTTP 1.1 doesn't support it). I've also looked at the BITS service but for uploads it requires an IIS server. So far my only option seems to be to cut up the file I want to upload into smaller pieces (say 1 meg each), upload them all to the server, reassemble them with PHP and run a checksum to see if everything went ok. To resume, I'd need to have some form of 'handshake' at the beginning of the upload to find out which pieces are already on the server. Will I have to code this by hand or does anyone know of a library that does all this for me, or maybe even a completely different solution? I'd rather not switch to another protocol that supports resume natively for maintenance reasons (potential problems with firewalls etc.)

Heavyarmed answered 29/1, 2009 at 16:17 Comment(0)
E
21

I'm eight months late, but I just stumbled upon this question and was surprised that webDAV wasn't mentioned. You could use the HTTP PUT method to upload, and include a Content-Range header to handle resuming and such. A HEAD request would tell you if the file already exists and how big it is. So perhaps something like this:

1) HEAD the remote file

2) If it exists and size == local size, upload is already done

3) If size < local size, add a Content-Range header to request and seek to the appropriate location in local file.

4) Make PUT request to upload the file (or portion of the file, if resuming)

5) If connection fails during PUT request, start over with step 1

You can also list (PROPFIND) and rename (MOVE) files, and create directories (MKCOL) with dav.

I believe both Apache and Lighttpd have dav extensions.

Erena answered 30/9, 2009 at 18:18 Comment(2)
Excellent suggestion, don't know why I didn't think of that. Problem is that in the mean time I have, much to my regret, implemented my own solution :) It's turning out to be quite a kludge so I may be replacing it with a WebDAV solution after all. Thanks.Heavyarmed
That depends on the server. Apache (with mod_dav) and other mature servers tend to support content-range on PUT. S3, however, does not (though it seems it's not out of the question: forums.aws.amazon.com/message.jspa?messageID=93019)Erena
O
2

You need a standard size (say 256k). If your file "abc.txt", uploaded by user x is 78.3MB it would be 313 full chunks and one smaller chunk.

  1. You send a request to upload stating filename and size, as well as number of initial threads.
  2. your php code will create a temp folder named after the IP address and filename,
  3. Your app can then use MULTIPLE connections to send the data in different threads, so you could be sending chunks 1,111,212,313 at the same time (with separate checksums).
  4. your php code saves them to different files and confirms reception after validating the checksum, giving the number of a new chunk to send, or to stop with this thread.
  5. After all thread are finished, you would ask the php to join all the files, if something is missing, it would goto 3

You could increase or decrease the number of threads at will, since the app is controlling the sending.

You can easily show a progress indicator, either a simple progress bar, or something close to downthemall's detailed view of chunks.

Oleum answered 29/1, 2009 at 23:37 Comment(0)
G
2

libcurl (C api) could be a viable option

-C/--continue-at Continue/Resume a previous file transfer at the given offset. The given offset is the exact number of bytes that will be skipped, counting from the beginning of the source file before it is transferred to the destination. If used with uploads, the FTP server command SIZE will not be used by curl. Use "-C -" to tell curl to automatically find out where/how to resume the transfer. It then uses the given output/input files to figure that out. If this option is used several times, the last one will be used

Gabby answered 9/10, 2009 at 9:31 Comment(0)
D
2

Google have created a Resumable HTTP Upload protocol. See https://developers.google.com/gdata/docs/resumable_upload

Downcome answered 4/8, 2012 at 14:28 Comment(0)
S
1

Is reversing the whole proccess an option? I mean, instead of pushing file over to the server make the server pull the file using standard HTTP GET with all bells and whistles (like accept-ranges, etc.).

Saluki answered 29/1, 2009 at 16:20 Comment(2)
Good thinking but no, because the client may be behind a corporate firewall that doesn't allow incoming web traffic or does NAT and can't figure out which client machine to forward to.Heavyarmed
And other possibilities? FTP, Bittorrent?Roundup
R
1

Maybe the easiest method would be to create an upload page that would accept the filename and range in parameter, such as http://yourpage/.../upload.php?file=myfile&from=123456 and handle resumes in the client (maybe you could add a function to inspect which ranges the server has received)

Roundup answered 29/1, 2009 at 18:32 Comment(0)
C
1

@ Anton Gogolev Lol, I was just thinking about the same thing - reversing whole thing, making server a client, and client a server. Thx to Roel, why it wouldn't work, is clearer to me now.

@ Roel I would suggest implementing Java uploader [JumpLoader is good, with its JScript interface and even sample PHP server side code]. Flash uploaders suffer badly when it comes to BIIIGGG files :) , in a gigabyte scale that is.

Coniferous answered 12/11, 2010 at 22:31 Comment(0)
Y
0

F*EX can upload files up to TB range via HTTP and is able to resume after link failures. It does not exactly meets your needs, because it is written in Perl and needs an UNIX based server, but the clients can be on any operating system. Maybe it is helpful for you nevertheless: http://fex.rus.uni-stuttgart.de/

Yare answered 17/7, 2013 at 23:14 Comment(0)
I
0

Exists the protocol called TUS for resumable uploads with some implementations in PHP and C++

Intelligibility answered 28/1, 2021 at 13:39 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.