How to optimize upload routine using Delphi 2010?
Asked Answered
M

1

7

My yet to be released Delphi 2010 application allows users to upload their files to my servers. Right now I'm using HTTPS POST to send the files, the (simplified) algorithm is basically:

  1. Split File into "slices" (256KB each)
  2. For each slice, POST it to server

ie. for a 1MB file:

--> Get Slice #1 (256KB)
--> Upload Slice #1 using TidHTTP.Post()

--> Get Slice #2 (256KB)
--> Upload Slice #2 using TidHTTP.Post()

--> Get Slice #3 (256KB)
--> Upload Slice #3 using TidHTTP.Post()

--> Get Slice #4 (256KB)
--> Upload Slice #4 using TidHTTP.Post()

I'm using Indy 10. I (ab)used my profiler over and over and there are not much left to optimize except changing the upload routine itself.

I'm also using multi-threading, and even though I did my best to optimize my code, my benchmarks still tell me I can do better (there are other well optimized software that do achieve a much better timing...almost twice as fast as my upload routine!)

I know it's not my server's fault...here are the ideas that I still need to explore:

  1. I tried grouping slices in a single POST, naturally this resulted in a performance boost (20-35%) but resuming capability is now reduced.

  2. I also thought about using SFTP / SSH, but I'm not sure if it's fast.

  3. Use web sockets to implement resumable upload (like this component), I'm not sure about speed either.

Now my question is: is there something I can do to speed up my upload? I'm open to any suggestion that I can implement, including commandline tools (if license allows me to ship it with my application), provided that:

  1. Resumable upload is supported
  2. Fast!
  3. Reasonable memory usage
  4. Secure & allow login/user authentication

Also, because of major security concerns, FTP is a not something I'd want to implement.

Thanks a lot!

Mixup answered 28/2, 2012 at 5:13 Comment(3)
Does the transfer use data compression/decompression?Aqualung
@mjn: yes (slices are already zipped before being uploaded + I use Indy's TIdCompressorZLib)Mixup
@kobik: fairly straightforward php code (move_uploaded_file() + md5 checking + simple sql insert), I measured the php timing, it's definitely not the bottleneck.Mixup
B
5

I would suggest doing a single TIdHTTP.Post() for the entire file without chunking it at all. You can use the TIdHTTP.OnWork... events to keep track of how many bytes were sent to the server so you know where to resume from if needed. When resuming, you can use the TIdHTTP.Request.CustomHeaders property to include a custom header that tells the server where you are resuming from, so it can roll back its previous file to the specfiied offset before accepting the new data.

Bracey answered 29/2, 2012 at 1:0 Comment(8)
That's great, I didn't know I could resume a POST. Let me see if I got this right: in the PHP code, I add this --> header('Accept-Ranges: bytes'); and in Delphi if I add this (just an example): IdHTTP.Request.CustomHeaders.Add('Range: bytes=5000-'); the HTTP POST will automatically discard extra bytes (roll back) & pickup from the 5000th byte, is that correct?Mixup
To resume a previous POST, you can pass in a TStream that has just the remaining data in it. But the server has to support the resume and append the new data to the existing file, not overwrite the file fresh. Upload resuming is not part of the standard HTTP protocol. The Accept-Ranges response header and the Range request header are only for downloads, not uploads. When I mentioned a custom header, I was referring to a custom X-... header of you own design that your PHP code can look for, eg: X-Resuming-From: ....Bracey
Or the Content-Range header, though RFC 2616 suggests that it is usually only used in responses, not in requests.Bracey
You could alternatively write separate scripts, one to POST to when sending data for a new upload, and another one to POST remaining data to when resuming a previous upload. Then you don't have to use custom request headers. You could send a HEAD request to determine how many bytes the server actually has available before starting a resume.Bracey
Thank you Remy, but I'm a confused: in order to send the 'HEAD' request I need to know the file path/reference, but if I'm not mistaken the PHP script only starts its execution after the file upload has been completed, or am I missing something here? I mean can you elaborate a little bit? Thanks!!Mixup
You are probably thinking of PUT, or maybe $_POST_FILES. Either one of those are meant for working with complete files only. The response to PUT tells the client the URL of the file that was created so it can be accessed later. A generic POST, on the other hand, is just arbitrary data, the receiving script decides what to do with that data. You can use $_POST, $HTTP_RAW_POST_DATA, or fopen("php://input") to access the raw data and do whatever you want with it. Just be careful because $_POST and $HTTP_RAW_POST_DATA are limited by php.ini directives, but php://input is not.Bracey
Thank you Remy, just to make sure I understood you correctly, let's say I have this Delphi code and this old PHP code, you mean I'd have to change the PHP code to something like this?Mixup
Thank you Remy and sorry if I had to torture you with my stupid questions over and over! For those interested in the solution, please take a look at the linked question, and the chat messages here. Delphi code is here and PHP code is hereMixup

© 2022 - 2024 — McMap. All rights reserved.