Calculating Time Remaining on File Copy
Asked Answered
R

4

7

I have an app that copies a large amount of files across the network to a file server (not web). I am trying to display a half decent estimation of the time remaining.

I have looked at a number of articles on SO an while the problem is addressed none that I have tried really do what I want. I want the estimated time remaining to be relatively stable I.E. not jump around all over the place depending on fluctuating transfer speeds.

So the first solution I looked at was to calculate the transfer speed in bytes per second

double bytePerSec = totalBytesCopied / TimeTaken.TotalSeconds;

And then divide the total byte remaining by the transfer rate.

double secRemain = (totalFileSizeToCopy - totalBytesCopied) / bytePerSec;

I figured that the time remaining would become more stable once a few MB had been copied (although expecting it to change . It doesn't, its erratic and jumps around all over the place.

Then I tried one of the solutions on SO....

double secRemain = (TimeTaken.TotalSeconds / totalBytesCopied) * (totalFileSizeToCopy - totalBytesCopied);

Which is a similar calculation but hoped it might make a difference!

So now I am kind of thinking I need to approach this from a different angle. IE Use averages? Use some kind of countdown timer and reset the time to go every so often? Just looking for opinions or preferably advice from anyone that has already had this problem.

Roque answered 19/2, 2014 at 12:22 Comment(11)
You might want to also consider that each file might come with a per-file lag time - for opening a stream and creating a new file somewhere. Will it take longer to copy a million 1kB files, than to copy one 1 GB file?Voluntarism
is it posible for you to show a progress bar with "Copying 1 of XXX" ?Gudrun
link This is you may want.Jaguarundi
@ShellShock yes absolutely but there has to be a way to calculate this and stabilize the result to give a rough estimate of how much longer the total transfer will take.Roque
Suppose that your network hardware was already under maximum load when you start the transfer, and the formula indicates it will take 2 hours to complete the file transfer. Now after the first 10 minutes, the process that was hogging your resources exits and it's looking like the rest of your file transfer will only take 5 more minutes. What do you want the timer to indicate? Repeat the same thought experiment for 20, 30, 40, ... minutes into the file transfer.Missilery
My conclusion is that you should base your formula on only the most recent transfer speeds. For example, you could measure the transfer speed every second, then use the average of only the last 10 seconds to calculate the remaining time.Missilery
If this doesn't work, the problem is most likely related to the fact that most of the time isn't spent sending the file contents, but rather the overhead of it.Getraer
@StevenLiekens ok I will be honest I was about to say I get all that! I understand I cannot rely on network congestion, server load etc but my attempts so far were being calculated each time the filestream wrote a block to the destination path. This means its recalulating every .2 sec or so hence why its so erratic. So using your suggestion and bringing in the filecount to somehow compensate for overheads might smooth things out a bit?Roque
Okay, so what if you only displayed the estimate once in 5 seconds? Or what if you round the resulting value?Getraer
@Roque I used 10 seconds as an example, but it's up to you. The longer the timespan, the smoother your timer will transition, but the longer it will take for your timer to respond to dramatic changes. Suppose that you pull the network cable. It would take 10 seconds before your timer would finally indicate that it will take an infinite amount of time to complete the transfer (at an average of 0 bytes/s).Missilery
@Roque I'm getting a good balance between stability and accuracy when I test with an average calculated from the 1 to 30 most recent measurements, 1 measurement per second.Missilery
M
7

Here's a working example of how you would asynchronously copy a file D:\dummy.bin to D:\dummy.bin.copy, with a timer taking snapshots of the transfer rate every second.

From that data, I simply take the average transfer rate from up to 30 snapshots (newest first). From that I can calculate a rough estimate of how long it will take to transfer the rest of the file.

This example is provided as-is and does not support copying multiple files in 1 operation. But it should give you some ideas.

using System;
using System.Collections.Generic;
using System.IO;
using System.Linq;
using System.Threading;

public class Program
{
    public static void Main(string[] args)
    {
        var sourcePath = @"D:\dummy.bin";
        var destinationPath = @"D:\dummy.bin.copy";
        var sourceFile = new FileInfo(sourcePath);
        var fileSize = sourceFile.Length;
        var currentBytesTransferred = 0L;
        var totalBytesTransferred = 0L;
        var snapshots = new Queue<long>(30);
        var timer = new System.Timers.Timer(1000D);
        timer.Elapsed += (sender, e) =>
        {
            // Remember only the last 30 snapshots; discard older snapshots
            if (snapshots.Count == 30)
            {
                snapshots.Dequeue();
            }

            snapshots.Enqueue(Interlocked.Exchange(ref currentBytesTransferred, 0L));
            var averageSpeed = snapshots.Average();
            var bytesLeft = fileSize - totalBytesTransferred;
            Console.WriteLine("Average speed: {0:#} MBytes / second", averageSpeed / (1024 * 1024));
            if (averageSpeed > 0)
            {
                var timeLeft = TimeSpan.FromSeconds(bytesLeft / averageSpeed);
                var timeLeftRounded = TimeSpan.FromSeconds(Math.Round(timeLeft.TotalSeconds));
                Console.WriteLine("Time left: {0}", timeLeftRounded);
            }
            else
            {
                Console.WriteLine("Time left: Infinite");
            }
        };

        using (var inputStream = sourceFile.OpenRead())
        using (var outputStream = File.OpenWrite(destinationPath))
        {
            timer.Start();
            var buffer = new byte[4096];
            var numBytes = default(int);
            var numBytesMax = buffer.Length;
            var timeout = TimeSpan.FromMinutes(10D);
            do
            {
                var mre = new ManualResetEvent(false);
                inputStream.BeginRead(buffer, 0, numBytesMax, asyncReadResult =>
                {
                    numBytes = inputStream.EndRead(asyncReadResult);
                    outputStream.BeginWrite(buffer, 0, numBytes, asyncWriteResult =>
                    {
                        outputStream.EndWrite(asyncWriteResult);
                        currentBytesTransferred = Interlocked.Add(ref currentBytesTransferred, numBytes);
                        totalBytesTransferred = Interlocked.Add(ref totalBytesTransferred, numBytes);
                        mre.Set();
                    }, null);
                }, null);
                mre.WaitOne(timeout);
            } while (numBytes != 0);
            timer.Stop();
        }
    }
}
Missilery answered 19/2, 2014 at 14:45 Comment(0)
V
2

You want to calculate the rate based on average transfer rate, but you want it to be a moving average since network speed is variable over the the life of the file transfer (especially for very large files). Here's the JavaScript method I came up with that seems to work well (should be easy to convert to C#).

var rates = [];
var ratesLength = 1000;
for (var i = 0; i < ratesLength; i++)
    rates[i] = 0;

/**
* Estimates the remaining download time of the file.
*
* @param {number} bytesTransferred: Number of bytes transferred so far.
* @param {number} totalBytes: Total number of bytes to be transferred.
* @return {string} Returns the estimating time remaining in the upload/download.
*/
function getSpeed(bytesTransferred, totalBytes) {

    var bytes = bytesTransferred - oldBytesTransfered;
    var time = currentTime - oldTime;

    if ((time != 0) && (bytes != 0)) {
        rates[rateIndex] = (bytes) / (time);
        rateIndex = (rateIndex + 1) % rates.length;
        var avgSpeed = 0;
        var count = 0;
        for (i = 0; i < rates.length ; i++) {
            if (rates[i] != 0) {
                avgSpeed += rates[i];
                count++;
            }
        }
        if (count == 0)
            return " ";

        avgSpeed /= count;
        return (humanReadableTime((totalBytes - bytesTransferred) / avgSpeed) + " remaining");
    } else {
        return " ";
    }
}

You can tweak the ratesLength to get the smoothness you want. oldTime is the time that the last chuck of bytes were received and oldBytesTransfered is the number of total bytes transferred after the last chunk. bytesTransferred is the total amount transferred including the current chuck.

I've found that doing a weighted average like this has good accuracy, but also reacts well to changes in speed.

Venn answered 19/2, 2014 at 12:52 Comment(0)
P
1

You'll likely want to calculate the estimate against the average transfer rate. That should help stabilize the estimate. There's another SO question which asks the same question.

Pru answered 19/2, 2014 at 12:29 Comment(0)
G
1

This could be one aproach

Calculate the bytes/second when the 25% of the first file has been copied

Use that calculation to estimate the rest of the 75% of the first file operation

Every time that a file has been copied store, in a collection, the average of the speed in bytes/second for each copied file.

Then you can use the average of the items in the collection to estimate the time of the of the copy operation for the files that have not yet been copied.

Let me know if you need more details or aproaches

Gudrun answered 19/2, 2014 at 12:30 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.