Is Stream.Copy piped?

Asked 15/9, 2012 at 15:42 Answered 15/9, 2012 at 22:17

Suppose I am writing a tcp proxy code. I am reading from the incoming stream and writing to the output stream. I know that Stream.Copy uses a buffer, but my question is: Does the Stream.Copy method writes to the output stream while fetching the next chunk from the input stream or it a loop like "read chunk from input, write chunk to ouput, read chunk from input, etc" ?

Mesocarp answered 15/9, 2012 at 15:42 Comment(3)

Interesting question; I don't actually know without checking. Of course, it should be noted that doing this would require 2 separate buffers (or two separate portions of a single buffer). – Japha 15/9, 2012 at 15:46

Yes, It's pretty obvious that a double buffer is needed. But i'm not sure that the Stream.Copy function is that smart. – Mesocarp 15/9, 2012 at 15:51

It should be noted that if it used a pipe it would then be doing two stream copies, which would then involve a further two pipes, which ... – Tubercle 15/9, 2012 at 18:11

Here's the implementation of CopyTo in .NET 4.5:

private void InternalCopyTo(Stream destination, int bufferSize)
{
    int num;
    byte[] buffer = new byte[bufferSize];
    while ((num = this.Read(buffer, 0, buffer.Length)) != 0)
    {
        destination.Write(buffer, 0, num);
    }
}

So as you can see, it reads from the source, then writes to the destination. This could probably be improved ;)

EDIT: here's a possible implementation of a piped version:

public static void CopyToPiped(this Stream source, Stream destination, int bufferSize = 0x14000)
{
    byte[] readBuffer = new byte[bufferSize];
    byte[] writeBuffer = new byte[bufferSize];

    int bytesRead = source.Read(readBuffer, 0, bufferSize);
    while (bytesRead > 0)
    {
        Swap(ref readBuffer, ref writeBuffer);
        var iar = destination.BeginWrite(writeBuffer, 0, bytesRead, null, null);
        bytesRead = source.Read(readBuffer, 0, bufferSize);
        destination.EndWrite(iar);
    }
}

static void Swap<T>(ref T x, ref T y)
{
    T tmp = x;
    x = y;
    y = tmp;
}

Basically, it reads a chunk synchronously, starts to copy it to the destination asynchronously, then read the next chunk and waits for the write to complete.

I ran a few performance tests:

using MemoryStreams, I didn't expect a significant improvement, since it doesn't use IO completion ports (AFAIK); and indeed, the performance is almost identical
using files on different drives, I expected the piped version to perform better, but it doesn't... it's actually slightly slower (by 5 to 10%)

So it apparently doesn't bring any benefit, which is probably the reason why it isn't implemented this way...

Abidjan answered 15/9, 2012 at 22:17 Comment(6)

I believe the buffer size plays an important role when using files. – Mesocarp 15/9, 2012 at 23:59

@IsraelLot, yes, probably. The default buffer size I used is the same as in the default Stream.Copy implementation. – Abidjan 16/9, 2012 at 18:59

The ideal solution would be to alternate between two buffers: https://mcmap.net/q/52965/-net-asynchronous-stream-read-write – Jinnah 20/10, 2014 at 17:47

@ivan_pozdeev, your edit in my code was incorrect, it caused an empty buffer to be written to the output stream. I rolled it back. Please don't make significant edits to other people's code, unless it's just to fix an obvious syntax error or typo. – Abidjan 20/10, 2014 at 17:53

@ThomasLevesque: Writing 0 bytes should be a no-op - what's wrong with that? – Jinnah 20/10, 2014 at 17:59

@ThomasLevesque: I just didn't like the reference juggling. The C-style array index technique looks more streamlined. – Jinnah 20/10, 2014 at 18:2

According to Reflector it does not. Such behavior better be documented because it would introduce concurrency. This is never safe to do in general. So the API design to not "pipe" is sound.

So this is not just a question of Stream.Copy being more or less smart. Copying in a concurrent way is not an implementation detail.

Mccollum answered 15/9, 2012 at 15:51 Comment(0)

Stream.Copy is synchronous operation. I don't think it is reasonable to expect it to use asynchronous read/write to make simultaneous read and write.

I would expect asynchrounous version (like RandomAccessStream.CopyAsync) to use simultaneous read and write.

Note: using multiple threads during copy would be unwelcome behavior, but using asynchronous read and write to run them at the same time is ok.

Pouter answered 15/9, 2012 at 17:54 Comment(6)

Asynchonous does not imply concurrent. Going concurrent with proper documentation and being very explicit about it would be very dangerous. Almost nothing is thread-safe by default. – Mccollum 15/9, 2012 at 18:23

@usr, not sure what your comment is about... maybe confusing usage of "parallel"? See edit (parallel -> simultaneous). – Pouter 15/9, 2012 at 21:55

Sry, my comment was unclear. Referring to this: "I would expect asynchrounous version ... to ... simultaneous read and write.": RandomAccessStream.CopyAsync can't do this simultaneously because that would potentially introduce race conditions in user/framework code. Both read and write could access a shared variable or such. – Mccollum 15/9, 2012 at 22:10

@Mccollum What race conditions? I would be very surprised if a stream the supports async read/write operations (BeginXXX/EndXXX pair of XXXAsync for 4.5) would explicitly impement them to have race conditions. Now for user's code - asynchronous operation will call back on the same thread as they were started, no concurrency issues here. The only immediate problem I can see if the same buffer is used for desitnation or Read and source of Write but it is only issue if whoever calls asynchronos methods explicitly shares it... – Pouter 15/9, 2012 at 22:27

the read and write calls might run on a custom stream implementation. They might for example increment a "static int readWriteCount" variable. If called concurrently a race exists. Again: Asynchrony does not imply parallelism. For that reason the async read and write calls must not be thread-safe internally. They might expect to run on one thread at a time (although on different threads over time). – Mccollum 16/9, 2012 at 11:39

@usr, I guess we both arguing for exactly the same thing - there is no word that describes "2 operations run at the same time on the same thread" (as in case of starting BeginRead and BeginWrite operations on after each other in the code). Or I'm missing something again? – Pouter 16/9, 2012 at 17:45

-1

Writing to the output stream is impossible (when using one buffer) while fetching next chunk because fetching the next chunk can overwrite the buffer while its being used for output.
You can say use double buffering but its pretty much the same as using a double sized buffer.

Pullen answered 15/9, 2012 at 15:46 Comment(1)

Since (as you yourself note) we can use a separate buffer (or fragment of buffer), it is clearly not "impossible". – Japha 15/9, 2012 at 15:48

Recommended topics

Hot tags