Faster (unsafe) BinaryReader in .NET
Asked Answered
H

5

28

I came across a situation where I have a pretty big file that I need to read binary data from.

Consequently, I realized that the default BinaryReader implementation in .NET is pretty slow. Upon looking at it with .NET Reflector I came across this:

public virtual int ReadInt32()
{
    if (this.m_isMemoryStream)
    {
        MemoryStream stream = this.m_stream as MemoryStream;
        return stream.InternalReadInt32();
    }
    this.FillBuffer(4);
    return (((this.m_buffer[0] | (this.m_buffer[1] << 8)) | (this.m_buffer[2] << 0x10)) | (this.m_buffer[3] << 0x18));
}

Which strikes me as extremely inefficient, thinking at how computers were designed to work with 32-bit values since the 32 bit CPU was invented.

So I made my own (unsafe) FastBinaryReader class with code such as this instead:

public unsafe class FastBinaryReader :IDisposable
{
    private static byte[] buffer = new byte[50];
    //private Stream baseStream;

    public Stream BaseStream { get; private set; }
    public FastBinaryReader(Stream input)
    {
        BaseStream = input;
    }


    public int ReadInt32()
    {
        BaseStream.Read(buffer, 0, 4);

        fixed (byte* numRef = &(buffer[0]))
        {
            return *(((int*)numRef));
        }
    }
...
}

Which is much faster - I managed to shave off 5-7 seconds off the time it took to read a 500 MB file, but it's still pretty slow overall (29 seconds initially and ~22 seconds now with my FastBinaryReader).

It still kind of baffles me as to why it still takes so long to read such a relatively small file. If I copy the file from one disk to another it takes only a couple of seconds, so disk throughput is not an issue.

I further inlined the ReadInt32, etc. calls, and I ended up with this code:

using (var br = new FastBinaryReader(new FileStream(cacheFilePath, FileMode.Open, FileAccess.Read, FileShare.Read, 0x10000, FileOptions.SequentialScan)))

  while (br.BaseStream.Position < br.BaseStream.Length)
  {
      var doc = DocumentData.Deserialize(br);
      docData[doc.InternalId] = doc;
  }
}

   public static DocumentData Deserialize(FastBinaryReader reader)
   {
       byte[] buffer = new byte[4 + 4 + 8 + 4 + 4 + 1 + 4];
       reader.BaseStream.Read(buffer, 0, buffer.Length);

       DocumentData data = new DocumentData();
       fixed (byte* numRef = &(buffer[0]))
       {
           data.InternalId = *((int*)&(numRef[0]));
           data.b = *((int*)&(numRef[4]));
           data.c = *((long*)&(numRef[8]));
           data.d = *((float*)&(numRef[16]));
           data.e = *((float*)&(numRef[20]));
           data.f = numRef[24];
           data.g = *((int*)&(numRef[25]));
       }
       return data;
   }

Any further ideas on how to make this even faster? I was thinking maybe I could use marshalling to map the entire file straight into memory on top of some custom structure, since the data is linear, fixed size and sequential.

SOLVED: I came to the conclusion that FileStream's buffering/BufferedStream are flawed. Please see the accepted answer and my own answer (with the solution) below.

Halfway answered 6/8, 2009 at 11:47 Comment(1)
It may be helpful: #19558935Heinrich
D
10

When you do a filecopy, large chunks of data are read and written to disk.

You are reading the entire file four bytes at a time. This is bound to be slower. Even if the stream implementation is smart enough to buffer, you still have at least 500 MB/4 = 131072000 API calls.

Isn't it more wise to just read a large chunk of data, and then go through it sequentially, and repeat until the file has been processed?

Dyadic answered 6/8, 2009 at 11:59 Comment(5)
There's a parameter in the FileStream constructor which specifies the buffer size, so the read is indeed done in chunks. I tried various values for the buffer size, but there were no major improvements. Extra large buffer sizes actually hurt performance in my measurements.Halfway
still you are doing the immense number of calls to 'ReadInt32'. Just getting it yourself from a consecutive piece of memory will be much quicker.Dyadic
Please re-read the question, I am not using ReadInt32 in the actual implementation, there is only 1 read per object, and all the conversions are inlined, see the last two blocks of code.Halfway
right... sorry about that. I guess then that the immense amount of small memory allocations might be the problem.Dyadic
I will award your question as the accepted answer because you suggested reading large chunks of data from the file. That would have been redundant if the actual FileStream's buffering implementation wasn't flawed, but apparently it is.Halfway
S
24

I ran into a similar performance issue with BinaryReader/FileStream, and after profiling, I discovered that the problem isn't with FileStream buffering, but instead with this line:

while (br.BaseStream.Position < br.BaseStream.Length) {

Specifically, the property br.BaseStream.Length on a FileStream makes a (relatively) slow system call to get the file size on each loop. After changing the code to this:

long length = br.BaseStream.Length;
while (br.BaseStream.Position < length) {

and using an appropriate buffer size for the FileStream, I achieved similar performance to the MemoryStream example.

Soda answered 26/4, 2012 at 23:39 Comment(0)
H
12

Interesting, reading the whole file into a buffer and going through it in memory made a huge difference. This is at the cost of memory, but we have plenty.

This makes me think that the FileStream's (or BufferedStream's for that matter) buffer implementation is flawed, because no matter what size buffer I tried, performance still sucked.

  using (var br = new FileStream(cacheFilePath, FileMode.Open, FileAccess.Read, FileShare.Read, 0x10000, FileOptions.SequentialScan))
  {
      byte[] buffer = new byte[br.Length];
      br.Read(buffer, 0, buffer.Length);
      using (var memoryStream = new MemoryStream(buffer))
      {
          while (memoryStream.Position < memoryStream.Length)
          {
              var doc = DocumentData.Deserialize(memoryStream);
              docData[doc.InternalId] = doc;
          }
      }
  }

Down to 2-5 seconds (depends on disk cache I'm guessing) now from 22. Which is good enough for now.

Halfway answered 6/8, 2009 at 12:21 Comment(3)
so my answer wasn't that flawed ;^)Dyadic
Thanks. But there's actually a problem with .NET's buffer implementation, because I tried a buffer size exactly as big as the file (which should have been equivalent to the intermediary MemoryStream), and that still sucked performance-wise. In theory your suggestion should have been redundant, but in practice - jackpot.Halfway
you can just say var buffer = File.ReadAllBytes(cacheFilePath); save some code and it's much fasterSurreptitious
D
10

When you do a filecopy, large chunks of data are read and written to disk.

You are reading the entire file four bytes at a time. This is bound to be slower. Even if the stream implementation is smart enough to buffer, you still have at least 500 MB/4 = 131072000 API calls.

Isn't it more wise to just read a large chunk of data, and then go through it sequentially, and repeat until the file has been processed?

Dyadic answered 6/8, 2009 at 11:59 Comment(5)
There's a parameter in the FileStream constructor which specifies the buffer size, so the read is indeed done in chunks. I tried various values for the buffer size, but there were no major improvements. Extra large buffer sizes actually hurt performance in my measurements.Halfway
still you are doing the immense number of calls to 'ReadInt32'. Just getting it yourself from a consecutive piece of memory will be much quicker.Dyadic
Please re-read the question, I am not using ReadInt32 in the actual implementation, there is only 1 read per object, and all the conversions are inlined, see the last two blocks of code.Halfway
right... sorry about that. I guess then that the immense amount of small memory allocations might be the problem.Dyadic
I will award your question as the accepted answer because you suggested reading large chunks of data from the file. That would have been redundant if the actual FileStream's buffering implementation wasn't flawed, but apparently it is.Halfway
C
6

One caveat; you might want to double-check your CPU's endianness... assuming little-endian is not quite safe (think: itanium etc).

You might also want to see if BufferedStream makes any difference (I'm not sure it will).

Cheese answered 6/8, 2009 at 11:50 Comment(2)
Yup, I'm aware of endianess issues, but this is a proprietary application where I have full control over deployment. Regarding BufferedStream, from my understanding the FileStream is already buffered, so it would just add an unnecessary intermediary buffer. P.S.: I'm also using your protobuf library in this project, so many thanks for that :)Halfway
I just made a new measurement with a wrapping BufferedStream, and as anticipated, there is no difference.Halfway
M
0

I used to write into the first bytes of the binary file a total number of rows of data in that file, or the amount of bytes that the 1 row of data requires.

However, I later discovered a solution called TeaFiles, which performs twice as fact, than even raw binary files solution I developed. Interestingly, it looks that the amount of disk space required is exactly the same as what binary file requires, so there is probably a lot in common to what this library does under the hood.

enter image description here enter image description here

On 2mil+ of time-series records, I get the following read performance of different solutions

  • SQLite: 11287 ms
  • JSON: 3842 ms
  • BIN (gzip compressed): 35308 ms
  • BIN (non-compressed): 7058 ms
  • TEA: 595 ms
  • CSV: 3074 ms
  • BIN (struct instead of class, non-compressed): 11042 ms
  • BIN (custom logic, for writing Pure binary using the BinaryReader/Writer): 930 ms

In my tests nothing beats the TeaFiles. Sorry for not posting the complete code for all the different options. You can run some tests and see for yourself if the proposed option is any good.

One thing to keep in mind, is that there is no way to remove a row from the file. So you essentially have to read the file, append a new row, and either re-write or create a new version of the existing file with most of the solutions above, excluding the fully pledged SQL based solution (sqlite). So the use-case for non-sql solutions are situational as with most things in life :)

p.s. If I am not lazy and have some time in the future I will update the topic with the link to reproduction code repo.

Magellan answered 5/1, 2023 at 12:48 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.