Fast reading of little endian integers from file
Asked Answered
M

4

11

I need to read a binary file consisting of 4 byte integers (little endian) into a 2D array for my Android application. My current solution is the following:

DataInputStream inp = null;
try {
    inp = new DataInputStream(new BufferedInputStream(new FileInputStream(procData), 32768));
}
catch (FileNotFoundException e) {
    Log.e(TAG, "File not found");
}

int[][] test_data = new int[SIZE_X][SIZE_Y];
byte[] buffer = new byte[4];
ByteBuffer byteBuffer = ByteBuffer.allocate(4);
for (int i=0; i < SIZE_Y; i++) {
    for (int j=0; j < SIZE_X; j++) {
        inp.read(buffer);
        byteBuffer = ByteBuffer.wrap(buffer);
        test_data[j][SIZE_Y - i - 1] = byteBuffer.order(ByteOrder.LITTLE_ENDIAN).getInt();
    }
}

This is pretty slow for a 2k*2k array, it takes about 25 seconds. I can see in the DDMS that the garbage collector is working overtime, so that is probably one reason for the slowness.

There has to be a more efficient way of using the ByteBuffer to read that file into the array, but I'm not seeing it at the moment. Any idea on how to speed this up?

Mufinella answered 22/2, 2011 at 12:29 Comment(2)
Do you really need to read all the data at the same time? And do you access many entries often? If not, you can avoid to "parse" the whole array as integers. Just read or wrap the whole file, and provide just the needed entry by calculating its offset from the x y coordinates.Ostap
@Luzifer I need all of the data at least once in the beginning.Mufinella
R
12

Why not read into a 4-byte buffer and then rearrange the bytes manually? It will look like this:

for (int i=0; i < SIZE_Y; i++) {
    for (int j=0; j < SIZE_X; j++) {
        inp.read(buffer);
        int nextInt = (buffer[0] & 0xFF) | (buffer[1] & 0xFF) << 8 | (buffer[2] & 0xFF) << 16 | (buffer[3] & 0xFF) << 24;
        test_data[j][SIZE_Y - i - 1] = nextInt;
    }
}

Of course, it is assumed that read reads all four bytes, but you should check for the situation when it's not. This way you won't create any objects during reading (so no strain on the garbage collector), you don't call anything, you just use bitwise operations.

Rayburn answered 22/2, 2011 at 12:39 Comment(4)
Thanks, this version is about 5x as fast as my original one, it only takes 5 seconds now. I'm not used to fiddle around with the bits directly.Mufinella
This is the only working method I've found to convert raw bytes to unsigned int. Thanks!Cockleshell
Why do you make bitwise & with FF? Isn't byte written on 8 bits? If so, that operation wouldn't do anything... What am I missing?Kolomna
@Kolomna Because if you just cast a byte to int, you may get negative numbers as bytes are signed. For example, if a byte contains 0b11111111, it will become -1 and not 255.Rayburn
R
5

If you are on a platform that supports memory-mapped files, consider the MappedByteBuffer and friends from java.nio

FileChannel channel = new RandomAccessFile(procData, "r").getChannel();
MappedByteBuffer map = channel.map(FileChannel.MapMode.READ_ONLY, 0, 4 * SIZE_X * SIZE_Y);
map.order(ByteOrder.LITTLE_ENDIAN);
IntBuffer buffer = map.asIntBuffer();

int[][] test_data = new int[SIZE_X][SIZE_Y];
for (int i=0; i < SIZE_Y; i++) {
    for (int j=0; j < SIZE_X; j++) {
        test_data[j][SIZE_Y - i - 1] = buffer.get();
    }
}

If you need cross-platform support or your platform lacks memory-mapped buffers, you may still want to avoid performing the conversions yourself using an IntBuffer. Consider dropping the BufferedInputStream, allocating a larger ByteBuffer yourself and obtaining a little-endian IntBuffer view on the data. Then in a loop reset the buffer positions to 0, use DataInputStream.readFully to read the large regions at once into the ByteBuffer, and pull int values out of the IntBuffer.

Riegel answered 5/4, 2012 at 19:13 Comment(0)
M
3

First of all, your 'inp.read(buffer)' is unsafe, as read contract does not guarantee that it will read all 4 bytes.

That aside, for quick transformation use the algorithm from DataInputStream.readInt

I've adapted for you case of byte array of 4 bytes:

int little2big(byte[ ] b) {
    return (b[3]&0xff)<<24)+((b[2]&0xff)<<16)+((b[1]&0xff)<<8)+(b[0]&0xff);
}
Metamerism answered 22/2, 2011 at 12:37 Comment(0)
B
1

I don't think it is necessary to reinvent the wheel and perform the byte reordering for endianness again. This is error prone and there is a reason a class like ByteBuffer exists.

Your code can be optimized in the sense that it wastes objects. When a byte[] is wrapped by a ByteBuffer the buffer adds a view, but the original array remains the same. It does not matter wheather the original array is modified/read from directly or the ByteBuffer instance is used.

Therefore, you only need to initialize one instance of ByteBuffer and also have to set the ByteOrder once.

To start again, just use rewind() to set the counter again to the beginning of the buffer.

I have taken your code and modified it as desribed. Be aware that it does not check for errors if there are not enough bytes in the input left. I would suggest to use inp.readFully, as this will throw EOFException if not enough bytes to fill the buffer are found.

int[][] test_data = new int[SIZE_X][SIZE_Y];
ByteBuffer byteBuffer = ByteBuffer.wrap(new byte[4]).order(ByteOrder.LITTLE_ENDIAN);
for (int i=0; i < SIZE_Y; i++) {
    for (int j=0; j < SIZE_X; j++) {
        inp.read(byteBuffer.array());
        byteBuffer.rewind();
        test_data[j][SIZE_Y - i - 1] = byteBuffer.getInt();
    }
}
Bailment answered 22/8, 2019 at 12:28 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.