Exceeding byte[] array length (over int upper limit) - java.lang.ArrayIndexOutOfBoundsException
Asked Answered
G

3

5

I have a ByteArrayOutputStream object that I'm getting the following error for:

java.lang.ArrayIndexOutOfBoundsException at 
java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:113)

I am trying to load a file that is several gigs into it by writing byte[] chunks of 250mb one at a time.

I can watch the byte grow in size and as soon as it hits length 2147483647, the upper limit of int, it blows up on the following line:

stream.write(buf); 

stream is the ByteArrayOutputStream, buf is what I'm writing to the stream in 250mb chunks.

I was planning to do

byte result[] = stream.toByteArray();

At the end. Is there some other method I can try that will support byte array sizes greater than the int upper limit?

Gurney answered 22/2, 2012 at 15:15 Comment(8)
Just a word of advice: don't store an array that large in memory. Do you really need to have all those gigs in the memory at once?Saffier
Do you really need more than 640K?Osiris
Note: a proposal for large arrays was considered but did not make it into Java 7. Perhaps we'll see it in Java 8?Osiris
If you have a 64G memory machine that you're using to run scientific experiments on, where your data is 30G, it makes a lot of sense to load everything into memory -- it will almost always result in a large time savings.Silva
possible duplicate of Java array with more than 4gb elementsDorice
@Silva If you're needing that kind of time savings, Java is not the language you should be using.Always
@Always not really, the JVM is incredibly fast. It is an absolute memory hog and requires some small, constant-ish amount of time to warm up and JIT things. As a GC language & runtime, one obviously expects it to be a bit slower than a non-GC platform. But please, chime in with a non-constructive, snarky, immature, off-topic comment that is inaccurate.Silva
@Silva Your choice to insult me speaks volumes on its own. I didn't say Java was a terrible language. It just isn't a good language to choose when you need every bit of performance you can get. It is several times slower than any native language even at its best. If you feel that you get other benefits from it like faster dev time that outweigh that, great!Always
C
8

Arrays in Java simply can't exceed the bounds of int.

From the JLS section 15.10:

The type of each dimension expression within a DimExpr must be a type that is convertible (§5.1.8) to an integral type, or a compile-time error occurs. Each expression undergoes unary numeric promotion (§). The promoted type must be int, or a compile-time error occurs; this means, specifically, that the type of a dimension expression must not be long.

Likewise in the JVM spec for arraylength:

The arrayref must be of type reference and must refer to an array. It is popped from the operand stack. The length of the array it references is determined. That length is pushed onto the operand stack as an int.

That basically enforces the maximum size of arrays.

It's not really clear what you were going to do with the data after loading it, but I'd attempt not to need to load it all into memory to start with.

Chung answered 22/2, 2012 at 15:19 Comment(0)
M
2

Use more than one array. When you reach the limit use ByteArrayOutputStream.toByteArray() and reset with ByteArrayOutputStream.reset().

Melioration answered 22/2, 2012 at 15:20 Comment(0)
R
2

Using a ByteArrayOutputStream for writing several GiB of data is not a good idea as everything has to held in the computer's memory. As you have noticed a byte array is limited to 2^31 bytes (2GiB).

Additionally the buffer used for storing that data does not grow if you write more data in it, therefore if the used buffer is getting full a new one has to be created (usually of double size) and all data has to copied from the old buffer into the new one.

My advice would be to use RandomAccessFile and save the data you get to a file. Via RandomAccessFile you can operate on data files larger than 2GiB.

Reply answered 22/2, 2012 at 15:30 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.