I was profiling my code that was loading a binary file. The load time was something around 15 seconds.
The majority of my load time was coming from the methods that were loading binary data.
I had the following code to create my DataInputStream:
is = new DataInputStream(
new GZIPInputStream(
new FileInputStream("file.bin")));
And I changed it to this:
is = new DataInputStream(
new BufferedInputStream(
new GZIPInputStream(
new FileInputStream("file.bin"))));
So after I did this small modification the loading code went from 15 seconds to 4.
But then I found that BufferedInputStream has two constructors. The other constructor lets you explicitly define the buffer size.
I've got two questions:
- What size is chosen in BufferedInputStream and is it ideal? If not, how can I find the optimum size for the buffer? Should I write a quick bit of code that does a binary search?
- Is this the best way I can use BufferedInputStream? I originally had it within the GZIPInputStream but there was negligable benefit. I'm assuming what the code is doing now is every time that the file buffer needs to be filled, the GZIP input stream goes through and decodes x bytes (where x is the size of the buffer). Would it be worth just omitting the GZIPInputStream entirely? It's definitely not needed, but my file size is decreased dramatically when using it.