Usage of BufferedInputStream
Asked Answered
F

6

41

Let me preface this post with a single caution. I am a total beginner when it comes to Java. I have been programming PHP on and off for a while, but I was ready to make a desktop application, so I decided to go with Java for various reasons.

The application I am working on is in the beginning stages (less than 5 classes) and I need to read bytes from a local file. Typically, the files are currently less than 512kB (but may get larger in the future). Currently, I am using a FileInputStream to read the file into three byte arrays, which perfectly satisfies my requirements. However, I have seen a BufferedInputStream mentioned, and was wondering if the way I am currently doing this is best, or if I should use a BufferedInputStream as well.

I have done some research and have read a few questions here on Stack Overflow, but I am still having troubles understanding the best situation for when to use and not use the BufferedInputStream. In my situation, the first array I read bytes into is only a few bytes (less than 20). If the data I receive is good in these bytes, then I read the rest of the file into two more byte arrays of varying size.

I have also heard many people mention profiling to see which is more efficient in each specific case, however, I have no profiling experience and I'm not really sure where to start. I would love some suggestions on this as well.

I'm sorry for such a long post, but I really want to learn and understand the best way to do these things. I always have a bad habit of second guessing my decisions, so I would love some feedback. Thanks!

Freeland answered 26/6, 2010 at 2:8 Comment(0)
K
89

If you are consistently doing small reads then a BufferedInputStream will give you significantly better performance. Each read request on an unbuffered stream typically results in a system call to the operating system to read the requested number of bytes. The overhead of doing a system call is may be thousands of machine instructions per syscall. A buffered stream reduces this by doing one large read for (say) up to 8k bytes into an internal buffer, and then handing out bytes from that buffer. This can drastically reduce the number of system calls.

However, if you are consistently doing large reads (e.g. 8k or more) then a BufferedInputStream slows things a bit. You typically don't reduce the number of syscalls, and the buffering introduces an extra data copying step.

In your use-case (where you read a 20 byte chunk first then lots of large chunks) I'd say that using a BufferedInputStream is more likely to reduce performance than increase it. But ultimately, it depends on the actual read patterns.

Kantian answered 26/6, 2010 at 2:57 Comment(2)
However, if you are consistently doing large reads (e.g. 8k or more) then a BufferedInputStream slows things. How?Gavingavini
Look at the code! There an extra level of indirection in the calls, extra work checking to see if there is anything in the buffers, etc. Fortunately, the code is smart enough to avoid an unnecessary copy, as far as is possible with the InputStream API. Thus, the relative slow down is small but it would be measurable.Kantian
B
5

If you are using a relatively large arrays to read the data a chunk at a time, then BufferedInputStream will just introduce a wasteful copy. (Remember, read does not necessarily read all of the array - you might want DataInputStream.readFully). Where BufferedInputStream wins is when making lots of small reads.

Bareheaded answered 26/6, 2010 at 2:24 Comment(2)
I think I understand what you are saying. Let me ask you another question. I see a constructor for FileInputStream that takes a byte[] as a parameter. Currently, I am using a for loop to read the desired bytes, however, I assume using this parameter instead would be more efficient? I also assume that using a for loop to constantly call read from the FileInputStream is what you mean by lots of small reads? I sorry to sound so noobish, but I am having a hard time completely grasping this for some reason. Thanks for your answer!Freeland
@mastermosaj You might be seeing the constructor for ByteArrayInputStream, which is an InputStream that reads through a byte[] so does no actual I/O. If you are reading through your byte[] byte by byte then you will probably find using a BufferedInputStream or ByteArrayInputStream simplifies your code at some performance cost. (Note don't mix using BufferedInputStream with using the underlying stream, because the former buffers.Bareheaded
D
1

BufferedInputStream reads more of the file that you need in advance. As I understand it, it's doing more work in advance, like, 1 big continous disk read vs doing many in a tight loop.

As far as profiling - I like the profiler that's built into netbeans. It's really easy to get started with. :-)

Deming answered 26/6, 2010 at 2:20 Comment(3)
Thanks for the suggestions. I heard someone mention the profilier in NetBeans. I started using NetBeans, however, I have switched to using just a plain text editor for the time being. I feel that I learn more about the language that way. Do you have any other suggestions?Freeland
Text editors are great, but it's kind of like pedaling a dump truck if you're billing clients. You might try hprof if you want to avoid doing the profiling in an ide: java.sun.com/developer/technicalArticles/Programming/HPROF.htmlDeming
Thanks @jskaggz. I will check out hprof. BTW, I am making this application for myself, so I am not really on a timetable, but I agree that if it were for a client, I would definitely use an ide to speed it along.Freeland
B
1

I can't speak to the profiling, but from my experience developing Java applications I find that using any of the buffer classes - BufferedInputStream, StringBuffer - my applications are exceptionally faster. Because of which, I use them even for the smallest files or string operation.

Billington answered 26/6, 2010 at 2:21 Comment(3)
When you use the BufferedInputStream, do you usually specify a particular size chunk for it to buffer, or do you let it automatically decide?Freeland
This depends. As Stephen C said above, if this number doesn't coincide well with the data page size used in the syscalls (say 4k) then you just shot yourself in the foot by creating a bottleneck. Think of it like filling a sandbag with a shovel. If you scoop too much or too little sand onto the shovel, you've just decreased efficiency/performance. Just a side note that I am an advocate of writing good code. But if you are just starting out, there is nothing wrong with getting it to work and then optimizing later. These things can be rabbit holes.Billington
@JasonMcCreary When to use read() byte by byte and when to use read(byte[]) array of byte. As I think reading array is always better. then can you give me example where to use read() byte by byte OR read(byte[]) array of byte. OR BufferedInputStream.?Gavingavini
W
0

The following works for me pretty well to pre-populate buffered input stream.

private BufferedInputStream readBeforehand(S3Object object) throws IOException {
    int length = min(object.getObjectMetadata().getContentLength(), BUFFER_MAX_SIZE);
    BufferedInputStream bis = new BufferedInputStream(object.getObjectContent(), length);
    bis.mark(length);
    for (int i = 0; i < length; i++)
        if (bis.read() == EOF) break;
    bis.reset();
    return bis;
}
Wade answered 4/7, 2023 at 11:10 Comment(0)
G
-4
    import java.io.*;
    class BufferedInputStream
    {
            public static void main(String arg[])throws IOException
            {
                FileInputStream fin=new FileInputStream("abc.txt");
                BufferedInputStream bis=new BufferedInputStream(fin);
                int size=bis.available();
                while(true)
                {
                        int x=bis.read(fin);
                        if(x==-1)
                        {
                                bis.mark(size);
                                System.out.println((char)x);
                        }
                }
                        bis.reset();
                        while(true)
                        {
                                int x=bis.read();
                                if(x==-1)
                                {
                                    break;
                                    System.out.println((char)x);
                                }
                        }

            }

    }
Godber answered 26/6, 2013 at 6:26 Comment(1)
Excuse me - what is this ?Bod

© 2022 - 2024 — McMap. All rights reserved.