Is there a class that exposes an unbuffered readLine method in Java?
Asked Answered
T

3

6

I'm cleaning up some chunks of our codebase at work, and one of the older classes is used to read and write data. This data is a mixture of US-ASCII encoded Strings and binary encoded primitives.

The current implementation uses DataInputStream, but as you can see in the documentation the readLine() method is deprecated because of an issue related to converting bytes to characters. While this encoding issue hasn't really popped up for us, the deprecation is an issue since it already doesn't work on some version of OpenJDK 7 and deprecation means that it could be removed entirely in the future. The "official" alternative is to use readLine from BufferedReader, but we can't do a complete swap-out with DataInputStream since BufferedReader can't really handle the binary encoded primitives.

The problem with "mixing" these two classes is that when the BufferedReader buffers off the stream, it advances the stream marker. This means that subsequent calls to methods like readDouble() from DataInputStream will fail with IOExceptions or EOFExceptions since the real location of the stream marker isn't where it "should" be in the context of the application logic.

I looked in to some sort of hacky mark()/reset() strategy but sometimes the stream is backed by a FileInputStream, which doesn't support mark()/reset().

Outside of changing our data protocol to write out the primitives as characters or writing my own implementation of readLine() (which is surprisingly non-trivial), is there any way to achieve this? I'd even be willing to consider an external library at this point.

Tempi answered 4/7, 2012 at 15:14 Comment(6)
Have you heard of Apache Commons IO? commons.apache.org/io I have pretty much given up with the built in Java SDK IO classes; they are complicated to use and often don't work as expected.Cabal
You could wrap your underlying input stream into a buffered stream. Then put the BufferedReader or DataInputStream on top of that. This setup guarantees that you can use mark()/reset()Discernment
When you say you can change the data protocol, does that mean you aren't stuck with a load of files encoded in the old method?Dollie
@JesseWebb looks like the Apache libraries may suffer from the same restrictions given my specific situation but I haven't had long enough to look at it. Thanks for the suggestion!Tempi
@Discernment I'll look in to this. Might work.Tempi
@Dollie Reverse that. I said I'd like to avoid changing our protocol. The files represent the physical definitions of robots (joints, linkages, masses, etc). And they're huge files. And we have a lot of them. So gutting our entire protocol is nontrivial.Tempi
U
1

I think that you should create a custom subclass of DataInputStream that adds a readLine-like method that behaves how you need it to. (You could even override the existingreadLine() method.)

Yes, an efficient implementation is non-trivial, but you could probably get away with a naive implementation if you stack your custom class on top of a BufferedInputStream.

Unshapen answered 4/7, 2012 at 15:33 Comment(1)
This is basically what I did. readLine is a final method so I just replicated the source and renamed it as a new method. Clever trick, I really like it.Tempi
G
4

If the current codebase works well and if your only issue is the deprecation tag, I'd personally recommend copying the code from the readLine method of the DataInputStream class and moving it to a helper/utility class. The readLine method of the DataInputStream doesn't use a lot of instance variables so with a bit of work you should be able to work with it fine. A sample invocation will look like: Utils.readLine(dataInStream). This will ensure that even if the method is removed, your codebase isn't affected.

Yes, it's hacky and yes it looks a bit ugly but is the quickest and probably the safest alternative (minimal changes to the remaining code base).

Gander answered 4/7, 2012 at 15:44 Comment(3)
This kind of works, with the exception that the one field that it does use is important; the InputStream is an instance field. What I ended up doing was a combination between what you and Stephen suggested; I subclassed DataInputStream and copied/pasted the original source over as a new method (couldn't overload because readLine is final).Tempi
Well, instead of just calling in.read() (which refers to the member in) you could have just called read() on the DataInputStream instance passed to the method which would have forwarded the call to the underlying InputStream instance. But anyways, good luck. :)Gander
Yeah, I suppose I could have done that, but subclassing lead to a much, much cleaner refactor with the ability to just drop the new class in and refactor across the board.Tempi
U
1

I think that you should create a custom subclass of DataInputStream that adds a readLine-like method that behaves how you need it to. (You could even override the existingreadLine() method.)

Yes, an efficient implementation is non-trivial, but you could probably get away with a naive implementation if you stack your custom class on top of a BufferedInputStream.

Unshapen answered 4/7, 2012 at 15:33 Comment(1)
This is basically what I did. readLine is a final method so I just replicated the source and renamed it as a new method. Clever trick, I really like it.Tempi
C
0

I had a similar problem and I managed to solve it by using a BufferedReader with a buffer size of 1.

As a result the method BufferedReader.readLine() is unbuffered .

            InputStreamReader inr=(new InputStreamReader( mInputStream(),"ASCII"));
            BufferedReader  mReader=new BufferedReader(inr,1);
Clementeclementi answered 22/8, 2014 at 11:53 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.