Readline is too slow - Anything Faster?
Asked Answered
T

2

6

I am reading in from a stream using a BufferedReader and InputStreamReader to create one long string that gets created from the readers. It gets up to over 100,000 lines and then throws a 500 error (call failed on the server). I am not sure what is the problem, is there anything faster than this method? It works when the lines are in the thousands but i am working with large data sets.

BufferedReader in = new BufferedReader(new InputStreamReader(newConnect.getInputStream()));
String inputLine;               
String xmlObject = "";
StringBuffer str = new StringBuffer();

while ((inputLine = in.readLine()) != null) {
    str.append(inputLine);
    str.toString();
}       
in.close();

Thanks in advance

Theoretician answered 13/10, 2011 at 15:19 Comment(11)
If you are reading this into RAM memory then perhaps you have run out of memory - which caused the exception (?) Also, can you give some more information as to why you would want to create "one long string" - not saying you shouldn't but please enlighten.Meniscus
What's on the other side of the socket? Sounds like there's some sort of timeout on the server process.Thermoelectric
I am doing this on the server side, i am creating a GWT application that pulls in xml data off of a servlet. I have one long XML file that needs to be read in and created into one long string to parse throughTheoretician
@user971337 - Have you tried increasing the buffer size for BufferedReader?Phonogram
Can you post the code for your conditional loop(s), the while(..) part etc.?Cumulus
BufferedReader in = new BufferedReader( new InputStreamReader( newConnect.getInputStream())); String inputLine; String xmlObject = ""; int count = 0; StringBuffer str = new StringBuffer(); while ((inputLine = in.readLine()) != null) { System.out.println(count); count++; str.append(inputLine); str.toString(); } in.close();Theoretician
@user971337: Please edit your question and append the formatted code.Hydrops
alright i just did, thanks for your helpTheoretician
@user971337: Did you try to download the file to your local machine and then parse this local file, any differences in behaviour?Hydrops
i dont want to download it to my local machine, i am doing this from a remote machineTheoretician
The last call (str.toString()) is most likely what is killing performance, because it need to copy the entire StringBuffer. And you call it in the loop, so you end up with 100.000 copies of a StringBuffer if you have 100.000 lines in the file. And you don't even use the result of toString() so why is it there at all? Other optimization hints: Initialize your StringBuffer with the size of the file you are reading.Hali
P
9

to create one long string that gets created from the readers.

Are you by any chance doing this to create your "long string"?

String string;
while(...) 
 string+=whateverComesFromTheSocket;

If yes, then change it to

StringBuilder str = new StringBuilder(); //Edit:Just changed StringBuffer to StringBuilder
while(...)
 str.append(whateverComesFromTheSocket);
String string = str.toString(); 

String objects are immutable and when you do str+="something", memory is reallocated and str+"something" is copied to that newly allocated area. This is a costly operation and running it 51,000 times is an extremely bad thing to do.

StringBuffer and StringBuilder are String's mutable brothers and StringBuilder, being non-concurrent is more efficient than StringBuffer.

Punctilio answered 13/10, 2011 at 15:23 Comment(9)
BufferedReader in = new BufferedReader( new InputStreamReader( newConnect.getInputStream())); String inputLine; String xmlObject = ""; int count = 0; StringBuffer str = new StringBuffer(); while ((inputLine = in.readLine()) != null) { System.out.println(count); count++; str.append(inputLine); xmlObject = str.toString(); }Theoretician
premature EOF is what i am still gettingTheoretician
Why is xmlObject = str.toString(); inside the loop?Punctilio
that should be there, but it is still goin slow, i figured reading 50,000 lines shouldnt take longer than 25 second. there has gotta be a way to read that in faster, especially from the server sideTheoretician
The append is now giving me all of the data that i need but still taking forever, so thanks for the help with getting me out of that error, now if we can solve the speed situationTheoretician
500 The call failed on the server; see server log for detailsTheoretician
I got that after a while of reading the lines but a lot more data is being read than beforeTheoretician
now try increasing the buffer size of BufferedReader. Check the online APIs for that.Punctilio
It's no nice to find exactly what I was doing wrong ☺️ thanks!Webbed
S
3

readline() can read at about 90 MB/s, its what you are doing with the data read which is slow. BTW readline removes newlines so this approach you are using is flawed as it will turn everying into one line.

Rather than re-inventing the wheel I would suggest you try FileUtils.readLineToString() This will read a file as a STring without discarding newlines, efficiently.

Steamroller answered 13/10, 2011 at 16:23 Comment(4)
wow i just took out two lines of my code, and it flew!!! i just need to save all of that to a string, that is how the XML parser reads the data, from a long string. Any suggestions besides the FileUtils.readLineToString? or you think that will solve it?Theoretician
I can't imagine what XML parser you are using if it requires you to supply the input as a string. Every XML parser I know of will accept the input as a File or an InputStream or a Reader.Yatzeck
@user971337 The XML parsers that come with the JDK all accept InputStreams, Readers, Files, URLs, ... If yours doesn't, it has been misdesigned: stop using it or get it fixed, or fix it yourself if you did it. Reading the entire input and then constructing a string and then passing that to a parser introduces latency and memory costs that are simply unnecessary. Just connect the parser to the stream.Correggio
Great suggestion to use FileUtils.Nubbly

© 2022 - 2024 — McMap. All rights reserved.