Why is System.out.println so slow?

Asked 14/12, 2010 at 9:37 Answered 29/10, 2018 at 14:14

Is this something common to all programming languages? Doing multiple print followed by a println seems faster but moving everything to a string and just printing that seems fastest. Why?

EDIT: For example, Java can find all the prime numbers up to 1 million in less than a second - but printing then all out each on their own println can take minutes! Up to a 10 billion can hours to print!

EX:

package sieveoferatosthenes;
public class Main {
    public static void main(String[] args) {
        int upTo = 10000000;
        boolean primes[] = new boolean[upTo];
        for( int b = 0; b < upTo; b++ ){
            primes[b] = true;
        }
        primes[0] = false;
        primes[1] = false;

        int testing = 1;

        while( testing <= Math.sqrt(upTo)){
            testing ++;
            int testingWith = testing;
            if( primes[testing] ){
                while( testingWith < upTo ){
                    testingWith = testingWith + testing;
                    if ( testingWith >= upTo){
                    }
                    else{
                        primes[testingWith] = false;
                    }

                }
            }
        }
        for( int b = 2; b < upTo; b++){
            if( primes[b] ){
                System.out.println( b );
            }
        }
    }
}

Embark answered 14/12, 2010 at 9:37 Comment(6)

Care to explain? I've always found println to be quite fast... – Ljubljana 14/12, 2010 at 9:40

@DasWood "Seems"? Please present some benchmarks (code + timings). – Sucrase 14/12, 2010 at 9:40

It tends to be fast on *nix, slow on windows. In other words, it's the console implementations of those OSes that's the factor here. – Newt 14/12, 2010 at 9:42

Bench the example I paste with printing commented out. Bench again with it not. HUGE difference and it is only running through the number line once, not 9999998 times like the actual sieve. – Embark 14/12, 2010 at 9:45

Do you mean println vs nothing, or println vs print? They are very different questions. – Mcbrayer 14/12, 2010 at 9:45

Instead of println repeatedly, put it all into a string then println it. It's faster. – Embark 14/12, 2010 at 9:51

println is not slow, it's the underlying PrintStream that is connected with the console, provided by the hosting operating system.

You can check it yourself: compare dumping a large text file to the console with piping the same textfile into another file:

cat largeTextFile.txt
cat largeTextFile.txt > temp.txt

Reading and writing are similiar and proportional to the size of the file (O(n)), the only difference is, that the destination is different (console compared to file). And that's basically the same with System.out.

The underlying OS operation (displaying chars on a console window) is slow because

The bytes have to be sent to the console application (should be quite fast)
Each char has to be rendered using (usually) a true type font (that's pretty slow, switching off anti aliasing could improve performance, btw)
The displayed area may have to be scrolled in order to append a new line to the visible area (best case: bit block transfer operation, worst case: re-rendering of the complete text area)

Decasyllabic answered 14/12, 2010 at 9:40 Comment(2)

Ok, then why is the underlying io operations so slow? – Embark 14/12, 2010 at 9:46

is it theoretically possible to optimize this in case of console output? E.g. to buffer output to be displayed up to say "one page", so scrolling only has to be performed once? I know that the responsibility of output buffering actually lies within the console implementation, just curious. – Spellbind 4/12, 2017 at 14:36

System.out is a static PrintStream class. PrintStream has, among other things, those methods you're probably quite familiar with, like print() and println() and such.

It's not unique to Java that input and output operations take a long time. "long." printing or writing to a PrintStream takes a fraction of a second, but over 10 billion instances of this print can add up to quite a lot!

This is why your "moving everything to a String" is the fastest. Your huge String is built, but you only print it once. Sure, it's a huge print, but you spend time on actually printing, not on the overhead associated with the print() or println().

As Dvd Prd has mentioned, Strings are immutable. That means whenever you assign a new String to an old one but reusing references, you actually destroy the reference to the old String and create a reference to the new one. So you can make this whole operation go even faster by using the StringBuilder class, which is mutable. This will decrease the overhead associated with building that string you'll eventually print.

Scholarship answered 14/12, 2010 at 9:51 Comment(4)

Between your answer and Alfred's the answer given is very complete. Thanks. – Embark 14/12, 2010 at 9:58

Creating the Strings is not the problem. If you just create the Strings by don't print them it will run close to the same speed. – Eggers 14/12, 2010 at 10:24

See my reply which shows it takes very little time to toString or write to a file, many orders of magnitude longer to write to the console (even when just displaying a file with the values) – Eggers 14/12, 2010 at 12:2

I mentioned that this will make it go "even faster", which it will for something of the order of 10 billion string concatenations, as the OP mentioned. I in no way stated that it's the creation of the String that's taking the most time. – Scholarship 15/12, 2010 at 2:52

I believe this is because of buffering. A quote from the article:

Another aspect of buffering concerns text output to a terminal window. By default, System.out (a PrintStream) is line buffered, meaning that the output buffer is flushed when a newline character is encountered. This is important for interactivity, where you'd like to have an input prompt displayed before actually entering any input.

A quote explaining buffers from wikipedia:

In computer science, a buffer is a region of memory used to temporarily hold data while it is being moved from one place to another. Typically, the data is stored in a buffer as it is retrieved from an input device (such as a Mouse) or just before it is sent to an output device (such as Speakers)

public void println()

Terminate the current line by writing the line separator string. The line separator string is defined by the system property line.separator, and is not necessarily a single newline character ('\n').

So the buffer get's flushed when you do println which means new memory has to be allocated etc which makes printing slower. The other methods you specified require lesser flushing of buffers thus are faster.

Cassiopeia answered 14/12, 2010 at 9:41 Comment(2)

Buffering to a file or console is the same (they are both file descriptors underneath) however a file is as much as 300x faster, meaning the buffering cost is a fairly small piece. – Eggers 14/12, 2010 at 12:4

@Peter I think you are right about that. The disc has much higher throughput compared to the screen? – Cassiopeia 14/12, 2010 at 12:10

By default, System.out.print() is only line-buffered and does a lot work related to Unicode handling. Because of its small buffer size, System.out.println() is not well suited to handle many repetitive outputs in a batch mode. Each line is flushed right away. If your output is mainly ASCII-based then by removing the Unicode-related activities, the overall execution time will be better.

Dazzle answered 14/12, 2010 at 11:59 Comment(0)

The problem you have is that displaying to the screen is very espensive, especially if you have a graphical windows/X-windows environment (rather than a pure text terminal) Just to render one digit in a font is far more expensive than the calculations you are doing. When you send data to the screen faster than it can display it, it buffered the data and quickly blocks. Even writing to a file is significant compare to the calculations, but its 10x - 100x faster than displaying on the screen.

BTW: math.sqrt() is very expensive, and using a loop is much slower than using modulus i.e. % to determine if a number is a multiple. BitSet can be 8x more space efficient than boolean[], and faster for operations on multiple bits e.g. counting or searching bits.

If I dump the output to a file, it is quick, but writing to the console is slow, and if I write to the console the data which was written to a file it takes about the same amount of time.

Took 289 ms to examine 10,000,000 numbers.
Took 149 ms to toString primes up to 10,000,000.
Took 306 ms to write to a file primes up to 10,000,000.
Took 61,082 ms to write to a System.out primes up to 10,000,000.

time cat primes.txt

real    1m24.916s
user    0m3.619s
sys     0m12.058s

The code

int upTo = 10*1000*1000;
long start = System.nanoTime();
BitSet nonprimes = new BitSet(upTo);
for (int t = 2; t * t < upTo; t++) {
    if (nonprimes.get(t)) continue;
    for (int i = 2 * t; i <= upTo; i += t)
        nonprimes.set(i);
}
PrintWriter report = new PrintWriter("report.txt");
long time = System.nanoTime() - start;
report.printf("Took %,d ms to examine %,d numbers.%n", time / 1000 / 1000, upTo);

long start2 = System.nanoTime();
for (int i = 2; i < upTo; i++) {
    if (!nonprimes.get(i))
        Integer.toString(i);
}
long time2 = System.nanoTime() - start2;
report.printf("Took %,d ms to toString primes up to %,d.%n", time2 / 1000 / 1000, upTo);

long start3 = System.nanoTime();
PrintWriter pw = new PrintWriter(new BufferedOutputStream(new FileOutputStream("primes.txt"), 64*1024));
for (int i = 2; i < upTo; i++) {
    if (!nonprimes.get(i))
        pw.println(i);
}
pw.close();
long time3 = System.nanoTime() - start3;
report.printf("Took %,d ms to write to a file primes up to %,d.%n", time3 / 1000 / 1000, upTo);

long start4 = System.nanoTime();
for (int i = 2; i < upTo; i++) {
    if (!nonprimes.get(i))
        System.out.println(i);
}
long time4 = System.nanoTime() - start4;
report.printf("Took %,d ms to write to a System.out primes up to %,d.%n", time4 / 1000 / 1000, upTo);
report.close();

Eggers answered 14/12, 2010 at 10:43 Comment(2)

"BitSet can be 8x more efficient than boolean[]" (To be clear) I'm fairly sure you meant space-efficient here. You tacked it at the end of a statement talking about the speed efficiency of sqrt() and the modulus operator, so it seemed as if you were saying BitSet was faster as well. The only time BitSet is faster is when there is a page fault involved in an enormous boolean array that doesn't occur when bits within a long array are used. – Shearwater 24/1, 2023 at 4:7

@Shearwater good point, if you are get/set-ting individual bits, as this example does, it's no faster. However, if you are using the search or masking operations ie touching many bits with a single operation, then it can be more than 8x faster. – Eggers 25/1, 2023 at 9:49

If you're printing to the console window, not to a file, that will be the killer.

Every character has to be painted, and on every line the whole window has to be scrolled. If the window is partly overlaid with other windows, it also has to do clipping.

That's going to take far more cycles than what your program is doing.

Usually that's not a bad price to pay, since console output is supposed to be for your reading pleasure :)

Cimino answered 14/12, 2010 at 11:32 Comment(0)

Most of the answers here are right, but they don't cover the most important point: system calls. This is the operation that induces the more overhead.

When your software needs to access some hardware resource (your screen for example), it needs to ask the OS (or hypervisor) if it can access the hardware. This costs a lot:

Here are interesting blogs about syscalls, the last one being dedicated to syscall and Java

http://arkanis.de/weblog/2017-01-05-measurements-of-system-call-performance-and-overhead http://www.brendangregg.com/blog/2014-05-11/strace-wow-much-syscall.html https://blog.packagecloud.io/eng/2017/03/14/using-strace-to-understand-java-performance-improvement/

Predestination answered 29/10, 2018 at 14:14 Comment(0)

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags