Why access volatile variable is about 100 slower than member?
Asked Answered
N

4

7

Here I wrote a test about access speed of local, member, volatile member:

public class VolatileTest {

public int member = -100;

public volatile int volatileMember = -100;

public static void main(String[] args) {
    int testloop = 10;
    for (int i = 1; i <= testloop; i++) {
        System.out.println("Round:" + i);
        VolatileTest vt = new VolatileTest();
        vt.runTest();
        System.out.println();
    }
}

public void runTest() {
    int local = -100;

    int loop = 1;
    int loop2 = Integer.MAX_VALUE;
    long startTime;

    startTime = System.currentTimeMillis();
    for (int i = 0; i < loop; i++) {
        for (int j = 0; j < loop2; j++) {
        }
        for (int j = 0; j < loop2; j++) {
        }
    }
    System.out.println("Empty:" + (System.currentTimeMillis() - startTime));

    startTime = System.currentTimeMillis();
    for (int i = 0; i < loop; i++) {
        for (int j = 0; j < loop2; j++) {
            local++;
        }
        for (int j = 0; j < loop2; j++) {
            local--;
        }
    }
    System.out.println("Local:" + (System.currentTimeMillis() - startTime));

    startTime = System.currentTimeMillis();
    for (int i = 0; i < loop; i++) {
        for (int j = 0; j < loop2; j++) {
            member++;
        }
        for (int j = 0; j < loop2; j++) {
            member--;
        }
    }
    System.out.println("Member:" + (System.currentTimeMillis() - startTime));

    startTime = System.currentTimeMillis();
    for (int i = 0; i < loop; i++) {
        for (int j = 0; j < loop2; j++) {
            volatileMember++;
        }
        for (int j = 0; j < loop2; j++) {
            volatileMember--;
        }
    }
    System.out.println("VMember:" + (System.currentTimeMillis() - startTime));

}
}

And here is a result on my X220 (I5 CPU):

Round:1 Empty:5 Local:10 Member:312 VMember:33378

Round:2 Empty:31 Local:0 Member:294 VMember:33180

Round:3 Empty:0 Local:0 Member:306 VMember:33085

Round:4 Empty:0 Local:0 Member:300 VMember:33066

Round:5 Empty:0 Local:0 Member:303 VMember:33078

Round:6 Empty:0 Local:0 Member:299 VMember:33398

Round:7 Empty:0 Local:0 Member:305 VMember:33139

Round:8 Empty:0 Local:0 Member:307 VMember:33490

Round:9 Empty:0 Local:0 Member:350 VMember:35291

Round:10 Empty:0 Local:0 Member:332 VMember:33838

It surprised me that access to volatile member is 100 times slower than normal member. I know there is some highlight feature about volatile member, such as a modification to it will be visible for all thread immediately, access point to volatile variable plays a role of "memory barrier". But can all these side effect be the main cause of 100 times slow?

PS: I also did a test on a Core II CPU machine. It is about 9:50, about 5 times slow. seems like this is also related to CPU arch. 5 times is still big, right?

Namtar answered 21/6, 2012 at 5:35 Comment(1)
possible duplicate of Is volatile expensive?Dust
W
4

Acess to volatile prevents some JIT optimisaton. This is especially important if you have a loop which doesn't really do anything as the JIT can optimise such loops away (unless you have a volatile field) If you run the loops "long" the descrepancy should increase more.

In more realistic test, you might expect volatile to take between 30% and 10x slower for cirtical code. In most real programs it makes very little difference because the CPU is smart enough to "realise" that only one core is using the volatile field and cache it rather than using main memory.

Widthwise answered 21/6, 2012 at 7:8 Comment(0)
M
9

The volatile members are never cached, so they are read directly from the main memory.

Mantle answered 21/6, 2012 at 6:52 Comment(0)
T
4

Access to a volatile variable prevents the CPU from re-ordering the instructions before and after the access, and this generally slows down execution.

Tabaret answered 21/6, 2012 at 5:42 Comment(0)
W
4

Acess to volatile prevents some JIT optimisaton. This is especially important if you have a loop which doesn't really do anything as the JIT can optimise such loops away (unless you have a volatile field) If you run the loops "long" the descrepancy should increase more.

In more realistic test, you might expect volatile to take between 30% and 10x slower for cirtical code. In most real programs it makes very little difference because the CPU is smart enough to "realise" that only one core is using the volatile field and cache it rather than using main memory.

Widthwise answered 21/6, 2012 at 7:8 Comment(0)
T
0

Using volatile will read from the memory directly so that every core of cpu will get the change at next get from the variable, there's no cpu cache used, which will not use register, L1~L3 cache tech, reading from

  1. register 1 clock cycle
  2. L1 cache 4 clock cycle
  3. L2 cache 11 clock cycle
  4. L3 cache 30~40 clock cycle
  5. Memory 100+ clock cycle

That's why your result is about 100 times slower when using volatile.

Toul answered 2/4, 2020 at 7:21 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.