I'm trying to implement Fast Inverse Square Root on java in order to speed up vector normalization. However, when I implement the single-precision version in Java, I get speeds about the same as 1F / (float)Math.sqrt()
at first, then quickly drops to half the speed. This is interesting, because while Math.sqrt uses (I presume) a native method, this involves floating point division, which I've heard is really slow. My code for computing the numbers is as follows:
public static float fastInverseSquareRoot(float x){
float xHalf = 0.5F * x;
int temp = Float.floatToRawIntBits(x);
temp = 0x5F3759DF - (temp >> 1);
float newX = Float.intBitsToFloat(temp);
newX = newX * (1.5F - xHalf * newX * newX);
return newX;
}
Using a short program I've written to iterate each 16 million times, then aggregate results, and repeat, I get results like this:
1F / Math.sqrt() took 65209490 nanoseconds.
Fast Inverse Square Root took 65456128 nanoseconds.
Fast Inverse Square Root was 0.378224 percent slower than 1F / Math.sqrt()
1F / Math.sqrt() took 64131293 nanoseconds.
Fast Inverse Square Root took 26214534 nanoseconds.
Fast Inverse Square Root was 59.123647 percent faster than 1F / Math.sqrt()
1F / Math.sqrt() took 27312205 nanoseconds.
Fast Inverse Square Root took 56234714 nanoseconds.
Fast Inverse Square Root was 105.895914 percent slower than 1F / Math.sqrt()
1F / Math.sqrt() took 26493281 nanoseconds.
Fast Inverse Square Root took 56004783 nanoseconds.
Fast Inverse Square Root was 111.392402 percent slower than 1F / Math.sqrt()
I consistently get numbers which are about the same speed for both, followed by an iteration where Fast Inverse Square Root saves about 60 percent of the time required by 1F / Math.sqrt()
, followed by several iterations which take about twice as long for Fast Inverse Square Root to run as the control. I'm confused why FISR would go from Same -> 60 percent faster -> 100 percent slower, and it happens every time I run my program.
EDIT: The above data is when I run it in eclipse. When I run the program with javac/java
I get completely different data:
1F / Math.sqrt() took 57870498 nanoseconds.
Fast Inverse Square Root took 88206794 nanoseconds.
Fast Inverse Square Root was 52.421004 percent slower than 1F / Math.sqrt()
1F / Math.sqrt() took 54982400 nanoseconds.
Fast Inverse Square Root took 83777562 nanoseconds.
Fast Inverse Square Root was 52.371599 percent slower than 1F / Math.sqrt()
1F / Math.sqrt() took 21115822 nanoseconds.
Fast Inverse Square Root took 76705152 nanoseconds.
Fast Inverse Square Root was 263.259133 percent slower than 1F / Math.sqrt()
1F / Math.sqrt() took 20159210 nanoseconds.
Fast Inverse Square Root took 80745616 nanoseconds.
Fast Inverse Square Root was 300.539585 percent slower than 1F / Math.sqrt()
1F / Math.sqrt() took 21814675 nanoseconds.
Fast Inverse Square Root took 85261648 nanoseconds.
Fast Inverse Square Root was 290.845374 percent slower than 1F / Math.sqrt()
EDIT2: After a few responses, it seems the speed stabilizes after several iterations, but the number it stabilizes to is highly volatile. Anyone have any idea why?
Here's my code (not exactly concise, but here's the whole thing):
public class FastInverseSquareRootTest {
public static FastInverseSquareRootTest conductTest() {
float result = 0F;
long startTime, endTime, midTime;
startTime = System.nanoTime();
for (float x = 1F; x < 4_000_000F; x += 0.25F) {
result = 1F / (float) Math.sqrt(x);
}
midTime = System.nanoTime();
for (float x = 1F; x < 4_000_000F; x += 0.25F) {
result = fastInverseSquareRoot(x);
}
endTime = System.nanoTime();
return new FastInverseSquareRootTest(midTime - startTime, endTime
- midTime);
}
public static float fastInverseSquareRoot(float x) {
float xHalf = 0.5F * x;
int temp = Float.floatToRawIntBits(x);
temp = 0x5F3759DF - (temp >> 1);
float newX = Float.intBitsToFloat(temp);
newX = newX * (1.5F - xHalf * newX * newX);
return newX;
}
public static void main(String[] args) throws Exception {
for (int i = 0; i < 7; i++) {
System.out.println(conductTest().toString());
}
}
private long controlDiff;
private long experimentalDiff;
private double percentError;
public FastInverseSquareRootTest(long controlDiff, long experimentalDiff) {
this.experimentalDiff = experimentalDiff;
this.controlDiff = controlDiff;
this.percentError = 100D * (experimentalDiff - controlDiff)
/ controlDiff;
}
@Override
public String toString() {
StringBuilder sb = new StringBuilder();
sb.append(String.format("1F / Math.sqrt() took %d nanoseconds.%n",
controlDiff));
sb.append(String.format(
"Fast Inverse Square Root took %d nanoseconds.%n",
experimentalDiff));
sb.append(String
.format("Fast Inverse Square Root was %f percent %s than 1F / Math.sqrt()%n",
Math.abs(percentError), percentError > 0D ? "slower"
: "faster"));
return sb.toString();
}
}