Benchmark: Sine vs. Curves performance
Asked Answered
K

16

0

In this thread we were talking the other day about the performance of sin() calculations vs. using curves so I made a quick & dirty benchmark to compare them.

Simple calculations

Lets call these very simple calculations 1000 times.

Sine


var h := sin(0.5)
var v := sin(0.5)

Simple curve

var h := simple_curve.sample(0.2)
var v := simple_curve.sample(0.2)

Results

sin(): 76 μs
Curves: 111 μs

More complex calculation

This is a more real file example. Actually at its core it's the same calculations I made in the other thread.

Sine

var time := Time.get_unix_time_from_system()

for i in iterations:
    var t_hori := wrapf(time, 0.0, 1.0)
    var t_vert := wrapf(time + 0.25, 0.0, 1.0) * 2.0
    var r := t_vert * PI
    var v := Vector2(sin(t_hori * 2.0 * PI), 1.0 - sin(r if r < PI else r - PI))

Curves

For the curves I am using these ones from the other thread:

var time := Time.get_unix_time_from_system()

for i in iterations:
    var t := wrapf(time, 0.0, 1.0)
    var v := Vector2(horizontal_sway.sample(t), vertical_sway.sample(t))

Results

sin(): 294 μs
Curves: 120 μs

Conclusion

As long as the calculations are simple calling something like sin() is faster but the moment it gets more complicated and you need to write more code then using curves is faster because of the GDScript overhead.

Kelley answered 25/11, 2023 at 13:40 Comment(0)
D
0

Kelley Not a fair benchmark. You needlessly burdened sine's loop with branching and allocation.

Debacle answered 25/11, 2023 at 13:45 Comment(0)
K
0

Debacle No I think it's totally fair. I am comparing two calculations that produce (roughly) the same results.

Kelley answered 25/11, 2023 at 13:47 Comment(0)
K
0

If I change the second sine benchmark to this:

var time := Time.get_unix_time_from_system()
var t_hori := wrapf(time, 0.0, 1.0)
var t_vert := wrapf(time + 0.25, 0.0, 1.0) * 2.0

for i in iterations:
    var r := t_vert * PI
    var v := Vector2(sin(t_hori * 2.0 * PI), 1.0 - sin(r if r < PI else r - PI))

We get: 190 μs instead of 294 μs.

But I still calculate r inside the loop because that is an integral part of the calculation.

Kelley answered 25/11, 2023 at 13:51 Comment(0)
D
0

Kelley No I think it's totally fair. I am comparing two calculations that produce (roughly) the same results.

Nope, you need to get that branching out of the loop for fair comparison. Otherwise it's apples and oranges. Am I allowed to supply sine's loop body? 🙂

Debacle answered 25/11, 2023 at 13:56 Comment(0)
K
0

Debacle No, I need that to calculate the same values. If I take it out we have two calculations that produce completely different values.

Kelley answered 25/11, 2023 at 14:0 Comment(0)
D
0

Kelley It's doable without branching. You don't actually need it.

Debacle answered 25/11, 2023 at 14:1 Comment(0)
K
0

Debacle That was the easiest for me to do. If you can give me a version without a branch I'll happily check it.

Kelley answered 25/11, 2023 at 14:23 Comment(0)
K
0

Nevermind, I forgot that I already done that earlier but I changed it back.

var time := Time.get_unix_time_from_system()
var t_hori := wrapf(time, 0.0, 1.0)
var t_vert := wrapf(time + 0.25, 0.0, 1.0) * 2.0

for i in iterations:
    var v := Vector2(sin(t_hori * 2.0 * PI), 1.0 - sin(fmod(t_vert * PI, PI)))

And we are down from 190 μs to 168 μs.

But explain to me one thing: How is fmod(t_vert * PI, PI) not zero? Which is what I don't get and the reason I removed it.

Kelley answered 25/11, 2023 at 14:34 Comment(0)
D
0

Kelley You're really trying to make that sine look bad 🙂 What's the point of wrapping sine's argument if function is already periodic? And then do it twice. Why do you need two separate phase arguments and then multiply them inside loop body with values that are the same for all iterations?

var time := Time.get_unix_time_from_system()
for i in iterations:
	var v := Vector2(sin(time), abs(cos(time)))
Debacle answered 25/11, 2023 at 14:53 Comment(0)
K
0

Debacle You're really trying to make that sine look bad

That's not what I am trying to do. 👼

But what is the point of the above? We basically already had that in our very first measurement in the OP.

Kelley answered 25/11, 2023 at 15:2 Comment(0)
D
0

Kelley I don't know what's the point. What was the point of making code "more complex"? I just optimized that code to do the same thing without superfluous calculations.

Debacle answered 25/11, 2023 at 15:10 Comment(0)
K
0

Debacle The whole point was to compare the performance of sin() vs. curves in one simple and one more complex case.

  1. As a baseline calculating a simple value with sin() and then getting a value out of a simple curve. (Or two values actually, because that is what we do in the next benchmark.)
  2. Having two more complex curves that calculate very specific values (as used in that other thread) and then rewrite that code to not use curves but sines instead and calculate the (roughly) exact same values and compare the performance of those two. Because in the other thread we speculated if calculating sines wouldn't be faster.

Basically: Here is function f1 that takes an argument (time) and uses curves to return a value. Now let's rewrite it into function f2 that takes the same argument (time) and returns the same values but without using curves. And compare the performance of f1 vs. f2. That was the more complex benchmark comparison.

And IMHO there is value in doing this comparison because I don't know about you but I had no idea if using curves would be somewhat fast or a lot slower than calculating the values by hand. At least in GDScript. And, well, now I know.

Kelley answered 25/11, 2023 at 15:32 Comment(0)
D
0

Kelley Trig functions are almost atomic on modern hardware. They'll always be faster than sampling a bezier curve.
I get your intent but you generalized the sine approach in an unfair way. It can be fully generalized like this:

const frequency = 5
const amplitude = 10
var phase := Time.get_unix_time_from_system() * frequency
for i in iterations:
	var v := Vector2(sin(phase), abs(cos(phase))) * amplitude

So the above should be measured against curves with specific amplitude values

Debacle answered 25/11, 2023 at 15:42 Comment(0)
K
0

Debacle Okay I finally got what you were trying to say. That in combination with my mathematical heyday now being probably 25 years in the past led to us talking around each other.

I changed both benchmarks:

    var time := Time.get_unix_time_from_system()

    for i in iterations:
        var t := time - int(time)
        var v := Vector2(horizontal_sway.sample(t), vertical_sway.sample(t))
    const _2pi := 2.0 * PI
    var time := Time.get_unix_time_from_system()

    for i in iterations:
        var f := time * _2pi
        var v := Vector2(sin(f), 1.0 - absf(cos(f)))

Now we get 190 μs and 114 μs.

I tried a couple of things but that was the fastest I could get in both cases.

Oh and before you say "but we could totally move the calculations of t and f out of the loop...!": Yes, we could. 😉 But the thought here is that this is supposed to be the inner part of a function that takes a changing time and return a result. Therefore these calculations are integral to the whole function and that's why I include it in the measurement.

Hrmm... Okay, fine. For completeness sake let me just do that and remeasure it one more time... 160 μs and 82 μs. 😆

Kelley answered 25/11, 2023 at 20:48 Comment(0)
D
0

Kelley I won't take that multiplication with _2pi and that subtraction from 1 as necessary 🙂 abs(cos()) is always in 0-1 range so its enough to just invert the sign, f can be removed from the loop because scaled phase can be passed into loop/function, but t actually couldn't because it's needed to ensure sample range periodicity. You're still not playing fair 😉

Can you paste the whole benchmark code? I'd like to run it.

Debacle answered 25/11, 2023 at 21:17 Comment(0)
K
0

Debacle Sure. For some quick tests I always butcher this project of mine. Let me... alright, this should be all of it:

benchmark.zip
28kB

The changed code is in main.gd, the first two functions. And after you started it press the first two buttons.

# ---- return class benchmark -----------------------------------------------
func bench_class(iterations: int):
    var time := Time.get_unix_time_from_system()
    var t := time - int(time)

    for i in iterations:
        var v := Vector2(horizontal_sway.sample(t), vertical_sway.sample(t))


# ---- return array benchmark -----------------------------------------------
func bench_array(iterations: int):
    const _2pi := 2.0 * PI
    var time := Time.get_unix_time_from_system()
    var f := time * _2pi

    for i in iterations:
        var v := Vector2(sin(f), 1.0 - absf(cos(f)))

You're still not playing fair

This is outrageous! I am the objectivity in person! 😉

Kelley answered 25/11, 2023 at 21:56 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.