Many algorithms (e.g. Graham scan) require points or vectors to be sorted by their angle (perhaps as seen from some other point, i.e. using difference vectors). This order is inherently cyclic, and where this cycle is broken to compute linear values often doesn't matter that much. But the real angle value doesn't matter much either, as long as cyclic order is maintained. So doing an atan2
call for every point might be wasteful. What faster methods are there to compute a value which is strictly monotonic in the angle, the way atan2
is? Such functions apparently have been called “pseudoangle” by some.
I started to play around with this and realised that the spec is kind of incomplete. atan2
has a discontinuity, because as dx and dy are varied, there's a point where atan2
will jump between -pi and +pi. The graph below shows the two formulas suggested by @MvG, and in fact they both have the discontinuity in a different place compared to atan2
. (NB: I added 3 to the first formula and 4 to the alternative so that the lines don't overlap on the graph). If I added atan2
to that graph then it would be the straight line y=x. So it seems to me that there could be various answers, depending on where one wants to put the discontinuity. If one really wants to replicate atan2
, the answer (in this genre) would be
# Input: dx, dy: coordinates of a (difference) vector.
# Output: a number from the range [-2 .. 2] which is monotonic
# in the angle this vector makes against the x axis.
# and with the same discontinuity as atan2
def pseudoangle(dx, dy):
p = dx/(abs(dx)+abs(dy)) # -1 .. 1 increasing with x
if dy < 0: return p - 1 # -2 .. 0 increasing with x
else: return 1 - p # 0 .. 2 decreasing with x
This means that if the language that you're using has a sign function, you could avoid branching by returning sign(dy)(1-p), which has the effect of putting an answer of 0 at the discontinuity between returning -2 and +2. And the same trick would work with @MvG's original methodology, one could return sign(dx)(p-1).
Update In a comment below, @MvG suggests a one-line C implementation of this, namely
pseudoangle = copysign(1. - dx/(fabs(dx)+fabs(dy)),dy)
@MvG says it works well, and it looks good to me :-).
copysign(1.-x/(fabs(x)+fabs(y)),y)
with which I could observe a speedup of at least a factor of 10 compared to atan2
, in contrast to the observation by @george that this might be slower than atan2
. Feel free to include this C snipped into your answer if you think it fits. –
Kerby I know one possible such function, which I will describe here.
# Input: dx, dy: coordinates of a (difference) vector.
# Output: a number from the range [-1 .. 3] (or [0 .. 4] with the comment enabled)
# which is monotonic in the angle this vector makes against the x axis.
def pseudoangle(dx, dy):
ax = abs(dx)
ay = abs(dy)
p = dy/(ax+ay)
if dx < 0: p = 2 - p
# elif dy < 0: p = 4 + p
return p
So why does this work? One thing to note is that scaling all input lengths will not affect the ouput. So the length of the vector (dx, dy)
is irrelevant, only its direction matters. Concentrating on the first quadrant, we may for the moment assume dx == 1
. Then dy/(1+dy)
grows monotonically from zero for dy == 0
to one for infinite dy
(i.e. for dx == 0
). Now the other quadrants have to be handled as well. If dy
is negative, then so is the initial p
. So for positive dx
we already have a range -1 <= p <= 1
monotonic in the angle. For dx < 0
we change the sign and add two. That gives a range 1 <= p <= 3
for dx < 0
, and a range of -1 <= p <= 3
on the whole. If negative numbers are for some reason undesirable, the elif
comment line can be included, which will shift the 4th quadrant from -1…0
to 3…4
.
I don't know if the above function has an established name, and who might have published it first. I've gotten it quite a while ago and copied it from one project to the next. I have however found occurrences of this on the web, so I'd consider this snipped public enough for re-use.
There is a way to obtain the range [0 … 4] (for real angles [0 … 2π]) without introducing a further case distinction:
# Input: dx, dy: coordinates of a (difference) vector.
# Output: a number from the range [0 .. 4] which is monotonic
# in the angle this vector makes against the x axis.
def pseudoangle(dx, dy):
p = dx/(abs(dx)+abs(dy)) # -1 .. 1 increasing with x
if dy < 0: return 3 + p # 2 .. 4 increasing with x
else: return 1 - p # 0 .. 2 decreasing with x
abs
and the if dx<0
). When I was an assembly language programmer checking the sign of something was just a single fast opcode, so I don't understand why the number of branch points matters. –
Oraorabel -O3
, and it turns out that fabs
will compiled to andpd
, which should cost almost nothing. Leaving only one branch. –
Kerby I kinda like trigonometry, so I know the best way of mapping an angle to some values we usually have is a tangent. Of course, if we want a finite number in order to not have the hassle of comparing {sign(x),y/x}, it gets a bit more confusing.
But there is a function that maps [1,+inf[ to [1,0[ known as inverse, that will allow us to have a finite range to which we will map angles. The inverse of the tangent is the well known cotangent, thus x/y (yes, it's as simple as that).
A little illustration, showing the values of tangent and cotangent on a unit circle :
You see the values are the same when |x| = |y|, and you see also that if we color the parts that output a value between [-1,1] on both circles, we manage to color a full circle. To have this mapping of values be continuous and monotonous, we can do two this :
- use the opposite of the cotangent to have the same monotony as tangent
- add 2 to -cotan, to have the values coincide where tan=1
- add 4 to one half of the circle (say, below the x=-y diagonal) to have values fit on the one of the discontinuities.
That gives the following piecewise function, which is a continuous and monotonous function of the angles, with only one discontinuity (which is the minimum) :
double pseudoangle(double dx, double dy)
{
// 1 for above, 0 for below the diagonal/anti-diagonal
int diag = dx > dy;
int adiag = dx > -dy;
double r = !adiag ? 4 : 0;
if (dy == 0)
return r;
if (diag ^ adiag)
r += 2 - dx / dy;
else
r += dy / dx;
return r;
}
Note that this is very close to Fowler angles, with the same properties. Formally, pseudoangle(dx,dy) + 1 % 8 == Fowler(dx,dy)
To talk performance, it's much less branchy than Fowler's code (and generally less complicated imo). Compiled with -O3
on gcc 6.1.1, the above function generates an assembly code with 4 branches, where two of them come from dy == 0
(one checking if the both operands are "unordered", thus if dy was NaN
, and the other checking if they are equal).
I would argue this version is more precise than others, since it only uses mantissa preserving operations, until shifting the result to the right interval. This should be especially visible when |x| << |y| or |y| >> |x|, then the operation |x| + |y| looses quite some precision.
As you can see on the graph the angle-pseudoangle relation is also nicely close to linear.
Looking where branches come from, we can make the following remarks:
My code doesn't rely on
abs
norcopysign
, which makes it look more self-contained. However playing with sign bits on floating point values is actually rather trivial, since it's just flipping a separate bit (no branch!), so this is more of a disadvantage.Furthermore other solutions proposed here do not check whether
abs(dx) + abs(dy) == 0
before dividing by it, but this version would fail as soon as only one component (dy) is 0 -- so that throws in a branch (or 2 in my case).
If we choose to get roughly the same result (up to rounding errors) but without branches, we could abuse copsign and write:
double pseudoangle(double dx, double dy)
{
double s = dx + dy;
double d = dx - dy;
double r = 2 * (1.0 - copysign(1.0, s));
double xor_sign = copysign(1.0, d) * copysign(1.0, s);
r += (1.0 - xor_sign);
r += (s - xor_sign * d) / (d + xor_sign * s);
return r;
}
Bigger errors may happen than with the previous implementation, due to cancellation in either d or s if dx and dy are close in absolute value. There is no check for division by zero to be comparable with the other implementations presented, and because this only happens when both dx and dy are 0.
If you can feed the original vectors instead of angles into a comparison function when sorting, you can make it work with:
- Just a single branch.
- Only floating point comparisons and multiplications.
Avoiding addition and subtraction makes it numerically much more robust. A double can actually always exactly represent the product of two floats, but not necessarily their sum. This means for single precision input you can guarantee a perfect flawless result with little effort.
This is basically Cimbali's solution repeated for both vectors, with branches eliminated and divisions multiplied away. It returns an integer, with sign matching the comparison result (positive, negative or zero):
signed int compare(double x1, double y1, double x2, double y2) {
unsigned int d1 = x1 > y1;
unsigned int d2 = x2 > y2;
unsigned int a1 = x1 > -y1;
unsigned int a2 = x2 > -y2;
// Quotients of both angles.
unsigned int qa = d1 * 2 + a1;
unsigned int qb = d2 * 2 + a2;
if(qa != qb) return((0x6c >> qa * 2 & 6) - (0x6c >> qb * 2 & 6));
d1 ^= a1;
double p = x1 * y2;
double q = x2 * y1;
// Numerator of each remainder, multiplied by denominator of the other.
double na = q * (1 - d1) - p * d1;
double nb = p * (1 - d1) - q * d1;
// Return signum(na - nb)
return((na > nb) - (na < nb));
}
divss
usually isn't worth it; an FP divide is still only 1 uop and has good enough throughput not to be a major bottleneck. It does have worse latency and is worth avoiding if you can cheaply (e.g. multiply by a reciprocal in a loop). Floating point division vs floating point multiplication –
Conservatism The simpliest thing I came up with is making normalized copies of the points and splitting the circle around them in half along the x or y axis. Then use the opposite axis as a linear value between the beginning and end of the top or bottom buffer (one buffer will need to be in reverse linear order when putting it in.) Then you can read the first then second buffer linearly and it will be clockwise, or second and first in reverse for counter clockwise.
That might not be a good explanation so I put some code up on GitHub that uses this method to sort points with an epsilion value to size the arrays.
https://github.com/Phobos001/SpatialSort2D
This might not be good for your use case because it's built for performance in graphics effects rendering, but it's fast and simple (O(N) Complexity). If your working with really small changes in points or very large (hundreds of thousands) data sets then this won't work because the memory usage might outweigh the performance benefits.
nice.. here is a varient that returns -Pi , Pi like many arctan2 functions.
edit note: changed my pseudoscode to proper python.. arg order changed for compatibility with pythons math module atan2(). Edit2 bother more code to catch the case dx=0.
def pseudoangle( dy , dx ):
""" returns approximation to math.atan2(dy,dx)*2/pi"""
if dx == 0 :
s = cmp(dy,0)
else::
s = cmp(dx*dy,0) # cmp == "sign" in many other languages.
if s == 0 : return 0 # doesnt hurt performance much.but can omit if 0,0 never happens
p = dy/(dx+s*dy)
if dx < 0: return p-2*s
return p
In this form the max error is only ~0.07 radian for all angles. (of course leave out the Pi/2 if you don't care about the magnitude.)
Now for the bad news -- on my system using python math.atan2 is about 25% faster Obviously replacing a simple interpreted code doesnt beat a compiled intrisic.
Sign[dx dy]
, do you mean the sign of the product? In either case, I'm somewhat worried about a signed value in the denominator, since that could easily make the denominator zero. E.g. if dx = dy = -1
. Or did I read your syntax incorrectly? –
Kerby atan2
is a single native function call but the pseudoangle implementation requires a number of interpreted arithmetic operations, probably with type checks, then using pseudoangles might indeed be bad for performance. –
Kerby If angles are not needed by themselves, but only for sorting, then @jjrv approach is the best one. Here is a comparison in Julia
using StableRNGs
using BenchmarkTools
# Definitions
struct V{T}
x::T
y::T
end
function pseudoangle(v)
copysign(1. - v.x/(abs(v.x)+abs(v.y)), v.y)
end
function isangleless(v1, v2)
a1 = abs(v1.x) + abs(v1.y)
a2 = abs(v2.x) + abs(v2.y)
a2*copysign(a1 - v1.x, v1.y) < a1*copysign(a2 - v2.x, v2.y)
end
# Data
rng = StableRNG(2021)
vectors = map(x -> V(x...), zip(rand(rng, 1000), rand(rng, 1000)))
# Comparison
res1 = sort(vectors, by = x -> pseudoangle(x));
res2 = sort(vectors, lt = (x, y) -> isangleless(x, y));
@assert res1 == res2
@btime sort($vectors, by = x -> pseudoangle(x));
# 110.437 μs (3 allocations: 23.70 KiB)
@btime sort($vectors, lt = (x, y) -> isangleless(x, y));
# 65.703 μs (3 allocations: 23.70 KiB)
So, by avoiding division, time is almost halved without losing result quality. Of course, for more precise calculations, isangleless
should be equipped with bigfloat
from time to time, but the same can be told about pseudoangle
.
Just use a cross-product function. The direction you rotate one segment relative to the other will give either a positive or negative number. No trig functions and no division. Fast and simple. Just Google it.
© 2022 - 2024 — McMap. All rights reserved.