I have the following loop that I am running on an ARM processor.
// pin here is pointer to some part of an array
for (i = 0; i < v->numelements; i++)
{
pe = pptr[i];
peParent = pe->parent;
SPHERE *ps = (SPHERE *)(pe->data);
pin[0] = FLOAT2FIX(ps->rad2);
pin[1] = *peParent->procs->pe_intersect == &SphPeIntersect;
fixifyVector( &pin[2], ps->center ); // Is an inline function
pin = pin + 5;
}
By the slow performance of the loop, I can judge that the compiler was unable to unroll this loop, as when I manually do the unrolling, it becomes quite fast. I think the compiler is getting confused by the pin
pointer. Can we use restrict
keyword to help the compiler here, or is restrict
only reserved for function parameters? In general how can we tell the compiler to unroll it and don't worry about the pin
pointer.
v->numelements
to a local and using that in the for loop? Could be the compiler cannot unroll the loop because it has to assume the value ofv->numelements
will be changed infixifyVector
. – Wellmeaningnumelements
? If it is in millions, you can avoid many code jumps and thus comparisons by unrolling. Or are there other benefits to loop unrolling that cannot be gained in this segment? – Dispend