I am writing a Linux Kernel driver (for ARM) and in an irq handler I need to check the interrupt bits.
bit
0/16 End point 0 In/Out interrupt
(very likely, while In is more likely)
1/17 End point 1 In/Out interrupt
...
15/31 End point 15 In/Out interrupt
Note that more than a bit can be set at a time.
So this is the code:
int i;
u32 intr = read_interrupt_register();
/* ep0 IN */
if(likely(intr & (1 << 0))){
handle_ep0_in();
}
/* ep0 OUT */
if(likely(intr & (1 << 16))){
handle_ep0_out();
}
for(i=1;i<16;++i){
if(unlikely(intr & (1 << i))){
handle_ep_in(i);
}
if(unlikely(intr & (1 << (i + 16)))){
handle_ep_out(i);
}
}
(1 << 0)
and (1 << 16)
would be calculated in compile time, however (1 << i)
and (1 << (i + 16))
wouldn't. Also there would be integral comparison and addition in the loop.
Because it is an irq handler, work should be done within the shortest time. This let me think whether I need to optimize it a bit.
Possible ways?
1. Split the loop, seems to make no difference...
/* ep0 IN */
if(likely(intr & (1 << 0))){
handle_ep0_in();
}
/* ep0 OUT */
if(likely(intr & (1 << 16))){
handle_ep0_out();
}
for(i=1;i<16;++i){
if(unlikely(intr & (1 << i))){
handle_ep_in(i);
}
}
for(i=17;i<32;++i){
if(unlikely(intr & (1 << i))){
handle_ep_out(i - 16);
}
}
2. Shift intr
instead of the value to be compared to?
/* ep0 IN */
if(likely(intr & (1 << 0))){
handle_ep0_in();
}
/* ep0 OUT */
if(likely(intr & (1 << 16))){
handle_ep0_out();
}
for(i=1;i<16;++i){
intr >>= 1;
if(unlikely(intr & 1)){
handle_ep_in(i);
}
}
intr >>= 1;
for(i=1;i<16;++i){
intr >>= 1;
if(unlikely(intr & 1)){
handle_ep_out(i);
}
}
3. Fully unroll the loop (not shown). That would make the code a bit messy.
4. Any other better ways?
5. Or it's that the compiler will actually generate the most optimized way?
Edit: I was looking for a way to tell the gcc compiler to unroll that particular loop, but it seems that it isn't possible according to my search...