How to properly count timer overflows to convert a 32-bit high-resolution timer into a 64-bit high-resolution timer
Asked Answered
B

2

2

I have a 32-bit core timer on a PIC32MZ microcontroller (datasheet) counting at 100 MHz.

The system is running FreeRTOS with multiple tasks (threads) and this code should be able to get called from any task at any time.

It rolls over every 2^32 ticks / 100000000 ticks/sec = 42.9 seconds. I'd like to safely convert it into a 64-bit counter.

There is an ISR which handles counting overflows. It looks something like this:

static volatile uint32_t rollover_count = 0;

void isr_core_timer_overflow()
{
    rollover_count++;
}

This concept would also work on 8-bit microcontrollers to convert from 8-bit counters to 16-bit or 32-bit counters.

Here is my proposed solution:

// For 32-bit mcus such as PIC32
uint64_t get_counts()
{
    uint32_t count1 = ReadCoreTimer();

    // No critical section needed because we are reading a 32-bit value, and
    // turning off interrupts doesn't stop the core timer from counting
    // anyway.
    uint32_t rollover_count_copy = rollover_count;

    uint32_t count2 = ReadCoreTimer();

    if (count2 < count1)
    {
        // rollover just occurred between reading count1 and count2, so manually
        // account for it; note: rollover_count will be incremented
        // by the rollover ISR, so I don't need to update it
        rollover_count_copy++;
    }

    uint64_t total_count = ((uint64_t)rollover_count_copy << 32) | count2;

    return total_count;
}

Part of me thinks I do need a critical section, but I can't wrap my head around why. It has something to do with the fact that when the core timer overflows, the overflow ISR would run, and maybe I would need to momentarily stop that.

If I had interrupt guards to protect a critical section, it might look like this:

// For 32-bit mcus such as PIC32
uint64_t get_counts()
{
    uint32_t count1 = ReadCoreTimer();

    // back up the interrupt state
    // - Note: the `__builtin_*` functions are part of the XC32 compiler, and 
    //   are documented in DS50002799C (MPLAB® XC32 C/C++ Compiler User's Guide
    //   for PIC32M MCUs; XC32 Compiler for PIC32M) p195, p265 (the assembly 
    //   code output for the functions is around here), etc.
    unsigned int isr_state = __builtin_get_isr_state();
    __builtin_disable_interrupts();  // critical section start

    uint32_t rollover_count_copy = rollover_count;
    uint32_t count2 = ReadCoreTimer();

    // restore the interrupt state
    __builtin_set_isr_state(isr_state); // critical section end

    if (count2 < count1)
    {
        // rollover just occurred between reading count1 and count2, so manually
        // account for it
        rollover_count_copy++;
    }

    uint64_t total_count = ((uint64_t)rollover_count_copy << 32) | count2;

    return total_count;
}

Or maybe move the count1 read into the critical section too to stop the rollover_count from incrementing in its ISR during that time too:

// For 32-bit mcus such as PIC32
uint64_t get_counts()
{

    unsigned int isr_state = __builtin_get_isr_state();
    __builtin_disable_interrupts();  // critical section start

    uint32_t count1 = ReadCoreTimer();
    uint32_t rollover_count_copy = rollover_count;
    uint32_t count2 = ReadCoreTimer();

    __builtin_set_isr_state(isr_state); // critical section end

    if (count2 < count1)
    {
        rollover_count_copy++;
    }

    uint64_t total_count = ((uint64_t)rollover_count_copy << 32) | count2;

    return total_count;
}

Which is correct? What solution would you propose? Are there any problems with my first approach above?

References

  1. My 8-bit AVR solution that has been working flawlessly for a decade is in the top of this issue here, but I feel like a rewrite following the design above would run faster and be simpler, and now I need to write this for a PIC32 anyway, which I've never done before. The if (count2 < count1) check is to try to counter this race condition identified here.

  2. The ReadCoreTimer() function is provided here inside the document titled PIC32 Peripheral Libraries for MPLAB C32 Compiler - also contains UpdateCoreTimer(), WriteCoreTimer(), etc.

  3. PIC32MZ datasheet

  4. https://www.microchip.com/en-us/tools-resources/develop/mplab-xc-compilers/xc32 --> scroll down to "MPLAB XC32 Compiler Documentation" and download "MPLAB XC32 C/C++ Compiler User's Guide for PIC32M MCUs".

    1. Direct link: MPLAB® XC32 C/C++ Compiler User's Guide for PIC32M MCUs--XC32 Compiler for PIC32M--DS50002799C.pdf - contains the documentation for the __builtin_* functions. From section 18.8 p195:
      unsigned int __builtin_get_isr_state(void)
      void __builtin_set_isr_state(unsigned int)
      unsigned int __builtin_disable_interrupts(void)
      void __builtin_enable_interrupts(void)
      

Related

  1. Timer rollover handling - related, but not a duplicate

Notes to self:

  1. Add a previous_count variable. Update it to the last returned value after each get_counts() call, and update it to 0 after each overflow ISR run. Use it to check for overflows.

  2. For AVR, enable nested interrupts inside the overflow ISR. Ensure that the interrupt that interrupts that interrupt does not also allow nested interrupts.

    Nevermind: the interrupting interrupts could be pin change interrupts and they need to able to read timestamps to measure incoming PWM signals.

Bisulfate answered 18/1 at 23:32 Comment(9)
Your proposed solution works on CPUs that don't do instruction reordering, otherwise it needs barriers.Kaltman
@stark, what do barriers look like? / how are they done?Bisulfate
If you are calling this timer function all the time (more frequently than the rollover ticks), then you could dispense with the ISR entirely, and simply wrap the timer method with your own one that stores the previous timer value and a rollover count as statics. If the the newly-requested timer value is less than the previous one, increment rollover count. That's obviously not thread-safe, but it's not clear whether your code is multi-threaded.Brophy
@paddy, that's true. In the current implementation I cannot guarantee that the code will be called often enough to remove the overflow ISR. Also, the code is multi-threaded using FreeRTOS as the operating system. There are multiple tasks and any task should be able to safely get a timestamp. Disabling interrupts can guarantee this safety because the scheduler cannot switch contexts with interrupts off.Bisulfate
"No critical section needed because we are reading a 32-bit value" This reasoning is flawed - C does not guarantee atomic access without _Atomic, even if your CPU happens to be 32 bit. Because a single C read/write can result in multiple assembler instructions. You can only tell if this is actually "thread safe" by watching the disassembly. If your compiler supports _Atomic, you could use that. Or otherwise you can write in inline asm. Or otherwise disable the specific interrupt during reads.Knowitall
@Lundin, that is true. But in this particular case, I'm not relying on the C standard to guarantee the atomicity of 32-bit reads. Instead, I'm relying on this particular PIC32 mcu with this particular compiler. Richard Barry states in the FreeRTOS tasks.c: /* A critical section is not required because the variables are of type BaseType_t. */. See that quote and its source in my answer here: Which variable types/sizes are atomic on STM32 microcontrollers?. It applies to the PIC32 mcu I'm using too. I agree that there is no guarantee in C...Bisulfate
...except for using atomic types, as shown in the cppreference community wiki here (ex: atomic_uint_least32_t) and here (_Atomic).Bisulfate
It's an old myth that if you read word size or less you can't be interrupted. And therefore said myth is a common cause for race condition bugs. The problem is that someone reads in the CPU manual about it, like in your link, not realizing that what's said there if for the ISA. But what works in assembler doesn't necessarily work in C. In fact next to no instruction in C is a single instruction, because on almost every target a read/write in C involves manipulating a register as well as moving a value to/from the stack.Knowitall
You can cockily write C code that works on word size and less and check that the generated assembler is OK. But such code is both brittle and non-portable.Knowitall
C
4

Simply repeat the count generation if rollover changes:

uint64_t get_counts()
{
    uint32_t count_lo = 0u ;
    uint32_t count_hi = 0u ;

    do
    {
        count_hi = rollover_count ;
        count_lo = ReadCoreTimer() ;

    } while( count_hi != rollover_count ) ;
   
    return ((uint64_t)count_hi << 32) | count_lo ;
}

The loop will execute no more than twice, and generally once. The race condition exists as you say, but here it is detected and the result recalculated. Critically, with no disabling of interrupts.

Cuneal answered 19/1 at 4:22 Comment(10)
Thank you. The count calculation should be moved to after the while loop I think to not have to ever calculate it twice.Bisulfate
I think I need to think about this some more. If this is correct, with no bugs, then I think my "proposed solution" is also correct, and has the same bug-free outcome in a slightly different way.Bisulfate
@GabrielStaples I don't think this works. When/where does rollover_count get changed?Duodenum
@CraigEstey rollover_count gets changed in the ISR for the counter. The interrupt may occur and rollover_count may change at any time during the execution of get_counts. The conditional while (count_hi != rollover_count) ensures that rollover_count did not change between the time count_hi and count_lo are set.Hitt
There's a downside to this solution, making (I think) my "proposed solution" more optimal: it reads until the same rollover_count is read twice. But, that's not strictly necessary. It would be okay for the rollover count to change so long as you know which rollover count value corresponds to your count_lo value.Bisulfate
I think I need to come up with a test plan on real hardware and start testing various solutions, which is very tricky to catch race condition bugs, if they exist, in this type of problem.Bisulfate
Regarding the location calculation of count, true. Changed. I was editing the code on a phone, and that was a hangover from a slightly different solutionCuneal
@GabrielStaples I agree that your first solution will work. The important take-home is not to disable interrupts. Always best avoided, and in this case unnecessary. Comparing at godbolt.org/z/4ff3Ph3Tc, my solution has one fewer instruction (-O3), and normally calls ReadCoreTimer() just once, whereas yours calls it twice unconditionally, then conditionally adjusts the rollover. It that sense yours is potentially more deterministic, but likely generally slower.Cuneal
@Clifford, actually, my solution still might be flawed. I read the core timer and then the rollover_count. You read the rollover_count and then the core timer. I think that order matters, and if you swapped your order your solution would be wrong too. Correct? Because after the core timer is read, the overflow ISR could fire, incrementing the rollover_count. Well, on second thought...then you'd see that the rollover count changed, and read again. So...yeah your solution is fine with either order, but perhaps mine is not.Bisulfate
@GabrielStaples indeed, you cannot determine when a preemption occured in a sequence, so you need an iteration. There is a reason this solution is idiomatic.Cuneal
D
1

I've done exactly this for a similar system in real production code.

A few issues ...

  1. You increment rollover_count_copy but never advance rollover_count
  2. The change to rollover_count must be done in the critical section (i.e. interrupts disabled)
  3. You don't want to read the timer twice. You want to have a global that keeps track of the "old" timer value.
  4. For PIC*, it doesn't do much instruction reordering, so using volatile on globals under the critical section is sufficient (vs. using barriers).
  5. Assuming that once you enter an ISR, the system is "single level" and can't be stacked (i.e. a second interrupt can not interrupt an ISR) ...
  6. ... If you add a global that is set when within an ISR, you can speed up get_counts by not changing the interrupt flag(s) when called from an ISR. (i.e.) In an ISR, interrupts are already disabled.
  7. The save/restore of interrupt context needs to be done only if called from base/task level.

Here is a slight rewrite of your function. It works when called from within an ISR and at task level. It is annotated:

#include <stdint.h>

// fake definitions
#ifdef __linux__
uint32_t ReadCoreTimer(void);
uint32_t __builtin_get_isr_state(void);
void __builtin_set_isr_state(uint32_t);
void __builtin_disable_interrupts(void);
#endif

volatile int in_isr = 0;                // 1=within ISR
volatile uint32_t old_timer = 0;        // previous timer value
volatile uint32_t rollover_count = 0;   // rollover count

// set this according to the system's interrupt architecture
#define CONFIG_INTERRUPTS_CANT_BE_STACKED   1

// For 32-bit mcus such as PIC32
uint64_t
get_counts()
{
    unsigned int isr_state;
    int isrnow = CONFIG_INTERRUPTS_CANT_BE_STACKED && in_isr;

    // back up the interrupt state
    // - Note: the `__builtin_*` functions are part of the XC32 compiler, and
    // are documented in DS50002799C (MPLAB® XC32 C/C++ Compiler User's Guide
    // for PIC32M MCUs; XC32 Compiler for PIC32M) p195, p265 (the assembly
    // code output for the functions is around here), etc.
    if (! isrnow) {
        isr_state = __builtin_get_isr_state();
        __builtin_disable_interrupts();     // critical section start
    }

    uint32_t rollover_count_copy = rollover_count;
    uint32_t curcount = ReadCoreTimer();

    if (curcount < old_timer) {
        // rollover just occurred between reading count1 and count2, so manually
        // account for it
        rollover_count = ++rollover_count_copy;
    }

    // save the current timer value for rollover detection on the _next_ call
    old_timer = curcount;

    // restore the interrupt state
    if (! isrnow)
        __builtin_set_isr_state(isr_state); // critical section end

    uint64_t total_count = ((uint64_t) rollover_count_copy << 32) | curcount;

    return total_count;
}

// some_ISR -- example ISR routine
void
some_ISR(void)
{

    in_isr = CONFIG_INTERRUPTS_CANT_BE_STACKED;

    get_counts();

    in_isr = 0;
}

// main -- example of task level function
int
main(void)
{

    get_counts();

    return 0;
}
Duodenum answered 19/1 at 2:52 Comment(4)
Thanks for the answer. Regarding bullet 1, rollover_count is incremented in the isr_core_timer_overflow, which is the ISR which runs automatically every time the core timer overflows. Bullet 2 doesn't apply since it is only updated in the ISR, & I think bullet 5 does apply. Bullet 3 is a great idea! I hadn't considered that, but I think it might result in the optimal solution. Bullet 4 is good to know. I need to study your code some more still. Upvoted simply because bullet 3 is so useful, about using a global variable to keep track of the old timer value, rather than doing a double read.Bisulfate
There's a lot to consider in this problem. It's a very hard problem to get perfectly right / optimal, and most importantly: bug-free.Bisulfate
For the "old core timer value" approach to work though, I need to come up with a way to guarantee that the get_counts() function is called at least once per overflow. This would probably best be done via an interrupt, which will require then a critical section (disabling interrupts) when updating the overflow count & the old core timer value. I could continually postpone this interrupt, pushing it back, however, in the get_counts() function, so that we don't have to let the ISR run, if the get_counts() function is called by the user frequently enough to count the overflows by itself.Bisulfate
An unnecessarily complicated solution to a simple problem I think. Also this answer is predicated on the misapprehension that the original code did not update rollover_count which it does.Cuneal

© 2022 - 2024 — McMap. All rights reserved.