Boost.Thread wakes up too late in 1.58
Asked Answered
F

3

7

I have an application that needs to do work within certain windows (in this case, the windows are all 30 seconds apart). When the time is not within a window, the time until the middle of the next window is calculated, and the thread sleeps for that amount of time (in milliseconds, using boost::this_thread::sleep_for).

Using Boost 1.55, I was able to hit the windows within my tolerance (+/-100ms) with extreme reliability. Upon migration to Boost 1.58, I am never able to hit these windows. Replacing the boost::this_thread::sleep_for with std::this_thread::sleep_for fixes the issue; however, I need the interruptible feature of boost::thread and the interruption point that boost::this_thread::sleep_for provides.

Here is some sample code illustrating the issue:

#include <boost/thread.hpp>
#include <boost/chrono.hpp>

#include <chrono>
#include <iostream>
#include <thread>

void boostThreadFunction ()
{
   std::cout << "Starting Boost thread" << std::endl;
   for (int i = 0; i < 10; ++i)
   {
      auto sleep_time = boost::chrono::milliseconds {29000 + 100 * i};
      auto mark = std::chrono::steady_clock::now ();
      boost::this_thread::sleep_for (sleep_time);
      auto duration = std::chrono::duration_cast<std::chrono::milliseconds>(
         std::chrono::steady_clock::now () - mark);
      std::cout << "Boost thread:" << std::endl;
      std::cout << "\tSupposed to sleep for:\t" << sleep_time.count () 
                << " ms" << std::endl;
      std::cout << "\tActually slept for:\t" << duration.count () 
                << " ms" << std::endl << std::endl;
   }
}

void stdThreadFunction ()
{
   std::cout << "Starting Std thread" << std::endl;
   for (int i = 0; i < 10; ++i)
   {
      auto sleep_time = std::chrono::milliseconds {29000 + 100 * i};
      auto mark = std::chrono::steady_clock::now ();
      std::this_thread::sleep_for (sleep_time);
      auto duration = std::chrono::duration_cast<std::chrono::milliseconds>(
         std::chrono::steady_clock::now () - mark);
      std::cout << "Std thread:" << std::endl;
      std::cout << "\tSupposed to sleep for:\t" << sleep_time.count () 
                << " ms" << std::endl;
      std::cout << "\tActually slept for:\t" << duration.count () 
                << " ms" << std::endl << std::endl;
   }
}

int main ()
{
   boost::thread boost_thread (&boostThreadFunction);
   std::this_thread::sleep_for (std::chrono::seconds (10));
   std::thread std_thread (&stdThreadFunction);
   boost_thread.join ();
   std_thread.join ();
   return 0;
}

Here is the output when referencing Boost 1.58 as an include directory and running on my workstation (Windows 7 64-bit):

Starting Boost thread
Starting Std thread
Boost thread:
        Supposed to sleep for:  29000 ms
        Actually slept for:     29690 ms

Std thread:
        Supposed to sleep for:  29000 ms
        Actually slept for:     29009 ms

Boost thread:
        Supposed to sleep for:  29100 ms
        Actually slept for:     29999 ms

Std thread:
        Supposed to sleep for:  29100 ms
        Actually slept for:     29111 ms

Boost thread:
        Supposed to sleep for:  29200 ms
        Actually slept for:     29990 ms

Std thread:
        Supposed to sleep for:  29200 ms
        Actually slept for:     29172 ms

Boost thread:
        Supposed to sleep for:  29300 ms
        Actually slept for:     30005 ms

Std thread:
        Supposed to sleep for:  29300 ms
        Actually slept for:     29339 ms

Boost thread:
        Supposed to sleep for:  29400 ms
        Actually slept for:     30003 ms

Std thread:
        Supposed to sleep for:  29400 ms
        Actually slept for:     29405 ms

Boost thread:
        Supposed to sleep for:  29500 ms
        Actually slept for:     29999 ms

Std thread:
        Supposed to sleep for:  29500 ms
        Actually slept for:     29472 ms

Boost thread:
        Supposed to sleep for:  29600 ms
        Actually slept for:     29999 ms

Std thread:
        Supposed to sleep for:  29600 ms
        Actually slept for:     29645 ms

Boost thread:
        Supposed to sleep for:  29700 ms
        Actually slept for:     29998 ms

Std thread:
        Supposed to sleep for:  29700 ms
        Actually slept for:     29706 ms

Boost thread:
        Supposed to sleep for:  29800 ms
        Actually slept for:     29998 ms

Std thread:
        Supposed to sleep for:  29800 ms
        Actually slept for:     29807 ms

Boost thread:
        Supposed to sleep for:  29900 ms
        Actually slept for:     30014 ms

Std thread:
        Supposed to sleep for:  29900 ms
        Actually slept for:     29915 ms

I would expect the std::thread and the boost::thread to sleep for the same amount of time; however, the boost::thread seems to want to sleep for ~30 seconds when asked to sleep for 29.1 - 29.9 seconds. Am I misusing the boost::thread interface, or is this a bug that was introduced since 1.55?

Felodese answered 21/5, 2015 at 18:41 Comment(3)
On most platforms any kind of thread sleep feature is a "best effort" deal; but if it worked for you previously it should still work now...Earring
I agree, and it's evident that std::thread is providing a "best effort" service because it is only accurate to +/- 30ms (not counting time it takes for the measurement calculations). However, boost::this_thread::sleep_for is providing much worse than "best effort" - it seems to be rounding my value up to 30 seconds.Felodese
BTW on Windows 8 you would see much different results again to Windows 7. I would assume Windows 10 will be different again.Disport
D
7

I am the person who committed the above change to Boost.Thread. This change in 1.58 is by design after a period of consultation with the Boost community and Microsoft, and results in potentially enormous battery life improvements on mobile devices. The C++ standard makes no guarantees whatsoever that any timed wait actually waits, or waits the correct period, or anything close to the correct period. Any code written to assume that timed waits work or are accurate is therefore buggy. A future Microsoft STL may make a similar change to Boost.Thread, and therefore the STL behaviour would be the same as Boost.Thread. I might add that on any non-realtime OS any timed wait is inherently unpredictable any may fire very considerably later than requested. This change was therefore thought by the community as helpful to expose buggy usage of the STL.

The change allows Windows to optionally fire timers late by a certain amount. It may not actually do so, and in fact simply tries to delay regular interrupts as part of a tickless kernel design on very recent editions of Windows. Even if you specify a tolerance of weeks, as the correct deadline is always sent to Windows the next system interrupt to occur after the timer expiry will always fire the timer, so no timer will ever be late by more than a few seconds at most.

One bug fixed by this change was the problem of system sleep. The previous implementation could get confused by the system sleeping whereby timed waits would never wake (well, in 29 days they would). This implementation correctly deals with system sleeps, and random hangs of code using Boost.Thread caused by system sleeps hopefully is now a thing of the past.

Finally, I personally think that timed waits need a hardness/softness guarantee in the STL. That's a pretty big change however. And even if implemented, except on hard realtime OSs hardness of timed waits can only ever be best effort. Which is why they were excluded from the C++ standard in the first place, as C++ 11 was finalised well before mobile device power consumption was considered important enough to modify APIs.

Niall

Disport answered 23/5, 2015 at 16:13 Comment(0)
F
1

Starting in Boost 1.58 on Windows, sleep_for() leverages SetWaitableTimerEx() (instead of SetWaitableTimer()) passing in a tolerance time to take advantage of coalescing timers.

In libs/thread/src/win32/thread.cpp, the tolerance is 5% of the sleep time or 32 ms, whichever is larger:

// Preferentially use coalescing timers for better power consumption and timer accuracy
    if(!target_time.is_sentinel())
    {
        detail::timeout::remaining_time const time_left=target_time.remaining_milliseconds();
        timer_handle=CreateWaitableTimer(NULL,false,NULL);
        if(timer_handle!=0)
        {
            ULONG tolerable=32; // Empirical testing shows Windows ignores this when <= 26
            if(time_left.milliseconds/20>tolerable)  // 5%
                tolerable=time_left.milliseconds/20;
            LARGE_INTEGER due_time=get_due_time(target_time);
            bool const set_time_succeeded=detail_::SetWaitableTimerEx()(timer_handle,&due_time,0,0,0,&detail_::default_reason_context,tolerable)!=0;
            if(set_time_succeeded)
            {
                timeout_index=handle_count;
                handles[handle_count++]=timer_handle;
            }
        }
    }

Since 5% of 29.1 seconds is 1.455 seconds, this explains why the sleep times using boost::sleep_for were so inaccurate.

Felodese answered 21/5, 2015 at 22:34 Comment(6)
That seems like an odd choice.Perennial
@DavidSchwartz: Do you consider coalescing similar timers odd, or the 5%?Refill
The 5%. If I have a meeting at 3PM, it may be fine for me to be a few minutes late, but to make an assumption that if a meeting is a few weeks away, it's fine to be a few hours late ... that's just crazy. This is supposed to be a general-purpose function. (5% of the sleep, but no less than 32ms nor more than 100ms might have been a reasonable design choice.)Perennial
@DavidSchwartz: You have faulty logic. The 5% tolerable delay is immaterial even if it's weeks long because the true timer deadline is correctly sent to Windows, and Windows is merely given permission to delay the timer fire by up to many weeks. It never actually will, because the next system interrupt will fire all timers due, and no Windows system is going to go interrupt free for more than a few seconds at a time, at absolute most a minute.Disport
@NiallDouglas You have faulty logic. That a parameter happens to be immaterial because an implementation happens to do the right thing even if you ask it for the wrong thing doesn't make it acceptable to ask for the wrong thing. This is is how you wind up with code that breaks with new versions of an operating system or library.Perennial
@NiallDouglas "It never actually will, because the next system interrupt will fire all timers due, and no Windows system is going to go interrupt free for more than a few seconds at a time, at absolute most a minute." This is false; my hour-long sleeps are lasting exactly 63 minutes, consistently (Win10 32-bit, Boost 1.70.0, GCC 9.2.0). It sounds like you would consider a three-minute delay to be unreasonable, so... why do you allow it? What's the rationale for not setting a maximum value for the delay?Afterguard
P
0

I use this code as a workaround if I need the interruptibleness of sleep_for:

        ::Sleep(20);
        boost::this_thread::interruption_point();
Polyphemus answered 15/4, 2016 at 9:49 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.