why c++11 sleep_for microseconds actually sleep for millisecond?

L

3

7

I call this function in my centos 7 server.

I find std::this_thread::sleep_for(chrono::nanoseconds(1)) actually sleep for one ms, Is there any explanation? I think it may be caused by os setting?

Lamkin answered 8/3, 2020 at 6:7 Comment(4)

Are you using a clock to measure the sleep duration? Is your clock's accuracy maybe 1 ms? – Curiel 8/3, 2020 at 6:10

@Curiel i do a quick loop job, like {for (i=0;i<100000;i++) a++;sleep_for()}, and without sleep_for, it will run out quickly, but when i add sleep_for(one_nana), it will run 1000 times per second, that confused me – Lamkin 8/3, 2020 at 6:12

Most probably the resolution of your clock is not better than 1 ns. – Milzie 8/3, 2020 at 6:18

The specification of sleep_for is that it blocks execution of the current thread for at least the specified duration. Practically, it will be longer than that due to scheduling delays (the scheduler in any operating system has some granularity) and resource contention. Also, the only operating systems that allow precise control of timing (e.g. will enforce both upper and lower bounds on a sleep duration, rather than only a lower bound) are hard realtime systems. Most general purpose unix operating systems are not hard realtime. – Beechnut 8/3, 2020 at 6:20

S

6

You've got the question you asked covered by the other answers, but you also asked a question in the comments:

Is there any simple method can ensure i sleep for 1us?

Instead of calling sleep_for, yielding the thread's execution slot, you could busy-sleep. That is, loop until a certain amount of time has passed. It'll often get more accurate results at the cost of making that CPU thread unusable for doing anything else.

Here's one example with a function called busy_sleep():

// get a rough estimate of how much overhead there is in calling buzy_sleep()
std::chrono::nanoseconds calc_overhead() {
    using namespace std::chrono;
    constexpr size_t tests = 1001;
    constexpr auto timer = 200us;

    auto init = [&timer]() {
        auto end = steady_clock::now() + timer;
        while(steady_clock::now() < end);
    };

    time_point<steady_clock> start;
    nanoseconds dur[tests];

    for(auto& d : dur) {
        start = steady_clock::now();
        init();
        d = steady_clock::now() - start - timer;
    }
    std::sort(std::begin(dur), std::end(dur));
    // get the median value or something a little less as in this example:
    return dur[tests / 3];
}

// initialize the overhead constant that will be used in busy_sleep()
static const std::chrono::nanoseconds overhead = calc_overhead();

inline void busy_sleep(std::chrono::nanoseconds t) {
    auto end = std::chrono::steady_clock::now() + t - overhead;
    while(std::chrono::steady_clock::now() < end);
}

Demo

^{Note: This was updated after it was accepted since I noticed that the overhead calculation could sometimes get terribly wrong. The updated example should be less fragile.}

Serif answered 8/3, 2020 at 8:36 Comment(2)

Note: This is toy code. Do not, under any circumstances, actually use code like this. It has several serious flaws. For example, on a hyper-threaded system, the waiting thread could starve another thread sharing the physical core or even slow more of the system by saturating inter-core communication resources. And when the time finally expires, the code will take the mother of all mispredicted branches blowing out caches. – Bhang 23/3, 2023 at 17:17

@DavidSchwartz "the mother of all mispredicted branches" indeed. :-) Yes, using it as is would be wasting a lot for sure. It's more a building block to show that it's possible to get very close to target, which was to sleep 1 µs. Running it constantly would be ... bad. I've used this technique to sleep longer too but then in combination with sleeping too short, waking up the thread so that it can busy wait a very very short time. If the "very very" short time is carefully tuned it doesn't waste a lot and still gets very close to target. – Serif 23/3, 2023 at 18:20

C

7

From sleep_for documentation, you can see that:

Blocks the execution of the current thread for at least the specified sleep_duration.

This function may block for longer than sleep_duration due to scheduling or resource contention delays.

The most likely cause is that your process scheduler kicks out the sleeping thread and doesn't reschedule it for a millisecond.

Curiel answered 8/3, 2020 at 6:16 Comment(1)

Thanks nick, so, Is there any simple method can ensure i sleep for 1us? – Lamkin 8/3, 2020 at 6:23

S

6

You've got the question you asked covered by the other answers, but you also asked a question in the comments:

Is there any simple method can ensure i sleep for 1us?

Instead of calling sleep_for, yielding the thread's execution slot, you could busy-sleep. That is, loop until a certain amount of time has passed. It'll often get more accurate results at the cost of making that CPU thread unusable for doing anything else.

Here's one example with a function called busy_sleep():

// get a rough estimate of how much overhead there is in calling buzy_sleep()
std::chrono::nanoseconds calc_overhead() {
    using namespace std::chrono;
    constexpr size_t tests = 1001;
    constexpr auto timer = 200us;

    auto init = [&timer]() {
        auto end = steady_clock::now() + timer;
        while(steady_clock::now() < end);
    };

    time_point<steady_clock> start;
    nanoseconds dur[tests];

    for(auto& d : dur) {
        start = steady_clock::now();
        init();
        d = steady_clock::now() - start - timer;
    }
    std::sort(std::begin(dur), std::end(dur));
    // get the median value or something a little less as in this example:
    return dur[tests / 3];
}

// initialize the overhead constant that will be used in busy_sleep()
static const std::chrono::nanoseconds overhead = calc_overhead();

inline void busy_sleep(std::chrono::nanoseconds t) {
    auto end = std::chrono::steady_clock::now() + t - overhead;
    while(std::chrono::steady_clock::now() < end);
}

Demo

^{Note: This was updated after it was accepted since I noticed that the overhead calculation could sometimes get terribly wrong. The updated example should be less fragile.}

Serif answered 8/3, 2020 at 8:36 Comment(2)

Note: This is toy code. Do not, under any circumstances, actually use code like this. It has several serious flaws. For example, on a hyper-threaded system, the waiting thread could starve another thread sharing the physical core or even slow more of the system by saturating inter-core communication resources. And when the time finally expires, the code will take the mother of all mispredicted branches blowing out caches. – Bhang 23/3, 2023 at 17:17

@DavidSchwartz "the mother of all mispredicted branches" indeed. :-) Yes, using it as is would be wasting a lot for sure. It's more a building block to show that it's possible to get very close to target, which was to sleep 1 µs. Running it constantly would be ... bad. I've used this technique to sleep longer too but then in combination with sleeping too short, waking up the thread so that it can busy wait a very very short time. If the "very very" short time is carefully tuned it doesn't waste a lot and still gets very close to target. – Serif 23/3, 2023 at 18:20

A

5

Let's first check what guarantees the specification gives you (quotes from the latest daft of the C++ standard):

[thread.req.timing]

Implementations necessarily have some delay in returning from a timeout. Any overhead in interrupt response, function return, and scheduling induces a “quality of implementation” delay, expressed as duration D_i. Ideally, this delay would be zero. Further, any contention for processor and memory resources induces a “quality of management” delay, expressed as duration D_m. The delay durations may vary from timeout to timeout, but in all cases shorter is better.

The functions whose names end in _for take an argument that specifies a duration. ... Given a duration argument D_t, the real-time duration of the timeout is D_t+D_i+D_m.

The resolution of timing provided by an implementation depends on both operating system and hardware. ...

So, it is expected for the slept time to be longer than D_t given as the argument.

Assuming your test was correct, we can use it to calculate that D_i+D_m was about a millisecond on your system with your hardware in that particular execution.

Is there any simple method can ensure i sleep for 1us?

No, not in standard C++ for on all systems.

It may potentially be possible on a real-time capable system. See the documentation of the system that you are targeting.

Arguable answered 8/3, 2020 at 6:18 Comment(0)

Recommended topics

Hot tags