why does a conditional variable fix our power consumption?

Asked 27/3, 2015 at 20:58 Answered 16/4, 2015 at 14:55

c++thread-sleep condition-variable energy

We were working on our audio player project on mac and noticed that the power usage was so high (about 7x that of google chrome doing the same workload.)

I used xcode's energy profiling tool, one of the problems was we had too much cpu-wake overhead.

According to xcode:

Each time the CPU wakes from idle, there is an incurred energy penalty. If the wakes are high, and the CPU utilization per wake is low, then you should consider batching work.

We had narrowed down the problem to a usleep function call.

In our code, the audio decoder is a producer that produces audio data and inserts them into the consumer -- the audio player. Our audio player is base on OpenAL, which has a buffer for the audio data.

Because the audio player can be slower than the producer, we always check the buffer availability before giving a new audio data to the audio player. If no buffer is available, we usleep for a while and try again. So the code looks like:

void playAudioBuffer(Data *data)
{
    while(no buffer is available)
    {
         usleep()
    }
    process data.
}

Knowing that usleep is a problem, the first thing we did was simply removing usleep(). (Because OpenAL doesn't seem to provide callback or any other way, polling seems to be the only option.) We successfully reduced the power usage by half after doing this.

Then, yesterday, we tried

for(int i =0; i<attempts; ++i)
{
    std::unique_lock<std::mutex> lk(m);
    cv.wait_for(lk, 3, []{
                            available = checkBufferAvailable(); 
                            return available;
                         })

    if (available)
    {
         process buf;
    }
 }

This is an experiment we tried by accident. It doesn't really make sense to us as logically it performs the same wait. And the use of the conditional variable isn't correct, because the variable "available" is only accessed by one thread. But it actually reduced our energy consumption by 90%, the cpu usage of the thread dropped a lot. Now we are better than chrome. But How is conditional variable implemented differently than the following code? Why does it save us power?

mutex lock;
while(condition is false)
{
    mutex unlock;
    usleep();
    mutex lock;
}
...
mutex unlock
...

(We use mac's activity monitor (energy number) and cpu usage profiling tool to measure the energy consumption.)

Credent answered 27/3, 2015 at 20:58 Comment(12)

Maybe the cv spins instead of sleeps in some cases? And maybe it does a "slow" or "low power" spin lock ... somehow? (btw, 3 what units of time? Are we talking std::condition_variable here? How does 3 work there?) Oh, you aren't posting actual code, you are transcribing manually and have introduced an unknown number of errors and omissions. Please don't do that: please post code that actually reproduces the thing you are interested in. This may involve simplifying your existing code: if you understood what caused your interesting thing, you wouldn't be asking here! – Skycap 27/3, 2015 at 21:4

Or may be you just wait longer with condition in comparation with usleep? – Reservist 27/3, 2015 at 21:6

in both cases we wait for 3 milliseconds and retry. we tried 4 millisecons, for example, and noticed hiccups – Credent 27/3, 2015 at 21:10

I don't suppose the implementation is open source? (Microsoft's CRT source code, for example, ships with Visual Studio and is quite helpful in tracking down implementation-specific oddities like this.) – Dacoit 27/3, 2015 at 21:14

I can suppose that usleep is too short in your code. And "heavy" mutex operations give you some gaps in processor consumption. – Elda 27/3, 2015 at 21:14

Perhaps the condition variable's waiting uses some kind of incremental backoff strategy? – Dacoit 27/3, 2015 at 21:18

@BillYan one more note, 3-4 milli- or micro- seconds? With 4 microseconds usleep(4) processor could be loaded up to 100%. – Elda 27/3, 2015 at 21:26

@BillYan I think that you may have been penalized by the CPU going in/coming out of sleep. You would have this transition in/out of sleep every time you poll for new data. In the conditional variable this does not happen, you only poll the buffer for a limited amount of time. – Bernt 31/3, 2015 at 12:30

You can also try std::this_thread::sleep_for(std::chrono::microseconds{3}); to avoid the mutex lock/unlock and maybe it is not implemented by calling usleep(). – Elicia 31/3, 2015 at 23:35

Maybe you should consider using a Pull Model just like the way CoreAudio is built. Then whenever the consumer feels the need to have more data supplied, it can pull it from the producer and you should end up with energy savings by avoiding useless work. developer.apple.com/library/mac/documentation/MusicAudio/… – Mountie 4/4, 2015 at 5:35

Maybe try: std::this_thread::yield(); std::this_thread::sleep_for(std::chrono::microseconds(4)); – Karney 15/4, 2015 at 15:22

Checking out the standard library's source code should tell you why your implementation differs from theirs... In any case, why sleeping for so short amounts of time instead of using larger buffers? The penalty of going in/out of sleep is huge. Spend more time processing larger buffers, more time waiting. – Wolpert 16/4, 2015 at 13:14

I may be wrong, but as far as I understand when you use conditional variable to implement waiting for buffer data income. The main thing it does, it puts thread, which renders this condition variable, to sleep until a signal associated with it wakes up this thread. That is why you get less wake up overhead and use resources more efficiently.

Here are a links to working with threads in Linux, where I read about it:

May be this will give you some understanding why and how it happens.

Again I'm not entirely sure that I'm totally right, but it seems to me like a right direction.

Sorry for my pure English.

G answered 16/4, 2015 at 14:40 Comment(1)

In the OP's case, nothing ever wakes the thread - it always sleeps till timeout. – Parallelepiped 16/4, 2015 at 14:52

If you want to get energy consumption down as much as possible on a Mac or iOS, at the very least you could use a dispatch_semaphore_t to wait exactly until the buffer is full, or pass some block to the buffer filling code.

Olmsted answered 16/4, 2015 at 14:55 Comment(0)

Recommended topics

Hot tags