Understand DispatchTime on M1 machines
Asked Answered
G

2

5

In my iOS project were were able to replicate Combine's Schedulers implementation and we have an extensive suit of testing, everything was fine on Intel machines all the tests were passing, now we got some of M1 machines to see if there is a showstopper in our workflow.

Suddenly some of our library code starts failing, the weird thing is even if we use Combine's Implementation the tests still failing.

Our assumption is we are misusing DispatchTime(uptimeNanoseconds:) as you can see in the following screen shot (Combine's implementation)

enter image description here

We know by now that initialising DispatchTime with uptimeNanoseconds value doesn't mean they are the actual nanoseconds on M1 machines, according to the docs

Creates a DispatchTime relative to the system clock that ticks since boot.

 - Parameters:
   - uptimeNanoseconds: The number of nanoseconds since boot, excluding
                        time the system spent asleep
 - Returns: A new `DispatchTime`
 - Discussion: This clock is the same as the value returned by
               `mach_absolute_time` when converted into nanoseconds.
               On some platforms, the nanosecond value is rounded up to a
               multiple of the Mach timebase, using the conversion factors
               returned by `mach_timebase_info()`. The nanosecond equivalent
               of the rounded result can be obtained by reading the
               `uptimeNanoseconds` property.
               Note that `DispatchTime(uptimeNanoseconds: 0)` is
               equivalent to `DispatchTime.now()`, that is, its value
               represents the number of nanoseconds since boot (excluding
               system sleep time), not zero nanoseconds since boot.

so, is the test wrong or we should not use DispatchTime like this?

we try to follow Apple suggestion and use this:

uint64_t MachTimeToNanoseconds(uint64_t machTime)
{
    uint64_t nanoseconds = 0;
    static mach_timebase_info_data_t sTimebase;
    if (sTimebase.denom == 0)
        (void)mach_timebase_info(&sTimebase);

    nanoseconds = ((machTime * sTimebase.numer) / sTimebase.denom);

    return nanoseconds;
}

it didnt help a lot.

Edit: Screenshot code:

 func testSchedulerTimeTypeDistance() {
    let time1 = DispatchQueue.SchedulerTimeType(.init(uptimeNanoseconds: 10000))
    let time2 = DispatchQueue.SchedulerTimeType(.init(uptimeNanoseconds: 10431))
    let distantFuture = DispatchQueue.SchedulerTimeType(.distantFuture)
    let notSoDistantFuture = DispatchQueue.SchedulerTimeType(
      DispatchTime(
        uptimeNanoseconds: DispatchTime.distantFuture.uptimeNanoseconds - 1024
      )
    )

    XCTAssertEqual(time1.distance(to: time2), .nanoseconds(431))
    XCTAssertEqual(time2.distance(to: time1), .nanoseconds(-431))

    XCTAssertEqual(time1.distance(to: distantFuture), .nanoseconds(-10001))
    XCTAssertEqual(distantFuture.distance(to: time1), .nanoseconds(10001))
    XCTAssertEqual(time2.distance(to: distantFuture), .nanoseconds(-10432))
    XCTAssertEqual(distantFuture.distance(to: time2), .nanoseconds(10432))

    XCTAssertEqual(time1.distance(to: notSoDistantFuture), .nanoseconds(-11025))
    XCTAssertEqual(notSoDistantFuture.distance(to: time1), .nanoseconds(11025))
    XCTAssertEqual(time2.distance(to: notSoDistantFuture), .nanoseconds(-11456))
    XCTAssertEqual(notSoDistantFuture.distance(to: time2), .nanoseconds(11456))

    XCTAssertEqual(distantFuture.distance(to: distantFuture), .nanoseconds(0))
    XCTAssertEqual(notSoDistantFuture.distance(to: notSoDistantFuture),
                   .nanoseconds(0))
  }
Gynaecomastia answered 30/11, 2021 at 15:13 Comment(2)
I've previously worked on htop, which uses this function. Might be useful to compare: github.com/htop-dev/htop/blob/…Prig
It's best not to include vital information as an image, can you copy/paste the relevant text instead?Antiphon
C
4

The difference between Intel and ARM code is precision.

With Intel code, DispatchTime internally works with nanoseconds. With ARM code, it works with nanoseconds * 3 / 125 (plus some integer rounding). The same applies to DispatchQueue.SchedulerTimeType.

DispatchTimeInterval and DispatchQueue.SchedulerTimeType.Stride internally use nanoseconds on both platforms.

So the ARM code uses lower precision for calculations but full precision when comparing distances. In addition, precision is lost when converting from nanoseconds to the internal unit.

The exact formula for the DispatchTime conversions are (executed as integer operations):

rawValue = (nanoseconds * 3 + 124) / 125

nanoseconds = rawValue * 125 / 3

As an example, let's take this code:

let time1 = DispatchQueue.SchedulerTimeType(.init(uptimeNanoseconds: 10000))
let time2 = DispatchQueue.SchedulerTimeType(.init(uptimeNanoseconds: 10431))
XCTAssertEqual(time1.distance(to: time2), .nanoseconds(431))

It results in the calculation:

(10000 * 3 + 124) / 125 -> 240
(10431 * 3 + 124) / 125 -> 251
251 - 240 -> 11
11 * 125 / 3 -> 458

The resulting comparison between 458 and 431 then fails.

So the main fix would be to allow for small differences (I haven't verified if 42 is the maximum difference):

XCTAssertEqual(time1.distance(to: time2), .nanoseconds(431), accuracy: .nanoseconds(42))
XCTAssertEqual(time2.distance(to: time1), .nanoseconds(-431), accuracy: .nanoseconds(42))

And there are more surprises: Other than with Intel code, distantFuture and notSoDistantFuture are equal with ARM code. It has probably been implemented like so to protect from an overflow when multiplying with 3. (The actual calculation would be: 0xFFFFFFFFFFFFFFFF * 3). And the conversion from the internal unit to nanoseconds would result in 0xFFFFFFFFFFFFFFFF * 125 / 3, a value to big to be represented with 64 bits.

Furthermore I think that you are relying on implementation specific behavior when calculating the distance between time stamps at or close to 0 and time stamps at or close to distant future. The tests rely on the fact the distant future internally uses 0xFFFFFFFFFFFFFFFF and that the unsigned subtraction wraps around and produces a result as if the internal value was -1.

Cercaria answered 17/1, 2022 at 20:14 Comment(2)
Wow, great answer! Do you have any links to sources or other further reading?Prig
Unfortunately not. I've mainly inspected the variables in the debugger and looked at print output. Additionally, I've stepped through assembler code.Cercaria
P
2

I think your issue lies in this line:

nanoseconds = ((machTime * sTimebase.numer) / sTimebase.denom)

... which is doing integer operations.

The actual ratio here for M1 is 125/3 (41.666...), so your conversion factor is truncating to 41. This is a ~1.6% error, which might explain the differences you're seeing.

Prig answered 30/11, 2021 at 15:29 Comment(2)
That would be the case if the code were machTime * (sTimeBase.numer / sTimeBase.denom) but it isn't. What's written should be the correct way to get the true quotient to the nearest integer, assuming the multiplication doesn't overflow.Antiphon
Hmmm you're right.Prig

© 2022 - 2024 — McMap. All rights reserved.