Why Swift using subscript syntax in a for-in loops is faster than using direct access to the element?
Asked Answered
B

1

5

I read the famous Why is it faster to process a sorted array than an unsorted array? and I decided to play around and experiment with other languages such as Swift. I was surprised by the run time differences between 2 very similar snippets of code.

In Swift one can access elements in an array either in a direct way or with a subscript while in a for-in loop. For instance this code:

for i in 0..<size {
    sum += data[i]
}

Could be written:

for element in data {
    sum += element
}

With size the data length and data an array of summable elements.

So, I just implemented in Swift (code bellow) the same algorithm as in the question I mentioned in the first paragraph and what surprised me is that the first method is roughly 5 times faster than the second method.

I don't really know the backstage subscript implementation but I thought that accessing directly the elements in a Swift for-in loop was just syntactic sugar.


Question

My question is what is the difference between the two for-in syntaxes and why it is faster to use subscript?

here is the detail of timers. I'm using Xcode 9.4.1 with Swift 4.1 on an early 2015 MacBook Air with a Commande Line Project.

// Using Direct Element Access
Elapsed Time: 8.506288427
Sum: 1051901000

vs

// Using Subscript
Elapsed Time: 1.483967902
Sum: 1070388000

Bonus question: why the execution is 100 times slower in Swift than in C++ (both executed on the same Mac in a n Xcode project)? For instance 100,000 repetitions in C++ take nearly the same time as 1,000 repetitions in Swift. My first guess is that Swift is a higher level language than C++ and that Swift operates more safety checks for instance.


Here is the Swift code I used, I only modified the second nested loop:

import Foundation
import GameplayKit

let size = 32_768
var data = [Int]()
var sum  = 0
var rand = GKRandomDistribution(lowestValue: 0, highestValue: 255)

for _ in 0..<size {
    data.append(rand.nextInt())
}

// data.sort()

let start = DispatchTime.now()

for _ in 0..<1_000 {
    // Only the following for-in loop changes
    for i in 0..<size {
        if data[i] <= 128 {
            sum += data[i]
        }
    }
}

let stop     = DispatchTime.now()
let nanoTime = stop.uptimeNanoseconds - start.uptimeNanoseconds
let elapsed  = Double(nanoTime) / 1_000_000_000

print("Elapsed Time: \(elapsed)")
print("Sum: \(sum)")
Baalbeer answered 29/6, 2018 at 21:15 Comment(7)
Were you testing in a Swift playground or a compiled app?Village
I'm using a compiled app (commande line project).Baalbeer
I suspect you're not compiling with optimizations. With -O, I'm seeing about a 10% cost at most, not 10x. Also you'd need to compare to -Ounchecked if you're comparing to C++.Buckinghamshire
Unless you use -0unchecked, every basic arithmetic operation does a branch (if checks for overflow, and crashes rather than allowing overflown results to be used)Damiondamita
Ok @RobNapier, I got the same result as you with -O. So to sum up when unoptimized the second loop method is far slower than using subscript.Baalbeer
@LouisLac Performance tests are pointless unless you're making optimized builds. The default settings are there for developer convenience (fast compile times, debug symbols) not run-time performance. Iteration in a for loop involves multiple function calls (Sequence.makeIterator(), IteratorProtocol.next()), that would slow things down if they're not optimized out (which they are, in -O)Damiondamita
Ok great I better understand how all this works together, thanks.Baalbeer
T
8

The overall performance output highly depends on the optimizations done by the compiler. If you compile your code with optimizations enabled, you will see the difference between both solutions is minimal.

To demonstrate this, I updated your code adding two methods, one with subscripting and the other using for-in.

import Foundation
import GameplayKit

let size = 32_768
var data = [Int]()
var sum  = 0
var rand = GKRandomDistribution(lowestValue: 0, highestValue: 255)

for _ in 0..<size {
    data.append(rand.nextInt())
}

// data.sort()

func withSubscript() {
  let start = DispatchTime.now()

  for _ in 0..<1_000 {
      for i in 0..<size {
          if data[i] <= 128 {
              sum += data[i]
          }
      }
  }

  let stop    = DispatchTime.now()
  let elapsed = Double(stop.uptimeNanoseconds - start.uptimeNanoseconds) / 1_000_000_000

  print("With subscript:")
  print("- Elapsed Time: \(elapsed)")
  print("- Sum: \(sum)")
}

func withForIn() {
  let start = DispatchTime.now()

  for _ in 0..<1_000 {
      for element in data {
          if element <= 128 {
              sum += element
          }
      }
  }

  let stop    = DispatchTime.now()
  let elapsed = Double(stop.uptimeNanoseconds - start.uptimeNanoseconds) / 1_000_000_000

  print("With for-in:")
  print("- Elapsed Time: \(elapsed)")
  print("- Sum: \(sum)")
}

withSubscript()
withForIn()

I saved that code into a file called array-subscripting.swift.

Then, from the command line, we can run it without any optimizations, like this:

$ swift array-subscripting.swift 
With subscript:
- Elapsed Time: 0.924554249
- Sum: 1057062000
With for-in:
- Elapsed Time: 5.796038213
- Sum: 2114124000

As you mentioned in the post, there is a big difference in performance.

This difference is pretty negligible when the code is compiled with optimizations:

$ swiftc array-subscripting.swift -O
$ ./array-subscripting 
With subscript:
- Elapsed Time: 0.110622556
- Sum: 1054578000
With for-in:
- Elapsed Time: 0.11670454
- Sum: 2109156000

As you can see, both solutions are way faster than before, and very similar on time execution.

Back to your original question, subscripting provides direct access to memory, which is pretty efficient in the case of contiguous arrays, where elements are stored next to each other in memory.

for-in loops, in the other hand, create an immutable copy of each element from the array, which incurs in a performance hit.

Thermotaxis answered 29/6, 2018 at 22:7 Comment(4)
ehm, you should set sum to 0 before running every method. It took me same time to understand why your two sums are not the same.Brigade
Haha, yeah. I doubt that would have any impact on the result, but you are correct, I missed that.Thermotaxis
Ok it wasn't clear for me that for-in loops create immutable copy while subscripting is just a way to access memory. That the answer I wanted, thanks!Baalbeer
You an also do for var if you want to create a mutable copy.Thermotaxis

© 2022 - 2024 — McMap. All rights reserved.