In my experience with Racket so far, I've not given much thought to vectors, because I gathered that their main benefit — constant-time access to elements — was not significant until you're working with lot of elements.
This doesn't seem quite accurate, however. Even with a small number of elements, vectors have a performance advantage. For example, allocating a list is slower than allocating a vector:
#lang racket
(time (for ([i (in-range 1000000)]) (make-list 50 #t)))
(time (for ([i (in-range 1000000)]) (make-vector 50 #t)))
>cpu time: 1337 real time: 1346 gc time: 987
>cpu time: 123 real time: 124 gc time: 39
And retrieving an element is slower too:
#lang racket
(define l (range 50))
(define v (make-vector 50 0))
(time (for ([i (in-range 1000000)]) (list-ref l 49)))
(time (for ([i (in-range 1000000)]) (vector-ref v 49)))
>cpu time: 77 real time: 76 gc time: 0
>cpu time: 15 real time: 15 gc time: 0
BTW this performance ratio holds if we increase to 10 million:
#lang racket
(define l (range 50))
(define v (make-vector 50 0))
(time (for ([i (in-range 10000000)]) (list-ref l 49)))
(time (for ([i (in-range 10000000)]) (vector-ref v 49)))
>cpu time: 710 real time: 709 gc time: 0
>cpu time: 116 real time: 116 gc time: 0
Sure, these are synthetic examples, to the extent that most programs don't allocate structures or use list-ref
a million times in a loop. (And yes, I'm deliberately grabbing the 50th element to illustrate the performance difference.)
But they're also not, because across a whole program that relies on lists, you're going to be incurring a little extra overhead every time you touch those lists, and all those little inefficiencies will add up into slower running time for the overall program.
Thus my question: why not just use vectors all the time? In what situations should we expect better performance from lists?
My best guess is that because it's just as fast to get an item off the front of a list, e.g.:
#lang racket
(define l (range 50))
(define v (make-vector 50 0))
(time (for ([i (in-range 1000000)]) (list-ref l 0)))
(time (for ([i (in-range 1000000)]) (vector-ref v 0)))
>cpu time: 15 real time: 16 gc time: 0
>cpu time: 12 real time: 11 gc time: 0
... that lists are preferred in recursion siutations, because you're mostly working with cons
and car
and cdr
, and it saves space to work with a list (vectors cannot be broken and put back together without copying the whole vector, right?)
But in situations where you're storing and retrieving data elements, it seems that vectors have the advantage, no matter the length.
list-ref
. The lookup is not linear. – Norrielength
, but the relative performance difference oflist-ref
andvector-ref
remains. Moreover,length
is an order of magnitude slower thanvector-length
. – Testaceouslength
for lists is probably calculated anew on each call. length of vector is surely maintained internally for each vector. – Ingratiating