Arrays vs Linked Lists in terms of locality
Asked Answered
G

1

17

Say we have an unsorted array and linked list. The worst case when searching for an element for both data structures would be O( n ), but my question is:

Would the array still be way faster because of the use of spatial locality within the cache, or will the cache make use of branch locality allowing linked lists to be just as fast as any array ?

My understanding for an array is that if an element is accessed, that block of memory and many of the surrounding blocks are then brought into the cache allowing for much faster memory accesses.

My understanding for a linked list is that since the path that will be taken to traverse the list is predictable, then the cache will exploit that and still store the appropriate blocks of memory even though the nodes of the list can be far apart within the heap.

Gurge answered 28/9, 2013 at 7:9 Comment(6)
What is this "branch locality" you talk about? How could the cache get around the latency of main memory? Please describe what you think the cache does with the linked list nodes.Divisionism
@delnan You can read about branch locality here: en.wikipedia.org/wiki/Locality_of_reference I don't know the answer to your 2nd question. I explained what I think the cache might do with the nodes at the end of my question.Gurge
The description of branch locality there seems to be about a fixed small set of alternatives, like target of a branch instruction (hence the name). In contrast, the next node of a linked list might be anywhere. As for my second question: The problem caches fix is not primarily limited bandwidth but limited latency. Loading a large continous block into the cache at once (as it happens for spatial locality/arrays) is an improvement because the cache only has to pay the latency cost once. So my question is, how can the cache load n nodes of a linked list without paying the latency cost n times?Divisionism
@delnan Oh I see what you're saying. Branch locality only applies to predictable loops or function calls ( basically anything with a branch instruction in the assembly code ). To answer your question, I was thinking the cache loads in multiple nodes simultaneously with the use of branch locality, rather than one at a time. But the problem with that is there is no way for the system to know what node to load into the cache until the node previous to it has been loaded, so it would pay the latency cost n times. Therefore traversing an array is muchhh faster. Is my understanding correct now?Gurge
It's consistent with my understanding at least ;-)Divisionism
This stop using linked list blog & should you ever use linked lists page are probably relevantWittenberg
F
18

Your understanding of the the array case is mostly correct. If an array is accessed sequentially, many processors will not only fetch the block containing the element, but will also prefetch subsequent blocks to minimize cycles spent waiting on cache misses. If you are using an Intel x86 processor, you can find details about this in the Intel x86 optimization manual. Also, if the array elements are small enough, loading a block containing an element means the next element is likely in the same block.

Unfortunately, for linked lists the pattern of loads is unpredictable from the processor's point of view. It doesn't know that when loading an element at address X that the next address is the contents of (X + 8).

As a concrete example, the sequence of load addresses for a sequential array access is nice and predictable. For example, 1000, 1016, 1032, 1064, etc.

For a linked list it will look like: 1000, 3048, 5040, 7888, etc. Very hard to predict the next address.

Fayum answered 28/9, 2013 at 7:42 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.