What is a Memory-Efficient Doubly Linked List in C?

Asked 7/3, 2016 at 10:41 Answered 19/3, 2016 at 0:15

Solved c data-structures memory-efficient xor-linkedlist mem.-efficient-linkedlist

I had come across the term "Memory-Efficient Doubly Linked List" while reading a book on C Data structures. It just had one line saying that a memory-efficient doubly linked list uses less memory than a normal doubly linked list, but does the same job. Nothing more was explained, and no example was given. Just it was given that this has been taken from a journal, and "Sinha" in Brackets.

After searching on Google, the closest I came to was this. But, I couldn't understand anything.

Can someone explain me what is a Memory-Efficient Doubly Linked List in C? How is it different from a normal Doubly Linked List?

EDIT: Okay, I made a serious mistake. See the link I had posted above, was the second page of the article. I didn't see that there was a first page, and thought the link given was the first page. The first page of the article actually gives an explanation, but I don't think it is perfect. It only talks of the basics concepts of Memory-Efficient Linked List or XOR Linked List.

Godhood answered 7/3, 2016 at 10:41 Comment(9)

I don't really see the reason why you would ever want to use this - what is the real world application? If you have a vast linked list of thousands of nodes, the problem is not so much the memory use, as the slow iterations to find a given element. Which means you picked the wrong container class to begin with! A binary search tree or hash table etc would have made more sense. – Alben 7/3, 2016 at 11:4

@Lundin, I agree. But I never mentioned that this is much better than a doubly-linked-list. I just could not find an answer to this question, and when I found one, I thought of posting a question and answer explaining a XOR Linked List. This is because, when I did not know that this is called XOR Linked List, it was really hard searching, and it took me upto three hours to find out what it exactly is. – Godhood 10/3, 2016 at 3:50

I'm not the downvoter, but you link to the 2nd page of the article where the first page explains the structure and gives example code – Nazarite 14/3, 2016 at 17:17

link [link] (geeksforgeeks.org/…) These provide some information – Dominga 17/3, 2016 at 17:35

About the bounty text: 1) What do you mean with "all the variations"? 3) The very fact that it's impossible to find in the wild should already tell you that this is as impractical as can be, or worse. The Linux and BSD kernels have their own implementations of various lists and queues, which have to be realistic and efficient; they do NOT include XOR lists, wonder why? 4) What kind of references would you like or expect? – Matsu 26/3, 2016 at 15:11

@hmijail: 1) "all the variations" means XOR Linked List made into different types and linked lists by slight tweaks. 3) I know that it is impractical. But however impractical it is, there will be some use. I'm not expecting an answer to this as I couldn't find it myself, but still put it as I would like to see an example. I never even mentioned that they are being used in Linux kernels, so why are you assuming that I said that? 4) I meant links. Basically, when you write a huge answer, like mine, you refer a lot of links. To know the complete knowledge, I would like to get the links. – Godhood 26/3, 2016 at 16:54

My point about list usage in kernels is that those people worry about efficiency, they have a handful of different types of lists and queues, and yet they do NOT use the XOR list. Meaning that, if you worry about efficiency, you would probably be better served looking at what they do. – Matsu 26/3, 2016 at 19:41

@hmijail: well that is what I mentioned. I know that he XOR Linked List is a very ugly data structure. Did I even once say that it is good? But I want to know about it. What is wrong in curiosity? – Godhood 27/3, 2016 at 2:0

@SurajJain chat.stackoverflow.com/rooms/134363/… (let's clean our comments now) – Godhood 30/1, 2017 at 11:8

I know that this is my second answer, but I think the explanation I provide here might be better, than the last answer. But note that even that answer is correct.

A Memory-Efficient Linked List is more often called a XOR Linked List as this is totally dependent on the XOR Logic Gate and its properties.

Is it different from a Doubly Linked List?

Yes, it is. It is actually doing nearly the same job as a Doubly Linked List, but it is different.

A Doubly-Linked List is storing two pointers, which point at the next and the previous node. Basically, if you want to go back, you go to the address pointed by the back pointer. If you want to go forward, you go to the address pointed by the next pointer. It's like:

A Memory-Efficient Linked List, or namely the XOR Linked List is having only one pointer instead of two. This stores the previous address (addr (prev)) XOR (^) the next address (addr (next)). When you want to move to the next node, you do certain calculations, and find the address of the next node. This is the same for going to the previous node.It is like:

How does it work?

The XOR Linked List, as you can make out from its name, is highly dependent on the logic gate XOR (^) and it's properties.

It's properties are:

|-------------|------------|------------|
|    Name     |   Formula  |    Result  |
|-------------|------------|------------|
| Commutative |    A ^ B   |    B ^ A   |
|-------------|------------|------------|
| Associative | A ^ (B ^ C)| (A ^ B) ^ C|
|-------------|------------|------------|
| None (1)    |    A ^ 0   |     A      |
|-------------|------------|------------|
| None (2)    |    A ^ A   |     0      |
|-------------|------------|------------|
| None (3)    | (A ^ B) ^ A|     B      |
|-------------|------------|------------|

Now let's leave this aside, and see what each node stores:

The first node, or the head, stores 0 ^ addr (next) as there is no previous node or address. It looks like:

Then the second node stores addr (prev) ^ addr (next). It looks like:

The picture above shows node B, or the second node. A and C are address's of the third and first node. All the node, except the head and the tail, are like the above one.

The tail of the list, does not have any next node, so it stores addr (prev) ^ 0. It looks like:

Before seeing how we move, let's see the representation of a XOR Linked List again:

When you see

it clearly means there is one link field, using which you move back and front.

Also, to note that when using a XOR Linked List, you need to have a temporary variable (not in the node), which stores the address of the node you were in before. When you move to the next node, you discard the old value, and store the address of the node you were in earlier.

Moving from Head to the next node

Let's say you are now at the first node, or at node A. Now you want to move at node B. This is the formula for doing so:

Address of Next Node = Address of Previous Node ^ pointer in the current Node

So this would be:

addr (next) = addr (prev) ^ (0 ^ addr (next))

As this is the head, the previous address would simply be 0, so:

addr (next) = 0 ^ (0 ^ addr (next))

We can remove the parentheses:

addr (next) = 0 ^ 0 addr (next)

Using the none (2) property, we can say that 0 ^ 0 will always be 0:

addr (next) = 0 ^ addr (next)

Using the none (1) property, we can simplify it to:

addr (next) = addr (next)

You got the address of the next node!

Moving from a node to the next node

Now let's say we are in a middle node, which has a previous and next node.

Let's apply the formula:

Address of Next Node = Address of Previous Node ^ pointer in the current Node

Now substitute the values:

addr (next) = addr (prev) ^ (addr (prev) ^ addr (next))

Remove Parentheses:

addr (next) = addr (prev) ^ addr (prev) ^ addr (next)

Using the none (2) property, we can simplify:

addr (next) = 0 ^ addr (next)

Using the none (1) property, we can simplify:

addr (next) = addr (next)

And you get it!

Moving from a node to the node you were in earlier

If you aren't understanding the title, it basically means that if you were at node X, and have now moved to node Y, you want to go back to the node visited earlier, or basically node X.

This isn't a cumbersome task. Remember that I had mentioned above, that you store the address you were at in a temporary variable. So the address of the node you want to visit, is lying in a variable:

addr (prev) = temp_addr

Moving from a node to the previous node

This isn't the same as mentioned above. I mean to say that, you were at node Z, now you are at node Y, and want to go to node X.

This is nearly same as moving from a node to the next node. Just that this is it's vice-versa. When you write a program, you will use the same steps as I had mentioned in the moving from one node to the next node, just that you are finding the earlier element in the list than finding the next element.

I don't think I need to explain this.

Advantages of XOR Linked List

This uses less memory than a Doubly Linked List. Approximately 33% less.
It uses only one pointer. This simplifies the structure of the node.
As doynax said, a sub-section of a XOR can be reversed in constant time.

Disadvantages of XOR Linked List

This is a bit tricky to implement. It has higher chances of a failure, and debugging it is quite difficult.
All conversions (in case of an int) have to take place to / from uintptr_t
You cannot just acquire the address of a node, and start traversing (or whatever) from there. You have to always start from the head or tail.
~~You cannot go jumping, or skipping nodes. You have to go one by one.~~
Moving requires more operations.
It is difficult to debug a program which is using a XOR Linked List. It is much easier to debug a Doubly Linked List.

References

Godhood answered 11/3, 2016 at 15:12 Comment(0)

This is an old programming trick that allows you to save memory. I don't think it's used much anymore, since memory is no longer as tight a resource as it was in the old days.

The basic idea is this: In a conventional doubly-linked list, you have two pointers to adjacent list elements, a "next" pointer which points to the next element, and a "prev" pointer which points to the previous element. You can therefore traverse the list either forwards or backwards by using the appropriate pointers.

In the reduced-memory implementation, you replace "next" and "prev" with a single value, which is the bitwise exclusive-OR (bitwise-XOR) of "next" and "prev". You therefore reduce the storage for the adjacent element pointers by one half.

Using this technique, it is still possible to traverse the list in either direction, but you need to know the address of the previous (or next) element to do so. For instance, if you are traversing the list in the forward direction, and you have the address of "prev", then you can get "next" by taking the bitwise-XOR of "prev" with the current combined pointer value, which is "prev" XOR "next". The result is "prev" XOR "prev" XOR "next", which is just "next". The same can be done in the opposite direction.

The downside is that you can't do things like delete an element, given a pointer to that element, without knowing the address of either the "prev" or "next" element, since you have no context with which to decode the combined pointer value.

The other downside is that this sort of pointer trick circumvents the normal data type checking mechanism that a compiler may expect.

It's a clever trick, but in all honesty I see very little reason to use it these days.

Basuto answered 13/3, 2016 at 21:25 Comment(0)

I would recommend seeing my second answer to this question, as it is much clearer. But I am not saying this answer is wrong. This is also correct.

A Memory-Efficient Linked List is also called a XOR Linked LIST.

How Does It Work

A XOR (^) Linked List is a Link List in which instead of storing the next and back pointers, we just use one pointer to do the job of both the next and back pointers. Let's first see the XOR logic gates properties:

|-------------|------------|------------|
|    Name     |   Formula  |    Result  |
|-------------|------------|------------|
| Commutative |    A ^ B   |    B ^ A   |
|-------------|------------|------------|
| Associative | A ^ (B ^ C)| (A ^ B) ^ C|
|-------------|------------|------------|
| None (1)    |    A ^ 0   |     A      |
|-------------|------------|------------|
| None (2)    |    A ^ A   |     0      |
|-------------|------------|------------|
| None (3)    | (A ^ B) ^ A|     B      |
|-------------|------------|------------|

Lets now take an example. We have a doubly linked list with four nodes: A, B, C, D. Here's how it looks:

If you see, each node has two pointers, and 1 variable for storing data. So we're using three variables.

Now if you're have a Doubly Linked List with lot's of nodes, the memory it will be using will be too much. Too make it more efficient, we use a Memory-Efficient Doubly Linked List.

A Memory-Efficient Doubly Linked List is a Linked List in which we use just one pointer to move back and forth using XOR and it's properties.

Here's a pictorial representation:

How do you move back and forth?

You have a temporary variable (only one, not in the node). Let's say you're traversing the node from left to right. So you start at node A. In the pointer of node A, you store node B's address. You then move to node B. While moving to node B, in the temporary variable, you store node A's address.

The link (pointer) variable of node B has the address of A ^ C. You would take the previous node’s address (which is A) and XOR it with the current link field , giving you the address of C. Logically, this would look like:

A ^ (A ^ C)

Let's now simplify the equation. We can remove the parentheses and rewrite it because of the Associative property like:

A ^ A ^ C

We can further simplify this to

0 ^ C

because of the second ("None (2)" as stated in the table) property.

Because of the first ("None (1)" as stated in the table) property, this is basically

If you can't understand all this, just simply see the third property ("None (3)" as stated in the table).

Now, you got the address of node C. This will be the same process for going back.

Let's say that you were going from node C to node B. You would store the address of node C in the temporary variable, do the process given above again.

NOTE: Everything like A, B, C are addresses. Thanks to Bathsheba for telling me to make it clear.

Disadvantages of XOR Linked List

As Lundin mentioned, all the conversions have to be done from/to uintptr_t.
As Sami Kuhmonen mentioned, you have to start from a well-defined start point, not just a random node.
You cannot just jump a node. You have to go in order.

Also note that a XOR Linked List is not better than a doubly-linked list in most cases.

References

Godhood answered 7/3, 2016 at 10:41 Comment(0)

OK, so you've seen the XOR linked list, which saves you one pointer per item... but that is an ugly, ugly data structure, and far from the best you can do.

If you're worried about memory, it's almost always better to use a doubly-linked list with more than 1 element per node, like a linked list of arrays.

For example, while an XOR linked list costs 1 pointer per item, plus the item itself, A doubly-linked list with 16 items per node costs 3 pointers for each 16 items, or 3/16 pointers per item. (the extra pointer is the cost of the integer that records how many items are in the node) That is less than 1.

In addition to the memory savings, you get advantages in locality because all 16 items in the node are next to each other in memory. Algorithms that iterate through the list will be faster.

Note that an XOR-linked list also requires you to allocate or free memory each time you add or delete a node, and that is an expensive operation. With the array-linked list, you can do better than this by allowing nodes to be less than completely full. If you allow 5 empty item slots, for example, then you only have allocate or free memory on every 3rd insert or delete at worst.

There are many possible strategies for determining how and when to allocate or free nodes.

Hack answered 12/3, 2016 at 21:23 Comment(0)

You already got a pretty elaborate explanation of XOR linked list, I'll share a few more ideas on memory optimization.

Pointers normally take 8 bytes on 64 bit machines. It's necessary to address any point in RAM beyond 4GB addressable with 32 bit pointers.
Memory managers typically deal with fixed size blocks, not bytes. E.g. C malloc usually allocates within 16 byte granularity.

These two things imply that if your data is 1 byte, corresponding doubly linked list element will take 32 bytes (8+8+1, rounded up to the nearest 16 multiple). With XOR trick you can get it down to 16.

However, to optimize even further, you can consider using your own memory manager, that: (a) deals with blocks at lower granuarity, such as 1 byte or maybe even go into bits, (b) has stricter limit on overall size. For example, if you know your list will always fit inside a continuous block of 100 MB, you only need 27 bits to address any byte within that block. Not 32 or 64 bits.

If you aren't developing a generic list class, but you know specific usage patterns for your application, in many cases implementing such memory manager may be an easy job. For example, if you know you will never allocate more than 1000 elements and each element takes 5 bytes, you can implement memory manager as 5000-byte array with a variable that holds index of the first free byte, and as you allocate extra element, you just take that index and move it forward by the allocated size. In this case your pointers will not be real pointers (like int*) but will be just indexes inside that 5000-byte array.

Labialize answered 19/3, 2016 at 0:15 Comment(0)

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++