Linked list implementation in Linux kernel

Linked list implementation in Linux kernel

We learned a lot about linked and solve around 30 odd problems : Linked list problems. However, the actual implementation of a linked list in Linux kernel is very different than what we learned. Let us understand how a linked list is implemented in Linux kernel and used in kernel code.

In a simple linked list, nodes contain data and point to the next node in the linked list. In other words, its the list which contains nodes which are linked. A typical example of the structure of a node of this kind of a list is:

class Node {
  private int data;
  private Node next;

  public Node (int data, int next){
      this.data = data;
      this.next = next;   
  }

  //Getters
};

However, linked lists in the Linux kernel, it’s the other way around, that is linked list is contained inside the node. This means, that there is no next pointer inside the node and each node is effectively a head node like a circular linked list. Also, it is a doubly linked list. A lot of things in one sentence!!

Linked list implementation in Kernel

Let’s understand it in detail. As said above, linked list is contained inside the node, structure of node is like this:

struct node {
 int data;
 list_head list; // list is inside the node
};

Here list_head is what defined as :

struct list_head{
  struct list_head *next, *prev;
}

See it has two pointers, essentially, making any node which contains this structure, part of a doubly linked list. The most interesting part of this kind of definition of a node is that same node can be part of multiple lists without being reallocated for each list. For example, in traditionally linked lists, if we need two linked lists: one as odd numbers and other as prime numbers, we would have to define two linked lists, one with odd numbers and other with prime numbers. With implementation provided in the Linux kernel, we can attach the same node to two lists as shown below, where an odd number which is also prime is allocated only once.

struct numbers {
 int number;
 list_head odd_numbers; // contains all odd numbers
 list_head primer_numbers; // Contains all prime numbers
};

How to access a node in list in Linux Kernel

We understood the node structure, how can we access a given node of a linked list. It was simple to do in a normal linked list as the base address of node accessible. In list implemented in Linux kernel, we have a pointer to the list_head structure in the next node and not a pointer to next node itself, as shown below.

linked list representation in linux kernel

There is a beautiful trick in C, which is used here to access the base address of the node whose list_head pointer is given. Once the base address of a node is known, accessing the node becomes similar to a normal linked list. The trick is that given a pointer to list_head in the structure; to find the base pointer of structure, find the offset at which list_head is stored in list. Once, we know the offset, (how many bytes, it is far from base address), then just subtract that offset from the absolute address of the pointer (which is given) and we get the base address. Figure explains

node representation in linked list in linux kernel

Let’s take an example, we will use structure numbers as given above. To get offset of element number in that, code is:

(unsigned long)(&((struct numbers *)0)->number)

Now, that we have offset of number and absolute address of number, we can get the base address of struct numbers as:

((struct numbers *)((char *)(pos) - \
          (unsigned long)(&((numbers *)0)->number)))

ANSI C defines the offsetof() macro in which lets you compute the offset of field f in struct s as offsetof(struct s, f). If for some reason you have to code this sort of thing yourself, one possibility is

#define offsetof(type, f) ((size_t) \
 ((char *)&((type *)0)->f - (char *)(type *)0))

Above code is not portable and some compilers may have problems with it.

There are some MACROS which are defined in the linux kernel which are useful in dealing with linked lists. Below are some examples:

#define list_entry(ptr, type, member) \
	((type *)((char *)(ptr)-(unsigned long)(&((type *)0)->member)))
<pre>#define LIST_HEAD(name) \
	struct list_head name = LIST_HEAD_INIT(name)

#define INIT_LIST_HEAD(ptr) do { \
	(ptr)-&gt;next = (ptr); (ptr)-&gt;prev = (ptr); \
} while (0)

Please refer to this file to understand various macros which are used in Linux kernel.

In next post, we will use these constructs and see how can we created a list, access it and delete a list.

Please share if there is something wrong or suggestion for improvements. Also, if you like the content, please share it.

Linked list data structure

Linked list data structure

Linked list is a very important data structure to understand as lot of problems are asked based on linked list in Amazon, Microsoft and Google interview. Today, we will understand the basics of linked list data structure and it’s implementation. 

Linked list represent linear sequence of elements. Each element connected to next element using chain of references. Another data structure which store linear sequence of items is array. There are some advantages and uses cases where linked list way of storing sequence is more efficient than array, I will cover that into next post : Arrays Vs Linked lists.

In last paragraph, I emphasized on linkedlist being linear data structure. In linear data structure, there is a sequence and order how elements are inserted, arranged and traversed. In order to go to tail of linked list, we have to go through all of the nodes.

linked list data structure
linear data structure when elements can be traversed only in one order


Non linear data structures are the ones where elements are not arranged or traversed in a specific order. One element may be connected to many others, hence we cannot traverse them in the same order every time. Example of non-linear data structure would be maps, dictionaries, trees, graphs etc.

linked list as data structure
Non  linear data structure when nodes cannot be traversed in one order always

Linked list implementation

Linked list consists of node, any number of nodes. Each node contains two things : first, value of the node, this value can be of any type, integer, string, or other user defined type. Second, a reference which points to next node in linked list. A node can be declared as follows:

typedef struct Node {
	int data;
	struct Node * next;
} Node;
Node structure
Linked list

What happens if the node is last node in linked list? At last node, next pointer of the node points to the null. It’s very important to understand this bit, as this condition will be used on almost every problem you have to solve on linked list.

Linked list is dynamic data structure. By dynamic data structure, we mean, it’s size and nature is not defined at the time of compilation, but defined at run time. Every time, a new node is added to linked list, new memory location is allocated and previous node’s next pointer will point to new node.

Operations of linked list

  • Adding node at the end of list
    There are three basic steps to add a node to linked list at end:
  1. Check if there is already a node
    1. If no, then create a new node and return it as head of linked list.
  2. If there is a node,
    1. Scan through linked list using next pointer, reach to the last node.
    2. Create a new node, and point next pointer of last node to this new node.
Node * createNode(int val){
	Node * newNode = (Node *)malloc(sizeof(Node));
	if(newNode){
		newNode->data = val;
		newNode->next = NULL;
	}
	return newNode;
}

void addNode(Node **headRef, int value){
	//create new node
	Node *newNode = createNode(value);

	//find the last node
	Node *currentNode = *headRef;
	while(currentNode && currentNode->next != NULL){
		currentNode = currentNode->next;
	}
	if(currentNode)
		currentNode->next = newNode;
	}
	else{
		//Change headRef to point to new head.
		*headRef = newNode;
	}
}

Complexity of adding a node to linked list is O(n). 

  • Insert node at head of list
    In this case too, we allocate a new node, however, this time we do not have to scan the entire list. Every time we add node to list, it’s head changes though.
  1. Check if there is already a node
    1. If no, then create a new node and return it as head of linked list.
  2. If there is a node,
    1. Create a new node, and point next pointer new node to head.
    2. Return new node as head pointer.
Node * createNode(int val){
	Node * newNode = (Node *)malloc(sizeof(Node));
	if(newNode){
		newNode->data = val;
		newNode->next = NULL;
	}
	return newNode;
}

void addNode(Node **headRef, int value){
	//create new node
	Node *newNode = createNode(value);
	newNode->next = *headRef;
	*headRef = newNode;
}

Linked list data structure problems

It’s very important to understand that linked list is a recursive data structure. Base case is a linked list with no node, represented by NULL node. Every problem on linked list can be solved using template : process one node, and then recursively process the remaining linked list.

In programming terms, linked list is divided into two parts, head and tail. The node being processed is called head and rest of the linked list is tail. Tail has the exactly same structure as the original list. 

Problems like merging linked lists, reverse a linked list, find length of linked list all can be solved using the same template of processing one node and the recursively call function on remaining node. 

Types of linked list

There are three types of linked lists :
1. Singly linked list 
Singly linked lists contain nodes with data and reference, i.e., next, which points to the next node in the sequence of nodes. The next pointer of the last node will point to null. In singly linked list you can traverse only in one direction.

singly linked list
singly linked list

2. Doubly linked list
In a doubly linked list, each node contains two links – previous, which points to the node before current node and next,  which points to next node. The previous pointer of the first node and next pointer of the last node will point to null. In doubly linked list, you can traverse it both directions. Two references adds to weight as extra memory is required.

doubly linked list
doubly linked list

3. Circular linked list
In circular linked list, next pointer of  the last node will point to the first node. A circular linked list can be both singly as well as doubly linked list.

circular linked list
Circular doubly linked list

This was all for basics of linked list, I know problems on them are hard to solve but if you look at all the problems, they boil down to one thing : understanding of node and how recursion can be used. In next posts, we will be solving many of these problems and see how we can use these basics.

Please share if there is something wrong or missing. If you are interested in contributing to website and share your knowledge with thousands of users across world, please reach out to us at communications@algorithmsandme.com

Delete a given node from linked list

Delete a node from linked list

Write a program to delete a node from a linked list. For example, in the linked list shown below, if we were asked to delete node 5, the result will be the linked list as shown.

delete node from linked list

delete a node in linked list

To delete a node from the linked list, first, we need to find the node with the given value. We compare given value with the value of each node of the list; if the values match, we found the node to delete. To delete a node in a singly linked list, next pointer of the previous node of current node will point to next node of the current node. previous->next = current->next

However, if we are matching the value of the node with the given value as mentioned above, we don’t have access to the previous node of the current node.

So what shall we do? There are two ways we can solve this problem: First, we compare the value of next of the current node instead of the current node, that way we are at the previous node of the node to be deleted when the match happens. Second, keep track of the previous node while traversing the linked list forward.

The node to be deleted can be anywhere in the linked list, there are three possibilities:

  • Node to be deleted is the head node of linked list.
  • Node to be deleted is any of the middle nodes.
  • Node to be deleted is last node of linked list.
  • In the first case, we should also update head pointer of the linked list. For second and third cases, we can just change the pointers: next of the previous node points to the next of the current node).

    package com.company;
    
    /**
     * Created by sangar on 14.10.18.
     */
    public class LinkedList<T> {
        private Node<T> head;
    
        public LinkedList(){
            head = null;
        }
    
        public void insert(T data){
            //If this is the first node to insert
            if(this.head == null){
                this.head = new Node<>(data);
            }
            else{
                Node current = this.head;
                /* Defensive programming, just in case current is null
                * As we check null condition for head earlier, it should
                * not be null in this while loop ever though
                * */
                while(current != null && current.getNext() != null){
                    current = current.getNext();
                }
                //We are at the last node.
                current.setNext(new Node(data));
            }
        }
    
        public Node<T> get(T data){
            /*
                Base condition if there is no nodes,
                return null
             */
            if(this.head == null){
                return null;
            }
            else{
                Node current = this.head;
                /* Defensive programming, just in case current is null
                * As we check null condition for head earlier, it should
                * not be null in this while loop ever though
                * */
                while(current != null && current.getData() != data){
                    current = current.getNext();
                }
    
                return current;
            }
        }
    
        public void deleteNode(int val){
            Node current = this.head;
            Node prev = null;
    
            while(current != null && (int)current.getData() != val){
                prev = current;
                current = current.getNext();
            }
    
            //Case where node to be deleted is head
            if(current == this.head){
                this.head = this.head.getNext();
                 return;
            }
    
            if(current != null){
                prev.setNext(current.getNext());
            }
            //let the gc free the memory
        }
    
        public void printList(){
            Node current = this.head;
    
            while(current != null){
                System.out.print(" " + current.getData());
                current = current.getNext();
            }
        }
    }
    

    Delete a node from a linked list with a reference

    Above implementation will not work if there are duplicate values in a linked list, we will delete the first node with the given value and it may not be always the node intended to be deleted. What if we were given the pointer or address to the node to be deleted?

    One way is to scan through the linked list and find the node matching the pointer. We can do that by comparing current->next to the pointer given. If they are equal, we can link the current->next = givenNode->next.

    Can we avoid the traversing of the linked list? Yes, we can. We know the pointer to the node, let’s say it is givenNode. If we copy value of givenNode->next node into givenNode->value, we can delete the givenNode->next node. Before deleting the node, we will link givenNode->next = givenNode->next->next.

    delete a node from linked list

    This method will not work if the node to be deleted is the last node of the linked list. In that case, even if we avoid accessing the null node after the last node, we may not be able to touch previous node’s next pointer, which will point to a freed node once we delete the last node.

        public void deleteNode(Node nodeToDelete){
            Node current = this.head;
    
            if(nodeToDelete == null) return;
    
            if(nodeToDelete.getNext() != null){
                //Copy data
                nodeToDelete.setData(nodeToDelete.getNext().getData());
                //Change pointer
                nodeToDelete.setNext(nodeToDelete.getNext().getNext());
            }
            //let the gc free the memory
        }
    

    The complexity of both of the above algorithms to delete a node in a linked list is O(N) where N is the number of nodes in the linked list.

    Please share if there is anything wrong or missing. If you are preparing for an interview, and want to have personalized coaching to guide you through it, please signup here for free interview preparation session.

    Merge point of two linked lists

    Merge point of two linked lists

    Given two linked lists, we need to find out that if two linked list merge, if yes, find merge point.
    For example, in the figure below, node(5) is the merge point of two linked lists.

    merge point of two linked list

    How can we identify that two linked list do not merge? Easy, if two linked lists merge at any given node including the last node, the last node should be same. If the last nodes of two linked lists are different, we can safely say that two linked list do not merge.

    Merge point of two linked lists

    How to find the merge point is the next problem. From the above diagram, notice that all the nodes after merge point are same if two linked list are merged at a certain node. So that part of two linked lists is common to both.
    We can use this insight to find the merge point. If lengths of two linked lists are equal, then we can just start a traversal of those lists from heads and they will meet the merge point.

    What if the lengths are different? In this case, there is an equal number of nodes in two linked list after the merge point. The difference has to be before the merge point. If we get the difference between the lengths and move the pointer on the longer list head by the difference, we will reach the merge point at the same time when we traverse lists again.

    merge point of two linked lists

    If length of two linked lists is different, problem reduces to the problem where we need to reach at the end of two lists at the same time. There is a simple solution to that.

    Merge point of two linked lists: implementation

    #include<stdlib.h>
    #include<stdio.h>
    
    typedef struct node{
        int data;
        struct node *next;
    } Node;
    
    void print_list(Node * head){
            while(head){
                     printf("%d->" , head->data );
                     head = head->next;
            }
    
            printf("NULL");
            printf("\n");
    
    }
    
    int merge_point(Node *L1, Node *L2){
    
            Node * current1 = L1;
            Node * current2  = L2;
    
            int count_1 = 0;
            int count_2 = 0;
            //If any one list is null, return false
            if(!current1 || !current2 ) return -1;
    
            //Count number of nodes in list 1;
            while(current1->next){
                    count_1++;
                    current1 = current1->next;
            }
            // Count number of nodes in list 2
            while(current2->next){
                    count_2++;
                    current2 = current2->next;
            }
            /*If last nodes of both linked list are not same,
            linked list don't merge. */
            if(current1 != current2)
                    return -1;
    
            //Calculate the difference in lengths of linked list.
            int diff = abs(count_1 - count_2);
    
            //Move the longer linked list by diff number of nodes
            if(count_1 < count_2){
                    Node * temp = L1;
                    L1 = L2;
                    L2 = temp;
            }
            current1 = L1;
            current2 = L2;
    
            while(diff && current1){
                    diff--;
                    current1 =  current1->next;
            }
            // Now move both linked list till they meet at merge point
            while(current1 != current2){
                    current1 =  current1->next;
                    current2 =  current2->next;
            }
            return current1->data;
    }
    
    void push(Node **head, int value){
        if(*head == NULL){
            (*head) = (Node *)malloc(sizeof(Node));
            (*head)->data = value;
            (*head)->next = NULL;
        }
        else{
            Node * newNode= (Node *)malloc(sizeof(Node));
            if(newNode){
                newNode->data = value;
                newNode->next = (*head);
                *head = newNode;
            }
        }
    }
    
    void main(){
            Node * L1 = NULL;
            Node * L2 = NULL;
     		push(&L1,3);
     		Node * temp1 = L1;
            push(&L1,4);
            push(&L1,6);
            push(&L1,7);
            push(&L1,8);
            push(&L1,9);
            print_list(L1);
            printf("\n");
            push(&L2,5);
            Node * temp2 = L2;
            push(&L2,7);
            push(&L2,8);
            push(&L2,2);
            push(&L2,1);
            push(&L2,10);
            printf("\n");
    
            temp2->next = temp1;
    
            print_list(L2);
            int result1 = merge_point(L1, L2);
            if(result1 != -1){
                    printf("\n Merge Point : %d", result1);
            }
            else{
                    printf("\n Lists don't merge");
            }
    }
    

    The complexity of the algorithm to find merge point of two linked list is O(n), however, we scan the lists multiple times.

    Please share if there is something wrong or missing. If you are preparing for an interview and want to have personalized coaching, please signup for a free demo session.