Longest increasing subsequence in O(nlogn)

Longest increasing subsequence in O(nlogn)

In last post, longest increasing subsequence , we discussed brute force and dynamic programming solution. Complexity of brute force solution is exponential where are for dynamic programming approach is O(n2). Question is can we solve longest increasing subsequence in nlogn complexity?

For completeness shake, problem statement is : Given an array of integers, find length longest increasing subsequence. Increasing subsequence contains elements A[i] and A[j] if i<j and A[i]< A[j]. For example, in array A = [2,4,5,3,1,6,7], increasing subsequence are [2,4,5,6,7], [2,3,6,7], [1,6,7] and many more. Longest such subsequence has length of 5.

longest increasing subsequence in O(nlogn)

Longest increasing subsequence in O(nlogn)

Basic idea behind solution is to keep track of all active subsequences at given point of time. Based on the current elements being consider, update these active lists. To understand this process, let’s work out an example.

A = {2,8,7}
Monotonically increasing subsequences are {2,8} and {2,7}

What if we add another element, 11 in this?

A = {2,8,7,11}
Monotonically increasing subsequences are {2,8,11} and {2,7,11}

What if new element 9 is added to array? What happens now? If we add it t0 subsequences, length of longest subsequence remains 3.

A = {2,8,7,11,9}
Monotonically increasing subsequences are {2,8,9} and {2,7,9}

Question is should we create new active subsequences with length 3 adding 9 to them or continue with 11 in them? Consider the next element is 10. At this point we know, that adding 9 to subsequence leads us to longer subsequences than keeping 11.

How do we decide when to replace and when to continue with the old element in subsequence?  We can add or replace current element A[i] to an existing sequence if there is an element A[j] (j > i) such that E < A[i] < A[j] or (E > A[i] < A[j] – for replace), E being the last element in subsequence. In above example, E = 11, A[i] = 9 and A[j] = 10.

What if A[i] is smaller than all elements in all subsequences? In this case, we have to create a new list and add A[i] into it. Whole idea is to maintain lists of increasing sequences and update them based  on new element.  Every time new element is to be added, scan all lists (for end elements) in decreasing order of their length. Below algorithm gives how to add, replace new element in existing list or to create a new list with it.

1. If A[i] is smallest among all end candidates of active lists, start new active list of length 1.
2. If A[i] is largest among all end candidates of active lists, clone the largest active list, and extend it by A[i].
3. If A[i] is in between, find a list with largest end element that is smaller than A[i]. Clone and extend this list by A[i]. 
4. Discard all other lists of same length as that of this modified list.

Let’s take an example and see how it works:
A = [ 0, 8, 4, 12, 2, 10, 6, 14, 1, 9, 5, 13 ].

A[0] = 0. Case 1. Since there are no active lists, create a new one.
Lists = [[0]].

A[1] = 8. Case 2. A[i] is greater than end elements of al lists, so select the longest one and clone it extend.
Lists [ [0], [0, 8]

A[2] = 4. A[i] is less than end of one of the list and greater than other. Find the list which has end element less than A[i], it’s [0]. Clone it, extend and discard all list with similar length lists.
List = [ [0], [0, 4] ].
[0, 8] is discarded

A[3] = 12. Same as A[1].
Lists = [ [0], [0, 4], [0, 4, 12] ]

A[4] = 2. Same as A[2], Clone the one with largest end which is less than A[i], extend it and discard all same length lists.
Lists = [ [0], [0, 2], [0,4,12]
[0, 4] is discarded.

A[5] = 10. Same as A[4]. Clone, extend and discard.
Lists = [ [0], [0, 2], [0,4,10] ]
[0, 4, 12] is discarded.

A[6] = 6. Same as A[5] Clone, extend and discard.
Lists = [ [0], [0, 2], [0,2,6] ]
[0, 4, 10] is discarded.

A[7] = 14. Same as A[1]. Clone and extend.
Lists = [ [0], [0, 2], [0,2,6],  [0,2,6,14] ]

A[8] = 1. Same as A[6]. Clone, extend and discard.
Lists = [ [0], [0,1], [0,2,6] [0,2,6,14]
[0, 2] is discarded

A[9] = 9. Same A[8]. Clone, extend and discard.
Lists = [ [0], [0,1], [0,2,6] [0,2,6,9] ].
[ 0, 2, 6, 14 ] is discarded.

A[10] = 5. Same as A[9]. Clone, extend and discard.
Lists = [ [0], [0,1], [0,2,5] [0,2,6,9]].
[ 0, 2, 6] is discarded.

A[11] = 13. Same as A[1]. Clone and extend.
Lists = [ [0], [0,1], [0,2,5] [0,2,6,9], [0,2,6,9,13]].

LIS is [ 0,2,6,9,13] with length of 5.

This looks like lot of things to be done just for maintaining list and there requires a lot of space to store all of these lists. There is an insight which helps do optimize all this stuff, observer that all we are using from lists are their end elements. We do not care what was prior to them in list. So, can we store just end elements of an auxiliary array and do operations on them? Size of this array in worst case will be n if arrays is sorted.

To extend the list, add another element in auxiliary array. To replace just overwrite the smallest element which is greater than current element. To find smallest element which is greater than current element, we use algorithm called ‘Ceiling’, which is modification of binary search.

To find length of longest subsequence, keep track of length of auxiliary array, because at the end, length of it will be length of  LIS.

Implementation of longest increasing subsequence in O(nlogn)

#include<stdio.h>
#include<stdlib.h>
 
int ceilElement(int a[],int start,int end,int key){
	while(end-start > 1){
		int mid = start +(end - start)/2;
		if(a[mid]>=key){
			end = mid;
		}
		else{
			start = mid;
		}
	}
	return end;
}
 
int longestIncreasingSubsequence(int input[], int size)
{
	if(!size)
		return 0;
 
	int a[size];
	int length=1;
 
	a[0] = input[0];
 
	for(int i=1; i<size; i++)
	{
		if(input[i]< a[0]){
			a[0]=input[i];
		}
		else if(input[i]>a[length-1]){
			a[length++]=input[i];
		}
		else
			a[ ceilElement(a,-1,length-1,input[i]) ]= input[i];
	}
	return length;
}
 
int main() {
	int a[]={0,8,4,12,2,10,6,14,1,9,5,13,1,7,15};
	int size =sizeof(a)/sizeof(a[0]);
 
	printf("Longest Increasing Sub-sequence is = %d",longestIncreasingSubsequence(a,size));
	return 0;
}

Complexity of this algorithm is O(n log n) as for each element in array, it requires log n time to find ceiling of it and put it at correct position.

This article is influence by http://www.geeksforgeeks.org/longest-monotonically-increasing-subsequence-size-n-log-n/ and http://stackoverflow.com/questions/6129682/longest-increasing-subsequenceonlogn and comments provided by readers under these articles.

Please share if you find something wrong or missing. Also, if you want to contribute to website, please refer Publishing and contact us. We would love to publish your article and at the same time, will pay you too.