Longest Increasing Subsequence in O(nlogn)

In the last post, longest increasing subsequence, we discussed the brute force and dynamic programming based solutions. The complexity of brute force solution is exponential whereas for the dynamic programming approach it is O(n2). Question is – Can we find the longest increasing subsequence in nlogn complexity?

Let’s revisit the problem statement: Given an array of integers, find the length of the longest increasing subsequence. An increasing subsequence contains elements A[i] and A[j] only if i < j and A[i] <  A[j]. For example, in an array A = [2,4,5,3,1,6,7], the increasing subsequences are [2,4,5,6,7], [2,3,6,7], [1,6,7] and many more. The longest subsequence here has a length of 5. The basic idea behind the solution is to keep track of all active subsequences at a given point of time. Based on the current elements being considered, update these active lists. To understand this process, let’s work out an example.

A = {2,8,7}
Monotonically increasing subsequences are {2,8} and {2,7}

What if we add another element, 11 in this?

A = {2,8,7,11}
Monotonically increasing subsequences are {2,8,11} and {2,7,11}

What if new element 9 is added to array? What happens now? If we add it t0 subsequences, length of longest subsequence remains 3.

A = {2,8,7,11,9}
Monotonically increasing subsequences are {2,8,9} and {2,7,9}

The decision to take for each element being considered is whether we create new active subsequences with length 3 with element 9 in them or continue with 11. If the next element is 10 we know that adding 9 to subsequence leads us to longer subsequences rather than keeping 11.

How do we decide when to replace and when to continue with the old element in the list of subsequences?  We can add or replace the current element A[i] to an existing sequence if there is an element A[j] (j > i) such that E < A[i] < A[j] or (E > A[i] < A[j] – for replace), E being the last element in subsequence. In above example, E = 11, A[i] = 9 and A[j] = 10.

What if A[i] is smaller than all elements in the present list of subsequences? In this case, we have to create a new list and add A[i] into it. The invariant is to maintain lists of increasing sequences and update them based on the next element encountered.  Each time a new element is to be added, scan all the lists of subsequences in decreasing order of their length. The following algorithm shows how to add/replace the new elements in the existing lists or to create a new list with it.

1. If A[i] is the smallest among all end candidates of active lists, start new active list with A[i] of length 1.
2. If A[i] is largest among all end candidates of active lists, clone the largest active list, and append A[i] to it.
3. If A[i] is in between, find a list with largest end element that is smaller than A[i]. Clone and append A[i] to these lists.
4. Discard all other lists of same length as that of this modified list.

Let’s take an example and see how it works with an array A:
A = [ 0, 8, 4, 12, 2, 10, 6, 14, 1, 9, 5, 13 ].

A = 0. Case 1. Since there are no active lists of subsequences, create a new one.
Lists = [].

A = 8.  A[i] is greater than the end elements of all the lists, so select the longest one and append A to it.
Lists [ , [0, 8] ]

A = 4. A[i] is less than the last element of one of the list and greater than the other. Find the list which has a last element less than A[i], it’s the first list containing . Clone it and append A to it and discard all other lists of the same length.
Lists = [ , [0, 4] ].
[0, 8] is discarded

A = 12. Same as A since its greater than all the last elements of the current lists.
Lists = [ , [0, 4], [0, 4, 12] ]

A = 2. Same as A, Clone the one with largest end which is less than A[i], append A to it and discard all same length lists.
Lists = [ , [0, 2], [0,4,12]
[0, 4] is discarded.

A = 10. Same as A. Clone, append and discard.
Lists = [ , [0, 2], [0,4,10] ]
[0, 4, 12] is discarded.

A = 6. Same as A Clone, append and discard.
Lists = [ , [0, 2], [0,2,6] ]
[0, 4, 10] is discarded.

A = 14. Same as A. Clone and append.
Lists = [ , [0, 2], [0,2,6],  [0,2,6,14] ]

A = 1. Same as A. Clone, append and discard.
Lists = [ , [0,1], [0,2,6] [0,2,6,14]
[0, 2] is discarded

A = 9. Same A. Clone, append and discard.
Lists = [ , [0,1], [0,2,6] [0,2,6,9] ].
[ 0, 2, 6, 14 ] is discarded.

A = 5. Same as A. Clone, append and discard.
Lists = [ , [0,1], [0,2,5] [0,2,6,9]].
[ 0, 2, 6] is discarded.

A = 13. Same as A. Clone and append.
Lists = [ , [0,1], [0,2,5] [0,2,6,9], [0,2,6,9,13]].

Hence, the Longest Increasing Subsequence is [ 0,2,6,9,13] with a length of 5.

It seems like a lot of things need to be done just for maintaining the lists and there is significant space complexity required to store all of these lists. We can optimize on this, observe that the only thing that we are using from the lists are their last elements and sizes. We do not care what was prior to them in list. So, can we store the last elements of an auxiliary array and do operations on them? Size of this array in worst case will be n if the input array is sorted.

To append to the list, add another element in the auxiliary array. To replace just overwrite the smallest element which is greater than the current element. To find the smallest element which is greater than the current element, we use an algorithm called ‘Ceiling’, which is a modification of binary search.

To find the length of the longest subsequence, keep track of the length of the auxiliary array, because, in the end, length of it will be the length of  LIS.

Implementation of longest increasing subsequence in O(nlogn)

#include<stdio.h>
#include<stdlib.h>

int ceilElement(int a[],int start,int end,int key){
while(end-start > 1){
int mid = start +(end - start)/2;
if(a[mid]>=key){
end = mid;
}
else{
start = mid;
}
}
return end;
}

int longestIncreasingSubsequence(int input[], int size)
{
if(!size)
return 0;

int a[size];
int length=1;

a = input;

for(int i=1; i<size; i++)
{
if(input[i]< a){
a=input[i];
}
else if(input[i]>a[length-1]){
a[length++]=input[i];
}
else {
a[ ceilElement(a,-1,length-1,input[i]) ]= input[i];
}
return length;
}

int main() {
int a[]={0,8,4,12,2,10,6,14,1,9,5,13,1,7,15};
int size =sizeof(a)/sizeof(a);

printf("Longest Increasing Sub-sequence is =     %d",longestIncreasingSubsequence(a,size));
return 0;
}

The complexity of this algorithm is O(n log n) as for each element in the array, it requires log n time to find the ceiling of it and put it at the correct position.

This article has taken some inspiration from: http://www.geeksforgeeks.org/longest-monotonically-increasing-subsequence-size-n-log-n/ and http://stackoverflow.com/questions/6129682/longest-increasing-subsequenceonlogn and the comments provided by readers under these articles.

Please share if you find something wrong or missing. Also, if you want to contribute to the website, please refer to Publishing and contact us. We would love to publish your article and at the same time, will pay you too.