Find k number in sliding window problem

Tags: , , , ,

Sliding window problem

Given a large integer array of size x, window size of n and a random number k, find smallest k numbers in every window of n elements in array. This is commonly know as sliding window problem. For example: for an array [2,3,1,5,6,4,2,5,4,3,8] k = 2 and n = 6, output should be [1,2],[1,2],[1,3][1,4][1,3][1,3]. How? see below figure.

This problem regularly features in Amazon interviews.

Find k numbers in sliding window : thoughts

If we spit down the problem, it reduces to find k smallest elements in an array, which can easily be solve in multiple ways. All we have to take care of is moving the window and storing results for each window.

Quick sort method
First way is to use quick sort, we randomly pick a pivot and put it in right place. When pivot is at right place, all elements on the right side of pivot are greater than pivot and all elements on the left side are less than pivot. If pivot is a kth position in array, all elements on left side of pivot automatically become K smallest elements of given array. In worst case this method take O(n log n) for each window.

Using heaps
What are we interested in is k elements, what if from current window, we take out first k numbers and consider them as k smallest elements? This set of k numbers may change based value of following numbers in the window. Which way? If new number is smaller than any of the number chosen randomly, new number has to be added into the k smallest element set. However, we have only k spaces there, so someone has to move out.

If new number is less than any number in set, it must be less than maximum number in set

Given above fact, we can always swap new number with maximum of set. Now problem is how to find max in a set? This set will modified repeatedly, so we cannot just sort it once and find the max. For use cases when data is changing and we have to find max of that set, heaps are the best data structures to use. In this case we will use max heap. Max heap is kind of heap where children of root node are smaller than root node. Max heap will give us O(1) complexity to find max and O(log n) complexity to heapify on removal old max and insertion of new number.

Algorithm

  1. Create a max heap with first k elements of window.
  2. Scan through remaining elements in window
    1. If root of max heap is less than new number, remove the root and add new element to heap
    2. All elements in heap at the end of processing are k smallest numbers in window.

    Sliding window algorithm to find k smallest elements : Implementation

    #include<stdio.h>
    #include<stdlib.h>
    #include <math.h>
    
    typedef struct node {
    	struct node * left;
    	struct node * right;
    	int data;
    } heapNode;
    
    int leftChild(int i){
    	return 2*i + 1;
    }
    
    int rightChild(int i){
    	return 2*i + 2;
    }
    
    void swapPtr(heapNode *a[], int i, int largest){
    	heapNode *temp = a[i];
    	a[i] = a[largest];
    	a[largest] = temp;
    }
    /* This function heapifies heap after removal of root  
    or at time of building heap from an array */
    void max_heapify_ptr(heapNode *a[], int i, int len){
            int largest = i;
            int left, right;
    
            left = leftChild(i);
            right = rightChild(i);
           
            if(left <= len && a[i]->data <a[left]->data){
                    largest = left;
            }
            if(right <= len && a[largest]->data < a[right]->data){
                    largest = right;
            }
            if(largest != i){
                    swapPtr(a, i, largest);
                    max_heapify_ptr(a, largest, len);
            }
    }
    
    /* Building heap from given elements */
    void build_max_heap_ptr(heapNode *a[], int len){
            int i = len/2 +1;
            for(; i>=0; i--){
                    max_heapify_ptr(a,i, len);
            }
    }
    
    /* This function allocates node of heap */
    heapNode * create_node(int data){
            heapNode *node = (heapNode *)(malloc)(sizeof(heapNode));
            if(node){
                    node->data = data;
            }
            return node;
    
    }
    
    /* This function is real implementation of 
    the sliding window algorithm */
    void slide_window(int buffer[], int N, int K, int buffer_len){
    
        int i =0, j =0,s;
        heapNode *max_heap[K+1];
        int num = K;
    
        for(j=0 ; j + N < buffer_len; j++){
          /* Window starts at index 0 and is of size N */
           printf("\nCurrent window :");
           for(s =j; s<j+N; s++){
               printf("%d ", buffer[s]);
           }
           printf("\n");
           /* Put K element from N element window */
           for(i=0;i<K; i++){
           /* Since we wold be doing for every window, 
              avoiding reallocation of node */
               if(max_heap[i]){
                    max_heap[i]->data = buffer[i+j];
                }
                else{
                    max_heap[i] = create_node(buffer[i+j]);
                }
            }
            /* Build min heap with those entered elements */
             build_max_heap_ptr(max_heap,K-1);
    
            /*Now for all remaining N-K-1 elements in window, 
             check if they fit in max heap */ 
             for(i=K+j; i< N+j; i++){
                 heapNode * root = max_heap[0];
                 if(buffer[i] < root->data){
                       root->data = buffer[i];
                       max_heapify_ptr(max_heap, 0, K-1);
                  }
              }
              
              /*Print the current max heap, it will contain K smallest 
                element in current window */
               printf("K minimum elements in this window :");
               for(int x=0; x< K; x++){
               	printf("%d ", max_heap[x]->data);
               }
               
               
            }
    }
    /* Driver Program to execute above code */
    int main(){
       int buffer[10] = {1,4,5,6,3,2,4,8,9,6};
    
       int K= 4;
       int N =5;
       
       int size = sizeof(buffer)/ sizeof(buffer[0]);
       
       slide_window(buffer,N, K,size);
       return 0;
    }
    

    Following figures explain how window slides and how heap is updated.
    1. Window starts at index 0 and ends at N. We take K minimum elements among N elements and store in max heap. Array is given in below picture with window size of 9 and k = 4.
    First step is to create a max heap with first 4 elements of window.

    sliding window problem

    Next we are looking at 4, which is less than max in max heap. So we remove the max from heap and add the new element(4) to heap.

    k smallest element in sliding window

    Next is 2, which is less than max in max heap. So we remove the max from heap and add the new element(2) to heap.

    Next is 3, which is less than max in max heap. So we remove the max from heap and add the new element(3) to heap.

    Next we have 10 and 11 which are greater than root of max heap, so nothing happens.

    We come to end of window. Therefore, 4 smallest element in window are [ 1,2,3,4 ]

    Next window moves one step ahead, that’s where you discard the max heap and create the new empty one and repeat the process.

    We can actually avoid discarding the entire heap when window moves, however complexity of overall algorithm will remain the same. This problem is asked in a different way, which is to find maximum in sliding window.

    #include <iostream>
    #include<deque>
    using namespace std;
    
    void slidingWindow(int buffer[], int n, int w, int output[])
    {
       deque<int> Q;
       int i;
       /*Initilize deque Q for first window, put all W elements, however also
       removing elements which cannot be maximum in this window */
       for (i = 0; i < w; i++)
       {
       	   //This is where we are removing all less than elements
           while (!Q.empty() && buffer[i] >= buffer[Q.back()])
               Q.pop_back();
           // Pushing the index
           Q.push_back(i);
       }
      
       for (i = w; i < n; i++)
       {
           output[i-w] = buffer[Q.front()];
    
           //update Q for new window
           while (!Q.empty() && buffer[i] >= buffer[Q.back()])
               Q.pop_back();
    
           //Pop older element outside window from Q    
           while (!Q.empty() && Q.front() <= i-w)
               Q.pop_front();
          
           //Insert current element in Q
           Q.push_back(i);
       }
       output[n-w] = buffer[Q.front()];
    }
    
    int main(){
    	int a[]={3,5,4,2,-1,4,0,-3};
    	int n = sizeof(a)/sizeof(a[0]);
    	int output[n];
    
    	slidingWindow(a,n,4,output);
    	return 0;
    }
    

    Worst case complexity of sliding window algorithm would be O(n2k). K is included as it takes O(k) complexity to build heap of k elements.

    Please share if there is something wrong or missing.