Range sum query- Immutable array

Range sum query- Immutable array

Write a service which given an integer array, returns the sum of the elements between indices i and j (i ≤ j), inclusive. Example: nums = [-2, 0, 3, -5, 2, -1]
sumRange(0, 2) -> 1
sumRange(2, 5) -> -1
sumRange(0, 5) -> -3

Also, the input set does not change during the calls to the sumRange(i,j).

The brute force solution is to calculate the sum of all the elements A[i] to A[j] whenever a sumRange(i,j) is called. This method has time complexity of O(n). It is OK to have this solution for small scale but as the number of queries goes up, processing of all the numbers from i to j would be inefficient. Also, imagine a case where the array itself is very large, then O(n) complexity for each query will lead to choking of your service.

Range sum query- Immutable array : thoughts

There are two hints for optimization is in the question, first, the array is immutable, it does not change. Second, we have to build a service, that means we have a server with us. These two things allow us to pre-compute and store results even before the query is made.

Now, the question is what do we pre-compute and how do we store it? We can precompute the sum of all the elements between each i and j and store them in a two-dimensional array. range[i][j] stores the sum of all the elements between index i and j. It will use O(n2) additional memory, however, the response time for each sumRange query will be constant. Also, the preprocessing step is O(n2)

Can we optimize for space as well? If I know the sum of all the elements from index 0 to index i and sum of all the elements from 0 to j, can I find the sum of all the elements between i and j? Of course, we can do it.

 Sum(i,j) = Sum(0,j) - Sum(0,i) + A[i]. 

Below diagram explains it.
range sum query array

However, integer array is not passed in the query request, we cannot use it while calculating the sum. Instead, we will use formula like: Sum(i,j) = Sum(0,j) – Sum(0,i-1), which is equivalent to the above.

We will pre-calculate the sum of all the elements between index 0 and j for all j>=0 and jImplementation

class NumArray {

    int[] rangeSum;
    public NumArray(int[] nums) {
        rangeSum = new int[nums.length];
            rangeSum[0] = nums[0]; 
            for(int i=1; i<nums.length; i++){
                rangeSum[i] = rangeSum[i-1] + nums[i];
    public int sumRange(int i, int j) {
        if(i==0) return rangeSum[j];
        return rangeSum[j] - rangeSum[i-1];

Now, the preprocessing step is O(n). N additional space is used. At the same time query response time is O(1).

Please share if there is anything wrong or missing in the post. If you are preparing for an interview and needs coaching to prepare for it, please book a free demo session with us.

Find combinations which add up to a number

Combination sum problem

Given an array of integers (candidates) (without duplicates) and a target number (target), find all unique combinations in candidates where the candidate numbers sums to target.

Also, same candidate can occur in the combination as multiple times.

For example, Input: candidates = [4,3,5,9], target = 9, a solution set is:[ [9], [3,3,3], [4,5]]

How can do we go about it? What happens if I take the coin 4 in the current example? Then all need to find in the candidates array if there is a combination adds up to 9-4 = 5. Seems like a recursion. For recursion, we need a termination condition. In this case, if I have on candidates to add and target is greater than zero, then whatever combination I have till now has no value, so I terminate the recursion in this case.

Second what if I have already found a combination which adds up to target? Then I will put that combination in the list of combinations and return.

What happens in recursive implementation? Well, we go through each coin, add that to current combination and see if leads to the target? If it does, it will be added to the result list along with the list of other candidates. If not, we just remove the current coin (backtrack) from the current combination and try the next coin.

This approach is called exhaustive search and backtracking paradigm of problem-solving where you search the entire input set to see to find the answer. However, in this case, we can prune the search path as soon as we know that the current set of candidates add up more than the target.

Combination sum : implementation

class Solution {
    public List<List<Integer>> combinationSum(int[] candidates,
                                              int target) {
        /* The result list contains all the combination 
           which add up to target.
        List<List<Integer>> result = new ArrayList<List<Integer>> ();
        //We start with the first coin and search exhaustively.
                           new ArrayList<Integer>(),
        return result;
    public void combinationSumUtil(int[] candidates, 
                                  int target,
                                  List<List<Integer>> result,
                                  List<Integer> current, 
                                  int index){
           First termination condition: if there are no coins left
           and required target is more than zero.
        if(target > 0 && index == candidates.length){

           Second termination condition: if target is zero,
           we can add the current combination to the result
        if(target == 0 && index < candidates.length){
            result.add(new ArrayList<>(current));
           Start from the current index, and go through
           all the coins.
        for(int i=index; i<candidates.length; i++){
               This is where we prune the branches 
               of our exhaustive search
            if(target - candidates[i] >=0){
                current.add(candidates[i]); // add to the list
                                   result, current, i);
                /* Remove the candidate from the list and 
                   check other combinations.
                if(current.size() > 0)

The time complexity is C(n,1) + C(n,2) + … + C(n,n) = 2^n – C(n,0) = O(2n).

The beauty of this solution is that it works with negative candidates as well, where the Dynamic solution for it may not work.

Maximum area rectangle in a histogram

A histogram is a diagram consisting of rectangles whose area is proportional to the frequency of a variable and whose width is equal to the class interval. Below is an example of a histogram.

maximum area rectangle in histogram

Given a histogram, whose class interval is 1, find maximum area rectangle in it. Let me explain the problem in more details.

In the histogram above, there are at least 6 rectangles with areas 2, 1,5,6,2, and 3. Are there more rectangles? Yes, we can make more rectangles by combining some of these rectangles. A few are shown below.

Apparently, the largest area rectangle in the histogram in the example is 2 x 5 = 10 rectangle. The task is to find a rectangle with maximum area in a given histogram. The histogram will be given as an array of the height of each block, in the example, input will be [2,1,5,6,2,3].

Maximum area rectangle: thoughts

First insight after looking at the rectangles above is: block can be part of a rectangle with a height less than or equal to its height. For each block of height h[i], check what all blocks on the left can be part of a rectangle with this block. All the blocks on the left side with a height greater than the current block height can be part of such a rectangle.
Similarly, all the blocks on the right side with a height greater than the current block height can be part of such a rectangle.
Idea is to calculate leftLimit and rightLimit and find the area (rightLimit - leftLimit) * h[i].
Check if this area is greater than previously known area, then update the maximum area else, continue to the next block.

class Solution {
    public int largestRectangleArea(int[] heights) {
        if(heights.length == 0) return 0;
        int maxArea = Integer.MIN_VALUE;

        for(int i=0; i<heights.length; i++){
            //Find the left limit for current block
            int leftLimit = findLeftLimit(heights, i);

            //Find the right limit for current block
            int rightLimit = findRightLimit(heights, i);

            int currentArea = (rightLimit - leftLimit-1) * heights[i];
            maxArea = Integer.max(maxArea, currentArea);

        return maxArea;

    private int findLeftLimit(int [] heights, int index){
        int j = index-1;
        while (j >= 0 && heights[j] >= heights[index]) j--;

        return j;

    private int findRightLimit(int [] heights, int index){
        int j = index+1;
        while (j < heights.length && heights[j] >= heights[index])

        return j;

The time complexity of the implementation is O(n2); we will left and right of each block which will take n operations, we do it for n blocks and hence the complexity is quadratic. Can we optimize the time complexity?

If heights[j] >= heights[i] and leftLimit of index j is already known, can we safely say that it will also be the leftLimit of index i as well?
Can we say the same thing for rightLimit well? Answers to all the questions are yes. If we store the left and right limit for all indices already seen, we can avoid re-calculating them.

class Solution {
    public int largestRectangleArea(int[] heights) {
        if(heights.length == 0) return 0;

        int maxArea = Integer.MIN_VALUE;

        //Finds left limit for each index, complexity O(n)
        int [] leftLimit = getLeftLimits(heights);
        //Find right limit for each index, complexity O(n)
        int [] rightLimit = getRightLimits(heights);

        for(int i=0; i<heights.length; i++){
            int currentArea = 
                (rightLimit[i] - leftLimit[i] -1) * heights[i];
            maxArea = Integer.max(maxArea, currentArea);

        return maxArea;

    private int[] getLeftLimits(int [] heights){

        int [] leftLimit = new int[heights.length];
        leftLimit[heights.length-1] = -1;

        for(int i=0; i<heights.length; i++) {
            int j = i - 1;
            while (j >= 0 && heights[j] >= heights[i]) {
                j = leftLimit[j];
            leftLimit[i] = j;
        return leftLimit;

    private int[] getRightLimits (int [] heights){

        int [] rightLimit = new int[heights.length];
        rightLimit[heights.length-1] = heights.length;

        for(int i=heights.length-2; i>=0; i--){
            int j = i+1;
                      && heights[j] > heights[i]){
                j = rightLimit[j];
            rightLimit[i] = j;
        return rightLimit;

The array leftLimitcontains at index i the closest index j to the left of i such that height[j] < height[i]. You can think about each value of the array as a pointer (or an arrow) pointing to such j for every i. How to calculate leftLimit[i]? Just point the arrow one to the left and if necessary just follow the arrows from there, until you get to proper j. The key idea here to see why this algorithm runs in O(n) is to observe that each arrow is followed at most once.

Largest area rectangle: stack-based solution

There is a classic method to solve this problem using the stack as well. Let’s see if we can build a stack-based solution using the information we already have. Let’s we do not calculate the area of the rectangle which includes the bar when we are processing it. When should we process it? Where should this bar be put on? If we want to create a rectangle with a height of this bar, we should find the left and right boundaries of such a rectangle. We should put this bar on a stack.
Now when you are processing bar j if height[j] is less than the bar on the top of the stack, we pop out the bar at the top. Why? Because this is the first bar on the right which has a height less than the height of the bar at top of the stack. This means if we want to make a rectangle with a height of the bar at the top of the stack, this index means the right boundary. This also gives away that all the blocks on the stack are in increasing order, as we never put a block which has a height less than the height of block at the top on to the stack. It means the next bar on the stack is the first bar which has a height lower than the bar at the top. To calculate the area of the rectangle with height as h[top], we need to take width as current index j - stack.peek() - 1

So the idea is that:

  1. For each bar, take its height as the rectangle’s height. Then find the left and right boundaries of this rectangle.
  2. The second top bar in the stack is always the first bar lower than the top bar on the stack on the left.
  3. The bar that j points to is always the first bar lower than the top bar in the stack on the right.
  4. After step 2 and 3, we know the left and right boundaries, then know the width, then know the area.
private int maxAreaUsingStack(int[] heights){

        Stack<Integer> s = new Stack<>();

        int maxArea = 0;
        for(int i=0; i<=heights.length; i++){
            //Handling the last case
            int h = i == heights.length ? 0 : heights[i];
            while(!s.empty() && h < heights[s.peek()]){
                int top = s.pop();
                int leftLimit = s.isEmpty() ? -1 : s.peek();
                int width = i-leftLimit-1;

                int area = width * heights[top];
                maxArea = Integer.max(area, maxArea);
        return maxArea;
The time complexity of the code is O(n) with an additional space complexity of O(n) If you are preparing for a technical interview in companies like Amazon, Facebook, etc and want help with preparation, please register for a coaching session with us.

System design basics – II

System design basics – Cache

In last post, we discussed the design fundamentals for database, in this post let’s talk about caching.

What is caching? Caching is a mechanism to store and fetch data faster by applications without referring to the original source of data. Usually, applications store data which they think will be used again, instead of doing a round trip to the original source, applications use the stored data to serve requests. These are mostly software caches implemented using main memory.


A cache is a storage which stores a subset of the entire data based on current usage and probability of its future usage for faster execution of requests. This storage is typically in main memory which has must faster read time compared to disk.

In typical microprocessors, hardware cache exists which between CPU and main memory, these are very high-speed miniature memories which store instructions and data, which are processed by CPU, referred to as L1, L2 caches.

In distributed web systems, caching can be done at various places and levels which we will discuss in a while.

Caching eviction strategy

As mentioned above, cache stores a subset of the original data. Question is which subset and how does it work when data which is not in the cache is requested? There are different approaches to select which data stays in the cache, these are called cache eviction strategies.

Before understanding more on eviction strategies, a cache miss is an event where the application did not find data it requested in the cache. A cache hit is when the application did find what requested in the cache.

  1. First in first out : In the event of a cache miss, the entity which came first into the cache is evicted first. Every entity has a timestamp associated with it and the oldest timestamp entity is removed first.
  2. Least recently used: In the case of a cache miss, a page which was used least recently gets evicted from the cache.
  3. Least frequently used: In case of a cache miss, the entity which is least frequently used among all the entities in the cache is thrown out of the cache.

There are other eviction strategies like minimum used eviction or heuristic-based eviction.

Cache hit ration is dependent on many parameters, first, the size of cache key-space, the more unique cache keys your application generates, the less chance you have to reuse any one of them. Always consider ways to reduce the number of possible cache keys. second, the number of items you can store in the cache, the more objects you can physically fit into your cache, the better your cache hit ratio. Third, longevity, how long each object can be stored in cache before expiring or being invalidated.

Caching levels

Caches can be applied and leveraged throughout various layers of technology including Operating Systems, Networking layers including Content Delivery Networks (CDN) and DNS, web applications, and Databases.
In a web application, most of the time, user requests need the same data to fulfill those request. If each request starts hitting your database, your servers will be overloaded and response time will slow. To avoid unnecessary load on servers and to decrease the response time, we place caches in between our databases and application. These caches can be on the same servers as the database, completely separate servers or at the application servers. Based on the metric and function you want to optimize, use appropriate caching level.

– Client caching
Caching which is done at the client-side like operating system and browser of the user. Typical examples are Address Resolution Protocol and static assets like HTML and CSS. Remember, you did nothing in this case, everything is done by browser and not your system.

– CDN caching
Imagine that you want to serve your static content like Javascript files, HTML templates, CSS without going to web servers. Anyways, web servers do not anything with static content than just passing it along. This is where Content Distribution Networks come in the picture. One can create CDNs near geographic locations of users and server static content to users from the nearest CDN, which makes website or app load faster.

– Server caching
We could have a cache directly on the servers. Each time a request is made to the service, the server will quickly return local, cached data if it exists. If it is not in the cache, the requesting node will query the data by going to network storage such as a database.
How this solution will scale as we may grow to many nodes? If we decide to expand to multiple nodes, it’s still quite possible to have each node host its own cache. However, if your load balancer randomly distributes requests across the nodes, the same request will go to different nodes, thus increasing cache misses.

Another design pattern to handle the cache miss problems is to have a common cache for the entire system which all server write and read from. This pattern scales better and even if the requests in the same session go to multiple servers, the user gets the consistent experience without latency. Trouble is that now you cache layer has become a bottleneck.

Caching approach

– Cache-aside
In this caching approach, we write directly on to the DB. The cache reads the info from DB in case of a miss and then stores it till the eviction policy moves it out of cache. This approach can lead to higher read latency in case of applications which write and re-read the information quickly.

– Write-through
Where writes go through the cache and write is confirmed as success only if writes to database and the cache succeed. There is data consistency between cache and database. If your cache crashes due to power failures or other disruptions and restarts, nothing will be lost. However, write latency will be higher in this case as there are writes to two separate systems.

– Write-behind (write-back)
In this approach we write on cache and it is confirmed as soon as it is done on cache without writing it on to database. This write is asynchronously synced to database. It results in quick write latency and high write throughput for the write-intensive applications. However, you have a risk of loss of data incase the caching layer dies because the only single copy of the written data is in the cache. We can improve this by having more than one replica acknowledging the write in the cache.

Advantages of using cache in your system design

1. Improve Application Performance
Because memory is orders of magnitude faster than disk (magnetic or SSD), reading data from the in-memory cache is extremely fast (sub-millisecond). This significantly faster data access improves the overall performance of the application.

2.Reduce Database Cost
A single cache instance can provide hundreds of thousands of IOPS (Input/output operations per second), potentially replacing a number of database instances, thus driving the total cost down. This is especially significant if the primary database charges per throughput.

3. Reduce the Load on the Backend
By redirecting significant parts of the read load from the backend database to the in-memory layer, caching can reduce the load on your database, and protect it from slower performance under load, or even from crashing at times of spikes.

4. Eliminate Database Hotspots
In many applications, it is likely that a small subset of data, such as a celebrity profile or popular product, will be accessed more frequently than the rest. This can result in hot spots in your database and may require overprovisioning of database resources based on the throughput requirements for the most frequently used data. Storing common keys in an in-memory cache mitigates the need to overprovision while providing fast and predictable performance for the most commonly accessed data.

5.Increase Read Throughput (IOPS)
In addition to lower latency, in-memory systems also offer much higher request rates (IOPS) relative to a comparable disk-based database. A single instance used as a distributed side-cache can serve hundreds of thousands of requests per second.

System design basics – I

System design basics

System design round is one of the rounds in any technical interview. The idea of this round is to know your design skills, analytical and trade-off skills, to see if you can take a fuzzy problem, break it down it in small executable chunks and actually solve the problem. The beauty is that there are no correct or incorrect answers, it is more of a discussion.

To have a fruitful discussion, one must know the basics and should be able to prove his/her claims with facts. This article discusses what are all those basic technical concepts you should be aware of when you go into a system design round.

System Design concepts: Databases

Databases are essentially required in any application/system unless it is a non-authenticated, static, read-only and not frequently changing content only application. It is therefore important to know a few concepts around databases.
First of all, understand that these concepts are independent of what actual DB used, it can be open source like MySQL or proprietary like MS-SQL Server.

Imagine a simple application like which is set up like the figure below. What are the problems with setup?

system design interview

There are three problems to start with among many others: First, what if the only DB server goes down or we have to take it down for maintenance? It impacts the system’s availability. Second, on the other instance, what if the server crashes and become unrecoverable, all application data in it is also gone, this affects the application’s data availability and persistence. Third, what if there are many reads and writes request coming to the server. The only server will degrade in performance, simple read requests will take a lot of time waiting for complex writes to finish, impacting the performance of the overall system.

What can we do? Let’s set up a new server, which is nothing but a replica of the original server and store everything which is on the original server. This process is called replication, the original server is called master and the copy server is called slave.

System design DB replication

How does it solve the problems: First, having a replicated server gives us the power to immediately switch to a new server in case the master goes down. The system will keep running as if nothing happened and nobody except your database administrators needs to know anything.
Second, you have the exact copy of the data, so even you lose data on one server, you can always extract that from the other.
Third, since the servers contain the exact same data, you can read from one server and write on another. This improves the performance of your read requests (which are usually more in any system) and is not blocked on write.

So in essence, replicating your DB servers gives you system availability, protection against data loss and performance gain .

Usually, in production systems, there are two masters, (one stand by) and many slaves attached to those masters. Also, you want to keep a server with replication delayed by a few hours (24 hours ideally). Just in case if data on master get corrupted ( developer run a wrong query, file system issues) and it gets replicated immediately to all the servers, all of the data is now corrupted. This delayed replicated server can give you sane data although you lose data of a day.

System design replication of database

Read more about replication in MySQL

Where is the catch, where is the tradeoff here? We improved availability and performance, however, we have risked the consistency. Imagine a case, where data is written on the master. Due to some lag, data replication to slave did not happen quickly. If another service reads from the slave the same data, either it does not get the data or gets an obsolete piece of data.

Replication improves availability and performance, however, risks consistency.

A database is essentially a collection of tables which are nothing but rows and columns. The number of rows and columns in tables vary. Let’s you have a very popular application and half of the earth’s population is registered to your application. In this case, your user table has too many rows. On the other hand, you have 10K employee and you want to store everything about those employees in one table called employee, here you have a table which has too many columns.

Before understanding partitioning, let’s understand why too many rows or columns in the table are problems for a system?
With too many rows, index size will be large and search performance will degrade, this will impact the overall performance of the service for a specific user. Let’s say a few thousands of users are accessing your service which queries this table, one user query starts degrading the other and your system goes into this spiral of degrading performance. If this big data gets corrupted, it impacts the availability of service to all the users.

With too many columns, you will be pulling a lot of data in every query even when you need just a few columns out of those. This will indirectly impact the performance and put a lot of unnecessary data on the network.

What is the solution to too many rows or columns? The answer is partitioning. There are two types of partitioning:
Horizontal partitioning
In horizontal partitioning, a table is divided into multiple partitions, each partition stores rows of the table. These partitions can be stored in multiple database clusters. One of the examples is that all the users from Germany, Netherlands, and France in a table called EuropeUser whereas all the users in USA and Canada are stored in a table called NorthAmericaUser and so on for Asia, Africa, etc. If you want to know all the users of your service, all you need to unite all these tables.
Gain is that now you have to search less number of rows to find a European user compared to earlier, which improves query performance and hence improves the response time of service. If table EuropeUser is corrupted or deleted, your service is still available in North America and Asia.

system design horizontal partitioning

Vertical partitioning
In vertical partitioning, we divide the table into multiple partitions and each partition contains certain columns. For example, an employee table with id, name, email, picture, salary, organization can be partitioned into three one with columns: id, name, and email; second: id, picture; Third: id, org, and salary.

vertical partitioning

There are multiple ways you can partition a table:

Range partitioning – selects a partition by determining if the partitioning key is inside a certain range. It distributes tuples based on the value intervals (ranges) of some attribute.

List partitioning – a partition is assigned a list of values. If the partitioning key has one of these values, the partition is chosen. For example, all rows where the column Country is either Iceland, Norway, Sweden, Finland or Denmark could build a partition for the Nordic countries.

Composite partitioning – allows for certain combinations of the above partitioning schemes, for example first applying a range partitioning and then a hash partitioning.
Consistent hashing could be considered a composite of hash and list partitioning where the hash reduces the key space to a size that can be listed.

Round-robin partitioning – the simplest strategy, it ensures uniform data distribution. With n partitions, the ith tuple in insertion order is assigned to partition (i mod n). This strategy enables the sequential access to a relation to be done in parallel.

Hash partitioning – applies a hash function to some attribute that yields the partition number. This strategy allows exact-match queries on the selection attribute to be processed by exactly one node and all other queries to be processed by all the nodes in parallel.

Partitioning improves your database availability and performance.

We already know that horizontal partitioning splits a big table into multiple partitions, however, these partitions are stored in the same schema and on the same server. Sharding takes it one step beyond, where partitions can be stored on multiple servers and in different schemas. This division of data and its storage on multiple servers gives flexibility to store more data which does not fit on single servers and fast search responses on the data.

Algorithmic sharding
In algorithmic sharding, the client can determine a given partition’s database without any help. Usually, there is a partition key, we can derive cluster id using that partition key. A simple hash function Hash(key) can be used a sharding function. Example of such sharding implementation is Memcached.

Dynamic sharding
In dynamic sharding, an external locator service determines the location of data.To read and write data, clients need to consult the locator service first.

Sharding a database table before it has been optimized locally causes premature complexity. Sharding should be used only when all other options for optimization are inadequate.


This is true because adding sharding actual may reduce performance if your data is not uniformly distributed by creating hotspots. Also, it adds lot of operational overheads, adding a new node in the system may require re-sharding and moving data between nodes.

Intersection of two arrays

Intersection of two arrays

Given two unsorted arrays of integers, find intersection of these two arrays. Intersection means common elements in the given two arrays. For example, A = [1,4,3,2,5,6] B = [3,2,1,5,6,7,8,10] intersection of A and B is [ 1,3,2,5,6 ].

Sort array and then use binary search
As given arrays are unsorted, sort one of the arrays, preferably the larger one. Then search each element of another array in the sorted array using binary search. If the element is present, put it into the intersection array.

class Solution {
    public int[] intersection(int[] nums1, int[] nums2) {
        int len1 = nums1.length;
        int len2 = nums2.length;
        Set<Integer> result = new HashSet<>();
        for(int i=0; i<len2; i++){
            if(binarySearch(nums1, nums2[i]) != -1){
        int i = 0;
        int[] resultArray = new int[result.size()];
        for(Integer num : result){
            resultArray[i++] = num ;
        return resultArray;
    private int binarySearch(int[] a, int key) {
        for(int i=0; i<a.length; i++){
            if(a[i] == key) return i;
        return -1;

The time complexity of binary search method to find intersection is O(nlogn) for sorting and then O(mlogn) for searching. Effective time complexity becomes O((n+m)logn) which is not optimal.

Sort and use merge to find common elements
Again in this method, sort two arrays first. Then use two pointers to scan both arrays simultaneously. (Please refer to merge part of merge sort ). The difference is we will put only common elements, instead of all.

The time complexity of merge sort method is O(nlogn) + O(mlogm) for sorting and then O(m+n) for scanning both arrays. It is worst than the binary search method.

Use hash
Create a hash with key as elements from the smaller array (saves space). Then scan through other array and see if the element is present in hash. If yes, put into intersection array else do not.

package AlgorithmsAndMe;


import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
import java.util.Map;

public class IntersectionTwoArrays {

    public List<Integer> findIntersecton(int[] a, int[] b) {
        List<Integer> result = new ArrayList<>();
        Map<Integer, Boolean> existingElements = new HashMap<>();

        for (int i = 0; i < a.length; i++) {
            existingElements.put(a[i], true);

        for (int i = 0; i < b.length; i++) {
            if (existingElements.containsKey(b[i])) {
        return result;

Test case

package Test;

import AlgorithmsAndMe.DuplicatesInArray;
import AlgorithmsAndMe.IntersectionTwoArrays;

import java.util.List;
import java.util.Set;

public class IntersectonTwoArraysTest {

    IntersectionTwoArrays intersectionTwoArrays
             = new IntersectionTwoArrays();

    public void testIntersectionTwoArrays() {
        int [] a = {1,6,3};
        int [] b = {1,2,3};
        List<Integer> result = intersectionTwoArrays.findIntersecton(a,b);

        result.forEach(s -> System.out.println(s));

This method has the complexity of O(n) where n is the number of elements in the larger array and extra space complexity of O(m) where m is the number of elements in the smaller array.

These methods to find the intersection of two arrays do not work when there are duplicate elements in any of the array as they will be part of intersection array only once.

Please share if there is something wrong or missing. we would love to hear from you.

Minimizing maximum lateness

Minimizing maximum lateness : Greedy algorithm

Since we have chosen the greed, let continue with it for one more post at least. Today’s problem is to minimize maximum lateness of a task. Let me clarify the problem: given a processor which processes one process at a time and as always given a list of processes to be scheduled on that processor, with the intention that maximum late process should be minimized. Contrary to previous problems, this time, we are not provided with start time and end time, but we are given length of time ti process will run and deadline it has to meet di, fi is actual finish time of process completion.

Lateness of a process is defined as
li = max{0, fi − di}, i.e. the length of time past its deadline that it finishes.
Goal here to schedule all tasks to minimize maximum lateness L = max li For example:

minimize maximum lateness

Minimizing maximum lateness : algorithm

Let’s decide our optimization strategy. There is some order in which jobs can be decided: shortest job first, earliest deadline first, least slack time first.

Let’s see if any of the above strategies work for the optimal solution. For shortest processing time first, consider example P1 = (1,100) P2 = (10, 10). If we schedule the shortest job first as in order (P1, P2), lateness will be 91, but if we take them as (P2, P1), lateness will be 0. So, clearly taking the shortest process first does not give us an optimal solution.

Check for the smallest slack time approach. See if you can come up with some counterexample that it does not work.

That leaves us with only one option, take the process which has the most pressing deadline, that is the one with the smallest deadline and yet not scheduled. If you have noticed, the example given for the problem statement is solved using this method. So, we know it works.

  1. Sort all job in ascending order of deadlines
  2. Start with time t = 0
  3. For each job in the list
    1. Schedule the job at time t
    2. Finish time = t + processing time of job
    3. t = finish time
  4. Return (start time, finish time) for each job

Minimizing maximum lateness : implementation

from operator import itemgetter

jobs = [(1, 3, 6), (2, 2, 9), (3, 1, 8), (4, 4, 9), 
        (5, 3, 14), (6, 2, 15)] 

def get_minimum_lateness():
	schedule =[];
	max_lateness = 0
	t = 0;
	sorted_jobs = sorted(jobs,key=itemgetter(2))
	for job in sorted_jobs:
		job_start_time = t
		job_finish_time = t + job[1]

		t = job_finish_time
		if(job_finish_time > job[2]):
			max_lateness =  max (max_lateness, (job_finish_time - job[2]))
		schedule.append((job_start_time, job_finish_time))

	return max_lateness, schedule

max_lateness, sc = get_minimum_lateness();
print "Maximum lateness will be :" + str(max_lateness)
for t in sc:
	print t[0], t[1]

The complexity of implementation is dominated by sort function, which is O(nlogn), rest of processing takes O(n).

Please share your suggestions or if you find something is wrong in comments. We would love to hear what you have to say. If you find this post interesting, please feel free to share or like.

Coin change problem : Greedy algorithm

Coin change problem : Greedy algorithm

Today, we will learn a very common problem which can be solved using the greedy algorithm. If you are not very familiar with a greedy algorithm, here is the gist: At every step of the algorithm, you take the best available option and hope that everything turns optimal at the end which usually does. The problem at hand is coin change problem, which goes like given coins of denominations 1,5,10,25,100; find out a way to give a customer an amount with the fewest number of coins. For example, if I ask you to return me change for 30, there are more than two ways to do so like

Amount: 30
Solutions : 3 X 10  ( 3 coins ) 
            6 X 5   ( 6 coins ) 
            1 X 25 + 5 X 1 ( 6 coins )
            1 X 25 + 1 X 5 ( 2 coins )

The last solution is the optimal one as it gives us a change of amount only with 2 coins, where as all other solutions provide it in more than two coins.

Solution for coin change problem using greedy algorithm is very intuitive and called as cashier’s algorithm. Basic principle is : At every iteration in search of a coin, take the largest coin which can fit into remaining amount we need change for at the instance. At the end you will have optimal solution.

Coin change problem : Algorithm

1. Sort n denomination coins in increasing order of value.
2. Initialize set of coins as empty. S = {}
3. While amount is not zero:
3.1 Ck is largest coin such that amount > Ck
3.1.1 If there is no such coin return “no viable solution”
3.1.2 Else include the coin in the solution S.
3.1.3 Decrease the remaining amount = amount – Ck

Coin change problem : implementation

#include <stdio.h>
int coins[] = { 1,5,10,25,100 };
int findMaxCoin(int amount, int size){
	for(int i=0; i<size; i++){
	    if(amount < coins[i]) return i-1;
	return -1;

int findMinimumCoinsForAmount(int amount, int change[]){
	int numOfCoins = sizeof(coins)/sizeof(coins[0]);
	int count = 0;
	    int k = findMaxCoin(amount, numOfCoins);
	    if(k == -1)
                printf("No viable solution");
                amount-= coins[k];
		change[count++] = coins[k];
	return count;
int main(void) {
	int change[10]; // This needs to be dynamic
	int amount = 34;
	int count = findMinimumCoinsForAmount(amount, change);
	printf("\n Number of coins for change of %d : %d", amount, count);
	printf("\n Coins : ");
	for(int i=0; i<count; i++){
		printf("%d ", change[i]);
	return 0;

What will the time complexity of the implementation? First of all, we are sorting the array of coins of size n, hence complexity with O(nlogn). While loop, the worst case is O(amount). If all we have is the coin with 1-denomination. Overall complexity for coin change problem becomes O(n log n) + O(amount).

Will this algorithm work for all sort of denominations? The answer is no. It will not give any solution if there is no coin with denomination 1. So be careful while applying this algorithm.

Please share if you have any suggestion or if you want me to write on a specific topic. If you liked the post, share it!

Find duplicate numbers in array

Find all duplicate numbers in array

Given an array of positive integers in range 0 to N-1, find all duplicate numbers in the array. The array is not sorted. For example:
A = [2,4,3,2,1,5,4] Duplicate numbers are 2,4 whereas in A = [4,1,3,2,1,1,5,5] duplicate numbers are 1,5.

Brute force solution would be to keep track of every number which is already visited. The basic idea behind the solution is to keep track that whether we have visited the number before or not. Which data structure is good for quick lookups like this? Of course a map or hash.
The time complexity of this solution is O(n) but it has an additional space complexity of O(n).

To reduce space requirement, a bit array can be used, where ith index is set whenever we encounter number i in the given array. If the bit is set already, its a duplicate number. It takes O(n) extra space which is actually less than earlier O(n) as only bits are used. The time complexity remains O(n)

Find duplicate numbers in an array without additional space

Can we use the given array itself to keep track of the already visited numbers? How can we change a number in an array while also be able to get the original number back whenever needed? That is where reading the problem statement carefully comes. Since array contains only positive numbers, we can negate the number at the index equal to the number visited. If ever find a number at any index negative, that means we have seen that number earlier as well and hence should be a duplicate.

Idea is to make the number at ith index of array negative whenever we see number i in the array. If the number at ith index is already negative, it means we have already visited this number and it is duplicate. Limitation of this method is that it will not work for negative numbers.

Duplicate numbers implementation

package AlgorithmsAndMe;

import java.util.HashSet;
import java.util.Set;

public class DuplicatesInArray {

    public Set<Integer> getAllDuplicates(int[] a ) 
                              throws IllegalArgumentException {

        Set<Integer> result = new HashSet<>();

        if(a == null) return result;

        for(int i=0; i<a.length; i++) {
            //In case input is wrong
            if(Math.abs(a[i]) >= a.length ){
               throw new IllegalArgumentException();
            if (a[Math.abs(a[i])] < 0) {
            } else {
                a[Math.abs(a[i])] = -a[Math.abs(a[i])];
        return result;

Test cases

package Test;

import AlgorithmsAndMe.DuplicatesInArray;
import java.util.Set;

public class DuplicatesInArrayTest {

    DuplicatesInArray duplicatesInArray = new DuplicatesInArray();

    public void testDuplicatesInArray() {
        int [] a = { 1,2,3,4,2,5,4,3,3};
        Set<Integer> result = duplicatesInArray.getAllDuplicates(a);

        result.forEach(s -> System.out.println(s));

    public void testDuplicatesInArrayWithNullArray() {
        Set<Integer> result = duplicatesInArray.getAllDuplicates(null);

        result.forEach(s -> System.out.println(s));

    //This case should generate an exception as 3 is greater than the size.
    public void testDuplicatesInArrayWithNullArray() {
        int [] a = { 1,2,3};
             Set<Integer> result = duplicatesInArray.getAllDuplicates(a);
        } catch (IllegalArgumentException  e){
            System.out.println("invalid input provided");

The complexity of the algorithm to find duplicate elements in an array is O(n).

Linked list implementation in Linux kernel

Linked list implementation in Linux kernel

We learned a lot about linked and solve around 30 odd problems : Linked list problems. However, the actual implementation of a linked list in Linux kernel is very different than what we learned. Let us understand how a linked list is implemented in Linux kernel and used in kernel code.

In a simple linked list, nodes contain data and point to the next node in the linked list. In other words, its the list which contains nodes which are linked. A typical example of the structure of a node of this kind of a list is:

class Node {
  private int data;
  private Node next;

  public Node (int data, int next){ = data; = next;   


However, linked lists in the Linux kernel, it’s the other way around, that is linked list is contained inside the node. This means, that there is no next pointer inside the node and each node is effectively a head node like a circular linked list. Also, it is a doubly linked list. A lot of things in one sentence!!

Linked list implementation in Kernel

Let’s understand it in detail. As said above, linked list is contained inside the node, structure of node is like this:

struct node {
 int data;
 list_head list; // list is inside the node

Here list_head is what defined as :

struct list_head{
  struct list_head *next, *prev;

See it has two pointers, essentially, making any node which contains this structure, part of a doubly linked list. The most interesting part of this kind of definition of a node is that same node can be part of multiple lists without being reallocated for each list. For example, in traditionally linked lists, if we need two linked lists: one as odd numbers and other as prime numbers, we would have to define two linked lists, one with odd numbers and other with prime numbers. With implementation provided in the Linux kernel, we can attach the same node to two lists as shown below, where an odd number which is also prime is allocated only once.

struct numbers {
 int number;
 list_head odd_numbers; // contains all odd numbers
 list_head primer_numbers; // Contains all prime numbers

How to access a node in list in Linux Kernel

We understood the node structure, how can we access a given node of a linked list. It was simple to do in a normal linked list as the base address of node accessible. In list implemented in Linux kernel, we have a pointer to the list_head structure in the next node and not a pointer to next node itself, as shown below.

linked list representation in linux kernel

There is a beautiful trick in C, which is used here to access the base address of the node whose list_head pointer is given. Once the base address of a node is known, accessing the node becomes similar to a normal linked list. The trick is that given a pointer to list_head in the structure; to find the base pointer of structure, find the offset at which list_head is stored in list. Once, we know the offset, (how many bytes, it is far from base address), then just subtract that offset from the absolute address of the pointer (which is given) and we get the base address. Figure explains

node representation in linked list in linux kernel

Let’s take an example, we will use structure numbers as given above. To get offset of element number in that, code is:

(unsigned long)(&((struct numbers *)0)->number)

Now, that we have offset of number and absolute address of number, we can get the base address of struct numbers as:

((struct numbers *)((char *)(pos) - \
          (unsigned long)(&((numbers *)0)->number)))

ANSI C defines the offsetof() macro in which lets you compute the offset of field f in struct s as offsetof(struct s, f). If for some reason you have to code this sort of thing yourself, one possibility is

#define offsetof(type, f) ((size_t) \
 ((char *)&((type *)0)->f - (char *)(type *)0))

Above code is not portable and some compilers may have problems with it.

There are some MACROS which are defined in the linux kernel which are useful in dealing with linked lists. Below are some examples:

#define list_entry(ptr, type, member) \
	((type *)((char *)(ptr)-(unsigned long)(&amp;((type *)0)-&gt;member)))
<pre>#define LIST_HEAD(name) \
	struct list_head name = LIST_HEAD_INIT(name)

#define INIT_LIST_HEAD(ptr) do { \
	(ptr)-&gt;next = (ptr); (ptr)-&gt;prev = (ptr); \
} while (0)

Please refer to this file to understand various macros which are used in Linux kernel.

In next post, we will use these constructs and see how can we created a list, access it and delete a list.

Please share if there is something wrong or suggestion for improvements. Also, if you like the content, please share it.