Matrix chain multiplication-dynamic programming

Matrix chain multiplication dynamic programming

What is matrix chain multiplication in general? To read on that please refer to Wiki. However, today’s problem is not about actually multiplying chain of matrices, but to find out optimal way to multiply them in order to minimize number of scalar multiplications. There is something called as dimensions of matrix: rows  and columns.  To be able to multiply two matrices, it is required that number of columns in first matrix is equal to number of rows of second matrix. If we multiply a matrix of dimension M x N to another matrix N x P, we can a matrix of dimension M x P.

matrix chain multiplication

How many scalar multiplications needs to be done to multiply M x N to N x P matrix? It’s M x N x P.

Given N matrices with dimensions, find optimal way to multiply these matrices, in order to minimize total number of scalar multiplications.

Matrix chain multiplication : line of thoughts

Before going further, lets understand some basics of matrix multiplication

  1. Matrix multiplication is associative i.e.  A* (B*C) = (A*B) *C
  2. It is not commutative i.e  A * (B*C) not equal to A * (C * B)
  3. To multiply two matrices, they should be compatible i.e. no of columns in first matrix should be equal to number of rows of second matrix. No of columns of first matrix = No. of rows of second matrix

Since matrix multiplication is associative, this problem actually reduces to figure out a way to put parenthesis around matrices so that total number of scalar multiplications are least.
Let’s take an example and understand. We have matrices A, B, C and D with dimensions array as 10 x 20, 20 x 30, 30 x 40, 40 x 50 respectively.

How can we solve this problem manually?  Given chain of matrices is as ABCD. There are three ways to split the chain into two parts :  as (A) x (BCD) or as (AB) x (CD) or as (ABC) x (D).

Any which way, we have smaller problems to solve now. If we take the first split, cost of multiplication of ABCD is cost of multiplication A + cost of (BCD) + cost of multiplication of A x (BCD).

Similarly for rest two splits. Answer will be minimum of all three possible splits.

To get cost of ABCD,  solve cost of BCD.  (BCD) can be split in two parts : (B) x (CD) or (BC) x (D).

We will continue with (B) x (CD). For cost of (CD), splits in one way (C) x (D). Cost of which is nothing but M x N x P where C is matrix of dimension M x N and D is matrix of dimension N x P.

Cost of  (BC) = M x N x P. Below figure shows the detailed calculation of each possible splits and final gives the answer as minimum of all possible splits.

matrix chain multiplication brute force
Notice that for N matrices, there N-1 ways to split the chain. This manual calculation can easily be implemented as recursive solution.

Recursive implementation of matrix chain multiplication

package com.company;

/**
 * Created by sangar on 31.12.17.
 */
public class MCM {

    public static int matrixChainMultiplication(int[] P, int i, int j){
        int count = 0;
        int min = Integer.MAX_VALUE;

        System.out.println("("+ i + "," + j + ")");
        if(i==j) return 0; // No cost of multiplying zero matrix

        for(int k=i; k<j; k++){
            System.out.println("Parent : ("+ i + "," + j + ")");
            count = matrixChainMultiplication(P,i, k)
                    + matrixChainMultiplication(P, k+1, j)
                    +   P[i-1]*P[k]*P[j];

            min =  Integer.min(count, min);
        }

        return min;
    }

    public static void main(String[] args) {
        int arr[] = new int[] {1, 2, 3, 4, 3};
        int n = arr.length;

        System.out.println("Minimum number of multiplications is "+
                matriChainMultiplication(arr));
    }
}

While implementing, we will get the matrices as array P where P[i-1] and P[i] represent the dimension of matrix i.

Matrix chain multiplication : Dynamic programming approach

However, complexity will be exponential in time and hence of no use for every large inputs. Also, if you look at the calculations tree, there are many sub-problems which are solved again and again. This is called overlapping sub-problems. What can be done to avoid calculating sub-problems again and again? Save it somewhere.

matrix chain multiplication : overlapping subproblems

If we can save cost of multiplication of matrices  i to j, we can refer it back when needed. This technique is called as memorization in dynamic programming.

Cost of multiplying matrices Ai to Aj  is cost of

Cost (Ai, Aj) = Cost(Ai,Ak) + Cost(Ak+1,Aj )+(P[i-1] * P[k] * P[j])

Idea is to find out K such that cost(Ai, Aj) becomes minimum. If M[i,j] represents the cost to multiply matrix i to matrix j, then,

M[i,j]  = M[i,k] + M[K+1,j] + ((P[i-1] * P[k] * P[j])

When calculating M[i,j]; M[i,k] and M[k+1,j] should be already available, this is called bottom up filling of matrix. Also, M[i,i] = 0 as cost of multiplying a single matrix will be 0.

Since, we are taking bottom up approach to fill our solution matrix, start calculating by grouping two matrices at a time, then 3 and then 4 till we reach N matrices chain. We start with length L= 2 and go on to solve the table entry for length N. M[1, N] will give us the final cost.

To find minimum cost(i,j), we need to find a K such that expression

Cost (Ai, Aj) = Cost(Ai,Ak) + Cost(Ak+1,Aj )+(P[i-1] * P[k] * P[j])

becomes minimum, hence

M[i,j] = min (M[i,j], (M[i,k] + M[k+1,j], P[i-1] * P[k] * P[j]))

Implementation for multiplication of matrix problem

package com.company;

/**
 * Created by sangar on 31.12.17.
 */
public class MCM {

    public  static  int matriChainMultiplicationDP(int[] P){
        int n = P.length;

        int[][] M = new int[n][n];

        for(int i=0; i<n; i++){
            for(int j=0; j<n; j++){
                M[i][j] = 0;
            }
        }

        for(int L=2; L<n; L++){
            /* For every position i, we check every chain of len L */
            for(int i=1; i<n-L+1; i++){
                int j = i+L-1;
                M[i][j] = Integer.MAX_VALUE;

                /* For matrix i to j, check every split K */
                for(int k=i; k<j; k++){
                    int temp = M[i][k] + M[k+1][j] + P[i-1] * P[k] * P[j];
                    /* Check if the current count is less than minimum */
                    M[i][j] = Integer.min(temp, M[i][j]);
                }
            }
        }

        return M[1][n-1];
    }
    public static void main(String[] args) {
        int arr[] = new int[] {1, 2, 3, 4, 3};
        int n = arr.length;

        System.out.println("Minimum number of multiplications is "+
                matriChainMultiplicationDP(arr));
    }
}

C implementation of matrix chain multiplication

#include<stdlib.h>
#include<stdio.h>

#define MAX_INT 10000000

int matrixMultiplication(int p[], int N){

  int L,i, j, temp;

  int M[N][N];

  for(i=0; i<N; i++){
      for(j=0; j<N; j++){
           M[i][j] = 0;
      }
  }       

  for(L=2; L<N; L++){
    /* For every position i, we check every chain of len L */
    for(i=1; i<N-L+1; i++){
        j = i+L-1;
        M[i][j] = MAX_INT;
        /* For matrix i to j, check every split K */
            for(int k=i; k<j; k++){
                temp = M[i][k] + M[k+1][j] + p[i-1] * p[k] * p[j];
               /* Check if the current count is less than minimum */
                if(temp < M[i][j]){
                    M[i][j] = temp;                 
                }
            }
        }
    }
    return M[1][N-1];
}

/* Driver program to run above code */
int main(){

    int p [] ={10, 20, 30, 40, 30};
    int n = sizeof(p)/sizeof(p[0]);
    
    printf("%d\n", matrixMultiplication(p,n));
    
    return 0;
}

Let’s run through an example and understand how does this code work?

P = {10, 20,30,40,30}, 
dimensions of matrix [1] = 10X20, 
dimensions of matrix [2] = 20X30, 
dimensions of matrix [3] = 30X40,
dimensions of matrix [4] = 40X30

Function will be called with array P and size which is 5.
Table of N size is declared. M[6][6] because we will need to store M[1][4].

Diagonal of the table is set to zero, as cost of multiplication of single matrix is 0.

M[0][0] = M[1][1] =M[2][2]=M[3][3]=M[4][4]= M[5][5]=M[6][6] =0

Start with for loop with L =2. Inner for loop will run from i =1  to N-L+1 = 5-2+1 =4, j will be i +L -1. which is 2.
Inner most loop runs with K=1 (i) till k = 2 (j). M[1,2] is INT_MAX.

temp = M[1,1] + M[2,2] + P[0] *P[1]*P[2] = 6000.

Since it is less than INT_MAX M[1,2] = 6000.
Similarly, for i =2, j = 2 + 2 -1 =3. Above process is followed and M[2,3] = 24000 and so on.

M[3,4] = P[2] * P[3] * P[4] = 36000.

Coming to L =3. i =1, j = 4-1 =3. K =1 to K <3.
For K =1,

temp =  M[1,1] + M[2,3] + P[0] * [1] * P[3] = 24000 +8000 = 32000

For K=2,

temp = M[1,2] + M[3,3] + P[0]*[2]*P[3] = 6000 + 12000 = 18000

Hence M[1,3] = min(32000, and 18000) = 18000 with i = 1.

With i =2. j =2+3-1 =4. K =2 to K <4
For K =2,

temp = M[2,2] + M[3,4] + P[1] * P[2]*P[4] =  36000 + 18000 = 54000.

For K =3,

temp = M[2,3] + M[4,4] + P[1] * P[3]*P[4] =  32000 + 24000 =  66000.

Hence M[1,3] remains 18000.
Same process is followed and the entries are filled. Finally, matrix will look like this :

0 6000 18000 30000 
0 0 24000 48000 
0 0 0 36000 
0 0 0 0

M[1,N-1] will be solution to our problem.

Time complexity of matrix chain multiplication using dynamic programming is O(n2). Also space complexity is O(n2).
Reference
http://www.personal.kent.edu/~rmuhamma/Algorithms/MyAlgorithms/Dynamic/chainMatrixMult.htm

If you want to contribute to website, please contact us. Please share if you there is something wrong or missing.