Matrices-Unit27.mws

MATRICES AND MATRIX OPERATIONS: Unit 27

Dr. Wlodzislaw Kostecki

The Papua New Guinea University of Technology (PNGUT)

Department of Electrical and Communication Engineering

Lae, Morobe Province

Papua New Guinea

Copyright © 2000 by Wlodzislaw Kostecki

All rights reserved

-------------------------------------------------------------------

(27) Differentiation of matrices comprising functions

OBJECTIVES :

To state the condition necessary for a matrix containing functions of one variable to have a derivative with respect to this variable.

To introduce the concept of a differentiable matrix.

To provide a definition of matrix derivative in the form of a symbolic expression.

To provide an example of a derivative of a rectangular matrix whose elements are functions.

To provide an example how to determine the largest interval about the origin, in which the derivative of a matrix is defined.

To specify and illustrate properties of matrix differentiation.

> restart : with(linalg, equal, inverse, orthog, transpose) :

If some or all elements of a matrix [ A ] are functions of one variable t , it is convenient to denote the matrix as [ A ( t ) ] when considering differentiation of such a matrix.

When the elements of a matrix [ A ( t ) ] are functions of a real variable t and all are differentiable with respect to t in some common interval t[0] < t < t[1] , then a derivative of [ A ( t ) ] with respect to t may be defined. Then, also the matrix [ A ( t ) ] is said to be differentiable in the interval t[0] < t < t[1] .

Consider a ( 2 × 3 ) matrix [ A ( t )] given as

> A(t) := matrix(2, 3, [a[11](t), a[12](t), a[13](t), a[21](t), a[22](t), a[23](t)]) : 'A(t)' = A(t) ;

A(t) = matrix([[a[11](t), a[12](t), a[13](t)], [a[2...

The derivative of [ A ( t ) ] with respect to t is defined to be the matrix of the same order with elements Diff(a[ij],t) , viz.

> `der(A)` := map(Diff, A(t), t) : Diff('A(t)', t) = matrix(`der(A)`) ;

Diff(A(t),t) = matrix([[Diff(a[11](t),t), Diff(a[12...

This operation can be displayed in "like-in-a-book" form, viz.

> Diff(A(t), t) = matrix(`der(A)`) ;

Diff(matrix([[a[11](t), a[12](t), a[13](t)], [a[21]...

For example, let the elements of a ( 2 × 3 ) matrix [ A ( t ) ] be the functions as shown

> A(t) := matrix(2, 3, [cosh(t), sin(t), cosh(2*t), sinh(t), cos(t), sinh(2*t)]) : 'A(t)' = A(t) ;

A(t) = matrix([[cosh(t), sin(t), cosh(2*t)], [sinh(...

The derivative of [ A ( t ) ] is the following matrix:

> `der(A)` := map(diff, A(t), t) : Diff('A(t)', t) = matrix(`der(A)`) ;

Diff(A(t),t) = matrix([[sinh(t), cos(t), 2*sinh(2*t...

or, in "like-in-a-book" form,

> Diff(A(t), t) = matrix(`der(A)`) ;

Diff(matrix([[cosh(t), sin(t), cosh(2*t)], [sinh(t)...

The following example illustrates how to determine the largest interval about the origin, in which a matrix derivative is defined.

Find the largest interval about the origin, in which the derivative of a matrix [ A ] is defined, given that

> A(t) := matrix(2, 3, [2*t^4, tan(t), ln(t^2-3)/2, 2*cos(t), t^2+6, cosh(t)]) : 'A(t)' = A(t) ;

A(t) = matrix([[2*t^4, tan(t), 1/2*ln(t^2-3)], [2*c...

Step 1 . Compute the derivative of [ A ]:

> `der(A)` := map(diff, A(t), t) : Diff('A(t)', t) = matrix(`der(A)`) ;

Diff(A(t),t) = matrix([[8*t^3, 1+tan(t)^2, t/(t^2-3...

It may be easily noticed that the functions at locations (1,2) and (1,3) of this matrix have discontinuities. This implies that the matrix is not defined over the entire field of real numbers.

Consequently, it is necessary to determine the interval of either function in which it is continuous and find its part common to both functions.

Step 2 . Find the interval about the origin within which the element at location (1,2) is continuous:

It is a well-known fact that the branch of the function y = tan(t) that goes through the origin is asymptotic to the lines described by the equations

> t[n_a] := -Pi/2 : t[p_a] := Pi/2 : 't[n_a]' = t[n_a] ; 't[p_a]' = t[p_a] ;

t[n_a] = -1/2*Pi

t[p_a] = 1/2*Pi

where t[n_a] and t[p_a] are the end points of the interval with the central point at t = 0 within which the function is continuous.

Consequently, the branch of the function y = 1+tan(t)^2 is also asymptotic to the same lines and its values at both end points, t[n_a] and t[p_a] , tend to infinity as shown

> Limit([`der(A)`[1,2]], t=t[n_a]) = limit(`der(A)`[1,2], t=t[n_a]) ;
Limit([`der(A)`[1,2]], t=t[p_a]) = limit(`der(A)`[1,2], t=t[p_a]) ;

Limit([1+tan(t)^2],t = -1/2*Pi) = infinity

Limit([1+tan(t)^2],t = 1/2*Pi) = infinity

Therefore, the branch of the function y = 1+tan(t)^2 that goes through the origin is continuous in the open interval ( -Pi /2, Pi /2).

Step 3 . Find the interval about the origin within which the element at location (1,3) is continuous:

It may be easily noticed that this function is discontinuous if its denominator becomes zero . Equating the denominator to zero

> Eq_denom(el13) := denom(`der(A)`[1,3]) = 0 : Eq_denom(el13) ;

t^2-3 = 0

and solving the resultant equation yield

> solution := solve({Eq_denom(el13)}) : solution ;

{t = sqrt(3)}, {t = -sqrt(3)}

Let the numerical parts of the solution be assigned the names t[n_d] and t[p_d] . Thus, the points at which the function y = t / ( t^2-3 ) is discontinuous are

> t[n_d] := rhs(solution[2][1]) : t[p_d] := rhs(solution[1][1]) : 't[n_d]' = t[n_d] ; 't[p_d]' = t[p_d] ;

t[n_d] = -sqrt(3)

t[p_d] = sqrt(3)

The points t[n_a] and t[p_a] are the end points of the interval with the central point at t = 0 within which the function exists and is continuous.

The function y = t / ( t^2-3 ) is not only discontinuous at both end points but also it has no limiting values at these points as shown

> Limit([`der(A)`[1,3]], t=t[n_d]) = limit(`der(A)`[1,3], t=t[n_d]) ;
Limit([`der(A)`[1,3]], t=t[p_d]) = limit(`der(A)`[1,3], t=t[p_d]) ;

Limit([t/(t^2-3)],t = -sqrt(3)) = undefined

Limit([t/(t^2-3)],t = sqrt(3)) = undefined

Therefore, the function y = t / ( t^2-3 ) is continuous in the open interval ( -sqrt(3) , sqrt(3) ).

Step 4 . Determine the interval common to both functions:

Since

> t[n_d] < t[n_a] ;

-sqrt(3) < -1/2*Pi

and

> t[p_a] < t[p_d] ;

1/2*Pi < sqrt(3)

both functions are continuous within the interval -Pi /2 < t < Pi /2 and the derivative of matrix [ A ] is defined in this interval.

* * *

N.B. If two matrices, [ A ( t ) ] and [ B ( t ) ], are conformable for addition and differentiable in some common interval, the derivative of their sum is equal to the sum of the derivatives of both matrices

Diff([A(t)+B(t)],t) = Diff(A(t),t)+Diff(B(t),t)

For example, let ( 2 × 3 ) matrices [ A ( t ) ] and [ B ( t ) ] be given as

> A(t) := matrix(2, 3, [t, t^3, -1, t^2, 2*t, -3*t^2]) : B(t) := matrix(2, 3, [2, t^2, t^3, -2*t, 3*t^2, -1]) :

> 'A(t)' = A(t) ; 'B(t)' = B(t) ;

A(t) = matrix([[t, t^3, -1], [t^2, 2*t, -3*t^2]])

B(t) = matrix([[2, t^2, t^3], [-2*t, 3*t^2, -1]])

(a) The derivative of the sum of the two matrices is the following ( 2 × 3 ) matrix:

> `der(A+B)` := map(factor, map(diff, evalm(A(t)+B(t)), t)) :

> Diff(['A(t)' + 'B(t)'], t) = matrix(`der(A+B)`) ;

Diff([A(t)+B(t)],t) = matrix([[1, t*(2+3*t), 3*t^2]...

(b) The sum of the derivatives of both matrices is the following ( 2 × 3 ) matrix:

> `der(A) + der(B)` := map(factor, evalm(map(diff, A(t), t) + map(diff, B(t), t))) :

> Diff('A(t)', t) + Diff('B(t)', t) = matrix(`der(A) + der(B)`) ;

Diff(A(t),t)+Diff(B(t),t) = matrix([[1, t*(2+3*t), ...

Both matrices of (a) and (b) are equal by inspection.

* * *

N.B. The derivative of the product of a scalar (number) and a matrix is equal to the product of the scalar and the matrix derivative

Diff([k*A(t)],t) = k*Diff(A(t),t)

For example, consider a ( 2 × 3 ) matrix [ A ( t ) ] given as

> A(t) := matrix(2, 3, [t, t^3, -1, t^2, 2*t, -3*t^2]) : 'A(t)' = A(t) ;

A(t) = matrix([[t, t^3, -1], [t^2, 2*t, -3*t^2]])

and the scalar k = 3 .

> k := 3 :

(a) The derivative of the product of the scalar and the matrix is the following ( 2 × 3 ) matrix:

> `der(kA)` := map(diff, evalm(k * A(t)), t) : Diff(['k*A(t)'], t) = matrix(`der(kA)`) ;

Diff([k*A(t)],t) = matrix([[3, 9*t^2, 0], [6*t, 6, ...

(b) The product of the scalar and the matrix derivative is the following ( 2 × 3 ) matrix:

> `k der(A)` := evalm(k * map(diff, A(t), t)) : 'k' * Diff('A(t)', t) = matrix(`k der(A)`) ;

k*Diff(A(t),t) = matrix([[3, 9*t^2, 0], [6*t, 6, -1...

Both matrices of (a) and (b) are equal by inspection.

* * *

N.B. If two matrices, [ A ( t ) ] and [ B ( t ) ], are conformable for multiplication and differentiable in some common interval, the derivative of their product is equal to the product of the derivative of the pre-multiplier and the post-multiplier plus the product of the pre-multiplier and the derivative of the post-multiplier

Diff([A(t)*B(t)],t) = Diff(A(t),t)*B(t)+A(t)*Diff(B...

For example, consider a ( 2 × 3 ) matrix [ A ( t ) ] given as

> A(t) := matrix(2, 3, [t, t^3, -1, t^2, 2*t, -3*t^2]) : 'A(t)' = A(t) ;

A(t) = matrix([[t, t^3, -1], [t^2, 2*t, -3*t^2]])

and a ( 3 × 3 ) matrix [ B ( t ) ] given as

> B(t) := matrix(3, 3, [2*t^2, -3*t, t, t^3, -3*t^2, 2, 2*t^3, -2*t, 2*t^2]) : 'B(t)' = B(t) ;

B(t) = matrix([[2*t^2, -3*t, t], [t^3, -3*t^2, 2], ...

(a) The derivative of the matrix product is the following ( 2 × 3 ) matrix:

> `der(AB)` := map(sort, map(factor, map(diff, evalm(A(t) &* B(t)), t))) :

> Diff(['A(t)' * 'B(t)'], t) = matrix(`der(AB)`) ;

Diff([A(t)*B(t)],t) = matrix([[6*t^5, -15*t^4-6*t+2...

(b) The product of the derivative of the pre-multiplier and the post-multiplier plus the product of the pre-multiplier and the derivative of the post-multiplier is the following ( 2 × 3 ) matrix:

> `der(A) B + A der(B)` := map(sort, map(factor, evalm(map(diff, A(t), t) &* B(t) + A(t) &* map(diff, B(t), t)))) :

> Diff('A(t)', t) * 'B(t)' + 'A(t)' * Diff('B(t)', t) = matrix(`der(A) B + A der(B)`) ;

Diff(A(t),t)*B(t)+A(t)*Diff(B(t),t) = matrix([[6*t^...

The equal function applied to the resultant matrices of (a) and (b), i.e.

> equal(`der(AB)`, `der(A) B + A der(B)`) ;

true

returns the Boolean value true , which verifies that both matrices are equal.

* * *

N.B. It follows immediately from the above ( by setting [ B ( t ) ] = [ A ( t ) ] ) that for a square matrix [ A ( t ) ], in general,

Diff([A(t)]^2,t) is not equal to 2*A(t)*Diff(A(t),t)

but

Diff([A(t)]^2,t) = Diff(A(t),t)*A(t)+A(t)*Diff(A(t)...

For example, consider a ( 2 × 2 ) matrix [ A ( t ) ] given as

> A(t) := matrix(2, 2, [2, t^2, t^3, 1]) : 'A(t)' = A(t) ;

A(t) = matrix([[2, t^2], [t^3, 1]])

(a) The derivative of the matrix squared is the following ( 2 × 2 ) matrix:

> `der(A^2)` := map(diff, evalm(A(t)^2), t) : Diff(['A(t)']^2, t) = matrix(`der(A^2)`) ;

Diff([A(t)]^2,t) = matrix([[5*t^4, 6*t], [9*t^2, 5*...

(b) The doubled product of the matrix and its derivative is the following ( 2 × 2 ) matrix:

> `2A der(A)` := evalm(2*A(t) &* map(diff, A(t), t)) : 2 * 'A(t)' * Diff('A(t)', t) = matrix(`2A der(A)`) ;

2*A(t)*Diff(A(t),t) = matrix([[6*t^4, 8*t], [6*t^2,...

which is evidently not equal to the derivative of the matrix squared.

According to the formula for the derivative of [ A ( t ) ] ^2, its right-hand side is needed for display in the next step. This, however, turns out to be normally impossible due to Maple s internal evaluations, which yield this unwanted result

> Diff('A(t)', t) * 'A(t)' + 'A(t)' * Diff('A(t)', t) ;

2*A(t)*Diff(A(t),t)

To get around this problem, a special undocumented command gensym must be used. This command gen erates a syn onym for a variable, i.e. a variable, which looks the same as the original one, but has a different machine address and so is considered by Maple to be different.

(c) The product of the matrix derivative and the matrix plus the product of the matrix and its derivative is the following ( 2 × 2 ) matrix:

> `der(A) A + A der(A)` := evalm(evalm(map(diff, A(t), t) &* A(t)) + evalm(A(t) &* map(diff, A(t), t))) :

> Diff('A(t)', t) * `tools/gensym`(A)(t) + 'A(t)' * Diff('A(t)', t) = matrix(`der(A) A + A der(A)`) ;

Diff(A(t),t)*A(t)+A(t)*Diff(A(t),t) = matrix([[5*t^...

which is equal to the derivative of the matrix squared.

* * *

N.B. If a matrix [ A ( t ) ] is an orthogonal matrix, then and only then

Diff(A(t)^2,t) = 2*A(t)*Diff(A(t),t)

For example, consider a ( 2 × 2 ) matrix [ A ( t ) ] given as

> A(t) := matrix(2, 2, [cos(t), -sin(t), sin(t), cos(t)]) : 'A(t)' = A(t) ;

A(t) = matrix([[cos(t), -sin(t)], [sin(t), cos(t)]]...

(a) Check whether [ A ( t ) ] is an orthogonal matrix:

> orthog(A(t)) ;

true

(b) The derivative of the matrix squared is the following ( 2 × 2 ) matrix:

> `der(A^2)` := map(combine, map(diff, evalm(A(t)^2), t)) : Diff(['A(t)']^2, t) = matrix(`der(A^2)`) ;

Diff([A(t)]^2,t) = matrix([[-2*sin(2*t), -2*cos(2*t...

(c) The doubled product of the matrix and its derivative is the following ( 2 × 2 ) matrix:

> `2A der(A)` := map(combine, evalm(2*A(t) &* map(diff, A(t), t))) :

> 2 * 'A(t)' * Diff('A(t)', t) = matrix(`2A der(A)`) ;

2*A(t)*Diff(A(t),t) = matrix([[-2*sin(2*t), -2*cos(...

Both matrices of (b) and (c) are equal by inspection.

* * *

N.B. If the matrix [ A ( t ) ] is a non-singular matrix, then the derivative of the matrix inverse is equal to the negative product of the matrix inverse, derivative, and inverse

Diff(Inv(A(t)),t) = -Inv(A(t))*Diff(A(t),t)*Inv(A(t...

For example, consider a ( 2 × 2 ) matrix [ A ( t ) ] given as

> A(t) := matrix(2, 2, [1, t, 2, t^2]) : 'A(t)' = A(t) ;

A(t) = matrix([[1, t], [2, t^2]])

(a) The derivative of the inverse of [ A ( t ) ] is the following ( 2 × 2 ) matrix:

> `der(inv(A))` := map(simplify, map(diff, inverse(A(t)), t)) : Diff(Inv('A(t)'), t) = matrix(`der(inv(A))`) ;

Diff(Inv(A(t)),t) = matrix([[-2*1/((t-2)^2), 1/((t-...

(b) The negative product of the inverse of [ A ( t ) ], derivative of [ A ( t ) ], and inverse of [ A ( t ) ] is the following ( 2 × 2 ) matrix:

> `-inv(A) der(A) inv(A)` := map(simplify, evalm(-inverse(A(t)) &* map(diff, A(t), t) &* inverse(A(t)))) :

> `–Inv`('A(t)') * Diff('A(t)', t) * Inv('A(t)') = matrix(`-inv(A) der(A) inv(A)`) ;

`–Inv`(A(t))*Diff(A(t),t)*Inv(A(t)) = matrix([[-2*1...

The equal function applied to the resultant matrices of (a) and (b), i.e.

> equal(`der(inv(A))`, `-inv(A) der(A) inv(A)`) ;

true

returns the Boolean value true , which verifies that both matrices are equal.

* * *

N.B. The derivative of a constant matrix [ K ] is the zero matrix of the same order

Diff(K,t) = [ 0 ]

For example, consider a ( 2 × 3 ) matrix [ K ] given as

> K := matrix(2, 3, [1, b, a^2, a, b^2, c]) : K = matrix(K) ;

K = matrix([[1, b, a^2], [a, b^2, c]])

The derivative of [ K ] is the following ( 2 × 3 ) zero matrix:

> `der(K)` := map(diff, K, t) : Diff(K, t) = matrix(`der(K)`) ;

Diff(K,t) = matrix([[0, 0, 0], [0, 0, 0]])

* * *

N.B. If two square non-singular matrices, [ A ( t ) ] and [ B ( t ) ], are differentiable in some common interval, the derivative of the inverse of their product is equal to the product of the derivative of the inverse of the post-multiplier and the inverse of the pre-multiplier plus the product of the inverse of the post-multiplier and the derivative of the inverse of the pre-multiplier

Diff(Inv(A(t)*B(t)),t) = Diff(Inv(B(t)),t)*Inv(A(t)...

For example, consider ( 2 × 2 ) matrices [ A ( t ) ] and [ B ( t ) ] given as

> A(t) := matrix(2, 2, [t, 1, -1, t^2]) : B(t) := matrix(2, 2, [2, t^2, t^3, 1]) : 'A(t)' = A(t) ; 'B(t)' = B(t) ;

A(t) = matrix([[t, 1], [-1, t^2]])

B(t) = matrix([[2, t^2], [t^3, 1]])

(a) The derivative of the inverse of the product of both matrices is the following ( 2 × 2 ) matrix:

> `der(inv(AB))` := map(sort, map(factor, map(diff, inverse(evalm(A(t) &* B(t))), t))) :

> Diff(Inv('A(t)' * 'B(t)'), t) = matrix(`der(inv(AB))`) ;

Diff(Inv(A(t)*B(t)),t) = matrix([[0, -5*t^4/((t^5-2...

(b) The product of the derivative of the inverse of the post-multiplier and the inverse of the pre-multiplier plus the product of the inverse of the post-multiplier and the derivative of the inverse of the pre-multiplier is the following ( 2 × 2 ) matrix:

> `der(inv(B)) inv(A) + inv(B) der(inv(A))` := map(sort, map(factor, evalm(map(diff, inverse(B(t)), t) &* inverse(A(t)) + inverse(B(t)) &* map(diff, inverse(A(t)), t)))) :

> Diff(Inv('B(t)'), t) * Inv('A(t)') + Inv('B(t)') * Diff(Inv('A(t)'), t) = matrix(`der(inv(B)) inv(A) + inv(B) der(inv(A))`) ;

Diff(Inv(B(t)),t)*Inv(A(t))+Inv(B(t))*Diff(Inv(A(t)...

The equal function applied to the resultant matrices of (a) and (b), i.e.

> equal(`der(inv(AB))`, `der(inv(B)) inv(A) + inv(B) der(inv(A))`) ;

true

returns the Boolean value true , which verifies that both matrices are equal.

* * *

N.B. If two square non-singular matrices, [ A ( t ) ] and [ B ( t ) ], are differentiable in some common interval, the derivative of the transpose of their product is equal to the product of the derivative of the transpose of the post-multiplier and the transpose of the pre-multiplier plus the product of the transpose of the post-multiplier and the derivative of the transpose of the pre-multiplier

Diff(Transp(A(t)*B(t)),t) = Diff(Transp(B(t)),t)*Tr...

For example, consider the same ( 2 × 2 ) matrices [ A ( t ) ] and [ B ( t ) ] as before.

(a) The derivative of the transpose of the product of both matrices is the following ( 2 × 2 ) matrix:

> `der(transp(AB))` := map(diff, transpose(evalm(A(t) &* B(t))), t) :

> Diff(Transp('A(t)' * 'B(t)'), t) = matrix(`der(transp(AB))`) ;

Diff(Transp(A(t)*B(t)),t) = matrix([[2+3*t^2, 5*t^4...

(b) The product of the derivative of the transpose of the post-multiplier and the transpose of the pre-multiplier plus the product of the transpose of the post-multiplier and the derivative of the transpose of the pre-multiplier is the following ( 2 × 2 ) matrix:

> `der(transp(B)) transp(A) + transp(B) der(transp(A))` := evalm(map(diff, transpose(B(t)), t) &* transpose(A(t)) + transpose(B(t)) &* map(diff, transpose(A(t)), t)) :

> Diff(Transp('B(t)'), t) * Transp('A(t)') + Transp('B(t)') * Diff(Transp('A(t)'), t) = matrix(`der(transp(B)) transp(A) + transp(B) der(transp(A))`) ;

Diff(Transp(B(t)),t)*Transp(A(t))+Transp(B(t))*Diff...

The equal function applied to the resultant matrices of (a) and (b), i.e.

> equal(`der(transp(AB))`, `der(transp(B)) transp(A) + transp(B) der(transp(A))`) ;

true

returns the Boolean value true , which verifies that both matrices are equal.

* * *

Proceed to Unit (28) for " Limits of matrices comprising functions ".

-------------------------------------------------------------------