Matrices-Unit25.mws

MATRICES AND MATRIX OPERATIONS: Unit 25

Dr. Wlodzislaw Kostecki

The Papua New Guinea University of Technology (PNGUT)

Department of Electrical and Communication Engineering

Lae, Morobe Province

Papua New Guinea

Copyright © 2000 by Wlodzislaw Kostecki

All rights reserved

-------------------------------------------------------------------

(25) Functions of matrices comprising non-algebraic functions

OBJECTIVES :

To include a statement based on author s own computational experiments that computation of functions of matrices comprising non-algebraic functions becomes very extensive for matrices of order higher than second .

To stress that the method using truncated Maclaurin’ s series is completely impractical for this kind of matrices of order higher than second because of the prohibitive computation time.

To include an observation that computation of functions of second-order matrices having four non-algebraic functions, or three such functions and a number (including zero at an 'improper' location) is also very extensive.

To include an observation that computation of functions of second-order matrices with non-algebraic functions is practically tractable only if a matrix contains a zero element at either of two specific locations in the matrix.

To substantiate the preceding observation by inclusion of an analysis of eigenvalues of second-order matrices with symbolic-function elements.

To demonstrate that the eigenvalues of a matrix containing a zero element at either of the two specific locations are equal to the elements on the main diagonal of the matrix.

To provide four step-by-step solved examples for computation of functions of second-order matrices with at least two non-algebraic functions and a zero element at a 'proper' location using in each case all the three methods of computation of matrix functions that are employed in Unit (23).

To provide one step-by-step solved example for computation of a function of a second-order matrix with two non-algebraic functions and a zero element at an 'improper' location using the same three methods.

To provide a comparison of the resultant matrix functions by computing the function of each matrix for a given numerical value of the variable.

To show how to use the eigenvects function for matrices whose elements are non-algebraic functions.

To include computations of the limits of the resultant matrix functions having a discontinuity on the interval of convergence of the function.

To stress that, and explain why, the numerical values of the elements on the main diagonal of the limit of a matrix function having an element with discontinuity are equal when the variable tends to the value at which the element is discontinuous.

> restart : interface(warnlevel=0) : with(linalg, augment, charpoly, coldim, diag, eigenvals, eigenvects, exponential, rowdim) :

This Unit deals with computation of functions of square matrices in which at least one element is a non-algebraic or transcendental function of one real variable t .

A function f of a matrix [ A ( t ) ] whose elements are non-algebraic functions of a real variable t may be computed only if the function is well defined for the matrix. [ Refer to Unit (23) for the well-defined function. ]

It was stated in the preceding Units (23 and 24) that a well-defined function of a matrix [ A ( t ) ] may be computed using the Cayley-Hamilton theorem, similarity of matrices, or the Maclaurin series representation of the corresponding function f(x) .

However, it appears that computation of functions of matrices with non-algebraic functions becomes very extensive for matrices of order higher than ( 2 × 2 ). The method using truncated Maclaurin’ s series turns out to be completely impractical for such matrices because of the prohibitive computation time.

Even if matrices of order ( 2 × 2 ) only are involved, computations of a matrix function are very extensive irrespective of the method used. The exception are matrices that comprise a zero element either at location (1,2) or (2,1). In this case, eigenvalues of the matrix are equal to its elements at locations (1,1) and (2,2), i.e. on the main diagonal. This ensures that a function of the matrix may be found without a great computational effort.

These observations are briefly analysed hereunder.

Consider ( 2 × 2 ) matrices [ A ( t ) ] and [ B ( t ) ] comprising a zero element at location (1,2) and (2,1), respectively.

> A(t) := matrix(2, 2, [a[11](t), 0, a[21](t), a[22](t)]) : B(t) := matrix(2, 2, [b[11](t), b[12](t), 0, b[22](t)]) :

> 'A(t)' = A(t) ; 'B(t)' = B(t) ;

A(t) = matrix([[a[11](t), 0], [a[21](t), a[22](t)]]...

B(t) = matrix([[b[11](t), b[12](t)], [0, b[22](t)]]...

The characteristic equation of either matrix has the same form, viz.

> l := lambda : char_eq('A(t)') := charpoly(A(t), l[A]) = 0 : char_eq('B(t)') := charpoly(B(t), l[B]) = 0 :

> char_eq('A(t)') ; char_eq('B(t)') ;

(lambda[A]-a[11](t))*(lambda[A]-a[22](t)) = 0

(lambda[B]-b[11](t))*(lambda[B]-b[22](t)) = 0

Inspection of the characteristic equations shows clearly that the eigenvalues are equal to the elements at locations (1,2) and (2,1).

The solution of either equation yields the eigenvalues, viz.

> solution(char_eq('A(t)')) := solve({char_eq('A(t)')}, {l[A]}) :

> solution(char_eq('B(t)')) := solve({char_eq('B(t)')},{l[B]}) :

> solution(char_eq('A(t)')) ; solution(char_eq('B(t)')) ;

{lambda[A] = a[22](t)}, {lambda[A] = a[11](t)}

{lambda[B] = b[22](t)}, {lambda[B] = b[11](t)}

whence

> l1A := rhs(op(solution(char_eq('A(t)'))[1])) : l2A := rhs(op(solution(char_eq('A(t)'))[2])) :

> l1B := rhs(op(solution(char_eq('B(t)'))[1])) : l2B := rhs(op(solution(char_eq('B(t)'))[2])) :

> l[1][A] = l1A ; l[2][A] = l2A ; `` ; l[1][B] = l1B ; l[2][B] = l2B ;

lambda[1][A] = a[22](t)

lambda[2][A] = a[11](t)

``

lambda[1][B] = b[22](t)

lambda[2][B] = b[11](t)

The above analysis has been done for instructive purposes to show why the eigenvalues are equal to the elements at the specified locations of a matrix. Naturally, the eigenvalues may be computed 'directly' using the eigenvals function, which is not that instructive but gives the same result with simpler commands, viz.

> charroots('A(t)') := eigenvals(A(t)) : charroots('B(t)') := eigenvals(B(t)) :

> l1A := charroots('A(t)')[1] : l2A := charroots('A(t)')[2] :

> l1B := charroots('B(t)')[1] : l2B := charroots('B(t)')[2] :

> l[1][A] = l1A ; l[2][A] = l2A ; `` ; l[1][B] = l1B ; l[2][B] = l2B ;

lambda[1][A] = a[11](t)

lambda[2][A] = a[22](t)

``

lambda[1][B] = b[11](t)

lambda[2][B] = b[22](t)

If a matrix contains a zero element, which is at a location different than (1,2) or (2,1), i.e. on the main diagonal, for example,

> A(t) := matrix(2, 2, [0, a[12](t), a[21](t), a[22](t)]) : B(t) := matrix(2, 2, [b[11](t), b[12](t), b[21](t), 0]) :

> 'A(t)' = A(t) ; 'B(t)' = B(t) ;

A(t) = matrix([[0, a[12](t)], [a[21](t), a[22](t)]]...

B(t) = matrix([[b[11](t), b[12](t)], [b[21](t), 0]]...

then the expressions of eigenvalues become more complicated since the square root is involved, viz.

> charroots('A(t)') := eigenvals(A(t)) : charroots('B(t)') := eigenvals(B(t)) :

> l1A := charroots('A(t)')[1] : l2A := charroots('A(t)')[2] :

> l1B := charroots('B(t)')[1] : l2B := charroots('B(t)')[2] :

> l[1][A] = l1A ; l[2][A] = l2A ; `` ; l[1][B] = l1B ; l[2][B] = l2B ;

lambda[1][A] = 1/2*a[22](t)+1/2*sqrt(a[22](t)^2+4*a...

lambda[2][A] = 1/2*a[22](t)-1/2*sqrt(a[22](t)^2+4*a...

``

lambda[1][B] = 1/2*b[11](t)+1/2*sqrt(b[11](t)^2+4*b...

lambda[2][B] = 1/2*b[11](t)-1/2*sqrt(b[11](t)^2+4*b...

If a matrix contains no zero element, for example

> A(t) := matrix(2, 2, [a[11](t), a[12](t), a[21](t), a[22](t)]) : 'A(t)' = A(t) ;

A(t) = matrix([[a[11](t), a[12](t)], [a[21](t), a[2...

then the expressions of eigenvalues also include the square root and are yet more complex, viz.

> charroots('A(t)') := eigenvals(A(t)) : l1A := charroots('A(t)')[1] : l2A := charroots('A(t)')[2] :

> l[1][A] = l1A ; l[2][A] = l2A ;

lambda[1][A] = 1/2*a[11](t)+1/2*a[22](t)+1/2*sqrt(a...

lambda[2][A] = 1/2*a[11](t)+1/2*a[22](t)-1/2*sqrt(a...

even if re-arranged to assume more compact forms

> l1A := a[11](t)/2 + a[22](t)/2 + sqrt((a[11](t)-a[22](t))^2 + 4*a[12](t)*a[21](t))/2 :

> l2A := a[11](t)/2 + a[22](t)/2 - sqrt((a[11](t)-a[22](t))^2 + 4*a[12](t)*a[21](t))/2 :

> l[1][A] = l1A ; l[2][A] = l2A ;

lambda[1][A] = 1/2*a[11](t)+1/2*a[22](t)+1/2*sqrt((...

lambda[2][A] = 1/2*a[11](t)+1/2*a[22](t)-1/2*sqrt((...

In view of the above analysis, it is stated that mathematically tractable are those matrices with transcendental-function elements whose order does not exceed ( 2 × 2 ) and which contain a zero element either at location (1,2) or (2,1). The first four examples involve such matrices. Their function elements have also been carefully selected so that all the three methods may be used for computation of a well-defined function of the matrix without a great computational effort.

The last, fifth, example is associated with Example 1 inasmuch as the same function is computed and the matrix contains the same function elements, but the zero element is at location (2,2) . The intent of this example is to illustrate the aforementioned much greater effort in computing the matrix function.

Example 1

Compute cos( [ A ( t ) ] ) of a ( 2 × 2 ) matrix [ A ( t ) ] given as

> A(t) := matrix(2, 2, [sin(t), t+1, 0, cos(t)]) : 'A(t)' = A(t) ;

A(t) = matrix([[sin(t), t+1], [0, cos(t)]])

The power series of cos(x) is convergent for all x [refer to Unit (23)], which implies that cos( [ A ] ) is well defined for every square matrix. This, in turn, means that the function cos( [ A ( t ) ] ) is well defined for every value of t , so it may be computed for this matrix. See also the note at the end of Method 1.

For convenience in computations, let

> B(t) := A(t) : A(t) := 'A(t)' :

Method 1 . Using the Cayley-Hamilton theorem

Step 1 . Having in mind that n = 2 for this matrix, write equation (1) of Unit (23) B for [ A ( t ) ]:

> `cos(A(t))` := a[1] * A(t) + a[0]*U : cos(A(t)) = `cos(A(t))` ; Cos(A(t)) := `cos(A(t))` :

cos(A(t)) = a[1]*A(t)+a[0]*U

Step 2 . Bearing in mind that the unit matrix [ U ] appropriately sized for this case is

> U := diag(1, 1) : U = matrix(U) ;

U = matrix([[1, 0], [0, 1]])

evaluate the matrix equation for cos( [ A ( t ) ] ):

> `cos(A(t))` := evalm(subs(A(t)=B(t), `cos(A(t))`)) : cos(A(t)) = matrix(`cos(A(t))`) ;

cos(A(t)) = matrix([[a[1]*sin(t)+a[0], a[1]*(t+1)],...

Step 3 . Formulate equation (2) of Unit (23) B corresponding to this case:

> `r(l)` := subs(A(t)=l, U=1, Cos(A(t))) : r(l) = `r(l)` ;

r(lambda) = a[1]*lambda+a[0]

Step 4 . Determine the eigenvalues of [ A ( t ) ]:

> charroots(A(t)) := eigenvals(subs(A(t)=B(t), A(t))) : char_roots(A(t)) = charroots(A(t)) ;

char_roots(A(t)) = (sin(t), cos(t))

Extracting the individual eigenvalues lambda[i] yields

> No_roots(A(t)) := nops([charroots(A(t))]) :

> for i to No_roots(A(t)) do ch_r[i](A(t)) := charroots(A(t))[i] : print(l[i] = ch_r[i](A(t))) : od :

lambda[1] = sin(t)

lambda[2] = cos(t)

Step 5 . Taking notice of the fact that the roots are distinct, formulate equation (3) of Unit (23) B in a general form:

Since the function f(lambda) corresponding to this case is

> `f(l)` := cos(l) : f(l) = `f(l)` ;

f(lambda) = cos(lambda)

therefore, equation (3) assumes the form

> Eq3_f(l) := `f(l)` = `r(l)` : Eq3_f(l) ;

cos(lambda) = a[1]*lambda+a[0]

Step 6 . Obtain a set of equations by substituting either eigenvalue into equation (3):

> for i to No_roots(A(t)) do Eq3_f[i](l) := subs(l=ch_r[i](A(t)), Eq3_f(l)) : print(Eq3_f[i](l)) : od :

cos(sin(t)) = a[1]*sin(t)+a[0]

cos(cos(t)) = a[1]*cos(t)+a[0]

Step 7 . Solve the simultaneous equations for the unknown coefficients:

> solution := solve({Eq3_f[1](l), Eq3_f[2](l)}, {a[0], a[1]}) : solution ;

{a[0] = (-cos(sin(t))*cos(t)+sin(t)*cos(cos(t)))/(-...

Extracting either unknown from the solution set yields

> assign(solution) : for i to No_roots(A(t)) do a[i-1] := a[i-1] : print(evaln(a[i-1]) = a[i-1]) : od :

a[0] = (-cos(sin(t))*cos(t)+sin(t)*cos(cos(t)))/(-c...

a[1] = -(cos(cos(t))-cos(sin(t)))/(-cos(t)+sin(t))

Step 8 . Substitute the values of the coefficients into the matrix cos( [ A ( t ) ] ), simplify the elements of the resultant matrix, and add a distinguishing subscript, say, C_H ( C ayley- H amilton) to its name for future comparative purposes:

> `cos(A(t))[C_H]` := map(normal, map(x->eval(x), `cos(A(t))`)) :

> cos(A(t))[C_H] = matrix(`cos(A(t))[C_H]`) ;

cos(A(t))[C_H] = matrix([[cos(sin(t)), -(cos(cos(t)...

> for i to No_roots(A(t)) do a[i-1] := evaln(a[i-1]) : od :

N.B. In general, the function cos( [ A ( t ) ] ) is well defined for every value of t . This is true if the elements of the matrix function are continuous functions for every t .

It can be easily noticed that the function element at location (1,2) in the above matrix is discontinuous if the denominator becomes zero . Equating the denominator to zero

> Eq_denom(el12) := denom(`cos(A(t))[C_H]`[1,2]) = 0 : Eq_denom(el12) ;

-cos(t)+sin(t) = 0

and solving the resultant equation yield

> solution := solve({Eq_denom(el12)}) : solution ;

{t = 1/4*Pi}

Let the numerical part of the solution be assigned the name t[d_s] . Thus,

> t[d_s] := rhs(solution[1]) : 't[d_s]' = t[d_s] ;

t[d_s] = 1/4*Pi

The denominator becomes zero at every Pi/4+2*Pi*n , where n = 0, ±1, ±2, ±3.

If the function cos( [ A ( t ) ] ) is to be computed for t = t[d_s] , this can be done only if a limiting value of the matrix element exists when t tends to the numerical value of t[d_s] . Consequently, this must be checked by computing the limit as follows:

> `limit(el[12](cos(A(t))[C_H]))` := limit(`cos(A(t))[C_H]`[1,2], t=t[d_s]) :

> Limit(element[12](cos(A(t))[C_H]), t=t[d_s]) = `limit(el[12](cos(A(t))[C_H]))` ;

Limit(element[12](cos(A(t))[C_H]),t = 1/4*Pi) = -1/...

or, in a more compact form,

> `limit(el[12](cos(A(t))[C_H]))` := factor(`limit(el[12](cos(A(t))[C_H]))`) :

> Limit(element[12](cos(A(t))[C_H]), t=t[d_s]) = `limit(el[12](cos(A(t))[C_H]))` ;

Limit(element[12](cos(A(t))[C_H]),t = 1/4*Pi) = -1/...

In computational practice, the above check may be omitted and an attempt made to compute the value of the function cos( [ A ( t ) ] ) for t = t[d_s] directly as the limit of the matrix function when t tends to the numerical value of t[d_s] . This yields

> `lim cos(A(t))[C_H]` := map(factor, map(limit, `cos(A(t))[C_H]`, t=t[d_s])) :

> Limit(cos(A(t))[C_H], t=t[d_s]) = matrix(`lim cos(A(t))[C_H]`) ;

Limit(cos(A(t))[C_H],t = 1/4*Pi) = matrix([[cos(1/2...

Floating-point evaluation of the above gives

> Limit(cos(A(t))[C_H], t=t[d_s]) = evalf(matrix(`lim cos(A(t))[C_H]`)) ;

Limit(cos(A(t))[C_H],t = 1/4*Pi) = matrix([[.760244...

[ For computation of limits of function matrices, refer to Unit (28). ]

Method 2 . Using similarity of matrices

Step 1 . Check if the eigenvalues of [ A ( t ) ] are distinct:

See Step 4 above where the eigenvalues are found to be distinct.

Step 2 . Find the sequence of lists containing eigenvalues and eigenvectors of [ A ( t ) ]:

WARNING : The eigenvects function works only for a matrix whose elements are rationals, rational functions, algebraic numbers, or algebraic functions. Since matrix [ A ( t ) ] comprises non-algebraic functions, eigenvects cannot be used for [ A ( t ) ]. To get around this problem, construct an auxiliary matrix [ C ] as follows.

> C := matrix(2, 2) :

> C[1,1] := f[11] : C[1,2] := f[12] : C[2,1] := B(t)[2,1] : C[2,2] := f[22] :

> C := matrix(C) : C = matrix(C) ;

C = matrix([[f[11], f[12]], [0, f[22]]])

whose elements f[11] , f[12] , and f[22] denote, respectively,

> f[11] = B(t)[1,1] ; f[12] = B(t)[1,2] ; f[22] = B(t)[2,2] ;

f[11] = sin(t)

f[12] = t+1

f[22] = cos(t)

Determine the eigenvalues of [ C ]:

> charroots(C) := eigenvals(C) : char_roots(C) = charroots(C) ;

char_roots(C) = (f[11], f[22])

whence

> No_roots(C) := nops([charroots(C)]) :

> for i to No_roots(C) do ch_r[i](C) := charroots(C)[i] : print(l[i] = ch_r[i](C)) : od :

lambda[1] = f[11]

lambda[2] = f[22]

The sequence of lists containing eigenvalues and eigenvectors of [ C ] is

> `roots&vectors(C)` := eigenvects(C) : roots_and_vectors(C) = `roots&vectors(C)` ;

roots_and_vectors(C) = ([f[22], 1, {vector([1, -(-f...

Step 3 . Extract the eigenvectors v[lambda[i]] corresponding to eigenvalues lambda[i] of [ C ]:

> for i to No_roots(C) do e[i] := charroots(C)[i] : List[i] := `roots&vectors(C)`[i] : od :

> for j to No_roots(C) do for i to No_roots(C) do if List[i][1] = e[j] then Lst[j] := `roots&vectors(C)`[i] : fi : od : od :

> for i to No_roots(C) do ch_v[i](C) := op(Lst[i][3]) : print(v[lambda[i]] = eval(ch_v[i](C)), ` --> ` * lambda[i] = ch_r[i](C)) : od :

v[lambda[1]] = vector([1, 0]), `  -->  `*lambda[1] ...

v[lambda[2]] = vector([1, -(-f[22]+f[11])/f[12]]), ...

Conversion of the eigenvectors v[lambda[i]] into column matrices X[lambda[i]] yields

> for i to No_roots(C) do X[l[i]] := convert(ch_v[i](C), matrix) : print(X[l[i]] = matrix(X[l[i]])) : od :

X[lambda[1]] = matrix([[1], [0]])

X[lambda[2]] = matrix([[1], [-(-f[22]+f[11])/f[12]]...

Step 4 . Construct the 'unique' modal matrix [ M ] associated with [ C ]:

> i := 'i' : `seq(X[l[i]])` := seq(X[l[i]], i=1..coldim(C)) : M := augment(`seq(X[l[i]])`) : M=matrix(M) ;

M = matrix([[1, 1], [0, -(-f[22]+f[11])/f[12]]])

Step 5 . Construct the unique Jordan form [ J ] of [ C ]:

> J_C := array(diagonal, 1..rowdim(C), 1..coldim(C)) :

> for i to rowdim(C) do for j to coldim(C) do if j = i then J_C[i,j] := ch_r[i](C) fi : od : od :

> J := matrix(J_C) : J = matrix(J) ;

J = matrix([[f[11], 0], [0, f[22]]])

Step 6 . Compute the function cosine of matrix [ J ]:

> J_C := array(diagonal, 1..rowdim(J), 1..coldim(J)) :

> for i to rowdim(J) do for j to coldim(J) do if j = i then J_C[i,j] := cos(J[i,j]) fi : od : od :

> `cos(J)` := matrix(J_C) : cos(J) = matrix(`cos(J)`) ;

cos(J) = matrix([[cos(f[11]), 0], [0, cos(f[22])]])...

Step 7 . Compute cos( [ C ] ) = [ M ] cos( [ J ] ) Inv [ M ] and simplify the elements of the resultant matrix:

> `cos(C)` := map(normal, evalm(M &* `cos(J)` &* M^(-1))) :

> cos(C) = matrix(`cos(C)`) ;

cos(C) = matrix([[cos(f[11]), f[12]*(cos(f[11])-cos...

Substitute the elements f[11] , f[12] , and f[22] with their corresponding function elements of [ A ( t ) ], substitute these elements into the matrix cos( [ C ] ), rename it as cos( [ A ( t ) ] ), and add a distinguishing subscript, say, s_m ( s imilarity of m atrices) to the name of the resultant matrix for future comparative purposes:

> `cos(A(t))[s_m]` := subs(f[11]=B(t)[1,1], f[12]=B(t)[1,2], f[22]=B(t)[2,2], matrix(`cos(C)`)) :

> cos(A(t))[s_m] = matrix(`cos(A(t))[s_m]`) ;

cos(A(t))[s_m] = matrix([[cos(sin(t)), (t+1)*(cos(s...

The symbolic forms of matrices cos(A(t))[C_H] and cos(A(t))[s_m] are equal.

Method 3 . Using truncated Maclaurin’ s series

Step 1 . Find the eigenvalues of [ A ( t ) ]:

This step was already done earlier to check if the eigenvalues of [ A ( t ) ] are distinct, which was essential for both preceding methods. It is not necessary for this method since the series of cos(x) is convergent for all x , which implies that cos( [ A ( t ) ] ) is well defined for every square matrix.

Step 2 . Write the expression for Maclaurin’ s series representing the function cos(x) and substitute the matrix name for x :

> `cos(A(t))` := U + subs(x=A(t), Sum((-1)^n*x^(2*n)/(2*n)!, n=1..infinity)) : cos(A(t)) = `cos(A(t))` ;

cos(A(t)) = U+Sum((-1)^n*A(t)^(2*n)/(2*n)!,n = 1 .....

Step 3 . Substitute the matrix names with their corresponding matrix structures:

> `cos(A(t))` := subs(A(t)=B(t), U=matrix(U), `cos(A(t))`) : cos(A(t)) = `cos(A(t))` ;

cos(A(t)) = matrix([[1, 0], [0, 1]])+Sum((-1)^n*mat...

Step 4 . Evaluate the matrix cos( [ A ( t ) ] ) with its symbolic elements after truncating the series to its first 50 terms.

At this point, a relative disadvantage of this method appears in comparison with the previous methods: displaying of the matrix for a large number of terms of the Maclaurin series becomes impractical.

Exemplarily, adopt at this point only n = 2 , simplify the matrix elements, and display the matrix

> `cos(A(t))` := map(combine, evalm(subs(A(t)=B(t), U + sum((-1)^n*A(t)^(2*n)/(2*n)!, n=1..2)))) :

> cos(A(t)) = matrix(`cos(A(t))`) ;

cos(A(t)) = matrix([[49/64+11/48*cos(2*t)+1/192*cos...

However, computational experiments show that the number of series terms adopted, n , should be about 50 to ensure a satisfactory accuracy. On the other hand, it is rather unlikely that a display of such matrices with their symbolic elements would be of any use. Practically, the result of numerical computation of the matrix function for a given value of t is required only.

Consequently, n = 50 is adopted and the resultant matrix is added a distinguishing subscript, say, M_s ( M aclaurin’ s s eries ) to its name for future comparative purposes. Moreover, the function combine is dropped, which results in some saving of computation time.

Thus, the final expression for cos( [ A ( t ) ] ) computed by this method is adopted as follows:

> `cos(A(t))[M_s]`:= evalm(subs(A(t)=B(t), U + sum((-1)^n*A(t)^(2*n)/(2*n)!, n=1..50))) :

> cos(A(t))[M_s] = subs(x=B(t), matrix(U) + Sum((-1)^n*x^(2*n)/(2*n)!, n=1..50)) ; B(t) := 'B(t)' :

cos(A(t))[M_s] = matrix([[1, 0], [0, 1]])+Sum((-1)^...

Comparison of numerical results of computation of cos( [ A ( t ) ] )

Compute cos( [ A ( t ) ] ) if t = 2 using all the three methods.

> t := 2 :

(a) For Method 1 using the Cayley-Hamilton theorem:

> `cos(A(t))[C_H]` := map(x->eval(x), `cos(A(t))[C_H]`) : cos(A('t'))[C_H] = matrix(`cos(A(t))[C_H]`) ;

cos(A(t))[C_H] = matrix([[cos(sin(2)), -3*(cos(cos(...

Floating-point evaluation yields

> `cos(A(t))[C_H]` := evalf(matrix(`cos(A(t))[C_H]`)) : cos(A('t'))[C_H] = matrix(`cos(A(t))[C_H]`) ;

cos(A(t))[C_H] = matrix([[.6143002821, -.6798166899...

(b) For Method 2 using similarity of matrices:

> `cos(A(t))[s_m]` := map(x->eval(x), `cos(A(t))[s_m]`) : cos(A('t'))[s_m] = matrix(`cos(A(t))[s_m]`) ;

cos(A(t))[s_m] = matrix([[cos(sin(2)), 3*(cos(sin(2...

Floating-point evaluation yields

> `cos(A(t))[s_m]` := evalf(matrix(`cos(A(t))[s_m]`)) : cos(A('t'))[s_m] = matrix(`cos(A(t))[s_m]`) ;

cos(A(t))[s_m] = matrix([[.6143002821, -.6798166899...

(c) For Method 3 using truncated Maclaurin’ s series:

> `cos(A(t))[M_s]` := evalf(map(x->eval(x), `cos(A(t))[M_s]`)) :

> cos(A('t'))[M_s] = matrix(`cos(A(t))[M_s]`) ; t := 't' :

cos(A(t))[M_s] = matrix([[.6143002822, -.6798166896...

The matrices of (a) and (b) are precisely equal and the matrix of (c) may be considered practically equal to the former ones.

* * *

Example 2

Compute sin( [ A ( t ) ] ) of a ( 2 × 2 ) matrix [ A ( t ) ] given as

> A(t) := matrix(2, 2, [4*t^2+3*t, 0, 3*t*sin(2*t), -5*t+2]) : 'A(t)' = A(t) ;

A(t) = matrix([[4*t^2+3*t, 0], [3*t*sin(2*t), -5*t+...

The power series of sin(x) is convergent for all x [refer to Unit (23)], which implies that sin( [ A ] ) is well defined for every square matrix. This, in turn, means that the function sin( [ A ( t ) ] ) is well defined for every value of t , so it may be computed for this matrix. See also the note at the end of Method 1.

For convenience in computations, let

> B(t) := A(t) : A(t) := 'A(t)' :

Method 1 . Using the Cayley-Hamilton theorem

Step 1 . Having in mind that n = 2 for this matrix, write equation (1) of Unit (23) B for [ A ( t ) ]:

> `sin(A(t))` := a[1] * A(t) + a[0]*U : sin(A(t)) = `sin(A(t))` ; Sin(A(t)) := `sin(A(t))` :

sin(A(t)) = a[1]*A(t)+a[0]*U

Step 2 . With the unit matrix [ U ] as in Example 1, evaluate the matrix equation for sin( [ A ( t ) ] ):

> `sin(A(t))` := evalm(subs(A(t)=B(t), `sin(A(t))`)) : sin(A(t)) = matrix(`sin(A(t))`) ;

sin(A(t)) = matrix([[a[1]*(4*t^2+3*t)+a[0], 0], [3*...

Step 3 . Formulate equation (2) of Unit (23) B corresponding to this case:

> `r(l)` := subs(A(t)=l, U=1, Sin(A(t))) : r(l) = `r(l)` ;

r(lambda) = a[1]*lambda+a[0]

Step 4 . Determine the eigenvalues of [ A ( t ) ]:

> charroots(A(t)) := eigenvals(subs(A(t)=B(t), A(t))) : char_roots(A(t)) = charroots(A(t)) ;

char_roots(A(t)) = (4*t^2+3*t, -5*t+2)

Extracting the individual eigenvalues lambda[i] yields

> No_roots(A(t)) := nops([charroots(A(t))]) :

> for i to No_roots(A(t)) do ch_r[i](A(t)) := charroots(A(t))[i] : print(l[i] = ch_r[i](A(t))) : od :

lambda[1] = 4*t^2+3*t

lambda[2] = -5*t+2

Step 5 . Taking notice of the fact that the roots are distinct, formulate equation (3) of Unit (23) B in a general form:

Since the function f(lambda) corresponding to this case is

> `f(l)` := sin(l) : f(l) = `f(l)` ;

f(lambda) = sin(lambda)

therefore, equation (3) assumes the form

> Eq3_f(l) := `f(l)` = `r(l)` : Eq3_f(l) ;

sin(lambda) = a[1]*lambda+a[0]

Step 6 . Obtain a set of equations by substituting either eigenvalue into equation (3):

> for i to No_roots(A(t)) do Eq3_f[i](l) := subs(l=ch_r[i](A(t)), Eq3_f(l)) : print(Eq3_f[i](l)) : od :

sin(4*t^2+3*t) = a[1]*(4*t^2+3*t)+a[0]

sin(-5*t+2) = a[1]*(-5*t+2)+a[0]

Step 7 . Solve the simultaneous equations for the unknown coefficients:

> solution := solve({Eq3_f[1](l), Eq3_f[2](l)}, {a[0], a[1]}) : solution ;

{a[0] = 1/2*(-4*sin(5*t-2)*t^2-3*sin(5*t-2)*t+5*t*s...

Extracting either unknown from the solution set yields

> assign(solution) : for i to No_roots(A(t)) do a[i-1] := a[i-1] : print(evaln(a[i-1]) = a[i-1]) : od :

a[0] = 1/2*(-4*sin(5*t-2)*t^2-3*sin(5*t-2)*t+5*t*si...

a[1] = -1/2*(-sin(4*t^2+3*t)-sin(5*t-2))/(2*t^2+4*t...

After some re-arrangements, both coefficients are written in more compact forms, viz.

> a[0]:=((5*t-2)*sin(4*t^2+3*t) - (4*t^2+3*t)*sin(5*t-2))/(4*t^2+8*t-2) : a[1]:=numer(a[1])/denom(a[1]) :

> for i to No_roots(A(t)) do print(evaln(a[i-1]) = a[i-1]) : od :

a[0] = ((5*t-2)*sin(4*t^2+3*t)-(4*t^2+3*t)*sin(5*t-...

a[1] = (sin(4*t^2+3*t)+sin(5*t-2))/(4*t^2+8*t-2)

Step 8 . Substitute the values of the coefficients into the matrix sin( [ A ( t ) ] ), simplify the elements of the resultant matrix, and add a distinguishing subscript, say, C_H ( C ayley- H amilton) to its name for future comparative purposes:

> `sin(A(t))[C_H]` := map(normal, map(x->eval(x), `sin(A(t))`)) :

> sin(A(t))[C_H] = matrix(`sin(A(t))[C_H]`) ;

sin(A(t))[C_H] = matrix([[sin(4*t^2+3*t), 0], [3/2*...

> for i to No_roots(A(t)) do a[i-1] := evaln(a[i-1]) : od :

N.B. In general, the function sin( [ A ( t ) ] ) is well defined for every value of t . This is true if the elements of the matrix function are continuous functions for every t .

It can be easily noticed that the function element at location (2,1) in the above matrix is discontinuous if the denominator becomes zero . Equating the denominator to zero

> Eq_denom(el21) := denom(`sin(A(t))[C_H]`[2,1]) = 0 : Eq_denom(el21) ;

4*t^2+8*t-2 = 0

and solving the resultant equation yield

> solution := solve({Eq_denom(el21)}) : solution ;

{t = -1+1/2*sqrt(6)}, {t = -1-1/2*sqrt(6)}

Let the numerical parts of the solution be assigned the names t[d_s[1]] and t[d_s[2]] . Thus,

> for i to 2 do t[d_s[i]] := rhs(solution[i][1]) : print(evaln(t[d_s[i]]) = t[d_s[i]]) : od :

t[d_s[1]] = -1+1/2*sqrt(6)

t[d_s[2]] = -1-1/2*sqrt(6)

Floating-point evaluation gives

> for i to 2 do t[d_s[i]] := evalf(t[d_s[i]]) : print(evaln(t[d_s[i]]) = t[d_s[i]]) : od :

t[d_s[1]] = .224744872

t[d_s[2]] = -2.224744872

If the function sin( [ A ( t ) ] ) is to be computed for t[d_s[1]] or t[d_s[2]] , this can be done only if a limiting value of the matrix element exists when t tends to the numerical value of t[d_s[1]] or t[d_s[2]] . Consequently, this must be checked by computing the limit as follows:

for t = t[d_s[1]]

> `limit(el[21](sin(A(t))[C_H]))` := evalf(limit(`sin(A(t))[C_H]`[2,1], t=t[d_s[1]])) :

> Limit(element[21](sin(A(t))[C_H]), t=t[d_s[1]]) = `limit(el[21](sin(A(t))[C_H]))` ;

Limit(element[21](sin(A(t))[C_H]),t = .224744872) =...

for t = t[d_s[2]]

> `limit(el[21](sin(A(t))[C_H]))` := evalf(limit(`sin(A(t))[C_H]`[2,1], t=t[d_s[2]])) :

> Limit(element[21](sin(A(t))[C_H]), t=t[d_s[2]]) = `limit(el[21](sin(A(t))[C_H]))` ;

Limit(element[21](sin(A(t))[C_H]),t = -2.224744872)...

In computational practice, the above check may be omitted and an attempt made to compute the value of the function cos( [ A ( t ) ] ) for t = t[d_s[1]] or t = t[d_s[2]] directly as the limit of the matrix function when t tends to the numerical value of t[d_s[1]] or t[d_s[2]] . This gives

for t = t[d_s[1]]

> `lim sin(A(t))[C_H]` := evalf(map(limit, `sin(A(t))[C_H]`, t=t[d_s[1]])) :

> Limit(sin(A(t))[C_H], t=t[d_s[1]]) = matrix(`lim sin(A(t))[C_H]`) ;

Limit(sin(A(t))[C_H],t = .224744872) = matrix([[.76...

for t = t[d_s[2]]

> `lim sin(A(t))[C_H]` := evalf(map(limit, `sin(A(t))[C_H]`, t=t[d_s[2]])) :

> Limit(sin(A(t))[C_H], t=t[d_s[2]]) = matrix(`lim sin(A(t))[C_H]`) ;

Limit(sin(A(t))[C_H],t = -2.224744872) = matrix([[....

Notice that the elements at locations (1,1) and (2,2) in the matrix sin(A(t))[C_H] are equal if t equals the root t[d_s[1]] or t[d_s[2]] . Explanation of this fact follows from equating both elements, viz.

> `Eq_el(11)_el(22)` := `sin(A(t))[C_H]`[1,1] = `sin(A(t))[C_H]`[2,2] : `Eq_el(11)_el(22)` ;

sin(4*t^2+3*t) = -sin(5*t-2)

Since the two elements are equal, their operands are also equal, which leads to the equation

> `Eq_op(el(11)_el(22))` := sign(`sin(A(t))[C_H]`[1,1]) * op(op(abs(`sin(A(t))[C_H]`[1,1]))) = sign(`sin(A(t))[C_H]`[2,2]) * op(op(abs(`sin(A(t))[C_H]`[2,2]))) : `Eq_op(el(11)_el(22))` ;

4*t^2+3*t = -5*t+2

Solution of this equation, i.e.

> solution := solve({`Eq_op(el(11)_el(22))`}) : solution ;

{t = -1+1/2*sqrt(6)}, {t = -1-1/2*sqrt(6)}

are the same roots, which were obtained as zeros of the denominator of the element at location (2,1) in the matrix sin(A(t))[C_H] . Thus, the two elements are equal when t = t[d_s[1]] or t = t[d_s[2]] .

Method 2 . Using similarity of matrices

Step 1 . Check if the eigenvalues of [ A ( t ) ] are distinct:

See Step 4 above where the eigenvalues are found to be distinct.

Step 2 . Find the sequence of lists containing eigenvalues and eigenvectors of [ A ( t ) ]:

For the reason stated at the same point of Example 1, the eigenvects function cannot be used for [ A ( t ) ]. To get around this problem, construct an auxiliary matrix [ C ] as follows.

> C := matrix(2, 2) :

> C[1,1] := f[11] : C[1,2] := B(t)[1,2] : C[2,1] := f[21] : C[2,2] := f[22] :

> C := matrix(C) : C = matrix(C) ;

C = matrix([[f[11], 0], [f[21], f[22]]])

whose elements f[11] , f[21] , and f[22] denote, respectively,

> f[11] = B(t)[1,1] ; f[21] = B(t)[2,1] ; f[22] = B(t)[2,2] ;

f[11] = 4*t^2+3*t

f[21] = 3*t*sin(2*t)

f[22] = -5*t+2

Determine the eigenvalues of [ C ]:

> charroots(C) := eigenvals(C) : char_roots(C) = charroots(C) ;

char_roots(C) = (f[11], f[22])

whence

> No_roots(C) := nops([charroots(C)]) :

> for i to No_roots(C) do ch_r[i](C) := charroots(C)[i] : print(l[i] = ch_r[i](C)) : od :

lambda[1] = f[11]

lambda[2] = f[22]

The sequence of lists containing eigenvalues and eigenvectors of [ C ] is

> `roots&vectors(C)` := eigenvects(C) : roots_and_vectors(C) = `roots&vectors(C)` ;

roots_and_vectors(C) = ([f[11], 1, {vector([(-f[22]...

Step 3 . Extract the eigenvectors v[lambda[i]] corresponding to eigenvalues lambda[i] of [ C ]:

> for i to No_roots(C) do e[i] := charroots(C)[i] : List[i] := `roots&vectors(C)`[i] : od :

> for j to No_roots(C) do for i to No_roots(C) do if List[i][1] = e[j] then Lst[j] := `roots&vectors(C)`[i] : fi : od : od :

> for i to No_roots(C) do ch_v[i](C) := op(Lst[i][3]) : print(v[lambda[i]] = eval(ch_v[i](C)), ` --> ` * lambda[i] = ch_r[i](C)) : od :

v[lambda[1]] = vector([(-f[22]+f[11])/f[21], 1]), `...

v[lambda[2]] = vector([0, 1]), `  -->  `*lambda[2] ...

Conversion of the eigenvectors v[lambda[i]] into column matrices X[lambda[i]] yields

> for i to No_roots(C) do X[l[i]] := convert(ch_v[i](C), matrix) : print(X[l[i]] = matrix(X[l[i]])) : od :

X[lambda[1]] = matrix([[(-f[22]+f[11])/f[21]], [1]]...

X[lambda[2]] = matrix([[0], [1]])

Step 4 . Construct the 'unique' modal matrix [ M ] associated with [ C ]:

> i := 'i' : `seq(X[l[i]])` := seq(X[l[i]], i=1..coldim(C)) : M := augment(`seq(X[l[i]])`) : M=matrix(M) ;

M = matrix([[(-f[22]+f[11])/f[21], 0], [1, 1]])

Step 5 . Construct the unique Jordan form [ J ] of [ C ]:

> J_C := array(diagonal, 1..rowdim(C), 1..coldim(C)) :

> for i to rowdim(C) do for j to coldim(C) do if j = i then J_C[i,j] := ch_r[i](C) fi : od : od :

> J := matrix(J_C) : J = matrix(J) ;

J = matrix([[f[11], 0], [0, f[22]]])

Step 6 . Compute the function sine of matrix [ J ]:

> J_C := array(diagonal, 1..rowdim(J), 1..coldim(J)) :

> for i to rowdim(J) do for j to coldim(J) do if j = i then J_C[i,j] := sin(J[i,j]) fi : od : od :

> `sin(J)` := matrix(J_C) : sin(J) = matrix(`sin(J)`) ;

sin(J) = matrix([[sin(f[11]), 0], [0, sin(f[22])]])...

Step 7 . Compute sin( [ C ] ) = [ M ] sin( [ J ] ) Inv [ M ] and simplify the elements of the resultant matrix:

> `sin(C)`:=map(normal, evalm(M &* `sin(J)` &* M^(-1))) : sin(C) = matrix(`sin(C)`) ;

sin(C) = matrix([[sin(f[11]), 0], [f[21]*(sin(f[11]...

Substitute the elements f[11] , f[21] , and f[22] with their corresponding function elements of [ A ( t ) ], substitute these elements into the matrix sin( [ C ] ), rename it as sin( [ A ( t ) ] ), and add a distinguishing subscript, say, s_m ( s imilarity of m atrices) to the name of the resultant matrix for future comparative purposes:

> `sin(A(t))[s_m]` := subs(f[11]=B(t)[1,1], f[21]=B(t)[2,1], f[22]=B(t)[2,2], matrix(`sin(C)`)) :

> sin(A(t))[s_m] = matrix(`sin(A(t))[s_m]`) ;

sin(A(t))[s_m] = matrix([[sin(4*t^2+3*t), 0], [3*t*...

The symbolic forms of matrices sin(A(t))[C_H] and sin(A(t))[s_m] are equal.

Method 3 . Using truncated Maclaurin’ s series

Step 1 . Find the eigenvalues of [ A ( t ) ]:

This step was already done earlier to check if the eigenvalues of [ A ( t ) ] are distinct, which was essential for both preceding methods. It is not necessary for this method since the series of