Solving linear equations using matrix algebra

CHAPTER 3 Linear Algebra

3.4 Solving linear equations using matrix algebra

We now turn our attention to an important application of matrix algebra, which is to solve sets of linear equations.

3.4.1 Representation

Systems of linear equations are conveniently represented by matrices. Consider the set of linear equations:

We can represent this set of equations by the matrix

3x+2y+z = 5 8x

– +y+4z = –2 9x+0.5y+4z = 0.9

DRAFT - Version 3-Elementary row operations and Gaussian elimination

83

where the position of a number in the matrix implicitly identifies it either as a coefficient of a variable or a value on the right hand side. This representation can be used for any set of linear equations. If the rightmost column is 0, then the system is said to be homogeneous. The submatrix corresponding to the left hand size of the linear equations is called the coefficient matrix.

3.4.2 Elementary row operations and Gaussian elimination

Given a set of equations, certain simple operations allow us to generate new equations. For example, multiplying the left- and right-hand sides of any equation by a scalar generates a new equation. Moreover, we can add or subtract the left- and right-hand sides of any pair of equations to also generate new equations.

In our example above, the first two equations are 3x + 2y + z = 5 and -8x + y + 4z = -2. We can multiply the first equation by 3 to get the new equation 9x + 6y + 3z = 15. We can also add the two equations to get a new equation (3-8)x + (2+1)y + (1+4)z = (5-2), which gives us the equation -5x + 3y + 5z = 3.

We can also combine these operations. For example, we could multiply the second equation by 2 and subtract it from the first one like this:

This results in an equation where the variable y has been eliminated (i.e., does not appear). We can similarly multiply the third equation by 4 and subtract it from the first one to obtain another equation that also eliminates y. We now have two equations in two variables that we can trivially solve to obtain x and z. Putting their values back into any of the three equations allows us to find y.

This approach, in essence, is the well-known technique called Gaussian elimination. In this technique, we pick any one variable and use multiplications and additions on the set of equations to eliminate that variable from all but one equation. This transforms a system with n variables and m equations to a system with n-1 variables and m-1 equations. We can now recurse to obtain, in the end¹, an equation with one variable, which solves the system for that variable. By substituting this value back into the reduced set of equations, we solve the system.

When using a matrix representation of the set of equations, the elementary operations of multiplying an equation by a scalar and of adding two equations correspond to two row operations. The first row operation multiplies all the elements of a row by a scalar and the second row operation is the element-by-element addition of two rows. It is easy to see that these are exactly analogous to the operations in the previous paragraphs. The Gaussian technique uses these elementary row operations to manipulate the matrix representation of a set of linear equations so that one row looks like this: [0 0 ... 0 1 0 ... 0 a].

This allows us to read off the value of that variable. We can use this to substitute for this variable in the other equations, so that we are left with a system of equations with one less unknown, and, by recursion, to find the values of all the variables.

EXAMPLE 7: GAUSSIANELIMINATION

Use row operations and Gaussian elimination to solve the system given by .

Solution:

1. Assuming that the equations are self-consistent and have at least one solution. More on this below.

3 2 1 5

– 1 4 –2 9 0.5 4 0.9

3–(–16)

( )x+(2–2)y+(1–8)z = 5–( )–4 19x–7z = 9

3 2 1 5

– 1 4 –2 9 0.5 4 0.9

DRAFT - Version 3- Solving linear equations using matrix algebra

Subtract row 3 from row 2 to obtain . Then subtract 0.25 times row 3 from row 1 to obtain

. Note that the first two rows represent a pair of equations in two unknowns. Multiply the second row

by 1.875/0.5 = 3.75 and subtract from the first row to obtain . This allows us to read off xas 15.65/

66.525 = 0.2426. Substituting this into row 2, we get -17*0.2426 + 0.5y = -2.9, which we solve to get y = 2.4496. Substitut- ing this into the third row, we get 9*0.2426 + 0.5* 2.4496 + 4z = 0.9, so that z = 0.6271. Checking, 3*0.2426 + 2*2.4484 - 0.6271 = 4.9975, which is within rounding error of 5.

In practice, choosing which variable to eliminate first has important consequences. Choosing a variable unwisely may require us to maintain matrix elements to very high degrees of precision, which is costly. There is a considerable body of work on algorithms to carefully choosing the variables to eliminate, which are also called the pivots. Standard matrix pack- ages, such as MATLAB, implement these algorithms.

3.4.3 Rank

So far, we have assumed that a set of linear equations always has a consistent solution. This is not always the case. A set of equations has no solution or has an infinite number of solutions if it is either over-determined or under-determined respec- tively. A system is over-determined if the same variable assumes inconsistent values. For example, a trivial over-determined system is the set of equations: x = 1 and x = 2. Gaussian elimination will fail for such systems.

A system is under-determined if it admits more than one answer. A trivial instance of an under-determined system is the system of linear equations: x + y= 1 because we can choose an infinite number of values of x and y that satisfy this equation.

Gaussian elimination on such a system results in some set of variables expressed as linear combinations of the independent variables. Each assignment of values to the independent variables will result in finding a consistent solution to the system.

Given a system of m linear equations using n variables, the system is under-determined if m < n. If m is at least as large as n, the system may or may not be under-determined, depending on whether some equations are ‘repeated.’ Specifically, we define an equation as being linearly dependenton a set of other equations if it can be expressed as a linear combination of the other equations (the vector corresponding to this equation is a linear combination of the vectors corresponding to the other equations). If one equation in a system of linear equations is linearly dependent on the others, then we can reduce the equation to the equation 0 = 0 by a suitable combination of multiplications and additions. Thus, this equation does not give us any additional information and can be removed from the system without changing the solution.

If of m equations in a system, k can be expressed as a linear combination of the other m - k equations, then we really only have m - k equations to work with. This value is called the rankof the system, denoted r. If r < n, then the system is under- determined. If r = n, then there is only one solution to the system. If r > n, then the system is over-determined, and therefore inconsistent. Note that the rank of a matrix is the same as the cardinality of the basis set of the corresponding set of row vectors.

EXAMPLE 8: RANK

We have already seen that the system of equations has a unique assignment of consistent values to the variables x, y, and z. Therefore, it has a rank of 3.

3 2 1 5

– 0.5 0 –2.9 9 0.5 4 0.9 0.75 1.875 0 4.775

– 0.5 0 –2.9

9 0.5 4 0.9

64.5 0 0 15.65 17

– 0.5 0 –2.9 9 0.5 4 0.9

3 2 1 5

– 1 4 –2 9 0.5 4 0.9

DRAFT - Version 3-Determinants

85

Consider the system . We see that the third row is just the first row multiplied by 2. Therefore, it adds no additional information to the system and can be removed. The rank of this system is 2 (it is under-determined), and the resultant system has an infinity of solutions.

Now, consider the system . We know that the first three rows are linearly independent and have a rank of 3.

The fourth row is inconsistent with the first row, so the system is over-determined, and has no solution. The resulting system has a rank of 4.

Many techniques are known to determine the rank of a system of equations. These are, however, beyond the scope of this discussion. For our purpose, it suffices to attempt Gaussian elimination, and report a system to be over-determined, that is, have a rank of at least n, if an inconsistent solution is found, and to be under-determined, that is, with a rank smaller than n, if an infinite number of solutions can be found. If the system is under-determined, the rank is the number of equations that do not reduce to the trivial equation 0 = 0.

3.4.4 Determinants

We now turn our attention to the study of a determinant of a matrix. Even the most enthusiastic of mathematicians will admit that the study of determinants is a rather dry topic. Moreover, the determinant of a matrix does not, by itself, have much prac- tical value. Although they can compactly represent the solution of a set of linear equations, actually computing solutions using the determinant is impractical. The real reason to persist in mastering determinants is as a necessary prelude to the deep and elegant area of the eigenvalues of a matrix.

The determinant D = det A of a two-by-two matrix is a scalar defined as follows:

(EQ 9)

Note the use of vertical lines (instead of brackets) to indicate the determinant of the matrix, rather than the matrix itself. The determinant of a two-by-two matrix is called a determinant of order 2.

To describe the determinant of a larger (square) matrix, we will need the concept of submatrix corresponding to an element a_jk. This is the matrix A from which the j^th row and the k^th column have been deleted and is denoted S_jk(A). The determinant of this submatrix, i.e., det S_jk(A) = |S_jk(A)| is a scalar called the minor of a_jk and is denoted M_jk.

Note that the submatrix of a matrix has one fewer row and column. The determinant of a n-by-n matrix has order n. There- fore, each of its minors has an order n-1.

We now define another auxiliary term, which is the co-factor of a_jk denoted C_jk. This is defined by

(EQ 10)

We are now in a position to define the determinant of a matrix. The determinant of a matrix is defined recursively as follows:

3 2 1 5 8 – 1 4 –2

6 4 2 10

3 2 1 5

– 1 4 –2 9 0.5 4 0.9

3 2 1 4

D detA det a₁₁ a₁₂ a₂₁ a₂₂

a₁₁ a₂₁

a₁₂

a₂₂ a₁₁a₂₂–a₁₂a₂₁

= = = =

C_jk = ( )–1 ^j⁺^kM_jk

DRAFT - Version 3- Solving linear equations using matrix algebra

(EQ 11)

where i is an arbitrary row or column. It can be shown that D does not change no matter which column or row is chosen for expansion. Moreover, it can be shown (see the Exercises) that the determinant of a matrix does not change if the matrix is transposed. That is, |A| = |A^T|.

EXAMPLE 9: DETERMINANTS

Compute the determinant of the matrix .

Solution:

We will compute this by expanding the third row, so that we can ignore the middle co-factor corresponding to the element a₃₂ = 0. The determinant is given by

As a check, we expand by the center column to obtain

Here are some useful properties of determinants:

• A determinant can be computed by expanding any row or column of a matrix. Therefore, if a matrix has a zero column or row, then its determinant is 0.

• Multiplying every element in a row or a column of a matrix by the constant c results in multiplying its determinant by the same factor.

• Interchanging two rows or columns of matrix A results in a matrix B such that |B| = -|A|. Therefore, if a matrix has iden- tical rows or columns, its determinant must be zero, because zero is the only number whose negation leaves it unchanged.

• A square matrix with n rows and columns has rank n if and only it has a non-zero determinant.

• A square matrix has an inverse (is non-singular) if and only if has a non-zero determinant.

3.4.5 Cramer’s theorem

Computing the determinant of a matrix allows us to (in theory, at least) trivially solve a system of equations. In practice, computing the determinant is more expensive than Gaussian elimination, so Cramer’s theorem, discussed below is useful mostly to give us insight into the nature of the solution.

Cramer’s theorem states that if a system of nlinear equations in n variables Ax = b has a non-zero coefficient determinant D

= det A, then the system has precisely one solution, given by D a_ijC_ij

j=1 n

∑

^a^ki^C^ki

k=1 n

∑

= =

2 5 –2 4 9 8 3 0 1

a₃₁C₃₁+a₃₃C₃₃ = 3( )–1 ³⁺¹M₃₁+1( )–1 ³⁺³M₃₃ 3

= 5 2– 98

125 49

+ = 3 40( –(–18))+1 18( –20) = 174+( )–2 = 172

a₁₂C₁₂+a₂₂C₂₂ = 5( )–1 ¹⁺²M₁₂+1( )–1 ²⁺²M₂₂ 548

31 –

= 92–2

+ = –5 4( –24)+9 2( –( )–6 ) = 100+72 = 172

DRAFT - Version 3-The inverse of a matrix

87

x_i = D_i/D

where D_iis determinant of a matrix obtained by substituting b for the ith column in A. Thus, if we know the corresponding determinants, we can directly compute the x_is using this theorem (this is also called Cramer’s rule).

A system is said to be homogeneous if b = 0. In this case, each of the D_is is zero (why?), so that each of the x_is is also 0. If the determinant of the coefficient matrix A, i.e., D, is 0, and the system is homogeneous, then Cramer’s rule assigns each variable the indeterminate quantity 0/0. However, it can be shown that in this case the system does, in fact, have non-zero solu- tions. This important fact is the point of departure for the computation of the eigenvalues of a matrix.

3.4.6 The inverse of a matrix

The inverse of a square matrix A denoted A^-1 is a matrix such that AA^-1 = A^-1A = I.

EXAMPLE 10: INVERSE

Prove that the inverse of a matrix is unique Solution:

If A had an inverse B as well as an inverse C, then AB=BA=AC=CA=I. So, B = BI = B(AC) = (BA)C = IC = C.

Not all square matrices are invertible: a matrix that does not have an inverse is called a singular matrix. All singular matrices have a determinant of zero. If a matrix is not singular, its inverse is given by:

(EQ 12)

where C_jkis the co-factor of a_jk. Note that the co-factor matrix is transposed when compared with A. As a special case, the inverse of a two-by-two matrix

(EQ 13)

EXAMPLE 11: INVERSE

Compute the inverse of the matrix .

The determinant of the matrix is 2*2 - 6*3 = -14. We can use Equation 13 to compute the inverse as:

A^–¹ 1

---A[C_jk]^T 1 ---A

C₁₁ C₂₁ … C_n1 C₁₂ C₂₂ … C_n2

… … … … C_1n C_2n … C_nn

= =

A a₁₁ a₁₂ a₂₁ a₂₂

= is A^–¹ 1

---A a₂₂ –a₁₂ a₂₁ – a₁₁

2 3 6 2

1 14 –--- 2 –3

– 2

DRAFT - Version 3- Linear transformations, eigenvalues and eigenvectors

Dalam dokumen Mathematical Mathematical Foundations of Computer Networkingof Computer Networking (Halaman 92-98)