Anda di halaman 1dari 41

Determinant

In linear algebra, the determinant is a scalar value that can be computed from the elements of a square matrix and encodes certain properties of the linear
transformation described by the matrix. The determinant of a matrix A is denoted det(A), det A, or |A|. Geometrically, it can be viewed as the volume
scaling factor of the linear transformation described by the matrix. This is also the signed volume of the n-dimensional parallelepiped spanned by the
column or row vectors of the matrix. The determinant is positive or negative according to whether the linear mapping preserves or reverses the orientation
of n-space.

In the case of a 2 × 2 matrix the determinant may be defined as:

Similarly, for a 3 × 3 matrix A, its determinant is:

Each determinant of a2 × 2 matrix in this equation is called a "minor" of the matrix A. This procedure can be extended to give a recursive definition for the
determinant of an n × n matrix, the minor expansion formula.

Determinants occur throughout mathematics. For example, a matrix is often used to represent the coefficients in a system of linear equations, and the
determinant can be used to solve those equations, although other methods of solution are much more computationally efficient. In linear algebra, a matrix
(with entries in a field) is invertible (also called nonsingular) if and only if its determinant is non-zero, and correspondingly the matrix is singular if and
only if its determinant is zero. This leads to the use of determinants in defining the characteristic polynomial of a matrix, whose roots are the eigenvalues.
In analytic geometry, determinants express the signed n-dimensional volumes of n-dimensional parallelepipeds. This leads to the use of determinants in
calculus, the Jacobian determinant in the change of variables rule for integrals of functions of several variables. Determinants appear frequently in algebraic
identities such as the Vandermonde identity.

Determinants possess many algebraic properties, including that the determinant of a product of matrices is equal to the product of determinants. Special
types of matrices have special determinants; for example, the determinant of an orthogonal matrix is always plus or minus one, and the determinant of a
complex Hermitian matrix is always real.

Contents
Geometric meaning
Definition
2 × 2 matrices
3 × 3 matrices
n × n matrices
Levi-Civita symbol

Properties of the determinant


Schur complement
Multiplicativity and matrix groups
Laplace's formula and the adjugate matrix
Sylvester's determinant theorem
Properties of the determinant in relation to other notions
Relation to eigenvalues and trace
Upper and lower bounds
Cramer's rule
Block matrices
Derivative
Abstract algebraic aspects
Determinant of an endomorphism
Exterior algebra
Transformation on alternating multilinearn-forms
Square matrices over commutative rings and abstract properties
Generalizations and related notions
Infinite matrices
Operators in von Neumann algebras
Related notions for non-commutative rings
Further variants
Calculation
Decomposition methods
Further methods
History
Applications
Linear independence
Orientation of a basis
Volume and Jacobian determinant
Vandermonde determinant (alternant)
Circulants
See also
Notes
References
External links

Geometric meaning
If an n × n real matrix A is written in terms of its column vectors then

This means that maps the unit n-cube to the n-dimensional parallelotope defined by the vectors , the region

The determinant gives the signed n-dimensional volume of this parallelotope, , and hence describes more generally the n-dimensional
volume scaling factor of the linear transformation produced by A.[1] (The sign shows whether the transformation preserves or reverses orientation.) In
particular, if the determinant is zero, then this parallelotope has volume zero and is not fully n-dimensional, which indicates that the dimension of the image
of A is less than n. This means that A produces a linear transformation which is neitheronto nor one-to-one, and so is not invertible.

Definition
There are various equivalent ways to define the determinant of a square matrix A, i.e. one with the same number of rows and columns. Perhaps the simplest
way to express the determinant is by considering the elements in the top row and the respective minors; starting at the left, multiply the element by the
minor, then subtract the product of the next element and its minor, and alternate adding and subtracting such products until all elements in the top row have
been exhausted. For example, here is the result for a 4 × 4 matrix:

Another way to define the determinant is expressed in terms of the columns of the matrix.
If we write an n × n matrix A in terms of its column vectors
where the are vectors of size n, then the determinant ofA is defined so that

where b and c are scalars, v is any vector of size n and I is the identity matrix of size n. These equations say that the determinant is a linear function of each
column, that interchanging adjacent columns reverses the sign of the determinant, and that the determinant of the identity matrix is 1. These properties
mean that the determinant is an alternating multilinear function of the columns that maps the identity matrix to the underlying unit scalar. These suffice to
uniquely calculate the determinant of any square matrix. Provided the underlying scalars form a field (more generally, a commutative ring with unity), the
[2]
definition below shows that such a function exists, and it can be shown to be unique.

Equivalently, the determinant can be expressed as a sum of products of entries of the matrix where each product has n terms and the coefficient of each
product is −1 or 1 or 0 according to a given rule: it is a polynomial expression of the matrix entries. This expression grows rapidly with the size of the
matrix (an n × n matrix contributes n! terms), so it will first be given explicitly for the case of 2 × 2 matrices and 3 × 3 matrices, followed by the rule for
arbitrary size matrices, which subsumes these two cases.

Assume A is a square matrix withn rows and n columns, so that it can be written as

The entries can be numbers or expressions (as happens when the determinant is used to definecharacteristic
a polynomial); the definition of the determinant
depends only on the fact that they can be added and multiplied together in commutative
a manner.

The determinant of A is denoted by det(A), or it can be denoted directly in terms of the matrix entries by writing enclosing bars instead of brackets:

2 × 2 matrices
The Leibniz formula for the determinant of a2 × 2 matrix is

If the matrix entries are real numbers, the matrixA can be used to represent two linear maps: one that maps
the standard basis vectors to the rows of A, and one that maps them to the columns of A. In either case, the
images of the basis vectors form a parallelogram that represents the image of the unit square under the
mapping. The parallelogram defined by the rows of the above matrix is the one with vertices at (0, 0),
(a, b), (a + c, b + d), and (c, d), as shown in the accompanying diagram.

The absolute value ofad − bc is the area of the parallelogram, and thus represents the scale factor by which
areas are transformed by A. (The parallelogram formed by the columns of A is in general a different
parallelogram, but since the determinant is symmetric with respect to rows and columns, the area will be The area of the parallelogram is the
the same.) absolute value of the determinant of
the matrix formed by the vectors
The absolute value of the determinant together with the sign becomes the oriented area of the representing the parallelogram's
parallelogram. The oriented area is the same as the usual area, except that it is negative when the angle sides.
from the first to the second vector defining the parallelogram turns in a clockwise direction (which is
opposite to the direction one would get for theidentity matrix).
To show that ad − bc is the signed area, one may consider a matrix containing two vectors a = (a, b) and b = (c, d) representing the parallelogram's sides.
The signed area can be expressed as |a||b|sinθ for the angle θ between the vectors, which is simply base times height, the length of one vector times the
perpendicular component of the other. Due to the sine this already is the signed area, yet it may be expressed more conveniently using the cosine of the
, e.g. a⊥ = (−b, a), such that |a⊥||b|cosθ' , which can be determined by the pattern of the scalar product to be
complementary angle to a perpendicular vector
equal to ad − bc:

Thus the determinant gives the scaling factor and the orientation induced by the mapping represented by A. When the determinant is equal to one, the linear
mapping defined by the matrix isequi-areal and orientation-preserving.

The object known as the bivector is related to these ideas. In 2D, it can be interpreted as an oriented plane segment formed by imagining two vectors each
with origin (0, 0), and coordinates (a, b) and (c, d). The bivector magnitude (denoted by (a, b) ∧ (c, d)) is the signed area, which is also the determinant
ad − bc.[3]

3 × 3 matrices
The Laplace formula for the determinant of a3 × 3 matrix is

this can be expanded out to give

which is the Leibniz formula for the determinant of a3 × 3 matrix. The volume of this parallelepiped is the absolute
value of the determinant of the matrix formed by
The rule of Sarrus is a the rows constructed from the vectors r1, r2, and
r3.
mnemonic for the 3 × 3 matrix
determinant: the sum of the
products of three diagonal north-west to south-east lines of matrix elements, minus the sum of
the products of three diagonal south-west to north-east lines of elements, when the copies of the
first two columns of the matrix are written beside it as in the illustration. This scheme for
calculating the determinant of a3 × 3 matrix does not carry over into higher dimensions.

Sarrus' rule: The determinant of the three


columns on the left is the sum of the products n × n matrices
along the solid diagonals minus the sum of the The determinant of a matrix of arbitrary size can be defined by the Leibniz formula or the
products along the dashed diagonals
Laplace formula.

The Leibniz formula for the determinant of ann × n matrix A is

Here the sum is computed over all permutations σ of the set {1, 2, …, n}. A permutation is a function that reorders this set of integers. The value in the ith
position after the reordering σ is denoted by σi. For example, for n = 3, the original sequence 1, 2, 3 might be reordered to σ = [2, 3, 1], with σ1 = 2, σ2 = 3,
and σ3 = 1. The set of all such permutations (also known as the symmetric group on n elements) is denoted by Sn. For each permutation σ, sgn(σ) denotes
the signature of σ, a value that is +1 whenever the reordering given by σ can be achieved by successively interchanging two entries an even number of
times, and −1 whenever it can be achieved by an odd number of such interchanges.

In any of the summands, the term


is notation for the product of the entries at positions(i, σi), where i ranges from 1 to n:

For example, the determinant of a3 × 3 matrix A (n = 3) is

Levi-Civita symbol
It is sometimes useful to extend the Leibniz formula to a summation in which not only permutations, but all sequences of n indices in the range 1, …, n
occur, ensuring that the contribution of a sequence will be zero unless it denotes a permutation. Thus the totally antisymmetric Levi-Civita symbol
extends the signature of a permutation, by setting for any permutation σ of n, and when no permutation σ exists such
that for (or equivalently, whenever some pair of indices are equal). The determinant for an n × n matrix can then be expressed
using an n-fold summation as

or using two epsilon symbols as

where now each ir and each jr should be summed over1, …, n.

However, through the use of tensor notation and the supression of the summation symbol (Einstein's summation convention) we can obtain a much more
compact expression of the determinant of the second order system of dimensions, ;

where and represent 'e-systems' that take on the values 0, +1 and -1 given the number of permutations of and . More specifically, is
equal to 0 when there is a repeated index in ; +1 when an even number of permutations of is present; -1 when an odd number of permutations of
is present. Note, that the number of indices present in the e-systems is equal to and thus can be generalized in this manner[4].

Properties of the determinant


The determinant has many properties. Some basic properties of determinants are

1. where In is the n × n identity matrix.


2. where denotes the transpose of .

3.

4. For square matrices A and B of equal size,

5.
5. for an n × n matrix, A.
6. For positive semidefinite matricesA, B, and C of equal size, , for
with the corollary [5][6]

7. If A is a triangular matrix, i.e. ai,j = 0 whenever i > j or, alternatively, whenever i < j, then its determinant equals the product of the
diagonal entries:

This can be deduced from some of the properties below, but it follows most easily directly from the Leibniz formula (or from the Laplace expansion), in
which the identity permutation is the only one that gives a non-zero contribution.

A number of additional properties relate to the ef


fects on the determinant of changing particular rows or columns:

1.
8. Viewing an matrix as being composed of columns, the determinant is ann-linear function. This means that if the column of a
matrix is written as a sum of two column vectors, and all other columns are left unchanged, then the determinant of is
the sum of the determinants of the matrices obtained from by replacing the column by (denoted ) and then by (denoted )
(and a similar relation holds when writing a column as a scalar multiple of a column vector).

9.
9. If in a matrix, any row or column has all elements equal to zero, then the determinant of that matrix is 0.
10. This n-linear function is an alternating form. This means that whenever two columns of a matrix are identical, or more generally some
column can be expressed as a linear combination of the other columns (i.e. the columns of the matrix formlinearly
a dependent set), its
determinant is 0.
Properties 1, 8 and 10 — which all follow from the Leibniz formula — completely characterize the determinant; in other words the determinant is the
unique function from n × n matrices to scalars that is n-linear alternating in the columns, and takes the value 1 for the identity matrix (this characterization
holds even if scalars are taken in any given commutative ring). To see this it suffices to expand the determinant by multi-linearity in the columns into a
(huge) linear combination of determinants of matrices in which each column is a standard basis vector. These determinants are either 0 (by property 9) or
else ±1 (by properties 1 and 12 below), so the linear combination gives the expression above in terms of the Levi-Civita symbol. While less technical in
appearance, this characterization cannot entirely replace the Leibniz formula in defining the determinant, since without it the existence of an appropriate
function is not clear. For matrices over non-commutative rings, properties 8 and 9 are incompatible for n ≥ 2,[7] so there is no good definition of the
determinant in this setting.

Property 2 above implies that properties for columns have their counterparts in terms of rows:

1.
11. Viewing an n × n matrix as being composed ofn rows, the determinant is ann-linear function.
12. This n-linear function is an alternating form: whenever two rows of a matrix are identical, its determinant is 0.
13. Interchanging any pair of columns or rows of a matrix multiplies its determinant by −1. This follows from properties 8 and 10 (it is a
general property of multilinear alternating maps). More generally
, any permutation of the rows or columns multiplies the determinant by
the sign of the permutation. By permutation, it is meant viewing each row as a vectorRi (equivalently each column asCi) and reordering
the rows (or columns) by interchange ofRj and Rk (or Cj and Ck), where j,k are two indices chosen from 1 ton for an n × n square matrix.
14. Adding a scalar multiple of one column toanother column does not change the value of the determinant. This is a consequence of
properties 8 and 10 in the following way: by property 8 the determinant changes by a multiple of the determinant of a matrix with two
equal columns, which determinant is 0 by property 10. Similarly, adding a scalar multiple of one row to another row leaves the
determinant unchanged.
Property 5 says that the determinant on n × n matrices is homogeneous of degree n. These properties can be used to facilitate the computation of
determinants by simplifying the matrix to the point where the determinant can be determined immediately. Specifically, for matrices with coefficients in a
field, properties 13 and 14 can be used to transform any matrix into a triangular matrix, whose determinant is given by property 7; this is essentially the
method of Gaussian elimination. For example, the determinant of

can be computed using the following matrices:


Here, B is obtained from A by adding −1/2×the first row to the second, so that det(A) = det(B). C is obtained from B by adding the first to the third row, so
that det(C) = det(B). Finally, D is obtained from C by exchanging the second and third row, so that det(D) = −det(C). The determinant of the (upper)
triangular matrix D is the product of its entries on themain diagonal: (−2) · 2 · 4.5 = −18. Therefore, det(A) = −det(D) = +18.

Schur complement
The following identity holds for aSchur complement of a square matrix:

The Schur complement arises as the result of performing a block Gaussian elimination by multiplying the matrix M from the right with a block lower
triangular matrix

Here Ip denotes a p×p identity matrix. After multiplication with the matrixL the Schur complement appears in the upperp×p block. The product matrix is

That is, we have effected a Gaussian decomposition

The first and last matrices on the RHS have determinant unity
, so we have

This is Schur's determinant identity.

Multiplicativity and matrix groups


The determinant of a matrix product of square matrices equals the product of their determinants:

Thus the determinant is a multiplicative map. This property is a consequence of the characterization given above of the determinant as the unique n-linear
alternating function of the columns with value 1 on the identity matrix, since the function Mn(K) → K that maps M ↦ det(AM) can easily be seen to be n-
linear and alternating in the columns of M, and takes the value det(A) at the identity. The formula can be generalized to (square) products of rectangular
matrices, giving the Cauchy–Binet formula, which also provides an independent proof of the multiplicative property
.

The determinant det(A) of a matrix A is non-zero if and only ifA is invertible or, yet another equivalent statement, if its rank equals the size of the matrix. If
so, the determinant of the inverse matrix is given by

In particular, products and inverses of matrices with determinant one still have this property. Thus, the set of such matrices (of fixed size n) form a group
known as the special linear group. More generally, the word "special" indicates the subgroup of another matrix group of matrices of determinant one.
Examples include the special orthogonal group(which if n is 2 or 3 consists of allrotation matrices), and the special unitary group.

Laplace's formula and the adjugate matrix


Laplace's formula expresses the determinant of a matrix in terms of its minors. The minor Mi,j is defined to be the determinant of the (n−1) × (n−1)-matrix
that results from A by removing the i-th row and the j-th column. The expression(−1)i + jMi,j is known as a cofactor. The determinant of A is given by

(for a fixed ) (for a fixed )

Calculating det(A) by means of this formula is referred to as expanding the determinant along a row, the i-th row using the first form with fixed i, or
expanding along a column, using the second form with fixedj. For example, the Laplace expansion of the3 × 3 matrix

along the second column (j = 2 and the sum runs overi) is given by,

However, Laplace expansion is efficient for small matrices only.

The adjugate matrix adj(A) is the transpose of the matrix consisting of the cofactors, i.e.,

[8]
In terms of the adjugate matrix, Laplace's expansion can be written as

Sylvester's determinant theorem


Sylvester's determinant theoremstates that for A, an m × n matrix, and B, an n × m matrix (so that A and B have dimensions allowing them to be multiplied
in either order forming a square matrix):

where Im and In are the m × m and n × n identity matrices, respectively.

From this general result several consequences follow


.

a. For the case of column vectorc and row vector r, each with m components, the formula allows quick calculation of the determinant of a
matrix that differs from the identity matrix by amatrix of rank 1:

b. More generally,[9] for any invertible m × m matrix X,


,
c. For a column and row vector as above:
.

Properties of the determinant in relation to other notions

Relation to eigenvalues and trace


Let A be an arbitrary n×n matrix of complex numbers with eigenvalues , , …, . (Here it is understood that an eigenvalue with algebraic
multiplicity μ occurs μ times in this list.) Then the determinant ofA is the product of all eigenvalues,

.
The product of all non-zero eigenvalues is referred to aspseudo-determinant.

Conversely, determinants can be used to find theeigenvalues of the matrix A: they are the solutions of thecharacteristic equation

where I is the identity matrix of the same dimension asA and λ is a (scalar) number which solves the equation (there are no more than n solutions, where n
is the dimension of A).

A Hermitian matrix is positive definite if all its eigenvalues are positive. Sylvester's criterion asserts that this is equivalent to the determinants of the
submatrices

being positive, for all k between 1 and n.

The trace tr(A) is by definition the sum of the diagonal entries ofA and also equals the sum of the eigenvalues. Thus, for complex matricesA,

or, for real matrices A,

Here exp(A) denotes the matrix exponential of A, because every eigenvalueλ of A corresponds to the eigenvalue exp(λ) of exp(A). In particular, given any
logarithm of A, that is, any matrix L satisfying

the determinant of A is given by

For example, for n = 2, n = 3, and n = 4, respectively,

cf. Cayley-Hamilton theorem. Such expressions are deducible from combinatorial arguments, Newton's identities, or the Faddeev–LeVerrier algorithm.
That is, for generic n, detA = (−) nc0 the signed constant term of thecharacteristic polynomial, determined recursively from

[10]
In the general case, this may also be obtained from

where the sum is taken over the set of all integerskl ≥ 0 satisfying the equation
The formula can be expressed in terms of the complete exponentialBell polynomial of n arguments sl = −(l – 1)! tr(Al) as

This formula can also be used to find the determinant of a matrix AIJ with multidimensional indices I = (i1, i2, …, ir) and J = (j1, j2, …, jr). The product
and trace of such matrices are defined in a natural way as

An important arbitrary dimension n identity can be obtained from the Mercator series expansion of the logarithm when the expansion converges. If every
eigenvalue of A is less than 1 in absolute value,

where I is the identity matrix.More generally, if

is expanded as a formal power series ins then all coefficients of sm for m > n are zero and the remaining polynomial isdet(I + sA).

Upper and lower bounds


For a positive definite matrixA, the trace operator gives the following tight lower and upper bounds on the log determinant

with equality if and only ifA=I. This relationship can be derived via the formula for the KL-diver
gence between two multivariate normal distributions.

Also,

These inequalities can be proved by bringing the matrix A to the diagonal form. As such, they represent the well-known fact that the harmonic mean is less
than the geometric mean, which is less than the arithmetic mean, which is, in turn, less than the root mean square.

Cramer's rule
For a matrix equation

, given that A has a nonzero determinant,

the solution is given byCramer's rule:

where Ai is the matrix formed by replacing the ith column of A by the column vector b. This follows immediately by column expansion of the determinant,
i.e.

where the vectors are the columns of A. The rule is also implied by the identity
It has recently been shown that Cramer's rule can be implemented in O(n3) time,[11] which is comparable to more common methods of solving systems of
linear equations, such asLU, QR, or singular value decomposition.

Block matrices
Suppose A, B, C, and D are matrices of dimensionn × n, n × m, m × n, and m × m, respectively. Then

This can be seen from theLeibniz formula, or from a decomposition like (for the former case)

When A is invertible, one has

as can be seen by employing the decomposition

When D is invertible, a similar identity with factored out can be derived analogously,[12] that is,

When the blocks are square matrices of the same order further formulas hold. For example, if C and D commute (i.e., CD = DC), then the following
formula comparable to the determinant of a2 × 2 matrix holds:[13]

Generally, if all pairs of n × n matrices of the np × np block matrix commute, then the determinant of the block matrix is equal to the determinant of the
matrix obtained by computing the determinant of the block matrix considering its entries as the entries ofpa× p matrix.[14] As the previous formula shows,
for p = 2, this criterion is sufficient, but not necessary.

When A = D and B = C, the blocks are square matrices of the same order and the following formula holds (even A
if and B do not commute)

When D is a 1×1 matrix, B is a column vector, and C is a row vector then

Let be a scalar complex number. If a block matrix is square, itscharacteristic polynomialcan be factored with

Derivative
It can be seen, e.g. using the Leibniz formula, that the determinant of real (or analogously for complex) square matrices is a polynomial function from
Rn × n to R, and so it is everywheredifferentiable. Its derivative can be expressed usingJacobi's formula:[15]
where adj(A) denotes the adjugate of A. In particular, if A is invertible, we have

Expressed in terms of the entries ofA, these are

Yet another equivalent formulation is

using big O notation. The special case where , the identity matrix, yields

This identity is used in describing thetangent space of certain matrix Lie groups.

If the matrix A is written as where a, b, c are column vectors of length 3, then the gradient over one of the three vectors may be written
as the cross product of the other two:

Abstract algebraic aspects

Determinant of an endomorphism
The above identities concerning the determinant of products and inverses of matrices imply that
similar matrices have the same determinant: two matricesA
and B are similar, if there exists an invertible matrixX such that A = X−1BX. Indeed, repeatedly applying the above identities yields

The determinant is therefore also called asimilarity invariant. The determinant of alinear transformation

for some finite-dimensional vector space V is defined to be the determinant of the matrix describing it, with respect to an arbitrary choice of basis in V. By
the similarity invariance, this determinant is independent of the choice of the basis for
V and therefore only depends on the endomorphismT.

Exterior algebra
The determinant of a linear transformation A : V → V of an n-dimensional vector spaceV can be formulated in a coordinate-free manner by considering the
nth exterior power ΛnV of V. A induces a linear map

As ΛnV is one-dimensional, the map ΛnA is given by multiplying with some scalar.This scalar coincides with the determinant ofA, that is to say
This definition agrees with the more concrete coordinate-dependent definition. This follows from the characterization of the determinant given above. For
example, switching two columns changes the sign of the determinant; likewise, permuting the vectors in the exterior product v1 ∧ v2 ∧ v3 ∧ … ∧ vn to
v2 ∧ v1 ∧ v3 ∧ … ∧ vn, say, also changes its sign.

For this reason, the highest non-zero exterior power Λn(V) is sometimes also called the determinant of V and similarly for more involved objects such as
vector bundles or chain complexes of vector spaces. Minors of a matrix can also be cast in this setting, by considering lower alternating formskΛ
V with k <
n.

Transformation on alternating multilinearn-forms


The vector space W of all alternating multilinear n-forms on an n-dimensional vector space V has dimension one. To each linear transformation T on V we
associate a linear transformation T′ on W, where for each w in W we define (T′w)(x1, …, xn) = w(Tx1, …, Txn). As a linear transformation on a one-
dimensional space, T′ is equivalent to a scalar multiple.We call this scalar the determinant ofT.

Square matrices over commutative rings and abstract properties


The determinant can also be characterized as the unique function

from the set of all n × n matrices with entries in a field K to this field satisfying the following three properties: first, D is an n-linear function: considering
all but one column of A fixed, the determinant is linear in the remaining column, that is

for any column vectors v1, ..., vn, and w and any scalars (elements of K) a and b. Second, D is an alternating function: for any matrix A with two identical
columns, D(A) = 0. Finally, D(In) = 1, where In is the identity matrix.

This fact also implies that every othern-linear alternating functionF: Mn(K) → K satisfies

This definition can also be extended where K is a commutative ring R, in which case a matrix is invertible if and only if its determinant is an invertible
element in R. For example, a matrix A with entries in Z, the integers, is invertible (in the sense that there exists an inverse matrix with integer entries) if the
determinant is +1 or −1. Such a matrix is calledunimodular.

The determinant defines a mapping

between the group of invertible n × n matrices with entries in R and the multiplicative group of units in R. Since it respects the multiplication in both
groups, this map is a group homomorphism. Secondly, given a ring homomorphism f: R → S, there is a map GLn(f): GLn(R) → GLn(S) given by replacing
all entries in R by their images underf. The determinant respects these maps, i.e., given a matrixA = (ai,j) with entries in R, the identity

holds. In other words, the following diagram commutes:

For example, the determinant of the complex conjugate of a complex matrix (which is also the determinant of its conjugate transpose) is the complex
conjugate of its determinant, and for integer matrices: the reduction modulo m of the determinant of such a matrix is equal to the determinant of the matrix
reduced modulo m (the latter determinant being computed using modular arithmetic). In the language of category theory, the determinant is a natural
transformation between the two functors GLn and (⋅)× (see also Natural transformation#Determinant).[16] Adding yet another layer of abstraction, this is
captured by saying that the determinant is a morphism ofalgebraic groups, from the general linear group to themultiplicative group,
Generalizations and related notions

Infinite matrices
For matrices with an infinite number of rows and columns, the above definitions of the determinant do not carry over directly. For example, in the Leibniz
formula, an infinite sum (all of whose terms are infinite products) would have to be calculated. Functional analysis provides different extensions of the
determinant for such infinite-dimensional situations, which however only work for particular kinds of operators.

The Fredholm determinantdefines the determinant for operators known astrace class operators by an appropriate generalization of the formula

Another infinite-dimensional notion of determinant is thefunctional determinant.

Operators in von Neumann algebras


For operators in a finite factor, one may define a positive real-valued determinant called the Fuglede−Kadison determinant using the canonical trace. In
fact, corresponding to everytracial state on a von Neumann algebra there is a notion of Fuglede−Kadison determinant.

Related notions for non-commutative rings


For square matrices with entries in a non-commutative ring, there are various difficulties in defining determinants analogously to that for commutative
rings. A meaning can be given to the Leibniz formula provided that the order for the product is specified, and similarly for other ways to define the
determinant, but non-commutativity then leads to the loss of many fundamental properties of the determinant, for instance the multiplicative property or the
fact that the determinant is unchanged under transposition of the matrix. Over non-commutative rings, there is no reasonable notion of a multilinear form
(existence of a nonzero bilinear form with a regular element of R as value on some pair of arguments implies that R is commutative). Nevertheless, various
notions of non-commutative determinant have been formulated, which preserve some of the properties of determinants, notably quasideterminants and the
Dieudonné determinant. It may be noted that if one considers certain specific classes of matrices with non-commutative elements, then there are examples
where one can define the determinant and prove linear algebra theorems that are very similar to their commutative analogs. Examples include quantum
groups and q-determinant, Capelli matrix and Capelli determinant, super-matrices and Berezinian; Manin matrices is the class of matrices which is most
close to matrices with commutative elements.

Further variants
Determinants of matrices insuperrings (that is, Z2-graded rings) are known as Berezinians or superdeterminants.[17]

The permanent of a matrix is defined as the determinant, except that the factors sgn(σ) occurring in Leibniz's rule are omitted. The immanant generalizes
both by introducing a character of the symmetric group Sn in Leibniz's rule.

Calculation
Determinants are mainly used as a theoretical tool. They are rarely calculated explicitly in numerical linear algebra, where for applications like checking
invertibility and finding eigenvalues the determinant has largely been supplanted by other techniques.[18] Computational geometry, however, does
[19]
frequently use calculations related to determinants.

Naive methods of implementing an algorithm to compute the determinant include using the Leibniz formula or Laplace's formula. Both these approaches
are extremely inefficient for large matrices, though, since the number of required operations grows very quickly: it is of order n! (n factorial) for an n × n
matrix M. For example, Leibniz's formula requires calculating n! products. Therefore, more involved techniques have been developed for calculating
determinants.

Decomposition methods
Given a matrix A, some methods compute its determinant by writing A as a product of matrices whose determinants can be more easily computed. Such
techniques are referred to as decomposition methods. Examples include the LU decomposition, the QR decomposition or the Cholesky decomposition (for
positive definite matrices). These methods are of order O(n3), which is a significant improvement over O(n!)

The LU decomposition expressesA in terms of a lower triangular matrixL, an upper triangular matrixU and a permutation matrix P:
The determinants of L and U can be quickly calculated, since they are the products of the respective diagonal entries. The determinant of P is just the sign
of the corresponding permutation (which is +1 for an even number of permutations and is −1 for an uneven number of permutations). The determinant of A
is then

(See determinant identities.) Moreover, the decomposition can be chosen such that L is a unitriangular matrix and therefore has determinant 1, in which
case the formula further simplifies to

Further methods
If the determinant of A and the inverse of A have already been computed, the matrix determinant lemma allows rapid calculation of the determinant of
A + uvT, where u and v are column vectors.

Since the definition of the determinant does not need divisions, a question arises: do fast algorithms exist that do not need divisions? This is especially
interesting for matrices over rings. Indeed, algorithms with run-time proportional to n4 exist. An algorithm of Mahajan and Vinay, and Berkowitz[20] is
based on closed ordered walks (short clow). It computes more products than the determinant definition requires, but some of these products cancel and the
sum of these products can be computed more efficiently. The final algorithm looks very much like an iterated product of triangular matrices.

If two matrices of ordern can be multiplied in time M(n), where M(n) ≥ na for some a > 2, then the determinant can be computed in time O(M(n)).[21] This
means, for example, that an O(n2.376 ) algorithm exists based on theCoppersmith–Winograd algorithm.

Charles Dodgson (i.e. Lewis Carroll of Alice's Adventures in Wonderland fame) invented a method for computing determinants called Dodgson
condensation. Unfortunately this interesting method does not always work in its original form.

Algorithms can also be assessed according to their bit complexity, i.e., how many bits of accuracy are needed to store intermediate values occurring in the
computation. For example, the Gaussian elimination (or LU decomposition) method is of order O(n3), but the bit length of intermediate values can become
exponentially long.[22] The Bareiss Algorithm, on the other hand, is an exact-division method based on Sylvester's identity is also of order n3, but the bit
complexity is roughly the bit size of the original entries in the matrix timesn.[23]

History
Historically, determinants were used long before matrices: originally, a determinant was defined as a property of a system of linear equations. The
determinant "determines" whether the system has a unique solution (which occurs precisely if the determinant is non-zero). In this sense, determinants were
first used in the Chinese mathematics textbook The Nine Chapters on the Mathematical Art (九章算術, Chinese scholars, around the 3rd century BCE). In
Europe, 2 × 2 determinants were considered byCardano at the end of the 16th century and larger ones by Leibniz.[24][25][26][27]

In Japan, Seki Takakazu (関 孝和) is credited with the discovery of the resultant and the determinant (at first in 1683, the complete version no later than
1710). In Europe, Cramer (1750) added to the theory, treating the subject in relation to sets of equations. The recurrence law was first announced by Bézout
(1764).

It was Vandermonde (1771) who first recognized determinants as independent functions.[24] Laplace (1772)[28][29] gave the general method of expanding a
determinant in terms of its complementary minors: Vandermonde had already given a special case. Immediately following, Lagrange (1773) treated
determinants of the second and third order and applied it to questions ofelimination theory; he proved many special cases of general identities.

Gauss (1801) made the next advance. Like Lagrange, he made much use of determinants in the theory of numbers. He introduced the word determinant
(Laplace had used resultant), though not in the present signification, but rather as applied to the discriminant of a quantic. Gauss also arrived at the notion
of reciprocal (inverse) determinants, and came very near the multiplication theorem.

The next contributor of importance isBinet (1811, 1812), who formally stated the theorem relating to the product of two matrices of m columns and n rows,
which for the special case of m = n reduces to the multiplication theorem. On the same day (November 30, 1812) that Binet presented his paper to the
Academy, Cauchy also presented one on the subject. (See Cauchy–Binet formula.) In this he used the word determinant in its present sense,[30][31]
summarized and simplified what was then known on the subject, improved the notation, and gave the multiplication theorem with a proof more satisfactory
than Binet's.[24][32] With him begins the theory in its generality.

The next important figure was Jacobi[25] (from 1827). He early used the functional determinant which Sylvester later called the Jacobian, and in his
memoirs in Crelle's Journal for 1841 he specially treats this subject, as well as the class of alternating functions which Sylvester has called alternants.
About the time of Jacobi's last memoirs,Sylvester (1839) and Cayley began their work.[33][34]
The study of special forms of determinants has been the natural result of the completion of the general theory. Axisymmetric determinants have been
studied by Lebesgue, Hesse, and Sylvester; persymmetric determinants by Sylvester and Hankel; circulants by Catalan, Spottiswoode, Glaisher, and Scott;
skew determinants and Pfaffians, in connection with the theory of orthogonal transformation, by Cayley; continuants by Sylvester; Wronskians (so called
by Muir) by Christoffel and Frobenius; compound determinants by Sylvester, Reiss, and Picquet; Jacobians and Hessians by Sylvester; and symmetric
gauche determinants by Trudi. Of the textbooks on the subject Spottiswoode's was the first. In America, Hanus (1886), Weld (1893), and Muir/Metzler
(1933) published treatises.

Applications

Linear independence
As mentioned above, the determinant of a matrix (with real or complex entries, say) is zero if and only if the column vectors (or the row vectors) of the
matrix are linearly dependent. Thus, determinants can be used to characterize linearly dependent vectors. For example, given two linearly independent
vectors v1, v2 in R3, a third vector v3 lies in the plane spanned by the former two vectors exactly if the determinant of the 3 × 3 matrix consisting of the
three vectors is zero. The same idea is also used in the theory of differential equations: given n functions f1(x), …, fn(x) (supposed to be n − 1 times
differentiable), the Wronskian is defined to be

It is non-zero (for somex) in a specified interval if and only if the given functions and all their derivatives up to order n−1 are linearly independent. If it can
be shown that the Wronskian is zero everywhere on an interval then, in the case of analytic functions, this implies the given functions are linearly
dependent. See the Wronskian and linear independence.

Orientation of a basis
The determinant can be thought of as assigning a number to every sequence of n vectors in Rn, by using the square matrix whose columns are the given
vectors. For instance, an orthogonal matrix with entries in Rn represents an orthonormal basis in Euclidean space. The determinant of such a matrix
determines whether the orientation of the basis is consistent with or opposite to the orientation of the standard basis. If the determinant is +1, the basis has
the same orientation. If it is −1, the basis has the opposite orientation.

More generally, if the determinant of A is positive, A represents an orientation-preserving linear transformation (if A is an orthogonal 2 × 2 or 3 × 3 matrix,
this is a rotation), while if it is negative,A switches the orientation of the basis.

Volume and Jacobian determinant


As pointed out above, the absolute value of the determinant of real vectors is equal to the volume of the parallelepiped spanned by those vectors. As a
consequence, if f : Rn → Rn is the linear map represented by the matrix A, and S is any measurable subset of Rn, then the volume of f(S) is given by
|det(A)| times the volume of S. More generally, if the linear map f : Rn → Rm is represented by the m × n matrix A, then the n-dimensional volume of f(S) is
given by:

By calculating the volume of the tetrahedron bounded by four points, they can be used to identify skew lines. The volume of any tetrahedron, given its
vertices a, b, c, and d, is (1/6)·|det(a − b, b − c, c − d)|, or any other combination of pairs of vertices that would form aspanning tree over the vertices.

For a general differentiable function, much of the above carries over by considering theJacobian matrix of f. For

the Jacobian matrix is then × n matrix whose entries are given by


Its determinant, the Jacobian determinant, appears in the higher-dimensional version of integration by substitution: for suitable functions f and an open
subset U of Rn (the domain of f), the integral over f(U) of some other functionφ : Rn → Rm is given by

The Jacobian also occurs in theinverse function theorem.

Vandermonde determinant (alternant)


The third order Vandermonde determinant is

In general, the nth-order Vandermonde determinant is[35]

where the right-hand side is the continued product of all the differences that can be formed from the n(n−1)/2 pairs of numbers taken from x1, x2, …, xn,
with the order of the differences taken in the reversed order of the suffixes that are involved.

Circulants
Second order

Third order

where ω and ω2 are the complex cube roots of 1. In general, thenth-order circulant determinant is[35]

where ωj is an nth root of 1.

See also
Cauchy determinant
Dieudonné determinant
Determinant identities
Functional determinant
Immanant
Matrix determinant lemma
Permanent
Slater determinant

Notes
1. "Determinants and Volumes" (https://textbooks.math.gatech.edu/ila/determinants-volumes.html)
. textbooks.math.gatech.edu. Retrieved
16 March 2018.
2. Serge Lang, Linear Algebra, 2nd Edition, Addison-Wesley, 1971, pp 173, 191.
3. WildLinAlg episode 4 (https://www.youtube.com/watch?v=6XghF70fqkY), Norman J Wildberger, Univ. of New South Wales, 2010, lecture
via youtube
4. McConnell (1957). Applications of Tensor Analysis. Dover Publications. pp. 10–17.
5. Lin, Minghua; Sra, Suvrit (2014). "Completely strong superadditivity of generalized matrix functions".
arXiv:1410.1958 (https://arxiv.org/ab
s/1410.1958) [math.FA (https://arxiv.org/archive/math.FA)].
6. Paksoy; Turkmen; Zhang (4-1-2014). "Inequalities of Generalized Matrix Functions via T
ensor Products". Electronic Journal of Linear
Algebra. 27. doi:10.13001/1081-3810.1622(https://doi.org/10.13001%2F1081-3810.1622) . Check date values in: |date= (help)
7. In a non-commutative setting left-linearity (compatibility with left-multiplication by scalars) should be distinguished from right-linearity
.
Assuming linearity in the columns is taken to be left-linearity
, one would have, for non-commuting scalarsa, b:

a contradiction. There is no useful notion of multi-linear functions over a non-commutative ring.


8. § 0.8.2 of R. A. Horn & C. R. Johnson:Matrix Analysis 2nd ed. (2013) Cambridge University Press.ISBN 978-0-521-54823-6.
9. Proofs can be found inhttp://www.ee.ic.ac.uk/hp/staff/dmb/matrix/proof003.html
10. A proof can be found in the Appendix B ofKondratyuk, L. A.; Krivoruchenko, M. I. (1992). "Superconducting quark matter in SU(2) color
group". Zeitschrift für Physik A. 344: 99–115. Bibcode:1992ZPhyA.344...99K (http://adsabs.harvard.edu/abs/1992ZPhyA.344...99K).
doi:10.1007/BF01291027 (https://doi.org/10.1007%2FBF01291027).
11. Habgood, Ken; Arel, Itamar (2012). "A condensation-based application of Cramer ʼs rule for solving large-scale linear systems".Journal
of Discrete Algorithms. 10: 98–109. doi:10.1016/j.jda.2011.06.007(https://doi.org/10.1016%2Fj.jda.2011.06.007) .
12. These identities were taken fromhttp://www.ee.ic.ac.uk/hp/staff/dmb/matrix/proof003.html
13. Proofs are given in Silvester, J. R. (2000). "Determinants of Block Matrices"(http://www.ee.iisc.ernet.in/new/people/faculty/prasantg/dow
nloads/blocks.pdf) (PDF). Math. Gazette. 84: 460–467. JSTOR 3620776 (https://www.jstor.org/stable/3620776).
14. Sothanaphan, Nat (January 2017). "Determinants of block matrices with noncommuting blocks".
Linear Algebra and its Applications. 512:
202–218. doi:10.1016/j.laa.2016.10.004(https://doi.org/10.1016%2Fj.laa.2016.10.004)
.
15. § 0.8.10 of R. A. Horn & C. R. Johnson:Matrix Analysis 2nd ed. (2013) Cambridge University Press.ISBN 978-0-521-54823-6.
16. Mac Lane, Saunders (1998), Categories for the Working Mathematician, Graduate Texts in Mathematics 5 ((2nd ed.) ed.), Springer-
Verlag, ISBN 0-387-98403-8
17. Varadarajan, V. S (2004), Supersymmetry for mathematicians: An introduction(https://books.google.com/?id=sZ1-G4hQgIIC&pg=P
A116
&dq=Berezinian#v=onepage&q=Berezinian&f=false) , ISBN 978-0-8218-3574-6.
18. L. N. Trefethen and D. Bau, Numerical Linear Algebra(SIAM, 1997). e.g. in Lecture 1: "... we mention that the determinant, though a
convenient notion theoretically, rarely finds a useful role in numerical algorithms."
19. A survey of state-of-the-art algorithms for computing determinants and their advantages and disadvantages including results of
performance tests, is included inFisikopoulos, Vissarion; Peñaranda, Luis (2016). "Faster geometric algorithms via dynamic determinant
computation" (https://arxiv.org/pdf/1206.7067.pdf)(PDF). Computational Geometry. Elsevier B. V. 54: 1–16. arXiv:1206.7067 (https://arxi
v.org/abs/1206.7067). doi:10.1016/j.comgeo.2015.12.001(https://doi.org/10.1016%2Fj.comgeo.2015.12.001) . ISSN 0925-7721 (https://w
ww.worldcat.org/issn/0925-7721). The survey is section 1.1 Previous work, and the results of tests are in section 4.3 Determinant
computation experiments.
20. http://page.inf.fu-berlin.de/~rote/Papers/pdf/Division-free+algorithms.pdf
21. Bunch, J. R.; Hopcroft, J. E. (1974). "Triangular Factorization and Inversion by Fast Matrix Multiplication".Mathematics of Computation.
28 (125): 231–236. doi:10.1090/S0025-5718-1974-0331751-8(https://doi.org/10.1090%2FS0025-5718-1974-0331751-8) .
22. Fang, Xin Gui; Havas, George (1997)."On the worst-case complexity of integer Gaussian elimination"(http://perso.ens-lyon.fr/gilles.villar
d/BIBLIOGRAPHIE/PDF/ft_gateway.cfm.pdf) (PDF). Proceedings of the 1997 international symposium on Symbolic and algebraic
computation. ISSAC '97. Kihei, Maui, Hawaii, United States: ACM. pp. 28–31.doi:10.1145/258726.258740(https://doi.org/10.1145%2F2
58726.258740). ISBN 0-89791-875-4.
23. Bareiss, Erwin (1968),"Sylvester's Identity and Multistep Integer-Preserving Gaussian Elimination"
(http://www.ams.org/journals/mcom/1
968-22-103/S0025-5718-1968-0226829-0/S0025-5718-1968-0226829-0.pdf) (PDF), Mathematics of Computation, 22 (102): 565–578,
doi:10.2307/2004533 (https://doi.org/10.2307%2F2004533), JSTOR 2004533 (https://www.jstor.org/stable/2004533)
24. Campbell, H: "Linear Algebra With Applications", pages 111–112. Appleton Century Crofts, 1971
25. Eves, H: "An Introduction to the History of Mathematics", pages 405, 493–494, Saunders College Publishing, 1990.
26. A Brief History of Linear Algebra and Matrix Theory "Archived
: copy" (https://web.archive.org/web/20120910034016/http://darkwing.uore
gon.edu/~vitulli/441.sp04/LinAlgHistory
.html). Archived from the original (http://darkwing.uoregon.edu/~vitulli/441.sp04/LinAlgHistory
.htm
l) on 2012-09-10. Retrieved 2012-01-24.
27. Cajori, F. A History of Mathematicsp. 80 (https://books.google.com/books?id=bBoP
AAAAIAAJ&pg=PA80#v=onepage&f=false)
28. Expansion of determinants in terms of minors: Laplace, Pierre-Simon (de) "Researches sur le calcul intégral et sur le systéme du
monde," Histoire de l'Académie Royale des Sciences(Paris), seconde partie, pages 267–376 (1772).
29. Muir, Sir Thomas, The Theory of Determinants in the historical Order of Development[London, England: Macmillan and Co., Ltd., 1906].
JFM 37.0181.02 (https://zbmath.org/?format=complete&q=an:37.0181.02)
30. The first use of the word "determinant" in the modern sense appeared in: Cauchy, Augustin-Louis "Memoire sur les fonctions qui ne
peuvent obtenir que deux valeurs égales et des signes contraires par suite des transpositions operées entre les variables qu'elles
renferment," which was first read at the Institute de France in Paris on November 30, 1812, and which was subsequently published in the
Journal de l'Ecole Polytechnique, Cahier 17, Tome 10, pages 29–112 (1815).
31. Origins of mathematical terms:http://jeff560.tripod.com/d.html
32. History of matrices and determinants:http://www-history.mcs.st-and.ac.uk/history/HistTopics/Matrices_and_determinants.html
33. The first use of vertical lines to denote a determinant appeared in: Cayley
, Arthur "On a theorem in the geometry of position,"Cambridge
Mathematical Journal, vol. 2, pages 267–271 (1841).
34. History of matrix notation:http://jeff560.tripod.com/matrices.html
35. Gradshteyn, Izrail Solomonovich; Ryzhik, Iosif Moiseevich; Geronimus, Yuri Veniaminovich; Tseytlin, Michail Yulyevich (February 2007).
"14.31". In Jeffrey, Alan; Zwillinger, Daniel. Table of Integrals, Series, and Products. Translated by Scripta Technica, Inc. (7 ed.).
Academic Press, Inc. ISBN 0-12-373637-4. LCCN 2010481177 (https://lccn.loc.gov/2010481177). MR 2360010 (https://www.ams.org/ma
thscinet-getitem?mr=2360010).

References
Axler, Sheldon Jay (1997), Linear Algebra Done Right(2nd ed.), Springer-Verlag, ISBN 0-387-98259-0
de Boor, Carl (1990), "An empty exercise" (PDF), ACM SIGNUM Newsletter, 25 (2): 3–7, doi:10.1145/122272.122273.
Lay, David C. (August 22, 2005),Linear Algebra and Its Applications(3rd ed.), Addison Wesley, ISBN 978-0-321-28713-7
Meyer, Carl D. (February 15, 2001),Matrix Analysis and Applied Linear Algebra, Society for Industrial and Applied Mathematics (SIAM),
ISBN 978-0-89871-454-8, archived from the original on 2009-10-31
Muir, Thomas (1960) [1933], A treatise on the theory of determinants, Revised and enlarged by William H. Metzler, New York, NY: Dover
Poole, David (2006), Linear Algebra: A Modern Introduction(2nd ed.), Brooks/Cole,ISBN 0-534-99845-3
G. Baley Price (1947) "Some identities in the theory of determinants",American Mathematical Monthly54:75–90 MR0019078
Horn, R. A.; Johnson, C. R. (2013),Matrix Analysis (2nd ed.), Cambridge University Press,ISBN 978-0-521-54823-6
Anton, Howard (2005),Elementary Linear Algebra (Applications V ersion) (9th ed.), Wiley International
Leon, Steven J. (2006),Linear Algebra With Applications (7th ed.), Pearson Prentice Hall

External links
Suprunenko, D.A. (2001) [1994],"Determinant", in Hazewinkel, Michiel, Encyclopedia of Mathematics, Springer Science+Business
Media B.V. / Kluwer Academic Publishers,ISBN 978-1-55608-010-4
Weisstein, Eric W. "Determinant". MathWorld.
O'Connor, John J.; Robertson, Edmund F., "Matrices and determinants", MacTutor History of Mathematics archive, University of St
Andrews.
Determinant Interactive Program and Tutorial
Linear algebra: determinants.Compute determinants of matrices up to order 6 using Laplace expansion you choose.
Matrices and Linear Algebra on the Earliest Uses Pages
Determinants explained in an easy fashion in the 4th chapter as a part of a Linear Algebra course.
Instructional Video on taking the determinantof an nxn matrix (Khan Academy)
"The determinant". Essence of linear algebra– via YouTube.

Retrieved from "https://en.wikipedia.org/w/index.php?title=Determinant&oldid=887925134


"

This page was last edited on 15 March 2019, at 18:53(UTC).

Text is available under theCreative Commons Attribution-ShareAlike License ; additional terms may apply. By using this site, you agree to the
Terms of Use and Privacy Policy. Wikipedia® is a registered trademark of theWikimedia Foundation, Inc., a non-profit organization.
Laplace expansion
In linear algebra, the Laplace expansion, named after Pierre-Simon Laplace, also called cofactor expansion, is an expression for the determinant |B| of an n × n matrix B that is a
weighted sum of the determinants of n sub-matrices (or minors) of B, each of size (n−1) × (n−1). The Laplace expansion is of theoretical interest as one of several ways to view and
compute the determinant.

The i, j cofactor of the matrix B is the scalar Cij defined by

where Mij is the i, j minor of B, that is, the determinant of the (n − 1) × (n − 1) matrix that results from deleting thei-th row and the j-th column of B.

Then the Laplace expansion is given by the following

Theorem. Suppose B = [bij] is an n × n matrix and fix any i, j ∈ {1, 2, ..., n}.

Then its determinant B


| | is given by:

where are are values of the matrix's row or column that were excluded by the step of finding minor matrix for the cofactor (see example below).

Contents
Examples
Proof
Laplace expansion of a determinant by complementary minors
Example
General statement
Computational expense
See also
References
External links

Examples
Consider the matrix

The determinant of this matrix can be computed by using the Laplace expansion along any one of its rows or columns. For instance, an expansion along the first row yields:

Laplace expansion along the second column yields the same result:

It is easy to verify that the result is correct: the matrix issingular because the sum of its first and third column is twice the second column, and hence its determinant is zero.

Proof
Suppose is an n × n matrix and For clarity we also label the entries of that compose its minor matrix as

for
Consider the terms in the expansion of that have as a factor. Each has the form

for some permutation τ ∈ Sn with , and a unique and evidently related permutation which selects the same minor entries as τ. Similarly each choice of σ
determines a correspondingτ i.e. the correspondence is a bijection between and The explicit relation between and can be written as

where is a temporary shorthand notation for acycle . This operation decrements all indices larger than j so that every index fit in the set {1,2,...,n-1}

The permutation τ can be derived from σ as follows. Define by for and . Then is expressed as

Now, the operation which apply first and then apply is (Notice applying A before B is equivalent to applying inverse of A to the upper row of B in Cauchy's two-line
notation )

where is temporary shorthand notation for .

the operation which apply first and then apply is

above two are equal thus,

where is the inverse of which is .

Thus

Since the two cycles can be written respectively as and transpositions,

And since the map is bijective,

from which the result follows.

Laplace expansion of a determinant by complementary minors


Laplaces cofactor expansion can be generalised as follows.

Example
Consider the matrix

The determinant of this matrix can be computed by using the Laplace's cofactor expansion along the first two rows as follows. Firstly note that there are 6 sets of two distinct numbers
in {1, 2, 3, 4}, namely let be the aforementioned set.

By defining the complementary cofactors to be


,

and the sign of their permutation to be

, where .

The determinant of A can be written out as

where is the complementary set to .

In our explicit example this gives us

As above, It is easy to verify that the result is correct: the matrix issingular because the sum of its first and third column is twice the second column, and hence its determinant is zero.

General statement
Let be an n × n matrix and the set of k-element subsets of {1, 2, ... , n}, an element in it. Then the determinant of can be expanded along the k rows identified
by as follows:

where is the sign of the permutation determined by and , equal to , the square minor of obtained by deleting from rows and columns
with indices in and respectively, and (called the complement of ) defined to be , and being the complement of and respectively.

This coincides with the theorem above when . The same thing holds for any fixedk columns.

Computational expense
The Laplace expansion is computationally inefficient for high dimension matrices, with a time complexity in big O notation of . Alternatively, using a decomposition into
triangular matrices as in the LU decomposition can yield determinants with a time complexity of .[1]

See also
Leibniz formula for determinants

References
1. Stoer Bulirsch: Introduction to Numerical Mathematics

David Poole: Linear Algebra. A Modern Introduction. Cengage Learning 2005,ISBN 0-534-99845-3, p. 265-267 (restricted online copy, p. 265, at Google Books)
Harvey E. Rose: Linear Algebra. A Pure Mathematical Approach . Springer 2002, ISBN 3-7643-6905-1, p. 57-60 (restricted online copy, p. 57, at Google Books)

External links
Laplace expansion in C(in Portuguese)
Laplace expansion in Java(in Portuguese)

Retrieved from "https://en.wikipedia.org/w/index.php?title=Laplace_expansion&oldid=889280512


"

This page was last edited on 24 March 2019, at 18:06(UTC).

Text is available under theCreative Commons Attribution-ShareAlike License; additional terms may apply. By using this site, you agree to theTerms of Use and
Privacy Policy. Wikipedia® is a registered trademark of theWikimedia Foundation, Inc., a non-profit organization.
Leibniz formula for determinants
In algebra, the Leibniz formula, named in honor of Gottfried Leibniz, expresses the determinant of a square matrix in terms of permutations of the
matrix elements. If A is an n×n matrix, where ai,j is the entry in the ith row and jth column of A, the formula is

where sgn is the sign function of permutations in the permutation group Sn, which returns +1 and −1 foreven and odd permutations, respectively.

Another common notation used for the formula is in terms of the Levi-Civita symbol and makes use of the Einstein summation notation, where it
becomes

which may be more familiar to physicists.

Directly evaluating the Leibniz formula from the definition requires operations in general—that is, a number of operations asymptotically
proportional to n factorial—because n! is the number of order-n permutations. This is impractically difficult for large n. Instead, the determinant can be
evaluated in O(n3) operations by forming the LU decomposition (typically via Gaussian elimination or similar methods), in which case
and the determinants of the triangular matrices L and U are simply the products of their diagonal entries. (In practical
applications of numerical linear algebra, however, explicit computation of the determinant is rarely required.) See, for example, Trefethen & Bau
(1997).

Formal statement and proof


Theorem. There exists exactly one function

which is alternating multilinear w.r.t. columns and such that .

Proof.

Uniqueness: Let be such a function, and let be an matrix. Call the -th column of , i.e. , so that

Also, let denote the -th column vector of the identity matrix.

Now one writes each of the 's in terms of the , i.e.

As is multilinear, one has

From alternation it follows that any term with repeated indices is zero. The sum can therefore be restricted to tuples with non-repeating indices, i.e.
permutations:
Because F is alternating, the columns can be swapped until it becomes the identity. The sign function is defined to count the number of
swaps necessary and account for the resulting sign change.One finally gets:

as is required to be equal to .

Therefore no function besides the function defined by the Leibniz Formula can be a multilinear alternating function with .

Existence: We now show that F, where F is the function defined by the Leibniz formula, has these three properties.

Multilinear:

Alternating:

For any let be the tuple equal to with the and indices switched.

Thus if then .
Finally, :

Thus the only alternating multilinear functions with are restricted to the function defined by the Leibniz formula, and it in fact also has
these three properties. Hence the determinant can be defined as the only function

with these three properties.

See also
Matrix
Laplace expansion
Cramer's rule

References
Hazewinkel, Michiel, ed. (2001) [1994], "Determinant", Encyclopedia of Mathematics, Springer Science+Business Media B.V
./
Kluwer Academic Publishers,ISBN 978-1-55608-010-4
Trefethen, Lloyd N.; Bau, David (June 1, 1997). Numerical Linear Algebra. SIAM. ISBN 978-0898713619.

Retrieved from "https://en.wikipedia.org/w/index.php?title=Leibniz_formula_for_determinants&oldid=863841834


"

This page was last edited on 13 October 2018, at 12:22(UTC).

Text is available under theCreative Commons Attribution-ShareAlike License ; additional terms may apply. By using this site, you agree
to the Terms of Use and Privacy Policy. Wikipedia® is a registered trademark of theWikimedia Foundation, Inc., a non-profit
organization.
Cayley–Hamilton theorem
In linear algebra, the Cayley–Hamilton theorem (named after the mathematicians Arthur
Cayley and William Rowan Hamilton) states that every square matrix over a commutative ring
(such as the real or complex field) satisfies its own characteristic equation.

If A is a given n×n matrix and In is the n×n identity matrix, then the characteristic
polynomial of A is defined as[7]

where det is the determinant operation and λ is a scalar element of the base ring. Since the
entries of the matrix are (linear or constant) polynomials in λ, the determinant is also an n-th
order monic polynomial in λ. The Cayley–Hamilton theorem states that substituting the matrix
A for λ in this polynomial results in thezero matrix,

The powers of A, obtained by substitution from powers of λ, are defined by repeated matrix
multiplication; the constant term of p(λ) gives a multiple of the power A0, which is defined as
the identity matrix. The theorem allows An to be expressed as a linear combination of the lower Arthur Cayley, F.R.S. (1821–1895) is
matrix powers of A. When the ring is a field, the Cayley–Hamilton theorem is equivalent to the widely regarded as Britain's leading
statement that the minimal polynomial of a square matrix divides its characteristic polynomial. pure mathematician of the 19th
century. Cayley in 1848 went to Dublin
The theorem was first proved in 1853[8] in terms of inverses of linear functions of quaternions, to attend lectures on quaternions by
a non-commutative ring, by Hamilton.[4][5][6] This corresponds to the special case of certain Hamilton, their discoverer. Later
4 × 4 real or 2 × 2 complex matrices. The theorem holds for general quaternionic Cayley impressed him by being the
second to publish work on them.[1]
matrices.[9][nb 1] Cayley in 1858 stated it for 3 × 3 and smaller matrices, but only published a
Cayley proved the theorem for
proof for the 2 × 2 case.[2] The general case was first proved byFrobenius in 1878.[10]
matrices of dimension 3 and less,
publishing proof for the two-
dimensional case.[2][3] As for n × n
matrices, Cayley stated “..., I have not
Contents thought it necessary to undertake the
Examples labor of a formal proof of the theorem
1×1 matrices in the general case of a matrix of any
2×2 matrices degree”.

Applications
Determinant and inverse matrix
n-th Power of matrix
Matrix functions
Algebraic Number Theory
Proving the theorem in general
Preliminaries
Adjugate matrices
A direct algebraic proof
A proof using polynomials with matrix coefficients
A synthesis of the first twoproofs
A proof using matrices of endomorphisms
A bogus "proof": p(A) = det(AIn − A) = det(A − A) = 0
Abstraction and generalizations
See also
Remarks
Notes
References
External links

Examples

1×1 matrices
For a 1×1 matrix A = (a1,1 ), the characteristic polynomial is given by p(λ) = λ − a, and so
p(A) = (a) − a1,1 = 0 is obvious.

2×2 matrices
As a concrete example, let

William Rowan Hamilton(1805–


1865), Irish physicist, astronomer, and
Its characteristic polynomial is given by mathematician, first foreign member
of the American National Academy of
Sciences. While maintaining opposing
position about how geometry should
be studied, Hamilton always remained
on the best terms with Cayley.[1]

Hamilton proved that for a linear


function of quaternions there exists a
certain equation, depending on the
linear function, that is satisfied by the
linear function itself.[4][5][6]

The Cayley–Hamilton theorem claims that, if wedefine

then

We can verify by computation that indeed,

For a generic 2×2 matrix,

the characteristic polynomial is given byp(λ) = λ2 − (a + d)λ + (ad − bc), so the Cayley–Hamilton theorem states that
which is indeed always the case, evident by working out the entries ofA2.

Applications

Determinant and inverse matrix


For a general n×n invertible matrix A, i.e., one with nonzero determinant,A−1 can thus be written as an (n − 1) -th order polynomial expression
in A: As indicated, the Cayley–Hamilton theorem amounts tothe identity

The coefficients ci are given by the elementary symmetric polynomials of the eigenvalues of A. Using Newton identities, the elementary
symmetric polynomials can in turn be expressed in terms ofpower sum symmetric polynomialsof the eigenvalues:

where tr (Ak) is the trace of the matrix Ak. Thus, we can expressci in terms of the trace of powers ofA.

In general, the formula for the coefficients ci is given in terms of complete exponentialBell polynomials as [nb 2]

In particular, the determinant of A corresponds to c0. Thus, the determinant can be written as atrace identity

Likewise, the characteristic polynomial can be written as

and, by multiplying both sides byA−1 (note −(−1) n = (−1) n−1 ), one is led to an expression for the inverse ofA as a trace identity,

2 3
For instance, the first few Bell polynomials areB0 = 1, B1(x1) = x1, B2(x1, x2) = x1 + x2, and B3(x1, x2, x3) = x1 + 3 x1x2 + x3.

Using these to specify the coefficients ci of the characteristic polynomial of a2×2 matrix yields

The coefficient c0 gives the determinant of the2×2 matrix, c1 minus its trace, while its inverse is given by
It is apparent from the general formula forcn-k , expressed in terms of Bell polynomials, that the expressions

and

always give the coefficients cn−1 of λn−1 and cn−2 of λn−2 in the characteristic polynomial of any n×n matrix, respectively. So, for a 3×3
matrix A, the statement of the Cayley–Hamilton theorem can also be written as

where the right-hand side designates a3×3 matrix with all entries reduced to zero. Likewise, this determinant in then = 3 case, is now

This expression gives the negative of coefficient cn−3 of λn−3 in the general case, as seen below.

Similarly, one can write for a 4×4 matrix A,

where, now, the determinant is cn−4 ,

and so on for larger matrices. The increasingly complex expressions for the coefficients ck is deducible from Newton's identities or the Faddeev–
LeVerrier algorithm.

Another method for obtaining these coefficients ck for a general n×n matrix, provided no root be zero, relies on the following alternative
expression for the determinant,

Hence, by virtue of theMercator series,

where the exponentialonly needs be expanded to orderλ−n, since p(λ) is of order n, the net negative powers of λ automatically vanishing by the
ficients of λ can be directly written in terms of complete Bell
C–H theorem. (Again, this requires a ring containing the rational numbers.) The coef
polynomials by comparing this expression with the generating function of the Bell polynomial.

Differentiation of this expression with respect to λ allows determination of the generic coefficients of the characteristic polynomial for general n,
as determinants of m×m matrices,[nb 3]

n-th Power of matrix


The Cayley–Hamilton theorem always provides a relationship between the powers of A (though not always the simplest one), which allows one
An or any higher powers ofA.
to simplify expressions involving such powers, and evaluate them without having to compute the power
As an example, for the theorem gives

Then, to calculate A4, observe

Likewise,

Notice that we have been able to write the matrix power as the sum of two terms. In fact, matrix power of any order k can be written as a matrix
polynomial of degree at most n - 1, where n is the size of a square matrix. This is an instance where Cayley–Hamilton theorem can be used to
express a matrix function, which we will discuss below systematically
.

Matrix functions
Given an analytic function

and the characteristic polynomialp(x) of degree n of an n × n matrix A, the function can be expressed using long division as

where q(x) is some quotient polynomial and r(x) is a remainder polynomial such that 0 ≤ deg r(x) < n. By the Cayley–Hamilton theorem,
replacing x by the matrix A gives p(A) = 0 , so one has

Thus, the analytic function of matrixA can be expressed as a matrix polynomial of degree less thann.

Let the remainder polynomial be

Since p(λ) = 0 , evaluating the functionf(x) at the n eigenvalues of A, yields

This amounts to a system ofn linear equations, which can be solved to determine the coef
ficients ci. Thus, one has

When the eigenvalues are repeated, that is λi = λ j for some i ≠ j, two or more equations are identical; and hence the linear equations cannot be
solved uniquely. For such cases, for an eigenvalue λ with multiplicity m, the first m – 1 derivative of p(x) vanishes at the eigenvalues. Thus,
there are the extra m – 1 linearly independent solutions
which, when combined with others, yield the requiredn equations to solve forci.

Finding a polynomial that passes through the points (λi, f (λi)) is essentially an interpolation problem, and can be solved using Lagrange or
Newton interpolation techniques, leading to Sylvester's formula.

For example, suppose the task is to find the polynomial representation of

The characteristic polynomial is p(x) = (x - 1)(x - 3) = x2 - 4 x + 3, and the eigenvalues are λ = 1, 3. Let r(x) = c 0 + c1x. Evaluating
f(λ) = r(λ) at the eigenvalues, one obtains two linear equations et = c0 + c1 and e3t = c0 + 3 c1. Solving the equations yields
c0 = (3 e t - e3t)/2 and c1 = (e 3t - et)/2. Thus, it follows that

If, instead, the function were f(A) = sin At, then the coefficients would have been c0 = (3 sin t - sin 3t)/2 and c1 = (sin 3t - sin t)/2;
hence

As a further example, when considering

then the characteristic polynomial is p(x) = x2 + 1, and the eigenvalues are λ = i, -i. As before, evaluating the function at the eigenvalues
gives us the linear equations eit = c0 + i c1 and e−it = c0 - i c1; the solution of which gives, c0 = (eit + e−it )/2 = cos t and
c1 = (e it - e−it )/2i = sin t . Thus, for this case,

which is a rotation matrix.

Standard examples of such usage is the exponential map from the Lie algebra of a matrix Lie group into the group. It is given by a matrix
exponential,

Such expressions have long been known forSU(2) ,

where the σ are the Pauli matrices and for SO(3) ,

which is Rodrigues' rotation formula. For the notation, seerotation group SO(3)#A note on Lie algebra.

More recently, expressions have appeared for other groups, like the Lorentz group SO(3, 1),[11] O(4, 2)[12] and SU(2, 2),[13] as well as
GL(n, R).[14] The group O(4, 2) is the conformal group of spacetime, SU(2, 2) its simply connected cover (to be precise, the simply
connected cover of the connected component SO+(4, 2) of O(4, 2)). The expressions obtained apply to the standard representation of these
groups. They require knowledge of (some of) the eigenvalues of the matrix to exponentiate. For SU(2) (and hence for SO(3) ), closed
[15]
expressions have recently been obtained for all irreducible representations, i.e. of any spin.

Algebraic Number Theory


The Cayley–Hamilton theorem is an effective tool for computing the minimal polynomial of
algebraic integers. For example, given a finite extension of and an algebraic
integer which is a non-zero linear combination of the we can
compute the minimal polynomial of by finding a matrix representing the -linear
transformation

If we call this transformation matrix , then we can find the minimal polynomial by applying
the Cayley–Hamilton theorem to .[16]

Proving the theorem in general


The Cayley-Hamilton is an immediate consequence of the existence of the Jordan normal form
for matrices over algebraically closed fields. In this section direct proofs are presented.

As the examples above show, obtaining the statement of the Cayley–Hamilton theorem for an
n×n matrix
Ferdinand Georg Frobenius(1849–
1917), German mathematician. His
main interests were elliptic functions
differential equations, and later group
requires two steps: first the coefficients ci of the characteristic polynomial are determined by
theory.
development as a polynomial int of the determinant
In 1878 he gave the first full proof of
the Cayley–Hamilton theorem.[10]

and then these coefficients are used in a linear combination of powers of A that is equated to the n×n null matrix:

The left hand side can be worked out to ann×n matrix whose entries are (enormous) polynomial expressions in the set of entries ai,j of A, so the
Cayley–Hamilton theorem states that each of these n2 expressions are equal to 0. For any fixed value of n these identities can be obtained by
tedious but completely straightforward algebraic manipulations. None of these computations can show however why the Cayley–Hamilton
theorem should be valid for matrices of all possible sizesn, so a uniform proof for alln is needed.

Preliminaries
If a vector v of size n happens to be an eigenvector of A with eigenvalue λ, in other words if A⋅v = λv, then
which is the null vector sincep(λ) = 0 (the eigenvalues of A are precisely the roots of p(t)). This holds for all possible eigenvaluesλ, so the two
. Now if A admits a basis of eigenvectors, in
matrices equated by the theorem certainly give the same (null) result when applied to any eigenvector
other words if A is diagonalizable, then the Cayley–Hamilton theorem must hold for A, since two matrices that give the same values when
applied to each element of a basis must be equal.

product of eigenvalues of

Consider now the function which maps matrices to matrices given by the formula , i.e. which takes
a matrix and plugs it into its own characteristic polynomial. Not all matrices are diagonalizable, but for matrices with complex coefficients
many of them are: the set of diagonalizable complex square matrices of a given size is dense in the set of all such square matrices[17] (for a
matrix to be diagonalizable it suffices for instance that its characteristic polynomial not have any multiple roots). Now viewed as a function
(since matrices have entries) we see that this function is continuous. This is true because the entries of the image of a matrix
are given by polynomials in the entries of the matrix. Since

and since is dense, by continuity this function must map the entire set of matrices to the zero matrix. Therefore the Cayley-Hamilton
theorem is true for complex numbers, and must therefore also hold for or valued matrices..

While this provides a valid proof, the argument is not very satisfactory, since the identities represented by the theorem do not in any way depend
on the nature of the matrix (diagonalizable or not), nor on the kind of entries allowed (for matrices with real entries the diagonalizable ones do
not form a dense set, and it seems strange one would have to consider complex matrices to see that the Cayley–Hamilton theorem holds for
them). We shall therefore now consider only arguments that prove the theorem directly for any matrix using algebraic manipulations only; these
also have the benefit of working for matrices with entries in anycommutative ring.

There is a great variety of such proofs of the Cayley–Hamilton theorem, of which several will be given here. They vary in the amount of abstract
algebraic notions required to understand the proof. The simplest proofs use just those notions needed to formulate the theorem (matrices,
polynomials with numeric entries, determinants), but involve technical computations that render somewhat mysterious the fact that they lead
precisely to the correct conclusion. It is possible to avoid such details, but at the price of involving more subtle algebraic notions: polynomials
with coefficients in a non-commutative ring, or matrices with unusual kinds of entries.

Adjugate matrices
All proofs below use the notion of theadjugate matrix adj(M) of an n×n matrix M, the transpose of its cofactor matrix.

This is a matrix whose coefficients are given by polynomial expressions in the coefficients of M (in fact, by certain (n − 1)×(n − 1)
determinants), in such a way thatthe following fundamental relations hold,

These relations are a direct consequence of the basic properties of determinants: evaluation of the (i,j) entry of the matrix product on the left
gives the expansion by column j of the determinant of the matrix obtained from M by replacing column i by a copy of column j, which is
det(M) if i = j and zero otherwise; the matrix product on the right is similar
, but for expansions by rows.

Being a consequence of just algebraic expression manipulation, these relations are valid for matrices with entries in any commutative ring
(commutativity must be assumed for determinants to be defined in the first place). This is important to note here, because these relations will be
applied below for matrices with non-numeric entries such as polynomials.
A direct algebraic proof
This proof uses just the kind of objects needed to formulate the Cayley–Hamilton theorem: matrices with polynomials as entries. The matrix
t In −A whose determinant is the characteristic polynomial of A is such a matrix, and since polynomials form a commutative ring, it has an
adjugate

Then, according to the right-hand fundamental relation of the adjugate, one has

Since B is also a matrix with polynomials in t as entries, one can, for each i , collect the coefficients of ti in each entry to form a matrix B i of
numbers, such that one has

(The way the entries of B are defined makes clear that no powers higher than tn−1 occur). While this looks like a polynomial with matrices as
coefficients, we shall not consider such a notion; it is just a way to write a matrix with polynomial entries as a linear combination of n constant
matrices, and the coefficient t i has been written to the left of the matrix to stress this point of view
.

Now, one can expand the matrix product in our equation by bilinearity

Writing

one obtains an equality of two matrices with polynomial entries, written as linear combinations of constant matrices with powers of t as
coefficients.

Such an equality can hold only if in any matrix position the entry that is multiplied by a given power ti is the same on both sides; it follows that
the constant matrices with coefficient ti in both expressions must be equal. Writing these equations then fori from n down to 0, one finds

Finally, multiply the equation of the coefficients of ti from the left by Ai, and sum up:

The left-hand sides form atelescoping sum and cancel completely; the right-hand sides add up to :

This completes the proof.


A proof using polynomials with matrix coefficients
This proof is similar to the first one, but tries to give meaning to the notion of polynomial with matrix coefficients that was suggested by the
expressions occurring in that proof. This requires considerable care, since it is somewhat unusual to consider polynomials with coefficients in a
non-commutative ring, and not all reasoning that is valid for commutative polynomials can be applied in this setting.

Notably, while arithmetic of polynomials over a commutative ring models the arithmetic of polynomial functions, this is not the case over a non-
commutative ring (in fact there is no obvious notion of polynomial function in this case that is closed under multiplication). So when considering
polynomials in t with matrix coefficients, the variable t must not be thought of as an "unknown", but as a formal symbol that is to be manipulated
according to given rules; in particular one cannot just sett to a specific value.

Let M(n, R) be the ring of n×n matrices with entries in some ringR (such as the real or complex numbers) that hasA as an element. Matrices with
as coefficients polynomials int, such as or its adjugate B in the first proof, are elements ofM(n, R[t]).

By collecting like powers of t, such matrices can be written as "polynomials" in t with constant matrices as coefficients; write M(n, R)[t] for the
set of such polynomials. Since this set is in bijection with M(n, R[t]), one defines arithmetic operations on it correspondingly, in particular
multiplication is given by

respecting the order of the coefficient matrices from the two operands; obviously this gives a non-commutative multiplication.

Thus, the identity

from the first proof can be viewed as one involving a multiplication of elements in
M(n, R)[t].

At this point, it is tempting to simply set t equal to the matrix A , which makes the first factor on the left equal to the null matrix, and the right
hand side equal to p(A); however, this is not an allowed operation when coefficients do not commute. It is possible to define a "right-evaluation
map" evA : M[t] → M, which replaces each ti by the matrix power Ai of A , where one stipulates that the power is always to be multiplied on the
right to the corresponding coefficient.

But this map is not a ring homomorphism: the right-evaluation of a product differs in general from the product of the right-evaluations. This is so
because multiplication of polynomials with matrix coefficients does not model multiplication of expressions containing unknowns: a product
is defined assuming thatt commutes with N, but this may fail if t is replaced by the matrixA.

One can work around this difficulty in the particular situation at hand, since the above right-evaluation map does become a ring homomorphism if
the matrix A is in the center of the ring of coefficients, so that it commutes with all the coefficients of the polynomials (the argument proving this
is straightforward, exactly because commutingt with coefficients is now justified after evaluation).

Now, A is not always in the center of M, but we may replace M with a smaller ring provided it contains all the coefficients of the polynomials in
question: , A, and the coefficients of the polynomial B. The obvious choice for such a subring is the centralizer Z of A, the subring of all
matrices that commute withA; by definition A is in the center of Z.

This centralizer obviously contains , and A, but one has to show that it contains the matrices . To do this, one combines the two fundamental
relations for adjugates, writing out the adjugateB as a polynomial:
Equating the coefficients shows that for each i, we have A Bi = Bi A as desired. Having found the proper setting in which evA is indeed a
homomorphism of rings, one can complete the proof as suggested above:

This completes the proof.

A synthesis of the first twoproofs


In the first proof, one was able to determine the coefficients Bi of B based on the right-hand fundamental relation for the adjugate only. In fact
the first n equations derived can be interpreted as determining the quotientB of the Euclidean division of the polynomial p(t)In on the left by the
monic polynomial Int−A, while the final equation expresses the fact that the remainder is zero. This division is performed in the ring of
polynomials with matrix coefficients. Indeed, even over a non-commutative ring, Euclidean division by a monic polynomial P is defined, and
always produces a unique quotient and remainder with the same degree condition as in the commutative case, provided it is specified at which
side one wishes P to be a factor (here that is to the left).

To see that quotient and remainder are unique (which is the important part of the statement here), it suffices to write as
and observe that sinceP is monic, P(Q−Q') cannot have a degree less than that ofP, unless Q=Q' .

But the dividend p(t)In and divisor Int−A used here both lie in the subring (R[A])[t], where R[A] is the subring of the matrix ring M(n, R)
generated by A: the R-linear span of all powers of A. Therefore, the Euclidean division can in fact be performed within that commutative
polynomial ring, and of course it then gives the same quotient B and remainder 0 as in the larger ring; in particular this shows that B in fact lies
in (R[A])[t].

But, in this commutative setting, it is valid to sett to A in the equation

; in other words, to apply the evaluation map

which is a ring homomorphism, giving

just like in the second proof, as desired.

In addition to proving the theorem, the above argument tells us that the coefficients Bi of B are polynomials in A, while from the second proof
we only knew that they lie in the centralizer Z of A; in general Z is a larger subring than R[A], and not necessarily commutative. In particular
the constant term B0= adj(−A) lies in R[A]. Since A is an arbitrary square matrix, this proves that adj(A) can always be expressed as a
polynomial in A (with coefficients that depend onA).

In fact, the equations found in the first proof allow successively expressing as polynomials in A, which leads to the identity
valid for all n×n matrices, where

is the characteristic polynomial ofA.

Note that this identity also implies the statement of the Cayley–Hamilton theorem: one may move adj(−A) to the right hand side, multiply the
resulting equation (on the left or on the right) byA, and use the fact that

A proof using matrices of endomorphisms


As was mentioned above, the matrix p(A) in statement of the theorem is obtained by first evaluating the determinant and then substituting the
matrix A for t; doing that substitution into the matrix before evaluating the determinant is not meaningful. Nevertheless, it is possible to
give an interpretation where p(A) is obtained directly as the value of a certain determinant, but this requires a more complicated setting, one of
matrices over a ring in which one can interpret both the entries of A, and all of A itself. One could take for this the ring M(n, R) of n×n
matrices over R, where the entry is realised as , and A as itself. But considering matrices with matrices as entries might cause
confusion with block matrices, which is not intended, as that gives the wrong notion of determinant (recall that the determinant of a matrix is
defined as a sum of products of its entries, and in the case of a block matrix this is generally not the same as the corresponding sum of products of
its blocks!). It is clearer to distinguish A from the endomorphism φ of an n-dimensional vector space V (or free R-module if R is not a field)
defined by it in a basis e1, ..., en, and to take matrices over the ring End(V) of all such endomorphisms. Then φ ∈ End(V) is a possible matrix
entry, while A designates the element of M(n, End(V)) whose i,j entry is endomorphism of scalar multiplication by ; similarly In will be
interpreted as element of M(n, End(V)). However, since End(V) is not a commutative ring, no determinant is defined on M(n, End(V)); this can
only be done for matrices over a commutative subring of End(V). Now the entries of the matrix all lie in the subring R[φ] generated by
the identity and φ, which is commutative. Then a determinant map M(n, R[φ]) → R[φ] is defined, and evaluates to the value p(φ)
of the characteristic polynomial of A at φ (this holds independently of the relation between A and φ); the Cayley–Hamilton theorem states that
p(φ) is the null endomorphism.

In this form, the following proof can be obtained from that of (Atiyah & MacDonald 1969, Prop. 2.4) (which in fact is the more general
statement related to the Nakayama lemma; one takes for the ideal in that proposition the whole ring R). The fact that A is the matrix of φ in the
basis e1, ..., en means that

One can interpret these asn components of one equation inVn, whose members can be written using the matrix-vector productM(n, End(V)) × Vn
→ Vn that is defined as usual, but with individual entries ψ∈ End(V) and v in V being "multiplied" by forming ; this gives:

where is the element whose component i is ei (in other words it is the basis e1, ..., en of V written as a column of vectors). Writing this
equation as

one recognizes the transpose of the matrix considered above, and its determinant (as element of M(n, R[φ])) is also p(φ). To derive
from this equation that p(φ) = 0 ∈ End(V), one left-multiplies by the adjugate matrix of , which is defined in the matrix ring M(n,
R[φ]), giving
the associativity of matrix-matrix and matrix-vector multiplication used in the first step is a purely formal property of those operations,
independent of the nature of the entries. Now component i of this equation says that p(φ)(ei) = 0 ∈ V; thus p(φ) vanishes on all ei, and since these
elements generate V it follows that p(φ) = 0 ∈ End(V), completing the proof.

One additional fact that follows from this proof is that the matrix A whose characteristic polynomial is taken need not be identical to the value φ
substituted into that polynomial; it suffices that φ be an endomorphism ofV satisfying the initial equations

for some sequence of elements e1,...,en that generate V (which space might have smaller dimension than n, or in case the ring R is not a field it
might not be a free module at all).

A bogus "proof": p(A) = det(AIn − A) = det(A − A) = 0


One persistent elementary butincorrect argument[18] for the theorem is to "simply" take the definition

and substitute A for λ, obtaining

There are many ways to see why this argument is wrong. First, in Cayley–Hamilton theorem, p(A) is an n×n matrix. However, the right hand side
of the above equation is the value of a determinant, which is a scalar. So they cannot be equated unless n = 1 (i.e. A is just a scalar). Second, in
the expression , the variable λ actually occurs at the diagonal entries of the matrix . To illustrate, consider the
characteristic polynomial in the previous example again:

If one substitutes the entire matrixA for λ in those positions, one obtains

in which the "matrix" expression is simply not a valid one. Note, however, that if scalar multiples of identity matrices instead of scalars are
subtracted in the above, i.e. if the substitution is performed as

then the determinant is indeed zero, but the expanded matrix in question does not evaluate to ; nor can its determinant (a scalar) be
compared to p(A) (a matrix). So the argument that still does not apply.

Actually, if such an argument holds, it should also hold when other multilinear forms instead of determinant is used. For instance, if we consider
the permanent function and define , then by the same argument, we should be able to "prove" that q(A) = 0. But this
statement is demonstrably wrong.In the 2-dimensional case, for instance, the permanent of a matrix is given by

So, for the matrix A in the previous example,


Yet one can verify that

One of the proofs for Cayley–Hamilton theorem above bears some similarity to the argument that . By introducing
a matrix with non-numeric coefficients, one can actually let A live inside a matrix entry, but then is not equal to A, and the conclusion is
reached differently.

Abstraction and generalizations


The above proofs show that the Cayley–Hamilton theorem holds for matrices with entries in any commutative ring R, and that p(φ) = 0 will hold
whenever φ is an endomorphism of anR module generated by elementse1,...,en that satisfies

This more general version of the theorem is the source of the celebratedNakayama lemma in commutative algebra and algebraic geometry
.

See also
Companion matrix

Remarks
1. Due to the non-commutative nature of the multiplication operation for quaternions and related constructions, care needs to be
taken with definitions, most notably in this context, for the determinant. The theorem holds as well for the slightly less well-
behaved split-quaternions, see Alagös, Oral & Yüce (2012). The rings of quaternions and split-quaternions can both be
represented by certain2 × 2 complex matrices. (When restricted to unit norm, these are the groups SU(2) and SU(1, 1)
respectively.) Therefore it is not surprising that the theorem holds.

There is no such matrix representation for theoctonions, since the multiplication operation is not associative in this case.
However, a modified Cayley-Hamilton theorem still holds for the octonions, seeTian (2000).
2. An explicit expression for these coefficients is

where the sum is taken over the sets of all integer partitionskl ≥ 0 satisfying the equation

3. See, e.g., p. 54 of Brown 1994, which solves Jacobi's formula,

where B is the adjugate matrix of the next section.There also exists an equivalent, related recursive algorithm introduced by
Urbain Le Verrier and Dmitry Konstantinovich Faddeev—the Faddeev–LeVerrier algorithm, which reads

A−1 = −M /c
(see, e.g., p 88 of Gantmacher 1960.) Observe A−1 = −Mn /c0 as the recursion terminates. See the algebraic proof in the
following section, which relies on the modes of the adjugate,Bk ≡ Mn−k . Specifically, and the above
derivative of p when one traces it yields

(Hou 1998), and the above recursions, in turn.

Notes
1. Crilly 1998 10. Frobenius 1878
2. Cayley 1858, pp. 17–37 11. Zeni & Rodrigues 1992
3. Cayley 1889, pp. 475–496 12. Barut, Zeni & Laufer 1994a
4. Hamilton 1864a 13. Barut, Zeni & Laufer 1994b
5. Hamilton 1864b 14. Laufer 1997
6. Hamilton 1862 15. Curtright, Fairlie & Zachos 2014
7. Atiyah & MacDonald 1969 16. Stein, William. Algebraic Number Theory, a Computational
8. Hamilton 1853, p. 562 Approach (http://wstein.org/books/ant/ant.pdf)(PDF). p. 29.
9. Zhang 1997 17. Bhatia 1997, p. 7
18. Garrett, 2007 & p. 381

References
Alagös, Y.; Oral, K.; Yüce, S. (2012)."Split Quaternion Matrices". Miskolc Mathematical Notes. 13 (2): 223–232. ISSN 1787-
2405 (open access)
Atiyah, M. F.; MacDonald, I. G. (1969), Introduction to Commutative Algebra, Westview Press, ISBN 978-0-201-40751-8
Barut, A. O.; Zeni, J. R.; Laufer, A. (1994a). "The exponential map for the conformal group O(2,4)" . J. Phys. A: Math. Gen. 27
(15): 5239–5250. arXiv:hep-th/9408105. Bibcode:1994JPhA...27.5239B. doi:10.1088/0305-4470/27/15/022.
Barut, A. O.; Zeni, J. R.; Laufer, A. (1994b). "The exponential map for the unitary group SU(2,2)" . J. Phys. A: Math. Gen. 27
(20): 6799–6806. arXiv:hep-th/9408145. Bibcode:1994JPhA...27.6799B. doi:10.1088/0305-4470/27/20/017.
Bhatia, R. (1997). Matrix Analysis. Graduate texts in mathematics.169. Springer. ISBN 978-0387948461.
Brown, Lowell S. (1994).Quantum Field Theory. Cambridge University Press. ISBN 978-0-521-46946-3.
Cayley, A. (1858). "A Memoir on the Theory of Matrices".Philos. Trans. 148.
Cayley, A. (1889). The Collected Mathematical Papers of Arthur Cayley . (Classic Reprint). 2. Forgotten books.
ASIN B008HUED9O.
Crilly, T. (1998). "The young Arthur Cayley".Notes Rec. R. Soc. Lond. 52 (2): 267–282. doi:10.1098/rsnr.1998.0050.
Curtright, T L; Fairlie, D B; Zachos, C K (2014). "A compact formula for rotations as spin matrix polynomials".SIGMA. 10
(2014): 084. arXiv:1402.3541. Bibcode:2014SIGMA..10..084C. doi:10.3842/SIGMA.2014.084.
Frobenius, G. (1878). "Ueber lineare Substutionen und bilineare Formen".J. Reine Angew. Math. 84: 1–63.
Gantmacher, F.R. (1960). The Theory of Matrices. NY: Chelsea Publishing. ISBN 978-0-8218-1376-8.
Garrett, Paul B. (2007).Abstract Algebra. NY: Chapman and Hall/CRC.ISBN 978-1584886891.
Hamilton, W. R. (1853). Lectures on Quaternions. Dublin.
Hamilton, W. R. (1864a). "On a New and General Method of Inverting a Linear and Quaternion Function of a Quaternion".
Proceedings of the Royal Irish Academy. viii: 182–183. (communicated on June 9, 1862)
Hamilton, W. R. (1864b). "On the Existence of a Symbolic and Biquadratic Equation, which is satisfied by the Symbol of Linear
Operation in Quaternions".Proceedings of the Royal Irish Academy. viii: 190–101. (communicated on June 23, 1862)
Hou, S. H. (1998). "Classroom Note: A Simple Proof of the Leverrier--Faddeev Characteristic Polynomial Algorithm". SIAM
Review. 40 (3): 706–709. Bibcode:1998SIAMR..40..706H. doi:10.1137/S003614459732076X. "Classroom Note: A Simple Proof
of the Leverrier--Faddeev Characteristic Polynomial Algorithm"
Hamilton, W. R. (1862). "On the Existence of a Symbolic and Biquadratic Equation which is satisfied by the Symbol of Linear or
Distributive Operation on a Quaternion". The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science .
series iv. 24: 127–128. ISSN 1478-6435. Retrieved 2015-02-14.
Householder, Alston S. (2006). The Theory of Matrices in Numerical Analysis. Dover Books on Mathematics.ISBN 978-
0486449722.
Laufer, A. (1997). "The exponential map of GL(N)". J. Phys. A: Math. Gen. 30 (15): 5455–5470. arXiv:hep-th/9604049.
Bibcode:1997JPhA...30.5455L. doi:10.1088/0305-4470/30/15/029.
Tian, Y. (2000). "Matrix representations of octonions and their application".Advances in Applied Clifford Algebras. 10 (1): 61–
90. arXiv:math/0003166v2. CiteSeerX 10.1.1.237.2217. doi:10.1007/BF03042010. ISSN 0188-7009.
Zeni, J. R.; Rodrigues, W.A. (1992). "A thoughful study of Lorentz transformations by Clifford algebras". Int. J. Mod. Phys. A. 7
(8): 1793 pp. Bibcode:1992IJMPA...7.1793Z. doi:10.1142/S0217751X92000776.
Zhang, F. (1997). "Quaternions and matrices of quaternions". Linear Algebra and its Applications. 251: 21–57.
doi:10.1016/0024-3795(95)00543-9. ISSN 0024-3795 – via ScienceDirect (open archive).
External links
Hazewinkel, Michiel, ed. (2001) [1994], "Cayley–Hamilton theorem", Encyclopedia of Mathematics, Springer Science+Business
Media B.V. / Kluwer Academic Publishers,ISBN 978-1-55608-010-4
A proof from PlanetMath.
The Cayley–Hamilton theorem at MathPages

Retrieved from "https://en.wikipedia.org/w/index.php?title=Cayley–Hamilton_theorem&oldid=887020498


"

This page was last edited on 10 March 2019, at 02:29(UTC).

Text is available under theCreative Commons Attribution-ShareAlike License ; additional terms may apply. By using this site, you
agree to the Terms of Use and Privacy Policy. Wikipedia® is a registered trademark of theWikimedia Foundation, Inc., a non-profit
organization.

Anda mungkin juga menyukai