1 Suka0 Tidak suka

29 tayangan33 halamanJan 29, 2011

© Attribution Non-Commercial (BY-NC)

DOC, PDF, TXT atau baca online dari Scribd

Attribution Non-Commercial (BY-NC)

29 tayangan

Attribution Non-Commercial (BY-NC)

- The Law of Explosive Growth: Lesson 20 from The 21 Irrefutable Laws of Leadership
- Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
- Hidden Figures Young Readers' Edition
- The E-Myth Revisited: Why Most Small Businesses Don't Work and
- Micro: A Novel
- The Wright Brothers
- The Other Einstein: A Novel
- State of Fear
- State of Fear
- The Power of Discipline: 7 Ways it Can Change Your Life
- The Kiss Quotient: A Novel
- Being Wrong: Adventures in the Margin of Error
- Algorithms to Live By: The Computer Science of Human Decisions
- The 6th Extinction
- The Black Swan
- The Art of Thinking Clearly
- The Last Battle
- Prince Caspian
- A Mind for Numbers: How to Excel at Math and Science Even If You Flunked Algebra
- The Theory of Death: A Decker/Lazarus Novel

Anda di halaman 1dari 33

substractions, and multiplication of matrices. An

introduction to multiplication and division of matrices by

a scalar is provided. Includes determinants.

Dr. E. Garcia

Mi Islita.com

Email | Last Update: 07/09/06

Topics

Learning by Doing

Matrix Operations at Once

Addition and Substraction of Matrices

Multiplication of Matrices

Multiplication and Division of Matrices by a Scalar

Orthogonal Matrices

Transpose and Inverse Properties

Determinants

Tutorial Review

References

Learning by Doing

In Part 1 of this tutorial we introduced the reader to different type of matrices, digraphs, and

markov chains. We used lots of graphics to help users visualize the concepts. Now is time to

discuss matrix operations. As mentioned before, only the most common and basic operations

will be covered. Here we will use a learning-by-doing approach. Thus, rather than staring at

some equations, you must do your part.

• Do a quick and first reading of this tutorial. Don't skip sections. Don't worry if you

don't understand completely a key concept the first time. This first scan is analogous

to the visual scanning of key concepts you did in Part 1. The idea is to place some

global weights of knowledge in your "database" (mind) and later on associate to each

concept a local weight of specific knowledge. Incidentally, this teaching approach

resembles the way term weights are computed; i.e., by considering global and local

information.

• Once you have finished, go back and read again each section carefully. It is now when

you are going to concept-map text to images, form associations and execute.

• By execute I mean that each time you encounter an equation or figure describing some

calculations, try to replicate the calculations from scratch. Don't skip sections.

• Once you have finished, we suggest you to invent your own exercises and solve these.

If you prefer, use tabular data like from the sport or business section of a newspaper.

• Try to solve the exercises presented in the review section.

The rules for addition, substraction, multiplications and divisions between matrices are as

follows. Let first assume that matrix A and B are used to construct matrix Z. It must follows

that for

• Substraction: Z = A - B; zij = aij - bij

• Multiplication: Z = A*B, if # columns in A = # rows in B; zij = ai1* b1j + ai2* b2j + ai3*

b3j + ... aim* bnj

The rules for multiplication and division of a matrix by a scalar (a real number) are

simpler. If matrix Z is constructed by multiplying all elements of matrix A by a scalar c, then

its elements are zij = c*aij. In an analogous manner, dividing matrix A by c gives zij = (1/c)*aij.

All these operations are illustrated in Figure 1. Let's revisit these one by one.

Figure 1. Some matrix operations.

To add or substract matrices these must be of identical order. This just means that the matrices

involved must have the same number of rows and columns. If they don't have the same

number of rows and columns we cannot add or substract these.

The expression

means "to element in row i, column j of matrix A add element in row i, column j of matrix B".

If we do this with each element of A and B we end with matrix Z. An example is given in

Figure 2.

Figure 2. Addition operation.

means "to element in row i, column j of matrix A deduct element in row i, column j of matrix

B". If we do this with each element of A and B we end with matrix Z. See Figure 3.

Multiplication of Matrices

Consider two matrices A and B with the following characteristics: the number of columns in

A equals the number of rows in B. These are conformable with respect to one another, and

they can be multiplied together to form a new matrix Z.

The expression

zij = ai1* b1j + ai2* b2j + ai3* b3j + ... aim* bnj

means "add the products obtained by multiplying elements in each i row of matrix A by

elements in each j column of matrix B". Figure 4 illustrates what we mean by this statement.

Matrix multiplication has a catch as we mentioned before. The order in which we multiply

terms does matter. The reason for this is that we need to multiply row elements by column

elements and one by one. Therefore A*B and B*A can produce different results. We say "can

produce" because there exist special cases in which the operation is conmutative (order does

not matter). An example of this is when we deal with diagonal matrices. Diagonal matrices

were described in Part 1.

The rules for multiplication and division of a matrix by a scalar are similar. Since multiplying

a number x by 1/c is the same as dividing x by c, lets consider these operations at once.

If all elements of matrix A are multiplied by a scalar c to construct matrix Z, hence zij = c*aij.

Similarly dividing matrix A by c gives zij = (1/c)*aij. The expression

zij = c*aij

means "multiply each element in row i column j times c", and the expression

means "divide each element in row i column j by c". These two operations are shown in

Figure 5, where c = 2.

Figure 6 shows that a scalar matrix is obtained when an identity matrix is multiplied by a

scalar. As we will see in Part 3 of this tutorial, deducting a scalar matrix from a regular matrix

is an important operation.

Orthogonal Matrices

A regular matrix (one whose determinant is not equal to zero) M is said to be orthogonal if

when multiplied by its transpose the identity matrix I is obtained; i.e., M*MT = I. Orthogonal

matrices have interesting properties. If M is orthogonal:

1. its transpose and inverse are identical: MT = M-1.

2. when multiplied by its transpose the product is commutative: M*MT = MT*M.

3. its transpose is also an orthogonal matrix.

4. when multipled by an orthogonal matrix the product is an orthogonal matrix.

5. its determinant is +/- 1. The reverse is not necessarily true; i.e., not all matrices whose

determinant is +/- 1 are orthogonal.

6. the sum of the square of the elements in a given row or column is equal to 1.

7. when multiplied, the corresponding elements in two rows or columns-i.e., dot product-

is equal to zero.

Conversely, a square matrix (one with same number of rows and columns) is orthogonal if

the following conditions both exist:

1. the sum of the square of the elements in every row or column is equal to 1.

2. the sum of the products of corresponding elements in every pair of rows or columns

-i.e., dot products- is equal to zero.

As we can see, it is quite easy to determine if a regular or square matrix is orthogonal. Just

look for any of these properties.

(ABC)T =CTBTAT

(ABC)-1 =C-1B-1A-1

A-1A = AA-1 = I = 1

Since matrix division is not defined, it is impossible to divide a matrix expression by a given

matrix. However, the desired effect is achieved by multiplying the expression by the inverse

of the given matrix (2).

Determinants

function that associates a scalar to a square matrix. This can assume any real value including

zero. A matrix with a nonzero determinant is an invertible matrix (we can calculate its inverse

matrix). If the determinat is zero (det = 0) is called a non invertible matrix. Don't worry about

matrix inversions, yet.

To indicate that we are referring to determinant A and not to matrix A we surround the

symbol A by pipes ("|"). The symbolic definition of a determinant for a matrix A is shown in

Figure 7 and for m = n = 2 and m = n = 3

In the figure, the second subscripts are all distinct, the number of terms is n! and v is the

number of inversions of the second subscripts. Thus, the determinant of a matrix of order n=2

has two terms and 1 negative sign and the determinant of a matrix of order n=3 has 6 terms

and 3 negative signs. Sample calculations are given in Figure 8.

There are other methods for solving determinants (triangularization, reduction methods, etc).

For large matrices there are plenty of software solutions to choose from.

If the determinant of a square matrix is not zero, its matrix is described as a regular matrix. If

the determinant is zero, its matrix is described as a singular matrix. The problem of

transforming a regular matrix into a singular matrix is referred to as the eigenvalue problem.

The eigenvalue problem and two important concepts, eigenvalues and eigenvectors will be

explained in Part 3 of this tutorial.

Tutorial Review

1. Create two different matrices A and B, both of order n = 2. Prove that A*B and B*A

produce different results.

2. Consider the m = n = 2 matrix with the follow elements; a11 = -18; a12 = 29 ; a21 = 30;

a22 = 4. Calculate its trace and its determinant. Is this a regular or a singular matrix? Is

this an invertible or non invertible matrix?

3. Calculate the transpose matrices for the matrices shown in Figure 7. Calculate the

determinants of the transposed matrices. Are these regular or singular matrices? Are

these invertible or non invertible matrices?

References

1. Graphical Exploratory Data Analysis; S.H.C du Toit, A.G.W. Steyn and R.H. Stumpf,

Springer-Verlag (1986).

2. Handbook of Applied Mathematics for Engineers and Scientists; Max Kurtz, McGraw

Hill (1991).

atrix Tutorial 3: Eigenvalues and Eigenvectors

A tutorial on eigenvalues, eigenvectors and their properties.

Includes step by step how-to calculations. An introduction

to vector iteration, the Power Method and the Deflation

Method is provided.

Dr. E. Garcia

Mi Islita.com

Email | Last Update: 07/17/06

Topics

The Eigenvalue Problem

Calculating Eigenvalues

Eigenvectors

Properties of Eigenvalues and Eigenvectors

Computing Eigenvectors from Eigenvalues

Computing Eigenvalues from Eigenvectors

The Power Method (Vector Iteration)

The Deflation Method

Why should we care about all this?

Beware of Link Model Speculators

Tutorial Review

References

Putting Everything Together

In Part 1 of this three-part tutorial we defined different type of matrices. We covered digraphs,

stochastic matrices, and markov chains. We also mentioned how some search engine

marketers have derived blogonomies out of these and similar concepts.

matrices. We also discussed multiplication and division of matrices by a scalar and

calculation of determinants from square matrices. We mentioned that if a determinant has a

nonzero value, its matrix is described as regular and that if a determinant has zero value, its

matrix is described as singular.

It is now time to put everything together, to demystify eigenvalues, eigenvectors, and present

some practical applications.

c*I. Deducting this from a regular matrix A gives a new matrix A - c*I.

Equation 1: A - Z = A - c*I.

Equation 2: |A - c*I| = 0

and A has been transformed into a singular matrix. The problem of transforming a regular

matrix into a singular matrix is referred to as the eigenvalue problem.

However, deducting c*I from A is equivalent to substracting a scalar c from the main

diagonal of A. For the determinant of the new matrix to vanish the trace of A must be equal to

the sum of specific values of c. For which values of c?

Calculating Eigenvalues

Figure 1. The eigenvalue problem.

In the figure we started with a matrix A of order n = 2 and deducted from this the Z = c*I

matrix. Applying the method of determinants for m = n = 2 matrices discussed in Part 2 gives

|A - c*I| = c2 - 17*c + 42 = 0

c1 = 3 and c2 = 14.

Note that c1 + c2 = 17, confirming that these characteristic values must add up to the trace of

the original matrix A (13 + 4 = 17).

The polynomial expression we just obtained is called the characteristic equation and the c

values are termed the latent roots or eigenvalues of matrix A.

determinant vanishes (|A - c*I| = 0)

c2/trace = 14/17 = 0.824 or 82.4%

Thus, c2 = 14 is the largest eigenvalue, accounting for more than 82% of the trace. The largest

eigenvalue of a matrix is also called the principal eigenvalue.

There are many scenarios like in Principal Component Analysis (PCA) and Singular Value

Decomposition (SVD) in which some eigenvalues are so small that are ignored. Then the

remaining eigenvalues are added together to compute an estimated fraction. This estimate is

then used as a correlation criterion for the so-called Rank Two approximation.

SVD and PCA are techniques used in cluster analysis. In information retrieval, SVD is used

in Latent Semantic Indexing (LSI) while PCA is used in Information Space (IS). These will

be discussed in upcoming tutorials.

Now that the eigenvalues are known, these are used to compute the latent vectors of matrix

A. These are the so-called eigenvectors.

Eigenvectors

Equation 3: A - ci*I

Multiplying by a column vector Xi of same number of rows as A and setting the results to

zero leads to

Equation 4: (A - ci*I)*Xi = 0

homogeneous equations, and every system of equations has an infinite number of solutions.

Corresponding to every eigenvalue ci is a set of eigenvectors Xi, the number of eigenvectors

in the set being infinite. Furthermore, eigenvectors that correspond to different eigenvalues

are linearly independent from one another.

At this point it might be a good idea to highlight several properties of eigenvalues and

eigenvectors. The following pertaint to the matrices we are dicussing here, only.

• the absolute value of a determinant (|detA|) is the product of the absolute values of the

eigenvalues of matrix A

• c = 0 is an eigenvalue of A if A is a singular (noninvertible) matrix

• If A is a nxn triangular matrix (upper triangular, lower triangular) or diagonal

matrix , the eigenvalues of A are the diagonal entries of A.

• A and its transpose matrix have same eigenvalues.

• Eigenvalues of a symmetric matrix are all real.

• Eigenvectors of a symmetric matrix are orthogonal, but only for distinct eigenvalues.

• The dominant or principal eigenvector of a matrix is an eigenvector corresponding

to the eigenvalue of largest magnitude (for real numbers, largest absolute value) of

that matrix.

• For a transition matrix, the dominant eigenvalue is always 1.

• The smallest eigenvalue of matrix A is the same as the inverse (reciprocal) of the

largest eigenvalue of A-1; i.e. of the inverse of A.

If we know an eigenvalue its eigenvector can be computed. The reverse process is also

possible; i.e., given an eigenvector, its corresponding eigenvalue can be calculated.

Let's use the example of Figure 1 to compute an eigenvector for c1 = 3. From Equation 2 we

write

Note that c1 = 3 gives a set with infinite number of eigenvectors. For the other eigenvalue, c2

= 14, we obtain

Figure 3. Eigenvectors for eigenvalue c2 = 14.

As show in Figure 4, plotting these vectors confirms that eigenvectors that correspond to

different eigenvalues are linearly independent of one another. Note that each eigenvalue

produces an infinite set of eigenvectors, all being multiples of a normalized vector. So, instead

of plotting candidate eigenvectors for a given eigenvalue one could simply represent an entire

set by its normalized eigenvector. This is done by rescaling coordinates; in this case, by taking

coordinate ratios. In our example, the coordinates of these normalized eigenvectors are:

2. (1, 0.2) for c2 = 14.

Figure 4. Eigenvectors for different eigenvalues are linearly independent.

Mathematicians love to normalize eigenvectors in terms of their Euclidean Distance (L), so all

vectors are unit length. To illustrate, in the preceeding example the coordinates of the two

eigenvectors are (0.5, -1) and (1, 0.2). Their lengths are

for c2 = 14: L = [12 + 0.22]1/2 = 1.02

for c2 = 14: (1/1.02, 0.20/1.02) = (1, 0.2)

You can do the same and normalize eigenvectors to your heart needs, but it is time consuming

(and boring). Fortunately, if you use software packages these will return unit eigenvectors for

you by default.

This is a lot easier to do. First we rearrange Equation 4. Since I = 1 we can write the general

expression

Equation 5: A*X = c*X

Now to illustrate calculations let's use the example given by Professor C.J. (Keith) van

Rijsbergen in chapter 4, page 58 of his great book The Geometry of Information Retrieval (3),

which we have reviewed already.

This result can be confirmed by simply computing the determinant of A and calculating the

latent roots. This should give two latent roots or eigenvalues, c = 41/2 = +/- 2. That is, one

eigenvalue must be c1 = +2 and the other must be c2 = -2. This also confirms that c1 + c2 =

trace of A which in this case is zero.

An alternate method for computing eigenvalues from eigenvectors consists in calculating the

so-called Rayleigh Quotient, where

For the example given in Figure 5, XT*A*X = 36 and XT*X = 18; hence, 36/18 = 2.

Rayleigh Quotients give you eigenvalues in a straightforward manner. You might want to use

this method instead of inspection or as double-checking method. You can also use this in

combination with other iterative methods like the Power Method.

Eigenvalues can be ordered in terms of their absolute values to find the dominant or largest

eigenvalue of a matrix. Thus, if two distinct hypothetical matrices have the following set of

eigenvalues

• 5, 8, -7; then |8| > |-7| > |5| and 8 is the dominant eigenvalue.

• 0.2, -1, 1; then |1| = |-1| > |0.2| and since |1| = |-1| there is no dominant eigenvalue.

One of the simplest methods for finding the largest eigenvalue and eigenvector of a matrix is

the Power Method, also called the Vector Iteration Method. The method fails if there is no

dominant eigenvalue.

1. Asign to the candidate matrix an arbitrary eigenvector with at least one element being

nonzero.

2. Compute a new eigenvector.

3. Normalize the eigenvector, where the normalization scalar is taken for an initial

eigenvalue.

4. Multiply the original matrix by the normalized eigenvector to calculate a new

eigenvector.

5. Normalize this eigenvector, where the normalization scalar is taken for a new

eigenvalue.

6. Repeat the entire process until the absolute relative error between successive

eigenvalues satisfies an arbitrary tolerance (threshold) value.

It cannot get any easier than this. Let's take a look at a simple example.

Figure 6. Power Method for finding an eigenvector with the largest eigenvalue.

What we have done here is apply repeatedly a matrix to an arbitrarily chosen eigenvector. The

result converges nicely to the largest eigenvalue of the matrix; i.e.

Figure 7 provides a visual representation of the iteration process obtained through the Power

Method for the matrix given in Figure 3. As expected, for its largest eigenvalue the iterated

vector converges to an eigenvector of relative coordinates (1, 0.20).

Figure 7. Visual representation of vector iteration.

It can be demonstrated that guessing an initial eigenvector in which its first element is 1 and

all others are zero produces in the next iteration step an eigenvector with elements being the

first column of the matrix. Thus, one could simply choose the first column of a matrix as an

initial seed.

Whether you want to try a matrix column as an initial seed, keep in mind that the rate of

convergence of the power method actually depends on the nature of the eigenvalues. For

closely spaced eigenvalues, the rate of convergence can be slow. Several methods for

improving the rate of convergence have been proposed (Shifted Iteration, Shifted Inverse

Iteration or transformation methods). I will not discuss these at this time.

There are different methods for finding subsequent eigenvalues of a matrix. I will discuss only

one of these: The Deflation Method. Deflation is a straightforward approach. Essentially, this

is what we do:

1. First, we use the Power Method to find the largest eigenvalue and eigenvector of

matrix A.

2. multiply the largest eigenvector by its transpose and then by the largest eigenvalue.

This produces the matrix Z* = c *X*(X)T

3. compute a new matrix A* = A - Z* = A - c *X*(X)T

4. Apply the Power Method to A* to compute its largest eigenvalue. This in turns should

be the second largest eigenvalue of the initial matrix A.

Figure 8 shows deflection in action for the example given in Figure 1 and 2. After few

iterations the method converges smoothly to the second largest eigenvalue of the matrix.

Neat!

Figure 8. Finding the second largest eigenvalue with the Deflation Method.

Note. We want to thanks Mr. William Cotton for pointing us of an error in the original

version of this figure, which was then compounded in the calculations. These have been

corrected since then. After corrections, still deflation was able to reach the right second

eigenvalue of c = 3. Results can be double checked using Raleigh's Quotients.

We can use deflation to find subsequent eigenvector-eigenvalue pairs, but there is a point

wherein rounding error reduces the accuracy below acceptable limits. For this reason other

methods, like Jacobi's Method, are preferred when one needs to compute many or all

eigenvalues of a matrix.

Armed with this knowledge, you should be able to understand better articles that discuss link

models like PageRank, their advantages and limitations, when these succeed or fail and why.

The assumption from these models is that surfing the web by jumping from links to links is

like a random walk describing a markov chain process over a set of linked web pages.

The matrix is considered the transition probability matrix of the Markov chain and having

elements strictly between zero and one. For such matrices the Perron-Frobenius Theorem tells

us that the largest eigenvalue of the matrix is equal to one (c = 1) and that the corresponding

eigenvector, which satisfies the equation

Equation 7: A*X = X

does exists and is the principal eigenvector (state vector) of the Markov Chain, with elements

of X being the pageranks. Thus, according to theory, iteration should enable one to compute

the largest eigenvalue and this principal eigenvector, whose elements are the pagerank of the

individual pages.

If you are interested in reading how PageRank is computed, stay away from speculators,

especially from search engine marketers. It is hard to find accurate explanations in SEO or

SEM forums or from those that sell link-based services. I rather suggest you to read university

research articles from those that have conducted serious research work on link graphs and

PageRank-based models. Great explanations are all over the place. However, some of these

are derivative work and might not reflect how Google actually implements PageRank these

days (only those at Google know or should know this or if PageRank has been phased out for

something better). With all, these research papers are based on experimentation and their

results are verifiable.

There is a scientific paper I would like readers to at least consider: Link Analysis,

Eigenvectors and Stability, from Ng, Zheng and Jordan from the University of California,

Berkeley (5). In this paper the authors use many of the topics herein described to explain the

HITS and PageRank models. Regarding the later they write:

Figure 9. PageRank explanation, according to Ng, Zheng and Jordan from University of

California, Berkeley

Note that the last equation in Figure 9 is of the form A*X = X as in Equation 7; that is, p is

the principal eigenvector (p = X) and can be obtained through iterations.

After completing this 3-part tutorial you should be able to grasp the gist of this paper. The

group even made an interesting connection between HITS and LSI (latent semantic indexing).

If you are a student and are looking for a good term paper on Perron-Frobenius Theory and

PageRank computations, I recommend you the term paper by Jacob Miles Prystowsky and

Levi Gill Calculating Web Page Authority Using the PageRank Algorithm (6). This paper

discusses PageRank and some how-to calculations involving the Power Method we have

described.

How many iterations are required to compute PageRank values? Only Google knows.

According to this Perron-Frobenius review from Professor Stephen Boyd from Stanford (7),

the original paper on Google claims that for 24 million pages 50 iterations were required. A

lot of things have changed since then, including methods for improving PageRank and new

flaws discovered in this and similar link models. These flaws have been the result of the

commercial nature of the Web. Not surprisingly, models that work well under controlled

conditions and free from noise often fail miserably when transferred to a noisy environment.

These topics will be discussed in details in upcoming articles.

Meanwhile, if you are still thinking that the entire numerical apparatus validates the notion

that on the Web links can be equated to votes of citation importance or that the treatment

validates the link citation-literature citation analogy a la Eugene Garfield's Impact Factors,

think again. This has been one of the biggest fallacies around, promoted by many link

spammers, few IRs and several search engine marketers with vested interests.

Literature citation and Impact Factors are driven by editorial policies and peer reviews. On the

Web anyone can add/remove/exchange links at any time for any reason whatever. Anyone can

buy/sell/trade links for any sort of vested interest or overwrite links at will. In such noisy

environment, far from the controlled conditions observed in a computer lab, peer review and

citation policies are almost absent or at best contaminated by commercialization. Evidently

under such circumstances the link citation-literature citation analogy or the notion that a link

is a vote of citation importance for the content of a document cannot be sustained.

Tutorial Review

scalar c; i.e., Z = c*I.

2. Prove that deducting c*I from regular matrix A is equivalent to substracting a scalar c

from the diagonal of A.

3. Given the following matrix,

Prove that these are indeed the three eigenvalues of the matrix. Calculate the

corresponding eigenvectors.

4. Use the Power Method to calculate the largest eigenvalue of the matrix given in

Exercise 3.

5. Use the Deflation Method to calculate the second largest eigenvalue of the matrix

given in Exercise 3.

References

1. Graphical Exploratory Data Analysis; S.H.C du Toit, A.G.W. Steyn and R.H. Stumpf,

Springer-Verlag (1986).

2. Handbook of Applied Mathematics for Engineers and Scientists; Max Kurtz, McGraw

Hill (1991).

3. The Geometry of Information Retrieval; C.J. (Keith) van Rijsbergen, Cambridge

(2004).

4. Lecture 8: Eigenvalue Equations; S. Xiao, University of Iowa.

5. Link Analysis, Eigenvectors and Stability; Ng, Zheng and Jordan from the University

of California, Berkeley.

6. Calculating Web Page Authority Using the PageRank Algorithm; Jacob Miles

Prystowsky and Levi Gill; College of the Redwoods, Eureka, CA (2005).

7. Perron-Frobenius Stephen Boyd; EE363: Linear Dynamical Systems, Stanford

University, Winter Quarter (2005-2006).

Thank you for using this site.

A matrix tutorial. Includes, square, triangular, scalar,

transpose, and stochastic matrices. Also covers rank of a

matrix, digraphs and Markov chains.

Dr. E. Garcia

Mi Islita.com

Email | Last Update: 07/11/06

Topics

Principal and Trace of a Square Matrix

Row Vectors, Column Vectors, Scalar and Transpose Matrices

The Rank of a Matrix

Demystifying Stochastic Matrices

Digraphs, Indegrees and Outdegrees

Markov Chains and Link Models

SEO Blogonomies: The Search Engine Markov Chain

What's Next?

Tutorial Review

References

About this Math Tutorial

This tutorial introduces matrices, eigenvalues, and eigenvectors to IR students and search

engine marketers. In Part 1 we go through some definitions and familiarize readers with

different type of matrices. Emphasis is given to stochastic matrices. In Part 2 we stop

momentarily to explain some basic matrix operations. Part 3 demystifies eigenvalues and

eigenvectors, showing how to calculate these.

We hope that presenting the material in this order, i.e., visualization of matrices first, followed

by matrix operations, might help students to associate math operations with what they have

visualized already. Currently, many matrix tutorials intermingle execution with visualization,

forcing students to stop and do a one-by-one mapping between text and graphics, before

processing new material. In our opinion that approach injects to the discourse an unnecessary

level of difficulty.

By separating visualization from execution, by the end of this tutorial the reader will be able

to discriminate between different type of matrices. Students will be able to identify key

concepts such as the rank of a matrix, digraphs, markov chains, and other key concepts

without resourcing to math operations.

We do not pretend to make a comprehensive review out of this tutorial. Rather the material is

limited to what we think might be relevant to link models and cluster structures. Applications

and examples are provided.

Most of the material and examples are taken from two great books (1, 2) I read way back

while in grad school and before the inception of commercial search engines (Google, Yahoo,

MSN, etc) in the Web scene:

1. Graphical Exploratory Data Analysis; S.H.C du Toit, A.G.W. Steyn and R.H. Stumpf,

Springer-Verlag (1986).

2. Handbook of Applied Mathematics for Engineers and Scientists; Max Kurtz, McGraw

Hill (1991).

Why we wrote this tutorial for an audience consisting of IR students and search marketers?

Well, there are plenty of reasons. Consider this:

• eigenvectors can be used to understand link models and networks.

• eigenvalues, eigenvectors, stochastic matrices, Markov Chains, etc are used to

understand dissimilar random processes.

• often students and search engine marketers find these concepts too abstracts or

complex to understand.

• research articles about these topics are often misquoted in SEM discussion

forums/SEO blogs and key concepts become "blogonomies".

Thus, one of the goals of this tutorial is help our audience to grasp these concepts, while we

dispel some myths. In this way next time a reader has an encounter with these topics he/she

can grasp the gist of the discourse --or at least a good portion of it.

Let us first define what is a matrix and go through some basic definitions.

A matrix is just a rectangular array of rows (m) and columns (n); that is, a table. Thus, tabular

data entered into an Excel spreadsheet can be viewed as a matrix. If you run a mom-n-pop

business and for some reason you have arranged numbers or letters in rows and columns, you

have handled matrices already.

If a matrix has the same number of rows (m) and columns (n) is termed a square matrix; i.e.,

m = n. The matrix is said to be of the nth order or of order n. Thus, an array consisting of two

rows and two columns is a square matrix of order m = n = 2 and an array consisting of three

rows and three columns is a square matrix of order m = n = 3.

Elements of a matrix are identified by assigning subscripts to rows and columns. Thus, for

matrix A its elements are aij. For instance, a32 means element in row 3 column 2.

The diagonal extending from the upper-left corner to the lower-right corner of a square matrix

is termed the principal. The elements of the principal are termed the principal elements or

diagonal elements. The sum of the principal is the trace of the matrix. The trace is an

important concept, as we will see in Part 1 and Part 2 of this tutorial. These concepts are

illustrated in Figure 1.

A one-row matrix is a called a row vector. Similarly, a one-column matrix is termed a column

vector. A null matrix is one with all elements being zero.

A matrix in which all nondiagonal elements have zero value is a diagonal matrix. If all

elements of a diagonal matrix are equal, we call this a scalar matrix. If all elements of a

scalar matrix are 1 this is termed a unit matrix or an identity matrix, I.

A transpose matrix AT is obtained by converting rows into columns and columns into rows.

Some of these definitions are illustrated in Figure 2.

A matrix in which all elements above or below the principal have zero value is a triangular

matrix. Moreover, a triangular matrix is classified as lower-triangular or upper-triangular,

respectively, according to whether the zero elements lie above or below the principal.

The rank of a matrix is equal to the number of linearly independent rows or linearly

independent columns it contains, whichever of these two number is smaller. Accordingly, the

rank of a square matrix is equal to the number of nonzero rows in its upper-triangular matrix

or the number of nonzero columns in its equivalent lower-triangular matrix, whichever of

these two number is smaller.

Figure 3 shows a square matrix and its equivalent triangular matrix. This was obtained by

subjecting the matrix to elementary column operations. Don't worry for now about

tranforming a square matrix into a triangular matrix. What is important is the following: since

B contains 3 nonzero columns, A is of rank 3.

Figure 3. Rank of a square matrix.

Another way of computing the rank of a matrix involves the use of singular values. This will

be discussed in an upcoming tutorial on Singular Value Decomposition (SVD).

Some times search engine marketers and those that sell links quote papers about link models,

not knowing that the term "rank" of a link graph is used in those articles in reference to the

rank of a matrix and not in reference to any web page ranks (i.e., positioning of search

results). Next thing one reads from these marketers is what we call a bunches of blogonomies.

We call a "blogonomy" the dissemination of false knowledge through blogs or public forums

and "blogorrhea" when a false concept is promoted for a profit.

If all elements of a matrix are non negative, we can normalize rows by adding row elements

together and dividing each element by the corresponding row totals. Obviously, adding

together normalized row elements equals 1. In general, a matrix whose sum of all row

elements (or column elements) equals 1 is called a stochastic matrix. Elements of a

stochastic matrix can be zero if their row totals (or column totals) equal 1. See Figure 4.

Since a stochastic matrix can be obtained also by normalizing columns, authors often use the

expressions row-stochastic matrix and column-stochastic matrix, in order to distiguish

between the two cases. The expression doubly stochastic matrix is reserved for square

matrices whose sums of both rows and columns equal 1. This is the case if both matrix A and

its transpose AT are stochastics.

A directed graph or digraph consists of a number of points (nodes) linked together by arrows

or lines also called edges. Arrows indicates the direction of the relationship between two

nodes. The number of arrows ending at a specific node is called the outdegree of the node and

the number of arrows leading from it, is called the indegree.

To illustrate these concepts, let me use the example presented by the authors of Graphical

Exploratory Data Analysis (1) from 1986.

Here they represented the friendship between six individuals as a digraph (any similarity with

link graphs flying around?). The direction of the arrows says it all. 1, 3 and 6 consider 2 a

friend, but 2 is friendly with 3, only.

The following array describes how the nodes are related. Note that row totals give outdegrees

and column totals indegrees.

Figure 6. Indegrees and outdegrees for the friendship between six persons.

When these type of relationships are represented in matrix notation the resultant array is

called an adjacency matrix. Dividing each row element of the adjacency matrix by the

correponding outdegree, yields a row-stochastic matrix,

Now that we have the basic ideas clarified, let's move forward and talk about random

processes.

A random process is a process or series of events that occur by chance. If the process evolve

in time is called a Markov chain. Looking at some of the stochastic matrices we have

derived, if instead of mere numbers the elements represent probabilities pij these are called

transition probabilities. The corresponding matrix is termed a transition matrix

Therefore, it can be said that a Markov chain is just a random process evolving in time

according with the transition probabilities of the Markov chain.

SEO Blogonomies: The Search Engine Markov Chain

prevalent in circles associated to search engine optimization (SEO). This is a social

phenomenon more notorious in the blogosphere and through public forums (sites and

discussion forums). Because of this, we call the phenomenon "blogonomies". We are

currently compiling a list of the most notorious blogonomies spreaded over the search engine

marketing world.

Many blogonomies are promoted by well known SEO and SEM specialists. These folks are

called "experts" by their followers and pose as such in their SEM conferences. They often

quote each other or call each other "experts". Many of these folks like to write the fine line of

fallacies, producing material where false concepts are decorated with scientific terms and

"fat" words. They are also experts in damage control and in saving face.

blogonomies since that is self-evident. What we want is make the reader aware of the

phenomenon. As a sample of what you could expect to see listed in our SEO Blogonomies

here is one: The Search Engine Markov Chain Blogonomy.

Some SEOs have written -giving the impression to readers- that search engines use a mythical

Markov Chain to find patterns in search engine search results or sites, like if such chain is a

special kind of detection instrument, tool or technique that is applied to find keyword patterns

in a web page or to detect how the document was optimized. This is pure non sense.

There is no such thing as a mythical Search Engine Markov Chain, which only exists in the

mind of these folks and followers, who often misquote research articles. A markov chain is

simply a random process that occurs over time according to some transition probabilities.

Suppose we run an experiment that has N possible results (states). Suppose that we keep

repeating the experiment and that the probability of each of the results or states occurring on

the (N+1)th repetition depends only on the result of the Nth repetition of the experiment. This

is called a markov chain.

Thus a markov chain is not an instrument, technique, tool or the like that allegedly is used by

search engines to rank web pages or to find word patterns in documents. True that there is a

lot of research in which things have been modeled as markov processes in an attempt at

understanding better behaviours and link graphs, but the analogy stops there.

True that there is something called an absorbing markov chain, but this is a specific case

involving random walks with absorbing states. Perhaps it might be a good idea to write a

tutorial on regular markov chains and absorbing markov chains or, better, recommend readers

to take a look at the book of James T. Sandefur, Discrete Dynamical Systems, Theory and

Applications (Oxford University Press; Chapter 6 Absorbing Markov Chains) (3). If you like

fractals, chaos and iterations, this book is for you.

Meanwhile, if while drunk you walked randomly from one point to another, chances are that

you might have "markov-chained yourself", already.

What's Next?

What all this discourse has to do with web links (linked web pages)? Well, consider a random

walk over a set of linked pages. This can be defined by a transition matrix, which in this case

is the links matrix. The largest eigenvector of the transition matrix tell us the probabilities of

the walk ending on the candidate pages. To understand the significance of this statement we

first need to define what we mean by the largest eigenvector and how these are computed.

This and the calculations involved, will be explained, step by step in Part 2 and Part 3 of this

tutorial.

Tutorial Review

2. Which of the following is a square matrix: an array of 3 rows and 2 columns or an

array of 10 rows and 10 columns?

3. Look at the business or sport section of a newspaper and try to find tabular data that

could be represented as a square matrix. Calculate its trace and transpose.

4. Derive a column-stochastic matrix from Figure 6.

5. Look at your site map. Try to derive a digraph from your site map or link structure.

For this exercise, consider only your pages, ignoring third-party links (this of course

will represent an ideal scenario). Compute a row-stochastic matrix or a column-

stochastic matrix. Have fun.

6. What is a markov chain?

References

1. Graphical Exploratory Data Analysis; S.H.C du Toit, A.G.W. Steyn and R.H. Stumpf,

Springer-Verlag (1986).

2. Handbook of Applied Mathematics for Engineers and Scientists; Max Kurtz, McGraw

Hill (1991).

3. Discrete Dynamical Systems, Theory and Applications; James T. Sandefur, Oxford

University Press; Chapter 6 Absorbing Markov Chains (1990).

- Segmentacion de VenasDiunggah olehFer Agraz
- Harmonic Resonance Mode Analysis (1)Diunggah olehJorge Campos
- 0712 3060v1Diunggah olehJay Liew
- Electronic Engineering Revised Curriculum 2014 (Draft)Diunggah olehdaddyaziz80
- Analysis of Risk Contributors for Major Corridors of Urban Traffic MobilityDiunggah olehEditor IJRITCC
- Hash La Moun Has Sou Neha BedDiunggah olehparagbhole
- geometría analíticaDiunggah olehroberto
- Unitary Matrix by Pamit RoxxDiunggah olehPamit Sharma
- ANNEXURE 2Diunggah olehSanchari_rittika
- A Complete Quantitative Analysis of SelfDiunggah olehMuntariNglebur
- CriticalDiunggah oleheddyedwards511
- DeeperInsidePageRank.pdfDiunggah olehShruti Bansal
- MatTutor_43Diunggah olehArtist Recording
- 2 Exams 1B Spring2013Diunggah olehAya Hassan
- Pages from zhang2015.pdfDiunggah olehKAS Mohamed
- 2013 2014 Lesseon Plan I Sem M. TechDiunggah olehSarfaraz Nawaz Shaha
- Linear Algebra Lecture NotesDiunggah olehxeemac
- Rv Rao AHP-cricketDiunggah olehAbhijit Garg
- Diff Equation 7 Sys 2012 FALLDiunggah olehCamelia Lupu
- Material 2Diunggah olehSmith Siva
- Arv3 Matlab TutorialDiunggah olehAnand Sc
- pset7Diunggah olehPeter Boyajian
- (Optical Science and Engineering) Partha P. Banerjee, John M. Jarem-Computational Methods for Electromagnetics and Optical Systems-CRC Press (2000)Diunggah olehsrikumar sandeep
- UT Dallas Syllabus for math2333.503 06s taught by William Donnell (wxd022000)Diunggah olehUT Dallas Provost's Technology Group
- UT Dallas Syllabus for math2333.501 05f taught by William Donnell (wxd022000)Diunggah olehUT Dallas Provost's Technology Group
- ECE602_e1soln_s17.pdfDiunggah olehpersi rani
- mtkDiunggah olehFarrel Amroe
- Linear Algebra(Ppt)UpdatedDiunggah olehalwafi
- 10Diunggah olehmmylo
- Odd Parity SuperconductivityDiunggah olehgenhyin

- Chapter 2 Matrix AlgebraDiunggah olehNik Haslida Nik Ali
- SciLab Made Easy - TutorialDiunggah olehNyro Research Foundation
- Generalized New Mersenne Number TransformsDiunggah olehanand_duraiswamy
- 04 04 Unary Matrix OperationsDiunggah olehJohn Bofarull Guix
- MM_V_nsns_2008_08_07.pdfDiunggah olehMauro Pérez
- Gerson J. Ferreira-Introduction to Computational Physics with examples in Julia programming language-self-published (2016).pdfDiunggah olehcquina
- 171203129Diunggah olehhuevonomar05
- Direction Cosine MatrixDiunggah olehsazrad
- Richard J. Lipton, Kenneth W. Regan-Quantum Algorithms via Linear Algebra_ a Primer-The MIT Press (2014)Diunggah olehivi
- TransformationsDiunggah olehM2C7r6
- FEA Appendix a Matrix AlgebraDiunggah olehOmom Mye
- Mat Lab Notes for ProfessionalsDiunggah olehLulu Crazy
- Algebra (Colt 1.2Diunggah olehJean claude onana
- list manipulation - Elegant operations on matrix rows and columns - Mathematica Stack Exchange.pdfDiunggah olehdharul khair
- matrices and determinantsDiunggah olehMonte Carlo Palado
- Best Paper 12Diunggah olehFakhar Abbas
- Signal Systems Lab Manual.pdfDiunggah olehAnon
- VTU 3RD SEM CSE DATA STRUCTURES WITH C NOTES 10CS35Diunggah olehEKTHATIGER633590
- Some theorems on the generalized numerical ranges - Yik-Hoi Au-yeung , Nam-Kiu Tsing.pdfDiunggah olehjulianli0220
- Tutorial 2 VectorsDiunggah olehIdea
- Matlab CodesDiunggah olehxforce
- IIT JEE Advanced 2014 Paper I Question Paper with Answers pdfDiunggah olehpavani
- ABAQUS Theory ManualDiunggah olehHurricanes31
- lec2.psDiunggah olehshiv kumar
- Matrix NormsDiunggah olehulbrich
- Vectors and TensorsDiunggah olehAndrei Cioroianu
- Generalized S-procedureDiunggah olehAliAlMisbah
- MATLAB for EngineersDiunggah olehandrej_manić
- Transpose Rows Into Columns using SQLDiunggah olehPain
- Mathematical Symbols.docxDiunggah olehCommunity Rescue Station

## Lebih dari sekadar dokumen.

Temukan segala yang ditawarkan Scribd, termasuk buku dan buku audio dari penerbit-penerbit terkemuka.

Batalkan kapan saja.