General Relativity

PHYS 3033 - General Relativity
Lee Kai Ming

CYMP Rm 415A
kmlee@lily.physics.hku.hk
2859 2370
http://www.physics.hku.hk/phys3033/
July 2013
Preface
We only have time to cover a tiny part of general relativity in this course.
Students need to read other books. Here are a few comments.
N. Christensen and T. Moore, Teaching General Relativity to Undergrad-
uates, Physics Today, 65, 6, 41 (June 2012). This is an article in Physics
Today, comparing the approaches of various books. A must read.
R.M. Wald, General Relativity, The University of Chicago Press (1984).
This is a good one, but not an introductory one.
B. ONeill, Semi-Riemannian Geometry, Academic Press (1983). A math-
ematics book. I love it. In this course, I will follow Wald and ONeill a lot.
S. Weinberg, Gravitation and Cosmology: Principles and Applications of
the General Theory of Relativity, Wiley, New York (1972). Clear explanation
in physics, but not in mathematics. I could understand this only after I have
read ONeill.
C.W. Misner, K.S. Thorne, J.A. Wheeler, Gravitation, W.H. Freeman,
San Francisco (1973). A very important book, but I hate it. The mathemat-
ics is not rigorous to my taste. Too many boxes and too heavy.
B.F. Schutz, A First Course in General Relativity, Cambridge University
Press, New York (1985; 2nd ed. 2009). I have not read it. I heard that it is
good. If you find other books too old, the second edition is a younger choice.
J.B. Hartle, Gravity: An Introduction to Einsteins General Relativity,
Addison-Wesley, San Francisco (2003). Also, I have not read it. Some stu-
dents comment that this is a popular science book with equations, one cannot
actually learn general relativity by reading it.
I will employ the mathematics approach. Students might find that half
(or more than half) of my course is mathematics. I am not apologetic because
this is the essence of general relativity. (Imagine learning electromagnetism
without vector calculus.) To learn well, some of the prerequisites are lin-
ear algebra, calculus of several variables, special relativity, just to name a
few. (Passing my course in special relativity do NOT prepare yourself to
understand this one. Dont worry, passing this course demands much less.)
Finally, two tips from Christensen and Moore, students should ask them-
selves what they can measure and it is important to ... draw the distinction
between global coordinates and real physical measurements performed in a
local laboratory.
i
Contents
Preface i
1 Linear Algebra 1
1.1 Vector Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Inner Product . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.3 Tensor Product . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2 Review of Special Relativity 16

2.1 Conventions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.2 Lorentz Transformations . . . . . . . . . . . . . . . . . . . . . 17
2.3 Time Dilation and Lorentz Contraction . . . . . . . . . . . . . 18
2.4 Energy Momentum Vector . . . . . . . . . . . . . . . . . . . . 19
2.5 Charge, Current and Conservation . . . . . . . . . . . . . . . . 19
2.6 Stress-energy Tensor . . . . . . . . . . . . . . . . . . . . . . . 22
3 Differentiable Manifolds 24
3.1 An Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
3.2 Smooth Manifolds . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.3 Vector Fields and One Forms . . . . . . . . . . . . . . . . . . 30
3.4 Tensor Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.5 Metric Tensor . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
4 Curvature 36
4.1 Covariant Derivative . . . . . . . . . . . . . . . . . . . . . . . 36
4.2 Parallel Transport . . . . . . . . . . . . . . . . . . . . . . . . . 40
4.3 Levi-Civita Connection . . . . . . . . . . . . . . . . . . . . . . 41
4.4 Curvature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
5 General Relativity 51
5.1 Equivalence Principle . . . . . . . . . . . . . . . . . . . . . . . 51
5.2 Proper Time . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
5.3 Newtonian Limit . . . . . . . . . . . . . . . . . . . . . . . . . 54
5.4 Time Dilation . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
5.5 Einsteins Field Equation . . . . . . . . . . . . . . . . . . . . . 57
ii
CONTENTS iii
6 Schwarzschild Solution 61
6.1 Metric in Standard Form . . . . . . . . . . . . . . . . . . . . . 61
6.2 Christoffel Symbols . . . . . . . . . . . . . . . . . . . . . . . . 62
6.3 Ricci Tensor . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
6.4 The Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
6.5 Black Holes . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
Index 71
Chapter 1
Linear Algebra
One of the central ideas of general relativity is that coordinates are for human
beings. Nature does not need them. We have to learn how to describe things
without using coordinates and what happens if we change the coordinate
systems. We start with vector spaces.
1.1 Vector Spaces

Definition 1.1. A vector space V (over real numbers R) is a non-empty set
together with two operations, called scalar multiplication : R V V
and vector addition + : V V V satisfying
1. u, v, w V, u + (v + w) = (u + v) + w;
2. u, v V, u + v = v + u;
3. There exists an element, zero vector, 0 V such that v V, 0+v = v;
4. v V , there exists an inverse u V such that v + u = 0. We denote

this u by v;
5. a R, u, v V, a(u + v) = au + av;
6. a, b R, v V, (a + b)u = au + bu;
7. a, b R, v V, a(bu) = (ab)u;
8. For 1 R, v V, 1v = v.
Example 1.2. The first non-trivial example is the plane R2 RR. This is
the set of all ordered pairs of two real numbers, almost always interpreted as
the x- and y-coordinates of a point on the plane relative to some pre-chosen
coordinate system.
1
CHAPTER 1. LINEAR ALGEBRA 2
This could be generalized to any positive integer n, with the case n = 3

being the three dimensional space. Let Rn R R be the direct
product of n factors of R. Elements of Rn is the n-tuple of real numbers,
v = (v1 , v2 , . . . , vn ) Rn . Scalar multiplication is defined by
av = a(v1 , v2 , . . . , vn ) (av1 , av2 , . . . , avn ) (1.1)
where, for example, av1 in the right hand side is just multiplication of two
real numbers. Vector addition is
(u1 , u2 , . . . , un ) + (v1 , v2 , . . . , vn ) = (u1 + v1 , u2 + v2 , . . . , un + vn ) . (1.2)
The zero vector is (0, 0, . . . , 0) and the inverse is (v1 , v2 , . . . , vn ). Stu-

dents should check that Rn satisfies all the requirement in Definition 1.1.
Spacetime of special relativity is modeled after R4 .
Example 1.3. Consider the differential equation
d2 y
+y =0 . (1.3)
dx2
The solutions form a vector space. Explicitly, if ya = a1 cos x + a2 sin x and
yb = b1 cos x+b2 sin x are two solutions of the differential equation, then their
sum
ya + yb = (a1 + b1 ) cos x + (a2 + b2 ) sin x (1.4)
is also a solution. (Here, the superscripts are indices, not power. For example,
a2 is a real number, has no relation with a1 and certainly not the square of
a. This notation is standard.)
The solutions of the inhomogeneous differential equation
d2 y
+y =1 . (1.5)
dx2
do not form a vector space. The sum of two non-zero solutions is not a
solution anymore.
Definition 1.4. Given m vectors v1 , v2 , . . . , vm . They are linearly inde-
pendent if the solution of the equation
a1 v 1 + a2 v 2 + + a m v m = 0 (1.6)
is only a1 = a2 = = am = 0.
Example 1.5. In R2 , (1, 0) and (0, 1) are linearly independent. A single
vector (1, 0) is linearly independent. (1, 0), (0, 1) and (1, 1) are not linearly
independent. (1, 0) and (2, 0) are not linearly independent. A single zero
vector (0, 0) is not linearly independent.
Definition 1.6. A maximum set of linearly independent vectors is a basis

of the vector space. We mean that B = {v1 , v2 , . . . , vn } is a basis if
1. v1 , v2 , . . . , vn are linearly independent;
2. if v V is any vector, v, v1 , v2 , . . . , vn are linearly dependent.
Theorem 1.7. All bases have the same number of the vectors. This is the
dimension of the vector space.
Example 1.8. The dimension of Rn is n. For R2 , {(1, 0), (0, 1)} is a basis,
{(1, 0), (1, 1)} is also a basis and {(2, 0), (0, 2)} is yet another basis.
The dimension of the vector space of solutions in Example 1.3 is 2.
The space of all continuous functions from real numbers to real numbers
f : R R is also a vector space. (Check this.) Its dimension is infinite.
Theorem 1.9. If V is a vector space of dimension n and v1 , . . . , vn form a

basis, then any vector v V can be expressed as a linear combination of
v1 , . . . , vn
v = a1 v 1 + a2 v 2 + + a n v n . (1.7)
Proof. The vectors v, v1 , v2 , . . . , vn are linearly dependent, we have for some
real numbers b, b1 , b2 , . . . , bn
bv + b1 v1 + + bn vn = 0 . (1.8)
b 6= 0, otherwise the above equation shows that v1 , . . . , vn are linear depen-

dent. Hence, take a = b /b, = 1, . . . , n, we have the required result.
Definition 1.10. The numbers a1 , . . . , an in Eq. (1.7) are the coordinates

or components of the vector v relative to the basis v1 , . . . , vn .
What happens if we choose another basis u1 , . . . , un ? v can be expressed

as linear combinations of u ,
n
X
v = L u (1.9)
=1
for some real numbers L . (We choose to use one subscript and one super-
script for a reason.) v also form a basis
n
X
u = K v (1.10)
=1
for some other real numbers K . We have

n
X n
X
v = L u = L K v
=1 ,=1
n
n
!
X X

0 = L K v . (1.11)
=1 =1
The Kronecker delta is defined as

1 if =
= . (1.12)
0 if 6=
It is just the entries of the identity matrix. Back to Eq. (1.11), the v are
linear independent. Their coefficients must be zero
n
X

= L K . (1.13)
=1
Pn
Similarly, we have = =1 K L , which means, as matrices, (L ) and
(K ) are inverse to each other, and their determinants are non-zero.
What is the relation between components? For a vector v V , it has
different components relative to different basis,
n
X n
X

v= a v = b u . (1.14)
=1 =1
Applying Eq. (1.9),

n
X n
X n
X

b u = a v = a L u . (1.15)
=1 =1 ,=1
Comparing coefficients, we have

n
X

b = a L . (1.16)
=1
This is how the components of the same vector transform relative to two
bases. Physicists would say that if something transforms like this, then it
is a vector. I hate this very vague statement.
Definition 1.11. A function of a set is just an assignment of a number to
every element of the set. A linear function of a vector space f : V R
must also satisfy
f (av + bu) = af (v) + bf (u) (1.17)
for any vectors u and v and any numbers a and b.
Given two linear functions f and g, we define scalar multiplication and

their sum by
(af )(v) = af (v) (1.18)

(f + g)(v) = f (v) + g(v) . (1.19)
which means the symbol af denotes a linear function that its value at v
equals to a times the value of f at v, similarly for f + g.
Definition 1.12. With the definitions given in the last paragraph, given a
vector space V , the space of all linear functions also form a vector space, the
dual space V .
P
If v is a basis, then for any vector v = a v , we have
X X
f (v) = f ( a v ) = a f (v ) . (1.20)
Hence, if we know the values of a linear function on a basis, we know its

value at any vector.
Definition 1.13. Given a basis v , define linear functions v by
v (v ) = . (1.21)
We also have v ( a v ) = a . The set {v } is the dual basis of the basis

P
v .
For a general linear function f , define
f (v ) , (1.22)
we have
X X
f (v) = f ( a v ) = a f (v )
X
= a (1.23)
X
= v (v)
X
= v (v) , (1.24)
which is X
f= v . (1.25)
Hence, v is a basis of V , and are the components of f . For finite
dimensional vector space, dual space has the same dimension.
In terms of another dual basis u , let
X
f= u . (1.26)
We have
X X X
= f (u ) = v (u ) = v ( K v )

X X
= K v (v ) = K
, ,
X

= K . (1.27)

This is how the components of linear function transform. One should com-
pare with Eq. (1.16).
Another way to say the same thing is, by Eq. (1.26) and Eq. (1.27),
X
f = u
X
= (K u ) . (1.28)
Comparing with Eq. (1.25), we have

X
v = K u . (1.29)
Example 1.14. The value of a linear function f at a vector v is independent
of coordinates. How could we express it in terms of coordinates?
In terms of basis and dual basis, a vector and a linear function are given
by Eq. (1.7) and Eq. (1.25). We have the identity
X X X
f (v) = ( v )( a v ) = a v (v )
,
X X
= a = a . (1.30)
,
This is just Eq. (1.23).

In terms of another basis and the corresponding dual basis, we have
f (v) = b , Eq. (1.14) and Eq. (1.26). We check that the two expressions
P
equal, by Eq. (1.16), Eq. (1.27) and Eq. (1.13),
X X
b = K a L
,,
!
X X
= K L a
,
X
= a
,
X
= a . (1.31)

Hence, such a sum is invariant under coordinate transformation.

Example 1.15. In R3 , using standard coordinate system, a linear function

has the form
f (x, y, z) = ax + by + cz . (1.32)
We see that the equation f (x, y, z) = 0 defines a plane. (What about the
special case a = b = c = 0?) In general Rn , the solution of f = 0 defines
a hyperplane, which has dimension n 1. (I try to avoid defining vector
subspace. Lets see if I can.)
Theorem 1.16. For a finite dimensional vector space V , its double dual
(V ) is (naturally isomorphic to) V itself.
Proof. What is (V ) ? If (V ) , then : V R is a linear function of
linear functions. Hence, we must consider what its value on linear functions.
Let v and v be basis and dual basis of V . We consider the values of
on v ,
a (v ) . (1.33)
Then, we define the vector v V by
X X
v a v = (v )v . (1.34)
For any linear function of V , expand it as f = v , we compute
P
X

(f ) = v
X
= (v )
X
= a
X
= a f (v )
X
= f a v
= f (v) , (1.35)
where we have used the definition in Eq. (1.22). We conclude that the action
of on a linear function is exactly the same as the action of the linear
function on the vector v. Also, v is independent on the basis used because
by Eq. (1.9) and Eq. (1.29), if u is another basis,
X X
(v )v = (K u )L u
,,
X
= K L (u )u
,,
X
= (u )u
,
X
= (u )u . (1.36)

1.2 Inner Product

We now talk about the length of a vector. It turns out that the length
square is a more convenient object.
Definition 1.17. The inner product of a vector space is a real valued
function of two vectors, ( , ) : V V R, satisfying
1. symmetry: (v, u) = (u, v);
2. linearity: (av + bu, w) = a(v, w) + b(u, w).
Note that in the above equation, the addition in the left hand side is the
vector addition, while in the right hand side, it is the addition of real numbers.
Because of the symmetry, we also have
(av + bu, ct + dw) = ac(v, t) + ad(v, w) + bc(u, t) + bd(u, w) , (1.37)
where a, b, c, d R and u, v, t, w V . We say that functions with this
property multilinear, which means that they are linear in each of their
arguments.
Example 1.18. The standard inner product in Rn is defined as: for two
vectors v = (a1 , . . . , an ) and u = (b1 , . . . , bn )
X
(v, u) = a1 b1 + + an bn = a b (1.38)
,
where the numbers are

1 if =
= . (1.39)
0 if 6=
Rn together with this inner product is a Euclidean space. In R3 , the inner
product is usually called the dot product. In Euclidean space, we could talk
about the length or magnitude of a vector, or the angle between two
vectors, etc.
Not all vector spaces with inner product are Euclidean spaces.
Theorem 1.19. For a n dimensional vector space with inner product, a basis
v can be chosen such that

1 if = = 1, . . . , l
(v , v ) = 1 if = = l + 1, . . . , m (1.40)
0 otherwise

for some l and m. Explicitly, for vectors v = a v and u = b v ,

P P
(v, u) = a1 b1 al bl + al+1 bl+1 + + am bm . (1.41)

Note that the coefficients of am+1 bm+1 , etc, terms are zero, and there could
be no positive terms and/or negative terms and/or zero terms.
Proof. We sketch the proof, by mathematical induction. First consider n = 1.

If v V , then all other vectors are multiple of v. If (v, v) 6= 0, we could
normalize it by
v
v1 p . (1.42)
|(v, v)|
We have (v1 , v1 ) = 1 and done. If (v, v) = 0, then the inner product is
identically zero. We finish the case n = 1.
For n > 1, if for all v V , (v, v) = 0, then the inner product is identically
zero. We are done. If there exists a vector that (v, v) 6= 0. We could assume
that it is normalized. Consider all the vectors w that are orthogonal or
perpendicular to v,
W {w V |(v, w) = 0} . (1.43)
Then, W is a vector space of dimension n 1, the original inner product
defines an inner product on W . Hence, by the assumption of induction, we
can choose a basis of W such that the inner product of W has the form
Eq. (1.41). Put back v, we have an equation of inner product of V with one
more term. (Check all the details. How about the case n = 0?)
Definition 1.20. If there is no zero terms in Eq. (1.41), (m = n), the
inner product is non-degenerate. As a matrix, the determinant of ( ) is
non-zero. Its inverse is denoted by
X X
= = . (1.44)

(A more proper notation should be ( 1 ) , but this is too clumsy.)

Theorem 1.21. An inner product is non-degenerate if and only if for any
vector 0 6= v V , there is another vector u such that (v, u) 6= 0.
Proof. If the inner product is degenerate, by Eq. (1.41), we see that (vn , u) =
0 for all vector u V . P
If the inner product is non-degenerate and v = a v 6= 0, then one of
the a is non-zero, and (v, v ) = a 6= 0.
From now on, we consider only non-degenerate inner product. We could
choose a basis such that (v , v ) = 1. Similar to Eq. (1.38), define
= (v , v ) . (1.45)
How does it transform in terms of another basis? By Eq. (1.10),

(u , u )
!
X X
= K v , K v

X
= K K (v , v )
,
X
= K K . (1.46)
,
Note that it transforms as the product of two coordinates of linear functions,

Eq. (1.27).
Example 1.22. Not all vectors have non-zero length. In R4 , consider

1 if = = 1
= 1 if = = 2, 3, 4 . (1.47)
0 otherwise

Then, for v (1, 1, 0, 0), (v, v) = 0.
Example 1.23. For every vector v, we could define a linear function v :

V R by the formula
v(u) = (v, u) . (1.48)
In terms of components, we have
v(u) = (v, u)
X X
= ( a v , b v )
!
X
= a b

!
X
= a v u. (1.49)
,
We have X
v = a v . (1.50)
,
(v ) , define the vector

P
Similarly, for a linear function f =
f
X
v , (1.51)
,
then, for any vector u

f (u) = (f, u) . (1.52)
(Check this.)
1.3 Tensor Product

Suppose V and W are two vector spaces of dimensions n and m respectively.
We first give a down-to-earth definition of tensor product.
Definition 1.24. The tensor product of V and W is a nm dimensional

vector space, denoted by V W . For some vi V and wj W , create a
symbol vi wj , then elements of V W are finite linear combinations
X
Ai,j vi wj (1.53)
i,j
where Ai,j R, subjected to the constraints
1. (v + v ) w = v w + v w,
2. v (w + w ) = v w + v w ,
3. (av) w = v (aw) for a R.
If v and w be bases of V and W , one basis of V W is v w , = 1, . . . , n,

= 1, . . . , m, elements are of the form
n,m
X
t= a v w . (1.54)
=1,=1
The a could be thought as the components of the element t relative to the

basis v w .
Example 1.25. In quantum mechanics, if there are three possible states for
particle A and four states for particle B, how many states are there if we
consider them together? The answer is twelve. The state space for A is a
three dimensional vector space, while state space for B is a four dimensional.
The state space of A and B together is the tensor product of the two.
If both particles have two states, |1i and 2i, the total space has dimension
four. The normalized state
1
(|1i |1i + |2i |2i) (1.55)
2
cannot be written simply as |ai |bi. It is called an entangled state, very
important in quantum information and quantum computing.
Definition 1.26. Now, we give the rigorous mathematics construction of

tensor product. Students can skip this.
For vector space V and W , consider the vector space F generated by

elements of set V W . For an element t F , there are finitely many ai R,
vi V and wi W such that
X
t= ai (vi , wi ) , (1.56)
i
where we recall (vi , wi ) V W . The dimension of F is infinite.

Let K be the subspace of F generated by all elements of the forms
1. (v + v , w) (v, w) (v , w);
2. (v, w + w ) (v, w) (v, w );
3. (av, w) (v, aw);
4. a(v, w) (av, w).
The quotient vector space F/K is the tensor product of V and W . The coset
(v, w) + K of the element (a, b) F is denoted by v w.
Example 1.27. What is the difference between direct product and tensor
product? Direct product behaves more like addition, while tensor product
is more like multiplication.
For example, the direct product of Rm and Rn is Rm+n = Rm Rm . Their
tensor product is Rmn = Rm Rn .
Let v R2 and u R. v could be thought as a vector in the xy-plane,
while u is a vector along the z-axis. In R3 = R2 R, v + u is a three
dimensional vector. Even if u is zero, we still have a non-zero vector, in the
xy-plane.
However, R2 R = R2 , if u is zero, then v u = 0 0.
Definition 1.28. For more than two vector spaces, we could define their
tensor product iteratively,
U V W = ( ((U V ) W ) ) . (1.57)
Example 1.29. How do the components of an element of a tensor product
transform under a change of basis? Using Eq. (1.9) and a similar one for W
(and changing the notation from u to v ), we have
n,m
X
t = a v w
=1,=1
X n m
X
a LV v (LW w )

=
=1,=1 =1,=1
n,m n,m
!
X X
= a LV LW v w . (1.58)
=1,=1 =1,=1
We can write this as

n,m

X
a = a LV LW (1.59)
=1,=1

where a are the components of the same element relative to another basis.
For a basis v of V , we have a natural dual basis v of V . Similarly for
W and W . Then, the corresponding basis of V W is v w . They
define some natural linear functions on V W by
(v w ) (v w ) = , (1.60)
where on the right hand side, it is the multiplication of two real numbers. It
is easy to see that linear functions on V W , f : V W R, are linear
combinations of v w :
X
f= b v w (1.61)
for some real numbers b . We have proved
Theorem 1.30. For finite dimensional vector spaces,
(V W )
= V W (1.62)
where = means naturally isomorphic, or students could think that the two
sides of the equation are the same.
Definition 1.31. Let V be a finite dimensional vector space, V its dual
space. A tensor, T , of type (k, l) over V is an element of
T V V V V , (1.63)
where there are k factors of V and l factors of V . Hence, a tensor of type
(1, 0) is just a vector and a tensor of type (0, 1) is a linear function. A tensor
of (0, 0) is understood as a real number.
In terms of basis and dual basis,
X
T = T ... ... v v v v . (1.64)
Using the fact that
(V V V V ) = V V V V, (1.65)
a tensor can also be understood as a multilinear function
T : V V V V R , (1.66)

T (. . . , av + bv , . . .) = aT (. . . , v , . . .) + bT (. . . , v , . . .) , (1.67)
T (. . . , av + bv , . . .) = aT (. . . , v, . . .) + bT (. . . , v , . . .) . (1.68)
In the second and third lines, . . . means that those arguments are the same
in both sides. This is how many books define tensors.
Theorem 1.32. Under a different basis, the transformation rule for tensors
is given by
...
X
T ... = L L K K T ... ... , (1.69)
(Students should know what I am talking about here.)

Example 1.33. In general, tensors are not symmetric. For example, if T is
of type (0, 2), then interpreted as a multilinear function, T (v, u) = T (u, v) is
not necessarily true and hence
T 6= T . (1.70)
But inner product is symmetric, and multilinear. As a result, inner products

are type (0, 2) tensors. In terms of a basis, v , define
(v , v ) , (1.71)
we have
X
(v, u) = ( a v , b v )
X
= a b (v , v )
X
= a b . (1.72)
See also Eq. (1.38) and Eq. (1.41).

Example 1.34. An important operation on tensors is tensor contraction.
We first illustrate by example. Let T be of type (2, 1),
X
T = T v v v . (1.73)
Treat it as a linear function T : V V V R, its contraction on the

first and third factors, say, is
X
T (v , , v ) (1.74)

which is a type (1, 0) tensor. In terms of components, we have

X
T v (v ) v v (v )
,,,
X
= T v
,,,
X
= T v . (1.75)
,
Contraction is independent of basis because, by Eq. (1.9) and Eq. (1.29),

X
T (v , , v )
X
= T (K u , , L u )
X
= L K T (u , , u )
X
= T (u , , u ) . (1.76)
In general, the contraction of a type (k, l) tensor is a type (k 1, l 1) tensor.

In terms of components, it is just summing up the specified upper and lower
indices, X
T ...... ...... . (1.77)

Example 1.35. Another common operation is the outer product of two

tensors. The outer product of a type (k, l) tensor R and a type (m, n) tensor
S is a type (k + m, l + n) tensor T R S. In terms of components, it is just
multiplication of real numbers, but there will be many more components
T ......... ......... = R...... ...... S ...... ...... . (1.78)
In this language, v u is the outer product of the vectors v and u.

Chapter 2
Review of Special Relativity
Hopefully, students know almost everything in this chapter.
2.1 Conventions
c=1 (2.1)
(If you ask what is c, you should not take this course.) It is barely acceptable
that c does still not equal to one after students learn special relativity. It
is not acceptable to not put c = 1 in general relativity. By doing this,
c is the natural conversion factor between time and length. Hence, 1s =
299792458m. The mass of electron is about 511keV. Leisurely, I walk with a
speed 0.000000003. The duration of this lecture is in the order of the distance
between the Sun and Saturn.
We sum over repeated indices if one is upper index and one is lower index.
This is just the contraction of tensor introduced in Example 1.34.
X
T ...... ...... = T ...... ...... . (2.2)

The invariant interval is
s2 = (t)2 + (x)2 + (y)2 + (z)2 (2.3)

= (x0 )2 + (x1 )2 + (x2 )2 + (x3 )2
= x x . (2.4)
We also say that the signature is ( + ++). Different books adopt different
conventions. We also put t = x0 and

1 if = = 0
= 1 if = = 1, 2, 3 . (2.5)
0 otherwise

16
CHAPTER 2. REVIEW OF SPECIAL RELATIVITY 17
The whole theory of special relativity is living in the space R4 with the inner
product given by Eq. (2.4). This is the Minkowski space. We usually do
not specify a fixed basis/frame.
2.2 Lorentz Transformations

See Section 2.1 of Weinberg. We know that a Lorentz transformation between
two frames is given by

x = x (2.6)
(if the origins of two frames coincide) that the invariant interval is invariant

x x = x x = x x . (2.7)
Hence,
= . (2.8)
Take the determinant of both sides and notice that det( ) = 1, we have
det( )2 = 1
det( ) = 1 . (2.9)
Take the 00-components of Eq. (2.8),
1 = 0 0 0 0 + 1 0 1 0 + 2 0 2 0 + 3 0 3 0 (2.10)
0 0 0 0 = 1 + 1 0 1 0 + 2 0 2 0 + 3 0 3 0
1
0 1 or 0 0 1 .
0
(2.11)
The identity transformation has 0 0 = +1 and det( ) = +1; and if we

continuously change the parameters, for example, the speed, 0 0 cannot jump
from 1 to 1, we consider only Lorentz transformations with 0 0 1
and det( ) = 1. These are called proper Lorentz transformations
They form a group, i.e., the composite of two proper Lorentz transformations
is also a proper Lorentz transformation, etc. (Other transformations are
related to time reversal and/or space inversion.)
How does this abstract formulation relate to the boost and rotation of
coordinates? First, define and v by
1
0 0 . (2.12)
1 v2
0 0 1 implies 1 > v 0. Then, for i = 1, 2, 3, define
vi = i 0 / . (2.13)
Because of Eq. (2.10), v 2 = v12 + v22 + v32 . A pure boost with velocity
(v1 , v2 , v3 ) is Lorentz transformation
0
0 = (2.14)
0 i
i = = vi
0 (2.15)
i 1
j = ij + vi vj . (2.16)
v2
By direct calculation, we have

= R (2.17)
and R has the form

1 0 0 0
0
(2.18)
0
0
where the lower-right 3 3 matrix is a rotation. (If you are not familiar
with Euler angles, etc, see, for example, H. Goldstein, C.P. Poole Jr. and
J.L. Safko, Classical Mechanics.) Eq. (2.17) is understood as all proper
Lorentz transformations
=
0 0 1
det( ) = 1 , (2.19)
can be thought as the composition of a rotation of the coordinate axes and

then a boost along some direction.
2.3 Time Dilation and Lorentz Contraction

Suppose that two events A and B happened at the same location but different
time in some frame, then their coordinates could be (x0A , x1A , x2A , x3A ) and
(x0B , x1B , x2B , x3B ) where xiA = xiB for i = 1, 2, 3,. The duration between them
is
x0 = |x0A x0B | . (2.20)
In another frame, the two events will happened at different locations and
different time. By Eq. (2.6), the duration in new frame is
0 0 0
x = |x A x B |
= |0 (xA xB )|
= 0 0 |x0A x0B |
|x0A x0B | . (2.21)
This is time dilation.

Note also that
0 2
x = (x0A x0B )2
= (xA xB )(xA xB )

= (x A x B )(x A x B ) . (2.22)
We have the second line because the two events are at the same location, and
the expression in the second line is Lorentz invariant, hence the third line.
Lorentz contraction or length contraction means that the length of
a moving object will be shorter along the direction of motion.
2.4 Energy Momentum Vector

For a particle of mass m, moving in constant velocity (v1 , v2 , v3 ), we have as
usual,
1
. (2.23)
1 v2
Its energy and momentum are E = m and
(p1 , p2 , p3 ) = m(v1 , v2 , v3 ) . (2.24)
Define p0 = E, then (p0 , p1 , p2 , p3 ) transforms as a four vector, which means,

if two frames are related by Eq. (2.6), the energy-momentum in another
frame is given by

p = p . (2.25)
2.5 Charge, Current and Conservation

See Section 2.6 of Weinberg. We first list some properties of Dirac functions.
It is defined such that if f : R R is a continuous real valued function, then
Z
f (x) (x x0 ) dx = f (x0 ) . (2.26)

(Mathematicians will say that function is a distribution.) For some other

function g(x), assume that g(x) = 0 has only one root at x = x0 and g (x0 ) >
0, let y = g(x), then
Z
f (x) (g(x)) dx

Z
dx
= f (x) (y) dy
dy
1
dg(x)
Z
= f (x) (y) dy
dx
= f (x0 ) g (x0 )1 . (2.27)
If g (x0 ) < 0, we have to switch the limits and get f (x0 ) g (x0 )1 .
Hence,
(x x0 )
(g(x)) = . (2.28)
|g (x0 )|
If g(x) = 0 has multiple roots, we have to sum them up. For multi-
dimensional function and multi-dimensional integral, we have a similar
formula. We only need the case that y i = Ai j xj where (Ai j ) is a invertible
matrix. Denote n (y i ) = (y 1 )(y 2 ) (y n ),
Z
f (x1 , . . . , xn ) n (y i ) dx1 dxn
1 n

n i d(x x )
Z
1 n
dy 1 dy n

= f (x , . . . , x ) (y ) 1 n
d(y y )
f (0, . . . , 0)
= . (2.29)
| det(Ai j )|
As a result,
n (y i ) = | det(Ai j )1 | n (xi ) . (2.30)
Suppose we have chosen a frame and some charged particle is moving
with coordinates x = f ( ). The trajectory is called the world line of the
particle and is a parameter on the world line. Its speed must be less than
or equal to 1
2 2 3 2
df 1 /d df 2 /d

df /d
+ + 1
df 0 /d df 0 /d df 0 /d
1 2 2 2 3 2 0 2
df df df df
+ +
d d d d

df df
0. (2.31)
d d
Its motion will induce a current. Let its charge be e. Define
df ( )
Z
J (x) d e 4 (x f ( ))

. (2.32)
d
With the above discussion of function, we see that J (x) is a four vector.
To understand its physical meaning, let us assume for simplicity that it is
moving in constant velocity. Then, we could choose the parameter = x0
as the time coordinate and f i ( ) = v i + X i . We have df i /d = v i for

i = 1, 2, 3, and f 0 ( ) = = x0 .
Z
i
J = d e 4 (x f ( )) v i
Z
= d e (x0 ) 3 (xi v i X i )v i
= e 3 (xi v i x0 X i )v i (2.33)
Z
J0 = d e 4 (x f ( )) = e 3 (xi v i x0 X i ) . (2.34)
J 0 is the charge density and J i is the current density. It is not difficult to

convince oneself that this interpretation is valid for general motion (varying
velocity).
Denote x . We are going to prove that J = 0.
J
df ( )
Z
= e d 4 (x f ( ))
d
4
(x f ( )) df ( )
Z
= e d
x d
(x f ( )) df ( )
4
Z
= e d
f ( ) d
4
d (x f ( ))
Z
= e d
d
4

= e (x f ( )) =
= 0. (2.35)
We have the third equality because, for example,
4 (x f ( ))
x0

= (x0 f 0 ( )) (x1 f 1 ( )) (x2 f 2 ( )) (x3 f 3 ( ))
x0

= 0 (x0 f 0 ( )) (x1 f 1 ( )) (x2 f 2 ( )) (x3 f 3 ( ))
f ( )
4 (x f ( ))
= . (2.36)
f 0 ( )
For any tensor that satisfies Eq. (2.35), we have a conservation law, as we
going to explain. In vector calculus notations, Eq. (2.35) is
0 J 0 + 1 J 1 + 2 J 2 + 3 J 3 = 0
J 0
+J = 0 . (2.37)
t
Define the quantity Z

Q J 0 (x) dx1 dx2 dx3 . (2.38)
Using Gausss theorem, we find that Q is time-independent

dQ 0
Z Z Z
3 3
= d x 0 J = d x J(x) = lim J dS = 0 (2.39)
dt x S S
where S is some surface that tends to enclose

R everything. This further illus-
trates the concept of current density, as S J dS is the rate of charge passing
through the surface S.
In the simple case of Eq. (2.32), Q = e is just the charge of the particle.
(Check this.) For more than one particles, labeled by n, of charge en , we just
sum them up
XZ df ( )

J (x) d en 4 (x fn ( )) n . (2.40)
n
d
P
In this case, Q = en .
2.6 Stress-energy Tensor

See Section 2.8 of Weinberg, Chapter 4 of Schutz. Anticipating that particles
will move in a curved spacetime and, in turns, particles will determine the
curvature of spacetime, we need a way, stress-energy tensor or just stress
tensor, to describe materials.
Using notations of last section, for a system of particles with energy
momentum four-vectors pn ( ), their stress tensor is defined as
XZ df

T (x) d pn ( ) n 4 (x fn ( )) . (2.41)
n
d
This is a tensor because T = T . It is also symmetric

T (x) = T (x) (2.42)
because pn ( ) is proportional to dfn /d , with proportional factor d /dx0 .
If the particles are free, then pn are constants. We can prove that
T (x) = 0 just by replacing e by pn in Eq. (2.35). Similarly, we could, for
aR fixed i, interpret T ij as the momentum flux of the i-th momentum and
S
Ti dS as the rate of momentum passing through surface S.
In general, stress-energy tensor does not conserve
T (x) 6= 0 . (2.43)
This is expected, as in the case of external force, energy and momentum do
not conserve. However, in many cases, it does conserve.
1. If the particles are interacting only during collisions that are strictly
localized in space, it conserves. See Section 2.8 of Weinberg.
2. Stress-energy tensor for fluid conserves, see Chapter 4 of Schutz.
3. For particles interacting with electromagnetic field, the sum of the

stress-energy tensor for the particles and a suitable constructed stress-
energy tensor for the electromagnetic field will conserve. (Electromag-
netic theory in general relativity is a very beautiful theory. Unfortu-
nately, we dont have time to talk about it.)
For other materials, there can be no general recipe for constructing T

(p.335 of ONeill). However, we do have some guidelines on how to construct
the stress-energy tensor (p.96 of Schutz). One prominent requirement is that
the tensor must conserve. If it does not, we will put in more terms for
other fields or materials, just like in a mechanical system, finding the force
that changes the total energy until conservation of energy (or conservation
of stress-energy tensor) restores. Stress-energy tensors for all known cases
conserve.
Chapter 3
Differentiable Manifolds
How can spacetime be curved? How do particles move in the curved space-
time? The necessary mathematical tool to describe all these is the theory of
differentiable manifolds. This is fundamental to general relativity and many
branches of mathematics.
3.1 An Example
Back to Newtonian concepts, space is described as R3 and the trajectory of a
particle is a curve x : R R3 or a mapping from time to space. The velocity
is a vector dx/dt. This space is flat.
What we want to discuss is the surface of a sphere, S
x2 + y 2 + z 2 = R 2 . (3.1)
It is two dimensional, we need two real numbers to describe a point on the

surface. For example, if we consider only the northern hemisphere, p we
could choose (x, y) as coordinates. Then, the point is (x, y, + x + y 2 ).
2
This coordinate system cannot be extended pass the equator.

We could choose the polar coordinates: (, ) is the point (x, y, z) =
(R sin cos , R sin sin , R cos ). This is good expect at the north and
south poles, where is undefined. The transformation equations between
the two coordinate systems are given by
p
x = R sin cos = sin1 x2 + y 2 /R
and . (3.2)
y = R sin sin = tan1 y/x
Note that the pair of equations on the right are only valid for northern
hemisphere without the north pole, while the pair on the right are even more
troublesome because we have to worry about the points that x = 0, the
various branches of arcsine and arctangent, etc. However, in their respective
domains of definition, they are infinitely differentiable.
24
CHAPTER 3. DIFFERENTIABLE MANIFOLDS 25
It can be proved that no single coordinate system can cover the whole
surface. Hence, the worries about the domains of definition of coordinate
systems and transformation of coordinates are unavoidable.
We need functions on the surface S, for example, the temperature at each
point of S. Mathematical functions could be quite weird. I cannot resist to
quote the function f : S R

1 if x is a rational number
f (x, y, z) = . (3.3)
0 if x is a irrational number
The function itself does not depend on any coordinate system. But to con-
cretely write it down, we need some coordinate systems. I write the function
as f (x, y, z), but in fact, it depends only on two coordinates, for example,
(x, y) for points in the northern hemisphere. We could also express it as
function of (, ) by Eq. (3.2). Physicists consider only functions that when
expressed in some coordinates, they are infinitely differentiable.
A particle moving on the surface S is given by a function f : (a, b) S,
where (a, b) = {a < t < b} is the time interval of concern. Once again, this
is independent of coordinate system. If at some particular time, it is in the
domain of definition of some coordinate system, we could express it as, e.g.,
(x(t), y(t)) or ((t), (t)).
In terms of coordinates, the velocity is

dx(t) dy(t)
, (3.4)
dt dt
or
d(t) d(t)
, . (3.5)
dt dt
We even have
dx(t) x d(t) x d(t)
= + (3.6)
dt dt dt
dy(t) y d(t) y d(t)
= + (3.7)
dt dt dt
where the partial differentiations are calculated by Eq. (3.2), and similarly
equations for d(t)/dt and d(t)/dt.
But wait! What do we mean by velocity here? In some sense, velocity
should have a z-component. However, S is two dimensional, there are only
two directions for the particle to move. (x, y) and (, ) are only two coor-
dinate systems out of infinitely many. Are they special? We even have not
talked about how to measure length of a curve on the surface... We have
enough motivations to start the following sections.
3.2 Smooth Manifolds

See ONeill. Let U be an open set of Rn , a real valued function f : U R
is smooth, C , if all partial derivatives of f to all order exist and are
continuous in U. A function f : Rm Rn is smooth if all n components of
f is smooth. A smooth manifold is something that locally behaves just like
Rn .
Definition 3.1. Let M be a set, or topological space if you are mathe-
matician. A n dimensional coordinate system on a subset U of M is a
one-to-one map : U Rn
(p) = (x1 (p), . . . , xn (p)) (3.8)
for p U. x (p) are the coordinate functions.

Different coordinate system could define on different domains, but we
demand that the transformation functions are smooth. Let : V Rn
be another coordinate system. Suppose that their intersection is non-empty
U V =6 . We say that they overlap smoothly if both the functions 1
and 1 are smooth.
n 1
R n
R
1

U V M
Figure 3.1: Transformation of two coordinate systems.
Definition 3.2. A n dimensional smooth manifold is a set that is covered

by n dimensional coordinate systems such that any two overlap smoothly.
(There are some other technical requirements that I omitted.)
Example 3.3. Let V be a n dimensional vector space. Choose a basis v .
Then, v V can be expressed as a linear combination
v = a v . (3.9)
Define : V Rn by
(v) = (a1 , . . . , an ) . (3.10)
Then, is a coordinate system on V and we can cover V by one coordinate
system. If we choose another basis, the coordinate transformation is given by
the multiplication of a matrix, which is smooth. Hence, V is a n dimensional
smooth manifold. Note that there are other non-linear coordinate systems
on V , for example, the polar coordinate system.
Example 3.4. The surface of a sphere in Section 3.1 is a two dimensional

smooth manifold.
Definition 3.5. For a real valued function f : M R, its coordinate

expression in terms of is f 1 : (U) R. This is a real valued function
on n real numbers
f = (f 1 )(x1 , . . . , xn ) . (3.11)
If the coordinate expression is smooth for all coordinate systems, we say that
f is a smooth function on M . The set of all smooth functions on M is
denoted by F(M ).
We have simple properties that the sum and the product of two smooth
functions are smooth, etc. Hence, F(M ) is a (infinite dimensional) vector
space.
Definition 3.6. This is the most important definition in manifold theory,

also very elegant and beautiful. Enjoy it.
For p M , a tangent vector to M at p is a real valued function v :
F(M ) R that satisfies
1. R-linear: v(af + bg) = av(f ) + bv(g),
2. Leibnitz rule: v(f g) = v(f )g(p) + f (p)v(g)
for all a, b R and f, g F(M ). (This is the concept of directional deriva-

tives.)
Definition 3.7. For p M , the set of vectors at p is the tangent space

Tp (M ). Define addition and scalar multiplication of vectors by
(v + w)(f ) v(f ) + w(f ) (3.12)

(av)(f ) a v(f ) (3.13)
for v, w Tp (M ) and a R. Tp (M ) is a vector space, as defined in Defini-

tion 1.1.
Definition 3.8. Let = (x1 , . . . , xn ) be a coordinate system at p M .

For f F(M ), its coordinate expression is a real valued function of n real
numbers (f 1 )(x1 , . . . , xn ). We could calculate its partial derivative
f 1
(3.14)
x
near p. Define a vector by evaluating the above partial derivative at p
|p : F(M ) R

f 1
( |p )(f ) . (3.15)
x p
This can be visualized as an arrow at p tangent to the xi coordinate line.

Theorem 3.9. Let v Tp (M ). (1) If h F(M ) is the constant function,
then v(h) = 0. (2) If f, g F(M ) are equal in a neighborhood of p, then
v(f ) = v(g).
Proof. (1) Let e be the constant function e(p) 1. We have, by Leibnitz
rule,
v(e) = v(e e) = v(e)e(p) + e(p)v(e) = 2v(e) . (3.16)
Hence, v(e) = 0, and v(h) = v(h e) = hv(e) = 0.
(2) f g = 0 in some neighborhood of p. We could find a smooth function
b such that it equals 1 in a smaller neighborhood of p and equals 0 in region
that f 6= g. Then, b(f g) = 0 everywhere. We have
0 = v(b(f g)) = v(b)(f (p) g(p)) + b(p)(v(f ) v(g)) = v(f ) v(g) (3.17)
because v(b) = v(0) = 0 and b(p) = 1.
Theorem 3.10 (Basis Theorem). Let = (x1 , . . . , xn ) be a coordinate sys-
tem at p M . The vectors |p form a basis of Tp (M ), and
v = v(x ) |p . (3.18)
Hence, Tp (M ) is a n dimensional vector space.
Proof. The idea of the proof is that for any function f F(M ) and point q
near p, there exist functions f such that
f (q) = f (p) + f (q) (x (q) x (p)) . (3.19)
This is very similar to the Taylors expansion of a function. Then,
v(f )
= v(f (p) + f (q) (x (q) x (p)))
= 0 + v(f (q)) (x (p) x (p)) + f (p)v (x (q) x (p)))
= f (p) v(x ) . (3.20)
Let the coordinates of p be (p) = (x1p , . . . , xnp ). For a function g defined

near (x1p , . . . , xnp ), and another point near (p), (x1 , . . . , xn ), we consider h :
(, ) R
h(t) = g t(x1 x1p , . . . , xn xnp ) + (x1p , . . . , xnp ) .

(3.21)
By fundamental theorem of calculus,
g r(x1 x1p , . . . , xn xnp ) + (x1p , . . . , xnp )

= h(r)
Z r
dh
= h(0) + dt
0 dt
Z r
g 1
t(x x1p , . . . , xn xnp ) + (x1p , . . . , xnp ) dt (x xp )

= h(0) +
0 x
g(xp , . . . , xnp ) + g (x xp ) ,
1
(3.22)
where g is the function
Z r
g 1 1 n n 1 n

g =
t(x x p , . . . , x x p ) + (x p , . . . , x p ) dt . (3.23)
0 x
Put r = 1, we have
g x1 , . . . , xn = g x1p , . . . , xnp + g (r = 1)(x xp ) .

(3.24)
Setting g = f 1 the coordinate expression of f , the above equation be-
comes
f (q) = f (p) + g (r = 1)(x (q) xp ) (3.25)
where g (r = 1) also depends on q via (x1 , . . . , xn ). x (q) are functions on
M as the coordinates of q.
Eq. (3.20) gives us
v(f ) = g (r = 1, q = p)v(x ) , (3.26)
where
g (r = 1, q = p)
Z 1
g 1
t(xp x1p , . . . , xnp xnp ) + (x1p , . . . , xnp ) dt

=
x
Z0 1
g 1
xp , . . . , xnp dt

=
0 x
g 1
xp , . . . , xnp

=
x
f 1 1 n

= x p , . . . , x p
x
= ( |p )(f ) . (3.27)
To show that |p is linear independent, suppose a |p = 0. Apply this

vector to the coordinate function x ,

x
0 = a |p (x ) = a = a = a . (3.28)
x p
Example 3.11. The trajectory of a particle, or a curve, on a smooth mani-

fold is a smooth map : I M where I is an open interval of real numbers.
The velocity vector is defined as follow.
At different time, the particle is at different locations. As a result, the
velocity vectors are living in different tangent spaces. We do not know how
to compare velocity vectors at different tangent spaces (yet).
At time t, the particle is at (t) M . The velocity vector is a function
: F(M ) R

df ((t))
(t0 )(f ) = . (3.29)
dt t0
It is easy to check that this definition satisfies the requirement in Defini-

tion 3.6. Hence, (t) T(t) (M ).
In terms of a coordinate system, by Theorem 3.10,
dx ((t))
(t) = |(t) . (3.30)
dt
dx ((t))/dt are the components of the velocity vector relative to the coor-
dinate system. See Eq. (3.4) and Eq. (3.5).
3.3 Vector Fields and One Forms

Definition 3.12. A vector field on a manifold M is a function that assigns
to each point a vector at that point. If f F(M ) and V is a vector field,
V f is a function
(V f )(p) Vp (f ), (3.31)
which is the action of the vector at p on the function f . We say that a vector
field is smooth if V f F(M ) for all f F(M ).
Theorem 3.13. In terms of a coordinate system, the vector fields defined

in Eq. (3.15) are smooth. Theorem 3.10 tells us that any vector field can be
written as
V = (V x ) . (3.32)
V is smooth if and only if V x are smooth as functions for all .
Definition 3.14. The set of all smooth vector fields on M is denoted by

X (M ). It is an infinite dimensional vector space.
Definition 3.15. At p M , the tangent space Tp (M ) is a n dimensional
vector space. Its dual is the cotangent space Tp (M ) , which is also a n
dimensional vector space. Hence, p Tp (M ) means p is a real valued
linear function of vectors at p, p : Tp (M ) R.
A one-form or covector, on M is a function that assigns to each point
an element p of the cotangent space. This is the object dual to vector field.
is smooth if and only if V are smooth as functions for all smooth
vector fields V .
The set of all smooth one-forms on M is denoted by X (M ). It is an
infinite dimensional vector space.
Definition 3.16. For a function f F(M ), its differential is the one-form
df defined by
(df )(v) v(f ) . (3.33)
In the right hand side, it is the action of a vector on a function, see Defini-
tion 3.6. It is smooth because for a smooth vector field V , (df )(V ) = V (f )
is a smooth function.
Theorem 3.17. In terms of a coordinate system, we have
= ( ) dx (3.34)
and
df = f dx . (3.35)
Proof. (You have seen Eq. (3.35) since kindergarten. Just by the face value,
it must be true. But now, all symbols have different meanings.)
The coordinates x are functions on M , its differential is
(dx )(v) = v(x ) . (3.36)
We apply the above equation to the vector field , defined in Eq. (3.15), we
have
x
(dx )( ) = (x ) = = . (3.37)
x
The first equality is definition, the second is the second line of Eq. (3.15),
and the third is just ordinary partial differentiation.
To prove Eq. (3.34), note that is an one-form, is a vector field and
( ) is a function. The right hand side of the equation is the multiplication
of a function and the differential of the coordinate (and sum over ). The
action of the right hand side on the vector field is
(( ) dx ) ( ) = ( ) (dx ) ( ) = ( ) = ( ) . (3.38)
Since span the tangent space at each point, we have proved Eq. (3.34).
In particular, for the differential df ,
(df ) = f , (3.39)
and we have Eq. (3.35).
3.4 Tensor Fields

We could repeat everything in Section 1.3.
Definition 3.18. A tensor field or just tensor on M of type (k, l) is an

element of
T X (M ) X (M ) X (M ) X (M ) T (k, l) , (3.40)
where there are k factors of X (M ) and l factors of X (M ). Hence, a tensor

field of type (1, 0) is just a vector field and a tensor field of type (0, 1) is a
one-form. A tensor field of (0, 0) is understood as a function.
A tensor field can also be understood as a multilinear function
T : X (M ) X (M ) X (M ) X (M ) R , (3.41)
T (. . . , f + g , . . .) = f T (. . . , , . . .) + gT (. . . , , . . .) , (3.42)

T (. . . , f V + gV , . . .) = f T (. . . , V, . . .) + f T (. . . , V , . . .) . (3.43)
for any functions f, g F(M ), vector fields V, V X (M ) and one-forms

, X (M ).
Theorem 3.19. Let = (x1 , . . . , xn ) and = (x 1 , . . . , x n ) be two coordi-

nate systems, and vp Tp (M ), p Tp (M ) ,
x

vp (x ) = vp (x ) , (3.44)
x p
x

p ( ) = p ( ) . (3.45)
x p
These are the transformation rules for vectors and one-forms.

Proof. x are functions on M , in terms of x , their coordinate expressions are
written as x (x ). The ordinary partial differentiation is x /x . Similarly
define the x /x .
By definition, the action of the vector on x is the function
x
(x ) = . (3.46)
x
Hence,
x

vp (x ) .

vp (x ) = (vp (x ) )(x ) = vp (x ) (x ) p =
(3.47)
x p
Similarly,
x

( x )p

p ( ) = (p ( ) dx )( ) = p ( ) = p ( ) . (3.48)
x p
Students should be able to find out the transformation rules for general
tensor fields.
3.5 Metric Tensor

Definition 3.20. A metric tensor or just metric g on M is a tensor of type
(0, 2), g T (0, 2), which is symmetric and non-degenerate:
1. Symmetry means that as a function g : X (M ) X (M ) R
g(V, W ) = g(W, V ) (3.49)
for any vector fields V and W .
2. At each point p M , g induces an inner product on Tp (M ). We
demand that this inner product is non-degenerate for all points.
A metric essentially tells us how to measure the infinitestimal distance
between two nearby points. An informal explanation is the following. If the
coordinates of two nearby points are x and x + x , we expect the distance
between them to be something like
sX
x x (3.50)

if the coordinates are Euclidean. In general, we have

p
g x x (3.51)
where g is exactly the metric tensor because x corresponds to the coordi-
nates of a vector. The square root is troublesome, we consider the distance
square, instead of distance. Also, we need negative distance square in rela-
tivity.
Now, we understand that a metric tells us the lengths of vectors at each
point. Theorem 1.19 allows us to choose a basis such that the metric at each
point has the standard form. Since the metric is smooth and non-degenerate,
the numbers of positive and negative terms must vary continuously, hence,
they are constant. Here comes the title of ONeill.
Definition 3.21. A semi-Riemannian manifold is a smooth manifold M

with a metric tensor g.
In general relativity, the metric tensor of spacetime is determined by the

material and dynamics, as we are going to see. But the signature is fixed at
( + ++). This is the Lorentz metric.
In terms of coordinates,
g = g dx dx . (3.52)
The inverse of the matrix (g ) is denoted by (g ). It is the components of

a tensor of type (2, 0), usually also denoted by the same letter g.
Just like in Example 1.23, using the metric and its inverse, we could freely
associate a vector to an one-form and an one-form to a vector. We have
g g = , (3.53)
g g = , (3.54)
V = g V , (3.55)
V = g V . (3.56)
In the third and fourth lines, we use superscript and subscript to distinguish
vectors from one-forms.
Example 3.22. For a n dimensional vector space, a standard metric is

given by Theorem 1.19. Then, g = 1, 0, and do not dependent on the
points. (g = 0 for 6= .)
Example 3.23. Recall that in Section 3.1, we have discussed the surface
of a sphere. The following is a physicists way to find out the natural
metric on it.
R3 , as a vector space, has a natural metric
g = dx dx + dy dy + dz dz . (3.57)
To find the components of the metric for the polar coordinates, notice that
x = R sin cos
dx = R cos cos d R sin sin d (3.58)
y = R sin sin
dy = R cos sin d + R sin cos d (3.59)
z = R cos
dz = R sin d . (3.60)
Hence, the metric for the surface of a sphere is
g = (R cos cos d R sin sin d)

(R cos cos d R sin sin d)
+(R cos sin d + R sin cos d)
(R cos sin d + R sin cos d)
+R2 sin2 d d
= R2 cos2 d d + R2 sin2 d d
+R2 sin2 d d
= R2 d d + R2 sin2 d d . (3.61)
(Be careful that d d 6= d d.) In component forms,
g = R2 (3.62)
g = R2 sin2 (3.63)
g = g = 0 . (3.64)
To justify this calculation mathematically requires the discussion of subman-

ifold and restriction of metric. We will not go through those, but I hope you
got some intuitive idea of what is going on.
How about the (x, y) coordinate system for the northern hemisphere?
p
z = R 2 x2 + y 2
x dx y dy
dz = p
R 2 x2 y 2
x2 dx dx + xy (dx dy + dy dx) + y 2 dy dy
dz dz = .(3.65)
R 2 x2 y 2
Hence, we have
1
g = ((R2 y 2 ) dx dx
R2 x2 y 2
+xy (dx dy + dy dx) + (R2 x2 ) dy dy) , (3.66)
and
R2 y 2
gxx = (3.67)
R 2 x2 y 2
R 2 x2
gyy = (3.68)
R 2 x2 y 2
xy
gxy = gyx = . (3.69)
R x2 y 2
2
Chapter 4
Curvature
We follow Chapter 3 of Wald.
4.1 Covariant Derivative

How can we compare tensors at different points on a manifold? If a particle
is moving in R3 , for example, we can easily calculate its velocity vector as a
function of time, and if it changes, we say that there is acceleration.
If the particle is moving on a manifold, then the velocity vectors at dif-
ferent points are living in different tangent spaces. We need some kind of
differentiation to define the acceleration.
In partial differentiation, we need to specify which variable to differenti-
ate. In a more general content, this is called the directional derivative. So,
a direction or a vector is need to define differentiation on a manifold. For a
function, the action of a vector on it, defined in Definition 3.6, is already a
differentiation. We should not change this.
Definition 4.1. A covariant derivative on a smooth manifold M is a

map which sends a tensor field of type (k, l) to a tensor field of type (k, l + 1)
: T (k, l) T (k, l + 1) . (4.1)
The covariant derivative of a tensor field has one more one-form component,
which could act on a vector field to produce back a tensor field of same
type. This vector field is the direction of the directional derivative, usually
written as V . It must satisfy the following five conditions
1. Linearity: For T, U T (k, l), a, b R and f F(M ),
V (aT + bU ) = aV T + bV U , (4.2)
f V T = f V T . (4.3)
36
CHAPTER 4. CURVATURE 37
The second line is just the fact that T is a tensor field of type (k, l+1)
and we evaluate it at the vector fields V . Note that it is not true that
V (f T ) = f V T for a function f F(M ).
2. Leibnitz rule: For T T (k, l) and U T (k , l ),
V (T U ) = V T U + T V U . (4.4)
3. Commutativity with contraction: Tensor contraction will turn a tensor

of type (k, l) to type (k 1, l 1), see Example 1.34. This requirement
states that contract first and then differentiate and differentiate
first and then contract will give the same result. We illustrate with
an example in terms of a coordinate system. Let T be of type (3, 2),
we have
T = T dx dx . (4.5)
Let T be the contraction of T of the first and forth indices,
T = T dx . (4.6)
More notations, let
T = S ; dx dx dx (4.7)
T = R ; dx dx . (4.8)
(The semi-colons in the subscripts are customary.) Finally, we demand
S ; = R ; . (4.9)
It is easier to understand the concept than going through the formulas.
4. For all f F(M ), we demand
V f = V (f ) . (4.10)
5. Torsion free: For all f F(M ),
(V W W V )f = V (W (f )) W (V (f )) . (4.11)
(Some gravity theories do not require this condition. They are called
torsion gravity. We do not discuss those.)
There are a lot of conditions. One wonders if any covariant derivative

exists at all. The fact is, for a general manifold, there are infinite many
covariant derivatives.
Definition 4.2. In terms of a coordinate system, we define
. (4.12)
This just simplifies the notation of the covariant derivative along the vector
tangent to the coordinate lines. Here comes the Christoffel symbols, ,
. (4.13)
Some explanation is in order. is a vector field. Its covariant derivative

along the direction is another vector field. We expand the resulting vector
field in terms of . The coefficients are the Christoffel symbols. All ,
and run over 1, . . . , n, there are n3 Christoffel symbols for a n dimensional
manifold.
Theorem 4.3. In terms of Christoffel symbols, the covariant derivative of a
general vector field W = W is given by
V W ; V W = V W + W .

(4.14)
Proof.
V W = V W
= V (W )
= V (( W ) + W ( ))
V ( W ) + W ( ) .

= (4.15)
The second line is by Eq. (4.3), the third is Leibnitz rule and the fourth line
is due to Eq. (4.10) and Eq. (4.13) because W are just functions. We rewrite
Eq. (4.14) as
W ; = W + W . (4.16)
Theorem 4.4.
= . (4.17)
Proof. By Eq. (4.11), we have
(V W W V )f = V (W (f )) W (V (f ))
(V W ; W V ; ) f

= V (W f )) W (V f ))
V W ; W V ; = V W W V
V W W V = 0
V W

= 0. (4.18)
This is valid for all V and W , hence the result.

Theorem 4.5. The covariant derivative of a general one-form = dx is

given by
V ; dx V = V dx .

(4.19)
Proof. Consider yet another vector field W = W . Then, W is a tensor
field of type (1, 1) and by Eq. (4.4), we have
V (W ) = (V W ) + W (V ) (4.20)
Condition 3 tells us that we could do the contraction before or after the

differentiation,
V (W ) = (V W ) + W (V )
V (W ) = (V W ; ) + W V ;
V (( W ) + W ) = (V ( W + W )) + W V ;
V W = V W + W V ;
V W = V W + W V ;
; = . (4.21)
Eq. (4.10) tells us how to differentiate functions. With Theorem 4.3,

Theorem 4.5 and Leibnitz rule, we know how to differentiate any tensor
fields.
Theorem 4.6.
T ... ...; = T ... ... + T ... ... + T ... ... (4.22)
(Note that in this equation, the Condition 3 is just adding and subtracting
the same Christoffel symbol terms.)
Theorem 4.7. Christoffel symbols are not components of a tensor. (Wald

has a weird sense of humor to claim that Christoffel symbols form a tensor.)
Proof. By Theorem 3.19, we have

=

x x
=
x x
x

x x
= +
x x x
x 2 x x x
= +
x x x x x
x 2 x x x
= +
x x x x x
x x x 2 x
= +
x x x x x
x x x x 2 x
= + . (4.23)
x x x x x x
Because of the second term in the right, Christoffel symbols are not compo-
nents of a tensor.
For any tensor of type (1, 2) which is symmetric in the lower indices

S = S , (4.24)
defining a new Christoffel symbols
+ S

, (4.25)
then we have a new covariant derivative.
4.2 Parallel Transport

Recall the velocity vector of a curve in Example 3.11.
Definition 4.8. A vector v(( )) given at each point of the curve is said to
be parallel transported as one moves along the curve if
v = 0 . (4.26)
For a tensor T , it is parallel transported if
T = 0 . (4.27)
Put V = and W = v in Theorem 4.3, Eq. (4.26) is translated to
dx (( )) v

+ v = 0
d x
dv (( ))
+ v = 0 . (4.28)
d
This is an ordinary differential equation. Initial value theorem guarantees
the solution. A very important fact is that parallel transport depends on the
curve. If one parallel transports a vector along a closed curve back to the
starting point, it is general true that the vector will be rotated. We will
spend a lot of time on this in Section 4.4.
Definition 4.9. A curve is called a geodesic if its velocity vector is parallel

transported
= 0 . (4.29)
In physics textbook, given a coordinate system, they will write a the coor-
dinate representation of a curve as x ( ). Then, the equations for geodesic
is
d2 x
dx dx

+ =0. (4.30)
d 2 d d
Note that the parametrization is no longer arbitrary, usually taken as
the proper time on the curve. In Wald, this is called an affine parametriza-
tion. A geodesic can be thought as the straightest possible line between
two points, or shortest trajectory between two points. Also see Wald.
4.3 Levi-Civita Connection

Theorem 4.10. For a semi-Riemannian manifold with metric g, there is a
unique covariant derivative such that
g = 0 . (4.31)
Proof. We try to solve for the Christoffel symbols from Eq. (4.31). By
Eq. (4.22), applying to g, we have
0 = g g g , (4.32)
0 = g g g , (4.33)
0 = g g g . (4.34)
We add the first two lines and subtract the third, and note that both g and
are symmetric in the lower indices,
g + g g
= g + g + g + g g g
= 2 g . (4.35)
Multiply by g , we finally have
2 g g = g ( g + g g )
1
= g ( g + g g ) . (4.36)
2
We now check that the n3 objects defined by the above equation satisfy
Eq. (4.23). To simplify notations, defined
x
L , (4.37)
x
x
K . (4.38)
x
Then, for example, = L and L = L .

g g
= (K K g ) (L L g )
= K K g ( L L g + L L g + L L g )
= K L g g + K K L g g L + K L
= K L L g g + K K L g g L + K L .(4.39)
Reverse and , we have

g g
= K L L g g + K K L g g L + K L .(4.40)
We also need

g g
= (K K g ) (L L g )
= K K g L L g + K K g g L L
+K K g g L L . (4.41)
Finally,

g ( g + g g )
= K L L g ( g + g g )
+K L + K L . (4.42)
Up to a factor of 1/2, this is exactly Eq. (4.23).

A better logic flows like this. From the metric, we define the n3 objects
by Eq. (4.36). We verify that they satisfy Eq. (4.23). By Eq. (4.22), we
could define a covariant derivative. This takes care of the existence. By
direct verification, we can prove Eq. (4.31). (Do it.) Uniqueness is given by
the fact that any covariant derivative that satisfies Eq. (4.31) must have the
form Eq. (4.36), as we did from Eq. (4.32) to Eq. (4.36).
See ONeill for a coordinate free proof. Thats why I love ONeill.
Definition 4.11. The unique covariant derivative provided in Theorem 4.10

is called the Levi-Civita connection of the semi-Riemannian manifold.
Example 4.12. We calculate the Levi-Civita connection for the surface of

a sphere in polar coordinates. This is not as daunting as you might think,
as the cross terms are zero, Eq. (3.64).
First of all, the inverse metric is easy

1
g = (4.43)
R2
1
g = (4.44)
R2 sin2
g = g = 0 . (4.45)
The Christoffel symbols are

1
= g g = 0
2
1
= = g ( g + g g ) = 0
2
1
= g (2 g g ) = sin cos
2
1
= g (2 g g ) = 0
2
1
= = g ( g + g g ) = cot
2
1
= g g = 0 . (4.46)
2
Hence, we have
= 0 (4.47)
= = cot (4.48)
= sin cos . (4.49)
Let (t) = ((t), (t)) be a curve, and v(t) = v (t) + v (t) be a vector
field on it. The equation of parallel transport, Eq. (4.28), is
dv
sin cos (t)v = 0 , (4.50)
dt
dv
+ cot ( (t)v + (t)v ) = 0 . (4.51)
dt
Hence, the curve is a geodesic if
2
d2 d
2
sin cos = 0, (4.52)
dt dt
d2 d d
2
+ 2 cot = 0. (4.53)
dt dt dt
By inspection, we can verify that the lines of longitude, = constant, = kt,
are geodesics. Also, the equator, = /2, = kt, is geodesic. However,
the lines of latitude, = constant, are not geodesics.
4.4 Curvature
Let be a covariant derivative (not necessarily Levi-Civita), be an one-
form and f a function. Then,
( ) (f )
= (f ) (f )
= ( f + f ) ( f + f )
= f + f + f + f
f f f f
= f f
= f ( ) . (4.54)
The action of the operator on one-forms is exactly like a tensor

because it is linear in the functions.
Definition 4.13. See Section 3.2 of Wald. In component form, let
( ) ( dx ) R dx . (4.55)
R is a tensor field of type (1, 3), called the Riemann curvature ten-
sor. (Different book has different convention. Be careful when comparing
equations.) This equation can also be written as
;; ;; = R . (4.56)
Theorem 4.14. Acting on a vector field V , we have
( ) (V ) = R V . (4.57)
Proof. Because ordinary partial derivative is commutative,
0 = ( ( V )) ( ( V ))
= ( ) (V )
= ( ) ( V )
= ( V + V ) ( V + V )
= V + V + V + V
V V V V
= ( ) V + ( V V )
= R V + ( ) V . (4.58)
In the third line, it is understood that we will contract the two factors after
covariant differentiation.
The operator itself is not a tensor because, for example, its action on one-
forms and vectors are not the same, but tensor outer product should have
the same action, see Example 1.35.
One could easily generalize the action of to tensors (one
+R for each lower index and one R for each upper index).
We find the expression of the curvature tensor in terms of Christoffel
symbols by Eq. (4.56).
;;
= ;

= ( )

=
+ + . (4.59)
If we reverse and and subtract, the first, fourth and fifth terms cancel
with themselves, the third and sixth cancel each other.
R = + +
R = + . (4.60)
This is the required equation. Brave students should check that it transforms
as a type (1, 3) tensor by Eq. (4.23).
The curvature tensor tells us how a vector is rotated after parallel trans-
ported around a closed loop. We will prove this in steps. First, let v be
an arbitrary vector field, and w0 v (x , x ) be the vector at the point
(x , x ). If we parallel transport w0 along the coordinate line from x to
x + to get a vector field w on the line, what will be w(x + , x ) different
from v (x + , x )? To simplify notations, in the following paragraphs, we
will use subscript 0 and 1 to denote that the quantity is evaluated at (x , x )
and (x + , x ) respectively, usually after differentiation.
Theorem 4.15. Up to second order of ,
2
w1 = v1 v1 + v 1 (4.61)
2
where the right hand side is evaluate at (x + , x ) and we do not sum over
.
Proof. The vector field w is parallel, it satisfies
w = w . (4.62)
However, v does not. By simple Taylors series,

1
w1 = w0 + w0 + 2 w0
2
1
= w0 0 w0 2 0 w0

2
1 1
= v0 0 v0 2 0 v0 2 0 w0
2 2
1 1
= v0 0 v0 2 0 v0 + 2 0 0 v0 . (4.63)
2 2
By Taylors series again,
2
v0 = v1
v1
+ v1 ,
2
0 = 1 1 . (4.64)
Substitute into Eq. (4.63), we have
2
w1 = v1 v1 + v1
2
1 1 (v1 v1 )

1 1
2 1 v1 + 2 1 1 v1
2 2
= v1 v1 + 1 v1
2
v1 + 21 v1 + 2 1 v1 1 v1 + 1 1 v1

+
2
= v1 v1
2
v1 + 21 v1 + 1 v1 + 1 1 v1 .

+ (4.65)
2
To simplify the last line, note that since is some fixed direction, v is a
vector field, not a tensor of type (1, 1). We have
v = v + v

= v + v + v + v .

(4.66)
Put them together,
2
w1 = v1 v1 + v1
2
2

= 1 + v1 . (4.67)
2
This is very similar to Taylors series.

(x,x + ) (x+ ,x + )
w4 w3 w2
(x,x ) w1
w0 (x + ,x )
Figure 4.1: Parallel transport of a vector.
Consider a loop which is a parallelogram in coordinates, Fig 4.1. With

obvious notations, we have up to second order
2 2

w2 = 1 + 1 + v2
2 2
2
2

= 1 + + v2
2 2
2

w3 = 1 + +
2
2 2

1 + + v3
2 2
2

= 1 + + v3 (4.68)
2
2

w4 = 1 + +
2
2

= 1 + + v4
2

= (1 + ) v4 . (4.69)
Here v4 is the vector field at point 4, which is just v4 = v0 = w0 . Hence, the
difference after a loop of parallel transport is
w4 w0 = ( ) v = R v . (4.70)
Theorem 4.16. 1. R = R
2. R + R + R = 0
3. For Levi-Civita connection, define
R R g . (4.71)
We have
R = R . (4.72)
4. Bianchi identity:
R ; + R ; + R ; = 0 (4.73)
Proof. To prove Property 2, stare at Eq. (4.60) long enough, or notice that
Christoffel symbols are symmetric, Theorem 4.4. Pay attention to the first
two terms in Eq. (4.60), the lower indices of the cyclic sum of curvature
tensors are
() () + () () + () () = 0 . (4.74)
For the last two terms in Eq. (4.60),
() () + () () + () () = 0 . (4.75)
To prove Property 3,
0 = g
0 = ( ) g
0 = R g + R g . (4.76)
For Property 4, on one hand
;;; ;;; = R ; + R ; . (4.77)
On the other hand,
;;; ;;; = (;; ;; );

= (R );
= R ; + R ; . (4.78)
Cyclic permute the indices , and in these two equations and sum up,
R ; + R ; + R ; + R ; + R ; + R ;
= R ; + R ; + R ; + R ; + R ; + R ;
R ; + R ; + R ; = R ; + R ; + R ;
R ; + R ; + R ; = 0 . (4.79)
We have the last line because of Property 2. Since is arbitrary, we have

the result.
Definition 4.17. The Ricci tensor is defined as
R R . (4.80)
For Levi-Civita connection, the scalar curvature is
R R g . (4.81)
Theorem 4.18.
R = R . (4.82)
Proof. By Eq. (4.72), we have
g R = g R
R = R
R = 0 . (4.83)
Contract and in Property 2 of Theorem 4.16,
R + R + R = 0
R R = 0 (4.84)
where we have used Property 1 of Theorem 4.16.
Definition 4.19. The Einstein tensor is

1
G = R Rg (4.85)
2
for a Levi-Civita connection.
Theorem 4.20.
G g G = 0 . (4.86)
Proof. By Bianchi identity,
R ; + R ; + R ; = 0
R ; + R ; + R ; = 0
R; R ; + R ; = 0
g (R; R; + g R; ) = 0
R; g R; g g R; = 0
R; g R; g R; = 0
g ((Rg ); R; R; ) = 0. (4.87)
Example 4.21. We calculate the curvature tensor, Ricci tensor and scalar
curvature of the surface of a sphere. There are sixteen components for the
curvature tensor, but many of them are zero because of Property 1 and
Property 3 of Theorem 4.16. The only non-zero terms are
R = R = R = R . (4.88)
Eq. (4.60) gives
R
= g R

= g +
R2 sin2 cot cot2

=

2 2 1 2
= R sin cot
sin2
= R2 sin2 . (4.89)
The Ricci tensor is
R = R + R = 1 (4.90)
R = R = R + R = 0 (4.91)
R = R + R = g R = sin2 . (4.92)
The scalar curvature is
g R + g R
2
= . (4.93)
R2
Is this what you expected? (I am sorry that both the scalar curvature and
the radius of the sphere are denoted by R.)
Chapter 5
General Relativity
Einstein revolutionized the concept of space and time with special relativity.
One of the main results is that nothing can travel faster than the speed of
light. The theory of gravity available at that time was Newtonian gravity,
which is not compatible with special relativity because the effect of gravity
in that theory transmits instantaneously. Einstein gave us another theory
of gravity, the general relativity. He revolutionized the concept of spacetime
again.
5.1 Equivalence Principle

See Section 1.4 and Section 4.3 of Wald. Electromagnetism merges with spe-
cial relativity (and general relativity) very well. (I am sorry that we dont
have time for this. Please read the chapter in John D. Jackson, Classical
Electrodynamics, Wiley, 1999 or Landau and Lifshitz, The Classical The-
ory of Fields, Butterworth-Heinemann, 1980.) We study the electromagnetic
interaction by studying the motions of electrically neutral and charged par-
ticles. The difference in their motions will tell us the present or absent and
strength of electromagnetic field. Spacetime is just a stage on which the
particles move. Can we do the same to gravity?
The answer is no because of the equivalence principle, which states
that all bodies are influenced by gravity and they fall in exactly the same
way. Another way to say it is that the inertial mass is proportional to the
gravitational mass, or the two kinds of masses are equivalent. (There are
quite a few versions of equivalence principles, the strong, the weak, etc. We
dont distinguish them.)
Inertial mass mi describes how a body responses to a force, for example,
F
mi = . (5.1)
a
On the other hand, gravitational mass mg describes how strong an attractive
51
CHAPTER 5. GENERAL RELATIVITY 52
force will be between two bodies

mg1 mg2
F = . (5.2)
r2
There is no logical reason why there should be any relationship between them,
just like there is no relationship between the inertial mass or gravitational
mass and the electric charge of a particle.
Equivalence principle states that mg is proportional to mi for everything
and with the same proportionality constant. In all books, the form of gravi-
tational attraction is adjusted to
mi1 mi2
F =G , (5.3)
r2
where G is the gravitational constant.
Whether equivalence principle is valid or not rests on experiments. It
is well tested for ordinary material, i.e., protons, neutrons and electrons.
Testing anti-hydrogen will be interesting.
One striking conclusion of equivalence principle is under the influence of
a gravitational field, everything will fall in exactly the same way
d2 r M
2
= G 2 , (5.4)
dt r
independent of the mass of the particle. We are not saying that everything
will fall in exactly the same way everywhere. A non-uniform gravitational
field is a simple example.
We could imagine that inside a free falling room, so long as it is not
large enough to explore the non-uniformity of the gravitational field, we do
not feel any gravitational attraction at all. The acceleration cancels the
gravitational attraction. This is a local inertial frame and all conclusions
of special relativity should be valid, provided that we do not look outside the
room.
We elaborate a little more. In a typical room on the surface of the Earth,
we feel the gravitational attraction of the Earth and the so-called little g is
9.8 ms2 . We cannot detect the non-uniformity of the gravitational attraction
of the Earth. One has to have a room comparable to the size of the Earth
to find that. If the room is free falling to the center of the Earth, then
equivalence principle states that doing experiments inside the room cannot
tell us whether it is free falling or it is far away from any massive body
such that it does not experience any gravitational attraction, no matter how
complicated or clever the experiments are.
(If we wait long enough, we can tell it is not far away from massive
body because everything is falling toward the center of the Earth, hence, two
floating particles will approach each other. The difference in gravitational
forces at different locations is called the tidal force.)
Because everything is affected by the gravitational field, the logical con-

clusion is: Instead of associating property to all particles, gravitational field
manifests itself as some properties of the spacetime, in which all particles are
moving. We say that the Minkowski space for special relativity is flat because
its metric tensor is constant and all curvature tensors are zero. Gravitational
field corresponds to a deviation from the flat geometry. We have a curved
spacetime.
The mathematical tool needed to describe gravity is then a theory of
spacetime such that locally, we could choose a coordinate system which gives
us back special relativity. We conclude: Spacetime is a semi-Riemannian
manifold with a Lorentz metric.
How to determine the metric will be discussed in the Section 5.5. Given
a metric with Lorentz signature, how can we write down the physical equa-
tions? Two general principles are very useful. The first one is that the
equations must be coordinate covariant, also called general covariant,
which means that the equations must be transformed nicely under a change
of coordinates. The second is the equation must reduce to the corresponding
one in special relativity (or even Newtonian mechanics) if the spacetime is
flat.
Nature does not care much about coordinate systems. Coordinate co-
variant should not only be valid in general relativity, but also in special
relativity and other physical theories. Only that in special relativity, there
are overwhelmingly convenient coordinate systems and we ignore the others.
There is nothing to prevent us to think about Lorentz transformations in po-
lar coordinates of Minkowski space, apart from the fact that it will be very
clumsy.
The whole theory of tensor analysis is created for this purpose. We recast
an equation in special relativity in a form of tensors, then replace by g
and by . We will have an equation that is general covariant in curved
spacetime which satisfies both principles above. In fact, this step is necessary
even for polar coordinates of Minkowski space.
The world line of a free particle in Minkowski space is given by
dv
0= (5.5)
d
where v = dx /d is the velocity vector and is a parameter on the world
line. The condition that its speed is less than or equal to speed of light is
0 v v . (5.6)
In curved spacetime (and in general coordinates), they are promoted to
dx dx v d2 x

dx dx
0 = v = + v = + (5.7)
d d x d 2 d d
dx dx
0 g . (5.8)
d d
Eq. (5.7) is just the geodesic equation, Eq. (4.30).

Can we add a term like Rv to the left hand side of Eq. (5.7)? Adding
such a term does not violate the two principles because the scalar curvature
R = 0 in flat spacetime. Yes, we could do so, the resulting theory is called
f (R) gravity. In general relativity, we take the simplest form and propose
that a free falling particle in curved spacetime moves in geodesics.
5.2 Proper Time

Suppose we have a material clock traveling along the trajectory x (), not
necessarily a geodesic. (We change the parameter from to .) The word
material means it travels with speed less than the speed of light, we have
dx dx
0 > g . (5.9)
d d
Then, how does it tell time?
Let me explain the question in much more details. First of all, coordi-
nate time does not tell us anything physical. We are not asking the readings
of x0 (). Assume that there are observers situated over each point of the
whole universe. When the parameter takes some specific value = 1 , the
clock is at coordinates (x0 (1 ), x1 (1 )x2 (1 )x3 (1 )). The observer at the ex-
act same location can take the reading of the clock, let say, t1 . Since the
observer is with the clock, there will be no time delay, or any other com-
plications. Repeat the observation again and again, with observers at other
locations. Afterward, the observers get together to compare their records.
They could reconstruct the relation between the time lapse of the clock with
the parameter .
In between and +, the clock is approximately moving with constant
velocity. We could choose a local inertial frame. According to Eq. (2.22), the
time lapse is
( )2 = g (x )(x ) . (5.10)
The proper time is
Z r
dx dx
= g d . (5.11)
d d
5.3 Newtonian Limit

See Section 3.4 of Weinberg. Consider a slow moving particle in a static
weak gravitational field, which means we can choose a coordinate system
such that one of the coordinates is time-like, g00 < 0, the other coordinates
are space-like and metric tensor does not depend on the time coordinate.
A satellite moving around the Earth is a good example. To qualify as a

theory of gravity, equations of general relativity should reduce to our familiar
Newtonian equations.
The Cartesian coordinates should be convenient. We expect that the
metric will deviate only slightly from flat metric
g = + h (5.12)
where h 1. The metric is static, so all time derivatives are zero, 0 g =
0. The particles speed is small, we have
0 i
dx
dx

d d (5.13)
for i = 1, 2, 3. (Remember the factor of c somewhere?) The equation of

geodesic becomes
d2 x dx dx
0 0
+ 00 =0. (5.14)
d 2 d d
where by Eq. (4.36) and up to first order of h,
1
00 = (0 h0 + 0 h0 h00 )
2
1
= h00 . (5.15)
2
(We do not sum over in the last line.) In component form, we have
2
d2 x i 1 dt
= i h 00 (5.16)
d 2 2 d
d2 t
= 0. (5.17)
d 2
The second line tells us that up to some simple coordinate changes, we could
take = t. The first equation becomes
d2 x 1
2
= grad h00 . (5.18)
dt 2
Comparing with Eq. (5.4) implies that
h00 = 2 + constant (5.19)
where is the gravitational potential. The constant is zero because very
far away, the metric should tend to Minkowski metric
g00 = (1 + 2) . (5.20)
Here the gravitational potential is dimensionless, while the traditional po-
tential has dimension of energy divided by mass. Putting back the factors of
c,
GM
= 2 . (5.21)
Rc
On the surface of the Earth, 109 .
5.4 Time Dilation

See Section 3.5 of Weinberg. Within a small room, we cannot detect time
dilation at all because equivalence principle states that every clock in the
room will run in exactly the same rate.
To investigate time dilation of two different locations, we still assume that
the spacetime is static. Suppose there are two clocks, at points A and B,
and we could send light from one point to another. The actual curve could
be found by solving the geodesic equation. Light is traveling with speed of
light (I feel stupid saying this.)
dx dx
0 = g
d d
0 = g00 (dt)2 + 2gti dt dxi + gij dxi dxj
p
gti dxi (gti dxi )2 g00 gij dxi dxj
dt = . (5.22)
g00
Integrating along the geodesic, we can find the coordinate time needed for
light to travel from A to B. The conclusion we need is that this does not
depend on t. Let it be T .
Light pulses could be sent out from A regularly according to As clock,
and B could compare the arrival times of the pulses to see if there is time
dilation.
Clocks run at the rate of proper time. For clock at fixed location, by
Eq. (5.10), the relation between proper time and coordinate time is

= g00 t . (5.23)
The first pulse was sent out at t = 0, say. It arrived B at pt = T . When

the clock at A read A , the coordinate time was t = A / g00 (A) and
the second pulse was sent out. It arrived B at coordinate time t + T . For
observers at B, the proper time difference between the two pulses is
p
B = g00 (B) t
s
g00 (B)
= A . (5.24)
g00 (A)
If point B is far away from any gravitational field, g00 (B) = 1, while
point A is at the surface of the Earth, g00 (A) = (1 2 GM R
), the ratio of
proper times is
B GM
=1+ >1. (5.25)
A R
Clocks (and everything) in a gravitational field run slower. We have time
dilation.
This effect has been verified by experiments. The ratio 109 , although
small, is much larger than the precision of atomic clocks. However, instead
of an atomic clock at infinity for reference, scientists compare the rate of
atomic clocks on the surface of the Earth and atomic clocks in orbit, which
are free falling, hence equivalent to clocks in no gravitational attraction.
5.5 Einsteins Field Equation

In this section, we take up the task to find out how to determine the metric
of space time. The main result is the Einsteins field equation or just
Einsteins equation
1
R g R = 8GT . (5.26)
2
We will give an argument why the Einsteins equation is reasonable, but it can
prove the equation. This equation is part of the basic assumption of general
relativity, just like the inverse square law is part of the basic assumption of
Newtonian gravity.
Equivalence principle implies that effects of gravity could be described
by a curved spacetime, but gravity itself is determined by matter. Hence,
matter can affect spacetime. This is roughly stated as the Machs principle:
spacetime is influenced by matter. Spacetime is not just a stage anymore.
Spacetime and matter affect each other and evolve with each other.
Just like the equation of motion of ordinary matter, we expect that af-
ter we specify the initial position and velocity, of the spacetime, the
governing equation should tell us how it evolves. Hence, we expect that
the governing equation is a second order differential equation in time on the
metric. By general covariant, it is a second order differential equation in all
coordinates.
How could we describe matter? There is no rigid body in special relativity
because the speed of light is the maximum allowed speed. There is also no
rigid body in general relativity. Everything is fluid, or at least, fluid-like. To
describe it, one may think that the energy-momentum of the fluid at each
point, i.e. an energy-momentum four vector field is a good choice. There
are at least two reasons why it isnt so. Firstly, it is impossible to form a
four vector out the second order derivatives of the metric tensor. (One has
to play around with the metric tensor a lot to convince oneself this.) This
road is not very promising to relate geometry and matter.
The second reason is that we have forgot the pressure of the fluid. Pres-
sure is due to the interaction of the particles. Usually, pressure increases
with density. This is why matter will not easily self gravitate and form a
black hole.
The next candidate in the line is the stress tensor, mentioned in Sec-
tion 2.6. Einstein did find an equation relate the stress tensor of matter and
geometry of spacetime. We will take the stress tensor as the appropriate
quantity to describe matter. Here, we would just like to mention that the
perfect fluid is defined by
T = u u + P (g + u u ) . (5.27)
where u = g u is the velocity four vector of the fluid at the point. is the
density and P is pressure, both measured at the rest frame of the fluid at the
point. We do not dive into the details of this equation. The brunch of par-
ticles in Section 2.6 describe something astronomers called dust, something
that P = 0.
Following Section 4.3 of Wald (but beware of some sign errors in (3.3.18),
(4.3.18) and (4.3.19)), we consider the tidal acceleration in non-uniform grav-
itational field. In Newtonian gravity, the equation of motion is
d2 x
= grad . (5.28)
dt2
The difference in accelerations at two different points is
grad |x2 + grad |x1

= ((x2 x1 ) grad)grad
= (x grad)grad . (5.29)
If initial separation of the two particles is going along the x direction with
distance , the tidal acceleration is
(5.30)
where labels the acceleration vector.

In curved spacetime, the tidal effect displays itself as the separation of
particles that follow their geodesics. Assume we could choose a coordinate
system that the geodesics go along the x direction and two particles are
away from each other of distance in the x direction.
If v is the initial velocity, the initial velocity at x = is the paral-
lel transport of v along the coordinate line x . After some time t, the
two particles travel to v t along the x direction. To compare their
velocities, we parallel transport back.
In summary, the first particle, with initial velocity v , goes to x = in
time t. The initial velocity of the second particle is the parallel transport
of v along x . Then, it moves to (x , x = (, ). We parallel transport its
velocity back to (x , x ) = (0, ) for comparison. This is just Eq. (4.68). The
difference in velocities is then

2

1 + + v
2
2

1 + v
2
= ( ) v
= R v . (5.31)
The tidal acceleration is
R v /t = R v v . (5.32)
We have the correspondence
R v v . (5.33)
Poissons equation for gravitational potential and mass density is
2 = 4G , (5.34)
and T v v as shown in Eq. (5.27) or deduced from Eq. (2.41) (after

some work). We have
R v v 4G (5.35)
or
R v v 4GT v v
R 4GT . (5.36)
However, T = 0 6= R . They are not equal to each other. The hint

is in Theorem 4.20. We propose
1
R g R = kGT . (5.37)
2
To fix the constant k, we follow Section 7.1 of Weinberg. Consider a static
weak gravitational field of some non-relativistic system. Then, |Tij | |T00 |,
where i, j = 1, 2, 3. By Eq. (5.37),
1
Rij gij R . (5.38)
2
For weak field, g ,
X 3
R R00 + Rii = R00 + R, (5.39)
i
2
we have R 2R00 and

1
G00 R00 g00 R 2R00 . (5.40)
2
We calculate R00 for weak static gravitational field, we could ignore the
terms and time derivatives in Eq. (4.60)
R00 = 00
1
= (g (0 g0 + 0 g0 g00 ))
2
1
= (g g00 )
X 2
= i i . (5.41)
i
where we have used Eq. (5.20). Finally, we have by Eq. (5.34)
G00 = kGT00
X
2 i i = kG
i
8G = kG
8 = k , (5.42)
and the Einsteins field equation Eq. (5.26). (I hope the sign is correct.)
Students should also consult the two derivations in Weinberg. His method
is mainly finding out all possible combinations of second derivatives of the
metric tensor, to see that only one combination is a non-trivial tensor and it
must proportional to the stress tensor.
There is another more natural derivation, the action principle. Please
read Chapter 12 of Weinberg or Appendix E of Wald. It is much more
convincing.
Einsteins field equation is a very non-linear partial differential equation.
Mathematicians classify it as a hyperbolic equation. Physicists will con-
cern about the Cauchy problem or Cauchy surface, see Chapter 8 of Wald
and Section 7.5 of Weinberg. It all means that one could choose a three
dimensional space, specify on it the initial spatial metric tensor, the mat-
ter distribution, and their velocities, then Einsteins field equation will in
principle tell their evolution. One have to solve for the metric of spacetime
and motion of material simultaneously. This requirement, in fact, put some
constraint on spacetime.
Chapter 6
Schwarzschild Solution
This is the most interesting solution of general relativity. The derivation is

quite tedious, but we have to go through it.
6.1 Metric in Standard Form

See Section 8.1 and 8.2 of Weinberg. We use bold type to denote vector in
three dimensional Euclidean space. The dot product x y is invariant under
rotation. A metric is isotropic if it is invariant under rotation of the space,
it is static if it is independent of time. What is the most general metric which
is both static and isotropic?
Apfunction f is isotropic and static if only only if it is a function of
r x21 + x22 + x23 . For tensors of type (0, 2), we have (x dx)2 , dx dx, dt2
and dt x dx. Hence, the proper time could be written as
d 2 = F (r) dt2 2E(r) dt x dx D(r) (x dx)2 C(r) dx dx . (6.1)
Change to polar coordinates, we have
x dx = r dr (6.2)
dx dx = dr2 + r2 d2 + r2 sin2 d2 (6.3)
and
d 2 = F (r) dt2 2rE(r) dtdr r2 D(r) dr2
C(r) dr2 + r2 d2 + r2 sin2 d2 .

(6.4)
To eliminate the cross term, we define
t t + (r) (6.5)
where
d(r) rE(r)
= . (6.6)
dr F (r)
61
CHAPTER 6. SCHWARZSCHILD SOLUTION 62
Then,
dt = dt (r) dr (6.7)
and
F dt2 2rE dtdr r2 D dr2

= F (dt dr)2 2rE(dt dr)dr r2 D dr2
2
2F dt dr + F dr2 2rEdt dr + 2rE dr2 r2 D dr2
= F dt2
2
= F dt2 + F + 2rE r2 D dr2
2
2 E
2
= F dt r + D dr2
F
F dt2 G dr2 . (6.8)
The proper time becomes
d 2 = F (r) dt2 G(r) dr2 C(r) dr2 + r2 d2 + r2 sin2 d2 . (6.9)

p
We further define r C(r) r,
2 2
d 2 = B(r ) dt2 A(r ) dr r d2 + sin2 d2

(6.10)
for some functions B(r ) and A(r ). (Students should work out these two
functions in terms of F , G and C.) Finally, we re-label the coordinates
d 2 = B(r) dt2 A(r) dr2 r2 d2 + sin2 d2 .

(6.11)
This is the standard form of static and isotropic metric.
6.2 Christoffel Symbols

The components of the metric tensor are
gtt = B(r) (6.12)

grr = A(r) (6.13)
g = r2 (6.14)
g = r2 sin2 , (6.15)
and all cross terms are zero, their inverses are
g tt = 1/B(r) (6.16)
g rr = 1/A(r) (6.17)
g = 1/r2 (6.18)
g = 1/(r2 sin2 ) . (6.19)
The Christoffel symbols for the Levi-Civita connection is given by Eq. (4.36).
For our case, we have some simplification. In the following equation, we do
not have to sum over
1
= g ( g + g g ) . (6.20)
2
There are forty Christoffel symbols. Most of them are zero because the metric
tensor depends only on r and . Let calculate them. For = t,
1 tt
t = g (t gtt + t gtt ) . (6.21)
2
The only non-zero components are
B
ttr = . (6.22)
2B
Put = r,
1 rr
r = g (r grr + r grr r g ) . (6.23)
2
1 B
rtt = g rr r gtt = (6.24)
2 2A
r 1 rr
tr = g t grr = 0 (6.25)
2
rt = 0 (6.26)
rt = 0 (6.27)
1 rr A
rrr = g r grr = (6.28)
2 2A
r 1 rr
r = g grr = 0 (6.29)
2
rr = 0 (6.30)
1 r
r = g rr r g = (6.31)
2 A
r = 0 (6.32)
1 r sin2
r = g rr r g = . (6.33)
2 A
Now, for = ,
1
= g ( g + g g ) . (6.34)
2
tt = tr = t = t = rr = 0 (6.35)
1 1
r = g r g = (6.36)
2 r
r = 0 (6.37)
= = 0 (6.38)
1
= g g = sin cos . (6.39)
2
Finally, = ,
1
=g ( g + g ) . (6.40)
2
The non-zero components are
1 1
r = g r g = (6.41)
2 r
1
= g g = cot . (6.42)
2
We group together the non-zero components in another order for easy refer-
ence.
B B
ttr = rtt = (6.43)
2B 2A
B
A 1 1
trt = rrr = r = r = (6.44)
2B 2A r r
r 1
r = r = = cot (6.45)
A r
r sin2 1
r
= = sin cos r = = cot (6.46)
A r
6.3 Ricci Tensor

We have some preparation before we calculate the Ricci tensor.
t = 0 (6.47)
B A 2
r = + + (6.48)
2B 2A r

= cot (6.49)
= 0 . (6.50)
B2
t t = 2ttr rtt = (6.51)
2AB
t r = 0 (6.52)
t = 0 (6.53)
t = 0 (6.54)
B2 A 2 2
r r = 2
+ 2
+ 2 (6.55)
4B 4A r
cot
r = (6.56)
r
r = 0 (6.57)
2
= + cot2 (6.58)
A
= 0 (6.59)
2
2 sin
= 2 cos2 (6.60)
A
Now, we can calculate the Ricci tensor by Eq. (4.60)
R = + . (6.61)
Rtt = tt + rtt r t t
B2

B B B A 2
= r + + +
2A 2A 2B 2A r 2AB
2
B
AB

B AB
B
B2
= + + +
2A 2A2 4AB 4A2 rA 2AB
2
B
AB

B
B
= 2
+
2A 4A 4AB rA
B B A B B
= + + . (6.62)
2A 4A A B rA
Rtr = rt + rrt r r t = 0 . (6.63)
Rt = Rt = 0 . (6.64)
Rrr = rr r r + rr r r

A B A 2
= r r + +
2A 2B 2A r
2
A 2

A B A 2 B 2
+ + + + +
2A 2B 2A r 4B 2 4A2 r2
B2

B 2 A B A 2
= r + + + 2
2
2B r 4AB rA 4B r
2 2
B
B AB
A
B
= + + +
2B 2B 4AB rA 4B 2
2
B B A B A
= + + + . (6.65)
2B 4B A B rA
Rr = r r + r r
= r r
= 0. (6.66)
Rr = r r + r r = 0 . (6.67)
R = +
r
r B A 2 2
= r cot + + + cot2
A A 2B 2A r A
1 rA
1 rB
rA
2 2
= + 2 + 2 2
+ cot2
A A sin 2AB 2A A A
1 r A B
= + +1 . (6.68)
A 2A A B
R = + = 0 . (6.69)
R = +
r sin2

= r (sin cos )
A
r sin2

B A 2
+ + sin cos cot
A 2B 2A r
2 sin2
+ + 2 cos2
A
sin2 rA sin2
= + 2
cos2 + sin2
A A
r sin2

B A 2
+ + sin cos cot
A 2B 2A r
2 sin2
+ + 2 cos2
A
sin2 rA sin2
= + 2
+ sin2
A A
r sin2

B A
+
A 2B 2A

2 1 r A B
= sin + +1
A 2A A B
= sin2 R . (6.70)
6.4 The Solution

The Schwarzschild solution is the general static and isotropic solution
of Einsteins field solution in empty space. Minkowski space is a special
solution.
In empty space, the stress tensor is zero, T = 0. Einsteins field equation

becomes
1
R g R = 0 . (6.71)
2
Contraction gives us
4
R R = 0
2
R = 0. (6.72)
Substitute back into Eq. (6.71), we get
R = 0 . (6.73)
We need to solve Rtt = Rrr = R = 0. We see that
Rrr Rtt A B
+ = +
A B rA2 rAB
A B
0 = +
A B
constant = AB . (6.74)
Looking back at Eq. (6.11), we demand that limr A = limr B = 1.
Hence, the constant is one. Rtt = 0 becomes
B B
+ = 0
2A rA
2B
B =
r
d log B 2
=
dr r
constant
B =
r2
constant
B = constant
r
C
B = 1 (6.75)
r
where we have taken care of the boundary condition and C is a constant.
Then,
1
C
A = 1 . (6.76)
r
We still have to check that Rrr = R = 0. With the second last line of
Eq. (6.74),
B A
Rrr = +
2B rA
C B
= =0. (6.77)
r3 B rB
1 r A B
R = + +1
A 2A A B

B
= B rB +1
B
C C
= 1 + + 1
r r
= 0. (6.78)
For locations far away from the center, the gravitational field is weak. By
Eqs. (5.20) and (5.21),

2GM
g00 = 1 . (6.79)
r
It is natural to set the constant C = 2GM where M is the mass of the object
at the center. We have the celebrated Schwarzschild solution
1
2 2GM 2 2GM
dr2 r2 d2 + sin2 d2 .

d = 1 dt 1
r r
(6.80)
We define the Schwarzschild radius
2GM
rS = 2GM = . (6.81)
c2
For the Sun, the Schwarzschild solution is applicable to region outside the
Sun because inside, the stress tensor is non-zero. The Schwarzschild radius
for a solar mass object is about 3km, well within the Sun. We dont have to
worry about the coordinate singularity of the solution at r = rS .
Please read Section 8.5 of Weinberg and Section 8.6 or Section 6.3 of
Wald for deflection of light by the Sun and precession of the perihelion of
Mercury respectively. These are the classical tests of general relativity.
6.5 Black Holes

If we take the Schwarzschild solution seriously, we have to ask what happens
at r = rS ? First of all, this is just a coordinate singularity, just like the north
and south poles of polar coordinates. If we express the coordinates (t, r) in
terms of (T, X) by
r
1 er/2GM = X 2 T 2 , (6.82)
2GM
t X +T
= log = 2 tanh1 (T /X) , (6.83)
2GM X T
we could transform the metric to

32G3 M 3 er/2GM
ds2 = dT 2 + dX 2 + r2 d2 + sin2 d2

(6.84)
r
where we treat r in the above equation as a function of (T, X), see Section 6.4
of Wald or Section 8.8 of Weinberg. We see that r = rS has no problem at
all.
However, the singularity at r = 0 is real. It cannot be removed by a
choosing other coordinate systems, as the curvature scalar R R blows
up there. Hence, general relativity breaks down near r = 0. We think we
need something called quantum gravity to study the region r 0. This
object, with vanishing stress tensor except near r = 0, is called a black hole
or a Schwarzschild black hole. It is electrically neutral and non-rotating. The
surface where r = rS is the event horizon of the black hole.
Recall the discussion of time dilation in Section 5.4. For a point with
coordinate r > rS , the ratio of time dilation compared to the time at infinity
is given Eq. (5.24)
B 1
=p . (6.85)
A 1 2GM/r
This tends to infinity if r rS . A photon emitted at r will be redshifted
by the factor when it gets to infinity. This could be understood as the lost
of energy of the photon when it climbs up the gravitational potential from
the black hole. Not just the photon, an observer far away will observe that
everything near the event horizon will move in slow motion.
A photo could orbit around a black hole, a neutron star is not mas-
sive/small enough to do that. To find out the radius of the photon sphere,
assume the photon is moving at the equator, then r is fixed, = /2. The
proper time of a photon is zero d 2 = 0

2GM
0 = 1 dt2 r2 d2
r
r d
p = 1. (6.86)
1 2GM/r dt
This just means that the photon is traveling with speed 1. r and are fixed,
hence dr/d = d/d = 0 where is a parameter on the geodesic. The
geodesic equation, Eq. (4.30), for r is
d2 r
r dx dx

0 = +
d2 d d
dt dt d d
0 = rtt + r
d d d d
B dt dt r d d
0 =
2A d d A d d
r
d B
=
dt r 2r
GM
= . (6.87)
r3
Put together the two equations, we have
p r
1 2GM/r GM
=
r r3
1 2GM/r GM
2
=
r r3
3
r = 3GM = rS . (6.88)
2
Further analysis shows that the photon sphere is unstable. A slight pertur-
bation of the photon will make it go into the black hole or to infinity.
How much time does it take to free fall to the event horizon? It depends
on observers. Eq. (6.84) is regular everywhere except r = 0, we expect that
the proper time (d 2 = ds2 ) for the free falling particle is some finite value.
For photon in radial trajectory, and are fixed, and the proper time is
zero. Eq. (6.80) gives us
1
2GM 2 2GM
1 dt = 1 dr2
r r
Z t Z r
dr
dt = rS
0 r0 1 r
r 0 rS
t = r0 r + rS log . (6.89)
r rS
We see that t if r rS . It takes infinite coordinate time to reach
the event horizon. However, the coordinate time is just the proper time for
an observer at infinite far away. We conclude that an observer far away will
never see the photon reaching the event horizon. For a massive particle, its
speed is lower than speed of light, it takes even longer time than photon to
reach a location near the event horizon.
It will take infinite time, both proper and coordinate, to discuss every-
thing we know about black holes.
Index
basis, 3 length contraction, 19

black hole, 69 Levi-Civita connection, 42
boost, 17 linearly independent, 2
Lorentz contraction, 19
Christoffel symbol, 38 Lorentz metric, 34
component, 3
coordinate, 3 Machs principle, 57
coordinate system, 26 metric tensor, 33
cotangent space, 31 Minkowski space, 17
covariant derivative, 36 momentum flux, 22
covector, 31 multilinear function, 8
differential, 31 non-degenerate, 9
dimension, 3
direct product, 2 one-form, 31
dual space, 5 outer product, 15
dust, 58
parallel transport, 40
Einstein tensor, 49 perfect fluid, 58
Einsteins equation, 57 photon sphere, 69
Einsteins field equation, 57 proper Lorentz transformation, 17
equivalence principle, 51 proper time, 54
Euclidean space, 8
Ricci tensor, 48
event horizon, 69
Riemann curvature tensor, 44
general covariant, 53
scalar curvature, 48
geodesic, 41
scalar multiplication, 1
gravitational constant, 52
Schwarzschild radius, 68
hyperplane, 7 Schwarzschild solution, 66, 68
semi-Riemannian manifold, 34
inner product, 8 signature, 16
invariant interval, 16 smooth, 26
inverse, 1 smooth function, 27
isotropic, 61 smooth manifold, 26
Kronecker delta, 4 smooth one-form, 31
smooth vector field, 30
Leibnitz rule, 27, 37 stress tensor, 22
71
INDEX 72
stress-energy Tensor, 22
tangent space, 27
tangent vector, 27
tensor, 13
tensor contraction, 14
tensor field, 32
tensor product, 11
tidal force, 52, 58
time dilation, 56
torsion free, 37
vector addition, 1
vector field, 30
vector space, 1
velocity vector, 30
world line, 20

General Relativity

Diunggah oleh

Informasi Dokumen

Judul Asli

Hak Cipta

Format Tersedia

Bagikan dokumen Ini

Bagikan atau Tanam Dokumen

Opsi Berbagi

Apakah menurut Anda dokumen ini bermanfaat?

Apakah konten ini tidak pantas?

Hak Cipta:

Format Tersedia

General Relativity

Diunggah oleh

Hak Cipta:

Format Tersedia

PHYS 3033 - General Relativity

Lee Kai Ming

2 Review of Special Relativity 16

1.1 Vector Spaces

3. There exists an element, zero vector, 0 V such that v V, 0+v = v;

4. v V , there exists an inverse u V such that v + u = 0. We denote

This could be generalized to any positive integer n, with the case n = 3

av = a(v1 , v2 , . . . , vn ) (av1 , av2 , . . . , avn ) (1.1)

(u1 , u2 , . . . , un ) + (v1 , v2 , . . . , vn ) = (u1 + v1 , u2 + v2 , . . . , un + vn ) . (1.2)

The zero vector is (0, 0, . . . , 0) and the inverse is (v1 , v2 , . . . , vn ). Stu-

Definition 1.6. A maximum set of linearly independent vectors is a basis

1. v1 , v2 , . . . , vn are linearly independent;

2. if v V is any vector, v, v1 , v2 , . . . , vn are linearly dependent.

Theorem 1.9. If V is a vector space of dimension n and v1 , . . . , vn form a

b 6= 0, otherwise the above equation shows that v1 , . . . , vn are linear depen-

Definition 1.10. The numbers a1 , . . . , an in Eq. (1.7) are the coordinates

What happens if we choose another basis u1 , . . . , un ? v can be expressed

for some other real numbers K . We have

The Kronecker delta is defined as

Applying Eq. (1.9),

Comparing coefficients, we have

Given two linear functions f and g, we define scalar multiplication and

(af )(v) = af (v) (1.18)

Hence, if we know the values of a linear function on a basis, we know its

We also have v ( a v ) = a . The set {v } is the dual basis of the basis

Comparing with Eq. (1.25), we have

This is just Eq. (1.23).

Hence, such a sum is invariant under coordinate transformation.

Example 1.15. In R3 , using standard coordinate system, a linear function

1.2 Inner Product

where the numbers are

for some l and m. Explicitly, for vectors v = a v and u = b v ,

(v, u) = a1 b1 al bl + al+1 bl+1 + + am bm . (1.41)

Proof. We sketch the proof, by mathematical induction. First consider n = 1.

(A more proper notation should be ( 1 ) , but this is too clumsy.)

How does it transform in terms of another basis? By Eq. (1.10),

Note that it transforms as the product of two coordinates of linear functions,

Example 1.22. Not all vectors have non-zero length. In R4 , consider

Then, for v (1, 1, 0, 0), (v, v) = 0.

Example 1.23. For every vector v, we could define a linear function v :

(v ) , define the vector

then, for any vector u

1.3 Tensor Product

Definition 1.24. The tensor product of V and W is a nm dimensional

where Ai,j R, subjected to the constraints

3. (av) w = v (aw) for a R.

If v and w be bases of V and W , one basis of V W is v w , = 1, . . . , n,

The a could be thought as the components of the element t relative to the

Definition 1.26. Now, we give the rigorous mathematics construction of

For vector space V and W , consider the vector space F generated by

where we recall (vi , wi ) V W . The dimension of F is infinite.

We can write this as

(Students should know what I am talking about here.)

But inner product is symmetric, and multilinear. As a result, inner products

See also Eq. (1.38) and Eq. (1.41).

Treat it as a linear function T : V V V R, its contraction on the

which is a type (1, 0) tensor. In terms of components, we have

Contraction is independent of basis because, by Eq. (1.9) and Eq. (1.29),

In general, the contraction of a type (k, l) tensor is a type (k 1, l 1) tensor.

Example 1.35. Another common operation is the outer product of two

T ......... ......... = R...... ...... S ...... ...... . (1.78)

In this language, v u is the outer product of the vectors v and u.

Review of Special Relativity