Anda di halaman 1dari 105

Statistical Mechanics

Eric DHoker
Department of Physics and Astronomy,
University of California, Los Angeles, CA 90095, USA
15 September 2012
1
Contents
1 Introduction 6
1.1 Brief history . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.2 Units and physical constants . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2 Statistical Physics and Thermodynamics 11
2.1 Micro-states and macro-states . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.1.1 Counting micro-states; entropy . . . . . . . . . . . . . . . . . . . . . 11
2.2 Statistical and thermal equilibrium . . . . . . . . . . . . . . . . . . . . . . . 12
2.3 The thermodynamic limit, extensive and intensive variables . . . . . . . . . . 15
2.4 Work and Heat . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.4.1 The rst law of thermodynamics . . . . . . . . . . . . . . . . . . . . 16
2.4.2 The second law of thermodynamics . . . . . . . . . . . . . . . . . . . 16
2.5 Relations between thermodynamic variables . . . . . . . . . . . . . . . . . . 17
2.6 Non-interacting microscopic constituents: the ideal gas . . . . . . . . . . . . 18
2.7 The simplest ideal gas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.7.1 Counting micro-states in the simplest ideal gas . . . . . . . . . . . . . 20
2.8 Mixing entropy and the Gibbs paradox . . . . . . . . . . . . . . . . . . . . . 21
2.9 Indistinguishability of identical particles . . . . . . . . . . . . . . . . . . . . 22
2.10 Thermodynamic transformations . . . . . . . . . . . . . . . . . . . . . . . . 23
3 Classical Statistical Ensembles 25
3.1 Classical mechanics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.2 Time averaging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.3 Statistical ensembles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.4 The density function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
3.5 The Liouville theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.6 Equilibrium ensembles and distribution functions . . . . . . . . . . . . . . . 31
3.7 The uniform ensemble . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.8 The micro-canonical ensemble . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.9 The canonical ensemble . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.10 Deriving the canonical from the micro-canonical ensemble . . . . . . . . . . . 34
3.11 The Gibbs and the grand-canonical ensembles . . . . . . . . . . . . . . . . . 36
4 Applications of Classical Statistical Ensembles 38
4.1 Application I: The Maxwell distribution . . . . . . . . . . . . . . . . . . . . 38
4.2 Application II: Magnetism in classical statistical mechanics . . . . . . . . . 38
4.3 Application III: Diatomic ideal gasses (classical) . . . . . . . . . . . . . . . . 39
4.4 Quantum vibrational modes . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
2
4.5 Quantum rotational modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
5 Quantum Statistical Ensembles 44
5.1 Quantum Mechanics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
5.2 Mixed quantum states . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
5.3 Subsystems of a quantum system in a pure state . . . . . . . . . . . . . . . . 45
5.4 The density matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
5.5 Ensemble averages and time-evolution . . . . . . . . . . . . . . . . . . . . . . 48
5.6 The density matrix for a subsystem of a pure state . . . . . . . . . . . . . . 48
5.7 Statistical entropy of a density matrix . . . . . . . . . . . . . . . . . . . . . 50
5.8 The uniform and micro-canonical ensembles . . . . . . . . . . . . . . . . . . 52
5.9 Construction of the density matrix in the canonical ensemble . . . . . . . . . 52
5.10 Generalized equilibrium ensembles . . . . . . . . . . . . . . . . . . . . . . . . 54
6 Applications of the canonical ensemble 55
6.1 The statistics of paramagnetism . . . . . . . . . . . . . . . . . . . . . . . . . 55
6.2 Non-relativistic Boltzmann ideal gas . . . . . . . . . . . . . . . . . . . . . . 56
6.3 Van der Waals equation of state . . . . . . . . . . . . . . . . . . . . . . . . . 57
6.4 The Mayer cluster expansion . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
7 Systems of indistinguishable quantum particles 61
7.1 FD and BE quantum permutation symmetries . . . . . . . . . . . . . . . . . 61
7.2 BE and FD statistics for N identical free particles . . . . . . . . . . . . . . . 62
7.3 Boltzmann statistics rederived . . . . . . . . . . . . . . . . . . . . . . . . . . 64
7.4 Fermi-Dirac statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
7.5 Bose-Einstein statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
7.6 Comparing the behavior of the occupation numbers . . . . . . . . . . . . . . 67
8 Ideal Fermi-Dirac Gases 68
8.1 The simplest ideal Fermi-Dirac gas . . . . . . . . . . . . . . . . . . . . . . . 68
8.2 Entropy, specic heat, and equation of state . . . . . . . . . . . . . . . . . . 69
8.3 Corrections to the Boltzmann gas . . . . . . . . . . . . . . . . . . . . . . . . 70
8.4 Zero temperature behavior . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
8.5 Low temperature behavior: The Sommerfeld expansion . . . . . . . . . . . . 72
8.6 Pauli paramagnetism of ideal gasses . . . . . . . . . . . . . . . . . . . . . . . 74
8.7 Landau diamagnetism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
8.8 White dwarfs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
3
9 Bose-Einstein statistics 81
9.1 Black body radiation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
9.2 Cosmic micro-wave background radiation . . . . . . . . . . . . . . . . . . . . 83
9.3 Thermodynamic variables for Bose-Einstein ideal gasses . . . . . . . . . . . . 84
9.4 Bose-Einstein condensation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
9.5 Behavior of the specic heat . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
10 Phase coexistence: thermodynamics 88
10.1 Conditions for phase equilibrium . . . . . . . . . . . . . . . . . . . . . . . . 88
10.2 Latent heat . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
10.3 Clausius-Clapeyron equation . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
10.4 Example of the Van der Waals gas-liquid transition . . . . . . . . . . . . . . 91
10.5 The Maxwell construction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
11 Phase transitions: statistical mechanics 94
11.1 Classication of phase transitions . . . . . . . . . . . . . . . . . . . . . . . . 94
11.2 The Ising Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
11.3 Exact solution of the 1-dimensional Ising Model . . . . . . . . . . . . . . . . 96
11.4 Ordered versus disordered phases . . . . . . . . . . . . . . . . . . . . . . . . 97
11.5 Mean-eld theory solution of the Ising model . . . . . . . . . . . . . . . . . . 99
12 Functional integral methods 101
12.1 Path integral representation for the partition function . . . . . . . . . . . . . 101
12.2 The classical = high temperature limit . . . . . . . . . . . . . . . . . . . . . 104
12.3 Integrating out momenta . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
4
Bibliography
Course textbook
Statistical Mechanics,
R.K. Pathria, Elsevier Publishing, (2011).
Classics
Thermodynamics and Statistical Mechanics,
A. Sommerfeld, Academic Press, 1956;
Thermodynamics and Statistics,
E. Fermi, Phoenix Science Series (1966);
Statistical Mechanics, An advanced course with problems and solutions,
R. Kubo, North Holland, 1971;
Statistical Physics, Part 1,
L.D. Landau and E.M. Lifschitz, Pergamon Press, Third edition (1982);
Statistical Mechanics, A set of Lectures,
R.P. Feynman, Benjamin Cummings Publishing (1982);
Statistical Thermodynamics,
E. Schr odinger, Dover Publications (1989).
Standard textbooks
Fundamentals of statistical and thermal physics,
F. Reif, McGraw Hill;
From Microphysics to Macrophysics,
Methods and Applications of Statistical Mechanics, Vol 1,
R. Balian. Springer (2007);
Introduction to Statistical Phyisics,
K. Huang, CRC Press, Second Edition (2010);
Statistical Physics of Particles,
M. Kardar, Cambridge University Press (2010);
Statistical Physics of Fields,
M. Kardar, Cambridge University Press (2007).
5
1 Introduction
The goal of statistical mechanics is to explain the physical properties of macroscopic systems
in terms of the dynamics of its microscopic constituents.
A macroscopic system is one that contains a large number of microscopic constituents;
for example, 12 grams of pure Carbon
12
C contains 1 mole or 6.02214110
23
Carbon atoms,
and constitutes a macroscopic system. As such, almost all of the matter that we encounter
in everyday life is macroscopic.
Microscopic constituents may be atoms, such as Carbon atoms in the above example.
They may, however, also be larger than atoms and consist of molecules each one of which
is composed of several or many atoms. It is also possible to consider the microscopic con-
stituents to be smaller than atoms, and work with nuclei and electrons, as would be appro-
priate for ionized gases, or even smaller constituents such as quarks and gluons in a quark
plasma environment. The choice of which particles to consider as microscopic constituents
is guided by the length-scales and time-scales on which the system is considered.
The dynamical laws which individual microscopic constituents obey are generally known.
If the microscopic constituents are taken to be electrons and nuclei, and the eects of
the weak, strong, and gravitational forces can be neglected, then the dynamics of electro-
magnetism remains as the only relevant interaction between the microscopic constituents.
Almost all matter encountered naturally on Earth should result, in one way or the other,
from the dynamics of the electro-magnetic interactions between these constituents.
The physical properties of the corresponding macroscopic system show, however, tremen-
dous qualitative and quantitative dierences. The qualitative distinctions are usually referred
to as phases of matter. In gasses, the microscopic constituents interact only weakly. In both
solids and liquids the interactions are strong and produce various degrees of spatial ordering.
In crystals and quasi-crystals, only discrete translations and rotations survive as symme-
tries. Customary liquids are homogeneous and isotropic phases (with unbroken translation
and rotation symmetries). Liquid crystals share certain properties with liquids, others with
solids. For example, in a nematic phase molecules are aligned along a spatial direction in
a translationally invariant way, while in a smectic phase there is ordering along layers but
molecules need not be pointing along the same direction. Electric and magnetic properties
provide further and ner distinctions between possible phases, giving rise to conductors,
semi-conductors, super-conductors, insulators, paramagnets, ferromagnets, and so on.
The goal of statistical mechanics is to explain quantitatively the physical properties of
these phases in terms of the dynamics of the underlying microscopic constituents. The time-
scales and length-scales over which we seek this information are typically macroscopic, on
the order of seconds and centimeters. The degrees of freedom corresponding to time-scales
and length-scales on the order of atomic processes will be treated eectively, or statistically.
6
1.1 Brief history
The development of statistical mechanics was driven by attempts to understand thermody-
namics from a dynamical microscopic point of view. Thermodynamics is a phenomenological
(or purely macroscopic) theory used to describe the transfer of heat, work, and chemical
constituents in and out of a macroscopic system. The development of thermodynamics goes
back to the early 19-th century, at a time when the concept of energy, and its conservation,
were still ill-understood and controversial. Sadi Carnot (1796-1832) uncovered some of the
physical principles that underly the functioning of steam-engines, and was led to distinguish
between reversible and irreversible processes. James Joule (1818-1889) extended the concept
of energy by include heat. In doing so, he claried conservation of energy, and laid the
foundation for the rst law of thermodynamics. Rudolf Clausius (1822-1888) claried the
role of energy conservation in the Carnot-cycle, introduced the concept of entropy, and rst
stated the second law of thermodynamics. The third law of thermodynamics was proposed
in 1905 by Walther Nernst (1864-1941).
The development of statistical mechanics also grew out of the kinetic theory of gasses.
Daniel Bernoulli (1700-1782) accounted for the pressure exerted by a gas on the walls of its
container in terms of collisions of molecules against the wall. The concepts of mean free path
and collision rate were introduced by Clausius, and incorporated by James Clerk Maxwell
(1831-1879) into a precise theory for the statistical distribution of velocities obeyed by the
molecules of a gas in equilibrium, the so-called Maxwell distribution. Ludwig Boltzmann
(1844-1906) generalized the Maxwell distribution to include external forces, thereby replacing
the purely kinetic energy of the Maxwell distribution by the total energy in the famous
Boltzmann factor e
E/kT
. The Boltzmann transport equation describes the kinetic dynamics
of gasses out of equilibrium; it may be used to study how equilibrium is reached in time.
The range of applicability of these kinetic theories is limited by the fact that its microscopic
constituents are invariably assumed to be weakly (or non-) interacting.
The kinetic theories of gasses provide a statistical molecular interpretation of thermody-
namic quantities and phenomena. In particular, Boltzmanns famous relation between the
entropy S of a macro-state and the number of micro-states accessible to this macro-state,
S = k ln
provides one of the fundamental links between a thermodynamic quantity S, and a statis-
tical mechanical quantity . Understanding exactly what is being counted by requires
introducing the concept of an ensemble. The kinetic theory led Maxwell and Boltzmann to
introducing ensembles for gasses of non-interacting molecules.
Josiah Willard Gibbs (1839-1903) held the rst professorship in theoretical physics in the
United States, at Yale University. Although his work on statistical mechanics (a term he
7
coined) and on vector calculus (which he invented) were carried out in relative isolation, its
inuence on 20-th century physics has been profound. Amongst many other contributions,
Gibbs proposed to think of ensembles in an abstract but generally applicable formalism,
which allows for the counting of in a systematic statistical framework. Twenty-ve years
before the discovery of quantum mechanics, Gibbs showed that counting molecules as dis-
tinguishable constituents leads to the Gibbs paradox. The paradox evaporates as soon as
constituents are treated quantum mechanically, and counted as indistinguishable. The Gibbs
approach to ensembles will be explained in detail and used extensively during this course.
The discovery of quantum mechanics through the analysis of the spectrum of black body
radiation by Max Planck (1858-1947) revolutionized physics in general and statistical me-
chanics in particular. Remarkably, the statistical counting formalism of Gibbs lent itself
perfectly to incorporating the changes brought about by the quantum theory. Theses fun-
damental changes are as follows,
1. Quantum states are discrete, countable entities;
2. Identical quantum constituents (they could be electrons, protons, atoms, or molecules)
are indistinguishable from one another and must be counted as such;
3. Integer spin particles obey Bose-Einstein statistics;
4. Half-integer spin particles obey Fermi-Dirac statistics.
Satyendra Bose (1894-1974) pioneered the quantum counting for photons, while Albert Ein-
stein (1879-1955) extended the method to (integer spin) atoms and molecules. Enrico Fermi
(1901-1954) and Paul Dirac (1902-1984) invented the modication needed to treat electrons,
protons, and other spin 1/2 particles. A systematic revision of the principles of statistical
mechanics, required by quantum mechanics, was pioneered by Lev Landau (1908-1968) and
John von Neumann (1903-1957), who introduced the tool of the density matrix. Landau
developed a systematic theory of second order phase transitions, and obtained general pre-
dictions for the critical behavior near the phase transition point. Landau also invented a
general theory of quantum liquids well-suited to describing liquids of bosonic or fermionic
excitations, and applied it successfully to explaining the superuidity of liquid
4
He.
In the second half of the 20-th century, many developments of statistical mechanics went
hand in hand with progress in quantum eld theory. The functional integral of Dirac and
Feynman provided a general starting point for the statistical mechanics of elds. Statistical
mechanical methods, such as block spin methods, were applied to the problem of renor-
malization in quantum eld theory, while the renormalization group methods of eld the-
ory provided systematic approximation methods in statistical mechanics. Kenneth Wilson
(1936-present) fully developed the interplay between quantum eld theory and statistical me-
chanics to derive a physical understanding of renormalization, and to usher in a detailed and
calculable approach to the theory of critical phenomena and second order phase transitions.
8
In the 1970s, Jacob Bekenstein (1947-present) put forward a surprising and bold hy-
pothesis, namely that black holes, when considered quantum mechanically, must have a
denite entropy. Further evidence for this hypothesis was supplied by Stephen Hawking
(1942-present), who showed that the objects we refer to classically as black holes actually
emit radiation due to quantum eects. The spectrum of Hawking radiation is that of a black
body, which implies that a black hole must have temperature in addition to entropy, and
thus behaves as a thermodynamic objects. Intuitively, the assignment of an entropy makes
physical sense when one considers that micro-states living inside the black hole horizon are
shielded away from us and are thus accessible to us only in the form of a macro-state. From
a statistical mechanics point of view, the question thus arises as to what these micro-states
inside the event horizon are. For very special types of black holes, the answer is provided by
string theory, and the micro-states are found to be certain branes, namely spatially extended
objects of various dimensions that generalize membranes and strings. But the nature of the
micro-states for a regular black hole, like Schwarzschild is not understood to date.
9
1.2 Units and physical constants
The following values of physical constants may be useful.
Plancks constant h = 1.0546 10
27
erg sec
speed of light c = 2.9979 10
10
cmsec
1
Newtons constant G
N
= 6.6720 10
8
cm
3
g
1
sec
2
Boltzmanns constant k = 1.3807 10
16
erg K
1
k = 8.6175 10
5
eV K
1
electron charge e = 1.6022 10
19
Coulomb (in MKS units)
electrom mass m
e
= 9.11 10
28
g = 0.511 MeV/c
2
Bohr radius r
B
= h/(m
e
c) = 0.53

A
Avogadros number N
A
= 6.022141 10
23
ideal gas constant R = 8.32 J/K
The following conversion relations between physical units may be useful.
1

A = 10
8
cm
1 erg = 1 g cm
2
sec
2
1 erg = 6.241 10
11
eV
1 erg = 10
7
J
1 eV = 1.6022 10
19
J
1 eV = 1.6022 10
12
erg
1 eV = 1.1605 10
4
K
1 cal = 4.184 J
1 Btu = 1055 J
10
2 Statistical Physics and Thermodynamics
In this section, we shall introduce some of the basic methods of statistical mechanics and
their implications for thermodynamics in a relatively informal way. In the subsequent section,
more complete denitions and applications of statistical ensembles will be introduced and
applied. In as much as possible, we shall discuss the cases of classical mechanics and quantum
physics in parallel.
2.1 Micro-states and macro-states
Consider a system of an extremely large number N of microscopic constituents (often referred
to as particles), for example on the order of Avogadros number 10
23
. A micro-state
corresponds to a precise and complete specication of each and every one of these particles.
From the point of view of classical mechanics of point-like particles, a micro-state will be
described by the generalized positions q
i
(t) and their velocities q
i
(t) for i = 1, , 3N in the
Lagrangian formulation or the generalized momenta p
i
(t) and positions q
i
(t) in Hamiltonian
mechanics. From the point of view of quantum mechanics, a microscopic state corresponds to
a vector [ in the Hilbert space 1 of the quantum mechanical system. In a coordinate basis,
the state [ is represented by a wave function (q
1
, , q
3N
) though additional quantum
numbers, such as spin, may have to be included as well. In both the classical and the quantum
mechanical cases it would be utterly impossible to completely specifying a micro-state, as
doing so would require supplying on the order of N 10
23
numerical entries.
A macro-state, either from the point of view of classical or quantum mechanics, is speci-
ed by a relatively small number of thermodynamic variables. Exactly which variables are
specied will depend on the nature of the physical system. The geometrical volume V in
which the system is constrained to evolve is often used to specify a macro-state. The total en-
ergy E of a system plays a special role in view of the general principle of energy conservation,
and is almost always used to specify macro-states. In fact, any quantity that is conserved on
general principle may be used to specify macro-states. These include total momentum, total
angular momentum, total electric charge, total magnetization, and total particle number (in
situations where the dynamics of the system conserves particle number). Their conservation
guarantees their exact matching between the micro-states and the macros-states.
2.1.1 Counting micro-states; entropy
Many dierent micro-states will correspond to one single macro-state. Given the thermody-
namic variables specifying a macro-state, we denote by the number of micro-states in which
the given macro-state can be realized. If a macro-state is specied by its internal energy E,
then the number of micro-states in terms of which this macro-state can be realized will be
11
a function of E, and denoted (E). If the total volume V is used in addition, becomes a
functions of two variables (E, V ), and so on. The more thermodynamic variables we use,
the more precise our specication of the macro-state will be. In practice, macro-states will
be described by just a few thermodynamic variables. Thus, specifying a macro-state will
always amount to omitting a large amount of information about the system.
The key logical connection between statistical mechanics and thermodynamics is made
by Boltzmanns formula for entropy S,
S = k ln (2.1)
in terms of the number of micro-states accessible to a given macro-state. For macro-states
specied by energy and volume, for example, both sides will be functions of E, V . The
dependence of (E, V ) and thus S(E, V ) on E and V is governed by the specic dynamical
laws of the system. In classical mechanics as well as in quantum physics these laws are
encapsulated in the microscopic degrees of freedom, and the precise form of the Hamiltonian
in terms of these degrees of freedom. The exact computation of and S is possible only for
non-interacting systems. Yet, non-interacting systems provide approximations to a wealth
of real physical problems, and we will carry out the calculations of and S there shortly.
2.2 Statistical and thermal equilibrium
From common experience and detailed experiments we know that an isolated system, namely
one which has no contacts or exchanges with the outside, will settle into a preferred state
of thermal equilibrium after a suciently long time has passed. More precisely, statistical
equilibrium may be dened as the thermodynamic state in which the given macro-state may
be realized in terms of the largest possible number of micro-states. In other words, statistical
equilibrium is achieved for maximal disorder. As a result, the state of statistical equilibrium
may be quantitatively characterized as the macro-state with maximal entropy.
To see how this works, consider two macroscopic systems o
1
, o
2
parametrized by the ther-
modynamic variables E
1
, V
1
, E
2
, V
2
. Without spelling out the precise microscopic dynamics
at work here, we shall simply assume that the numbers of micro-states accessible to each
one of the above macro-states are given by
1
(E
1
, V
1
) and
2
(E
2
, V
2
) respectively. We now
bring these two macro-systems in contact with one another, as shown in the gure 1 below.
In a rst stage, we keep the volumes V
1
, V
2
xed, but allow for energy exchange between
o
1
and o
2
, for example by placing a heat-conducting screen in between them. The total
system will be denoted by o, and is assumed to be isolated. By energy conservation, the
total energy E of the system is given by
E = E
1
+ E
2
(2.2)
12
System 1 System 2
E V E V
Figure 1: Two macroscopic systems o
1
and o
2
in contact.
and is xed since o is assumed to be isolated. In general, the number of accessible micro-
states (E) of o at energy E (recall the volumes are xed) depends on the interactions
between o
1
and o
2
and is not readily calculable. We shall consider here an idealized situation
where the interactions between o
1
and o
2
are neglected (which can be justied in part since
they occur through a surface and have small eects in the bulk). As a result, the system o
at energy E may be in any of the macro-states (E
1
, E
2
) with E
1
+E
2
= E, with the following
corresponding number of micro-states,

0
(E
1
, E
2
) =
1
(E
1
)
2
(E
2
) (2.3)
We see that the partition of o into the subsystems o
1
and o
2
which are in mutual thermal
contact provides us with one macroscopic parameter which is not yet xed, namely the energy
E
1
of the subsystem o
1
. The state of equilibrium will be achieved when
0
is maximal for
given E. In the process of evolving towards equilibrium, the free parameter E
1
will adjust
to produce the maximum possible value for
0
(E
1
, E
2
), so that we have,
ln
0
(E
1
, E
2
)
E
1
=
ln
1
(E
1
)
E
1

ln
2
(E
2
)
E
2
= 0 (2.4)
Partial derivatives are used to make it clear that all other thermodynamic variables, such as V
and N, are being kept constant. Recognizing k ln as the entropy by Boltzmanns equation
(2.1), and the derivative of S with respect to E as inverse of the absolute temperature T,
1
T
=
S(E)
E
(2.5)
the equilibrium condition (2.4) may be re-expressed in terms of the temperature T of the
two systems,
T
1
= T
2
(2.6)
13
namely, at equilibrium, the temperatures of the two subsystems coincide.
Extending the above derivation, assume that the systems o
1
and o
2
are not only in
thermal contact, but that also the wall separating them is movable, so that the volumes V
1
and V
2
may vary, all the while keeping the total volume V of the system xed,
V = V
1
+ V
2
(2.7)
In addition, let us assume that one or several further conserved quantities N, such as mo-
mentum, angular momentum, or the conserved numbers of various species of particles, may
also be exchanged between the two systems, all the while keeping their total value xed,
N = N
1
+ N
2
(2.8)
By the same arguments as were used for energy exchange, equilibrium of the system o will
be characterized by the following equalities,
ln
1
(E
1
, V
1
, N
1
)
V
1

ln
2
(E
2
, V
2
, N
2
)
V
2
= 0
ln
1
(E
1
, V
1
, N
1
)
N
1

ln
2
(E
2
, V
2
, N
2
)
N
2
= 0 (2.9)
In terms of entropy S, temperature T, pressure P, and chemical potential , dened by,
S(E, V, N)
E

V,N
=
1
T
S(E, V, N)
V

E,N
=
P
T
S(E, V, N)
N

E,V
=

T
(2.10)
the equilibrium conditions become,
T
1
= T
2
P
1
= P
2

1
=
2
(2.11)
which constitute, of course, the well-known equations of thermodynamical equilibrium. Thus
equilibrium guarantees that T, P, are constant throughout the system.
These relations may be further generalized to the case of more than two subsystems in
thermal contact. If o may be subdivided into three subsystems o
1
, o
2
, o
3
, which are all in
mutual thermal contact, then the equilibrium conditions between o
1
, o
2
on the one hand, and
o
2
, o
3
on the other hand, will imply that the subsystems o
1
and o
3
are also in equilibrium
with one another. This fact is referred to as the 0-th law of thermodynamics in Kardar.
14
2.3 The thermodynamic limit, extensive and intensive variables
Putting together multiple identical copies of a given macro-system allows us to scale the
energy E, volume V , and particle number N, as well as any other conserved quantities, by
a common scale factor ,
E E V V N N (2.12)
This scaling will be reliable in the limit where boundary and surface eects can be neglected,
an approximation which is expected to hold in the limit where . This limit is referred
to as the thermodynamic limit.
Thermodynamic variables which scale linearly with are referred to as extensive, while
those which are untransformed under are referred to as intensive. For example, the en-
ergy density E/V , and the number density N/V are intensive. Given that thermodynamic
equilibrium sets equal T, P, and across the bulk of a substance, these quantities should
be intensive. As a result of the dening relations in (2.10), the entropy is expected to be an
extensive thermodynamic variable.
2.4 Work and Heat
The partial derivative relations of (2.10) may equivalently be expressed in dierential from,
dE = TdS PdV + dN (2.13)
and may be viewed as a fundamental relation of thermodynamics. Carefully note what this
formula means. It gives the relation between the changes in thermodynamic variables for
neighboring states of thermodynamic equilibrium of two subsystems in contact with one
another, the whole system o being isolated, and not undergoing any change in total entropy.
Such changes are referred to as reversible since no total entropy is produced.
In classical and quantum mechanics, it is often convenient to separate contributions to
energy into kinetic from potential, even though only the sum of the two is conserved. The
potential energy may be viewed as summarizing the change in the Hamiltonian due to changes
in the external conditions, such as the volume, or an electric potential. In thermodynamics,
the changes dE in the internal energy which result from changes in the external conditions,
such as a change in volume V or in the conserved quantity N are referred to as the work
W done by the system. In the case at hand, the contributions to the work are given by,
W = PdV dN (2.14)
Since the parameters V and N are not statistical in nature, but are supplied as external
conditions, this relation holds for reversible as well as irreversible processes, as long as the
changes produced (in time) are such that the system remains in equilibrium throughout.
15
All changes in the internal energy which are not work will be lumped together as heat,
and denoted by Q. Thus, heat transfer may be thought of as associated with changes in
the kinetic energy of the system.
2.4.1 The rst law of thermodynamics
The rst law of thermodynamics reects the principle of conservation of energy,
dE = QW (2.15)
The minus sign in front of work is a matter of convention, and results from the fact that we
have dened the work done *by* the system (as opposed to the work done *on* the system).
The notations Q and W are used here to stress the following important distinction with
dE, dS, dV etc. The thermodynamic variables V, E, N, S, are all thermodynamic state
functions, and depend only on the (equilibrium) macro-state under consideration. Thus, the
corresponding dierentials dV, dE, dN, dS, are exact dierentials, and their integration
depends only on the initial and nal macro-states of the system. Heat and work are not, in
general, thermodynamic state functions, and integrals of the dierentials Q, W do depend
on the path followed to go from one macro-state of the system to another. The dierentials
Q and W are not closed (an thus not exact); the notation is used to indicate this (Clausius
introduced this distinction, and used the notation drather than ).
When the thermodynamic changes are reversible, we may combine formulas (2.14) and
(2.15) which are generally valid, with formula (2.13) which is valid only for reversible pro-
cesses, to obtain a relation between heat and entropy,
Q = TdS (2.16)
In view of its very derivation, this relation will hold only for reversible processes. For
irreversible processes, the relation does not hold. Instead, we need to appeal to the second
law of thermodynamics to clarify the situation.
2.4.2 The second law of thermodynamics
The second law of thermodynamics may be formulated in a number of equivalent ways.
1. Clausius: No thermodynamic process is possible whose sole result is the transfer of
heat from a cooler to a hotter body.
2. Kelvin: No thermodynamic process is possible whose sole result is the complete con-
version of heat into work.
16
These statements may be translated into an inequality with the help of a theorem by Clau-
sius (proven, for example, in Kardar.) In a general thermodynamic process (reversible or
irreversible), the entropy change dS for given heat transfer Q and temperature will be larger
that Q/T, so that for any cyclic thermodynamic process we have,
_
Q
T
0 (2.17)
Here, Q is the heat increment supplied to the system at temperature T. For any reversible
processes we have equality, while for any irreversible process we have strict inequality.
2.5 Relations between thermodynamic variables
Next, we derive a number of immediate but fundamental thermodynamic relations from
(2.13), an equation we repeat here for convenience,
dE = TdS PdV + dN (2.18)
(1) Applying the Euler equation for homogeneous functions of degree 1 to the entropy in the
thermodynamic limit, and using the dening equations of (2.10), we derive the relation,
E = TS PV + N (2.19)
Taking the total dierential of this relation, and then eliminating dE with the help of (2.18)
gives the Gibbs-Duhem relation,
SdT V dP + Nd = 0 (2.20)
(2) Next, we provide alternative formulas for T, P, and as derivatives of the internal energy
E rather that of the entropy S. The derivations are standard, and extend to many other
thermodynamic relations. It is immediate from (2.18) that we alternatively have,
T =
_
E
S
_
V,N
P =
_
E
V
_
S,N
=
_
E
N
_
S,V
(2.21)
Homogeneity of E now leads to the same equation as we found in (2.19).
(3) To change independent variables from S, V, N to T, V, N we use the free energy F,
F = E TS (2.22)
in terms of which equation (2.18) takes the form,
dF = SdT PdV + dN (2.23)
17
It follows that S, P, and may be represented as derivatives of F with respect to the variables
respectively T, V , and N,
S =
_
F
T
_
V,N
P =
_
F
V
_
T,N
=
_
F
N
_
T,V
(2.24)
(4) The specic heat functions C
V
and C
P
are dened as follows,
C
V
= T
_
S
T
_
V,N
C
P
= T
_
S
T
_
P,N
(2.25)
Again, alternative formulas are available in terms of derivatives of the internal energy, or
free energy. We quote the alternative formulas for C
V
and C
P
,
C
V
=
_
E
T
_
V,N
C
P
=
_
E
T
_
P,N
+ P
_
V
T
_
P,N
(2.26)
2.6 Non-interacting microscopic constituents: the ideal gas
In a macroscopic system of non-interacting microscopic constituents, the number N of par-
ticles is conserved, as is the total energy E of the system. One refers to such a system as an
ideal gas since, generally, gasses are characterized by weakly interacting particles.
Without any detailed model or calculations, and using scaling arguments alone, one can
deduce the dependence of the number of states (E, V, N) on V (always in the thermody-
namic limit where V ). Assuming that the size of the particles is negligible compared
to V , we see that introducing a single particle in a volume V produces a number of states
proportional to V (doubling V will double the number of states). Now, since the particles
are non-interacting and of negligible size, introducing a second particle will just multiply the
number of 1-particle states by V etc. Thus, we have,
(E, V, N) V
N
S(E, V, N) N ln V (2.27)
Computing the pressure, using the second relation in (2.10), we nd P = NkT/V . It is
customary to express this relation in terms of the gas constant R = N
A
k, dened in terms
of k and Avogadros number N
A
, and takes the form of the ideal gas law,
PV = nRT = NkT (2.28)
where n is the number of moles of the gas.
18
2.7 The simplest ideal gas
Not all ideal gasses are the same. This is due to the fact that the number of degrees of
freedom of each microscopic constituent by itself may vary. Each microscopic constituent in
any ideal gas will have three translational degrees of freedom, namely the three components
of momentum. Clearly, the simplest ideal gas corresponds to the case where these are the
only degrees of freedom of each microscopic constituent. More generally, the atoms and
molecules in a general ideal gas will have also rotational and internal degrees of freedom,
with corresponding quantum numbers. In this section, we shall treat only the simplest ideal
gas, and postpone treatment of more complicated cases.
To also extract the dependence on energy and particle number, we need a more detailed
model, specifying in particular whether the gas is relativistic or not, and quantum mechanical
or classical. We shall work with the non-relativistic quantum version, for non-interacting
particles conned to a square box of size L and volume V = L
3
, with Dirichlet boundary
conditions (or equivalently an innite positive potential outside the box). The micro-states
of the system are specied by the quantum state of each particle i = 1, , N by three
non-negative integers n
i
= (n
i
x
, n
i
y
, n
i
z
). The energy of a particle in this state is given by,
(n
i
x
, n
i
y
, n
i
z
) =

2
h
2
2mL
2
_
(n
i
x
)
2
+ (n
i
y
)
2
+ (n
i
z
)
2
_
(2.29)
The energy of the system in a micro-state specied by the n
i
is given by,
E =
N

i=1
(n
i
x
, n
i
y
, n
i
z
) (2.30)
Thus, specifying a macro-state by E, V , and N then precisely determines (E, V, N) as the
number of solutions n
r
0 for r = 1, , 3N to the equation,
3N

r=1
n
2
r
=
2mEL
2

2
h
2
(2.31)
We see that the counting of states depends on N, and on the combination EL
2
= EV
2/3
but
not on E and V separately. As a result, the functional dependence of S can be simplied,
since it may be expressed in terms of a function s of only 2 variables,
S(E, V, N) = s(EV
2/3
, N) (2.32)
Therefore, the derivatives of S with respect to E and V are related to one another,
E
S
E

V,N

3
2
V
S
V

E,N
= 0 (2.33)
19
so that one derives the following formulas,
P =
2
3
E
V
E =
3
2
NkT (2.34)
The second relation reproduces the classic result that the average thermal energy per particle
is
3
2
kT, or the average energy per degree of freedom is
1
2
kT. Finally, combining the earlier
result of (2.27) with that of (2.32), we derive the complete dependence on E as well,
S(E, V, N) = N ln
_
V E
3/2
N
5/2
_
+ s(N) (2.35)
where s(N) depends only on N. If S is to be an extensive quantity, then s(N) should be a
linear function of N, given by s(N) = s
0
N for some constant s
0
.
2.7.1 Counting micro-states in the simplest ideal gas
Next, we investigate how the actual counting of micro-states for the non-relativistic ideal gas
proceeds, and which results we obtain. The combinatorial problem is well-posed in (2.31).
It may be solved in the approximation where mEL
2
/ h
2
1 by computing the volume of
the positive n
r
quadrant of a sphere in 3N dimensions, of radius dened by,

2
=
2mEV
2/3

2
h
2
(2.36)
The volume of the n 1 dimensional sphere of unit radius (embedded in n dimensional at
Euclidean space) is given by,
1
V
S
n1 =
2
n/2
(n/2)
(2.37)
As a result, the number of micro-states is given by,
(E, V, N)
1
2
3N
2
3N/2

3N1
(3N/2)

2
(3N/2)
_
mEV
2/3
2 h
2
_
3N/2
(2.38)
In the second approximation, we have used N 1 is large to drop the -1 in the exponent
of . Using the Sterling formula for the -function evaluated at large argument n 1,
ln (n + 1) nln(n) n +
1
2
ln(2n) +O
_
1
n
_
(2.39)
1
This formula may be obtained by evaluating an n-dimensional spherically symmetric Gaussian integral
in two dierent ways: rst as a product of identical 1-dimensional Gaussian integrals; second by expressing
the integral in spherical coordinates, and factoring out the volume V
S
n1.
20
and converting the result of (2.38) to a formula for the entropy, we nd,
S(E, V, N) kN ln
_
V
N
_
mE
3 h
2
N
_
3/2
_
+
3
2
kN + kN ln N (2.40)
Note that the argument of the logarithm is properly dimensionless, and forms an intensive
combination of E, V, N. The rst two terms are extensive contributions, as one would expect
for the total entropy, and their dependence on E and V is in accord with the results derived
on general scaling grounds in equation (2.35). The last term is consistent with the form of
(2.35) as well, but fails to be properly extensive. It is useful to re-express E in terms of T
using the second equation in (2.34), and we nd,
S(E, V, N) = kN ln
_
_
V
N
_
mkT
2 h
2
_
3/2
_
_
+
3
2
kN + kN ln N (2.41)
The failure of the last term to be properly extensive is referred to as the Gibbs paradox.
2.8 Mixing entropy and the Gibbs paradox
The non-extensive formula for the entropy leads to a paradox for the following reasons. Con-
sider again the situation of the two systems o
1
and o
2
depicted in gure 1, with gasses such
that the masses of their microscopic constituents are equal m
1
= m
2
= m. We begin by as-
suming that each system is isolated and in thermal equilibrium, and that their temperatures
coincide T
1
= T
2
= T. We shall also assume that their pressures coincide so that, by the
ideal gas law (2.28), the number densities must also be equal, N
1
/V
1
= N
2
/V
2
= . We will
leave their chemical potentials unconstrained.
First, consider bringing the systems o
1
and o
2
into thermal contact, by allowing exchanges
of energy and volume, but not particle number. Introduction or removal of thermal contact
are then reversible processes, and no increase in entropy is generated, since the preparation of
the systems guarantees thermal equilibrium of o, as long as no particles are being exchanged.
Second, consider the gasses in o
1
and o
2
to be composed of identical particles, and
allowing exchange also of particle number. Again, the introduction or removal of thermal
contact are reversible processes with no associated entropy increases.
Third, consider the gas in o
1
to be composed of one species of particles, but the gas in
o
2
to be composed of another species, which happen to have equal masses (this can happen
approximately to rather high precision in Nature; for example with the proton and the
neutron, or with isobaric nuclei of dierent elements such as
40
K and
40
Ca). Now, we know
that allowing the two gasses to mix will increase the entropy. One way to see this is that each
species of particles can have its separate chemical potential, and these chemical potentials
21
are dierent for the two isolated systems o
1
and o
2
prior to thermal contact. Thus, upon
producing thermal contact, the total system will not in thermal equilibrium.
Let us explore the predictions of formula (2.41), in order to compare the entropies S

of
the systems o

for = 1, 2,
S

= kN

ln
_
_
1

_
mkT
2 h
2
_
3/2
_
_
+ kN

_
3
2
+ ln N

_
(2.42)
with the entropy S of the total system o,
S = k(N
1
+ N
2
) ln
_
_
1

_
mkT
2 h
2
_
3/2
_
_
+ k(N
1
+ N
2
)
_
3
2
+ ln(N
1
+ N
2
)
_
(2.43)
The mixing entropy increase S S
1
S
2
may be evaluated using the Stirling formula,
S S
1
S
2
= k ln
(N
1
+ N
2
)!
N
1
! N
2
!
(2.44)
a result which clearly cries out for a combinatorial interpretation.
2.9 Indistinguishability of identical particles
When Gibbs articulated the above paradox in 1875, quantum mechanics was still 25 years
away, and he did not have the factor of h in the expression for the entropy. The paradox
existed even then, and Gibbs proposed an ad hoc remedy by dividing the number of accessible
micro-states by the combinatorial factor N!.
We now know the reason for this factor: in quantum mechanics, identical particles are
indistinguishable. Proper counting of indistinguishable particles requires the inclusion of
the factor N!, though we also know, of course, that quantum mechanics will impose further
modications on this counting, depending on whether we are dealing with fermions or bosons.
The result is the Sackur-Tetrode formula,
S = kN ln
_
_
V
N
_
mkT
2 h
2
_
3/2
_
_
+
5
2
kN (2.45)
It was our initial treatment of the identical particles as distinguishable (namely each one
being labeled by (p
i
, q
i
)), that produced the paradox.
The heat capacities C
V
, C
P
for the ideal gas may be evaluated using nothing more than
the ideal gas laws of (2.34), and we nd,
C
V
=
3
2
Nk C
P
=
5
2
Nk (2.46)
22
Evaluating the internal entropy as a function of entropy allows us to compute the chemical
potential for the ideal gas, and we nd,
E =
3 h
2
N
5/3
mV
2/3
exp
_
2S
3kN

5
3
_
= kT ln
_
_
V
N
_
mkT
2 h
2
_
3/2
_
_
(2.47)
Note that the ubiquitous combination,
=
_
2 h
2
mkT
_
1/2
(2.48)
has dimension of length, and is usually referred to as the thermal wavelength.
Recasting the formula for the entropy (2.45) in terms of the density = N/V and the
thermal wavelength , we nd,
S = kN ln
_

3
_
+
5
2
kN (2.49)
Since the entropy must be positive by denition, we see that the above formula is valid only
when the density is small compared with the scale set by the thermal wavelength

3
1 (2.50)
This is the condition for a Boltzmann (or suciently dilute) gas. For
3
1, we will need
to appeal to the full quantum statistics of Bose-Einstein or Fermi-Dirac statistic instead.
2.10 Thermodynamic transformations
A thermodynamic transformation corresponds to a change of the macroscopic state of a
system. If the thermodynamic variables are, for example, P, V, N, then any change in these
variables produces a thermodynamic transformation. Special thermodynamic transforma-
tions are often of interest, and usually correspond to keeping one or another thermodynamic
variable xed. If a state is initially in equilibrium, then a thermodynamic transformation
can be brought about only by changing the external conditions on the system.
We shall distinguish the following thermodynamic transformations,
1. Quasi-static: if the external conditions change slowly enough to keep the system in
equilibrium throughout the transformation process.
23
2. Reversible: a transformation producing no increase in total entropy. (A reversible
process is quasi-static, but the converse doe snot hold.)
3. Adiabatic: a transformation in which only the external conditions on the system are
changed, but no heat is transferred: Q = 0. As a result, we have dE = W.
(Note that Landau and Lifshytz use terminology where the term stands for what we
understand to be reversible adiabatic; see next entry.)
4. Reversible adiabatic: combining the two properties, one shows that the entropy of the
system remains unchanged. (For the ideal gas, this means that V T
2/3
or equivalently
PV
5/3
are constant, for xed particle number N.)
5. Isothermal: a process at constant temperature. (For an ideal gas, the internal energy
is then constant, and the product PV is constant, for xed particle number N.)
24
3 Classical Statistical Ensembles
In the preceding section, we have succeeded in counting the number of micro-states for an
ideal gas of non-interacting non-relativistic particles conned to a box. As soon as interac-
tions are turned on, however, an exact counting of micro-states in the thermodynamic limit
can be carried out only very rarely. Therefore, a general formalism of the statistical approach
to arbitrary systems is needed which, in particular, will permit a systematic calculation of
perturbative approximations.
The general formalism of statistical mechanics is based on the theory of statistical en-
sembles, a concept which dates back to Maxwell and Boltzmann for gasses, and which was
articulated in wide generality by Gibbs. We shall begin with ensembles in classical mechan-
ics, where the formulation may be set and understood in more familiar terrain, and introduce
the density function. Ensembles in the quantum case will be introduced in the subsequent
section, where we shall also distinguish between pure and mixed ensembles, and introduce
the density matrix.
3.1 Classical mechanics
The Hamiltonian formulation of classical mechanics will be used to describe a system of N
classical particles. Points in phase space T may be described by generalized coordinates
(p
i
, q
i
) with i = 1, , s. For point particles without internal degrees of freedom in three
space dimensions, we have s = 3N; when further internal degrees of freedom are present,
such as rotational or vibrational degrees of freedom, we will have s > 3N. A point in
phase space completely species a mechanical state at a given time, and corresponds to a
micro-state. Phase space T is the space of all possible micro-states of the classical system.
Dynamics is governed by a classical Hamiltonian H(p, q; t) = H(p
1
, , p
s
, q
1
, q
s
; t).
The time evolution of a micro-state, namely the evolution in time t of its N particles, is
given by the time evolution of the generalized coordinates via Hamilton equations,
p
i
=
H
q
i
q
i
=
H
p
i
(3.1)
where the dot denotes the derivative with respect to t. Time-evolution produces a ow
on phase space. When the Hamiltonian H has no explicit time-dependence, and is of the
form H(p, q), the total energy of the system is conserved. We may then consistently restrict
attention to dynamics at xed total energy E,
H(p, q) = E (3.2)
for each value of E. The corresponding subspaces T
E
of phase space do not intersect one
another for dierent values of E, and the family of spaces T
E
provides a foliation of T,
25
depicted schematically in Figure 2. Some of these leafs may be empty sets; this happened
for example for negative E with the harmonic oscillator Hamiltonian, which is non-negative
throughout T.

E
E
E
E
E
Figure 2: Foliation of phase space T by the conserved energy variable E. Leaves T
E
at
dierent energies do not intersect.
3.2 Time averaging
For a very large number N of interacting particles, it will be impossible to specify the initial
conditions and to solve the evolution equations for individual micro-states. Instead, we shall
be interested in predicting statistically averaged quantities.
One conceivable average is over long periods of time. Consider a mechanical function
f(p, q) on phase space, and follow its evolution in time f(p(t), q(t)) under Hamiltons equa-
tions (3.1). Since we cannot predict p(t) and q(t), we cannot predict f(p(t), q(t)) either.
Calculating the average of f over long periods of time,
lim
T
1
T
_
t+T
t
dt
t
f(p(t
t
), q(t
t
)) (3.3)
may oer a better prospect. Although the time-averaging approach may seem natural, it
has a number of drawbacks. The time-averaging of (3.3) may depend on the initial time
t, and on the initial conditions imposed on the time-evolution. Ideally, such data will be
washed out for large enough T provided we have strong enough interactions. Unfortunately,
characterizing the behavior of a system over long time intervals is very complicated, and the
question whether the initial conditions will be or are will not be washed out for large enough
T introduces a dicult dynamical complication. In particular, during its time evolution, the
system may or may not reach all of the allowed phase space, a problem that is the subject
of the ergodic theorem of dynamical systems. In fact, for free particles (say on a torus with
26
period boundary conditions), or for systems with maximal numbers of conserved quantities,
we know that not all of the phase space allowed by total energy will be reached.
3.3 Statistical ensembles
The Gibbs approach to statistical mechanics via the use of ensembles greatly simplies and
claries the manner in which we take averages. The approach also enjoys better physical
motivation, as well as much wider applicability due to its more formal setting (an excellent
combination for making progress).
Let us begin with the physical motivation in the case of classical mechanics. When we
wish to make statistical predictions, for example about the macroscopic system of one liter
of water, we are interested not so much in predicting the behavior of a single specic one-
liter bottle of water (whose initial conditions or micro-state we would not know precisely
anyway). Instead, we are rather interested in making predictions about any liter of water
with the same composition, and macroscopic variables. In particular, we do not prescribe
the initial conditions of the system beyond giving its macroscopic variables. Therefore, from
the outset, we are interested in considering together all possible micro-states to which a
given macros-state has access, and averaging over them. To summarize, in addition to time
averaging of just one specic system, we will take an average also over all possible similar
systems, with the same Hamiltonian, but with dierent initial conditions. This procedure
makes physical sense because the initial conditions on the systems were not specied for a
given macro-state anyway.
Let us now provide a precise set-up. Following Gibbs, we consider not one system, but
rather a collection of A systems, the dynamics of each one of these systems being governed
by the same Hamiltonian H(p, q; t), but whose initial conditions will be dierent. Recall
that, at any given time, a single system is characterized by a single point in phase space T.
Thus, at any given time, the collection of A systems will be characterized by a cloud of A
points in phase space, each point corresponding precisely to one system. Such a collection
(or cloud) of phase space points is referred to as an ensemble. Each point (or system) in the
ensemble evolves in time according to the same Hamiltonian H(p, q; t).
It is important to point out that, by construction, the dierent systems collected in an
ensemble do not interact with one another.
It cannot be emphasized enough that a system composed of N similar interacting elements
(or similar interacting subsystems, such as for a gas of N interacting identical molecules)
is not an ensemble of independent systems, and should not be confused with an ensemble.
2
2
To emphasize this distinction, we shall denote the number of particles or subsystems by N, but the
number of non-interacting systems collected in an ensemble by A.
27

(a) (b) (c)
Figure 3: Systems in phase space T: (a) a single system corresponds to a point, its time-
evolution gives a curve in T; (b) a cloud of points denes an ensemble; gives the time
evolution of an ensemble.
Instead, when a system is composed of N interacting molecules, the Gibbs formulation of
statistical mechanics will require A copies of the entire system of N molecules.
We conclude by noting that in a system composed of N identical non-interacting con-
stituents, each constituent can be used as a system in its own right, the total system then
being equivalent to an ensemble of A = N such non-interacting subsystems. This is why
our earlier treatment of non-interacting molecules in an ideal gas produces the same result
as will be gotten from a treatment by ensemble theory.
3.4 The density function
Consider a mechanical system parametrized by generalized coordinates (p
i
, q
i
), i = 1, , s
on phase space T, whose dynamics is governed by a Hamiltonian H(p, q; t). An ensemble of
systems is then represented, at a xed time t, by a cloud of A points in T. In the statistical
approach, we shall consider A extremely large, in such a way that the cloud of points may
be well-approximated by a continuous distribution,
dw = (p, q; t) dpdq dpdq =
s

i=1
dp
i
dq
i
(3.4)
where dpdq stands for the canonical volume form on phase space. The interpretation is that
dw counts the number of points of the ensemble contained within an innitesimal volume
dpdq surrounding the point (p, q) at time t. Equivalently, the number of phase space points
A
D
of the ensemble contained in a nite domain D, and the total number of points A in
the entire ensemble, are given by,
A
D
=
_
D
dpdq (p, q; t) A =
_
1
dpdq (p, q; t) (3.5)
28
The ensemble average

f(t) of any function f(p, q) on phase space is dened by

f(t) =
1
A
_
1
dpdq (p, q; t) f(p, q) (3.6)
Since all ensemble averages are unchanged under multiplication of by a positive constant,
we will throughout use the normalized distribution function (p, q; t)/A instead of itself,
and continue to use the notation (p, q; t) to signify the normalized distribution function. In
the standard language of probability theory, the normalized distribution function is then a
probability distribution.
The time-averaging procedure of (3.3) on a single system reaching equilibrium, may be
extended to a time-averaging procedure on the entire ensemble. To see how this works,
consider an innitesimal volume of size dpdq surrounding a point (p, q) in phase space. At
time t, various individual points in the cloud of phase space points representing the ensemble
will be inside the volume dpdq. This number is given by Adpdq (p, q; t), for a normalized
distribution function . Thus, the time-average of (3.3) may be expressed as follows,
lim
T
1
T
_
t+T
t
dt
t
f(p(t
t
), q(t
t
)) = lim
T
1
T
_
t+T
t
dt
t
_
1
dpdq (p, q; t
t
) f(p, q) (3.7)
All dependence on T and t is now contained in the t
t
-integral over the density function.
Clearly, equilibrium will be attained for large enough T provided the combination
1
T
_
t+T
t
dt
t
(p, q; t
t
) (3.8)
tends towards a nite, t-independent limit as T . This will be realized provided the
following equilibrium condition holds at all times t,
(p, q; t)
t
= 0 (3.9)
Note that the general density function (p, q; t) introduced through the method of ensembles
need not correspond to a system in equilibrium. The added advantage of the ensemble
formulation is that we have a clear and clean characterization of equilibrium given in terms
of the density function by condition (3.9).
3.5 The Liouville theorem
The Liouville theorem expresses the fact that the number of material points in a Hamiltonian
system is conserved. This conservation applies to the phase space points describing the
systems of an ensemble. We characterize an ensemble by a normalized density function
29
(p, q; t), so that the number of ensemble points inside an innitesimal volume dpdq of phase
space is given by A dpdq (p, q; t). The number of ensemble points A
D
contained inside a
domain D (of dimension 2s), is given by the integral over the density function as in (3.5),
A
D
(t) = A
_
D
dpdq (p, q; t) (3.10)
Since depends explicitly on t, the number A
D
will in general depend on t as well. Conser-
vation of the number of points is expressed by the conservation equation, which relates the
density function to the current density ( p
i
, q
i
) on all of phase space,

t
+
s

i=1
_

p
i
( p
i
) +

q
i
( q
i
)
_
= 0 (3.11)
where the sum of partial derivatives gives the divergence of the current ( p
i
, q
i
). Using
Hamiltons equations to eliminate p
i
and q
i
, and using the relation,

p
i
p
i
+

q
i
q
i
=

p
i
H
q
i
+

q
i
H
p
i
= 0 (3.12)
we nd,

t
+
s

i=1
_

p
i
p
i
+

q
i
q
i
_
= 0 (3.13)
This conservation equation may be recast in two equivalent ways. First, given that is a
function of p, q, t, we see that (3.13) expresses its total time-independence,
d
dt
= 0 (3.14)
as the Hamiltonian system evolves in time according to Hamiltons equations. This means
that the density does not change in time, so that the ensemble evolves as an incompressible
uid does. Alternatively, (3.13) may be expressed with the help of Poisson brackets , . We
normalize the generalized coordinates canonically by p
i
, p
j
= q
i
, q
j
= 0 and q
i
, p
j
=
ij
.
Equations (3.13) and (3.14) are then equivalent to,

t
+, H = 0 (3.15)
We stress that the above results hold for any distribution function, whether it describe an
ensemble in equilibrium or out of equilibrium.
30
3.6 Equilibrium ensembles and distribution functions
The construction of equilibrium ensembles, and their associated distribution function, is
guided by two key principles. The rst is the condition for equilibrium in (3.9). The second
is Boltzmanns assumption of equal a priori probabilities. We explain each these below.
If a distribution function describes an ensemble in equilibrium, then by (3.9) and (3.15),
it must satisfy the following conditions which are equivalent to one another,

t
= 0 , H = 0 (3.16)
Both relations provide a concrete characterization of equilibrium density functions, and thus
of ensembles in equilibrium. We shall concentrate on conservative systems, in which case H
has no explicit time dependence. The second equation in (3.16) then instructs us that is a
conserved quantity. Generically, a conservative Hamiltonian exhibits only a single conserved
quantity, namely the total mechanical energy E = H(p, q), so that will be a function of
the Hamiltonian only. When further conserved quantities, such as total momentum, total
angular momentum, particle number, electric charge, exist, then will also depend on the
corresponding conserved mechanical functions.
The principle of equal a priori probabilities postulates that, if a certain macro-state has
been completely specied by a certain number of thermodynamic variables, and is accessible
by a number of micro-states, then the probability for nding the system in any one of
these micro-states is the same for all micro-states, and thus equal to 1/. As this probability
distribution does not favor any one micro-state above another, it is referred to as an unbiased
probability distribution.
3.7 The uniform ensemble
In the uniform ensemble, (p, q) is taken to be constant, i.e. independent of p, q, t. This
means that all systems in the ensemble have the same statistical probability, in accord
with the Boltzmann principle of equal a priori probabilities. In view of (3.16), the uniform
distribution function characterizes an equilibrium ensemble. Ensemble averages are obtained
as integrals over phase space by,

f =
_
1
dpdq f(p, q)
__
1
dpdq (3.17)
It is readily seen that these expressions will be nite and make physical sense only if the
volume of phase space is nite, for example when T is compact. For most systems of
physical interest, this will not be the case. Thus, in classical mechanics, this ensemble is not
that useful. The uniform ensemble will be useful later, however, when considering certain
quantum mechanical systems.
31
3.8 The micro-canonical ensemble
In the micro-canonical ensemble, the total (or internal) energy E is used to specify the macro-
states completely. All micro-states with total energy E have equal weight, in accord with the
Boltzmann prnciple of equal a priori probabilities. The motion of every system is restricted
to the subspace T
E
of energy E of the full phase space T, and the density function
E
for
energy E is supported on T
E
with uniform weight. As a result, the normalized distribution
function for the micro-canonical ensemble may be written down explicitly,

E
(p, q) =
1
A(E)

_
H(p, q) E
_
A(E) =
_
1
dpdq
_
H(p, q) E
_
(3.18)
The micro-canonical ensemble average of a phase space function f(p, q) is given by,

f =
_
1
dpdq
E
(p, q) f(p, q)
_
1
dpdq
E
(p, q) = 1 (3.19)
and the statistical entropy of the ensemble is given by,
S(E) = k ln (E) (E) = A(E)/A
0
(3.20)
The need for the presence of a non-purely-mechanical normalization factor A
0
in the entropy
formula was already made clear by considering the simplest ideal gas, which showed that A
0
has a quantum component involving h. It is clear that this factor is needed also from the
consideration of dimensions: the total number of micro-states (E) is dimensionless, but
A(E), dened in (3.18) has dimensions. Note that (E) is dened in an intrinsic geometric
manner, and invariant under canonical transformations on p, q.
In practice, it will often be more convenient, and more physical, to dene the micro-
canonical ensemble for a nite range of energies, instead of strictly at one value of energy.
This treatment will be especially valuable in the quantum case, where energy levels are
usually discrete. To do so, we dene a shell of energies, in the range [E, E + ], with

E,
(p, q) =
1
(E, )
_

_
1 if E H(p, q) E +
0 otherwise
(3.21)
The ensemble averages are dened accordingly as in (3.19) with
E
replaced with
E,
.
3.9 The canonical ensemble
The canonical ensemble is the most standard one. As a result, a number of dierent deriva-
tions are available, and will be discussed here.
32
The canonical ensemble may be obtained from the canonical ensemble by changing vari-
ables from xed internal energy E to xed absolute temperature T. The density function
for the canonical ensemble (p, q; T) is the Laplace transform of the density function
E
(p, q)
for the micro-canonical ensemble, and is given by,
(p, q; T) =
1
Z(T)
e
H(p,q)/kT
(3.22)
The normalization factor Z(T) is referred to as the partition function, and is given by,
Z(T) =
_
1
dpdq e
H(p,q)/kT
(3.23)
Statistical averages in the canonical ensemble are obtained by,

f =
1
Z(T)
_
1
dpdq e
H(p,q)/kT
f(p, q) (3.24)
The canonical ensemble is more useful in carrying out calculations than the micro-canonical
ensemble. The three formulas above are fundamental throughout statistical mechanics.
To make contact with other familiar thermodynamic functions, we proceed as follows.
By denition, the ensemble average of the Hamiltonian H(p, q) gives the internal energy E.
It may be expressed via a derivative of Z with respect to T,
E =

H = kT
2

T
ln Z(T) (3.25)
To compare this result with formulas in thermodynamics, we express it in terms of the free
energy F, with the help of F = E TS and S = F/T to obtain,
E = T
2

T
_
F
T
_
(3.26)
Comparison of (3.6) and (3.5) shows that the free energy is given by,
F(T) = kT ln
Z(T)
Z
0
(3.27)
which is one of the most fundamental formulas of correspondence between statistical me-
chanics and thermodynamics, in the canonical ensemble. The constant Z
0
is not determined
by classical statistical mechanics. Its presence shifts the entropy by a constant. We know
from the example of the simplest ideal gas that Z
0
is quantum mechanical in origin. More
importantly, one needs the constant there to make the argument of the logarithm to be
properly dimensionless. Finally, we have already established in the ideal gas calculation
33
that the quantum indistinguishability of identical particles requires the presence of an extra
combinatorial factor N!. Thus, the correct constant is found to be,
Z
0
= (2 h)
s
N! (3.28)
This result may be shown in all generality by rst using the functional integral formulation
of the quantum problem and then taking its classical limit.
An alternative derivation makes use of the fact that the normalized density function is
a probability distribution. As appropriate for the canonical ensemble, we assume that the
density function only depends on energy, namely on the Hamiltonian of the system. Now
consider two independent systems (each system having its own phase space), with Hamilto-
nians H
1
, H
2
, and corresponding density functions (H
1
) and (H
2
). The Hamiltonian of the
total system is then H = H
1
+H
2
. The probability distributions (H
1
) and (H
2
) are prob-
abilistically independent. Therefore, as always with probability, the probability distribution
of the combined system (H) must be the product,
(H) = (H
1
) (H
2
) (3.29)
But since we have H = H
1
+ H
2
, this implies that the dependence of (H) on H must be
exponential. This gives the Canonical distribution with the Boltzmann factor.
3.10 Deriving the canonical from the micro-canonical ensemble
The constructing of the preceding section is somewhat formal. Here, we present another
derivation of the distribution function (3.22) which provides more physical insight. Keeping
a system (denoted here by o) in equilibrium at a xed temperature T may be realized
physically by putting the system o in thermal contact with a very large system o
t
, which
is assumed to be in equilibrium at temperature T. The system o
t
provides a heat bath or
thermal reservoir with which o can freely exchange energy. A pictorial representation of the
system is given in Figure 4. We keep the total energy E
tot
of the combined system xed,
so that the set-up used early on for the micro-canonical ensemble applies. We shall assume
here that the micro-states of the system o are discrete, may be labelled by a discrete index
n, and have discrete energy levels E
n
.
The total number
tot
(E
tot
) of micro-states to which the combined system has access is
then given by,

tot
(E
tot
) =

E
(E)
t
(E
t
) (3.30)
where E and E
t
are the energies of the systems o and o
t
respectively, and the total energy
is given by E
tot
= E+E
t
. Labeling all the micro-states of o by the discrete label n, we have
34
E
E
Figure 4: A small macro-systems o in thermal equilibrium with a large heat bath o
t
.
the following explicit formula for (E),
(E) =

E,En
(3.31)
where
E,En
is the Kronecker delta. It is important to stress that the sum over n is to run
over all distinct states, not just over all allowed energies; in particular, there may be several
states of o with energy E
n
, and the degeneracy of the energy level E
n
must be properly
included in the sum. Using (3.31) and (3.31), formula (3.30) may be cast in terms of a sum
over all micro-states of o,

tot
(E
tot
) =

t
(E
tot
E
n
) =

n
exp
_
1
k
S
t
(E
tot
E
n
)
_
(3.32)
Now, we use the fact that the heat bath system o is huge compared with o, so that we can
assume E
n
/E
tot
1, and expand in this quantity,
S
t
(E
tot
E
n
) = S
t
(E
tot
) E
n
S
t
(E
tot
)
E
tot
+
1
2
E
2
n

2
S
t
(E
tot
)
E
2
tot
+O(E
3
n
) (3.33)
Now, by denition, the rst derivative,
S
t
(E
tot
)
E
tot
=
1
T
(3.34)
gives the temperature T of the heat bath. Since S
t
(E
tot
is extensive, the second derivative
term is of order E
n
/E
tot
and may be neglected compared to the rst derivative term. Thus,
the total number of states becomes,

tot
(E
tot
) = e
S

(Etot)/k

n
e
En/kT
(3.35)
35
Using now the principle of equal a priori probabilities, we know that every state n with
energy E
n
is equally likely, so we nd the number of states of the total system accessible to
micro-state n to be given by
e
S

(Etot)/k
e
En/kT
(3.36)
As a result, the probability of nding the whose system in a state such that o is in the
micro-state n is given by the ratio,

n
=
e
En/kT
Z(T)
Z(T) =

n
e
En/kT
(3.37)
Note that in these probabilities, all references to the heat bath has vanishes, except for its
temperature. In classical mechanics, the possible micro-states are labelled by the generalized
coordinates (p, q), the corresponding energy being given by the Hamiltonian H(p, q). The
probability becomes the normalized density function, so that we recover (3.22).
3.11 The Gibbs and the grand-canonical ensembles
The Gibbs ensemble is a generalization of the canonical ensemble in which time-independent
external generalized forces may applied to the system, and work may be exchanged between
systems in contact. Each type of generalized force f has a conjugate generalized displacement
x, together forming thermodynamic paired variables. The variable x is a function of phase
space. Most frequently x is of the form x = x(p, q), though when x is the number of
particles N, or the volume V of space, the functional dependence is somewhat unusual.
Classic examples of thermodynamic paired variables are listed in Table 1.
System generalized force f generalized displacement x, X
Fluid pressure P volume V
Film surface tension area A
String string tension length L
Magnetic magnetic eld B magnetization M
Electric electric eld E charge q
polarization P
chemical chemical potential particle number N
Table 1: Paired thermodynamic variables
We shall assume that the generalized forces f on the system are kept xed in time, as
is required for an equilibrium ensemble. The work done on the system by a generalized
36
displacement x is then linear in x, and given by W = fx. The corresponding change in the
Hamiltonian is given by H(p, q) H(p, q) f x(p, q). (If several generalized forces f

are
applied, we will instead have a sum

(p, q).) The distribution function is given by,


(p, q; T, f) =
1
Z(T, f)
e
(H(p,q)fx(p,q))/kT
(3.38)
The normalization factor Z(T, f) is referred to as the grand canonical partition function,
Z(T, f) =
_
1
dpdq e
(H(p,q)fx(p,q))/kT
(3.39)
Statistical averages in the canonical ensemble are obtained by,

f =
1
Z(T, f)
_
1
dpdq e
(H(p,q)fx)kT
f(p, q) (3.40)
In particular, the ensemble average of the generalized displacement variable x will be denoted
by X = x. The Gibbs free energy is dened by,
G(T, f) = E TS fX (3.41)
and is related to the partition function by,
G(T, f) = kT ln Z(T, f) (3.42)
When the exchange of particles is permitted, the corresponding Gibbs ensemble is sometimes
referred to as the grand-canonical ensemble.
37
4 Applications of Classical Statistical Ensembles
4.1 Application I: The Maxwell distribution
The density function in the canonical ensemble may be readily used to derive the Maxwell
distribution for the momenta or velocities in a classical ideal gas. The particles being non-
interacting, we can focus on the velocity distribution of a single particle, the case of N
particles being given by multiplying the probabilities and partition functions for single-
particles. The single-particle partition function Z
1
is given by,
Z
1
(T) =
_
d
3
pd
3
q
(2 h)
3
e
H(p,q)/kT
(4.1)
The Hamiltonian is that for a free non-relativistic particle H(p, q) = p
2
/2m with no depen-
dence on q. Thus, the q-integral may be carried out and yields the space-volue V . Changing
variables from momentum to velocity, p = mv, we have,
Z
1
(T) =
m
3
V
(2 h)
3
_
d
3
v e
mv
2
/2kT
(4.2)
It is straightforward to evaluate the integral, and we nd,
Z
1
(T) =
m
3
V
(2 h)
3

_
2kT
m
_
3/2
(4.3)
As a result, the probability distribution for one particle is given by,

Max
(v) =
_
m
2kT
_
3/2
4v
2
e
mv
2
/2kT
_

0
dv
Max
(v) = 1 (4.4)
This is the Maxwell distribution. It is readily checked that the average velocity v vanishes,
while the average velocity square, and thus the average energy, are given by,
v
2
=
_

0
dv
Max
(v) v
2
=
3kT
m
H =
3
2
kT (4.5)
as expected from the equipartition theorem.
4.2 Application II: Magnetism in classical statistical mechanics
Magnetic elds couple to matter in two ways; rst, to the currents of moving electric charges;
second to intrinsic magnetic dipole moments. The rst eect is diamagnetic, the second
paramagnetic. The intrinsic magnetic moments of elementary particles is a purely quantum
38
eect, at the same level as their spin. The magnetic coupling to electric charge is governed
by the minimally coupled Hamiltonian for charged particles,
H(p, q) =
N

i=1
1
2m
i
_
p
i
e
i
A
i
(q)
_
2
+ U(q) (4.6)
Here, U(q) is any potential contribution which depends only on the generalized positions
q; A
i
(q) is the vector potential acting on particle i; and e
i
, m
i
are respectively the electric
charge and mass of particle i. Note that the vector potential may include the eects of an
external magnetic eld, as well as the eects of internal magnetic interactions. We begin by
computing the partition function,
Z(T) =
1
(2 h)
3N
N!
_
N

i=1
d
3
p
i
d
3
q
i
e
H(p,q)/kT
(4.7)
It may be decomposed as follows,
Z(T) =
1
(2 h)
3N
N!
_
N

i=1
d
3
q
i
e
U(q)/kT
N

i=1
_
d
3
p
i
exp
_

_
p
i
e
i
A
i
(q)
_
2
2m
i
kT
_

_
(4.8)
Each integration measure d
3
p
i
is invariant under translation by an arbitrary vector, which
may depend on i. In particular, it is invariant under the shifts,
p
i
p
t
i
= p
i
e
i
A
i
(q) (4.9)
Performing these changes of variables, we obtain,
Z(T) =
1
(2 h)
3N
N!
_
N

i=1
d
3
q
i
e
U(q)/kT
N

i=1
_
d
3
p
i
exp
_

p
2
i
2m
i
kT
_
(4.10)
All magnetic eects have vanished from the partition function. This result is referred to as the
Bohr - Van Leeuwen theorem. In particular, all thermodynamic functions are independent of
external magnetic elds. Thus, there is no diamagnetism in classical statistical mechanics. In
as far as intrinsic magnetic moments are purely quantum eects, there are strictly speaking
then also no paramagnetic eects classically.
4.3 Application III: Diatomic ideal gasses (classical)
The simplest ideal gas model only included translational degrees of freedom, as would be
suitable for point-like microscopic constituents. Single atoms can of course also have excited
39
quantum states, but these start to be relevant only at relatively high temperatures. For
example, the rst excited state of Hydrogen is 13.6 eV above the ground state, corresponding
to a temperature of 10
5
K. Polyatomic molecules do, however, have low lying extra degrees
of freedom, such as rotational and vibrational, which are relevant at room temperatures.
Consider a simple model of an ideal gas composed of diatomic molecules. The atoms in
each molecule can rotate in the two directions perpendicular to the symmetry axis of the
molecule with moment of inertia I, while we will neglect the moment of inertia of rotations
about the symmetry axis. The atoms in each molecule can also vibrate by changing their
relative distance, and we shall model these vibrations by a harmonic oscillator of frequency
. The translational, rotational and vibrational degrees of freedom are all decoupled from
one another, so that their Hamiltonians add up, and their phase space measures factorize.
Thus, the partition function of an individual molecule (recall that the gas is ideal, so the
total partition function is given by the N-th power and a prefactor of 1/N!) factorizes,
Z
1
(T) = Z
tranl
Z
rot
Z
vib
(4.11)
The factor Z
tranl
was already computed, and is given by (4.3).
The rotational degrees of freedom may be parametrized by the angles on the sphere (, ),
and the associated Lagrangian is given by,
L =
I
2
(

2
+

2
sin
2
) (4.12)
The conjugate momenta are p

= I

and p

= I

sin
2
, and the Hamiltonian is given by,
H
rot
=
p
2

2I
+
p
2

2I sin
2

(4.13)
Carefully including the measure with the standard normalization, we nd,
Z
rot
=
_
d ddp

dp

(2 h)
2
e
Hrot
=
2IkT
h
2
(4.14)
Using (3.16), we derive the contribution from the rotational energy per molecule,
E
rot
= kT (4.15)
in accord with the equipartition theorem for 2 rotational degrees of freedom.
Finally, there is a single vibrational degrees of freedom between the positions of the two
atoms, which we will denote by x, and whose Hamiltonian is given by
H
vib
=
p
2
x
2m
+
1
2
m
2
x
2
(4.16)
40
from which we deduce,
Z
vib
=
kT
h
E
vib
= kT (4.17)
The factor of 2 in this contribution compared with the naive expectation from equipartition
is due to the presence of the potential energy. Adding all up, we nd the internal energy of
the gas as follows,
E = N
_
E
transl
+ E
rot
+ E
vib
_
=
7
2
NkT (4.18)
In particular, the specic heat is found to be
C
V
=
7
2
Nk (4.19)
This prediction is very simple and quite distinct from the one for mono-atomic gasses. So,
one immediate question is: does (4.19) agree with experiment ? For most gasses at room
temperature, the answer is no; instead their specic heat is more in line with the predictions
of a mono-atomic gas. The explanation lies, even at room temperature, in quantum eects,
which we shall study in the next section.
4.4 Quantum vibrational modes
Consider the vibrational eects rst. Classically, the oscillator can carry any small amount
of energy above that of its rest position. But this is not so in quantum mechanics, since the
energy levels of the oscillator are discrete,

n
=
_
1
2
+ n
_
h n = 0, 1, 2, 3, (4.20)
Thus, the minimal amount of energy (or quantum) above the ground state h is discrete.
This introduces a characteristic temperature scale in the problem,
T
vib
=
h
k
(4.21)
As a consequence, we expect that at temperatures below T
vib
, the vibrational mode will be
strongly suppressed by the Boltzmann factor, and will not contribute to the specic heat.
We shall now work out the quantum contribution from the vibrational modes quantita-
tively. The partition function is given by,
Z
vib
() =

n=0
e
n
=
e
h/2
1 e
h
(4.22)
41
The free energy and internal energy for the vibrational mode are given by,
F
vib
= kT ln Z
vib
=
1
2
h + kT ln
_
1 e
h
_
E
vib
=

ln Z
vib
= h
_
1
2
+
1
e
h
1
_
(4.23)
The specic heat is then readily evaluated, as we nd,
C
vib
=
E
vib
T
=
T
2
vib
T
2
e
T
vib
/T
(e
T
vib
/T
1)
2
(4.24)
The contribution to the total specic heat from the N vibrational modes is simply C
V
=
NC
vib
. The behavior of this function may be inferred from its asymptotic behavior,
T T
vib
C
V
kN
T T
vib
C
V
kN
T
2
vib
T
2
e
T
vib
/T
(4.25)
which is clearly being reected in a numerical plot of the function given in Figure 5. The
large T contribution corresponds to the classical behavior found in the previous section.
Below T
vib
, the vibrational contribution to the specic heat turns o quickly.
O O
O O
with(plots):
p1:=plot(1/x^2*exp(1/x)/(exp(1/x)-1)^2, x=0..3, labels=[x,
C_V/Nk], color=[blue],thickness=2):
Z:=x->sum((2*n+1)*exp(-n*(n+1)/x), n=0..infinity);
Ck:=x->diff(x^2*diff(ln(Z(x)),x),x);
p2:=plot(Ck(x), x=0..3, color=[red],thickness=2):
p3:=textplot([1.4, 0.8, "vibrational x=T/T_vib"], align =
{above, right}, color=[blue], font=[TIMES,BOLD,13]):
p4:=textplot([1.4, 1.1, "rotational x=T/T_rot"], align = {above,
right}, color=[red], font=[TIMES,BOLD,13]):
display(p1,p2,p3,p4);
Z := x/
>
n = 0
N
2 n C1 e
K
n n C1
x
Ck := x/
d
dx
x
2

d
dx
ln Z x
vibrational x=T/T_vib
rotational x=T/T_rot
x
0 1 2 3
C_V
Nk
0
0.2
0.4
0.6
0.8
1
Figure 5: Contribution to the specic heat from a quantum vibrational mode.
42
4.5 Quantum rotational modes
A similar story applies to the rotational modes. Their contribution to the specic heat will
be cut o at a characteristic temperature T
rot
as well since the energy levels of the quantum
rotor are again discrete. The energy levels of a (simplied) spherical model are given by

= ( + 1) h/2I with degeneracy 2 + 1, so that the partition function is given by,


Z
rot
=

=0
(2 + 1) e
(+1)Trot/T
T
rot
=
h
2
2Ik
(4.26)
Analytically, this sum would be given by an elliptic function, but we can easily plot the
function numerically, as shown in Figure 5. It is useful to get some order of magnitude
estimated for the transition temperatures. For molecular Hydrogen, H
2
, we have,
T
vib
1500K 2 10
14
s
1
T
rot
100 K I 0.46 10
40
erg s
2
(4.27)
These order of magnitudes demonstrate that quantum eects are relevant at relatively low
as well as at relatively high temperatures.
43
5 Quantum Statistical Ensembles
In this section, ee shall dene ensembles in quantum mechanics, distinguish between pure
and mixed ensembles, and introduce the density matrix. We have set up the classical case
in such a way that it will resemble most closely the treatment of the quantum case. The
uniform, micro-canonical and canonical ensembles are then introduced for the quantum case.
5.1 Quantum Mechanics
In quantum mechanics, physical states are represented by vectors in a Hilbert space 1, and
denoted by bras and kets following Dirac. Two vector [ and [
t
correspond to the same
physical state if and only if there exists a non-zero complex number such that [
t
= [.
Observables are represented by self-adjoint linear operators on 1. A state [
i
has a denite
measured value a
i
for an observable A provided [
i
is an eigenstate of A with eigenvalue a
i
.
Let [ be an arbitrary state in 1, and let [
i
denote a set of orthonormal states. Then,
the probability P for measuring the state [ to be in one of the states [
i
is given by,
P = [
i
[[
2
[ = 1 (5.1)
Finally, in the Schrodinger picture of quantum mechanics, observables are time-independent
and states [(t) evolve in time according to the Schr odinger equation associated with a
Hamiltonian H, which is itself an observable,
i h
d
dt
[(t) = H[(t) (5.2)
Although these statements certainly provide the basic principles of quantum mechanics, not
all physical states of systems encountered in Nature can be properly described by vectors
in a Hilbert space. To make the distinction, one refers to a state described by a vector in
Hilbert space as a pure state. Instances where the vector description is not adequate include,
1. Incoherent mixtures of pure states;
2. Subsystems of a system in a pure quantum state.
Understanding both of these cases will turn out to be of fundamental importance in quantum
statistical mechanics, and we shall begin by discussing them in some detail.
5.2 Mixed quantum states
The interference properties of quantum mechanics may be illustrated using the physical
examples of polarized beams of photons or of spin 1/2 particles. In either set-up, the rst
44
piece of apparatus used is usually a polarizer, responsible for ltering a denite polarization
out of an unpolarized beam. Particles in a polarized beam are pure quantum states. In
particular, a polarized beam of Silver atoms in the Stern-Gerlach experiment should be
viewed as an ensemble of spins which are all in the same quantum state. This states in the
beam are said to be coherent and to form a pure ensemble.
What we have not yet provided in quantum mechanics is a mathematical description for
a beam which is unpolarized, or is said to be in a mixed state. The dening property of an
unpolarized beam of spin 1/2 particles is that measurement of the spin operator n S along
any direction n, with n
2
= 1, gives h/2 with equal probabilities, namely 50% each.
We begin by showing that an unpolarized beam cannot be described mathematically by
a vector in Hilbert space. Let us assume the contrary, and represent the particles in the
unpolarized beam by a vector [. The probability P
n
for observing the spins in a quantum
state [n, +, which is the eigenstate of n S with eigenvalue + h/2, would then be given by,
P
n
= [n, +[[
2
. To compute this probability, we express the normalized states [n, + and
[ in a xed orthonormal basis [ of the Hilbert space, parametrized as follows,
[n, + = cos e
i
[+ + sin e
i
[
[ = cos e
i
[+ + sin e
i
[ (5.3)
The probability P
n
is then given by,
P
n
= cos
2
cos
2
+ sin
2
sin
2
+ 2 cos sin cos sin cos(2 2) (5.4)
For an unpolarized beam, P
n
must be independent of both and . Independence of
requires sin 2 = 0 (since is to be arbitrary). Its solutions = 0, /2 respectively give
P
n
= cos
2
and P
n
= sin
2
, neither of which can be independent of . We conclude that an
unpolarized beam cannot be described mathematically by a state in Hilbert space.
5.3 Subsystems of a quantum system in a pure state
Consider a quantum system whose Hilbert space 1is the tensor product of two Hilbert spaces
1
A
and 1
B
. A simple example is provided by A and B being both two-state systems. Here,
1
A
and 1
B
may be generated by the ortho-normal states [, A and [, B, in which the
operators S
z
A
and S
z
B
are diagonal and take eigenvalues h/2. Consider a pure state [ of
the system 1; for example, the singlet state is given by,
[ =
1

2
_
[+, A [, B [, A [+, B
_
(5.5)
Now suppose that we measure observables in the subsystem A only, such as for example S
z
A
.
As such, this question does not quite make sense yet, because S
z
A
is an operator in 1
A
, but
45
not in 1. Since we make no observations on system B, the only natural way to extend S
z
A
to an operator of 1 is by using the identity matrix I
B
in 1
B
,
S
z
A


S
z
A
= S
z
A
I
B
(5.6)
The expectation value of

S
z
A
in the pure quantum state [ is then given as follows,
[

S
z
A
[ =
1
2
_
+, A[S
z
A
[+, A +, A[S
z
A
[, A
_
= 0 (5.7)
using the orthonormality of the states [, B. In fact, by the same reasoning, one shows
that we also have [n S[ = 0 for any n.
Remarkably, we have established that observing a subsystem of a total system which is
in a pure quantum state produces quantum expectation values which are identical to those
of a mixture of pure quantum states. Thus, the problems of representing mixtures of pure
quantum states and the problem of observing only a subsystem are closely related to one
another, and neither of them permits a description by vectors in Hilbert space alone.
5.4 The density matrix
A new mathematical tool is needed to describe quantum systems in a mixture, or observed
only by the observables of a subsystem. This formalism was introduced by Landau and von
Neumann in 1927, and the key object is referred to as the density matrix.
A pure ensemble, by denition, is a collection of physical systems such that every member
of the ensemble is characterized by the same element [ in Hilbert space.
A mixed ensemble, by denition, is a collection of physical systems such that a fraction
w
2
of the members is characterized by a pure state [
1
, a fraction w
2
of the members is
characterized by a pure state [
2
, and so on. We shall assume that each pure state ket [
i

is normalized, but the vectors [


1
, [
2
, do not need to be mutually orthogonal to one
another. Thus, a mixed ensemble is characterized by N vectors [
i
for i = 1, , N, each
one representing a pure state, and entering into the mixed state with a population fraction
w
i
0, which quantitatively indicates the proportion of state [
i
in the mixture. The
population fractions are normalized by,
N

i=1
w
i
= 1 (5.8)
A mixture is an incoherent superposition of pure states which means that all relative phase
information of the pure states must be lost in the mixture. This is achieved by superimposing,
not the states [
i
in Hilbert space, but rather the projection operators [
i

i
[ associated
46
with each pure state. The density matrix for an ensemble of N pure states [
i
incoherently
superimposed with population fractions w
i
is dened by
=
N

i=1
[
i
w
i

i
[ (5.9)
Since the superpositions involved here are incoherent, the weights w
i
may be thought of as
classical probability weights assigned to each population. We stress that the various dierent
pure states [
i
are not required to be orthogonal to one another, since it should certainly
be possible to superimpose pure states which are not orthogonal to one another.
The following properties of the density matrix immediately result,
1. Self-adjointness :

= ;
2. Unit trace : Tr() = 1;
3. Non-negative : [[ 0 for all [ 1;
4. A state corresponding to is pure (rank 1) if and only if is a projection operator.
Conversely, any operator in 1 satisfying properties 1, 2, 3 above is of the form of (5.9)
with normalization (5.8). The above three conditions guarantee that can be diagonalized
in an orthonormal basis [
i
, with real non-negative eigenvalues p
i
,
=

i
[
i
p
i

i
[

i
p
i
= 1 (5.10)
Note that [
i
need not be proportional to [
i
, and p
i
need not coincide with w
i
. Thus,
a given density matrix will have several equivalent representations in terms of pure states.
To check property 4, we note that if corresponds to a pure state, it is rank 1, and thus
a projection operator in view of property 2. Conversely, substituting the expression (5.10)
into the projection operator relation
2
= requires p
i
(p
i
1) = 0 for all i. The condition
is solved by either p
i
= 0 or p
i
= 1, but Tr() = 1 requires that p
i
equal 1 for a single i.
Using the above properties of density matrices, we construct the general density matrix
for a 2-state system for which the Hilbert space is 2-dimensional. The density matrix is then
a 2 2 matrix and may be expressed in terms of the Pauli matrices = (
1
,
2
,
3
),
= (I +a )/2 (5.11)
where a is a real 3-vector. Self-adjointness and normalization are guaranteed by (5.11),
while positivity requires [a[ 1. Pure states precisely correspond to the boundary [a[ = 1,
while any density matrix with [a[ < 1 corresponds to a mixed state which is not pure. The
ensemble average of the spin operator n S in the direction n is readily calculated,
n S = Tr
_
n S
_
= n a
h
2
(5.12)
47
The above formula conrms that the state is unpolarized for a = 0, partially polarized for
0 < [a[ < 1, and pure for [a[ = 1. This gives a nice geometrical representation of the space
of density matrices for the two-state system.
5.5 Ensemble averages and time-evolution
In a pure normalized state [, we dened the expectation value of an observable A by
the matrix element [A[ = Tr
_
A[[
_
, a quantity that gives the quantum mechanical
weighed probability average of the eigenvalues of A. In a mixed state, these quantum me-
chanical expectation values must be further weighed by the population fraction of each pure
state in the mixture. One denotes this double quantum and statistical average by either

A,
or by A, and one denes the ensemble average of the observable A by

A = A =

i
w
i

i
[A[
i
= Tr(A) (5.13)
It follows from the properties of the density matrix that the ensemble average of any self-
adjoint operator is real. Note that formula (5.13) holds whether corresponds to a pure
state or a mixture.
The time-evolution of the density matrix may be derived from the Schr odinger equation
for pure states. In the Schrodinger picture, the time evolution of a quantum state [(t) is
governed by a self-adjoint Hamiltonian operator H on 1 via the Schrodinger equation (5.2).
Assuming that the population fractions w
i
of (5.9), or equivalently the p
i
of (5.10), do not
change in time during this evolution, the density matrix (t) will obey the following time
evolution equation,
i h
d
dt
(t) = [H, (t)] (5.14)
To prove this formula, one rst derives the time evolution equation for [(t)(t)[ and
then takes the weighed average with the time-indepenent population fractions. This time
evolution may be solved for in terms of the unitary evolution operator U(t) = e
itH/h
, by
(t) = U(t)(0)U(t)

(5.15)
The normalization condition Tr() = 1 is automatically preserved under time evolution.
5.6 The density matrix for a subsystem of a pure state
Next, we return to the problem of observing a subsystem with Hilbert space 1
A
of a total
system with Hilbert space 1 = 1
A
1
B
. Let [
i
, A be an orthonormal basis in 1
A
, and
48
[
j
, B an orthonormal basis in 1
B
with i = 1, , dim1
A
and j = 1, , dim1
B
. The
states [
i
, A [
j
, B then form an orthonormal basis in 1 = 1
A
1
B
, with the above
ranges for i, j. A normalized pure quantum state [ in 1 may be expressed in this basis by,
[ =

i,j

ij
[
i
, A [
j
, B

i,j
[
ij
[
2
= 1 (5.16)
where
ij
are complex coecients. Let us now compute the purely quantum mechanical
expectation value of an observable / of the subsystem 1
A
in the pure state [ 1.
Making no observations in subsystem B may be represented by extending the operator /
on 1
A
to an operator

/ on 1 by letting,
/

/ = /I
B
(5.17)
where I
B
is the identity operator in 1
B
. This is a generalization of a similar extension we
used for the two-state system in (5.6). Evaluating the expectation value of

/ in the pure
state [ of 1, we nd,
[

/[ =

i,j

,j

ij

j

i
, A[
j
, B[/I
B
[
i
, A [
j
, B (5.18)
Using the orthogonality relation
j
, B[
j
, B =
j,j
, and introducing the linear operator
A
on 1
A
, dened by,

A
=

i,j,i

[
i
, A
i

ij

i
, A[ (5.19)
we see that we have
[

/[ = Tr
7
A
_

A
/
_
(5.20)
The operator
A
is self-adjoint, satises Tr
7
A
(
A
) = 1, and is positive. Thus,
A
qualies as
a density matrix. The density matrix
A
is a projection operator and corresponds to a pure
state of 1
A
if and only if it has rank 1, which in turn requires the matrix
ij
to have rank 1.
In all other cases,
A
corresponds to a non-trivial mixture.
A quicker, but more formal derivation of the above calculation makes use of the relation,
Tr
7
O = Tr
7
A
_
Tr
7
B
O
_
(5.21)
valid for any operator in O on 1 = 1
A
1
B
, as well as the relation [

/[ = Tr
7
(

/[[).
Applying both of these relations gives (5.20) with,

A
= Tr
7
B
_
[[
_
(5.22)
It is readily checked that the right hand formula for
A
reproduces (5.19).
49
5.7 Statistical entropy of a density matrix
The density matrix for an ensemble of n pure (orthonormal) states [
i
occurring with
probability p
i
in the ensemble is given by,
=
n

i=1
[
i
p
i

i
[
n

i=1
p
i
= 1 (5.23)
When all probabilities are equal and given by p
i
= 1/n, the density matrix corresponds to
the uniform ensemble, and the entropy is given by Boltzmanns formula S = k ln n. Let us
now compute the entropy when the probabilities are not all equal.
To do so, we follow Gibbs again. Instead of considering just one system, consider a very
large number A of identical systems, and let us compute the entropy not just for one system,
but for all A of them. By the denition of what probability means for large A, we then see
that of those A systems, A
i
= p
i
A will be in state [
i
, with
n

i=1
A
i
= A A
i
0 (5.24)
The number of micro-states that can realize a macro-state with the A
i
systems in state [
i

is then given by a standard combinatorial formula,

A
=
A!
A
1
! A
n
!
(5.25)
This expression and the corresponding entropy
N
for the ensemble of A systems may be
recast in terms of the probabilities p
i
and the total number of copies A, and we nd,
S
A
= k ln
A
= k
_
ln A!
n

i=1
ln (p
i
A)!
_
(5.26)
Since A is very large, we may use the Sterling formula m! mln mm, and nd,
S
A
= k
_
A ln A A
n

i=1
_
p
i
A ln (p
i
A) p
i
A
_
_
(5.27)
The term A ln A A cancels since the p
i
sum to 1, and we are left with,
S
A
= kA
n

i=1
p
i
ln p
i
(5.28)
We nd, that the entropy for the A systems of the ensemble is simply proportional to A,
as should be expected. Thus, it makes sense to extract the entropy S for a single system by
dividing S
A
by A and setting S = S
A
/A, so that we nd,
S(p
1
, , p
n
) = k
n

i=1
p
i
ln p
i
(5.29)
50
or equivalently in terms of the density matrix directly, we have,
S() = k Tr( ln ) (5.30)
Setting all probabilities equal p
i
= 1/n for the uniform ensemble, we recover S = k ln n.
Some basic properties of the statistical entropy are as follows.
1. The above construction of the statistical entropy does not assume equilibrium.
2. Positivity, S(p
1
, , p
n
) 0;
3. The minimum S = 0 is attained when all probability assignments are 0, except for a
single entry p
j
= 1. The entropy vanishes if and only if corresponds to a pure state.
4. The maximum S
max
= k ln n is attained when all probabilities are equal, p
i
= 1/n.
5. Invariance under conjugation of the density operator by a unitary transformation. In
particular, the entropy is invariant under time evolution, under the assumption that
the probabilities p
i
remain unchanged in time;
6. Additivity upon combination of two subsystems which are statistically uncorrelated.
Let the systems be described by Hilbert spaces 1
a
and 1
b
, with density operators

a
and
b
respectively, then the full Hilbert space is 1
ab
= 1
a
1
b
and the density
matrix for the combined system is
ab
=
a

b
. The entropy is then additive,
S(
ab
) = S(
a
) + S(
b
) (5.31)
7. Subadditivity upon dividing a system with Hilbert space 1
ab
and density opera-
tor
ab
into two subsystems with Hilbert spaces 1
a
and 1
b
, and density matrices

a
= Tr
7
b
(
ab
) and
b
= Tr
7a
(
ab
) which are statistically correlated. The full density
operator
ab
is not the tensor product of
a
and
b
, in view of the non-trivial statistical
correlations between the two subsystems. Instead, one only has a strict inequality,
S(
ab
) S(
a
) + S(
b
) (5.32)
The proofs of these properties will be developed in problem sets.
Entropy may be given a meaning beyond traditional statistical mechanics. In developing
a theory of information around 1948, Claude Shannon was led to a generalized notion of
entropy that characterizes the amount of missing information for a given ensemble. In the
case of information theory, the ensembles consist of messages, sent in words and sentences.
To make contact with the previous sections, a message may be viewed as a mixture of a
certain number of letters and words, occurring with certain probabilities p
i
. Shannon was
led precisely to the entropy of (5.29) to characterize quantitatively the missing information.
51
5.8 The uniform and micro-canonical ensembles
The ensembles dened in classical statistical mechanics may be generalized to the quantum
case. In this section, we shall briey discuss the uniform and micro-canonical ensembles, but
spend most of our attention on the canonical ensemble whose usefulness is the greatest.
In the uniform ensemble, no macro-state information is given. By the principle of a
priori equal probabilities, the density matrix is proportional to the identity operator in the
full Hilbert space 1 of the quantum system, = I
7
/, where is the total number of states
which is equal to the dimension of 1. Clearly, the uniform density matrix is time-indpendent
and corresponds to an equilibrium ensemble. The corresponding entropy is S = k ln .
In the micro-canonical ensemble, the total energy of the system is xed to be E. The
density matrix has support only on eigenstates [E
i
,
i
of the Hamiltonian with energy
E
i
= E, the
i
denoting degeneracies of the energy level E
i
. In view of the principle of equal
a priori probabilities, the weights of all the states [E, for various values of are all the
same, and given by the total number of states (E) at energy E. Thus, the density matrix
may be expressed in terms of the projection operator P
E
onto states of energy E,
=
P
E
(E)
=
1
(E)

[E, E, [ (5.33)
The corresponding entropy is S(E) = k ln (E).
5.9 Construction of the density matrix in the canonical ensemble
In the canonical ensemble, the temperature T is xed. This may be achieved by putting the
system in contact with another very large system (or heat bath), and letting the combined
system reach equilibrium. Internal energy E may then be exchanged with the heat bath,
and its average value must then be determined so as to achieve temperature T. In terms of
the density matrix, we have the equations,
Tr() = 1 Tr( H) = E (5.34)
The normalization constraint has been made explicit here in order to be able to liberate
this condition on . Equilibrium will be achieved by maximizing the entropy, subject to
the above constraints. Extremization subject to constraints is carried out mathematically
with the help of Lagrange multipliers, one for each constraint. Thus, we shall extremize the
combination, S() Tr kTr(H), and set,
0 = S() Tr kTr(H)
= k Tr( ln ) Tr() k Tr( H) (5.35)
52
The rst term is calculated with the help of the mathematical identity,
Tr( ln ) = Tr
_
ln +
_
(5.36)
Putting all together, we nd the following relation for ,
0 = Tr
_
ln + (k + )/kI
7
+ H
_
(5.37)
The operators , , I
7
, and H are all self-adjoint as well. Satisfying equation for all
then requires that ln + H is proportional to the identity or,
=
e
H
Z()
(5.38)
The parameter must be identied with inverse temperature = 1/kT. The partition
function Z() is determined by the normalization condition Tr = 1, and is thus given by,
Z() = Tr
_
e
H
_
(5.39)
where the trace extends over the full Hilbert space. The internal energy may now be com-
puted directly from Z() by,
E() = Tr(H) =
1
Z()
Tr
_
H e
H
_
=
ln Z()

(5.40)
and the free energy is found to be,
F() = kT ln Z() (5.41)
Finally, we show that the thermodynamic denition of the entropy coincides with its sta-
tistical denition. To this end, we start with the thermodynamic relation E = F TS to
obtain the entropy with the help of (5.40) and (5.41),
S =
1
T
Tr(H) + k ln Z (5.42)
Taking the logarithm of the operator equation (5.38), we have ln = H ln Z. Using
this formula to eliminate H in (5.42), we nd,
S = kTr( ln ) (5.43)
reproducing the statistical denition of entropy.
53
5.10 Generalized equilibrium ensembles
The derivation of the Boltzmann weights and associated density matrix corresponds to the
canonical ensemble, in which only the energy of the system is kept constant. In the grand
canonical ensemble, both the energy and the number of particles in the system is kept
constant. More generally, we consider an ensemble in which the ensemble average of a number
of commuting observables A
i
, i = 1, , K is kept constant. To compute the associated
density operator of this ensemble, we extremize with respect to variations in the entropy,
under the constraint that the ensemble averages Tr(A
i
) are kept constant. Using again
Lagrange multipliers
i
, i = 1, , K, we extremize
Tr
_
ln
_

i=1

i
Tr(A
i
) (5.44)
Upon enforcing the normalization Tr() = 1, this gives,
=
1
Z
exp
_

i=1

i
A
i
_
Z = Tr
_
exp
_

i=1

i
A
i
__
(5.45)
In the grand canonical ensemble, for example, these quantities are
=
1
Z
e
H+N
Z = Tr
_
e
H+N
_
(5.46)
where N is the number operator and is the chemical potential. Other observables whose
ensemble averages are often kept xed in this way are electric charge, baryon number, electron
number etc.
54
6 Applications of the canonical ensemble
We shall now illustrate the use of the canonical (and micro-canonical) distributions on some
simple, but physically important systems.
6.1 The statistics of paramagnetism
Consider a system of N non-interacting magnetic dipoles with individual magnetic moment
in the presence of an external magnetic eld B. The magnetic dipoles may be those of
elementary particles, such as the electron, proton or neutron, in which case they are quantum
eects, along with the spin of these particles. The corresponding Hamiltonian is given by,
H =
N

i=1

i
B (6.1)
Quantum mechanically,
3
the magnetic moment is proportional to the spin s of the particle
(or more generally to the total angular),

i
= g
s
i
h
=
he
2m
e
c
(6.2)
where is the Bohr magneton of the dipole, given in terms of its mass m
e
of the electron,
and the basic unit electric charge e (namely the charge of the electron), and the Lande factor
g. For the electron, we have g 2. For a given type of particle, the total spin s will be xed,
and given by the eigenvalue h
2
s(s + 1) of s
2
. The eigenvalues of B are then given by,
gBm m = s, s + 1, , s 1, s (6.3)
Since the particles are non-interacting, the partition function Z is given by the N-th power
of the single particle partition function Z
1
, which in turn is given by,
Z = (Z
1
)
N
Z
1
=
s

m=s
e
gBm
(6.4)
The sum over m is geometric, and may be carried out analytically,
Z
1
=
sinh((2s + 1)x)
sinh x
x =
1
2
gB (6.5)
3
The classical treatment is parallel to the classical treatment of the electric dipole in an electric eld,
which was solved in problem set 3.
55
The corresponding free energy F, internal energy E, and magnetization M are given by,
F = kTN ln Z
1
E = N
ln Z
1

M =
N

ln Z
1
B
(6.6)
Of special interest is the magnetization, which may be recast as follows,
M =
1
2
Ng
_
(2s + 1)
ch(2s + 1)x
sh(2s + 1)x

chx
sh(x)
_
(6.7)
For large x (corresponding to large B and/or small T), the magnetization saturates at the
value sNg, while for small x (corresponding to small B and/or large T), the magnetization
follows the Curie law,
M =
1
3
s(s + 1)Ng
2

2
B
kT
(6.8)
As expected, it behaves linearly in B for small B, and tends to zero for large T.
6.2 Non-relativistic Boltzmann ideal gas
We return once more to the system of an ideal gas of N particles, in the approximation of
low density where Boltzmann statistics may be used. At higher densities, one will need to
appeal to Fermi-Dirac or Bose-Einstein statistics instead. The microscopic constituents in
an ideal gas are mutually non-interacting, so that the partition function is simply the N-th
power of the partition function for a single particle. The single-particle Hamiltonian takes
the following form,
H
1
= H
transl
+ H
int
H
transl
=
p
2
2M
(6.9)
Here, H
transl
corresponds to the center of mass motion for total momentum p and total mass
M, while H
int
corresponds to the internal degrees of freedom, in which we include rotational
and vibrational modes. These two Hamiltonians commute with one another, and may be
simultaneously diagonalized. Putting the system in a large cubic box of linear size L, and
volume V = L
3
with Dirichlet boundary conditions, the energy levels are then given by,
E
transl
=
h
2
2ML
2
(n
2
x
+ n
2
y
+ n
2
z
) n
x
, n
y
, n
z
0 (6.10)
56
and E
int
=
n
, where
n
are the 1-particle energy levels of a single particle. The levels
n
are
clearly independent of V and N. Thus, the partition function decomposes as follows,
Z = (Z
transl
)
N
(Z
int
)
N
(6.11)
The translational part was already computed earlier, and we have,
F
transl
= NkT + NkT ln
_
N(T)
3
V
_

2
=
2 h
2
MkT
(6.12)
where (T) is the thermal wavelength introduced in (2.48). In all generality, all that we can
state about the internal part Z
int
is that it only depends on T, but not on V or N,
F
int
(T) = kT ln Z
int
Z
int
=

n
e
n
(6.13)
We conclude that a general non-relativistic Boltzmann ideal gas has the following free energy,
F = NF
int
(T) NkT + NkT ln
_
N(T)
3
V
_
(6.14)
Using the standard thermodynamic relations of (2.24), we nd,
E =
3
2
NkT N

(F
int
)
S = NF
t
int
+
5
2
Nk Nk ln
_
N(T)
3
V
_
PV = NkT
= F
int
+ kT ln
_
N(T)
3
V
_
(6.15)
We observe as general properties of any ideal gas that,
the internal energy density E/N depends only on T, and not on the density N/V ;
the law PV = NkT holds for all ideal gasses.
6.3 Van der Waals equation of state
In real gasses, interactions between atoms do occur, and must be taken into account. We
shall here treat the case of a classical mono-atomic gas of identical particles, so that the only
degrees of freedom are the translational ones. Thus, the Hamiltonian for N particles will be,
H =
N

i=1
p
2
i
2m
+ U (6.16)
57
Generally, U will be a function of all variables at once. In a mono-atomic gas, U will be a
function only of the positions of the atomd. In the approximation of a dilute real gas, how-
ever, one may assume that the interactions occur only between pairs of particles, neglecting
interactions between triples, quadruples etc. In the partition function, the momentum and
coordinate integrals factorize. The momentum integrals give the contribution for the ideal
gas, and may be factored out. Thus, we nd,
Z = Z
ideal
Z
U
(6.17)
where the ideal gas part is given by
Z
ideal
=
V
N
N!
_
mkT
2 h
2
_
3N/2
(6.18)
while the interaction part is given by,
Z
U
= 1 +
1
V
N
N

i=1
_
d
3
q
i
_
e
U(q)/kT
1
_
(6.19)
We have rearranged this partition function to expose the value for the ideal gas when U = 0.
There are N(N 1)/2 N
2
/2 combinatorial ways in which the two-body interaction can
occur amongst N bodies, so that within this approximation, we nd,
Z
U
= 1 +
N
2
2V
2
_
d
3
q
1
d
3
q
2
_
e
U
(
q
1
,q
2
)/kT
1
_
(6.20)
If U depends only on the relative distances between the particles, then this formula may be
further simplied, and we nd,
Z
U
= 1
N
2
V
B(T) B(T) =
1
2
_
d
3
q
_
1 e
U
(
q)/kT
_
(6.21)
where B(T) depends only on T and the interaction, but not on V , under the assumption
that the interaction is suciently short-ranged.
If the interaction is everywhere weak compared to the temperature scale, then we may
use the approximation e
U/kT
1, to derive the following expression for the free energy,
F = NkT + NkT ln
_
N(T)
3
V
_
+ kT
N
2
B(T)
V
(6.22)
The pressure P = F/V is given by,
PV = NkT
_
1 +
NB(T)
V
_
(6.23)
58
This approximation is not very realistic for real gasses though. Instead, at room tempera-
tures, there is a core overlap of radius r
0
where the interaction is very strong (compared to
T), but negligible outside of this core. Thus, we may then use the following approximate
formula for B,
B(T) = 2
_
r
0
0
r
2
dr +
2
kT
_

r
0
r
2
drU (6.24)
The rst term is independent of T, while the second is inversely proportional to it. For an
attractive potential U < 0, we have,
B(T) = b
a
T
a, b > 0 (6.25)
The corresponding free energy is found to be,
F = NkT + NkT ln
_
N(T)
3
V
_
+
N
2
V
(bkT a) (6.26)
Computing the pressure, we nd,
P =
kNT
V
+
N
2
V
2
(bkT a) (6.27)
or
_
P + a
N
2
V
2
_
=
kNT
V
+
N
2
V
2
bkT
NkT
V Nb
(6.28)
which gives the Van der Waals equation for a real gas,
_
P + a
N
2
V
2
_
(V Nb) = NkT (6.29)
under the assumption that V Nb.
6.4 The Mayer cluster expansion
More generally, we can perform a more systematic expansion for a system in which the
interaction potential is a sum of a single to-dody potential acting between any pair of identical
particles. We shall denote this potential by U(r
ij
) where r
ij
= [r
i
r
j
[, and we use r here
rather than q for the position vectors. The partition function
Z =
1
N!(2 h)
N
N

i=1
_
d
3
p
i
d
3
r
i
e
H
(6.30)
59
factors as follows,
Z = Z
ideal
Z
U
Z
U
=
1
V
N
N

i=1
_
d
3
r
i
e

j<k
U(r
jk
)
(6.31)
where the ideal gas partition function was given in (??). Dening the following function,
f
jk
= e
U(r
jk
)
1 (6.32)
we may recast Z
U
as follows,
Z
U
=
1
V
N
N

i=1
_
d
3
r
i

j<k
(1 + f
jk
) (6.33)
The Mayer expansion is obtained by expansing the product in powers of f, and we have,
Z
U
=
1
V
N
N

i=1
_
d
3
r
i
_
_
1 +

j<k
f
jk
+

j<k, m<n
f
jk
f
mn
+
_
_
(6.34)
The rst term gives 1, the second term gives
N
2
2V
_
d
3
rf(r) (6.35)
and so on. The higher order terms admit a diagrammatic expansion.
60
7 Systems of indistinguishable quantum particles
Identical quantum particles (or systems of particles such as nuclei, atoms or molecules) are
not only indistinguishable; they obey specic quantum permutation symmetries. In 3 space-
dimensions (and higher), the only two types of permutation symmetries allowed by local
Poincare invariant quantum eld theory correspond to either Bose-Einstein or Fermi-Dirac,
respectively for particles (or systems) either of integer spin or of half odd-integer spin. This
correspondence will be heavily used, but not be proven here, as it is properly the subject of
quantum eld theory.
7.1 FD and BE quantum permutation symmetries
Concretely, we consider a system of N identical particles (or systems), which we will label
by an integer n = 1, , N. We denote the Hilbert space of a single particle by 1, and label
a basis of quantum states [ = [
1
, ,
A
in 1 by an array of A quantum numbers

with = 1, , A, corresponding to a maximal set of commuting observables O

in 1,
O

[
1
, ,
A
=

[
1
, ,
A
(7.1)
The Hilbert space 1
N
of the N indistinguishable particles is then given by the tensor product
of N copies of 1,
1
N
= 11 1
. .
N
(7.2)
The observables O

on 1 may be naturally extended to observables O

n
on 1
N
, with the
help of the identity operator I in 1, and we have,
O

n
= I I
. .
n1
O

I I
. .
Nn1
(7.3)
The quantum numbers labeling the states of 1
N
may be taken to be those of the maximal
set of commuting observables O

n
of 1
N
, with n = 1, , N and = 1, , A. Thus, the
basis states of 1
N
may be labeled by,
[
1
[
2
[
N
(7.4)
We stress that
n
here stands for the full array
n
=
1
n
,
2
n
, ,
A
n
. The action of the
operators O

i
in this basis may be read o from their denition,
O

n
[
1
[
2
[
N
=

n
[
1
[
2
[
N
(7.5)
61
The action of a permutation amongst these N particles is dened by its action on each
particle, and may be expressed in terms of the quantum numbers
n
. The quantum permu-
tation symmetry allowed for a physical state [
1
,
2
, ,
N
1
N
of N identical particles
with quantum numbers
n
for n = 1, , N, is either one of the following,
[
(1)
,
(2)
, ,
(N)
=
_

_
+[
1
,
2
, ,
N
Bose-Einstein
()

[
1
,
2
, ,
N
Fermi-Dirac
(7.6)
where ()

denotes the signature of the permutation . The states [


1
,
2
, ,
N
may be
constructed explicitly by symmetrizing (BE) or anti-symmetrizing (FD) the states of (7.4),
[
1
,
2
, ,
N
=
1

N!

S
N

[
(1)
,
(2)
, ,
(N)
(7.7)
where

=
_

_
1 Bose-Einstein
()

Fermi-Dirac
(7.8)
and o
N
is the set of all permutations of N particles.
This implies that two identical fermions cannot occupy the same 1-particle quantum
state; i.e. two identical fermions cannot have the same quantum numbers. Two or more
identical bosons, however, may freely occupy the same quantum state.
7.2 BE and FD statistics for N identical free particles
The BE and FD quantum permutation symmetry conditions were formulated on the micro-
states of the theory. To derive the consequences of these microscopic conditions on macro-
scopic states of indistinguishable particles, and on the counting of the number of micro-states
to which a macros-state has access, we need to develop some new counting techniques. In
this section, we limit study to that of non-interacting particles.
Since the particles are non-interacting, the Hamiltonian is just the sum of the N one-
particle Hamiltonians H
(1)
. Following the notation of the previous section, it is given by,
H =
N

n=1
H
n
H
n
= I I
. .
n1
H
(1)
I I
. .
Nn1
(7.9)
We denote the eigenstates of H
(1)
by [, and the corresponding eigenvalue by

and use
as a collective label for all quantum numbers of the one-particle states ( was denoted by
62
in the preceding section). A macro-state with total number of particles N, total energy E,
and N

particles in micro-state [ then satises,

= N

= E (7.10)
In the thermodynamic limit, the energy levels of H
(1)
will become closely spaced, and ap-
proach a continuum distribution.
To impose the quantum permutation symmetry on the micro-states, we appeal to the
coarse-graining procedure, and divide the (basis of states in the) Hilbert space 1 of all
states into discrete cells, labelled by an integer i = 1, 2, 3, . Physically, this may be done
by ordering the states according to the energy levels of a single particle H
(1)
and/or any
other conserved quantum number. The number of (distinct) states in cell i is denoted by
G
i
. In the thermodynamic limit, the spectrum of H
(1)
becomes closely spaced, so that we
may assume G
i
1 for all i. We denote the number of particles in cell i by N
i
, and assume
that N
i
1. Thus, each cell by itself may be viewed as a macroscopic subsystem of the
whole macro-state. The set-up is schematically represented in Figure 6, where the set of
basis states of 1 is divided into cells i (two cells are separated by a long vertical dash), each
cell having G
i
quantum states (each state is indicated with a short vertical dash), and N
i
particles (each particle is indicated with a dot above an available micro-state).
G G G
N N N

particles
states
Figure 6: Cell decomposition of the spectrum of available micro-states.
In practice, it will be assumed that the cells are small enough so that the energy and/or
other conserved quantum numbers remain constant throughout a given cell. Finally, it must
be stressed that any physically observable quantity should be independent of the precise
coarse graining mesh that has been applied to the system. One such physical quantity is the
mean occupation number n
i
for each cell, dened by,
n
i
=
N
i
G
i
(7.11)
63
The number
i
of micro-states in each cell, and the associated contribution to the entropy
S
i
, are related to the total number of micro-states and the total entropy by,
=

i
S = k

i
ln
i
(7.12)
The object is to compute the numbers
i
of micro-states in each cell i, and the associated
total energy E and total particle number N, given by,
N =

i
N
i
=

i
G
i
n
i
E =

i
N
i

i
=

i
G
i
n
i

i
(7.13)
where
i
is the average energy of the one-particle states in cell i.
Equilibrium will be achieved by maximizing the entropy with respect to the occupation
numbers n
i
, while keeping xed the sizes of the cells G
i
, the total energy E and, when
appropriate also the total number of particles N. This may be achieved with the help of
Lagrange multipliers and , as usual,

n
i
_
S/k E + N
_
= 0 (7.14)
Note that the sizes G
i
of the cells should be viewed as artifacts of the coarse-graining, and
therefore should not enter into any physical quantities.
7.3 Boltzmann statistics rederived
As a warm-up, we could use this construction to re-derive Boltzmann statistics. The number
of micro-states
i
available to cell i is given by G
N
i
i
for distinguishable particles, and G
N
i
i
/N
i
!
for indistinguishable particles. In the approximation of G
i
1, the entropy is given by,
S = k

i
G
i
_
n
i
ln n
i
+ n
i
_
(7.15)
Maximizing S according to (7.14), keeping G
i
, E, N xed, we recover the Boltzmann distri-
bution n
i
= e
(
i
)
. Note that this result is independent of G
i
as anticipated.
7.4 Fermi-Dirac statistics
For Fermi-Dirac statistics, at most one particle can occupy a given quantum state. Thus,
the number of micro-states accessible to the macroscopic subsystem of a single cell is,

i
=
G
i
!
N
i
!(G
i
N
i
)!
(7.16)
64
In the approximation of large G
i
, the total entropy is given by,
S = k

i
G
i
_
n
i
ln n
i
+ (1 n
i
) ln(1 n
i
)
_
(7.17)
Note that this derivation of the entropy does not appeal to any equilibrium arguments, it is
simply based on coarse graining and quantum statistics counting.
To obtain the equilibrium distribution for the occupation numbers n
i
, we use again (7.14),
for xed G
i
, E and, where appropriate, xed N,

i
G
i
n
i
_
ln n
i
+ ln(1 n
i
)
i
+
_
= 0 (7.18)
Setting this variation to zero for each n
i
gives an equation for each n
i
,
n
i
=
1
e
(
i
)
+ 1
(7.19)
where and are related to E, N by,
E =

i
G
i

i
e
(
i
)
+ 1
N =

i
G
i
e
(
i
)
+ 1
(7.20)
We see that G
i
naturally has the interpretation as the degeneracy of the energy level i. The
partition function is readily deduced, and we nd,
Z =

i
_
1 + e
(
i
)
_
G
i
(7.21)
In the grand-canonical ensemble, the partition function is directly related to the Gibbs free
energy G by the relation G = kT ln Z of (3.42). For our case, this quantity is given by,
G = E TS N = PV (7.22)
where we have used the homogeneity relation E = TS PV +N of (2.19). In terms of the
partition function, we thus nd,
PV = kT

i
G
i
ln
_
1 + e
(
i
)
_
(7.23)
which precisely corresponds to lling up each micro-state with at most one particle. This is
a very convenient relation, which gives directly the equation of state.
65
O O
O O
with(plots):
p1:=plot(exp(-x), x=-1.5..3, labels=[(E-mu)/kT, n ], color=
[black],thickness=2):
p2:=plot(1/(exp(x)+1), x=-1.5..3, color=[blue],thickness=2):
p3:=plot(1/(exp(x)-1), x=0.2..3, color=[red],thickness=2):
p4:=textplot([1.2, 4, "Bose-Einstein"], align = {above, right},
color=[red], font=[TIMES,BOLD,13]):
p5:=textplot([1.2, 3.5, "Boltzmann"], align = {above, right},
color=[black], font=[TIMES,BOLD,13]):
p6:=textplot([1.2, 3, "Fermi-Dirac"], align = {above, right},
color=[blue], font=[TIMES,BOLD,13]):
display(p1,p2,p3,p4,p5,p6);
Bose-Einstein
Boltzmann
Fermi-Dirac
EK
kT
K1 0 1 2 3
n
1
2
3
4
Figure 7: Comparison of the occupation numbers for quantum statistics.
7.5 Bose-Einstein statistics
For Bose-Einstein statistics, an arbitrary number of particles may occupy any given quantum
state. Therefore, the number of accessible micro-states is given by,

i
=
(G
i
+ N
i
1)!
N
i
! (G
i
1)!
(7.24)
In the approximation of large G
i
, the total entropy is given by,
S = k

i
G
i
_
n
i
ln n
i
(1 + n
i
) ln(1 + n
i
)
_
(7.25)
Again, this derivation does not require the system to be in equilibrium.
To obtain the equilibrium distribution, we again use (7.14), and obtain,
n
i
=
1
e
(
i
)
1
(7.26)
where and are related to E, N by,
E =

i
G
i

i
e
(
i
)
1
N =

i
G
i
e
(
i
)
1
(7.27)
66
The free energy and partition function are readily deduced, and we nd,
F = kT

i
G
i
ln
_
1 e
(
i
)
_
Z =

i
_

n=0
e
n(
i
)
_
G
i
(7.28)
which corresponds to lling up each micro-state with an arbitrary number of particles.
7.6 Comparing the behavior of the occupation numbers
Boltzmann, Fermi-Dirac, and Bose-Einstein statistics produce qualitatively dierent behav-
iors for the occupation numbers, as shown in Figure 7. Bose-Einstein statistics is consistent
only for 0, while Boltzmann and Fermi-Dirac are dened for all ranges of . Both FD
and BE statistics asymptote to Boltzmann in the limit of large (E )/kT, which is the
limit of large energy levels, or better even, of
n 1 (7.29)
This was the limit in which the Sackur-Tetrode formula of the simplest ideal gas was valid.
Thus, we see that FD and BE statistics will produce modications thereof when the occu-
pation numbers are not small.
67
8 Ideal Fermi-Dirac Gases
In this section, we shall discuss ideal gasses of particles obeying Fermi-Dirac statistics. Appli-
cations to the physics of electrons in metals, their magnetic properties, low energy behavior,
white dwarfs and neutron stars, will be treated in relative detail.
Recall the basic equations for an ideal FD gas from equations (7.20) and (7.23),
N =

i
1
e
(
i
)
+ 1
E =

i
e
(
i
)
+ 1
G = PV = kT ln Z = kT

i
ln
_
1 + e
(
i
)
_
(8.1)
where we take the sum over all states labelled by i, including their degeneracies, and corre-
sponding energy
i
. Here, Z stands for the grand canonical partition function. The Gibbs
free energy G, which is the natural thermodynamic potential in the grand canonical ensem-
ble, is related to these quantities by G = PV . As we are assuming that the gas is ideal,
no mutual interactions take place; the Hamiltonian is just that of the free relativistic or
non-relativistic particle, and
i
are the free particle energies.
8.1 The simplest ideal Fermi-Dirac gas
For particles whose energy depends only on translational degrees of freedom, we may set,

gV
(2 h)
3
_
d
3
p (8.2)
where a degeneracy factor g has been included to account for the number of internal degrees
of freedom of each particle. For example, when this degree of freedom is spin s, we have
g = 2s + 1. The standard thermodynamic functions are then given as follows,
N =
gV
(2 h)
3
_
d
3
p
1
e
(p
2
/2m)
+ 1
E =
gV
2m(2 h)
3
_
d
3
p
p
2
e
(p
2
/2m)
+ 1
PV = kT
gV
(2 h)
3
_
d
3
pln
_
1 + e
(p
2
/2m)
_
(8.3)
It is a property of the simplest ideal gas thermodynamics that temperature may be scaled
out of all these integrals. To do so, we express the quantities in terms of the composite
68
variable z, referred to as the fugacity, and we will continue to use the notation = (T) for
the thermal wavelength,
z = e

=
_
2 h
2
mkT
_1
2
(8.4)
Expressing in terms of z, and p in terms of x dened by p
2
= 2mkTx, we have the following
alternative expressions,
N
V
=
g

3
f
3/2
(z) E =
3
2
PV
P
kT
=
g

3
f
5/2
(z) (8.5)
The Fermi-Dirac functions f

(z) depend only upon the fugacity z, and are dened as follows,
f

(z)
1
()
_

0
x
1
dx
z
1
e
x
+ 1
(8.6)
From (8.4), we have 0 < z < . More generally, in the regime 0 < Re(z) < , the integral
is absolutely convergent for Re() > 0, and may be analytically continued to all C. The
functions have the following expansion in powers of z,
f

(z) =

n=1
()
n+1
z
n
n

(8.7)
The series is absolutely convergent only for [z[ 1 and 1 < , and gives the small z behavior
of the functions, or equivalently for /kT 1. From the expansion in (8.7), it is immediate
that we have the following recursion relation,
z

z
f

(z) = f
1
(z) (8.8)
which may also be derived directly from the integral representation, and holds for all z C.
The function f

(1) is related to the Riemann -function (),


f

(1) =
_
1
1
2
1
_
() () =

n=1
1
n

(8.9)
where () is well-known to be meromorphic throughout C, with a single pole at = 1.
8.2 Entropy, specic heat, and equation of state
The free energy F = E TS = PV + N may be expressed by,
F = PV + N = NkT
_
ln(z)
f
5/2
(z)
f
3/2
(z)
_
(8.10)
69
> >
> >
> >
restart:
with(plots):
f:=(nu,z)->evalf((1/GAMMA(nu))*int(x^(nu-1)/(exp(x)/z+1), x=0..
infinity)):
p1:=plot(f(1/2,z), z=0..30, labels=[z, f_nu(z) ], color=[blue],
thickness=2):
p2:=plot(f(3/2,z), z=0..30, labels=[z, f_nu(z) ], color=[black],
thickness=2):
p3:=plot(f(5/2,z), z=0..30, labels=[z, f_nu(z) ], color=[red],
thickness=2):
p4:=textplot([23, 1, "f_{1/2}(z)"], align = {above, right}, color=
[blue], font=[TIMES,BOLD,13]):
p5:=textplot([23, 3.5, "f_{3/2}(z)"], align = {above, right},
color=[black], font=[TIMES,BOLD,13]):
p6:=textplot([23, 7, "f_{5/2}(z)"], align = {above, right}, color=
[red], font=[TIMES,BOLD,13]):
display(p1,p2,p3,p4,p5,p6);
f_{1/2}(z)
f_{3/2}(z)
f_{5/2}(z)
z
0 10 20 30
f_nu z
1
2
3
4
5
6
7
8
9
plot(f(5/2,z)/f(3/2,z), z=0..8, labels=[z, PV/NkT ], color=[blue],
thickness=2);
z
0 1 2 3 4 5 6 7 8
PV
NkT
1.1
1.2
1.3
1.4
Figure 8: Left panel: numerical plot of the Fermi-Dirac functions f
1/2
(z), f
3/2
(z), f
5/2
(z).
Right panel: the equation of state relation in terms of z.
from which the entropy is obtained by S = (E F)/T,
S = Nk
_
5
2
f
5/2
(z)
f
3/2
(z)
ln(z)
_
(8.11)
To compute the specic heat C
V
(which is dened at constant V and N), the change in the
fugacity with temperature (as N is being kept constant) must be taken into account,
1
z
_
z
T
_
V
=
3
2T
f
3/2
(z)
f
1/2
(z)
(8.12)
Re-expressing E in terms of PV , and PV in terms of f using (8.5), we nd,
C
V
Nk
=
15
4
f
5/2
(z)
f
3/2
(z)

9
4
f
3/2
(z)
f
1/2
(z)
(8.13)
The equation of state is obtained by forming the combination,
PV
NkT
=
f
5/2
(z)
f
3/2
(z)
(8.14)
This function is plotted against z in Figure 8.
8.3 Corrections to the Boltzmann gas
In the regime of low density compared to the thermal wavelength scale,
N
3
/V = gf
3/2
(z) 1 (8.15)
70
we have z 1 (see Figure 8 for the numerical behavior of f
3/2
(z)), and we recover the
Boltzmann gas with,
z
N
3
gV
f

(z) z (8.16)
In this limit the equation of state reduces to PV = NkT, and the internal energy, entropy
and specic heat are given by their Boltzmann expressions, with E = 3PV/2 as well as,
E =
3
2
kT S =
5
2
Nk Nk ln
_
N
3
gV
_
C
V
=
3
2
Nk (8.17)
Using the expansion for small z of (8.7), we easily obtain the leading corrections to the
Boltzmann expressions,
PV = NkT
_
1 +
N
3
4

2gV
_
S = Nk
_
5
2
+
N
3
8

2gV
_
Nk ln
_
N
3
gV
_
C
V
=
3
2
Nk
_
1
N
3
8

2gV
_
(8.18)
The increase in the pressure, at xed temperature, is the result of the exclusion principle.
8.4 Zero temperature behavior
The Fermi-Dirac distribution exhibits a drastic simplication in the limit of low temperature,
clearly apparent from Figure 9. In the limit T = 0, the distribution is a step function,
lim
T0
+
1
e
()
+ 1
=
_
1 for <
0 for >
(8.19)
The chemical potential =
F
of the system at T = 0 is referred to as the Fermi energy.
The Pauli exclusion principle underlies the physical interpretation of the distribution: as the
lowest energy quantum states are lled up, additional particles must be placed in successively
higher energy states, up till the Fermi energy. The sharp cut-o is the result of having exactly
zero temperature. The corresponding surface of momenta p satisfying the relation,

F
=
p
2
2m
p
F
= [p[ (8.20)
71
is referred to as the Fermi surface. The magnitude p
F
of the momenta lying on the Fermi
surface is constant, and is referred to as the Fermi momentum. Strictly at T = 0, the various
thermodynamic functions may be evaluated in terms of the parameters m, V,
F
, and we nd,
N =
gV
(2 h)
3
_
p
F
0
4p
2
dp =
4gV p
3
F
3(2 h)
3
E =
gV
(2 h)
3
_
p
F
0
4p
2
dp
p
2
2m
=
4gV p
5
F
10m(2 h)
3
(8.21)
Energy density and pressure are found to be,
E
N
=
3
5

F
P =
2
5
N
V

F
(8.22)
Using the expression for the free energy F = E TS = PV + N, the entropy is found
to vanish. This is in accord with the fact that the system at T = 0 is expected to be in its
(unique) microscopic ground state.
Immediate applications of the low temperature thermodynamics of a gas of fermions is
to massive gravitational bodies in which the pressure generated by the exclusion principle is
compensated by the force of gravity at equilibrium. This is the case of neutron stars, and of
the hypothetical quark stars and strange stars. Later on, we shall discuss in some detail the
case of white dwarfs where electrons are the key players.
8.5 Low temperature behavior: The Sommerfeld expansion
For low temperatures, the Fermi-Dirac distribution is close to a step function, but shows that
some states with energies less than the Fermi energy
F
are empty, while some states with
energies above
F
are lled. The spread in these energies is set by the temperature kT, as is
manifest from Figure 9. In practice, it is only the contribution of these degrees of freedom
that will be important in the low temperature thermodynamics of the system. In particular,
the number of degrees of freedom contributing to the specic heat should decrease as T
decreases, and should no longer be given by the classical constant value C
V
Nk. Indeed,
we shall nd that C
V
T for small T.
A systematic expansion for the thermodynamic functions at low but nite temperature
is somewhat delicate. It corresponds to a regime where z 1, and is provided by the Som-
merfeld expansion. Following Pathria, we set z = e

with = , and begin by evaluating


a generalized form of the Fermi-Dirac functions,
f[] =
_

0
dx
(x)
e
x
+ 1
z = e

(8.23)
72
E
0 2 4 6 8 10
0
0.2
0.4
0.6
0.8
1
Figure 9: Low temperature behavior of the Fermi-Dirac occupation number for = 6, and
kT = 0.125, 0.25, and 1 respectively in blue, black, and red.
for any function (x) with polynomial growth in [x[. By separating the integration region
at the point x = , we readily obtain (after suitable shifts of variables in each region),
f[] =
_

0
dx (x) +
_

0
dx
( + x) ( x)
e
x
+ 1
+
_

dx
(x)
e
x
+ 1
(8.24)
The rst two terms produce power law dependences in large , while the last term is sup-
pressed by exponentials O(e

), and may be neglected compared to the rst two. Expanding


( x) in a power series in x around x = 0, and using the formula, for the value = 2n+2,
we have the following nal formula,
F[] =
_

0
dx (x) +

n=0
2
_
1
1
2
2n+1
_
(2n + 2)
(2n+1)
() +O(e

) (8.25)
or, for the rst few orders, we have,
F[] =
_

0
dx (x) +

2
6

(1)
() +
7
4
360

(3)
() + (8.26)
where
(2n+1)
() denotes the derivative of order 2n + 1 evaluated at the point . Using this
formula with (x) = x
1
/(), we nd,
f

(z) =

( + 1)
_
1 + ( 1)

2
6
2
+ ( 1)( 2)( 3)
7
4
360
4
+
_
+O(e

) (8.27)
The functions we need are given as follows,
f
1/2
(z) =

1/2
(3/2)
_
1

2
24
2

7
4
384
4
+
_
+O(e

)
73
f
3/2
(z) =

3/2
(5/2)
_
1 +

2
8
2
+
7
4
640
4
+
_
+O(e

)
f
5/2
(z) =

5/2
(7/2)
_
1 +
5
2
8
2

7
4
384
4
+
_
+O(e

) (8.28)
Recall that this expansion holds in = = /kT for large , i.e. for small T at xed .
Corrections to various thermodynamic functions may now be evaluated with the help of
the general formulas (8.5), (8.11), and (8.13). We begin by expressing the chemical potential
at nite temperature in terms of the Fermi energy
F
, by comparing the formulas,
N
V
=
4g(2m
F
)
3/2
3(2 h)
3
N
V
=
g

3
f
3/2
(z) =
g

3
(5/2)
_

kT
_
3/2
_
1 +

2
8
2
_
(8.29)
We nd to this order of approximation,

F
=
_
1 +

2
12
2
_
=
F
_
_
1

2
12
_
kT

F
_
2
_
_
(8.30)
In terms of T and
F
, the energy per particle, entropy, and specic heat are given by,
E
N
=
3
5

F
_
_
1 +
5
2
12
_
kT

F
_
2
_
_
S
Nk
=
C
V
Nk
=

2
kT
2
F
(8.31)
8.6 Pauli paramagnetism of ideal gasses
Spin 1/2 particles have intrinsic magnetic dipole moments, and therefore produce paramag-
netic eects in a gas or liquid. An external magnetic eld splits each energy level in two,
with eective energies,
=
p
2
2m
B (8.32)
where = e h/2mc is the Bohr magneton. Since the shifts B enter in combination with
the chemical potential for the spin 1/2 particles, the total Gibbs free energy is given by,,
G(T, V, , B) = G
0
(T, V, + B) + G
0
(T, V, B) (8.33)
74
where G
0
is the Gibbs free energy of the simplest spinless ideal gas, and is given by,
G
0
(T, V, ) =
kTV
(2 h)
3
_
d
3
pln
_
1 + e
(p
2
/2m)
_
=
kTV

3
f
5/2
(e

) (8.34)
The paramagnetic susceptibility, per unit volume and at zero external magnetic eld, is then
obtained as follows,

para
=
1
V

2
G
B
2

B=0
= 2

2
V

2
G
0

2
= 2

2
V
N

T,V
(8.35)
At zero temperature, we may use the rst relation in (8.21) for g = 1, and expressed in terms
of the chemical potential,
N = V
(2m)
3/2
3
2
h
3
(8.36)
and we nd,

para
=

2
mp
F

2
h
3
=
3
2

F
N
V
(8.37)
Finite temperature corrections may be deduced using the full expression for G
0
. At high
temperatures, we recover, of course, the Curie-Weiss law already derived using Boltzmann
statistics in section 6.
8.7 Landau diamagnetism
The energy levels of a charged particle in the presence of a constant magnetic eld are
arranged into Landau levels, labelled by an integer n = 0, 1, 2, , and given by,
(p
z
, n) =
p
2
z
2m
+
eB h
mc
_
n +
1
2
_
(8.38)
where p
z
is the momentum along the direction of the magnetic eld. Particles with a magnetic
moment require an extra term B term, but at small magnetic elds, the case we are most
interested in here, this eect is paramagnetic and may be treated independently along the
lines of the preceding section.
Some care is needed in correctly normalizing the summation over Landau levels. For weak
magnetic elds, the Landau levels are closely spaced and degenerate into the continuous p
x
, p
y
spectrum at B = 0. We may use this correspondence to normalize the sum, as follows. The
75
number of quantum states between two successive Landau energy levels (p
z
, n + 1) and
(p
z
, n) is eB h/mc; computing the same from the classical measure, we nd,
_
(pz,n+1)
(pz,n)
dx dy dp
x
dp
y
(2 h)
2
=
L
x
L
y
eB
2 hc
(8.39)
where L
x
and L
y
are the linear dimensions of a square box in the plane perpendicular to
B. This factor provides the correct normalization for the calculation of the grand canonical
partition function, actually for all values of the eld B,
G = kT ln Z = kT

n=0
L
x
L
y
eB
2 hc
_
dz dp
z
2 h
ln
_
1 + e
((pz,n))
_
(8.40)
Since we are interested in weak magnetic elds, eB h/mc kT, we may use an approximate
evaluation of the sum over n, by using the Euler-McLaurin formula to rst order,

n=0
f
_
n +
1
2
_
=
_

0
dxf(x) +
1
24
f
t
(0) (8.41)
up to higher derivative corrections (which will be accompanied by higher powers of B). The
contribution of the integral is proportional to
eB
_

0
dx ln
_
1 + e
(p
2
z
/2m+eBh/mcx)
_
(8.42)
By changing integration variables eBx x, we see that the integral is in fact independent of
B, and for our purposes of studying magnetic properties, will be immaterial. We abbreviate
its contribution to G as G
B=0
. The remaining contribution is readily evaluated, and we nd,
G = G
B=0
+
1
24
V e
2
B
2
h
(2 h)
2
mc
2
_
dp
z
1
e
(p
2
z
/2m)
+ 1
(8.43)
Changing variables p
z
=

2mkT

x and z = e

, we may express the result in terms of a


Fermi-Dirac function, and the Bohr magneton = e h/2mc,
G = G
B=0
+
mV
24
2
h
3

2
B
2
(2mkT)
1
2

f
1/2
(z) (8.44)
The diamagnetic susceptibility per unit volume is dened by,

diam
=
1
V

2
G
B
2

T,
(8.45)
and is found to be given by

diam
=
m
12
2
h
3

2
(2mkT)
1
2

f
1/2
(z) (8.46)
76
For the Boltzmann regime, with z 1, and z N
3
/V , we get,

diam

N
V

2
3kT
(8.47)
In the low temperature regime where z 1, we have f
1/2
(z) 2(/kT)
1/2
, and we nd,

diam
=
N
2
4V
F
(8.48)
Assembling the contributions from the paramagnetic and diamagnetic parts, we nd for large
temperature,
=
para
+
diam
=
N
V kT
_

1
3

2
_
=
2N
2
3V kT
(8.49)
Thus, at high temperatures, gasses tend to be paramagnetic. The same is generally true at
low temperatures.
8.8 White dwarfs
The Pauli exclusion principle prevents electrons in atoms from collapsing into the lowest
atomic quantum state, and it also prevents certain types of large astronomical bodies from
collapsing under the forces of their gravitational attraction. The oldest system where this
mechanism was applied is the case of white dwarfs. More recent systems include neutron
stars, and possibly quark stars and strange stars. Here, we shall discuss in detail only the
case of white dwarfs.
The energy radiated by the sun is generated by the nuclear reactions of converting Hy-
drogen into Deuterium and then into Helium. The reaction pretty much ends when this
conversion is complete, as the
4
He nucleus has larger binding energy per nucleon than its
three subsequent elements, Lithium
7
Li, Beryllium
9
Be and Boron
11
B. At that point the
star runs out of thermonuclear fuel, and may be expected to collapse under its gravitational
weight. But this is not what happens, at least for certain suciently small stars, referred
to as white dwarfs. They remain stable in size and mass, and glow relatively faintly. The
typical numbers are as follows,
M 10
33
g
10
7
g/cm
3
T 10
7
K 10
3
eV (8.50)
77
where M is the total mass of star, its mass density, and T its temperature. At this
temperature, all
4
He is completely ionized, so we have a plasma of N electrons and N/2
Helium nuclei. The relations between the total mass M and volume V are as follows,
M = N(m + 2m
p
) 2m
p
N
N
V
=

2m
p
(8.51)
The electron density N/V allows us to compute the Fermi momentum by,
N
V
=
2p
3
F
3
2
h
3
=

2m
p
(8.52)
The ratio of the Fermi momentum to mc is given by,
p
F
mc
=
_
3
16m
p
_
1/3
2 h
mc
(8.53)
Substituting the above data for the characteristics of a white dwarf, we get p
F
/mc 100,
so that the electrons are in fact relativistic. The Fermi energy and temperature T
F
=
F
/k
are respectively found to be
F
10
6
eV, and T
F
10
10
K. Thus the actual temperature
of the white dwarfs is small compared to the Fermi temperature, and the electron gas is
fully degenerate. The electrons produce the largest contribution to the pressure against
gravitation, and so we shall neglect the contribution of the nuclei.
To study the equilibrium conditions quantitatively, we compute the pressure of this rel-
ativistic gas of electrons. Using the Gibbs ensemble, we have,
P =
gkT
(2 h)
3
_
d
3
p ln
_
1 + e
((p))
_
(8.54)
where the relativistic energy is given by,
4
(p) =
_
p
2
c
2
+ m
2
c
4
(8.55)
In the degenerate limit, is large and positive, and the logarithmic term vanishes when
(p) > . The contribution for (p) < is dominated by the exponential part of the
argument, so that we obtain,
P =
g
(2 h)
3
_
d
3
p( (p)) ( (p)) (8.56)
4
Equivalently, one could use for (p) only the relativistic kinetic energy; the dierence amounts to a shift
of by the rest energy mc
2
.
78
Carrying out the angular integration, and changing variables to
p = mc sinh p
F
= mc sinh
F
=
F
= (p
F
) (8.57)
we have,
P =
4gm
4
c
5
(2 h)
3
_

F
0
d cosh sinh
2
(cosh
F
cosh ) (8.58)
The integral is readily computed, and we nd,
P =
g m
4
c
5
6(2 h)
3
A
_
p
F
mc
_
(8.59)
where the function A is given by,
A(x) = x(2x
2
3)

1 + x
2
+ 3 ln
_
x +

x
2
+ 1
_
(8.60) > > with(plots):
A:=x*(2*x^2-3)*sqrt(x^2+1)+3*ln(x+sqrt(x^2+1)):
p1:=plot(A(x)/(2*x^4), x=0..10):
p2:=plot(1, x=0..10,color=[black]):
display(p1,p2);
x
0 2 4 6 8 10
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
Figure 10: The function A(x)/2x
4
versus x.
79
Next, we need to balance the pressure of the electron gas against the gravitational pull
on them. In terms of the work done by each, we have,
E
P
= P 4R
2
dR
E
g
=

4
GM
2
R
2
dR (8.61)
Here, is a fudge factor of order unity introduced in order to account for the fact that the
mass distribution in a real star will not be uniform. At equilibrium, these two must balance
one another, so we must have,
P =

4
GM
2
R
4
(8.62)
We now relate the Fermi momentum to the radius, by using the formula for the mass density
= 3M/4R
3
, and we nd,
x =
p
F
mc
=
R
0
R
R
0
=
2 h
mc
_
9M
64
2
m
p
_
1/3
(8.63)
The equilibrium relation then takes the form,
A(x)
x
4
= 6
G
hc
M
2
_
8m
p
9M
_
4/3
(8.64)
or after some simplications,
M =
9
64m
2
p
_
3

3
_
1/2
_
hc
G
_
3/2
_
A(x)
2x
4
_
3/2
(8.65)
Now, it is easy to see, for example from Figure 10, that the function A(x) satises,
A(x) 2x
4
(8.66)
for all 0 < x < . As a result, there is a maximum mass that a white dwarf can have, whose
value is given by the Chandrasekar limit, which to our approximation is given by,
M
c
=
9
64m
2
p
_
3

3
_
1/2
_
hc
G
_
3/2
(8.67)
In practice, the limiting mass is not much larger than the mass of the sun: M
c
1.44M

.
We conclude that our sun will end its life as a white dwarf.
80
9 Bose-Einstein statistics
In this section, we shall discuss ideal gasses of particles obeying Bose-Einstein statistics.
Applications to Plancks black body radiation, microwave background radiation, an Bose-
Einstein condensation will be given.
Recall the basic equations for an ideal BR gas from equations (7.26), (7.27), and (7.28)
are given by,
N =

i
1
e
(
i
)
1
E =

i
e
(
i
)
1
G = PV = kT ln Z = kT

i
ln
_
1 e
(
i
)
_
(9.1)
where we take the sum over all states labelled by i, including their degeneracies, and cor-
responding energy
i
. We shall normalize the ground state energy to zero, so that the
Bose-Einstein distribution requires,
0 (9.2)
Here, Z stands for the grand canonical partition function. The Gibbs free energy G is related
to these quantities by G = PV . As we are assuming that the gas is ideal, no mutual
interactions take place; the Hamiltonian is just that of the free relativistic or non-relativistic
particle, and
i
are the free particle energies.
9.1 Black body radiation
Black body radiation refers to a system of photons which are in equilibrium at temperature
T with some weakly interacting quantity of matter, usually the material walls of the box or
oven in which the photons are being observed. Photons without matter present interact too
weakly (through scattering of light by light) to achieve equilibrium. It is key to observe that
the number of photons N is not conserved. Hence, the chemical potential associated with
the number of photons must vanish. In other words, there is no potential energy associated
with the number of photons created, only with their total kinetic energy. Thus, in the above
formulas, we must set
= 0 (9.3)
and we will really be dealing with the canonical ensemble, where G is actually the Helmholtz
free energy. Finally, for photons, the energies are given by,
= h (9.4)
81
where = [k[c. Since a photon has two degrees of polarization, we have g = 2, and obtain
the following formulas for energy and pressure,
E =
2V
(2 h)
3
_
d
3
p
h
e
h
1
PV = kT
2V
(2 h)
3
_
d
3
pln
_
1 e
h
_
(9.5)
Expressing momentum p = hk, performing the angular integration, and then transforming
[k[ into frequency , we have,
2V d
3
p
(2 h)
3

V
2
d

2
c
3
(9.6)
The energy density and pressure distribution may be,
E
V
=
_

0

2
d

2
c
3
h
e
h
1
P = kT
_

0

2
d

2
c
3
ln
_
1 e
h
_
(9.7)
The famous Planck black body frequency distribution curve is shown in Figure 11. The
maximum in the energy distribution function is attained at the maximum of the function

3
/(e
h
1), which is given approximately by h
max
= 2.822 kT.
> > with(plots):
plot(x^3/(exp(x)-1), x=0..15);
x
0 5 10 15
0
0.2
0.4
0.6
0.8
1
1.2
1.4
Figure 11: The normalized energy distribution function x
3
/(e
x
1) versus x = h/kT.
82
The temperature dependence may be easily extracted by changing integration variables,
h = x. The integrals may be performed by expanding the integrands in powers of the
exponentials e
x
, and we nd,
E
V
=

2
(kT)
4
15 c
3
h
3
_

0
dx x
3
e
x
1
= 6 (4)
P =

2
(kT)
4
45 c
3
h
3
_

0
dx x
2
ln
_
1 e
x
_
= 2 (4) (9.8)
where (z) is the Riemann zeta function, and we have (4) =
4
/90. It readily follows that
internal energy and pressure are relatedby,
E = 3PV (9.9)
The Helmholtz free energy is given by F = E TS = PV (since = 0), we can also
compute the entropy, and we nd,
S
V
=
4
2
(kT)
3
45c
3
h
3
(9.10)
One immediate application is to the Stefan-Boltzmann law of black body radiation. A small
hole made in the vessel that contains the black body radiation will let a small amount of the
radiation out. While the system is now not strictly at equilibrium any more, it is suciently
close to equilibrium that we can continue to use equilibrium formulas. In particular, the rate
of radiation energy ow per unit area, is given by,
c
4
E
V
= T
4
=

2
k
4
60c
3
h
3
(9.11)
where is the Stefan-Boltzmann constant.
9.2 Cosmic micro-wave background radiation
The Cosmic Micro-wave Background (CMB) radiation was predicted by Robert Dicke on
theoretical grounds and discovered experimentally, by accident, by Penzias and Wilson, who
were awarded the Nobel Prize in 1978. The universe is permeated by an electro-magnetic
radiation, obeying the Planck black body radiation spectral distribution to 1 part in 10
5
,
and corresponding to a temperature of T
CMB
= 2.725 K. This is a very low temperature
compared to any nuclear or atom energy scales. Where does it come from ?
Assuming the Big Bang took place, then early in the evolution of the Universe, various
particle species are all interacting relatively strongly, which allows for local equilibrium to
occur. As the Universe cooled down, one crosses the combination temperature T
comb
. Above
83
T
comb
, nucleons and electrons form a plasma which interacts with the photons, and below
which ionized nuclei and electrons combine. The typical ionization energy of atoms and
molecules is on the order of T
comb
1eV 10
4
K. At temperatures below T
comb
, photons
do not directly interact with charged particles any more, but rather with electrically neutral
atoms and molecules. As a result, below T
comb
, photons are not in equilibrium any more
with atoms and molecules, and decouple. So, below T
comb
, photons remain as an isolated
free gas.
The fact that we observe today CMB radiation at a much lower temperature is explained
by the expansion of the universe. Assuming that this expansion is adiabatic, namely entropy
is conserved, then we can read o the relation between temperature and volume of the
universe from formula (9.10). Thus, we know how the volume V of the present Universe is
related to the volume V
comb
of the Universe at the time of combination:
V
V
comb
=
T
3
comb
T
3
CMB
(9.12)
Given the temperatures, this number evaluates to 5 10
10
. The corresponding red-shift
factor is 3000, as may be deduced by comparing the average photon energies.
9.3 Thermodynamic variables for Bose-Einstein ideal gasses
We shall now consider an ideal gas of non-relativistic particles obeying BE statistics, for
which particle number is conserved (unlike in the case of photons, where photon number was
not conserved). Using the standard energy-momentum relation (p) = p
2
/2m, it is clear
that the ground state energy vanishes, so that the chemical potential needs to obey 0.
As a result, the fugacity z = e

obeys z 1, and the Bose-Einstein functions,


g

(z)
1
()
_

0
dx x
1
z
1
e
x
1
(9.13)
are completely described by the series expansion,
g

(z) =

n=1
z
n
n

(9.14)
since this series converges absolutely for [z[ 1 as long as > 1.
Special care needs to be exerted in translating the discrete quantum spectrum of a BE
gas in a box of nite size into formulas for continuous distributions as V becomes large.
The reason is that the ground state of the 1-particle Hamiltonian, namely for p = 0, can be
occupied by a small or by a macroscopically large number of particles, without extra cost in
84
energy. This can happen when z is very close to 1, since then the occupation number of the
= 0 state can become large,
1
z
1
e

1
=
z
1 z
= 0 (9.15)
The quantum states at nite energy are not aected by this phenomenon, and the customary
translation into continuum states may be applied there. For the internal energy, the zero
energy states do not contribute, and we have,
E
V
=
4
(2 h)
3
_
d
3
p
p
2
/2m
z
1
e
p
2
/2m
1
(9.16)
Isolating the = 0 contribution to N and P gives the following formulas,
N
V
=
1
V
z
1 z
+
4
(2 h)
3
_
d
3
p
1
z
1
e
p
2
/2m
1
P =
kT
V
ln(1 z)
4kT
(2 h)
3
_
d
3
p ln
_
1 ze
p
2
/2m
_
(9.17)
Next, we change variables, to exhibit the number N
0
of particles in the ground state,
N
0
=
z
1 z
z =
N
0
N
0
+ 1
1
1
N
0
(9.18)
Assuming that N
0
/V tends to a nite value in the thermodynamic limit, we see that the
eects of N
0
survives only in the above formulas for the particle number, but drops out of
the formula for the pressure, since we have, V
1
ln(1 z) V
1
ln N
0
0 as V with
N
0
/V xed. The nal formulas for the total number of particles and pressure are thus,
N N
0
=
4V
(2 h)
3
_
d
3
p
1
z
1
e
p
2
/2m
1
P =
4kT
(2 h)
3
_
d
3
p ln
_
1 ze
p
2
/2m
_
(9.19)
In terms of the BE functions, we obtain,
N N
0
=
V

3
g
3/2
(z) E =
3
2
PV P =
kT

3
g
5/2
(z) (9.20)
The equation of state is obtained by eliminating z between the relations
PV
(N N
0
)kT
=
g
5/2
(z)
g
3/2
(z)
(9.21)
and the rst relation in (9.20).
85
9.4 Bose-Einstein condensation
The number N
e
= N N
0
of excited states, i.e. states with non-zero energy, is bounded
from above by the fact that,
g
3/2
(z) g
3/2
(1) = (3/2) 2.612 (9.22)
for all values of z in the physical range 0 < z 1, and we have,
N
e
= N N
0
(3/2)V
_
mkT
2 h
2
_
3/2
(9.23)
As T is lowered, at xed volume and xed total number of particles N, the number of excited
states N
e
must decrease. At some critical temperature T
c
, one is forced to populate the states
at zero entropy in a macroscopic way, namely with N
0
/V nite in the large V limit. This
critical temperature is obtained by setting N
0
= 0, and we nd,
T
c
=
2 h
2
mk
_
N
(3/2)V
_
2/3
(9.24)
To leading approximation, namely by setting z = 1, the number of excited and zero energy
states are related as follows, for T < T
c
,
N
0
N
= 1
N
e
N
= 1
_
T
T
c
_
3/2
(9.25)
9.5 Behavior of the specic heat
A calculation of the specic heat may be carried out in parallel to the one done for a FD
gas in section 7.2. For BE gasses, however, we must be careful to carry out the calculation
corresponding to the gas phase and the condensed phase separately. In either case, the
starting point is,
C
V
=
E
T

V
=
3
2
V
P
T

V
P =
kT

3
g
5/2
(z) (9.26)
In the gas phase, we have z < 1, while in the condensed phase, we set z = 1 in view of (9.18).
In the condensed phase, the formula for the pressure simplies, and we have,
P =
kT

3
g
5/2
(1) (9.27)
All reference to the volume has disappeared, so that the specic heat becomes,
C
V
Nk
=
15
4
V
N
3
g
5/2
(1) T
3/2
T < T
c
(9.28)
86
In the gas phase, the calculation will be similar to the one that led to (8.12) for the FD gas,
and uses the fact that z depends on T. Thus we have,
P
T

V
=
5
2
k

3
g
5/2
(z) +
kT

3
g
3/2
(z)
_
ln z
T
_
V
(9.29)
The derivative of ln z is computed by dierentiating equation N
3
= V g
3/2
(z), and we nd,
g
1/2
(z)
_
ln z
T
_
V
=
3
2
N
3
V T
(9.30)
Putting all together, the specic heat in the gaseous phase is found to be,
C
V
Nk
=
15
4
g
5/2
(z)
g
3/2
(z)

9
4
g
3/2
(z)
g
1/2
(z)
T > T
c
(9.31)
As z 1, we have g
1/2
(z) , and so near the transition point, the second term cancels,
and we nd that the specic heat in the gas phase and in the condensed phase match at the
transition point, since we have N
3
= V g
3/2
(z). The slope at T = T
c
is discontinuous across
the transition. This is a sign of a phase transition, as we shall study in detail in subsequent
sections.
T
C/Nk
3/2
T
0
Figure 12: Molecular specic heat curve across the Bose-Einstein condensation transition.
87
10 Phase coexistence: thermodynamics
A given system or material can exhibit dierent physically distinct phases, as a function
of the ranges of its macroscopic parameters. A common example is that of water, which
may exist in its solid phase of ice, its liquid phase, or its vapor phase. Dierent phases may
co-exist in equilibrium. The passage from one phase to another, as thermodynamic variables
are changed, is referred to as a phase transition. Given the equations of state in both phases,
thermodynamics gives an eective formulation of the conditions of equilibrium between dif-
ferent phases, and transitions between them. The underlying microscopic dynamics which
is responsible for the system existing in one phase or another, or transitioning between one
phase to another is, however, captured only by the application of statistical mechanics.
10.1 Conditions for phase equilibrium
We shall begin with a thermodynamic study of the equilibrium between gasses, liquids and
solids, or any other forms of matter. If we have two systems, referred to as o
1
and o
2
, such as
the two dierent phases of matter, in equilibrium with one another, then their temperature,
pressure and chemical potentials must match,
T
2
= T
1
P
2
= P
1

2
=
1
(10.1)
Given these conditions, it will be most convenient to use not the variables E, V, N but
rather with the intensive parameters T, P, in terms of which the equilibrium conditions are
written. In either phase, these three variables are not independent, so we shall choose T, P
and express the chemical potential for each phase as a function of T, P,

1
=
1
(T, P)

2
=
2
(T, P) (10.2)
The chemical potentials
1
and
2
have dierent functional forms for the two phases. For
given T, P it may, or may not, be possible to set the chemical potentials equal. Thus we
conclude that
if
1
(T, P) =
2
(T, P) then phases 1 and 2 can coexist at T, P
if
1
(T, P) ,=
2
(T, P) then phases 1 and 2 cannot coexist at T, P (10.3)
Generally, the relation
1
(T, P) =
2
(T, P) will hold on a line segment in the T, P plane, as
shown in gure 13.
88
T
T
P
P
P
P
V
critical point critical point
Figure 13: Phase coexistence curve in (T, P) and (V, P) variables ending at a critical point.
In terms of dierent thermodynamic variables, such as V, P, the phase diagrams takes
on a completely dierent form. Above P
c
, we read o from the T, P diagram that there
is a unique system for each value of V, P, but this ceases to be the case for P < P
c
. The
two phases below P
c
may coexist, but their densities are unequal, leaving the volume only
partially determined. We shall analyze this phenomenon with the help of the Van der Waals
equation of state.
T
T
P
P
solid
liquid
gas
end point
Figure 14: A gas-liquid-solid phase diagram exhibiting a triple point and an end point.
More than two phases can coexist pairwise at various temperatures and pressures. It is
also possible for three phases to coexist at a point (generically). In fact this is the case of the
standard gas-liquid-solid phase diagram, as show schematically in gure 14. At the triple
89
point, we have the relations,

gas
(T, P) =
liquid
(T, P) =
solid
(T, P) (10.4)
This phase diagram applies, for example, to water.
10.2 Latent heat
The intensive quantities dierent from T, P, are generally discontinuous across a phase
separation lines. This include the molecular volume v = V/N and the molecular entropy
s = S/N. We shall assume that the substance of interest is composed out of a single material,
such as water, and that molecules of this substance are transferred from one phase to another
only. Thus, the total N is then conserved. Evaluating the relation
E = TS PV + N (10.5)
on both phases, and taking into account that T, P, coincide for these phases at equilibrium,
we nd the relation,
E
2
E
1
= T(S
2
S
1
) P(V
2
V
1
) (10.6)
By the rst law of thermodynamics, we have
E
2
E
1
+ P(V
2
V
1
) = Q
2
Q
1
= Q (10.7)
namely the latent heat transfer between the phases. Combining both equations, we nd that
Q = T(S
2
S
1
) (10.8)
This relation shows that the process of phase change is reversible, upon release of absorption
of the latent heat, since it agrees with the reversible relation Q = dS/T at constant T.
10.3 Clausius-Clapeyron equation
The Clausius-Clapeyron equation gives the shape of the coexistence curve in terms of the
dierences of molecular volume and entropy. It is a good exercise in thermodynamics gym-
nastics to work this out. Along the coexistence curve between two phases, we have,

1
(T, P) =
2
(T, p) (10.9)
and this relation determines the dependence between T, P along the coexistence curve. Dif-
ferentiate this relation with respect to T,

1
T
+

1
P
dP
dT
=

2
T
+

2
P
dP
dT
(10.10)
90
The quantity of interest to us is dP/dT, as this gives a dierential equation for the coexistence
curve. To evaluate the other quantities in the equation, we use T, P, N as independent
thermodynamic variables, where N will in the end be xed. The appropriate thermodynamic
potential is
= E TS + PV = N (10.11)
sometimes referred to as the Gibbs potential. Its dierential is given by,
d = SdT + V dP + dN (10.12)
but the last term not be important here as dN = 0. The relation may also be expressed in
terms of using (10.5) or (10.11), and for constant N we nd,
d = sdT + vdP (10.13)
where again s = S/N and v = V/N. The desired ingredients for (10.10) now immediately
follows, and we have,

P
= s

P

T
= v (10.14)
so that,
dP
dT
=
s
2
s
1
v
2
v
1
(10.15)
This is the Clausius-Clapeyron equation.
10.4 Example of the Van der Waals gas-liquid transition
The Van der Waals equation of state,
_
P +
a
V
2
_
(V b) = NkT a, b > 0 (10.16)
is rich enough to describe an interacting gas which can condense into a liquid phase. To
study how this works, we work in T, P variables, the total number N of molecules being
xed. Given T, P, the equation for the volume V is a polynomial of third degree with real
coecients,
PV
3
(bP + NkT)V
2
+ aV ab = 0 (10.17)
The equation always has at least one real root, the additional two roots being either both real,
or complex conjugates of one another. For suciently large P and T, we may approximate
91
the equation by PV
3
NkTV
2
= 0 which reduces to the ideal gas equation of state, and the
solution is unique. Thus, for T, P suciently large, we have two complex conjugate solutions
in addition to the single real root. As T, P are lowered, all three real solutions will appear.
This will happen when the two roots of the derivative equation coincide, a point we denote
by V
c
. But V
c
must also satisfy the Van der Waals equation, so that the equation must be
equivalent to (V V
c
)
3
= 0. Multiplying by P
c
and expanding gives,
P
c
V
3
3P
c
V
c
V
2
+ 3P
c
V
2
c
V P
c
V
3
c
= 0 (10.18)
Identifying coecients with the equation of (10.17) gives,
bP
c
+ NkT
c
= 3P
c
V
c
a = 3P
c
V
2
c
ab = P
c
V
3
c
(10.19)
which is solved uniquely as follows,
V
c
= 3b P
c
=
a
27b
2
NkT
c
=
8a
27b
(10.20)
Eliminating a, b from the Van der Waals equation in favor of T
c
, P
c
, V
c
gives,
_
P
P
c
+ 3
V
2
c
V
2
_
_
3
V
V
c
1
_
= 8
T
T
c
(10.21)
Note that this curve is universal, and has no extra free parameters in terms of the normalized
thermodynamic variables T/T
c
, P/P
c
, and V/V
c
. A schematic plot in V, P variables of its
isothermal curves (lines of constant T) are plotted in gure 15.
For P > P
c
, the pressure is a monotonously decreasing function as V increases at constant
T. This is in accord with the physical dynamics of a gas. For P < P
c
, however, the situation
is more complex. As one decreases the volume away from very large volume, the pressure
at rst increases monotonically, as was the case for P > P
c
. But as the volume is further
reduced, it would appear that there comes a point where the pressure now starts to decrease
(on the lowest curve in the left panel of gure 15). This behavior is not consistent with
the behavior of a gas. Rather, the tendency to decrease the pressure should be viewed as a
symptom of condensation into the liquid phase.
10.5 The Maxwell construction
The actual physical equilibrium curve between the two coexisting phases may be derived by
using the Gibbs-Duhem relation,
SdT V dP + Nd = 0 (10.22)
92
P P
1
2
P
V
V V
Figure 15: The (V, P) diagram of isotherms for the Van der Waals equation of state on the
left panel, and the Maxwell construction for phase equilibrium on the right panel.
Along an isotherm, we have dT = 0, so the rst term drops out. The remaining terms, with
constant N, gives us a dierential expression Nd = V dP. Integrating this relation between
the two phases must reect the equality of the chemical potential in the two phases, and
thus we must have,
0 = N(
2

1
) =
_
2
1
V dP (10.23)
This relation determines a horizontal line in the V, P diagram of the Van der Waals equation
of state which is such that the area in V, P space above equals the area below. This is
precisely the Maxwell construction for equilibrium between two phases.
93
11 Phase transitions: statistical mechanics
A statistical mechanics understanding of the existence and coexistence of various phases of
matter and the phase transitions between them is what we will pursue in this section. After
a brief classication of the dierent kinds of phase transitions in Nature, we will proceed to
studying the Ising model, which captures so many of the key features of phase transitions.
11.1 Classication of phase transitions
In the gas-liquid-solid phase diagram of gure 10, the variables T, P, are continuous across
the phase coexistence curves, but the molecular volume v and molecular entropy s are discon-
tinuous. When this is the case, the phase transition across the coexistence curve is referred
to as rst order. A rst order transition is always accompanied by the exchange of latent
heat, and a change of molecular volume, both of which are related to the slope dP/dT by
the Clausius-Clapeyron equation.
More generally, the free energy F, or the Gibbs potential G for the grand canonical
ensemble, or whichever thermodynamic potential is given directly by the logarithm of the
partition function ln Z, is always continuous, but its derivatives may exhibit dierent degrees
of discontinuity. The order of a phase transition is n 1 provided the partition function Z
and its rst n 1 derivatives are continuous, but its derivative of order n is discontinuous.
Thus, in a second order phase transition, the molecular entropy and molecular volume are
continuous as well as the free energy and Gibbs potential, but the specic heat is discontin-
uous, since it is given by two derivatives of the partition function.
Examples of rst order transitions include the gas-liquid-solid transitions already dis-
cussed. Examples of second order phase transitions include the Curie phase transition in
ferro-magnets, and the phase transitions to superuidity, superconductivity, and so on. Lan-
dau developed a general theory of second order phase transitions based on symmetry consid-
erations, and the existence of an order parameter. Generally, second order phase transitions
in the presence of an external eld, such as a magnetic eld, may either disappear or become
rst order. Conversely, the nature of the rst order phase transition along the gas-liquid
coexistence curve in gure 10 changes quantitatively, and the latent heat diminishes as one
approaches the end point of the curve, where the latent heat vanishes. Thus, the phase
transition precisely at the end point is second order.
Finally, it pays to pause for a moment and to think about the signicance of a phase
transition from a purely mathematical point of view. Consider a partition function in the
canonical ensemble, given in terms of the energy levels
i
of the system,
Z() =

i
e

i
(11.1)
94
If this sum were nite, then Z() is an analytic function of , and in particular all its
derivatives will be continuous, to all orders. So, in any system with a nite number of states,
no phase transitions can ever occur. Even if there are an innite number of states available,
Z() will still often be analytic, as is the case for the harmonic oscillator for example. So,
one really needs an innite number of degrees of freedom, usually corresponding to the
thermodynamic limit
N V (11.2)
In nite volume V , and with nite N, no phase transitions will occur. But in Nature, N
is very large, but not innite. While this is certainly true, it becomes mathematically more
convenient to view a function as discontinuous rather than to consider a function whose slope
is N 10
23
. So, to some extent, phase transitions provide a mathematically simplied
picture of a very very large system.
11.2 The Ising Model
The Ising model is one of the simplest, and most useful systems in statistical mechanics. Its
degrees of freedom are classical spins s
i
which can take the values 1 for each lattice site
i of a d-dimensional lattice, as depicted by arrows in gure 16. It is customary to take the
lattice to be square (i.e. cubic in 3 dimensions), but the Ising model has been considered
on most other types of lattices as well, including triangular, hexagonal, random, Bethe, etc.






Figure 16: A conguration of spins in the 2-dimensional Ising model.
The Hamiltonian is taken as follows,
H = J

i,j)
s
i
s
j
+ b

i
s
i
(11.3)
Here J is the coupling strength, and b is an exterior parameter such as a magnetic eld; both
of these parameters are real, and could be positive or negative. The symbol i, j indicates
that the sum over the sites i, j is restricted in some way: for example to be nearest neighbor.
95
The Ising model captures binary degrees of freedom, which may capture the interactions
between magnetic moments of electrons at sites in the approximation where only the con-
tribution of one of their spatial components is retained (when all components are retained,
we have the Heisenberg model instead). It can also capture the degrees of freedom of a
binary mixture of substances A and B, where for s
i
= +1 corresponds to an atom of A while
s
i
= 1 corresponds to an atom of B.
To begin the study of the dynamics of the system, we set the external eld b = 0. The
nature of the ground state of the system then depends on the sign of J.
The coupling J > 0 is referred to as ferro-magnetic because the two possible ground
states have all spins equal, either all up or all down. Note that the nature of these
ground states does not depend on the lattice structure. When thought of as magnetic
spins, the system would then be magnetized in either one of its ground state. Of course,
thermal uctuations may wash out the magnetization as temperature is increased. One
can show that the spin waves around this ground state obey a non-relativistic dispersion
relation k
2
.
The coupling J < 0 is referred to as anti-ferro-magnetic. The nature of the ground
state now depends to some degree on the structure of the lattice. The energy associated
with the coupling between two spins i, j included in the sum over i, j is minimized
when s
i
and s
j
are opposite. But if the coupling is nearest neighbor on a triangular
lattice, then it is impossible to satisfy minimum energy for all three bonds on the
triangle. The system is said to be frustrated. On a d-dimensional square lattice, whose
sites are labelled by integers (i
1
, 1
2
, , , i
d
), we do have a two-fold degenerate ground
state with absolutely minimal energy, given by a perfectly alternating spin assignment,
s
i
1
,1
2
,,,i
d
= (1)
i
1
+1
2
+,+i
d
s
0
s
0
= 1 (11.4)
This state is called the Neel ground state of an anti-ferro-magnetic system. One can
show that the spin waves around this ground state obey a linear dispersion relation
[k[, which is akin to the relation for a massless relativistic particle.
11.3 Exact solution of the 1-dimensional Ising Model
The partition function Z is dened by,
Z =

s
1
,s
2
,,s
N
=1
e

N
i=1
(s
i
s
i+1
bs
i
)
(11.5)
and where we make the system periodic by requiring s
N+1
= s
1
. To compute Z, we write it
as the trace of a sequential product,
Z = tr
_
c
1
c
2
c
N1
c
N
_
96
c
i
= exp
_
Js
i
s
i+1
+
1
2
b (s
i
+ s
i+1
)
_
(11.6)
The problem now becomes one of 2 2 matrix multiplication. To see this, we work out the
matrix elements of c
i
that enter here,
T

,
= exp
_
J
t
+
1
2
b( +
t
)
_
(11.7)
The partition function is then given by
Z =

i
=1
T

1
,
2
T

2
,
3
T

N
,
1
= tr
_
T
N
_
(11.8)
Written out explicitly, the transfer matrix T is given by
T =
_
e
J+b
e
J
e
J
e
Jb
_
(11.9)
Its eigenvalues

satisfy the equation,

e
J
ch(b) + 2 sh(2J) = 0 (11.10)
which is solved by

= e
J
ch(b)
_
e
2J
sh
2
(b) + e
2J
(11.11)
Therefore, the partition function is given by Z =
N
+
+
N

. for all values of N, , J, and b.


Since we have
+
>

for all values and signs of J, the thermodynamic limit of the system
simplies as we take N . The free energy per unit volume is given by,
F
N
= kT lim
N
1
N
ln
_

N
+
+
N

_
= kT ln
+
(11.12)
This function is analytic in , so there are no phase transitions for any nite value of . In
other words, the system is in the same thermodynamic phase for all temperatures.
11.4 Ordered versus disordered phases
An important qualitative characteristic of the dynamics of statistical magnetic systems is
order versus disorder. For magnetic systems, this property may be understood systematically
in terms of the magnetization, dened as the thermodynamic average of the spin,
m(b) =
1
N
F
b
(11.13)
97
In the case of the 1-dimensional Ising model this quantity is easily computed for both J > 0
or J < 0. The eigenvalue
+
depends on b through an even function of b, and thus, the
magnetization m(b) at zero external magnetic eld always vanishes. This result is interpreted
as the fact that the spins in the system, on average at all temperatures, point in all directions
randomly, so that their total contribution to magnetization vanishes in the bulk.
When can a system be ordered then ? We have seen previously that for J > 0, the
minimum energy states are
s
i
= +1 for all i
s
i
= 1 for all i (11.14)
These ground states are mapped into one another by the spin reversal symmetry R of the
Hamiltonian for b = 0. If both ground states contribute to the partition function, then the
total magnetization will get wiped out, and the system will remain in a disordered phase.
When N is nite this will always be the case. But when N , it is possible for the system
to get stuck in one ground state or the other. The reason this only happens for innite N
is that it would then take an innite number of spin ips to transition between the s
i
= +1
and s
i
= 1 states, and this may get energetically impossible. When the system gets stuck
in one of its ground states, then m(0) ,= 0 and we have spontaneous magnetization, familiar
from ferromagnetism below the Curie temperature. The operation of spin reversal, which is
a symmetry of the Hamiltonian for b = 0 is then NOT a symmetry of the physical system
any more, as a denite non-zero value of m(0) is not invariant under R. The symmetry R is
said to be spontaneously broken, and the system is then in an ordered phase, close to one
of its ground states. We have already shown that, for the 1-dimensional Ising model, this
phenomenon does not take place.
The 2-dimensional Ising model, however, does exhibit an ordered phase below a critical
temperature T
c
. This is known since the model was solved exactly by Lars Onsager in 1944,
and the critical temperature is known analytically,
sh(2J
c
) = 1 2J
c
= 0.881412 (11.15)
The corresponding magnetization was computed by C.N. Yang,
m(0) =
_
1
1
sh
4
(2J)
_
1/8
T < T
c
m(0) = 0 T > T
c
(11.16)
Note that as T T
c
, the expression m(0) (T
c
T)
1/8
vanishes and joins continuously
with the T > T
c
result m(0) = 0. The phase transition at T = T
c
is actually second order.
98
The exponent 1/8 is referred to as a critical exponent, in this case of the magnetization
order parameter m. Critical exponents tend to be universal objects, whose dependence on
the detailed short-distance interactions is limited.
Whether the 3-dimensional Ising model allows for an exact solution is one of the great
outstanding problems of statistical mechanics. Proposals have been made that the model
behaves as a theory of free fermionic random surfaces, but the details have never been
conclusive. Numerical studies of the model have shown however that it also admits a phase
transition between ordered (low temperature) and disordered (high temperature) phases.
11.5 Mean-eld theory solution of the Ising model
Short of an exact solution, we may try to approach the problem for the ferromagnetic J > 0
coupling with the help of some drastic simplications. In mean-eld theory, one assume that
uctuations of the spins away from the average value are small, and one determines this
average value self-consistently. Our starting point is to recast the Hamiltonian in terms of
the average magnetization per spin m which is to be determined. We proceed as follows,
s
i
s
j
= (s
i
m)(s
j
m) + m(s
i
+ s
j
) m
2
(11.17)
The key assumption of mean-eld theory is that the statistical average of the rst term
on the right side may be neglected. The remaining mean-eld Hamiltonian is obtained by
summing over all pairs i, j. Reformulating this sum in terms of a sum over individual sites
may be done by noticing that on a d-dimensional square lattice, the number of bonds from
a site is 2d. The Hamiltonian is then given by,
H
mf
= JNdm
2
b
e

i
s
i
b
e
= b + 2dJm (11.18)
Here, b
e
is the eective magnetic eld felt by each spin s
i
due to the mean value of the spins
that surround it. The partition function is now easily computed, leaving m as a parameter
which remains to be determined self-consistently. We nd,
Z = e
JNdm
2
_
e
b
e
+ e
b
e
_
N
(11.19)
The average magnetization, computed from Z is the to be set equal to m,
m =
1
N
ln Z
(b)
= tanh(b + 2dJm) (11.20)
Solving for m at b = 0, we see that there is always the solution m = 0. When 2dJ < 1,
namely for higher temperatures 2dJ < kT, then m = 0 is the only solution, and we have
99
> >
> >
> >
> > restart:
with(plots):
n:=1:
p1:=plot(x, x=0..n, color=[blue]):
p2:=plot(tanh(0.5*x), x=0..n, color=[red]):
p4:=plot(tanh(1*x), x=0..n, color=[red]):
p5:=plot(tanh(2*x), x=0..n, color=[red]):
display(p1,p2,p4,p5);
x
0 0.2 0.4 0.6 0.8 1
0
0.2
0.4
0.6
0.8
1
Figure 17: Plot of the curves x in blue, and tanh(2x), tanh(x), and tanh(x/2); only the rst
produces an intersection with x ,= 0.
no spontaneous magnetization. For 2dJ > kT on the other hand, we have one solution with
non-zero m, as may be seen by inspecting gure 17.
It is straightforward to solve in the approximation where 2dJ 1, so that m will
be small. One nds the following mean-eld result for the magnetization near the phase
transition point,
m
_
6dJk
2
c
(T
c
T)
1
2
2dJ
c
= 1 (11.21)
We see that mean-eld theory gives a completely wrong answer for d = 1, where there is no
phase transition at all in view of our exact result. For d = 2, comparison with the Onsager
solution shows that
c
is o by a factor of 2, but more importantly, that the critical exponent
is o by a factor of 4. So, while mean eld theory correctly predicts a phase transition for
d = 2, the details are not impressive. Numerical simulations show that in three dimensions,
we have
m (T
c
T)
0.308
(11.22)
so this is closer to the mean-eld exponent of 0.5 for d = 3. In fact, one can argue that
mean-eld theory becomes exact in the limit of d .
100
12 Functional integral methods
Functional integral methods may be extended from quantum mechanics to quantum sta-
tistical mechanics, where they provide an extremely useful formalism. In this section, we
begin by deriving the functional integral for the partition function of a classical mechanical
system. We use the path integral representation to study the classical limit, low and high
temperature asymptotics, as well as perturbation theory.
12.1 Path integral representation for the partition function
We give the derivation for the case of a system with just one degree of freedom, p, q, rep-
resented by quantum operators P, Q via the correspondence principle. The operators obey
canonical commutation relations [Q, P] = i h, and time-evolution is governed by a self-adjoint,
time-independent, quantum Hamiltonian H = H(P, Q). The generalization to N degrees of
freedom will be straightforward. The object is to calculate the partition function,
Z = Tr
_
e
H
_
(12.1)
The Boltzmann operator e
H
may be viewed as an analytic continuation to imaginary time
t i h of the evolution operator U(t) = e
itH/h
. Thus, the derivation given here for the
partition function will show strong parallels, but also some important dierences, with the
construction of the path integral representation for quantum mechanics.
Recall the respective bases in which the operators Q and P are diagonal,
Q[q = q[q
P[p = p[p (12.2)
Orthonormality and completeness hold as follows,
q
t
[q = (q q
t
) I
7
=
_
dq [qq[
p
t
[p = 2 h(p p
t
) I
7
=
_
dp
2 h
[pp[ (12.3)
where I
7
is the identity element in Hilbert space. Translation operators in both bases satisfy
e
+iaP/h
Qe
iaP/h
= Q + a e
iaP/h
[q = [q + a
e
ibQ/h
Pe
+ibQ/h
= P + b e
+ibQ/h
[p = [p + b (12.4)
From these relations, we deduce the values of the mixed matrix elements,
q[p = e
+iqp/h
p[q = e
iqp/h
(12.5)
101
This summarizes all that will be needed to derive the functional integral representation.
To evaluate the trace over the Boltzmann operator in (12.1), we divide into N intervals
of equal length such that = N, and use the formula,
e
H
=
_
I
7
e
H
_
N
(12.6)
We represent I
7
using both completeness relations of (12.3) in the following manner,
I
7
=
_
dp dq
2 h
[q q[p p[ (12.7)
In practice, we will need to use N dierent pairs of integration variables p
n
, q
n
with n =
1, , N, namely one pair for each inserted identity operator. Working this out, we nd,
Z = Tr
_
e
H
_
=
N

n=1
_
dp
n
dq
n
2 h
q
n1
[p
n
p
n
[e
H
[q
n
(12.8)
with the understanding that we set q
0
= q
N
.
Letting 0, we may expand the Boltzmann operator in a power series in ,
e
H
= I
7
H +O(
2
) (12.9)
The whole purpose of inserting the completeness relations in both [p and [q states is to
have a mixed representation in which matrix elements for both kinetic and potential terms
in the Hamiltonian may be evaluated explicitly. Consider, for example, a potential-type
Hamiltonian of the form H = P
2
/2m + U(Q). It is manifest that we have p[H[q =
p[qH(p, q), where H(p, q) is the classical Hamiltonian. Given a general Hamiltonian, H,
we can always order all P operators to the left of all Q operators, using [Q, P] = i h. It is in
this form that we can use it to dene the classical Hamiltonian H(q, p) by,
p[H[q = p[qH(p, q) (12.10)
The needed matrix elements are now readily evaluated, and we have,
p
n
[e
H
[q
n
= p
n
[q
n

_
1 H(p
n
, q
n
) +O(
2
)
_
= p
n
[q
n
e
H(pn,qn)
(12.11)
up to contribution that are of higher order in . It remains to evaluate the combination,
q
n1
[p
n
p
n
[q
n
= e
i(q
n1
qn)pn/h
(12.12)
Putting all together, we have,
q
n1
[p
n
p
n
[e
H(P,Q)
[q
n
= e
i(q
n1
qn)pn/hH(pn,qn)
(12.13)
102
The nal step consists of a change of notation,

n
= n h = hn/N
q
n
= q(
n
)
p
n
= p(
n
) (12.14)
The argument of the exponential may now be recast as follows,
i
h
(q
n1
q
n
)p
n
H(p
n
, q
n
) =

n1
h
_
i
q(
n1
) q(
n
)

n1

n
p(
n
) + H(p(
n
), q(
n
))
_

1
h
_
n

n1
d
_
i q()p() + H(p(), q())
_
(12.15)
Taking N , so that 0 makes exact the expansion in to rst order. We introduce
the functional integration measure,
_
Tp Tq = lim
N
N

n=1
_
R
dp(
n
)dq(
n
)
2 h
(12.16)
and notice that the trace condition produces a periodicity requirement on the position co-
ordinate q( h) = q(0). Putting all together,
Z =
_
Tp Tq exp
_

1
h
_
h
0
d
_
i q()p() + H(p(), q())
_
_
(12.17)
The momenta enter without derivatives in this expression. Without loss of generality, we
may impose periodic boundary conditions also on the variable p, so that we have,
p( h) = p(0)
q( h) = q(0) (12.18)
Although expression (12.17) involves a factor of i, we are assured from the outset that the
partition function must be real.
Generalization to s degrees of freedom p
i
, q
i
and i = 1, , s, with a Hamiltonian
H(p
1
, , p
s
, q
1
, , q
s
) = H(p, q) is straightforward, and we nd,
Z =
_
s

i=1
_
Tp
i
Tq
i
_
exp
_

1
h
_
h
0
d
_
i
s

i=1
q
i
()p
i
() + H(p(), q())
_
_
(12.19)
with the periodicity conditions (12.18) enforced on every p
i
, q
i
.
103
12.2 The classical = high temperature limit
In view of the periodicity conditions (12.18), we may expand the functions p() and q() in
an orthogonal basis of periodic functions on the interval [0, h],
q() = q +

n=1
_
a
n
e
in
+ a

n
e
in
_
p() = p +

n=1
_
b
n
e
in
+ b

n
e
in
_
(12.20)
The thermal or Matsubara frequencies are given by,

n
=
2n
h
(12.21)
and the zero-modes p, q have been explicitly separated.
When h 0 the frequencies
n
for n > 0 become large while the zero modes p, q
are unaected. When the Matsubara frequencies are much larger than all the characteristic
scales of the Hamiltonian, we may use the following h approximation,
1
h
_
h
0
d H(p(), q()) H(p, q) (12.22)
where only the zero modes p, q are retained on the right side in H(p, q). Separating also the
integration measure into their zero-mode and non-zero-mode contributions,
Tp Tq =
dp dq
2 h
T
t
p T
t
q (12.23)
and using the fact that the non-zero-modes precisely integrate to 1, namely
_
T
t
p T
t
q exp
_

i
h
_
h
0
d p() q()
_
= 1 (12.24)
we nd that the h 0 approximation to the partition function is given by,
Z
_
dp dq
2 h
e
H(p,q)
(12.25)
which is the properly normalized classical result we had already encountered earlier.
104
12.3 Integrating out momenta
Returning to the non-classical case for arbitrary h, we may seek to integrate out explicitly
the momentum variables, since they enter purely algebraically. When H is quadratic-linear
in p, this is always possible. Consider for example the case of a Hamiltonian for a charged
particle in a magnetic eld and potential U(q),
H =
(p eA(q))
2
2m
+ U(q) (12.26)
The integrand of the exponential may be rearranged as follows,
_
h
0
d
_
i q p + H(p, q)
_
=
_
h
0
d
_
1
2m
(p eA+ im q)
2
+
1
2
m q
2
+ i q A+ U(q)
_
The rst term produces a Gaussian integral in p, which may be carried out and gives a factor
Z
0
in the partition function which depends only on h, m, and , but not on the dynamics of
the system contained in A and U(q). Thus, we have,
Z = Z
0
_
Tq exp
_

1
h
_
h
0
d
_
1
2
m q
2
+ i q A+ U(q)
_
_
(12.27)
The object under the -integration is the Euclidean Lagrangian. Note the factor of i in front
of the electro-magnetic interaction !
105

Anda mungkin juga menyukai