Rolf Enderlein
ihmboldtbhiuemity BerEin Universi?yof Sao Pauh
NommJ.M. Horing
stewwas Institute o Technology f
Hoboken, NJ
World Scientific
Singapore New Jersey London Hong Kong
Puhli.dzed hy
World Scientific Publishing Co. Re. Ltd.
P 0 Box 128, Farrer Road, Singapore 912805 USA oflice: Suite 1 H, 1060 Main Streei, River Edge, NJ 07661 UK oflcficer 57 Shelton Street, Covcnt Garden, London WCZH 9IiE
British Library CatalogiiinginPublicationData A catalogue recurd fur this book is available from the British Library.
For photocopying of material in this volume, please pay a copying fee through the Copyright Clearance Center, Inc., 222 Rosewood Drive. Danvers, MA 01923, U S A . In this casc pcmission to photocopy i s not required from the publisher.
ISBN 9810223870
Printed in Singapore.
Dedication
This book is dedicated to the memory of Adele and Werner Enderlein (par ents of R.E.), and to t h e memory of Joseph and Esther Morgenstern (hfor ganstein)(grandparents of N.J. M. 11.).
Vii
Preface
People come to technical books with a vast array of daerent needs and requirements, arising from their differing educational backgrounds, professional orientations and career objectives. This is particularly evident in the field of semiconductors, which stands at the juncture of physics, chemistry, electronic engineering, material science and mathematics. No longer just an academic discipline. this field is at the heart of an ongoing revolution in communications, computation and electronic device applications, that innovate many fields and change modern life in myriad ways, large and small. Its profound impact and further potential command interest and attention from all corners of the earth. and from a wide variety of students and researchers. The clear need to address a broadly diversified and variously motivated readership has weighed heavily in the authors considerations. It poses a pedagogical problem faced by many teachers of intermediate level courses on semiconductor physics. Generally speaking, every student has previously studied about half of the course materiaL The difficulty lies in the fact that each students exposure is likely to have been to a &fleerent half, depending on which lower level courses and teachers they have had, and where the emphasis lay. To accommodate readers with varied backgrounds we start from first principles and provide fully detaiIed explanations and proofs, assuming only that the reader is familiar with the Schrodinger equation. This intensively tutorial treatment of the electronic properties of semiconductors includes recent fundamental developments and is carried through to the physical principles o device operation, to meet the needs of readers interf ested in engineering aspects of semiconductors, as well as those interested in basic physics. Clarity of explanation and breadth of exposure relating to the electronic properties of semiconductors, from first principles to modern devices, are our principal objectives in this fraddy pedagogical book. We offer full mathematical derivations to strengthen understanding and discuss the physical significance of results. avoiding reliance on hand waving arguments alone. To support the readers introduction to the physics of semiconductors, we provide a thorough grounding in the basic principles of solid state physics, assuming no prior knowledge of the field on the part of the reader. An ele mentary discussion of the crystal structure, chemical nature and macroscopic properties characterizing semiconductors is given in Chapter 1. Moreover, we also include an extensive appendix to guide the reader through group theory and its applications in connection with the symmetry properties of semiconductors, which are of major importance. Beside spatially homogeneous bulk semiconductors, we undertake a full exposition of inhomogeneous semiconductor junctions and heterostructures because of their crucial role
Preface
iu
The book has emerged from lectures which the authors presented for physics students at the HumboldtUniversity of Berlin. Germany, and the State University of Sao Paulo, Brazil, and for physics and engineering physics students at the Stevens Institute of TeFhnology in Hoboken, New Jersey, C.S.A. Part of the book has similarities with the german book "Grundlagen der Halbleiterphysik" ("Fundamentals of Semiconductor Physics") which was written by one o us (R. E.) together with A. Schenk. We are thankful to Dr. Schenk f (now at ETH Zurich) for allowing us to use part of his work in the present volume. In writing this book we have had excellent suppoIt from many of our colleagues at our own and other Universities. We are particularly r thankful to Prof. D . J. Auth (HumboldtLniversity Berlin), Prof. Dr. F. Bechstedt (FriedrichSchiller University Jena), Prof. Dr. W. A. Harrison (Stanford University), Prof. Dr. M. Scheffler (FritzHaber Institut, Berlin), Prof. Dr. J. R. b i t e , Prof. Dr. A. Fazzio, and Prof. D . J. L. Alves r el (State University Sao Paulo), as w l as to Prof. Dr. H. L. Cui, Prof. Dr. G. Rothberg, Mr. G. Lichtner (Stevens Institute of Technology), and Prof. Dr. G . Gumbs (Hunter College, CWNY, New York), who read parts of the manuscript and contributed helpful suggestions and critical remarks. The technical assistance of Mrs. Hannelore Enderlein is gratefully acknowledged.
RoIf Enderlein
Sao Paulo October 1996
X i
Contents
1 Characterization of sernicond uct ors
Inlrnduclion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 Atomic structure of ideal crystals . . . . . . . . . . . . . . . . 1.2.1 Cryst.al latlices . . . . . . . . . . . . . . . . . . . . . . 1.2.2 Point groups of equivalent directions arid crystal classes 1.2.3 Space groups and crystal structures . . . . . . . . . . 1.2.4 Cubic semiconductor structures . . . . . . . . . . . . . 1.2.5 Hexagoiial semiconductor st.ructures . . . . . . . . . . 1.3 Chemical nature of semiconductors. Material classes . . . . . 1.3.1 Group IV elemental semiconductors . . . . . . . . . . 1.3.2 111V semiconductors . . . . . . . . . . . . . . . . . . . 1.3.3 11VI semiconductors . . . . . . . . . . . . . . . . . . . 1.3.4 Group \ elemental semiconductors . . . . . . . . . . ' I 1.3.5 IVVI semiconductors . . . . . . . . . . . . . . . . . . 1.3.6 Other compound semiconductors . . . . . . . . . . . . 1 4 hlacroscopic properties and their microscopic implications . . . 1.4.1 Electrical conductivity . . . . . . . . . . . . . . . . . . 1.4.2 Depenclenre of conductivity on the semiconductor state 1.4.3 Optical absorption spectrum and the band modcl of srmicoiiductors . . . . . . . . . . . . . . . . . . . . . . 1.4.4 Electrical conductivity in the band model . . . . . . . 1.4.5 The Hall effect and the existence of positively charged freely mobile carriers . . . . . . . . . . . . . . . . . . . 1.4.6 Seinicondiictors far from thermodynamic equilibrium .
1.1
1 1
5
12
14 16
22
28 29 30 31 31
32
32
33 34
35
38
43
45
49
51 51 54
54
2 Electronic structure of ideal crystals 2.1 Abcimic cores and vdcnce electrons . . . . . . . . . . . . . . . 2.2 The ciynaniical problem . . . . . . . . . . . . . . . . . . . . . 2.2.1 Schriidiiiger equation for the interacting core and valence dwtlon system . . . . . . . . . . . . . . . . . . . 2.2.2 Adiabatic approximation . Lattice dynamics . . . . . . 2.2.3 Oneparticle approximation . Oneparticle Schriidinger equation . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3 General properties of stationary onerlectron states in a crystal 2,3.1 Syinrnctry properties of the oneelectron Schrtidinger equation . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3.2 R b c h theorem . . . . . . . . . . . . . . . . . . . . . . 2.3.3 Reciprocal v e c h space and the reciprocal latt.ice . . . 2.3.4 Relation between energy eigenvalues and quasiwave vector . . . . . . . . . . . . . . . . . . . . . . . . . . .
57
66
82
82
85
89
94
xii
Contents
2.4
2.5
2.6
2.7
2.8
Schrodinger equation solution in the nearlyfreeelectron approximation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98 2.4.1 Kondegenerate perturbat. t.heory . . . . . . . . . . 100 ion 2.4.2 Degenerate perturbation theory . . . . . . . . . . . . . 103 Bandstructure . . . . . . . . . . . . . . . . . . . . . . . . . . 105 2.5.1 Brillouin zones . . . . . . . . . . . . . . . . . . . . . . 105 2.5.2 Degeneracy of energy bands . . . . . . . . . . . . . . . 116 2.5.3 Critical points and effective masses . . . . . . . . . . . 119 2.5.4 Density of states . . . . . . . . . . . . . . . . . . . . . 123 2.5.5 Spin . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128 2.5.6 Calculational methods for band structure determination133 Tight binding approximation . . . . . . . . . . . . . . . . . . 140 2.6.1 Fundamentals . . . . . . . . . . . . . . . . . . . . . . . 140 2.6.2 TB theory of diamond and zincblende type semiconductors . . . . . . . . . . . . . . . . . . . . . . . . . . 148 2.6.3 sp3hybrids, total energy and chemical bonding . . . . 165 k . pmet.hod . . . . . . . . . . . . . . . . . . . . . . . . . . . 179 2.7.1 Fundamentals . . . . . . . . . . . . . . . . . . . . . . . 179 2.7.2 Valence bands of diamond structure semiconductors without spinorbit interaction . . . . . . . . . . . . . 184 . 2.7.3 h t t i n g eTKohn model . . . . . . . . . . . . . . . . . . 189 2.7.4 Kana model . . . . . . . . . . . . . . . . . . . . . . . . 200 Band structure of important semiconductors . . . . . . . . . . 211 2.8.1 Silicou . . . . . . . . . . . . . . . . . . . . . . . . . . . 212 2.8.2 Germanium . . . . . . . . . . . . . . . . . . . . . . . . 218 2.8.3 111V semiconductors . . . . . . . . . . . . . . . . . . 219 2.8.4 IGVI semiconductors . . . . . . . . . . . . . . . . . . . 221 2.8.5 IV\'I semiconductors . . . . . . . . . . . . . . . . . . 224 2.8.6 Tellurium and selenium . . . . . . . . . . . . . . . . . 224
3 Electronic s t r u c t u r e of semiconductor crystals with p e r t u r bations 225 f 3.1 Atomic structure o red semiconductor crystals . . . . . . . . 226 3.1.1 Classification of perturbations . . . . . . . . . . . . . . 226 3.1.2 Point perturbations . . . . . . . . . . . . . . . . . . . . 227 3.1.3 Formation of point perturbations and their movenient 235 3.1.4 h e and planar defects . . . . . . . . . . . . . . . . . 240 3.2 Oneelectron Schrodinger equation for point perturbations . . 241 3.2.1 Electroncore interaction . . . . . . . . . . . . . . . . . 242 3.2.2 Electronelw?c.lron interaction . . . . . . . . . . . . . . 245 3.3 Effective mass equation . . . . . . . . . . . . . . . . . . . . . 252 3.3.1 Effectivemass equation for a single band . . . . . . . 253 3.3.2 Multjband effective mass equation . . . . . . . . . . . 259
Contents
Xlll
...
3.4 Shallow levels. Donor and acceptor states . . . . . . . . . . . 265 3.4.1 Hydrogen model . . . . . . . . . . . . . . . . . . . . . 266 3.4.2 Improvements upon the hydrogen model . . . . . . . . 272 3.5 Deeplevds . . . . . . . . . . . . . . . . . . . . . . . . . . . . 281 3.5.1 General characterization of deep levels . . . . . . . . . 281 3.5.2 Defect molecule model . . . . . . . . . . . . . . . . . .285 3.5.3 Solution methods for the oneelectron Schriidinger q u a tion of a crystal with a point perturbation . . . . . . . 293 3.5.4 Correlation effects . . . . . . . . . . . . . . . . . . . . 301 3.5.5 Resu1t.s for se1ecDed deep centtas . . . . . . . . . . . . 308 3.6 Clean semiconductor surfaces . . . . . . . . . . . . . . . . . . 334 3.6.1 The concept of clean surfaces . . . . . . . . . . . . . . 334 3.6.2 Atomic structure of clean surlaces . . . . . . . . . . . 336 3.6.3 Electronic structure of crystals with a surface . . . . . 354 3.6.4 htomic and electronic structure of particular surfaces 371 3.7 Semiconductor microstructures . . . . . . . . . . . . . . . . .388 3.7.1 Neterojunctions . . . . . . . . . . . . . . . . . . . . . . 388 3.7.2 Microstructures; Fabrication, classifications, examples 396 3.7.3 h*lethodsfor electronic structure calculations . . . . . 409 3.7.4 Elcctronic structure of particular microstructures . . . 420 3.8 Macroscopic electric fields . . . . . . . . . . . . . . . . . . . . 433 3.8.1 Effective mass equation and stationary electron states 434 3.8.2 Nonstationary states . Bloch oscillations . . . . . . . . 437 3.8.3 Interband tunneling . . . . . . . . . . . . . . . . . . . 440 3.8.4 Photon assisted interband tunneling . . . . . . . . . . 442 3.9 Macroscopic magnetic fields . . . . . . . . . . . . . . . . . . . 443 3.9.1 Effective mass equation in a magnetic field . . . . . . 444 3.9.2 Solution of the effective mass equation . . . . . . . . . 452
4
Electron system in t herrnodynamic equilibrium 457 4.1 Fundamentals of the statistical description . . . . . . . . . . . 457 4.2 Calculation of average particle numbers . . . . . . . . . . . . 460 4.2.1 Configurationindependent oneparticle states . . . . . 460 4.2.2 Configurationdependent oneparticle states . . . . . . 462 4.3 Density of states . . . . . . . . . . . . . . . . . . . . . . . . . 469 4.3.1 Total electron concentration . . . . . . . . . . . . . . . 469 4.3.2 Density of states of ideal semiconductors . . . . . . . . 470 4.3.3 Density of states of real semiconductors . . . . . . . . 474 4.4 Free carrier concentrations . . . . . . . . . . . . . . . . . . . . 477 4.4.1 Conservation of total electron number . . . . . . . . . 477 4.4.2 Free carrier concentration dependence on Fermi energy. Law of mass action . . . . . . . . . . . . . . . . . 478 4.4.3 Intrinsic semiconductors . . . . . . . . . . . . . . . . . 482
xiv
Contents
Extrinsic semiconductors . . . . . . . . . . . . . . . . 484 Compensation of donors and acceptors . . . . . . . . . 489 More complex cases . . . . . . . . . . . . . . . . . . . 492
Nonequilibrium processes in semiconductors 499 5.1 Fundamentals of the statistical description of nonequilibrium processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 500 5.2 Systematics of nonequilibrium processes in semiconductors . 505 5.2.1 Temporal inhomogeneity and spatial homogeneity . . 505 5.2.2 Spatial inhomogeneity and temporal homogeneity . . . 506 5.2.3 Space and time inhomogeneities . . . . . . . . . . . . 508 5.3 Generation and annihilation of free charge carriers . . . . . . 509 5.3.1 Generation processes . . . . . . . . . . . . . . . . . . . 510 5.3.2 Unipolar annihilation of free charge carriers: capture 511 at deep centers . . . . . . . . . . . . . . . . . . . . . . 5.3.3 Bipolar annihilation of carriers at deep centers . . . . 517 5.4 Drift current . . . . . . . . . . . . . . . . . . . . . . . . . . . 523 5.5 Diffusion and annihilation of free carriers . . . . . . . . . . . 527 5.6 Equilibrium of free carriers in inhomogeneously doped semiconductors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 530
6 Semiconductor junctions in thermodynamic equilibrium 535 6.1 pnjunction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 537 6.1.1 Establishment of thermodynamic equilibrium . . . . . 539 6.1.2 Diffusion voltage . . . . . . . . . . . . . . . . . . . . . 541 6.1.3 Spatial variation of the electric and chemical potentials: Schottky approximation . . . . . . . . . . . . . . 542 6.2 Heterojunctions . . . . . . . . . . . . . . . . . . . . . . . . . . 549 6.2.1 Equilibrium condition . . . . . . . . . . . . . . . . . . 550 6.2.2 Electrostatic potentid . GaAs/Gal,Al, As heterojunction as an example . . . . . . . . . . . . . . . . . . . . 552 6.3 Metalsemiconductor junctions . . . . . . . . . . . . . . . . . 557 6.3.1 Energy level diagram before establishing equilibrium . 357 6.3.2 Electrostatic potential . . . . . . . . . . . . . . . . . . 559 6.3.3 Schottky barrier . . . . . . . . . . . . . . . . . . . . . 563 6.4 Insulatorsemiconductor junctions . . . . . . . . . . . . . . . . 567 6.4.1 Thermodynamic equilibrium . . . . . . . . . . . . . . 567 6.4.2 Influence of interface states . . . . . . . . . . . . . . . 570 6.4.3 Semiconductor surfaces . . . . . . . . . . . . . . . . . 572
7 Semiconductor junctions under nonequilibrium conditions 573 7.1 pnjunction in an external voltage . . . . . . . . . . . . . . . 574 7.1.1 Electrostatic potential profile . . . . . . . . . . . . . . 576
Contents
xv
Mechamism of current transport through a pnjunction 577 Chemical potential profiles for electrons and holm . . 580 Dependence of current density on voltage . . . . . . . 583 Bipolar transistor'. . . . . . . . . . . . . . . . . . . . . 585 7.1.6 T u n e 1 diode . . . . . . . . . . . . . . . . . . . . . . . 593 7.2 ynjunction in interaction with light . . . . . . . . . . . . . . 595 7.2.1 Photocffect at a pnjunction . Photodiode and photovoltaic element . . . . . . . . . . . . . . . . . . . . . . 595 7.2.2 Laser diode . . . . . . . . . . . . . . . . . . . . . . . . 599 7.3 Metalsemiconductor junction in an external voltage. Rectificrs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 606 7.4 hwulatorsemiconductor junction in an external voltage . . . 612 7.4.1 Field effect . . . . . . . . . . . . . . . . . . . . . . . . 612 7.4.2 Inversion layers . . . . . . . . . . . . . . . . . . . . . . 614
7.4.3 MOSFET . . . . . . . . . . . . . . . . . . . . . . . . .
620
Appendices
A Group theory for applications in semiconductor physics
623 A.1 Definitions and concepts . . . . . . . . . . . . . . . . . . . . . 623 A . l .1 Group definition . . . . . . . . . . . . . . . . . . . . . 623 A.1.2 Concepts . . . . . . . . . . . . . . . . . . . . . . . . . 624 A.2 Rigid displacements . . . . . . . . . . . . . . . . . . . . . . . 627 A.2.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . . 621 A.2.2 Translations . . . . . . . . . . . . . . . . . . . . . . . . 628 A.2.3 Orthogonal transformations . . . . . . . . . . . . . . . 629 A.2.4 Geometrical interpretation . . . . . . . . . . . . . . . . 631 4.2.5 Screw rotations and glide re3ections . . . . . . . . . . 632 A.3 Translation. point and space groups . . . . . . . . . . . . . . 635 A.3.1 Lattice translation groups . . . . . . . . . . . . . . . . 635 4.3.2 Point groups . . . . . . . . . . . . . . . . . . . . . . . 636 A.3.3 Space groups . . . . . . . . . . . . . . . . . . . . . . . 654 A.4 Representations of groups . . . . . . . . . . . . . . . . . . . . 655 A.4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . 655 A.4.2 Irreducible representations . . . . . . . . . . . . . . . . 661 4.4.3 Products of representations . . . . . . . . . . . . . . . 667 A . 5 Representations of the full rotation group . . . . . . . . . . . 673 4.5.1 Vector representation of the rotation group and generators of infinitesimal rotations . . . . . . . . . . . . 674 A.5.2 Representations for dimensions other than three . . . 676 A.6 Spinor representations . . . . . . . . . . . . . . . . . . . . . . 682 A.6.1 Spacedependent spinors . . . . . . . . . . . . . . . . . 682
mi
Cootents
A.6.2 Representation V I . . . . . . . . . . . . . . . . . . . . 683 2 A.6.3 Irreducible spinor representations . . . . . . . . . . . . 684 A.6.4 Double group method . . . . . . . . . . . . . . . . . . 685 A.7 Projective representations . . . . . . . . . . . . . . . . . . . . 687 A.7.1 Factor systems . . . . . . . . . . . . . . . . . . . . . . 687 A.7.2 Definitions and theorems . . . . . . . . . . . . . . . . 689 A.7.3 Construction of projective representations . . . . . . . 692 A.8 Time reversal symmetry . . . . . . . . . . . . . . . . . . . . . 692 A.8.1 Time reversal operator . . . . . . . . . . . . . . . . . . 693 A.8.2 Additional degenerac!?~of energy eigenvalues . . . . . 694 A.8.3 Additional selection rules for matrix elements . . . . . 697 A.9 Irreducible representations of space groups . . . . . . . . . . . 698 A.9.1 Representations of translation groups . . . . . . . . . 698 A.9.2 Star of wavevectors . . . . . . . . . . . . . . . . . . . . 700 A.9.3 Small point groups and their projective representations 702 A.9.4 Representations of the fufl space group . . . . . . . . . 704 A.9.5 Spinor representations of space groups . . . . . . . . . 706 A.9.6 Implications of time reversal symmetry . . . . . . . . 707 A.9.7 Compatibility . . . . . . . . . . . . . . . . . . . . . . . 712 A.10 Irreducible representations of small point groups . . . . . . . 712 A.10.1 Character tables . . . . . . . . . . . . . . . . . . . . . 712 A.10.2 Multiplication tables . . . . . . . . . . . . . . . . . . 731 A.10.3 Compatibility relations . . . . . . . . . . . . . . . . . 734
737
'741
747
Index
757
Chapter 1
Characterization of semiconductors
1.1
Introduction
Semiconductors are identified as a unique material group on the basis of their common macroscopir properties, as is done for metals, dielectrics and magnetic materials. The name semiconductor stems from the fact that such materials have moderately good conductivity, higher than that of insulators, and lower than that of metals. However, if this were the only property which these materials had in common, the term semiconductor would have only a very weak foundation. But such is not the case. In fact, many materials having conductivity between that of metals and insulators. display simultaneously a series of further common properties. In particular, their conductivity depends very strongly on material staie, for example, on temperature and chemical purity, much more so than in the case of metals. For sufficiently pure semiconductors, the conductivity decays by orders of magnitude while cooling down from room temperature to liquid helium temperature. ,4t absolute zero temperature, semiconductor conductivity almost vanishes, in contrast to the conductivity of metals, which rises modestly with falling temperature. The conductivity of metals reaches its maximum at low temperature, and for superconductors it effectively becomes infinitely large. In regard to the dependence of semiconductor conductivity on the degree of purity, it has been found that a given semiconductor in a very pure state can resemble an insulator. while in a highly polluted state it acts like a metal, among other peculiarities. Furthermore, irradiation with light can cause a transition from insulatorlike behavior to metallike behavior of one and the same semiconductor. There are yet other optical properties shared by semiconductors: The op
tical absorption spectra of semiconductors exhibit a threshold  below the threshold frequency, light can pass through practically without losses, while above it the light is strongly absorbed. Moreover, good luminescence properties in the visible and infrared spectral range are also characceristic of many semiconductors. Thus, the identification of a semiconductor material involves several characteristic properties, not just the one of moderately good electrical conduction. One has a semiconductor only if all such properties apply. This criterion excludes ionic conductors, for example, which exhibit conductivity values of the right order of magnitude, but do not display the characteristic temperature dependence. One may justifiably question the extent to which this definition of a semiconductor really makes sense. For example, it is not yet obvious why just the above properties are selected as defining features while others are not, and what internal connections may exist among them. A full answer can only be given by means of a microscopic theory of semiconductor properties which will be developed systematically in the chapters to follow. For the moment, we invite the reader to join us in the recognition that all macroscopic properties involved in the definition of a semiconductor can be traced back to a common microscopic origin, namely the nature of the spectrum of allowed energy levels and the particulars of their population by electrons. To be specific, the permissible energy levels of a semiconductor form bands which are separated by forbidden regions, and the Characteristic electron population of allowed energy levels is such that, at absolute zero temperature, a semiconductor i s characterized by having only completely occupied and completely empty energy bands (no partially filled bands). It is this common microscopic feature which underlies the totality of macroscopic material properties that uniquely define a semiconductor. It also provides the basis for uncovering yet other common macroscopic features of this class of materials, beyond the ones already discussed. For instance, it may be expected that semiconductors should be predominantly solid crystalline materials, since the formation of energy bands with gaps between them is most likely to occur in the crystalline phase. Nevertheless, amorphous and liquid semiconductors cannot be completely rxcluded since a certain regularity of the relative positions of neighboring atoms also exists in the amorphous and liquid phases. Actually, in addition to solid crystalline semiconductors, which are the main reprcsentatives, a series of amorphous semiconductors has also been found silicon and selenium being important examples. Liquid semiconductors are also possible, with melted tellurium among them. Other semiconducting materials, e.g., silicon and germanium, are metals in the liquid phase. In this book we restrict our considerations to solid crystalline semiconductors. The discussion also partially applies to amorphous and liquid sexniconductors, but in most cases modifications are necessary. Even the basic
1.1. Introduction
concept o f the quantum mechanical energy spectrum uf electrons has to be defined in tl different way, and a proper treatment of amorphous and liquid materials cannot be acconimodated within the framework of this introduction to semiconductor physics. Interested readers are referred to the extensive literature on this subject (see, for example? Elliot, 1983; Mott and Davis, 11979; BunchBruevich, Enderlein, Ewer, Keiper, Mironov, and Xvyagin, 1984; Adhr, F r h s c h e and Ovahinsky, 1985). T h e microscopic definition discussed above contains no recognizable constraints with regard to the chemical nature of semiconductors. One may expect, therefore, that semiconductors should be distinguished by having a large chemical diversity. This is in fact true semiconductors may be composed of a large variety of chemical elements and compounds. The most fully expAored candidates and those used for technical applications today arc crystalline semiconductors consisting of relatively small chemical u n i t s , i.e. either dements or binary and ternary compounds. Knowledge of thc mxistence of a distinct material group semiconductors developed, historically, only relatively late. MPtals have been used by men since antiquity, but semiconductors attracted attention for the Erst time only a century and a half ago. The first reference to a characteristic semiconductor property dates back to Faraday who in 1833 observed an increase of the electric conductivity of silver sulfide with temperature. The exponential form of this increase was discovcrcxl by Hittorf in 1851. The trerm semiconductor was introduced in 1911 by Konigsberger and Weiss after a similar term of about the same context had already been employed by Ebert (1789) and Bromme (1851). The late and, initially, relatively slow development of semiconductor physics, is primarily due to the circnnistance that the characteristic properties of semiconductors depend strongly on their degree of purity, more precisely, on the presence 01 absence o certain chemical elements. This is also f the reason that many semiconducting materials in their natural form as minerals do not display the typical properties of a semiconductor they are too heavily polluted and have too many structural defects. Natural diamondrr, for example, are semiconductors only in rare cases. Accordingly, clean fabrication of the materials in the laboratory and the controlled incorporation of chemical elements played a crucial role from the very beginning. The lack of such control in preparation presented fundamental difficulties which had to be overcame in the early days of semiconductor rasearch. The necessary accuracy o composition control, which amounts to one atom in one hundred f thousand or less, excccdcd the accuracy that prevailed in chemistry at the t h e by orders of magnitude. It was necessary to raise the accuracy of chemical composition control to a level wherein one could measure one millionth of a mole haction instead of one thousandth. only in this way could reliable results be achieved with semiconductors. Since such accuracy was achieved
~ ~
only gradually, semiconductor physics, in its early history, was confronted with apparently mysterious phenomena and contradictory results. For example, silicon and germanium were first thought to be metals, until the recognition that the impurity concentrations of certain elements were too large to achieve semiconducting properties. From the very beginning semiconductor physics research was impeded by the need for expensive fabrication of material samples. The Geld could develop only to the extent that good samples could be made available. Naturally, the first samples weie either materials which could b r fabricated relatively cheaply or materials which occurred in nature in a suitable form, albeit not ideal. Among these first materials were metal sulfides, which like lead sulfide in its mineral form as Galena and copper oxide (CuzO) and selenium in their artificially grown form, displayed good semiconducting properties even with relatively strong pollution and structural imperfections. In 1874, Braun discovered that contacts between certain metal sulfides and metal tips fxhibited different electrical resistance upon reversal of the polarity of the applied voltage. Such point contact structures were used in radio receivers as rectifiers at the beginning of our century. One can mark these point contact rectifiers as the first semiconductor devices. Similar rectifying action was also found for selenium and copper oxide. Moreover, a large change of electrical conductivity could also be achieved in these materials through irradiation by light. For selenium, this property was discovered as early as 1852 by Hittorf. Since the beginning of the 20th century, this effect has also been used practically in the selenium photocell. The first technical use of copper oxide as a rectifier was accomplished in 1926 by Grondahl, followed by rectifiers using selenium. The first practical application of copper oxide in photocells was accomplished in 1932 by Lang. Because of their technological importance, selenium and copper oxide were the first semiconductors to be subjected to more detailed physical investigations. The semiconducting metal sulfides, selenides and tellurides were already known earlier because of their good luminescence properties. In the further exploration of these materials semiconducting behavior, luminescence physics partially merged with semiconductor physics. In the midnineteenthirties, the search for a solidstatebased electronic switching element which could replace the vacuum tube was extended to the elemental semiconductors germanium and silicon. The most important results of this research, which turned out to be decisive for the whole further development of semiconductor physics, were the invention of the germaniumbased bipolar transistor in 1949, and the realization of the field effect transistor with the help of silicon at the end of the nineteenfifties. With the introduction of silicon, the stage was set for the development of semiconductor microelectronics. Later, a similar role was played by compounds involving elements of the third and fifth groups of the periodic table, like gallium
arsenide or phosphide, making possible the development of semiconductor optoelectronirs. The broad technical application of its results distinguishes semiconductor physics now from its early days. It is well established that semicondnctors arc exceptionally wellsuited for necessary functions in electronics and electrical engineering. This is by no means accidental, but is due to the microscopic nature of semiconductors, which permits the controlled variation of characteristic matprial properties by external means over a wide range of parameters. The great technical importance of semiconductors has made thorough physical investigation of these materials necessary, but it has also justified the high cost involved in their fabrication and study. Owing to both of these aspects semiconductors are thc best explored and understood materials of condensrd matter today. Moreover, a multitude of physical phenomena which occur in other solid state materials may also be observed in semiconductors, often in the most distinctive way. For this reason studies of semiconductors can also provide knowledge of other solid state mattrials. Semiconductors have in fact become model systems for basic research in condensed matter physics. The presentation of the microscopic principles of semiconductor physics will occupy most of this book. The introductory first chapter lies outside of this framework because it involves discussion of the results on a phenomenological basis. The characterization of semiconductors by means of their unique macroscopic features, which we have touched upon above, will be continued in Chapter 1. In this regard, atomic structure will be discuss4 in section 1.2, chemical nature in section 1.3 and macroscopic physical properties in section 1.4. In dealing with macroscopic properties we will not restrict ourselves to mere description, but we will also use them to motivate the microscopic model of semiconductors introduced above. In this connection, we will make the first step towards a microscopic theory of semiconductors in section 1.4. Naturally, this will have to be done in a heuristic way, and many questions postponed until later. The full presentation of the microscopic principles of semiconductor physics will follow in later chapters.
1.2
means that,, for a given atom, there are remote atoms possessing the same shortrange order complexes as the original atom, rand that the positions of the remote atoms are related to the position of the origina1 atom by sirnple transformations. 'l'hp crystal is considered t o be inhitely large in this context. Atoms having identical shortrange order complexes are termed equivalent. Equivalent atoms are necessarily of the same chemical species, but chemically identical atoms need not necessarily be equivalent.
121 ..
Crystal lattices
of three noncoplanar vectors Al, Az. A3 with arbitrary integer coefficients t l ,t 2 , t3. The parallelepipeds of the crystal spanned by the particular displacement vectors Ak, A2, A3 are called unit cells. By putting unit cells together the wholc crystal may be constructed. ' h e size of a unit cell and the number of atoms in it is not fixed by the above definition, and can in fact be taken arbitrarily large, as long 8s it remains finite. The pertinent question is not how large a unit cell can be, but rather how small. The answer to this question leads us t o the definition of the primitive unit cell and the crystal lattice.
Definition
The smallest possible unit cell is called a primitive unit cell In the extreme this cell can contain only 1atom, but as a rule, it has several atoms. If there is only one atom per cell, then the shortrange order merges into the longrange order. I the unit cell is taken to be a primitive one, the vectors Al, A2, & f are some minimal vectors a l , az,w. The parallelepiped spanned by these vectors is a primitive unit cell, Each translation by a vector R o the form f
case
with integer coefficients r l , r2, r 3 transforms the crystal into itself. One refers to this property as the trunslation~lsymmetry of the crystal. The point set defined by the vectors R forms a spatial lattice called the crystal lattice. The vectors al,a2, are termed primitive lattice vectors, The volume 00of a ag, primitive unit cell may be written as the triple scalar product of al, a,
Ro = a . [ag x l
a].
While the lattice of a crystal and the volume Qo of its primitive unit cell are well defined, this is not the case for its primitive lattice vectors al,a2,a3 and also it is not true for the form of its primitive unit cell. Any set of linear combinations of the primitive vectors al,a2,a3 which yields a triple scalar product equal to the volume Qo is again a set of primitive lattice vectors, and the parallelepiped spanned by them forms a primitive unit cell. The corners of the parallelepipeds do not necessarily have to lie on lattice points. Each parallelepiped, shifted arbitrarily in space, is again a primitive unit cell. The 'parallelepiped form is also not imperative, as there are also other forms possible. An especially compact primitive unit cell is the socalled Wzgner3eit.z cell. The center of this cell lies on a lattice point and its surface is formed by the perpendicular bisector planes which divide in half the line segments joining the center to adjacent lattice points. Translations which transform a crystal into itself, by definition, do the same for the lattice of the crystal. Here, the translations are through lattice vectors R, called lattice tramtations The set of all lattice translations forms a group. This term describes a mathematical set of elements among which a 'multiplication' is defined that results in products which are also elements of the set. Further properties of a set forming a group are listed in A p pendix A. In particular, there must be an identity element, and the inverse of an element must also be an element of the set. In the case of translations the 'multiplication' is the consecutive application of two of these transformations. Since two consecutive lattice translations constitute yet another lattice translation, and also the requirements of Appendix A are satisfied, the set of all lattice translations of a crystal forms, in fact, a group, called the translation symmetrg, group or simply the tramlation group. Groups of symmetry elements play a central role in the description of the atomic structure and other microscopic properties of crystals. Appendix A provides a thorough discussion of groups as needed in this book.
their positions. In a t,ranslation, all points are shifted by the same vector, with no points fixed. Rotations transform righthand4 Cartesian coordinate systems into righthanded systems, but the application of reflections and inversion to righthanded system results in lefthanded ones. It turns out that there are orthogonal transformations which transform lattices into themselves. They are called point symmetry operations of lattices. The set of all point symmetry operations of a lattice forms a group, as does the set of all lattice translations. The multiglicslion of two of these operations is again understood to represent their consecutive application. Groups of point symmetry operations are termed point gvoups. In Appendix A we describe thwn in detail. Not all of the various point groups listed in Appendix A are allowed as symmetry groups of crystal lailices, but only parlicular ones which are called holohedral point groups. We will derive them now by demonstrating that lhey must have three special properties. First, all uf these point groups must contain the inversion transformation with r e q e t t to the lattice point R = 0. This may be seen as fullows: Inversion with respect to 0 transforms a lattice point R into EL Considered joiutly with R, the point R is a lattice point having q, r2, 73 as integer coefficients. Therefore, inversion with respect to 0 transforms an arbitrary lattice into itself. It follows immediately that inversion with respect l o any other lattice point will do h e same. Second, it turns out that rotation symmetry axes Ohrough lattice points can only be 2, 3, 4 and 6fold while ri, 7 and morefold axes are not compatible with the translation symmetry of the lattice. One may prove this as follows: Let C, be a rotation about such an axis t,hrough an angle 2?r/n. We consider a lattice plane perpendicular to this axis and denote a primitive lattice vector of the corresponding planar lattice by f (see Figure 1.1). Rotating it through 2a/a, it becomes C,f, and a rotation by  2 ? r / R transforms it into Cqlf. If, as w e suppose, Cmbelongs to the point group of the lattice, them s n d o a C;'. Thus both C,f and C i l f are vectors of the plane lattice. The same holds for the sum Cnf i Cg'f ol t,he two vectors. Moreover, C,,f CGLf represents a vector parallel to f. This means that C,f CF'f must be an integer multiple of f. Since the largest possible length of Cnf Cg'f can only he 2 / f J ,one has Cnf f C'L'f = p,F with pn  2. LO, 1 or 2. This is indicative t.hat, the relation
+ +
p , = 2COS
(?)
1
must hold for p,. For p n = 2, equation (1.4) yields n 2. For pn = 1 one has n = 3, for pn  0 it follows that n = 4 and for p , = 1 l h e solulion is n = 6 . For ppI 2 equation (1.4) has only the trivial solution n = I . This completes the proof concerning rotation symmetv opexations. For socaIled gvasicrystals. which do not exhibit an exact translation symmetry, rotations
about &fold and other axes are also possible. Third, one finds that point groups of lattices having 3, 4 or 6fold axes of rotation must also necessarily contain mirror planes parallel to each of these axes. The explicit proof of this assertion will not be presented here. All three required properties described above are satisfied by each of exactly sei;en point groups, namely c%(i), $ ) , ~ 2 h9( 2 q e ) ~ 3 & $ ) > cZ~( g, > D4h(&;$). D ~ h ( $ ; z ) and Oh ($3;). The point group notations used here are those of Schonflies and the international notations are given in brackets. Both systems of notation are explained in Appendix A. In summary. the above results mean that exactly seven different point groups are possible for spatial lattices. They define the seven crystal systems: triclinic (Ct),monoclinic, ( C 2 h ) , orthorhombic ( D z h ) , trigonal ( D 3 d ) . tetragonal (D*h), hexagonal (DGh), and cubic (Oh).
Bravais lattices
Within a given crystal system, several different types of lattices may exist. Their common property is that they all have the same point symmetry, but they may differ otherwise. Figure 1.2 visualizes them by means of their unit cells. These differences give rise to diflerenf lattice types. The simplest lattices ef a given point symmetry are represented at the far l f of each row et in Figure 1.2. They axe called primitive lattices. Even these simple lattices are not unambiguously &ermined by their point symmetry. If one, for instance, multiplies all lattice vectors by the same real number, i.e. stretches or compresses the lattices evenly on all sides, the point symmetry remains unchanged. In less symmetrical crystal system one may even change certain length relationships or angles between primitive lattice vectors without disturbing the point symmetry. In the tetragonal system the height of the
10
rectangular parallelepiped in relation to its basis may be changed arbitrarily. Generally speaking, the primitive lattices are determined only up to continuous point transformations preserving their point symmetries. Moreover, starting from a primitive lattice one can produce other sets of regularly ordered points by adding new points to each primitive unit cell in equivalent positions. As before, equivalent refers to positions which are either identical or that differ by a lattice vector. If one places the new points in symmetrical positions, i.e. those which are transformed into themselves or equivalent points by the point symmetry operations, snch as the centers of the primitive unit cells, one obtains a new point set having the same point symmetry as the original lattice. The new point set, in general, no longer forms a lattice, but may be thought of as the union of several lattices placed within each other. In some special cases, however, the result may still be a primitive lattice. Whether this happens or not is a question which must be explored separately in each case. If the answer is positive, one has to examine whether the lattice is only another realization of the original primitive lattice, i.e. whether or not it can be brought back to the original one by a continuous and symmetry preserving transformation. It turns out that both cases may occur. I the lattices cannot be transformed into each f other by such a transformation, then this implies that there are two different types of lattices with the same point symmetry. One calls them different Bmvais types or Bravais latttices. According to this definition two Bravais lattices are of the same type if they may be transformed into each other by a continuous and point symmetry preserving transformation, otherwise they are Bravais lattices of different types. As an example, we consider the different Bravais lattices in the case of the cubic crystal system. If one adds the body centers of the primitive unitcellcubes as additional points to a primitive cubic lattice, the resulting point set has the cubic point symmetry and it forms again a lattice. The same holds if the added points are the centers of the faces of the primitive unitcellcubes instead of the body centers. Neither the spacecentered nor the facecentered cubic lattices can be transformed back to a primitive cubic lattice by means of a continuous and symmetry preserving transformation, nor can the two centered lattices be transformed into each other by such a transformation. Therefore, they represent cubic lattices of two new Bravais types. If one adds both face and body centers to the primitive cubic lattice, then the new point set forms again a primitive cubic lattice, however, with a lattice constant equal to half of that of the original lattice. It may be transformed back to the original primitive cubic lattice by means of a continuous and symmetry preserving transformation, thus it does not represent a cubic lattice of a new Bravais type. Altogether one finds three different cubic Bravais lattices, the primitive (p), the bodycentered (bc), and the facecentered (fc) ones. Analogous considerations have to be made for the other 6 primitive Iattices.
11
Cubic P
Cuhic 1
Chic F
Tetragonst I
pj$,
. .\ #
I
Triclinic
Monocfinic
Monoclinic 1
Trigonal
12
In this w q ,one finds that, in totul, 14 diEerriit Bravais lattices are possible in the 7 crystal systems. The assignment ol Iht.sr lattices to the crystal systems is indicated in Table 1.1. The Bravais lattices themsehps are shown in Figure 1.2.
1.2.2
The lattice of a crystal serves as a conceptional basis for the illustration of its translation symmetry. Lattice points do not necessarily have to be occupied by atoms, which may be bum1 at general points of the primitive unit cells. Generally, a primitive unit cell contains several atoms which may be either chemically identical or different. We denote them here by an atom index 1 which t,akes the values 1  1 , 2 , . . . , L , where L is the total number of atoms in a primitive unit cell. For L 2 2 the set of all L atoms is called the basis uf the crystal. In the case I. = 1 one says t.hat the crystal has no basis. Ihe position of the 1th at,om, relative to the corner R of a primitive unit cell, is drscribed by a vrcbor The position R of this atom relative t o the i origin is then given by
Ri=Rti. (1.5) Without loss of generality one may always set onr of the vectors il, e.g. i l . equal to zero. If the primitive unit cell canlains only 1 atom, it may be placed in o m o f the roriiers of the cell. A crystal without basis may thiis be describd as a laltice whosp points are dl occupied by ahoms. For L 2 2, the crystal may be generated in such a way that one multiples its crystal lattice Lfold, then shifts the resulting sublattices relative to t,he first by, respcctivcly, the vectors 6 ,. ,?I,, and finally occupies the points of the .. shift,ed lattices, respectively, with atoms of t,he species 2,. . . , L. With only one atom per primitive unit cell! any point symmetry operation of the lattice will ncccssatily transform the whole crystal into itsdf. For cryst,alswith basis, however, this is not true i general. For this reason it is n meaningful to consider, besides the point symmetry operations of the crystal lattice, also orthogonal transformations which map physically equivalent direchns of Ihe crystal into each other, without necessarily bringing the crystal back onto itself. We explain the meaning of physical equivalence by using the example of the relation between the vectors of the electric current density j and the electric field strength E in a crystal. Generally, j is a nonlincar fimrtioii of E and, because of crystal snisotropy, the direct.ions of the two vectors may be different. If E and j are transformed from their original directions reht.ive to the crystal into new ones, without changing their relative orientation, the relation betwccn the new j and E, in general, will differ from the relation before rotat,ion. Analogous statements hold for reflections and rotationreflections. If tbcre are orthogonal transformations
13
Table 1.1: Symmetry classification of crystals. The following abbreviations are used p  primit.ive, bc  body centered, fc face centered, bfc  basis face centered.
~
Crystal
system
tnclinic
monoclinic
+
group lattk
c 1
Holohedral
Braiais
Crystal
class
Number of space
groups
r
C,
! I
1
'2h
rhombic
trigonal
tetragonal
hexagonal
cubic
I )
p
bc
fc
14
which leave the relation between the two vectors unchanged  which ultimately must be verified experimentally  the original and the transformed directions are called physically equivalent. The group of orthogonal transformations which map physically equivalent directions of a crystal into each other, is called the point group of equivalent directions. It defines the crystal class. Each crystal class corresponds to a particular point group of equivalent directions. In crystals without basis the point group of equivalent directions coincides with the holohedral point group. For crystals with a basis this is no longer true in general, and the point groups of equivalent directions are generally subgroups of the holohedral groups. There are as many different point groups of equivalent directions, or crystal classes, as there are different subgroups of the 7 holohedral groups. Using Appendix A one can easily show that their number amounts to 32 (see Table 1.1). Not all crystal classes can be realized in all crystal systems  the point group which defines the crystal class has to be a subgroup of the holohedral group of the crystal lattice. Each crystal class is, however, found in at least one crystal system, several in more than one. In assigning the crystal classes to the different crystal systems one proceeds as follows: A class which exists in several systems is attributed to the one with the lowest common symmetry. In this way one obtains the assignment between crystal systems and crystal classes shown in Table 1.1.
1.2.3
It remains for us to explore the symmetry of the crystal as a whole. This consists of the set of all rigid displacements which transform the atoms of the crystal into identical or equivalent positions. It is obvious that symmetry operations which transform equivalent atoms into each other must necessarily also do the same with physically equivalent directions. The converse, however, is not always true  after carrying out s rotation, reflection, rotationreflection or rotationinversion which transforms a particular direction into a physically equivalent one, the atoms of the crystal are not necessarily also trausformed into equivalent atoms. It may be necessary to add to a rotation yet another translation parallel to its axis (screwrotation), or to add to a reflection a displacement parallel to the mirror plgraetglidere~ection),or to add both in the case of rotationreflections and rotationinversions. That one records in this way all conceivable symmetry operations follows from a theorem proven in Appendix A, which states that every rigid displacement which is not a pure translation or an orthogonal transformation, must be a screwrotation or a glidereflection. The parallel displacement P; which follows a rotation about an nfold axis in a screwrotation, must be an integer multiple of one nth of the smallest lattice vector in the direction of the axis. This follows immediately if one
15
considers the nth power of the screwrotation, which is also a symmetry element as well. This describes an nfold repetition of the rotation by 27r/n, i.e. a rotation by 27r, or no rotation at all, followed by the translation n times Since the crystal must pass into an equivalent position by such a transformation, the vector n times p must be a whole lattice vector. As it points in the direction of the screw axis, it is necessarily a multiple of the shortest lattice vector in this direction. In the international notation system the multiplicity is described by a lower index a f b e d to the symbol for the rotation axis. This also indicates that the axis is not an ordinary one but a screw axis (with righthanded thread). The symbol 63, for example, means a 6fold screw axis with a parallel displacement by half the shortest lattice vector in the direction of the axis. For a glidereflection one can similarly show that the translation in the mirror plane must be a superposition of multiples of halves of the smallest linearly independent lattice vectors in this plane. In summary, we may state that the symmetry operations on a crystal are of the following types: translations by lattice vectors, proper rotations, reflections, rotary reflections, rotationinversions as well as screwrotations and glidereflections. Furthermore, all combinations of these transformations are allowed. The set of all symmetry operations on a crystal forms a group. It is called a space group. If the crystal has no screwrotations or glidereflections as symmetry operations, then its space group necessarily contains the point group of equivalent directions as a subgroup. Such space groups are called symmorphic. Space groups with screwrotations or glide reflections are called nonsymmorphic. The latter do not contain the point group of equivalent directions of the crystal. Each element of a symmorphic space group is the product of an element of its translation group and an element of the point group of equivalent directions. The set of all possible space groups of crystals may be obtained in the following way: One considers, first of all, crystals of the triclinic system. In this case only the primitive Bravais lattice is possible. The corresponding space groups may easily be determined  there are only two. Similarly one proceeds with all other combinations of crystal classes, Bravais lattices and crystal systems, moving in the direction of increasing symmetry. At each stage of counting only the newly occurring space groups are added. In this way one obtains the numbers indicated in Table 1.1. The total is 230. Each of these 230 possible space groups corresponds to a particular crystal structure. By specifying its space group, the structure of a crystal is uniquely determined, except, of course, for changes of the distances between atoms and of angles between lines connecting atoms which do not affect the symmetry. The majority of semiconductors belong to a small selection of the possible crystal structures or space groups. Five are especially important: Their designations in the international system are Fd3m (diamond structure), F W m
+.
16
(zincblende st ruct ure) , F m3m (rocksalt st ruct ure) , P63m c (wurt zite st ructure), and P3121 or P3221 (selenium structure). Each of these crystal structures is more or less closely tied to a particular material group. We consider this connection in more detail in the next section, which deals with the chemical nature of semiconductors. The five crystal structures above belong to just two different crystal systems  t h e diamond, zincblende and rocksalt structures belong to the cubic system and the wurtzite and selenium structures belong to the hexagonal system. We start with the three cubic structures. Table 1.2 summarizes their properties
1.2.4
Crystals having diamond, zincblende and rocksalt structures not only belong to the same crystal system, but also have the same Bravais lattice, namely the face centered cubic, abbreviated as fcc. The fcc lattice is commonly described in a Cartesian coordinate system whose unit vectors e,, ey,e, are taken in the directions of the cubic crystal axes. The lattice constant a is the distance between the lattice points of a primitive cubic reference lattice, which is obtained from the fcc lattices by omitting the face centers. The primitive lattice vectors al,a2,* of the fcc lattice may be chosen in the form
The lattice constants for a series of semiconductors with fcc lattices are listed in Table 1.3. We stress that the cubic lattice constant is neither the distance between two lattice points (it is u / f i for the fcc lattice), nor the distance between two atoms in a crystal with this structure (which is &/4). For the volume Ro of the primitive unit cell one obtains the value u3/4 from equation (1.3). In the case of silicon this yields Ro = 4.00 x cm3. A silicon crystal of volume 1 cm3, therefore, contains 2.5 x primitive unit cells. The lattice does not determine the positions of the atoms. This is done by the basis of the crystal, in which respect the three cubic semiconductor structures differ.
17
Table 1.2: Characteristicz of the five most comnion structure types of seniiconduc tors. 0 < C < h, 0 < n < $. More details are given in the text.
Diamond
Zincblende
Itocksalt
Wiirtzite
Seleniuni
F#m
P3121
P3221
fy
,3; 4
2
43m
Gmm
32
Bravais lattice
cubic fc
cubic fc
cubic fc
hexagonal P
hexagonal P
Basis
I
I
I
18
Table 1.3: Lattice constants of semiconductors of various types of structures. Diamond structure C Si
5.43
Zincblende structure
1:
Ge
5.65
,
I
aSn
6.46
I
GaN
4.54
AlP
4.37 5.47
RlAs
5.66 ZnS 5.43
AlSb
6.14 ZnSe 5.66
Gal
5.45
Cans
CaSb
Id
5.87
5.65
6.13
InAs
6.05
InSb
6.47
ZnTc
6.08
CdTe
6.42
HgSe
6.08
IlgTe
6.37
1:
I
Rocksalt structure
PbS
5.93
PbSe
6.12
PbTe
6.45
SnTc
6.3
CdO
4.70
MgS
5.20
MgSe
5.46
Wurtzite structure
ZnS
3.82 1.64
I
MgTe
4.53 1.62
I
AlN
3.11 1.60
CdS
4.14 1.62
CdSe
4.31 1.63
GaN
3.19 1.62
InN
3.54 1.61
a cla
Selenium structure
Se
4.36
Te
4.47 1.33
a
1.14
cl a
19
Figure 1.3; Spatial model of a crystal having diamond (a) and zincblende structure (b). In the diamond structure all atoms arc of the same chemical clement, in the zincblendc structure the atoms are of two different chemical elements.
0 1A 02 8 3A 0 4B
P
(1111
6
Cmll
3
cfm
a IA r, 2 0 3A 40 * 5A 60
(J
Figure 1.4: Projection of a crystal having diamond and zincblende structure in a (100) plane (left) and a (111) plane (right). Atoms of the same size and darkness lie in the same plane. The vertical sequence of atomic layers is indicated on the right hand side of the projections. The sequence for the (100) plane is repeated after 4 layers, and that for the (111) plane after 6 layers. Crystals with diamond and zincblende structures may also be described as a composite of two interpenetrating superposed fcc lattices displaced with 2 respect to each other by the vector ? and with their lattice points occupied by, respectively, chemically equivalent or different atoms. The geometric relations are depicted in Figures. 1.3 and 1.4 in, respectively, a 3dimensional representation and a plane projection. From Figure 1.4 one can readily see that for both crystal structures the cubic axes put through an atom form 4fold mirrorrotation or inversion rotation axes (which are the same in this drawing). The body diagonals through an atom are %fold axes of
20
Figuw 1.5: Stereograms of the paint groups Oh (left) and 7'6 (right) of, respectively, the diamond and zinchlmde structiires.
rotation, and the planrs through opposite cube dges are mirror planes. Both structures are theiefore tramformd into themselves by the point group T d , the symmetry group of a tetrahcdron. It follows that their point groups of equivalent directions are at least T d . For the zincblende structure there are
no further point symmetry operations which transform equivalent directions into each other. The point group of equivalent directions is therefore T d , and the space group turns out to be symmorphic. It is denoted by F13m.
In the raw of diamond structure, thew i s yet another symmptry operation, namely that which transforms the two chemically equivalent atoms of the basis into cach other. It may be described as reflection in a plane perpendicular to a cubic axis, say e,, which cuts the connecting line bel w r n two atoms at its center, followrd hy a translation in that plane by (e/4) (ex ey). One, therefore, has a glidereflection as additional symmetry dement. The space group of diamond structure becomw nonsymmorphir in this way The point group of directions of this structure may be obtained from the point group T d of the zincblende structure by adding the reflections in planes perpmdicular to the thrw cubic axes. This yields the full cubic group Oh, as one can easily determine by means of the stereograms o the f Iwo groups rn Figure 1.5 (for a n introduction to the stereograms of point groups, see Appendix A). The group o h can be generated from t h e tetrahedral group in yet anothcr way, namely by adding the inversion operation. As an element of the space group, inversion must be relative to the site of an atom and be followed by a tradation through the vector T or ?zr depaid2 ing on whether m c considers a 1 or 2atom. For future reference we bear in mind that for the diamond and zincblende structures, inversion always involves an exchange of the two sublattims.
21
tal structures, the socalled coordmatatm Consider atom 1 of thc Lesis of the unit cell at R = 0. The basis atom 2 of this cell is one of its nearebt neighbors. If all rotationreflections are executed with respect to the axis through atom I, then the basis atom 2 of the primitive unit cell at R = 0 gives rise t o three more basis atoms of type 2. They are a l at the same distance l from atom 1 as thP 2atom first considered, but naturally they lie in three other primitive unit cells. This means that each 1atom has, altogether, four nearest neighboring 2 atoms. Since it would also have been possible to start from a 2atom in this consideration, the mme also holds with respect to a %atom, and, of course, independently of whether t h 2atom is of the same chemical nature as the 1atom (diamond structure) or is not (zincblende structure). The four nearest neighbor atoms lie at the corners of a tetrahedron, whose center is occupied by the atom itself (see Figure 1.3). Their relative positions are given by the vectors (a/4)(1, 1, l),(e/4)(T, 1,l), (a/4) (1,7,l), (a/4)(1,1,i). Thr distanrc betwrrn nearest neighbor atoms is &a/4. For silicon this is 2+35 A . The second nearest npighbors belong to the same sublattice. Therefore, for both structures, they arc atoms of thP same chemical species. They are located at thc nearestneighbor lattice points of the fcc lattice. relative to the central atom, thus their positions are ( u / 2 } ( * e 9 f ef), ( a / 2 ) ( f e y f e z ) , (a/2)(*e, f ey). This means that thew are 12 second nearest neighbors i each slructurr. n Rocksalt structure The basis of this crystal structure consists, like that of zincblende structure, of two atoms of diffprent chemical nature which arc displaced relative to each other in the direction of the space diagonal of the reference cube. In contrast to zincblende structure, the displawmcnl i s not, however, a quarter but half of the length o the space diagonal of the reference cube, whence
22
(1111
1A 01B 4 2A 0 28
1A
028
0
3A 48 5A
6B
Figure 1.7: Projection of a crystal having rocksalt structure in the (100) plane (left) and the (111) plane (right). The vertical sequences of atomic layers is shown on the right hand side of the projections. In the (LOO) case the stacking repeata itself after 4 Iayem, and in the (111) case after 6 layers. Accordingly, a crystal having rocksalt structure can be described as a composite of two fcc lattices, displaced with respect to ~ a c h other by the vector ?, and with their lattice points occupied by chemically different atoms. ' z This structure is illustrated in Figures 1.6 and 1.7. As one can see horn these drawings, the crystal has the full symmetry of the primitive reference cube. The point group of equivalent directions is therefore the full cubic group o h (see Figure 1.5). The space group is symmorphic and is denoted by FmSm. In Ihe direction of a cubic axis, the atoms are separated by a distance of a / 2 , with 1 and 2atoms alternating. Since shorter distances do not occur, these atoms are ncarest neighbors. Each atom, therefore, has 6 nearest neighbors. The second nearest neighbors are the nearest neighboring atoms of the same fcc sublatticp. Their number amounts to 12, as in the zincblende structure. The relative positions are also the same as in this structure. 1.2.5
The primitive lattice vectors of the hexagonal lattice may be written in the form
ai y a, e,
h
a2  ex
a h
a + &e,", 2
ce,.
Here ek, I$, e," are unit vectors of a cubic coordinate system whose zaxis points in the direction of the caxis of the hexagonal lattice and whose zaxis is identifird arbitrarily with one of the three symmetric lattice directions in the plane perpendicular to the caxis. The two primitive lattice vectors a1
23
and a in this plane have equal lengths and form an angle of 2x13 with 2 each other. The volume 52 of a primitive unit cell is &a2c/2. The lattice 0 constants a and c of some semiconductors with a hexagonal lattice are listed in Table 1.3. Wurtzite structure The basis of crystals with wurtzite structure consists of 4 atoms. We denote them by M I , X I , Mz and X 2 . Each pair of these atoms, namely M I , M2, on the one hand, and XI, X 2 ) on the other hand, are chemically equivalent, while the two pairs are of different elements M and X. If atom M I is sited at a lattice point, then the position vectors of the three others are
fx1= cc e," ,
(1.10)
Here, ( is a parameter which may take any value between 0 and 1/2. A wurtzite type crystal can be understood as a composite superposition of four interpenetrating hexagonal lattices, with the lattice points of two of them occupied by atoms of chemical species M, and two of them by atoms of species X. The two Mlattices are displaced relative to each other by F M ~ ,likewise as the two Xlattices. The displacement of the XIlattice with respect to the MIlattice is Fx1. In Figures 1.8 and 1.9 these relations are illustrated in, respectively, spatial and planar displays. The Matoms are shown in black and the Xatoms in white. From these figures one can easily determine the symmetry of the wurtzite structure. The caxis through the center of one of the empty hexagons in Figure 1.9 forms a 6fold screw axis 63: a 2n/6rotation about this axis must be connected with a parallel displacement by c/2 in order to transform the crystal into itself. This also mean8 that this axis represents a %fold proper axis of rotation. The axis contains six inequivalent mirror planes, among them three proper reflection planes and three glidereflection planes joined with translations by c/2 in the direction of the caxis. The corresponding point group of directions is c 6 v (see Figure 1.10). The space group of the wurtzite structure is nonsymmorphic and is denoted by P6amc. Consider the atom M I in the unit cell at R  0. Its neighbor in the same unit cell is atom X 1 at the site ?XI. Another neighbor of M I is Since the atom Xz at the site ce,h I Fxz of the unit cell at R  c$. the caxis through atom M i is a %fold symmetry axis (sep Figure t.9), one obtains from Xz two additional Xatoms at the same distance but in other unit cells. Altogether, there are four neighboring Xatoms. In order that all four be at the same distance from M I , and, correspondingly, all four be
24
0 1 A 028 3A
0
4B
Figure 1.9; Projectionof a crystal with ideal wurtzite structure in the plane normal to the caxis. The vertical sequence of atoniic layers is shown on the right herid side.
nearest neighbors
the wurtzite
of M I , the distance 1 7x1 I between M I and X I must be equal to the distance I ce," t F x 2 I between Mi and X2 in the unit cell at R = ce!. This results in the condition
c = , 19 ( ;1) t
(1.11)
If it is fiilfilled, the atom Mi has four nearest neighboru of chemical species X . The same statement is true for M I , with the interchange of M and X , dito for X I and Xa. In semiconductor crystals of the wurtzite type, the condition (1.11) is always satisfied almosl perfectly. This means that
the four nearest neighbors of an atom are sited at the corners of a slightly deformed tetrahedron which surrounds the central atom symmetrically, and
25
which is compressed or stretched in the direction of the caxis. The degree of deformation depends on the ratio cla. For the deformation to vanish, the distance between the M I  and XIlattice planes perpendicular to the caxis must be three times the distance between the X I  and M2lattice planes perpendicular to this axis, just as in the zincblende structure. This yields = 3(1/2<) or = 318 which, taken together with equation (l.ll), results in
<
<
(z) fi=
=
1.633.
(1.12)
Table 1.2 shows that all wurtzite type semiconductors listed there also satisfy this condition quite well. This indicates that the ordering of the nearest neighbors in the wurtzite structure is almost the same as in the zincblende structure. If a wurtzite crystal exactly obeys conditions (1.11) and (1.12), one says that it has an ideal w u r t z i t e structure. The distance between two nearest neighbor atoms in the zincblende structure is &all4 with a the cubic lattice constant. The corresponding distance in the wurtzite structure is Cc = (3/8)@ a. For the two distances to be equal, the equation
a= h a
(1.13)
must hold. This result means that in a crystal having wurtzite structure which fuEUs conditions (1.11) and (1.12), the nearest neighbor atoms are positioned at exactly the same sites as in a crystal having zincblende structure with lattice constant 4 a. If one multiplies the lattice constants a, of thc wurtzitc type crystals in Table 1.2 by 4,then one in fact obtains values which fit well witahthe cubic lat,t,iccconstants of the zincblendc structure crystals of this table. The volume &a2c/2 of a primitive unit cell oi the ideal wurtzitc structure is d 3 / 2 , which is twice that of thc zincblende reference structure given by 8 1 4 . Of course, this does not at all mean that the wurtzite structurc may be traced back to lhe zincblende slruclure, Indeed, the actual positions of the second nearest neighbors differ in the two crystal structures, although they are all at the same distances a = a/& This may be seen most easily from the projections of the two crystals on the plane perpendicular to the caxis (see Figure 1.4) or to the space diagonals e, t e$  t e (see Figure , 1.9). In the wurtzite structure the vertical stacking repeats itself after two double layers of nearest neighbor atoms, but in the zincblende structure only after three. The wurtzik structure can be set up, therefore, by stacking one upon another two different double layers A and B, the zincblende structiire by stacking three double layers A, B, (see Figure 1.11). The uppermost C double layer A of the zincbbnde structure in Figure 1.llb coincides with the
26
A *
B  B A  C B A .
A 
C
:
 B
Figure 1.11: Construction of a crystal having wurtzite structure (top) and zincblende structure (bottom) by stacking of double layers of nearestneighbor atoms. Atoms shown by full circles form the upper part of a double layer, and atoms shown by open circles form the lower part of such a layer. In stacking, the nets shown in each of the double layers must be placed at the same positions. The stacking sequence is depicted on the upper right. M h e r explanations are given in the text. uppermost double layer A of the wurtzite structure in Figure L l l a , while B and C differ from B. By examining the layers in Figure 1.11 we also note the different locations of some of the second nearest neighbor atoms of the two crystal structures. Among the 12 second nearest neighbors of an atom in an A = Alayer, in the ideal wurtzite structure, three are found in the upper Blayer and three in the lower Blayer, while in the zincblende structure three second nearest neighbors are in the Clayer above and three in the Blayer below. In Figure 1.11 this fact is illustrated by emphasis on the second nearest neighbor atoms mentioned. Those in double layer B are not in the same positions as those in double layer B.Thus the second nearest neighbor sites are partially different in the wurtzite and the ideal zincblende structures. A comparison of the two crystal structures shows that the same ordering of the nearest neighbor atoms may be consistent with different orderings of more remote atoms. From this observation we may conclude that, in a solid,
27
the longrange order could be perturbed without significantly changing the shortrange order of the corresponding crystal. This is just what is observed in amorphous silicon and other amorphous semiconductors. Selenium structure The basis of trigonal selenium and tellurium crystals consists of three atoms. The vectors of the basis may be written in the form
~3 = 2craez
+  e,h, 3
2c
(1.14)
where a is a parameter ranging between 0 and 112. The atoms are ordered on parallel spirals, in which successive atomic positions are rotated with respect to each other by an angle of 120' (see Figure 1.12). The space group can be determined from the projection of the crystal on the plane perpendicular to the caxis (see Figure 1.13). It contains one %fold screw axis parallel to the caxis which goes through a center of one of the empty triangles in Figure 1.13. A rotation by 120 or 240' must be followed by a displacement by c/3 or 2c/3 in the direction of the caxis, to transform the crystal into itself. Beside these screw operations, rotations are allowed with respect to the three inequivalent 2fold axes perpendicular to the caxis. These rotation axes pass through the %fold screw axis and through one of the three surrounding atoms. The symmetry operations mentioned correspond to the space group P312. Beside this, the selenium structure may also have symmetry corresponding to the space group P322, where the spiral differing from P312 has not a righthanded thread, but a lefthanded one. The point group of directions is, in both cases, D3 (see Figure 1.14). Concluding this discussion of the characterization of the most common semiconductor crystal structures, it should be emphasized that these struc
28
(00011
Figure 1.13: Projection of a ciystal having selenium structure on the plane normal to the caxis. The vertical sequence of atomic layers is shown on the right hand side.
tures apply to the ideal case of infinite, structurally perfect, and chemically absolutely pure materials. Real crystals deviate from this ideal case in varying degrees. There are structural defects such as stacking faults, step or screwdislocations, vacancies, or atoms on interstitial crystal sites. Such defects will be considered in detail in Chapter 3, in the context of the electronic f structure o yprturbrrl semiconductors. In the next section we will discuss the chemical composition of idea1 semiconductor crystals.
1.3
According to their chemical nature, most semiconductors are inorganic materials. Examples of organic semiconductors indude anthracene, naphthaline and polyacetylcne. In thiv book, .we restrict our considerations to inorganic semiconductors, because these are much bettcr understood and haw much greater twhnolagical iniporttznce than semiconductors of the organic type. Nevertheless, organic semiconductors are also of interest for science and tcchnology. The best way to get an overview of the different &sues of inorganic semiconducting materials is to examine thP periodic table of elements. In 'l'able 1.4. a part of the periodic table is shown with which many elemental and compound semiconductors are associated. W e start with the clernenld semiconductors of group IV.
29
3
Mg
c
Si
Ge
0
S
A1
Ga
In T1
P
As Sb
Bi
Zn
Se
Cd
Sn
're
Hg
Pb
Table 1.4: Part of the periodic table t o which Inany element.al and compound semiconductors are associated.
1.3.1
Semiconductors of this group are, starting at the top of column IV? carbon (C) in the form as diamond, silicon (Si), germanium (Ge), and tin (Sn) in its gray modification known as aSn. All cited semiconductors of this group crystallize into the structure of diamond. They differ from each other in regard to their position Let,weezl metals and insulators. Diamond behaves much like an insulator! and tin, on the contrary, much like a metal. Retween them one has the two typical semicondnctors silicon and germanium. Both mat.&& play an essential role in micro and optoelectronics. It is weUknown that silicon is the material of preference in microelectronics and, with this, one of the most important materials in d modern technology. l Beside the pure elemental semiconductors Si a i d Ge, alloys of bath materials also have semiconducting properties. The notation for such alloys is (Si,Ge), or Sil,Ge, to indicate the rpspcctive mole fractions 1  z and z of the two alloy components. According to their structure, the alloys may be identified as mixed crystals. These are crystals with the same geomcbric order of atom sites as in the case of the alloy components here that of diamond in both CBSW. The regular crystal sites are occupied, however. randamly by Si or Ge atoms. T'he probability of finding a particular el* ment on a given site is determined hy the mole frclctiou of this element. in t,he alloy. In the case of Sil,Ce, the probability of finding Si is 1  2. and of finding Ge is x. The lattice constants of inixcd cryst.als, i many n
~
30
cases, follow from the lattice constants of the constituents by means of Enear interpolation. This result is known as Vegards rule. The cubic lattice of constant a(si,ce) a Sil,Ge, alloy, for example, is given by the relation ~ ( s i =~ ) z)asi Z U G ~ ,where asi and a& are the cubic lattice con, (1 stants of Si and Ge, respectively. A mixed crystal strongly deviates from an ideal crystal, but the semiconducting property of the two pure crystals is often retained in the mixed crystal. This is important, and by no means obvious. It was recognized at the beginning of the sixties in (Si, Ge) alloys. Later it was also shown to be true in many other material combinations. The discovery of semiconducting alloys significantly expanded the family of semiconductor materials. It became possible to tailor materials with desired properties, just by varying the alloy elements and mole fractions. The common feature of the elements of the main group IV of the periodic table is that there are four electrons in the outer shell of their electron clouds, the socalled valence shell. The primitive unit cell, which here has two atoms, therefore contains 8 electrons. Later we will prove that semiconducting properties are related to the crystal structure and the number of valence electrons per primitive unit cell. Using this result, one may conclude that compounds of elements from main groups III and V, and from main groups 1 and VI of the periodic table should also be semiconductors. pro1 vided they have zincblende structure, since the crystals of these compounds also have 8 electrons per primitive unit cell. Actually the GIV compounds (with a few exceptions) do crystallize into the zincblende structure, In the case of ILVIcompounds the wurtzite structure can also occur, besides the zincblende structure. The two structures are, however, very similar, as has been pointed out above, so that semiconducting behavior may be expected also in their case.
1.3.2
111V semiconductors
The conjecture that the 111V compounds should be semiconductors is amply confirmed. Following the group III and group V columns of the periodic table from the top down, one obtains the following compound semiconductors: BN, BP, BAS, AlK, Alp, AlAs, AlSb, GaN, Gap, GaAs, GaSb, ZnN, InP, InAs and InSb. Except for the nitrides all these compounds crystallize into the zincblende structure. The nitrides are stable in the wurtzite structure, BY and GaN also have metastable zincblende phases. Just, as in the case of the elemental semiconductors of group IV, the mixed crystals made of binary 111V compounds also have semiconducting properties. Examples are (Ga, Al)As, Ga(As, P), (In,Ga)As and (In,Ga)(As,P). The principal applications of the 111V semiconductors and their alloys lie in the field of optoelectronics. (Ga,Al)As and Ga(As,P) are used, or example, in light emitting diodes (LEDs) and laser diodes for the near infrared t o green spectral region. GaN
31
is a promising material for blue LEDs and laser diodes. The quaternary alloy system (Ga, In)(As, P ) is used in making laserx and pholodiodes for optical fiber communications at 1.55 p m wavelengths, which provides maximum transmission for SiOabased fibers. GaAs and, lately, also (Gtr,Al)As are employed for transistors with extremely short time delay. BN and GaN can be used for electronic devices to be operated at high temperatures.
1.3.4
Selenium and tellurium are likewise good semiconductors. Both crystallize into the same trigonal chain structure, which is characteristic of these materials (see Table 1.2). Selenium has a long history of practical use. Before the introduction of power rectifiers based on silicon, selenium ones were used. Today, above all, the photoelectric properties of selenium are exploited. This applies, for example, to photocopying where the photosensitive layer of the photostatic cylinder may be made of selenium or a selenium alloy.
32
1.3.6
It is to be expected that a combination of two elements of group IV will be a semiconductor. It seems that S i c is the only stable compound of this kind. It is one of the longest known semiconductors of all. Electroluminescence w s observed in this material as earlier as 1907. S i c OCCUIS in many crystal a structures, among them the zincblende and wurtzite structures. After a long period of only moderate interest in Sic, much activity is now being devoted to this materiai. The reason is the good thermal stability of Sic. which makes it possible to fabricate electronic devices for high temperature and high power applications. The elements of groups I1 and V may form semiconducting 1 I 3  v ~compounds. Examples are ZmAsz and CdsAsz, which possess both tetragonal and cubic crystalline phases. One also finds semiconducting properties in compounds composed of elements of groups V and VI with the stoichiometric composition V2VI3. In particular, the oxides, sulfides. and selenides of the semimetals As, Sb, and Bi are among them. With regard to crystal structure, a relatively large diversity exists among these compounds. Different structure types of the trigonal, orthorhombic and monoclinic crystal systems have been observed. The very good thermoelectric properties of Bi2Se3 and BizTq have long been used for cooling elements. Finally, one also finds semiconductors among IIIVI compounds. Examples are G ~ q S e 3 ~ GazTe3 or In2Se3, which display a 2:3stoichiometry. They crystallize into socalled defect stmctuws. These are modifications of certain basic crystallographic structures as, for example, of the zincblende or wurtzite type, which distinguish themselves by having a (ordered or disordered) network of cation vacancies. These vacancies are necessary to make possible a stoichiometry which deviates h m that of the basic structure. Among the semiconducting 111VI compounds, however, there are also those with 1:lstoichiometry. The layerstructures GaS, GaSe, and GaTe are ex
33
amples. A semiconductor of this group displaying ferromagnetic properties is EuSe. So far we have exclusively considered bznary compound semiconductors. Ilowever, thcrc are also semiconductors generated from ternary and qua ternary compounds. Among the ternary IIIEVI compounds, one has, in particular, the I111VTz materials which crystallize, such as CuGaSz, into the chalcopyrite structure belonging to the hexagonal crystal system. Semiconductors are also counted among the IIIVV2 compounds, ZnSiPa as an example. The crystal structure is also of chalcopyrite type. Concluding this survey of the most important semiconducting material classes, we add two more general remarks. Firstly, it has to b r remeinherd that we have stated the chemical composition of ideal semiconductor crystals, meaning, in particular, chemically pure materials. Absolute chemical purity, however, occiirs in nature just as seldom as absolute crystallographic perfection. Each real semiconductor material contains chemical impurities. In order that crystals composed of such materials really be semiconductors, their impurity concentrations must not to be too large  bear in mind what has been said about this point in the introduction. On the other hand, semiconductors can not to be too pure. In order to have a sufficiently large electric conductivity, they must often be intentionally polluted with certain chemical elements. We will undertake this discussion in greater detail in the next subsection. Secondly, it is striking that the semiconducting substance classes which were defined pritnarily by their chemical nature also display, as a riile, the same crystallographic structure. The reason for this relation between chemical composition and atomic structure is by no means obvious. It is closely connected with the electronic structure of crystals which we will treat in Chapter 2.
34
Table 1.5: Electrical conductivity 0 and charge carrier concentrations for metals,
serniconduct.ors,and insulators.
Material class
Metal
Semiconductor
Insulator
lo3..
n jcm3
< 1010
1021 . . . 1o1O
1.4.1
Electrical conductivity
As already mentioned, the term semiconductor refers t o a characteristic macroscopic property of these materials, namely their electrical conductivity, The latter is defined in terms of the proportionality factor between the current density j and the electric field E in Ohms law
j=uE
(1.15)
Here, it is assumed that anisotropies either do not exist or are negligible. The following example gives an idea of the order of magnitude of the conductivity: Consider a cube with edge length of 1 cm, and apply a voltage of 1 V between two of its opposite planes. Then a current of 1 A will flow if ern'. The order of magthe cube material has a conductivity of (T = 1 nitude ranges spanned by the Uvalues of the three material classes metals, semiconductors, and insulators are assembled in Table 1.5. The striking features of this table are the large changes both between the three classes of substances as well as within the semiconductor material class. The question arises how these sharp changes may be understood 0 1a physical basis. 1 To construct an explanation, we use the known representation of the s electrical conductivity IT a the product of the electron concentration n, the mobility ,u, and the electric charge e of an electron,
u = enp.
(1.16)
The mobility p is defined as the ratio of the magnitude of the average velocit,y of an electron to the strength of the electric field driving it (for a derivation see Chapter 5). Formula (1.16) for the Conductivity o shows that its variations may, in principle, be caused either by different values of the mobility p and/or by different electron concentrations n. The mobility is actually determined by the scattering processes which electrons undergo due to the perturbations of the crystal lattice. One ex
35
pecks, of course, differences with respect to the strengths of these processes in metals, semiconductors, and insulators, but variations by several orders of magnitude are very unlikely. The orders of magnitude of the mobilities ME, psc and in normal metals, semiconductors and insulators, respectively, should be more or less the same:
P M E E? P S C = P I S .
(1.17)
This conjecture is confirmed by experiment, and consequently it follows that the large conductivity differences between the three material groups are essentially due to their different electron concentrations. If one assumes for metals one free electron per atom, which amounts to approximately 10 electrons per m 3 , the conductivity values in Table 1.5 yield the electhen tron concentrations of semiconductors and insulators listed in this table.
1.4.2
The large range of conductivity values of semiconductors, between lo 0l c m  l and lo3 R m , does not arise, as one might suppose, from the chemical diversity of semiconductors, but is primarily due to the fact that a given semiconductor material covers the entire range of crvalues by itself if its macroscopic state is varied. By state we mean, firstly, the temperature, and secondly the Concentrations of certain impurities. Considering first the change of the conductivity with temperature, we note that it is particularly pronounced if the semiconductor contains practically no impurities  one c d s this an int~ninsicsemiconductor. In Figure 1.15 the dependence of the conductivity on temperature is shown for very pure silicon and germanium samples. The conductivity increases with rising temperature according to the exponential law
(1.18)
Here, rrg is a factor which depends only weakly on temperature, k is the Boltzmann constant and Eg is an energy of about 0.7 eV in the case of germanium and 1.1el in the case of silicon. Formula (1.18) also describes the temperature dependence of the conductivity for other pure semiconductor materials fairly well at sufficiently high temperatures, save that the values for cro and E, must be changed. In Table 1.6 the &.values are listed for some import ant semiconductors. There is a second way of varying the conductivity of a semiconductor over a wide range, namely to intentionally pollute it with atoms of certain chemical elements. One calls this process doping wzth impup.ity atoms. In this context, the doped semiconductor materials are called extrinsic semiconductors. As an example we choose silicon, doped with arsenic a t o m
36
Epzianov, 1979.)
of concentration N D . In Figure 1.16 the change of conductivity is shown with N o varying from 1014 up to crnA3. At temperature 300 K corraponding to the data of Figure 1.16, the conductivity rises monotonically with doping ronrrntration. Again, a change of several orders of magnitude i s exhibited.
Table 1.6: E,values for several semiconductor materials at 300 K (in eV).(Aftar
LundoldtBumstein, 1982.)
Material
E,
C:
Si
1.11
Ge
0.66
,OSiC
2.2
BN
6.2
InN
1.95
AlN
AIP
2.45
AlAs
3.14
IrlSb
0.18
AlSb
1.63
ZnS
5.5
6.28
IriP
1.26
Material
E, Material E,
GaN
3.30
GaP
2.27
GaAs
1.43
(Mh
0.71
Iiub 0.36
3.56
ZnSe
2.67
CdS
2.50
CdTe
1.43
HgTe
0.00
PbS
PbSe
PbTe
Se
1.8
Te
0.33
0.37 0.26
03 .0
37
Figure 1.17: Temperature dependence of conductivity u of Asdoped silicon for different arsenic concentrations N D : 1 1.75 x 1014, z  2.1 1015,31.75 x 101*,41.3 x 1017,52.2 x 101B.62.7x lo1 cm . (After Morin and Maita, 195d.}
The trmperalurr dependence of the conductivity is substantially weaker for doped silicon than for pure as may be seen from Figure 1.17. Each curve
in this figure corresponds to a particular concentration of arsenic atoms. The temperature rise becomes weaker with increasing doping, and for the highest doping it completely vanishes. At higher temperatures, it even turns downward in a decrease that is more pronounced for purer materials. The rise of curve 1 above 400 K indicates thcrt the corresponding [relatively pure) material starts to behave like an intrinsic semiconductor (see Figure 1.15).
The origin of the difference i temperature characteristics of the conn ductivity of pure and d o p d semiconductors remains to be addressed. In this regard, it will be helpful to discuss yet another rnacrosropic property of semiconductors, namely optical absorption.
38
100 '
102~ 1 2
'
102040 hw/eV
'
m m
'
1.4.3
The measured values of the absorption coefficient of silicon are displayed in Figure 1.18 showing a ( h w ) as a function of the photon energy hw. The particular shape of this spectrum, namely the vanishing of a below a threshold energy, the steep rise at this energy up to valurb which are larger by orders of magnitude, and the weak decrease from these high values with further increase of photon energy, i s characteristic of all semiconductors. In simplified terms, one may say that semiconductors are transparent below the threshold photon energy, and absorbing above it. The values of the optical threshold energies coincide so well with the Egvalues known from the temperature dependence of the electrical conductivity (see Table L6), that an intimate relation i s strongly suggested. Below we will analyze this relation in greater detail. Consider a semiconductor in thermodynamic equilibrium at absolute zero temperature, and let E, be the maximum energy which an electron may have m such a semiconductor. If an electron of that energy absorbs a photon of energy h w , lhen its energy increases to the value E , hw. The absorption hw is in the allowed range can, however, only occur if the energy E, of electron energies. It is wellknown that in quantum mechanics not all energies arc allowed, but only specific energies which are eigenvalues of the Hamiltonian. For atoms these are the energies corresponding to the various Bohr orbitals. Thus, in semiconductors, absorption may be nonvanishing ' abovc h g only if the electrons can take energy values E > E ,  E* E,, i.e. if the energy values E' > E, are allowed. In turn, the vaniahing of absorption for E < E , can be explained if the energy values between E, and & are forbidden. Such a distribution of allow4 energy values is shown schematically in Figure 1.19. The representation of these values as a function
39
Valence bond
O0
X
H L
0
1
f [El)
Figure 1.20:
Fermi distribution function f ( E ) as a function of energy at different temperat#ur&.The energy unit is E,, and EF has been set at
E,P.
of electron coordinate is intended to illustrate the degree of localization of the electron. A straight line across the whole crystal from 0 to d means that the electron at this energy is spread out over the whole crystal. The regions with closely clustered allowed energy values and electron states extended over the whole crystal are called energy bands. The lower band in Figure 1.19 is the valcrice band, and the upper is the conduction band. Between these two bands lies a region of forbidden energy values, called the forbidden zone or the energy gap. The existence of energy bands done is not in itself siifficient, however, for the explanation of the observed absorption spectrum of semiconductors. For this it i also necessary that the valence band be occupied by electrons s and the conduction band be empty. If there were no electrons available with energies in the valence band, then there would &o be no electrons to absorb a photon to make the transition to the conduction band possible. If, on the other hand, all states of the conduction band were to be occupied, then no electron could be excited tu this band, and no photon absorb4 in such a transition since, according t o the P a d exclusion principle, the conduction
40
0
X 
0
X
L
4
Figure 1.21: Distribution of electrons of an ideal semiconductor over the valence and conduction bands. On the left for T = 0 K , on the right for T > 0 K. The population of energy levels by electrons at equilibrium is determined by the lkrrnz distribution f i n r t i o n f ( B ) . This function represents the probability of finding an electron in an energy levcl E , and it is given by the expression
(1.19)
Here E F is the socalld Fwmz mevyy. Like temperature, the Fermi energy is an intensive thermodynamic state variable, namely the free enthalpy per particle or the chemical potential. A more detailed treatment of the Fermi distribution function will he given in Chapter 4. The probability of orcupation of a particular energy value E: depends decisively on the relative position of E with respect to the Fermi energy. If E < E F holds, and in addition J E E F J>> kT,then f ( R ) essentially has the value 1. If, on the contrary, E > E F and again IE  EFI >> k T , then f ( E ) is approximately zero. The shape of f(E)is shown schematically in Figure 1.20. The width of the e n e r a region where the transition horn 1 to 0 takes place is cf the order of magnitude kT. The lower the temperature, the more abrupt this transition becomes. At T = 0 K thc transition is steplike. Qccesionally, one says that f(E) has the form of an iceblock at T 0 K , which melts at higher temperatures. 111 Figure 1.20 we have also implied that Pi, lies below E F and E, above E F , in order for the valence bend to be almost completely occupied and the conduction band to be almost empty. This means that
41
Figure 1.22: Absorption of photons by electron transitions from vaIence to conduction band.
0
XL
the Fermi energy must be located within the energy gap between Ev and E,, sufficiently far away from the two band edges (measured in units of kT). In this. we have described an essential microscopic property of pure semiconductors. Their Fermi level lies in the energy gap between the valence and conduction bands. The electrons are distributed over the two bands as shown in Figure 1.21. This knowledge provides a complete explanation of the absorption spectrum in Figure 1.18. The explanation is illustrated in Figure 1.22 and needs no further comment. The differences in the properties of semiconductors and metals may be traced back, in essence, to the different positions of the Fermi levels in these two different types of materials. In metals the Fermi level l e within the is conduction band. Insulators do not differ from semiconductors qualitatively with respect to their Fermi level positions, i.e. the Fermi levels are found in the energy gap in both cases, but the gap of insulators is typically larger than that of semiconductors. If Eg > 3.5 e V , as a rule, one has an insulator.
42
(1.20)
with n o as a factor of dimension ~ r n  ~ depending weakly on temperature. As we know, the electrons of intrinsic semiconductors are statistically dip tributed over the valence and conduction bands in accordance with the Fermi distribution function f(E). At T 0 K the valence band is fully occupied, and the conduction band is completely empty. The electrons in the fully occupied valence band cannot contribute to carrier transport. The reason for this is again the Pauli principlc, according t o which an electron state may be occupied by only one electron. Since all valence states are occupied, no stste change is possible in the valence band by redistribut,ing the electrons. This means that the T = 0 K state will remain unchanged in 8 1 electric Eeld, and x consequently no current will flow. However, a current will arise from the relatively few clectrons which, according to thc Fermi distribution function: are populating the conduction band at ftnite temperature. Their concentration can be easily calculated. For a pure semiconductor Ec  EF > 6T holds. > Thus, in the Fermi distribution function j ( E ) of ecpation (l.l!J),out3 m a y neglect the 1 o the denominator for emrgies E in the conduction band, f i.e. energies with E > EC Iu so doing, one obtains, approximately, the Boltzmann distribution function
7
(1.21)
The concentration of energy levels in the conduction band having energies between E and E dE is p c ( E ) d E , where pc(E) is the socalled density of states of the conduction band, which describes the number of states per unit energy and unit volume. In Chapter 2 we will calculate this quantity explicitly, here its mere existence suffices. In terms of p , ( E ) one may write the electron concentration n of the conduction band in the form
(1.22)
If one substitutes f(E)from (1.21) into the integrand of (1.221, it follows that
(1.23)
where
(1.24)
43
may be understood as the effectiae density of states of the conduction band. To ensure the consistency of the expressions (1.20) and (1.23) for n, we or identify no = N , and ( E c  E F )= Eg/2,
E F = Ev
1 + E 2
'
(1.25)
i.e. the Ferini level lies exactly in the middle between the valence and condiistion bands. Strictly speaking, this is only true for T = 0 K , as we will see later. The corrections at higher temperatures are, however, quite smaI1, and they can bc ignored here. Another inaccuracy of the above consideration that. we have ignored is that, at, finite temperatures, the valence band population also changes. so that no longer are all states of this band ocmpied. Under these circumstances t.he electrons of the valence band also make a contribution to the current. The temperature dependence of this contribution is roughly the same as that of the conduction band. The electric charge transport due to the electrons of a notcompletelyoccupied valence band will be considered more fully later. It is connetted with a remarkable observation of the Hall effect, which we will discuss below. However, we will first clarify how one may understand the strong change of conductivity with impurity concentration in extrinsic semiconductors. For this purpose we again m e the energy band model of Figure 1.19 for an ides1 semiconductor. If arsenic impurity atoms are present in a silicon crystal, then this model has to be altered. Wc will later prove (see Chapter 3) that each of the arsenic impurity atoms g i v ~ to an energy level in the rise forbidden gap, just below the conduct.ian hand edge, and that the electrons in these levels are localized at the site of the impurity atom. For this reason, we have marked these levels in Figure 1.23 by short line segments. At temperature T = 0 K each of these levels is occupied by one dectron, namely the fifth valence elcctron of an arsenic atom replacing a silicon atom having only four valence electrons (see Figure 1.23). If temperature i s increased, these electrons are excited into the conduction band. In this way, bound electrons which formerly could not participate in carrier transport, become freely mobile electrons (see Figure 1.24) which can contribute to transport. If the concentration N D of arsenic atoms is not too large: and the temperature 7' not too law, then practically all arsenic atoms are ionized and the concentration n of free electrons is equal to the that of the arsenic atoms, i.e. one has R = N D . This corresponds to the approximate proportionality between cr and N o observed in Figures 1.16 and 1.17. If the carrier which at sufficiently high temperatures T i s concentration has Ihe value NLI> actually the caw, then the conductivity becomes independent of T. The only remaining source of a temperature dependence for u is the Tdependence of the mobility p. It is this relatively weak temperature effect which shows up
44
Figure 1.23: Ordering of the allowed energy values of electrons in a silicon crys
tal doped with arsonic (schematically). Each arsenic atom correspond# to 1 locslized energy level.
I n
a,
G Eg
a
P W
W
0
Xc
T0
T.0
0
XX 
Figure 1.24; Distribution of the electrons of a silicon crystal doped with arsenic over the allowed energy values. 011the left for T 0 K , OIL the right for T > 0 K .
7
45
Current 4
ua) Current
Boron doped Si
Current
Current
bl
Phosphorus
doped Si
~ a i current i
H ~ Icurrent I
Cl
HOII current
fie'd
Hall current
d)
Figurc 1.25: nlrlstration of the Hall effect in two silicon smrples, one doped with phosphorus, and the other with boron.
1.4.5
The Hall effect and the existence of positively charged freely mobile carriers
We consider two semiconductor samples of the extrinsic type hoth madf of the Same material, however, with different types of doping. To be specific, we assume two silicon samples, one doped with phosphorus as a group V dement. and the other with bnroii as a group 111 element (see Figiiie 1.25). Applying a voltage to both samples, as shown in Figure 1.25a and b. a current I will Bow. For both samples it has the same direction, namely from '+' to '', Consider, ROW, the ~ I I I P two sarriples in a niagnPtic Geld B ( ~ e e Figure 1 . 2 5 ~and d). As is wellknown, the Hall effect will be observed in such circumstances, i.e. a current component I H perpendicular to both the electric field E and the magneliu field B will arise. Gnder the wndilforts
46
of Figure 1.25 no current can flow in this direction, and therefore there will be a voltage UH such that the current caused by it will just compensate the Hall current. Experimentally one finds that the Hall voltage UH has different polarities in the two samples. This means that the corresponding Hall currents flow in opposite directions. If we assume that the magnetic field is directed normal to the plane of the figure and that it points into this plane, then the Hall current flows upwards in the borondoped sample, and downward in the phosphorusdoped one. The Hall effect can also be measured in metals. In this case the direction of the Hall current is the same as in the silicon sample doped with phospho
How can the dlfemnt behauiior o thesiicon sample dupd with born f
be understood? The explanation is quite clear if one assumes that, in contrast to metals and the phosphorusdoped silicon sample, the current in the borondoped sample is not carried by negative charge carriers but by positive ones. This is illustrated in Figure 1.25d. A charge q , which moves with velocity v in a magnetic field B,experiences the Lorentz force
F=4[vxS]. C
(1.26)
Since negative charges move to the left, and positive ones to the right, the Lorentz force F, which depends directly on the sign of the charge, has, for both charge signs the same direction, namely upwards. For the borondoped sample this means a Hall current directed upwards, but in the phosphorusdoped sample the Hall current is downwards, since in this case the charge carriers are negative. This is just the observed behavior, which means that the assumption of positive freely mobile charge carriers is successful in explaining the unusual sign of the Hall current in borondoped silicon. In this way the experimental observations of the Hall effect reveal a remarkable general property of semicondiictors: For doping with certain atoms, the current is not carried by negative charges as one would expect considering the negative charge of electrons, but by positive ones. In other words, in addition to the electrons as negative freely mobile charge carriers, one also has positive ones in semiconductors under certain conditions. One may This surprising observation may be understood as ~OUQWS. demonstrate  as we will do later explicitly  that boron atoms whi& substitute silicon atoms in tf silicon crystal, give rise to energy levels in the forbidden zone just above the valence band edge. This is illustrated in Figure 1.26. Electrons in such energy levels are localized at the sites of the boron atoms. At very low temperatures these states are not occupied, and the boron atonis are electrically neutral. At finite temperatures electrons from the valence band are excited into the boron levels (Figure 1.27), leaving behind unoccupied states, or occupation hole3 in the valence band. We
47
Figure 1.26: Ordering of the allowed energy values of electrons in a silicon crystal doped with boron (schematically). Each boron atom corresponds to 1 locali d energy level.
immediately recognize that these occupation holes of the valence band behave jubt like freely mobile positive charge carriers in an applied electric field. This is illustrated in Figure 1.28. For simplicity only one hole is assumed to be in an otherwisr rompletely occupied valence band, placed at the upper band edge. The population of the bend edge is shown in Figure 1.28 at different points of time. At the beginning, the hole is at the outermost left position. The adjacent electrons experience a force which tends to move them to tho left. Rut only the electron neighboring the hole on its right hand side can follow this force, since all other electrons are blocked because the states to their left are already occupied. The electron immediately on the right hand side of the hole, moves into the hole and leaves behind another hole to the right of the first one. The hole has thus moved one site further to the right in this way. 111 the next time interval this process is repeated, and the hole again moves one site further to the right etc. Thus, in an electric field, the hole moves like a freely mobile electron, but in the opposite dirwtion, as if carrying a positive charge. In summary, holes in the valence band behave like freely mobile positive charge carriers. This qualitative introduction of the concept of holes will later be elaborated by more quantitative considerations. As we have seen above, the sign of the Hall voltage tells one whether the free carriers of an extrinsic semiconductor are negatively or positively charged, Le. whether they are electrons or holes. In the first case one speaks of a ntypr semzcondurtov (the n stands for negative), in the second of a ptype semarondector ( p for positive). The antranszc semiconductors intro
48
L
X
0
X
Figure 1.28: Empty states (halefi) in Ihe valence band behaye like freely mobile poriitive charge carriem in an electric field.
duced earlier haw neither electrons nor holes from impurity atoms. Their mobile electrons arr generated by thermal excitation of bound electrons from the valence band to the rondiiction band. Since each excited electron leaves behind a hole in the valence bmd. one also has mobile holes in intrinsic semiconductors. Their number is equal to that of the mobile electrons. Holes are also present in ntype semiconductors, but, in very small numbers compared to elwtrons. Analogously, a few electrons occur in p t y p e materials with mob1 of the carriers being holes. One calls the many mobile carriers of extrinsic semiconductors m a j o n l y cariwrs and the few mobile carriers mzrrortty earners.
The Hall effect can also be used for purposes other than the determination of whethrr an extrinsic semiconductor is n or ptype. The absolute
valup of the Hall voltage determines the concentration of majority carriers. We will veTify this assertion for an n  t y p ~semicondiictor. In this case the rurrcnt density j can be expressed in the form j = env. which permits us l o rewrite equation (1.26) as
F=[j
RC
X B ] .
(1.27)
This is the same force caused by the electric field E ~ I F/(e). The = voltage corresponding t,o EH is by dehition the Hall voltage Ufi. With b as the width of the sample normal to j and 3, expression below for U x the
follows:
(1.28)
where
Rff = PrtC
(1.29)
is the socalled Hall constant. We can write an analogous expression for holes, only the electron concentration has to be replaced by the hole concentration p . Thin;, hy measuring thc Hall vultagr, onr ran also det.ermine the majority carrier concentration.
1.4.6
All properties considered hitherto were for semiconductors in a state of thermodynamic equilibrium or in close proximity. However. semiconductors may easily be driven into states far from equilibrium. Here, 'far' means that characteristic macroscopic propexties of the semiconductor deviate strongly from those in equilibrium. To be more specific about these qualitative statements, we examine the example of photoconduction. Consider a sample of an intrinsic semironductor, shielded against unwanted influence of light. If, in this 'dark state' the conductivity is measured. one obtains a relatively small value. in accordance with the relatively low carrier concentration of an intrinsic semiconductor in thermodynamic equilibrium. However, if one irradiates the sample with light which is absorbed by the semiconductor (see Figure 1.22). the conductivity will rise more or less strongly, depending on the intensity of the light and the magnitude of the absorption coefficient. This is shown schematically in Figure 1.29. assuming realistic conditions. When the exposure of the sample t o light ceases, its conductivity decreases to the original low dark value. Evidently, electrons in the conduction band and holes in the valence band were created by irradiation with light to such an extent that their equilibrium values were exceeded by orders of magnitude. A simple estimate for Figure 1.29
50
Figiire 1.29: Conductivity of an intrinsic semiconductor irradiated with light as a function of intensity I of the absorbed radiation (schematically).
1M  ~
1~7 
104
103
lo7
101
ma
r]'
I h o t t om*
+
shows that t.he carrier concentration was increased from about 10'' cm4 (equilibrium value) up to c ~ n . by light of intensity 10 Wcm'. The ~ electronhole pairs created by the radiation decay, however, after a short t,ime. 'lhis follows from the fact that continuous irradiation leads to tt stalionary conductivity value, and lienre a constant carricr concentration, and, on the other hand, from the observat.ionthat the conducbjvity decays down to the dark value after switching off the light source. The 1at.ter observat.ion means that, thermodynamic equilibrium i s reestablished by socalled
wxnnbinaiion, p7nuesses.
Rwidr irradiation with light, extreme nunequilibrium states in semiconductors can also be created in other ways, for example, by putting an ntype semiconductor in contact with a p t y p e semiconductor or with a metal, or by applying voltage to a. semiconductor which previously had been isolated by a thin insulating layer from one of the electrodes. The ability to c r e at,? extreme nonequilibrium states in semiconductors i s extensively used in electsonic devices. Almost all applicat.ionsof semiconductors in such devices rest on this uniquc pQssibility. Nonequilibrium processes in semiconductors and the most important scmiconductor devices based o n them, such as electric rect.%ers, bipolar and unipolar transistors, tunnel diodes, photodetectmu: solar cells, as well as luminescence and laser diodes, will be dealt with in the second part of this book, i.e. in Chapters 5 , 6 and 7. In the first part of the book, the basic concept.s, discussed above in a heuristic way, will be developed from first principles. This applies to the stationary electron states of an ideal semiconductor (Chapt,er 2), their niodifications by impurity atoms and other deviations from the ideal crystal, as well as by external fields (Chapter 3) and t h e statistical distribution of charge carriers over aiergy levels in thermodynamic equilibrium (Chapter 4).
51
Chapter 2
21 .
Qualitatively, the kind of changes contemplated may be characterized as follows: There is no doubt that the electrons of the outer shells, i.e. the valence electrons, will react most strongly in assembling the isolated atoms of a crystal, for they are the primary agents which bind the atoms into the crystal state (Table 2.1). Whether, and to what extent the electrons of the inner shells also change their states, is harder to predict. One may suppose that such inner shell changes will be comparatively slight, at least for those inner shells which lie much lower energetically than the valence shells (Ta
52
crystsls
Table 2.1: Characterization of atomic cores and valence electrons of main group elements from which semiconducting mat,erialsmay be formed.
Atom
Nucleus
Valenw electrons
B
C
N
6+
5+
18
222p
2922p2 2s22p3 323p 132222p6 323p2 3s23p3
7f
A1
St
14+
13+ 153 1+
P
Ga
424p
ls22s22p633s23p63d10 3s24p2
4Ap3
Ge
AS
323 3+
ble 2.2). Because the electrons of these deep shells are localized so close to tlirir rrspertive nuclei. they feel potential changes produced by siirrouuding atoms as being almost uniform. Strictly speaking, this means that the wavefunctions of these electrons are essentially unaltered. Their eneqy levels shift, however, specifically by the change of the constant potential value arross their localization region. The term d i d d a t P shtjts is iised for t h e s ~ shifts of the inner electron levels. By measuring these shifts m e can obtain information about the chemical nature and geometric striicturc of the environnient of an atom in B solid. In B way, the inner electrotla threby serve
as probes.
Here, we arc intermtd in the electronic strizrture of crystals, dnd in this regard the pertinent feature is that the wavefunctions of the inner shcll plectrons 01 the atoms in the crystal undeigo only weak changes. This statement is even better justifid for the wavefiinctions of the nucleons in the atomic nuclei, which remain practically unaffected. Furthermore, if the gown crystal is expoard to certain extrrrial perturbations  heat. prrssure, elwtromagnetir fields  the states of the electrons of the inner shells and those of IhenucIeoas frequently do not change. For the e x h n a l perturbations which are of p r i m interest in semiconductor physics, this is even generally true. Therefore, in determining the elertronic stricture of sanironductor crystals and the influence of exterruel perturbations on them, the states of the inner electron shells and those of thc nuclei, as a rule, can be assumcd to be the same as those of the free atoms. This allows one to consider the atomic nuclei and
53
Table 2.2: Energy levels E, and 4 of the valence electrons and  E , of the shallouwt core electrons for some chemical elements which may occur in semiconducting materials. All energies are given in el/. (Afcer Herm.an. and SkiIlman, 1963.)
Atom
tp
3
6.6
12.5
195
0
14.1
A1
4.9
Si
6.5
P
8.3
17.1 147.4
S 1. 03
20.8
Zn 3.1
Ga
4.9 11.4 31.7
9.00 11.5
17.5
, t
23.0 29.1
405 537
10.1
87.5
13.6
8.4
20.7
ec
291
116
182
cP
6.4
7.9
95 .
3.4
4.7
5.9
7.2
8.6
3.5
5.8
the inner shcll drrctruns jointly as subsystems of the crystal, whose intrrnal structure i s of no further interest sirire it does nut change. The structure is, so to speak, frozen. In this sense t.hc subsystcms composed of atomic iuiclei and inner electrons are elementary building blocks of the crystal. One rekm t,o them as atom.ic ~ 0 : ~ s .
Since the crystal aIso contains valence electrons as independent particles, we arrivr at a yic:i.ure which is fundamental for the further analysis  the picture of a cryst,al as system composed of at,omiccores and valence electrons. Somctirnes one refers to this concept, as the frozencore a p p ~ ~ ~ i i m Int i ~ ~ ~ a Table 2.1 the division into core8 and valence electrons is indicated for some elements from which semiconductor crystals arc made. The frozen nature of the electron states of the cores of a crystal! and their lack of rcsponse to external influences, generally prevails, but as always, there arc exceptions. In heavy metals such like zinc, t.he inner &shells are, energetically, rather close t,o the oubar valence shells (see Tables 2.1 and 2.2). In this case the delectrons significantly participate in chemical bonding and can no longer be included in t.he core, which is iinchangeahle by definit,iou, Moreowr, the inner shell eledrons of crystals can be excited by means of electromagnetic radiation in thc far UV and Xray region. This can also occur by means of an electron beam In solid state nuclear reactions, ~ W R the states of the nuclei of the crystal atoms change.
54
2.2
221 ..
In dealing with the atomic structure of crystals in Chapter 1, we found that their atoms are not located at arbitrary positions but at well d e h d locations, specifically at points which are consistent with the existence of a lattice and a unit cell. In this, the atoms were assumed to be pointlike and the space between them was empty. The crystal itself was imagined to extend to infinity. We now consider a more realistic model of a crystal. First, we replace the pointlike picture of atoms by introducing spatially extended atomic cores and take their centers of gravity as the atom sites. Second, we recognize that the centers of gravity can also move to other positions than those prescribed for the ideal crystal. In this way, we also account for the fact that the atomic cores in crystals can execute oscillations around their equilibrium positions, and that only these equilibrium positions form an ideal crystal. Thirdly, the space between the massive elements of the crystal, i.e. the cores, is now no longer assumed to be empty, as was done before, but we acknowledge that valence electrons are present there. The assumption of infinite extension of the crystal which, of course, is also not exact, will be addressed at a later stage. This assumption excludes effects due to the existence of bounding surfaces. As far as the electrons are concerned, these effects are treated in section 3.6. The atomic cores will be marked by an integer subindex 3 , and the valence electrons by an integer subindex d. Both should start with 1 and run upwards, reaching arbitrarily large values since we are considering an infinite crystal. By means of a conceptual device which we are about to introduce notwithstanding the infinite extent of the crystal  only finite sets of J cores and N valence electrons need be considered. To understand this, we imagine the infinite crystal to be divided into parallelepipeds of macroscopic size in such a way, that their edges are parallel to the primitive lattice vectors a,,az,W of the crystal. These edges are to be given by the vectors Gal, Gaz, G a with G a large integer. Each of these parallelepipeds should contain an equal number J of cores and N of electrons. One calls these parallelepipeds periodicity regions. Of all possible motions of the particles of the infinite crystal, we now select those particular ones for which the cores and electrons in different periodicity regions have the same positions relative to the origin of their own region, and also have the same speeds. In this way the infinite crystal becomes a periodic continuation of one particular periodicity region, and it suffices to describe the motion of the J cores and N electrons of this particular region. If the periodicity regions are made s a 
55
/ ;
+
Figure 2.1: 13escription of the positiom o the atomic cores ( 0 ) and valence elm f trons {a) {left part) as well as the interactions between these particles (right part). ciently large, they will encompass all types of motions of an infinite crystal with desired accuracy. The concept of the periodicity region makes it possible to pass from the original infinite space problem of motion to a finite one without thereby losing the translational symmetry of the infinite crystal. We use Xj t o denote the centerofmass coordinates of the jth atomic i core, and x for the position of the ith electron, which is further assumed to be pointlike {see Figure 2.1) The jth core mass will be denoted by Mj. Of course, there are only as many diferent values of M j as there are chemically different types of atoms in the crystal, so most of the Mjvalues are identical. In the case of electrons we can omit the index i from their masses since they have the common mass m. The momentum of the jth core is called P3,and that of the ith electron pi, such that
We are interested in the motion of the interacting atomic cores and valence electrons of the infinite crystal, which can only be adequately treated by m a s of quantum mechanics. The state of the system is described by a en i wavefunction @, which dependson the coordinates x of all electrons and Xj of all atomic cores, as well as on the time t. Since we assume periodicity of the motion with respect to a periodicity region, it suffices to consider @ as a function of the coordinates xi of the Nelectrons and the coordinates Xj of the J cores of only one periodicity region. The state of the particles in the remaining periodicity regions can then be described by means of a periodic continuation of this function, i.e. by means of the relation
56
@(XI I
G, a,
~
x2 1 , % : C
with CY 1 , 2 , 3 . We will now wt up the Hamiltonian 7l of the system of the Nelectrons and J atomic cores of a periodicity region. We use Tc and reto denote the kinetic energies of the atomic cores and of the electrons, respectively, and xz, we define V ( X ~ , . . . ,XN, X i , X2,. . . , XJ) to be the potential energy of the system. The Harniltonian is the sum of the kinetic and potential energy operators,
H = Tc Te
+ + V.
(2.3)
The kinetic energies Xc and T, ran be expressed in terms of the momenta Pj and pa of the cores and electrons as fOllOW8:
(2.4)
The potential energy is due to three interactions (see Figure 2.1): (1) lhe repulsive Coulomb interaction of the electrons with each other. The corresponding potential encrgy is denoted by Vce. It depends only on the coordinates of the electrons, as given by
(2) The interaction of the electrons with the atomic cores due to their mutually attractive Coulomb forces, and also due to (repulsive) forces of quantum mechanical origin, which become effective if the valence electron wavefunctions overlap the inner electron shells of the atomic cores. The electroncore interaction potential energy V,, depends on the locations of both the electrons and cores. With respect to the electrons, it is evidently additive, i.e.
. XJ)=
I
where Vc(xi, X I ,X2, . . . , XJ) is the potential energy of the ith electron in the field of all cores. (3) The mutual interaction of cores, which at sufficiently large distances is again of Coulomb type. If the distances become small, repulsive forces of quantum mechanical origin also occur. The corecore interaction potential , energy will be denoted by V. It depends only on the locations of the atomic cores, i.e.
57
vcc
VCr(X1, XL
XJ).
(2.7)
Summing the three potential parts (2.5), (2.6) and (2.7), one gets the total potential
(2.8) which determines the dynamical problem of the crystal uniquely. This problem i s described by the timedependent Schrodinger equation
i3 i Le = H6, Fat whose solution may be determined in terms of the eigeiivalue problem for the Hamilt,onittn 7f, which is given by the timeindependenl Schrodinger equation
v + v, + v, , , ,
%IJ! = EQ.
(2.10)
The normalization condition for the wavefunction with refereiicc to a pcriodicity region is
(@I*)
d3X1.
. . d 3 x J p q X 1 ,x2,. . . , XN,X I ,
xZ, . . . ,
xj,) l 2 = 1. t
(2.11) Attempts to solve this eigenvalue problem exactly are hopeless from the very beginning, because it involves a macroscopic system, i.e. a system with about 10 electrons and a similar number of atomic cores, the motions of which are mutually coupled in a rather complex way. One must therefore resort to approximations. Such approximations must first provide the means to reduce the gigantic number of electrons, and secondly, allow for a proper decoupling of the electron and core motions. The second simplification is achieved by the socalled adiabatic approximation, and the first by the oneparticle approximation. These two approximations will be elaborated below. We begin with the adiabatic approximation, and in the course of the discussion it will also become clear how the somewhat unexpected designation of the latter arises.
2.2.2
The adiabatic approximation (also known as the BornOppenheimer approximation) is based on the fact that the mass of the atomic cores is many tens of thousands of times larger than that of the electrons  in Si, e.g., 52 thousand times, and in mercury 368 thousand times. In addition, it takes advantage of the fact that in a crystal the kinetic energy of an atomic core is, on average,
58
smaller than that of a valence electron. 'l'his can be seen in the following way: If the cores and valence electrons were fwe particles, i.e. if they did not interact, then the average kinetic e n p r o of a core would be approximately (3/2) h7'. That of a valencp dzctron would be about (3/6) Eli where E F is defined as the Fwmi energy of an electron gas of the same density. The daerence between the average kinetic energies of the two types of particles arises from the fact that the electrons obey Fermi statistics, whereas atomic cores obey Boltzmann statistics. For typical concentrations of valence electrons in a crystal of about 10" C W L  ~ the Frrmi e n e r a EF i~of the order of m~gnit~ude e V , while kT reaches only about 0.1 eV below the melt of mg point of the crystal. The average kinetic energy of a core is therefore generally smdlcr than that of an electron. This remains true when the interactions between the electrons and coresi which were omitted above, are taken into account. Writing (M,/2) X > and (m/2) < x > as the < : : avrragP kinetic enmgies of a core and an electron in a crystal, we thus have
2 X j > <
and it, follows that
Mj
< <xq>. 2
1 7 1 . '
(2.12)
(2.13)
Corrmpondingly, one may say that, on statistical average, the cores move much slower than the electrons. This observation plays an important role in the following considerations. To simplify the notation, we replace the Ncomponent sequence o the vectors (XI, . . , X N )by x, and the Jcomponent sequence of vectors x2,. (X\,Xz,. . .,XJ) X, ie. we write by
x = (XI,xz,. , ,
XN),
x = (XI, xz,.
I . ,
XJ).
(2.14)
To take advantage of the slow motion of the cores we write the solution q{x,X) of the Schrodinger equation (2.10) for the total crystal. in the form of a product
q x ,X ) = $(x, X )
'
@(XI.
(2.15)
The necessary normalization (2.11) of the total wavefunction *(x, X) with respect to a periodicity region is assured if each oI the two factors of (2.15) is normalized with respect to this region, i.e. if
(2.16)
59
(2.17)
are assumed. An analogous statement holds for the periodicity condition (2.2) of @(x, X). To assure overall periodicity with respect to a periodicity and region we assume it for +(x,X) 4(X)separately. An Ansatz of the form (2.15) is always possible since it does not assert separability of the from the wavefunction variables x and X,but merely splits off a factor +(X) @(x,X) which depends only on one variable, X, while retaining the full dependenre on both Coordinates in thr second factor +(x, X). The Ansatz (2.15) becomes nontrivial if we proceed as follows: Firstly, we assume that $(xiX)is the solution of a Schrodinger equation for electrons,
1%
+ v,,,c]
(2.18)
and U(X)is in the nature of an electron energy eigenvaliie. Secondly, we demand that the splitoff factor, 4(X), of the total wavefunction (2.15) satisfy a Schrodinger equation in which the coordinates of electrons do not appear. It turns out that such an equation cannot be derived rigorously, but only in a special approximation  the adiabatic approximation which was mentioned above. Yet without any approximation we have
[ e Tc + K , , T+
(2.20)
The set {$4) of the eigenfunctions of the Schriidinger equation (2.20) forms an orthonormalized basis set in the Hilbert space of the crystal. Therefore, relation (2.20) is satisfied if it holds for the Fourier type coefficients relative to all basis functions Qd of this set, i.e., if the identity
($4lTe
(2.21)
is valid. The necessary simplification concerns the matrix element ($4(Tc I$4) of the kinetic energy of cores in this equation. Using relation (2.4) between Tc and the squares Pj of core momenta, and applying the product rule for differentiation we get, first of all,
60
The first two terms on the righthand side of this q u a t i o n turn out to be small compared to thr kinetic cuergy term of elwtrons in equation c2.20). One has the order of magnitude relations
(2.23)
(2.24)
Here .n/E is a typird value of the core masses M,. The two equations (2.23) and (2.24) are of fundamental importance in crystal dynamics, because they are ultimately responsible for the drroupling of rloctron dynamics from the dynamics of the cores. Therefore we present the proof of thwc equations in Appendix B. Here we proceed a n the assumption that these relations are proven. The terms of rplativeorders of magnitude (rn/M)1/2 2 lov2 and ( m / h f )w l W 4 , will be nPglatrd henceforth. With this the operator T, for the rambined kinetic energies of all cores satisfies the approximate relation
Tc[.l$(x,X)9(X)l E=r ${x, X j W G q .
(2.25)
This means that T, effectively does not act on V(X, X).In view of this relation we reconsider the SrhriidingFr equation (2.20) for the crystal, replaring the terms which still depend on x by means of the electron Schrodinger equation (2.18) in terms involving the electron energy eigenvalue U ( X ) . Finally, forming the scalar product with VI, we obtain the relation
(Tc
+ VC,(X)+ U ( X ) \ d ( X )= Ern(X).
(2.26)
This rcpresents the SchrGdinger equation for the atomic cores in which the coordinates x of thc electrons do not appear. The state of the elcctron system, however, enters this equation, namely via its energy U(X)which plays the role of a potential (referred to as adiabatze potential). In summary, we h a w reached the following description of the total crystal, viewed as an interact,ing system of atomic cores and electrqns: The subsystemof electrons is describcd by R separate wavefunction qi(x, X), which obeys a Schrijdinger equahion in which the coordinates of the cores enter only as parameters in the potential, but do not occur as differential operators in t,hr kinetic energy. In this way, the motion 01 the electrons is treat,& as if the cores were at. red. Core motion, which does, of course, occur despite its neglect with respect to electron motion, is described by the wavefunction &(XIand the Schrodinger equation (2.26). T h e potential of this q u a t i o n contains, besides the corecore interaction energy, a second contribution IJ(X). This originsles in the interaction of the electrons with
61
the cores, for without such an interaction the eigenvalue U in the electron Schrodinger equation (2.18) would be a constant independent of X, which could be omitted. The potential contribution U(X) caused by the electroncore interaction does not depend, however, on the electron coordinates. It is an average value over all their positions X. The weight with which the various positions x enter this average over the probability I + ( x ,X)1 d 3 N ~ of finding the electron system in a volume element d 3 N ~ the position x, at since equation (2.18) implies that
u(X) = (+(X)ITe
+ vee,ec(X)l+(X)).
(2.27)
One can alternatively express this as follows: The electrons move so fast that they are no longer seen by the cores as pointlike particles, but as smeared out over all space. Equation (2.26) for 4(X) thus contains the same namely that the cores move much assumption as equation (2.18) for + ( x ,X), slower than the electrons. In so far as this feature is seen from the point of view of the electrons, equation (2.18) follows, whereas from the point of view of the cores one obtains equation (2.26). Since the relation between the velocities of the cores and the electrons, according to (2.13), is determined by the inverse ratio of their masses, it is clear why this ratio must be small for the two Schrodinger equations (2.18) and (2.26) to hold jointly in an approximate sense. It remains yet to clarlfy what effects are neglected because of the above approximation and why this approximation is called adiabatic. In quantum mechanics, one understands adiabatic temporal changes of potentials in the sense that the changes proceed so slowly that no quantum mechanical transitions will occur between the discrete quantum states of the potential, which themselves evolve slowly from the initial onset of time variation. The state of the system thus conforms continuously to the evolving new potential values as a function of time, without any transitions to other states. That exactly this situation is described by equation (2.18) and (2.26), may be seen from (2.24). If one considers the previously neglected term in (2.24) of relative order of magnitude ( m / M ) l I 2 ,then the total Hamiltonian 7t of the crystal has nonvanishing offdiagonal elements (2.28) and quantum mechanical transitions between the different eigenstates $14 and $+ of the crystal are recognized to occur. These transitions are caused by the kinetic energy of the cores exclusively. If terms of the order of magnitude ( m / M ) I 1 2 are omitted, then the quantum transitions due to core motion are also neglected. This is equivalent to the assumption that the core motion be adiabatically slow, in the quantum mechanical understand
62
ing of this term. The term adiabatic thus refers to the essential character of the approximation in neglecting (rn/M)/. This approximation is useful, of course, only as long as transitions between different eigenstates $$ play no important dynamical role. This is actually the case in regard to many crystal properties and phenomena. There are, however, also effects for which this does not hold, notably electric current transport. The fact that the electric conductivity of an absolutely pure crystal does not become excessively large is due in large part to the scattering of carriers from the oscillations of the atomic cores, i.e. to nonadiabatic quantum transitions between different electron and core states. Also, in the recombination of electronhole pairs mentioned in Chapter 1, these transitions play a decisive role, with the lattice of atomic cores absorbing the energy which is released during recombination. Formally, one may understand nonadiabatic transitions as the result of an interaction between the electrons and the motion of the atomic cores. Since such core motion, as we will see below, represents a superposition of lattice oscillations, also known as phonon excitations, this interaction is called the electTonphonon interactaon We have yet to explore how the two Schrodinger equations (2.18) and (2.26) for the electrons and cores can be actually solved. The problem is that both equations, are, at the outset, not completely determined  the one for the cores contains the adiabatic potential U(X), which can be known only after the equation for the electrons has been solved; and the electron equation can be fully d e h e d , however, only if the positions X o cores in the f potential Vee,..,(x, X) are known. The direct way to overcome this difficulty would be the following: One assumes a particular spatial ordering Xof the The latcores and uses it to determine for them the potential Ve,ec(x,X). ter is then used to solve the electron Schrodinger equation (2.18) (we will not discuss here how this is accomplished, as it will be the subject of the next subsection, 2.2.3). From the solution of the Schrodinger equation (2.18) one obtains the value of the adiabatic potential U at the position X of the cores. The same procedure is then applied to all other possible positions X, whereby the adiabatic potential U(X) and the Schrodinger equation (2.26) for the cores are completely determined. This equation can then be used to calculate the core wavefunction t#(X). It follows that the dynamical problem for the crystal as a whole is solved, since one would know its eigenfunctions @(x, X)= y(x, X)q(X). reality, however, this procedure is unsuccessful. In One cannot solve the electron Schrodinger equation for all possible core positions. Therefore, a simplified procedure is necessary. It contains additional approximations, but has the advantage of being feasible in practical terms. In this approach, one ignores the motion of the atomic cores completely and assumes that they are resting in certain equilibrium positions Xq. In reality, they execute oscillations around these positions with amplitudes that become smaller as the temperature of the crystal decreases. However, due to
x o
 X"
63
I.ke,ec(X,Xn)
v,v,(xy
0 ?
no
xn+1
I
+
Figure 2.2: Iterative calculation of the equilibrium positions of the atomic cores. the quantum mechanical phenomenon of zeropoint oscillations, such motion remains finite even at absolute zero temperature. The equilibrium positions Xq are unknown at the outset. One can determine them by demanding that the total energy Vo(X) = U(X) VC(X)
(2.29)
of the crystal in equilibrium have a minimum at Xq. Equivalent to this is the requirement that the forces VxVo(X) on the cores, the socalled HellmanFeynman forces, vanish at the equilibrium positions:
vxvo(x)~x,xeq0. =
(2.30)
Bearing this in mind, we may employ the iteration process below for the solution of the two coupled adiabatic equations (see Figure 2.2): In this pro
64
cess, one assumes ccrtain trial equilibrium positions Xo, enters them in the Ts h electron Schrodinger equation, and determines the eigerivalue U(Xo). i solution is then used to determine the potential Vo(Xo) and the liellmaiiF e y m m forces. Thanks to the Fe'egrman theorem, taken jointly with appropriate analytical transformations, one can determine these forces without numerically calculating Lhe potentid in the environment of X. After the ' first iteration cycle, t.he HcllmanFeynman forces will, in g~neral.not yet vanish. signifying that thc cmes arc still not at equilibrium posibians. By nieans of the nonvanishing f o r c e one det.ermines new trial positions XI. 'The new positions are then substituted again onto the electron Schrudinger equation (z.18)! to calculate a new eigenvalue U(X1), and the latter determines the corresponding HebanFeynrnan forces. This procedure is to be repeated until the forces become zero. The corresponding core positions are then the equilibrium positions XeQ.In this way one reaches a very important result, the determination of the atomic structure of the crystal. Such structure calculations are successfully carried out currently fw many solid state systems, including a series of semiconductor cryst.als. With regard to semiconductors, it can he shown, for instance, that under normal conditions, Si haas the diamond structure, and that its Lattice constant a iR 5.49 A.
I so far n
VO(X) =
vo(xeQ) ' ( X i2
 X e ~ ) V ~ v X r ~ ' o ( X ~x)e(4 1. x
(2.31)
65
where. for simplicity, Vo(Xeq) = 0 has been assumed. Equation (2.32) describes a system of coupled harmonic oscillators. The restoring forces are determined by the second derivatives of the potential Vo(X). Using the eigenvectors and eigenvalues of the matrix of restoring forces (actually, of the socalled dynamical matrix which also includes the kinetic energy term), one can easily transform to a system of uncoupled harmonic oscillators. Their motions are called normal mode osciliataom or lattace oscillations,and their excitation quanta are phonom. Phonons are a good example of the introduction of a concept which is of fundamental importance for the dynamics of many body systems, including the manyelectron system of a crystal which will engage us in the next subsection. The concept we have in mind here is identified by the terms elementary excztatzons or quastparticles (both terms are commonly used). This concept is based on the possibility of decomposing the motion of a system of mutually interacting particles  in our case of the atomic cores of a crystal  into noninteracting components of motion  phonons in ow case. The phonons or, more generally, the elementary excitations are. so to speak, the elements of n o t z o n of the system, while the atomic cores or, more generally, the actual particles. form the ~ ~ T U C ~ U T  Q ~ elements of the system. ,413elementary excitation involves coordinated motion of all structural elements of the system. Conversely, the motion of an individual structural element is a superposition of all elementary excitations  the motion of the atomic cores, for example, is a superposition of all normal mode oscillations or phonons. Besides the oneelectron and onehole excitations, the phonons are the most important elementary excitations, or quasiparticles, of a crystal. In this book we will deal mainly with electronzc elementary excitations, and will include phonons only if it is otherwise impossible to properly describe electron dynamics. Relevant phonon information will simply be cited without detailed justification, since a thorough development of the theory of phonons is beyond the scope of this book. O w choice of subject matter here is conditioned by the fact that electrons and holes are much more important for understanding the properties of semiconductors as they are used in electronic devices, than are phonons. Readers who are particularly interested in phonons are referred to other books (see, e.g., Born and Huang, 1968; Bilz and Kress, 1979; BonchBruevich and Kalashnikov. 1982). We return now to the Schrodinger equation (2.18) for electrons. In the sense of Figure 2.2, we approximate the core positions X by their equilibrium values Xeq. As far as the latter are concerned, we take the point of view that they are known from experimental structure investigations, e.g., by means of Xray diffraction. For common semiconductor crystals this is in fact true in all cases. Taking this approach, the potential b=, in the electron Schriidinger equation (2.18) is welldefined from the very beginning. To
66
crystals
simplify the notation, we suppress the core coordinates X in the potential = V + V, henceforth, writing Vec(x, ) = v,(x). Similarly, WP write , X the electron wavefunction @(x,X) $:,(x). as Usingequations (2.19), r1.5) and (2.6), the Hamiltonian H = Te V e , , of the Nelectron system in equation (2.18 j takes the explicit form
2.2.3
With the Hamiltonian H of (2.331, the Nelectron Schrodinger equation (2.18) can be written as
HVj,(Xl,x2,.. ., X N ) = Ull(X1, x2,.. . , XN).
(z 34)
The wavefunction ~+4(XI, xp, . . . ,XN) must be periodic and normalized with respect to a periodicity region. The Schrodinger equation (2.34) is impossible to solve directly since it describes an interacting system of electrons having a tremendously large number of particles of order loz2. The goal of this subsection is to provide an approximate description which allows one to reduce the number of particles down to minimum number 1. This will be done by developing a oneparticle Schrijdinger equation whose solutions are rdated to those of the true many electron Schrodinger equation in a welldefined and sdiciently simple way.
~
Hartree approximation
In keeping with the remarks above, we a s s m e the existence of an infinite set of oneparticle wavefunctions q ~ l~ 2 . . .. 'pm, from which the stationary , : stat $(xl, x2). . . ,XN) the Nelectron system may be constructed. The of p I I ( x ) ,v = 1,2,. . . , m , are, firstly, taken to be periodic with respect to a periodicity region, as the wavefunction @(XI, x2,. . , xN) itself, i.e. they . satisfy the condition
pv(x)= pu(x Cajj
j = 1,Z, 3.
(2.35)
Secondly, they are assumed to form a complete orthonormal set of functions in Hillert space, sylribolically
(CPv~lPv) = 6uhr.
(2.36)
Employing pY(x)we form wavefunctions for the Nelectron system in the folIowing way. We Brst associate each of the N electrons with a particular
67
oneparticle state pv[x), i.e. particle 1 with state pull particle 2 with state puL, etc., up to partick N which is associated with the state, ,o . y Alternatively, WP may say that we occupy state pw with particle 1, state p with , particle 2, etc. Due to the Pauli exclusion principle, each state can host only 1 particle, ignoring spin (which we do at this stage). Thus a given state p1, p2,. . . ,pm may occur among the papdated ones lpy, 'py, . , ,pVN not more than once. Most of the states will not even orcur o n r ~ i.a, not at all , (bear in mind that there is an i n h i t e number of them). These states remain unoccupied. The set of quantum numbers, (q, . . . , VN),termed configzsvz, ratzon, definea the state of the Nelectron system uniquely if we understand that the h s t number in this set refers to the state of particle 1, the second to the state of particle 2 etc.. Henceforth, we abbreviate the configuration ( V l , V a l . . . ,w )by (.I. Thirdly. we assign to each configuration (IJ)the Nelectron system a of wavefunction $(y)(xl,x2,. XN) which is given by the bllowing product of oneparticle wavefunctions:
I
=~LpvJx3)j
(2.37)
Disregarding the miitual interaction of the electrons for the moment, the product (2.37) forms an eigenstate of the Nelectron system if the oneparticle wavefunctions pV,(xJ) energy eigenstates of the individual o n e are electron subsystem Hamiltonians. This suggests t h e question whether a similar result might be possible for interacting electrons, i.e. whether it will be possible to choose the py,(xj) in such a way that the product statc +ivi obeys the Schriidinger equation
H${V)(Xl*
. 1
XN)
(2.38)
for the fully interacting Nelectron system  if not rigorously. then at least in some reasonable approximation. To address this question one may use the variational principle of quantum mechanics. In this procedureI the oneparticle wavefunctions pV,(xj) are determined such that the expectation value of the Nelectron Hamiltonian H becomes a minimum for NeIecctran states of the product form (2.37). Here we take a slightly different approach, and start from the Schrodinger equation (2.38). This procedure has the advantage that one ran ai. once determine whether there is a suitable approximation in which d ~ { may ~ } be written in the product form (2.37), and also barn the nature of that approximation. Considering an Melectron system, we label a particular electron i, and this index can take all values between 1 and N. The Hilbert space of the
68
system of t.he N  1remaining elect,rons 1,2, . . , i  1,i . by the set, of product functions
+ 1,... ,N is spanned
with P I , . . . , pzl, p o + l . . . . , pjv ranging over all possible values 1,2,, , co .. independently of each other. In this remaining Hilbert space we form the Fouriertype coefficients of the Schrodinger equation (2.381, i.e., we multiply t,his equation by the complex conjugated product function (2.39) and x2... . XN with the exception of xz. In this way we integrate over all XI, obtain
(2.40)
Due to the orthogonality of the qv. t h e righthand side of this equation differs from zero only if the pvalues coincide with the vvalues, i.e.. if = q ,.., pzl = ~ ~  pcl,+l = v,+l. .... p~ = v w holds. However, the 1 , lefthand side of equation (2.40) differs from zero if p J f v3 for one or several I # t . Thus equation (2.38) cannot hold rigorously, which means that the eigenfunctions ${,,I of H cannot be written exactly as a product {2.37) of oneparticle wavefunctions. This is only possible under the condition that the nondiagonal elements of the Hamiltonian operator on the lefthand side of (2.40) may be neglected. It is this approximation which makes possible the reduction of the Nparticle wavefunctions to products of oneparticle wavefunctions. It is called a onepartzcle approozamatton. Strictly speaking, it is the simplest variant of a oneparticle approximation, the socalled Hartwe appronmatzoa. . more accurate oneparticle approximation, called 4 the HartreeFock apprommatzoa, will be discussed below. Within the framework of the Hartree approximation the equation system (2.40) involves only the diagonal terms with p 3 = vj for each 1 # 2 , and correspondingly takes the form
69
(2.42)
On the righthand side of this equation only the first three terms depend on the electron coordinates x1 while the last two are constants in this regard. If one substitutes the expression (2.42) into equation (2.41), then the last two terms can be grouped together with U{'(.}to h m the new eigenvalur
(2.45)
we rewrit,e (2.41) as
The final relation (2.46) has the form of a Schrodinger equation for the 7th particle where V'{(")(xp) the potential energy of this particle and EV,is is its energy eigenvalue. Beside the potential energy ITc(&) due t o the atomic #}x) It is caused by the cores. V'Iv}(x,) also contains the contribution \'(,. mutual interaction of electrons, and is commonly called the Hartree p o t e n t d In explicit form, ITH I (x,)reads %{.
(2.47)
70
Here the integration runs over a periodicity region. The Hartree putential V~(X~) describes the potential energy of the itb particle in the Coulomb potential produced by the charge distribution e Ck+% i(oV,(x)l2 the reof maining particles. The factor of the electronelectron interaction potential (2.5) does not occur in expressions (2.45) and (2.47) for the Hartree potential since each electron pair contribntes only once. Obviously. the Hartree potential and the corresponding energy cigenvalues depend OB the configuration { v } of the Nelectron system, and also on the index i of the particle which was removed. The oneparticle Schrdinger equation (2.46) derived above for the ith elpctron, holds for each other electron as well. only with a somewhat different potential. This difference will now be removed, together with the dependence of the H a r t r e potential on the configuration {v} of the Nelectron system. We argue as follows: If the number A; of electrons is macroscopically large, as in the case of the electron system of a crystal, and if we consider only oneparticle states which are spatially spread out more or less evenly aver the entire crystal, there is no signifkant difference if we extend the sum over k in equation (2.47) for the potential V;,[(x,)t o include k  i . Then the 2dependence o the potential no longer exists. The emor thereby incurred is f of relative order of magnitude l/N. If one considers, on the othcr hand, only states ( v } of the &electron system which are similar to each other, one may also neglect the {v)dependence of the potential and replace V{}(x) by the value for a representative configuration {vo). The question is, does such a representative codguration exist in the case of a semiconductor, and if so. what is it. The answer to the former question is under normal conditions, yes. For a representative configuration in the abovementioned sense, we have the state of the Nelectron system with lowest total energy, the socalled ground state. In this state a l oneparticle states p with energies l Ev below a special energy value (the Fermi energy) are occupied, and the states with energies above are empty. Under normal conditions the states of the Nelectron system which occur in semiconductors, and also in other solids, deviate very little from the ground state, Nonnormal conditions are associated with large deviations, e.g., such as semiconductors which are displaced to a highly excited state by intense laser irradiation. Excluding such extreme cases, the Hartree potential Vrj}(x) for the configuration {v) is almost the same as that for the ground state configuration {v}, and correspondingly we have
(2.48)
71
V(X,) = V&)
t VH(Xt).
(2.50)
The extent to which the approximation of a Configuration independent Hartree potential is valid again depends on the kind of oneparticle states involved. For the extended, planewavelike oneparticle states of an ideal crystal this approximation works better than for the localized oncpartick states of a real semiconductor. In the latter case the configuration dependence of the potential may become essential (see Chapter 3 for further discussion). Using (2.50), the Schrodinger equation (2.46) becomes
(2.51)
The Hamiltonian of this equation is the same for all particles and no longer depends on the configuration of the Nparticle system. Equation (2.51) is therefore the oneparticle Schrodinger equation par excellence, devoid of any reference to a particular particle or configuration of the Nelectron system. We may therefore omit the index i in equation (2.51). TJsing the oneparticle Hamiltonian
ff
P ' 2, I v(x),
(2.52)
H'Pdx) = Evcpv(x).
(2.53)
The Hamiltonian H of equation (2.52) is Hermitian, and it is natural l o assume that its eigenfunctions form a complete orthonormal set in Hilbert space. 'lhis assumption has in fact been made at the outset, with respect to the oneparticle states cpu(x) forming the product wavefunctionu of the Nelectron system. In summary, the discussion above has shown the following: Within the framework of the oneparticle approximation, i.e. neglecting nondiagonal ~, elements of thc Hamiltonian, the product wavefunctions $ J { ~ } ( X x2,.. . ,XN) are eigenstates of the Nelectron system provided that the oneparticle wavefunctions of the product functions satisfy the oneparticle Schrodinger equation (2.53). Solving this equation and forming thc product wavefunction (2.37), one gets approximate solutions of the Nelectron Schrodinger equation. In this way we have reached the goal which was formulated at the outset to replace the Nelectron problem by a oneparticle problem whose
72
solution has a well defined and sufficiently simple connection with the solution of the Nelectron problem. The idea that the (py(x)are energy eigenstates and the E , are energies of single electrons, underlying the above consideration, needs to be made more precise. Because of the electronelectron interaction, the motion of a particular electron is always tied to the motion of all others, and the energy of an electron is also, in part, energy of interaction. The latter statement manifests itself clearly in relation (2.43) between the oneparticle energies E, and the total energy U{.} of the Nelectron system, which we will explore in more detail. First of all, it can be further simplified. Using the oneparticle Schrodinger equation, one can reexpress the terms on the righthand side of equation (2.43) by oneparticle eigenvalues, leading to
The energy of the Nparticle system is therefore not just the sum of all oneparticle energies. It is necessary to subtract the Coulomb interaction energy of the particles. Therein is reflected the fact that the E, contain a certain portion of interaction energy with other electrons. This is doubly counted in E , of oneparticle energies, once in summing over the particles the sum themselves, and once in summing over their interaction partners, which is done in E, automatically. To correct this, one must subtract the Coulomb interaction energy. This shows that the ( , x may be interpreted as states of single elecp() trons only in a generalized sense. In reality the (py(x) describe stationary states of the motion of the Nelectron system in which all electrons are involved. These states of motion are not mutually coupled, as in the case of normal oscillations of a system of interacting atomic cores. Using the terminology introduced in that context one may consider the states q , ( x ) as states of quasiparticles or elementary excitations of the Nelectron system. The E, are the corresponding quasiparticle or excitation energies. There is, however, a qualitative difference between these elementary excitations of the electron system and the normal oscillations of a crystal. This may be made clear as follows. If one adds to the Nelectron system (which we will assume to be in the ground state) one more electron, i.e. if one passes over to a ( N 1)electron system, then the oneparticle Hamiltonian (2.52) does not change within the framework of the approximations made above. The oneparticle wavefunctions pV of the Nelectron system therefore also approximately describe the elementary excitations of the system of ( N 1) electrons. This means that an eigenstate of the ( N 1)electron system may be realized by keeping the previously available N electrons in their one
xi
73
particle states and adding the ( N 1)th electron in one of the oneparticle states pu* of the Nelectron system which were previously not occupied. Thus, by adding an electron, the energy of the system rises approximately by Ey*. This means that the eigenvalue Ev* of the oneparticle Ilamiltonian may be understood as the energy of an electron added to the system. This statement is called Koopman h e o r e m . From it, one can learn more about the kind of elementary excitations of the Nelectron system that are described by the p,. These are states in which, as always, all electrons of the system are involved, but not all in the same way. Only one of the electrons is moving in such states, while the others play a passive role; they determine the potential in which this movement occurs. One therefore refers to these states as oneparticle excitations of the Nelectron system, and to their energies as oneparticle excitations energies or, in short, oneparticle energies. In addition to the oneparticle excitations considered above there may yet be others. This can be confirmed by taking a (N 1)electron system instead of the ( N 1)electron system. The missing clectron corresponds to a hole in a previously occupied oneparticle state ( p y ~ . The excitation energy of the hole is EvO, which did not occur among the oneparticle excitations considered above. It therefore represents an additional oneparticle excitation. If an electron is removed from state vy and simultaneously an electron is added in state then this corresponds to the excitation of the Nelectron system : from state ($, Y:, . , . , v k ) into state (v;, v, ..., I&). The energy difference with respect to the ground state amounts to EV; F 0 . It corresponds 5 to the excitation energy of a11 electronholepair with the electron in state pu; and the hole in state p 0 . If one excites a second electron from state v1 'p o into state p,,;, the energy difference with respect to the ground state is u!2 (Eq  Ey:) (By; E e ) , etc. The excitation energies of the Nelectron system can thus be written as a linear superposition of oneparticle energics. This is valid only within the oneparticle approximation. In a strict seiisc one also has manyparticle excitations, which will be considered in more detail below. As far as the oneparticle excitations are concerned, there are no others than the ones considered above, at least as long as one ignores spin and the magnetic interaction between electrons.
vT,
74
d?
n
P U
Figure 2.3: Selfconsistent solution of the oneparticle Schrijdinger equation. atomic cores. As was done there. we may also solve the present problem iteratively (see Figure 2.3). We employ oneparticle wavefunctions pf(x) close to the true stationary oneelectron states. r s i n g cp:(x) we determine a POtential t$(x) according to equation (2.491, form the total potential Vo(x) by means of (2.50), and use this to solve the oneparticle Schrodinger equation (2.53). The solutions vt(x)are then substituted into formula (2.49), thereby determining new potentials TG(x) and V l ( x ) . With the latter one recalculates the eigenfunctions p:(x) etc. One continues this iterative procedure until the eigenfunctions, and with them also the potential in the following iteration step. no longer change within a specified limit of accuracy. The eigenfunctions and potential are then said to be determined selfconsistently. Spin a n d spinorbit interaction
At this point in our treatment of the oneelectron approximation, it is a p propriate to recognize that electrons h w e a spin, i.e. an internal angular momentum with the two possible values rC/2 and 7i/2 in a given direction. This is to say that electrons are capable of a motion in spin space, beside
75
their motion in coordinate space, which in this context is called orbatal m o ) tion. As orbital motion involves dependence of the wavefunction p ( ~on the space coordinate x,spin motion involves a dependence of p(x, s) on the spin variable s which may take the two possible values s = and s =  z1

(below the latter value will be written as =_ ;). 'I'hus, in consideration of spin, the wavefunction of an elcrtron changes from an ordinary vector p(x) in Hilbert space to an element {~(x, p(x, ;)} in the product space of the ordinary Hilbert spare and tht. twocomponent spin space, a socalled twocomponent spinor. fiLnrtion, To determine the spinor state of an electron uniquely, the quantum number X which defines the statr must also specify the spin state. If the latter i s independent of the state in coordinate space, this may be done by specifying another quantum number u for the spin motion along with the quantum number v of the orbital motion, i.e. by setting  v , o where u may take the two values T (spin up) and 1 (spin down). The spinor {px(x, px(x, can then be represented as a product of only one spatially varying function py(x)and a spinor ,yo($)} which does not change in coordinate space. The two spirior components cpx(x, s) can then be written as
i),
i),
i)}
(~~(4)~
(2.55)
In general, however, the orbital and spin motions are coupled. This is mainly
due to the fact that, on the one hand, the spin motion is accompanied by a magnetic moment of the electron, and that, on the other hand, the orbital motion gives rise to a magnetic field which couples that magnetic moment. In quantum theory it i s shown that this interaction, which is called spinovhit intemction, can be represented by the following additional term H,, in the oneelectron Hamiltouian:
H,,  [VV(X) 4m c
Tz
x p] . (7.
(2.56)
Here V(x) denotes, as before, the periodic crystal potential of equation (2.50)) and 3 is the vector whose three components are Pauli's spin matrices. In spin space one usually refers all quantities to the basis X I = (1,0),X I = ( 0 , l ) . Then the components of are
u.=(;
i),
u y  ( i0
0')
1
3
(2.57)
Taking account of spin and the spinorbit interaction, the oneparticle Schrodingw equation (2.55) in Hartree approximation t,akes the form
76
(2.58)
Spinorbit interaction i s in fact an important consideration in determination of t,he energy specha of many serniconduct,ors. HartreeFock approximation
An obvious drawback of the Hartree approximation is that the wavefunction of the IVparticle system is not antisymmetric with respect to the exchange of two particles, a requirements of the Pauli exclusion principle. 'l'his defect can be easily remedied by replacing h e product wavefiinction (2.37) of the Ilartree state by a linear combination of product wavefunctions with exchaiigd partirle indices and altered signs. In conjunction with this, the spin of the electrons has to be considered, such that the wavefunction of the tth particle is given by the spinor p ~ , ( x ,si). The antisymmetric linear , cornhatiom of the product waverunctions may he arit,tcn in t h e form of a socalled Slater determinant
(2.59)
In this determinant, an exchange of t.he variables of two electrons leads to the exchange of the two corresponding rows. The sign of the determinant thereby changes, SO that the Slater determinant actually has the requisite aiitisymmnctry propcrty. If t.wo of the quantum numbers XI, X2,. . . , AN are q u a l , then two colunms 01bhP determinant are identical and vanishes. 'l'his means that no states of the Nelectron system are allowed with two electrons in the same oneparticle state. "his is just the Pauli principle, automatically enforcd by the use of the det,erminant,alform of the Npart,icle
wawfunr tion.
Employing such SlaPer determinants as Nparticle wavefunctions, as opp o s d to simple producls, a oneparticle Schriidinger equation far pv(x) may be derivd in the same way as before, but the potential in this equation ia
77
somewhat different than that in the Hartree equation. It contains an additional contribution, the socalled exchange potential Vay(x),and the total potential reads
V(X) = K(x)
+ VH(4 + W X ) .
(2.60)
In the case of negligibly small spinorbit interaction, the orbital state may be characterized by a separate quantum number v t , and the spin state by a separate cpiantum number m i . T h c spinor components cp~,(x,, are of thr 8%) form (2.55) in such circumstances. The sum VH(X) Vx(x) of the Hartree and cxchange potentials can then be written in a relatively simple form. It can be shown thilt theii action on the coordinate dependent factor of the oncparticle wavefunction cpl,(x)takes the form
(2.61)
The Erst term on the righthand side of this cquation is t,heHartree potential. The factor of 2 results from summing over the t,wo spin states associated with each wavefunction cpvk(x). The second term corrcsponds to the exchange potential. Formally, it differs from the first term by exchanging the states at the two positions x and x . The factor of $ reflects the fact that, firstly, the exchange potential acts only between electrons of the same spin, and, secondly, that for the ground state with total spin 0, half of the electrons are in spindown states, and half are in spinup states. In this way the magnitude of the exchange potential is influenced by the existence of electron spin, although its value is the same for spinup and spindown states. Equation (2.61) also shows that, unlike the Hartree potential, the exchange potential ,x is nonlocal. The effect of the exchange potential on the wavefunction p ( ) is represented by an integral operator. In actual calculations one often uses a local approximation for Vx(x). The exchange potential proofs to be attractive, which is to be expected: the antisymmetric form (2.59) of the total wavefunction ${A} means that the probability of finding two electrons with the same spin at the same position is zero so that one has an exchange hole around each electron. This lowering of electron density in the vicinity of an electron results in an attractive potential in addition to the repulsive Hartree potential since the total Hartree wavefunction (2.37) does not account for the exchange hole. The improved oneparticle Schrodinger equation with the potential of (2.60) and (2.61) is called the HartreeFock equation, and the oneparticle approximation, which underlies it, is called HartveeFock approximation Thereby, the Hartree and exchange potentials are understood as those for the
78
ground state configuration v i of the Nelectron system. The effects of the electronelectron interaction, which are still neglected within the HartreeFock approximation with configuration independent Hartree and exchange potentials. are called correlation effects. Correlation effects Correlation effects are, first of all, m n f s e in the fact that the true oneaietd particle excitation energies of an Nelectron system differ from those in the HartreeFock approximation. In particular, these excitation energies depend on the configuration of the system, one has tt, configuration dependence. Secondly, Slater determinants which in HartreeFock approximation are considered to be eigenstates of the total Hamiltonian, in fact do not diagonalize tbis Harniltouian ezactlv; there are nonvanishing offdiagonal elements, an effect which is termed configuration interaction. The exact eigenstates of the total Hamiltonian are linear combinations of diflerent Slater determinants, and the corresponding energy eigenvaluea are no longer s u m of oneparticle excitation energies, as had been the case for individual Slater determinants. I other terms, the exart,eigenalates of the Nelwtrronsystem n are not oneparticle excitations! but manyparticle excitetians. Examples of manyparticle excitations include twoparticle ezcitation.5 of an electron and a hole which are bound together by their Coulomb Interaction. The excitation energy of such a hound electronhole pair, the socalled ezcitan, is smaller than that of the excitation energy of a free electron and hole pair, differing by the binding energy of the pair. The reason for the designnation correlation effect for this phenomenon is obvious: binding may be understood as a correlation between the positions of the electron and the hole, since their separation by a distance of about a Bohr radius is more probable than all others. This interpretation presents the correct concept of correlation in other cases also the states of the electrons are no longer independent o each other, but, are correlated contrary to the assumptions implicit in the f oneparticle approximation. Collective manyparticle excitations are excitations of states in which all electrons of the system participate in comparable measure. Examples include the plasma o s c i l  t i a m of an electron system. They form a direct electronic analogy to the lattice vibratious of the atomic cores of a crystal. Their excitation quanta are called plasmons. The consideration of correlation effects stands along the most difficult problems of solid state thmry which, even today, is not completely solved. A comprehensive analysis of this problem i s far beyond the scope of the present book. Readers who are particularly interested in correlation effects will find discussions in a number of textbooks (see, e . g , hbrikosov, Gorkw, md Dzyaloshinski, 1963; Fetter and Walecka, 1971; Ziman, 1974; Callaway, 1976; Madelung, 1978; Harrison, 1981). Below we summarize some results
~
79
which will be needed in Chapter 3. In doing so, we will concentrate on oneparticle excitations, i.e. individual electrons and holes moving in the force field of all other electrons as well as in the force field of the atomic cores.
Correlation effects on oneparticle excitations. Density functional theory.
Correlation effects on oneparticle excitations may be treated by means of the Green's function theory of many particlesystems. The poles of the o n e particle Green's function in the complex energy plane represent oneparticle excitation energies (more strictly speaking, the real parts of these poles are the energy levels, and the imaginary parts are the lifetime broadening energies of the oneparticle excitations). The oneparticle Green's function is governed by the Dyson equation, which contains correlation effects through the socalled mass or selfenergy operator. Another method which works well for oneparticle states involved in the ground state of the manyparticle system is known as density functional theory. This method relies on a theorem, the HohenbergKohn theorem, which ensures that the ground state energy Eo of an interacting electron system in an external potential Vc(x) is a functional E o [ n ( x ) ]of the total electron density n(x) of the ground state alone. This implies, first of all, that the total energy E o [ n ( x ) ]depends on the oneparticle wavefunctions only through the ground state density n,(x)and, moreover, that the density enters at every point x, through an integral over X . The oneparticle wavefunctions determine the ground state density by means of the equation
(2.62)
where cpvi(x) denote the oneparticle states which, in the ground state of the Nelectron system, are populated by electrons z = 1,2, ..., N. According to the variational principle of quantum mechanics, the wavefunctions cpv,(x) adjust so that, while keeping their norms (cpudlqv,) constant, the total envrgy Eo[n(x)]is minimized. This requires the vanishing of the variational derivative of the functional E o [ n ( z ) ] E,(cp,Ip,) with respect to p:*(x), where the factors E , are variational parameters, therefore
xi
(2.63)
In this functional derivative the value of cp:t(x) at a certain point x is taken as an independent variable, with respect to which the common derivative is taken. of The total energy functional Eo[n(x)] the ground statr may be decomposed into several energy contributions, namely, the kinetic energy E ~ & ( x ) ] , the external potential energy E c [ n ( x ) ] the Hartree energy E w [ n ( x ) ]and the ,
80
exchange and correlation energies which are usually summed in the exchangecorrelation energy E x c [ n ( x ) ] .Thus
Eo[n(x)] Ekin[n(x)l Ec[n(x)l &dn(x)l =
+ Exc[n(x)l.
(2.64)
//
R R
d3x'd3x
(2.66)
The functional E x c [ n ( x ) ]is less obvious. It is usually taken in a local approximation called the local density approximation (LDA). The LDA starts with the homogeneous electron system without any external potential. In this case the density n ( x ) is a constant n in space, and E x c ( n ) reduces to an ordinary function of n. Dividing E x c ( n ) by the total number nR of electrons yields the exchangecorrelation energy E X C ( ~of the free electron gas, per ) electron. The total exchangecorrelation energy Exc(n) of a weakly inhomogeneous electron gas of density n ( x ) should then be given approximately by the expression (2.67) has Finally, the kinetic energy fuIictiona1 E:k.tn[n(x)] to be specified. By definition, we have
(2.68)
Although this expression does not look like a functional of n ( x ) il is indeed possible l o transform it into such a form because all other terms in the total energy functional Eo[n(x)]of (2.64) are functionals of n ( x ) , and the HohenbergKohn theorem enforces this for E & t ( x ) ] . For the waluation of the variational equation (2.63) we do not, however, need the explicit functional form of EkznIn(x)l; expression (2.68) suffices. Its variational derivative with respect to p;,(x) iu given by (2.69)
81
(2.71)
(2.72)
where
(2.73)
denotes the ~xcliangPcnrrelation potential. The latter can be determined if the exchangecorrelatiou energy E x c ( n )of the homogeneous electron gas is known as fniictio~iof n. This dependence can hr obtained by calcrilatnumerically for tfifkrent values of n and then lilting the data ing E>yc(n) to appropriate explicit functions. TJsing this procedure. various Pxchang+ correlation potentials have been proposed, for exaniylr
Lkc(x) 
( : )
1/3
e2n1j3[x)[I
+ 0.7734 z In
with z  rS/2l, where rS = and CI.B the Bohr radius (Hedin, Lundqvist, 1971). Suhstitiiting i n h q u a t i o n (2.63) t h e frlnctionrtl drrivatives obtained above, one arrives at
(2.75)
with
v(x) = VC(X) I Vrr(X)
+ i<Tc:(x)
(2.76)
as an effective oneelcrtron potential. Th? electron ind~x has heen omitted a here bwauw thP equation is the bame for all electrons. Relation (2.75), with the potential V(x) given by (2.76), is known as KohnSham equatzon. ,4s rompnred t o the oneeleclron potential V(X) of the Hartrw ur HartretFock equations, that of the KohnSham equation additionally accounts or correlation effects. The physical significance of tlir solutions of the KohnSham equation is, however, less direct than that of the solutions of thP llartree or HartrwFock equations. Generally, the eigcnvalucs of the KohnSham equation cannot be understood in the sense of oneparticle excitation energies of the Nelectron system. as it is possible for the eigenvalues of the IIartree or HartrePFock quations according to Koapman's theoiern. A misinterpretation of this kind may lead to large errors. This applies, in particular, to electronhole excitation energies in senlimnductor crystals, defining the energy gap. The resulting erroneous gaps are about 50% smaller than the experimental values. However, the
82
eigenvalues and eigenfunctions of the KohnSham equation can be properly used to calculate the total energy of the ground state of the Nelectron system by means of the total energy functional (2.64). The dependence of the total energy on external parameters as, for example, on the positions of atomic cores, also can be obtained in this way. Minimizing the total energy with respect to the core positions yields the atomic structure of the crystal. The oneparticle excitation energies may be obtained as the differences of the total energies of the ( N t 1)or ( N  1)electron systems, on the one hand, and the Nelectron system, on the other hand, where the ( N 1)electron system applies for electlan excitations and the ( N  1)electron system for hole excitations.
23 .
The oneelectron Schrodinger equation (2.53) is specified by the form of the potential V(x) which is different for crystals of different chemical composition and atomic strurlure. However, there are certain general properties of V(x) which do no1 depend on the particular material nature of the crystal. As we already know, a crystal remains invariant under a transformation by an element of i t s space group. Rotations, reflections and rotationreflections of the point group of directions transform a given crystal direction into a physically equivalent one. The symmetry of the crystal is transferred directly to the potential V(x). Consequently, the Schrsdinger equation (2.53) is endowed with corresponding symmetry properties. We shall first describe these and then explore their implications for the stationary oneelectron states cpu(x). In doing so, we initially neglect electron spin.
2.3.1
For simplicity, we restrict our considerations here to crystals their space groups are symmorphic. These are groups which contain solely translations, rotations, reflections as well as rotationreflections, while screw rotations and glidereflections are excluded. The general case of nonsymmorphic space groups, and with it the particularly iniyortant case of the diamond structure, i s treated in Appendix A. It turns out that the results derived for symmorphic space groups arc also valid, with minor modifications, for nonsymmorphic ones. We use the symbol t R to denote the lattice translation operator which causes a translation of all points x through a lattice vector R,
tRx =x
+ R.
(2.77)
states in
crystal
83
The set of all lattice translation operators forms the translation group, which we already encountered in Chapter 1. Since the lattice translation operators are symmetry dements of the crystal, the translation gioup is B subgroup of the space group which, by definition: contains all symmetry elements of the crystal. As in the case of translations, we also assign operators to rotations, reflections and rotationreflmtions of the point group of directions. These operators also art on the spatial vectors x, and W P denote them by the symbol a and call them point symmetry operataorw. For the symrnorphic space groups considered here, each element y may be thought of as a product 0 . t or t ~ r of a translcttion t~ and a point syrnrnetrg operation a. Since an ~ r arbitrary position vector x may be represented in a basis spanned by the three primitive lattice vectors a .a 2 and a3, it suffices to s p e d y the &ect l of a on these. We define (see Appendix A}
(2.78)
with ov as real coefficimts. By means of this relation each opeiator a is uniquely associakd with a corresponding matrix aV. If the thee primitive , lattice vectors a are orthogonal and of the same length, the uaJform orthogonal niatriw$, i.e. their inverse matrices are the same as their transposed ones. However, this holds only for the primitive cubic lattice, and is not true fur all other 13 Binvais btticrs. Thus, the utJ arc not iu general orthogoual matrices. The effect of a on an arbitrary position vector x may easily be d e t e r m i n d by ~ n e a n s i t s decomposition of
x czja3
t
(2.79)
with respect to the three primitive lattice vectors aj. The ponents of x with respect to these vectors. Applying a ,
xj
(2.80)
j
(2.81)
from which it follows that the transformation a , which was ariginaIly defined as a transformation of the basis vectors a with fixed coordinates zz,may also i be understood as a countertra~sformation thc coordinates with fixed basis of vectors. Indeecl, lh coordinate countertransformation takes place with the
84
transpose matrix a j i of aij, in contradistinction to the transformation (2.78) of the basis vectors. This is t o say that
(2.82)
remains invarianl under the transformations t R and a , then this mist also hold for the potential V(x) with which the tramformed crystal acts on an electron at position x. To express this fact formally, we &fine the operation of t~ and n on an arbitrary position dependent scab1 crystal function S(x) as follows:
If the crystal
tf$(x) = S ( t R ' X ) ,
(2.83)
(2.84)
a S ( x )= S ( a  l x ) .
It is striking that the operators t~ and a act on S(x) in such a way that x ifireplaced by tR1x or m  l x . but not by k ~ or c k x . as one might hme x expectd. The rhosen deftnition strms from the rrcognition that the transformed property is that of the transfarmrcl crystal at the original position x. This is the Sam?, howewr. as the proprrty o f t h e original crystal at the inverse transformd position. Formally the definitions (2.83) and (2.84) giiarantee the correct multiplication order of two no11 commuting operators 1 under the fiinction symbol, because ( q c q )  '  rtl a 1 . Applying these 2 definitions to the potential V ( x ) and simultaneously requiring crystal symmetry, w r have
(2.85)
(2.86)
v(xj.
(2.89)
(2.W)
follow, with IA, B ] = A B  B A as abbreviation for the commutator of two operators A and 3. One says that t~ and 1y commute with the potential V ( x ) . Such cornmutivity also holds for the uthcr contribution of the Hamiltonian, the kinetic enerw operator ?' = p2/2m. For t~ this follows directly
85
from the relations p =  i h V , and V = V x + ~ , , while for a it follows from the fact that p2 is the square of the length of the momentum vector, as w l as from the observation that the length of a vector is not changed by a el translation, rotation or reflection. Thus.
[tR,TI = [a,1 = 0. 2 
(2.91)
(2.92)
(2.93)
Since the elements g of a symmorphic space group may be written as products of t~ and C Y , one also has
The latter relation expresses in h a 1 form the implication of crystal symmetry for the Schrodinger equation. We will use it extensively in the analysis of symmetry properties of stationary oneelectron st,ates to follow.
2.3.2
Bloch theorem
Let p~ be an cigenfunction of H having eigenvalue E . Then the stationary Schrodinger equation (2.53)holds, whrrc the state index v has been replaced by E .
HVEW = EPE(4.
(2.95)
In addition, one has the periodicity condition with respect to a periodicity region VE(X k G a j ) = ( P E ( X ) 1 J  2, 3, (2.96)
and t h e normalization condition
Through the periodicity condition (2.961, the symmetry group, which at the outset includes an infinite number of lattice translations, is reduccd to the finite subgraup containing only those translations t R which do not fall outside the periodicity region. The commutivity of t~ and a with H has the consequence that, along with cp~(x), also t ~ p ~ ( x ) , w,uE(x) and g p ~ ( x are ) eigenfunctions of H having ihe same cigenvalue E . Thus, one has
86
crystah
(2.100)
Consider now the eigcnfunction g i p p If the symmetry operation g ranges over the whole space g o u p , the totality of vectors g p spans a subspace of ~ the Hilbert space w l i o s ~ dimension d i s in general larger than 1. This is to say that the eignvalue E, for symmetry reasons, is dfold degenerate. It can be shown that, apart from spmial cases, d equals the number of different elements m of the point group, and then the d basis functions of the subspace may be chosen in the form with Q ranging over the entire point group. This result will not. howevpr, be usrd in what follows and the basis funrtions will be written in the general form p ~ lp, ~ 2 , .. , IpE& They . span that subspace of the Hilbert space which contains the eigenfunctions of the Hamiltonian with the eigenvalue E . According to i t s construction this space is invariant under the operation of an arbitrary element g of the space group. In terms of the concept of rmducible represenfatema of groups (an introduction is provided in Appendix A), we may alternatively express this observation by saying that the eigenfunctions of H for a particular eigenvalue E give rise to a &dimensional irreducible representation of the space group. The same statement also follows for each subgroup of this group, particularly for the subgroup of all translations. The space spanned by r p ~ lP E T , . . . , ( p ~ d :thus also provides B representation of the translation , group. However, this representation is, in general, no longer irreducible. l ~ ~ That means that the original basis p ~ p , 2 .,. . ,( P E may be transloorrued (d1) x I into a new basis (p~,(pk,'plh;, in such a way that the represen.,pE tation matrices of all translation operators t~ written in the new basis are constructed from lower dimensional matIices, odered along the diagonal. It is of special importance that the translation group is a group of Abehan type: The resiilt of two translations t~~ and t~~ does not depend on the sequence in which they are executed. Formally this is expressed by the equation
(2,101)
As is demonstrated in Appendix A, all irreducible representations of Abelian groups are 1 dimensional. This means that the lower dimensional matrices along the diagonals of the tHrepresentation matrices are 1dimensional. (d1) x I Each of the basis functions &, ,p . . . , ' p E ;c therefore forms a represrtntation space of the translation group by itself. Thus, su functionstRpE, with t~ an arbitrary element of the translation group, are linearly dependent on p~ and dl functions t ~ & linearly dependent on & etc. 'I'his means are that thp t p p itself may be written in the form ~
tRcPE(x) = c(R)rFE(x)
(2.102)
where r(R) is a complex coefficient. Analogous equations hold for all o t h a functions t ~ l p b , tR&., , . .. This is equivalent to the statement that the cho
87
sen ( P E ,'pk,&, . . . are eigenfunctions of the translation operator. This result forms the content of the
Bloch theorem:
The eigenfunctions p~ of the Hamiltonian H of a crystal can be chosen such that they are simultaneously eigenfunctions of the lattice translation operators of the crystal
The particular energy eigenfunctions whose existence is stated by this theorem are termed Bloch functions. The proof of the Bloch theorem sketched above relies in an essential way on features of group representations. Although it forms the most appropriate proof, one may, however, also proceed without the tools of group theory, using a mathematical theorem which eos u e s that two commuting Hermitian or unitary operators have a common set of eigenfunctions. The Bloch theorem is of such great importance that several remarks are appropriate. In the first place, it has to be emphasized that this theorem is an immediate consequence of the commutivity of translations. For the symmetry operations a of the point group, for example, which in general do not commute, it does not hold, which means that in general the eigenfunctions of H cannot be chosen to be simultaneous eigenfunctions of the operators u of the point group. Secondly, the theorem does not say that every conceivable eigenfunction of W is also necessarily an eigenfunction of t R . This holds only for specially chosen c p ~ .The Bloch theorem insures that such a choice is always possible. Thirdly, this theorem also does not imply that only one eigenfunction exists for particular eigenvalues E and c(R) of, respectively, H and t R . In reality there are alwtLys several. The pair of eigenvalues El c ( R ) is therefore not sufficient to uniquely characterize the eigenstates of H and of the group of operators t R . Quantities which provide such unique characterization have yet to be identified. It, turns out that there are three real numbers k l , E z , k3 which allow one to distinguish the different irreducible representations of the translation group of the crystal. To understand this we must examine these representations in greater detail. They are defined by the eigenvalue equation (2.102) of the translation operators. T o start, we will show that the eigenvalues c k l k 2 k 3 ( R ) of this equation may be written in the form
(2.103)
P k 1 k 2 k 3 ( R )=
(2r)(hri
+ ~ Y+ z ' 3 ) h
(2.104)
Here, the E l , kz, k3 are the above mentioned real numbers which determine 1 the representations uniquely. The factor L 2 ~is introduced to simplify expressions which later will arise. To prove equation (2.103) we first show
88
(2.105)
(the subscript indices k l , k2, Ic3 on c(R) and P(R) will be suppressed temporarily). The proof is based on the normalization of p~ according to equation (2.97), which leads to
(tR(PE t R P E ) = ( P E (PE).
(2.106)
This holds because the translation t~ of the integration variable through R in ( t ~ p t ~ p may be absorbed by a change of variables jointly with an I ) application of the periodicity of the wavefunction p(x) with respect to the periodicity region. On the other hand, it follows from relation (2.102) that
(tRLPE tR(PE)
I
c(R)
l2
(PaE I P E ) .
(2.107)
Considering (2.106) and (2.107) together, (2.103) is verified at once. Secondly, we show that c(R1+ R2) = c(R1) . c(R2) must hold. To prove this, we use the following obvious relations:
(2.108)
Comparison of the last two relations immediately shows that (2.108) is also true. Employing (2.103) for c(R), equation (2.108) now yields exp[iP(Rl+ Rdl = exp[iP(R~)I~xP[WWI, . whence
(2.112) (2.113)
This means that P(R) is a homogeneous linear function of the components TI, 72, r3 of R. As such P ( R ) must have the form (2.104). We now proceed to the eigenfunctions of the translation operator. Employing ( ~ ~ k(x)to denote the eigenfunction having eigenvalue Ck1kzk3 ( R ) ~ k ~ k ~ of equation (2.103), we claim that pEklkzks(x) can be written in the form
(2.114)
where u ~ k ~ k ~ k denotes a latticeperiodic function, such that, for any ~ ( x ) lattice vector R,
89
UEklkzk3(X
 R) = UEk1kzk3(X)'
(2.115)
The factor l/& with 0 = G300is introduced in (2.114) in order that the normalization integral of UEklkzk3 (x) with respect to a primitive unit cell be 1. The proof of (2.114) may be carried out by verifying that functions of the form (2.114) obey equation (2.102). This results in
which we know to he true. The functions p ~ k ~ k ~ k of ((2.114) are referred ~ x ) to as Bloch functions and the factors u ~ k ~ k + ~ (as )Bloch factors. Recalling x k ( ~ ) that the p ~ k ~ k ~ are~also simultaneously eigenfunctions of the Hamiltonian, we may express the Bloch theorem in a somewhat more specific form than we did above:
The ezgenfinctions of the oneelectron Hamiltonaan of a crystal can be chosen i n the f o r m (2.114).
The real numbers kl,k2, k3 characterizing the various Bloch functions may he understood as components of a vector. However, as we will soon see, this is not a vector in coordinate space, but one in a space which is reciprocal to coordinate space.
2.3.3
The starting point for understanding the nature of the components k l , k2, k3 is their transformation law under point symmetry transformations a in coordinate space. This will now be explored.
Transformation properties of kl,122, k3
At the outset, it is clear from equation (2.99) that both p ~ k ~ k ~ k andx ~ ( a q E k l k z k 3 (X) are degenerate eigenfunctions of the Hamiltonian H with the same eigenvalue E . The question arises whether a p , g k l k Z k 3 ( x ) is also an eigenfunction of the translation operator and, if so, what eigenvalue of t R pertains to it. In order to answer this question, we form tRfffpEklkzk3 = t ~ p ~ k ~ k ~ k ~ (and 'obtain the equation a  ~ )
= e x ~ l i P k , k ~ k  , ( ~ ~ ~ ~ R ) l ~ ~ ' ~ ~ k ~ k(2.117) ~ ) . ~ k ~ (
k ~ ( ~ ) The latter relation means that a p ~ k ~ k ~ is indeed an eigenfunction of ~R)]. the translation operator t R with eigenvalue e ~ p [ i P k ~ k ~ k ~ ( a Evaluating ,f3klkZ~3(a1R) explicitly using equations (2.104) and (2.82), we have
90
(2.118)
j i
This expression is subject to an interpretation which differs from the previous one and which has important consequences for later developmentu. The new interpretation is that the transformation, which hitherto operated on the vector components of R, should now be understood to operate on the real numbers kl,k z , k~ inslead. Of course, t,he new interpretation cannot change the value of the expression (2.118). To assnre this, the kj must transform according t,o the transposed matrix ' ; a of' a j n s t e d of the matrix a i j , ; which applies in the case of the components ~j of R. This is to say that
(2.119)
must hold. One may therefore write
(2.121)
3
This equation means that the transformation a,which originally operated on the wavefunction p ~ k ~ l f a k ~ ( has been transposd to operate 021 the x), k,, which werp initially introduced as real numbers without any particular transformation behavior. The transformation properties of the kj thusly defined are similar, but not identical to, those of vector components. The differenceliea in the fact that it ia not the matrix arj itself that multiplies the column vector of the components, but rather it is the transposed inverse . ' ; We will now prove that this is exactly the way the components matrix a of a vector transform in a space reciprocal to the ordinary space of position \=tors x.
Definition of the reciprocal vector space
l h e space of position vectors x is defined through its basis al,a2,w. The corresponding reciprocal vect.or space is determined by a basis bl, b2, b3 said to be reciprocal to the original basis a1,az,a.The reciprocal basis is defined by the set of equations
EQ
'
bj = 2 ~ 5 i j
(2.122)
These equations state that the vectors of the T & P T Q C ~ basis me normal to, respectively, one of the three planes spanned by pairs of direct basis vectors.
91
,4mong themselves, the bj are in general nonorthogonal to the extent that , the a are mutually nonorthogonal. The lengths of the reciprocal basis vectors bj are determined by the three equations which follow from (2.122) for j = a. Altogether, the equation set (2.122) determines the reciprocal basis vectors bj uniquely, aa follows:
2x
27r
00
x a1 1
, b3 : [a1
2x Qo
x az] .
(2.123)
Here Ro = a . [a2x a 1 is the volume of a primitive unit cell of the crystal 1 3 lattice. If one s u bj et s the direct basis to a rotation or reflection, this induces the same rotation or reflection of the rwiprocal basis, as follows immediately from the defining equations (2.122) or (2.123). Because of these relations the 'reciprocal' basis (bl, bz, b3) is rigidy joined with the direct basis (al: ag) a2, and will transform when the latter is tranaformd However, since the recipaz,a3) are orthogonal rocal basis is different from the direct one (only if (al, unit vectors are the two basis sets the same), the matrix LY which describes i the rotation or reflection of the a is different from that which transforms the bj. Using equation (2.123) one may easily show that the bj are transformed with the transposed inverse of the transposed matrix u p which transforms the q ,that is, with the inverse matrix a;'. In other words, the inverse matrix describes the same rotation OF reflection of the reciprocal basis a the s transposed matrix docs fur the direct basis. Consider now an arbitrary vector k of reciprocal vector space, which can be written as (2.124)
i
As we know, the vector components themselves transform in accordance with the transpose of the transformation of the basis vectors. Since the transformation of the reciprocal basis calls for the inverse matrix, the components bi will transform with the transposed inverse matrix 05'. This verifies our earlier assertion that quation (2.121)describes the transformation of the components of a reciprocal vector. Using this result, we obtain
kkb,
i =
cxk.
(2.125)
The reciprocal vector k of (2.124) may be used to express the phase ,&kka(R) of the eigenvalue (2.103) of the translation operator in the more compact form
92
Bklkpl~3(R)
k. R.
(2.126)
Using this expression the eigenvalue of t,hP treiistntion operator (2.103) becomes
R)
UEk(X)
(2.129)
A further consideration illuminates the physical meaning of the reciprocal vector k. For this p u r p o s ~ first assume the potential V(X) to be a constant we independent of x. appropriate to a free electron. All the abovementioned r e s u l t s hold. of roursc, in this caw. Rut thrrp is m o m In this caw translations through arbitrary vectors r and rotations about any axis and through arbitrary angles are symiwtry opcrations, for they do not change the potential and, consequently, also leave the Hamilt onian innriant. Therefore.
"Ekb 
4= " E k W
(2.130)
for arbitrary vectors r. This mcans that the Bloch factor ugk(x) is a cons t w t , and t,he eigenfunctious of the HamiItonian are of the form
(2.131)
This is just the wellknown result that the stationary states of a reeelectron may be taken as plane waves o a given wavwwtor k If the potentid is not f . completely constant but. as happens in a crystal, it remains constant only under translations through lattice vectors, the meaning of k as wavevector is largely prwervd. It is thrrefore called a qaasawavevector: Of course, the dimension of the quasiwsvevector is also that of a reciprocal length. The Bloch functions y ~ k ( x ) (2.128) may be understood as travelling waves. of modulated spdiallJ by thr latticeperiodic Bloch factor U E ~ ( X ) .That the stationary oneelectron states may be chosen in this form, is the content of Bloch~ thwrpm. This theorem d o e riot say, howevrr, that these states are necessarily modulated tmvellzng plane waves, just as t h e stationary states
2.3. General properti= of stationary oneelectron Figure 2.4: Mesh of allowed points in kspace due to the periodic boundary conditions
in a 2dimensional model.
states in a crystd
93
of a free electron need not be travelling plane waves. One ran also h a m standing plane waves, sphrriral waves d c .
Discretization of kspace
'The periodicity condition (2.96) for the eigenstates of the Hamiltonian has largely been ignored so far. Fortunately, it can be satisfied very simply. The Ebch fuucfions V E ~ ( X )obey the relation
k,, G
with 4
BS
lj, j
= 1!2 , 3 ,
(2.133)
The kvectors defined by (2.133) form a finely meshed net in kspace (see Figure 2.4). The only permissible kvectors must be points of this net since the Bloch functions are periodic with respect to the Periodicity region. The kspace thus has a discrete structure; its set of points is countable. The distance betwren different kvectors is, however, so small because of the large size of C that k can pract,ically be treat4 as it continuously varying quantity despite its discrete character. This approximationwill be used later extensively.
Reciprocal lattice
Just t t h in direct spaw, one may also consider a point lattice in reciprocal space. In this context, the socalled mciprocal lattice i s especially useful. It is dehed by taking the reciprocal basis vectors bl,bz,b3 of (2.123) aa its
94
primitive lattice vectors. A reciprocal lattice vector K is then given by the relation
K =CKjbj
3
(2.135)
with arbitrary integers K j . As in coordinate space, one also has 14 different 4 lattice types in reciprocal space, namely the 1 Bravais lattices. The reciprocal lattices bear a close relation to the corresponding direct lattices. The following properties can be verified easily: (i) A reciprocal lattice has the same point symmetry as the direct lattice with which it is associated. (ii) The reciprocal lattice of a reciprocal lattice is the direct lattice. (iii) If the direct lattice is primitive, then the same holds for the corresponding reciprocal lattice. The reciprocal lattice of a centered direct lattice is also centered. The Bravais types of primitive direct lattices are the same as those of their reciprocal lattices, while the Bravais types of centered direct lattices and of their reciprocal lattices differ in certain cases, For example, a body centered cubic reciprocal lattice corresponds to a face centered cubic direct lattice, and, conversely, a face centered cubic reciprocal lattice corresponds to a body centered cubic direct lattice. 2.3,4
We have not as yet directed attention to the question if there may be a connection between the quasiwavevector k and the energy eigenvalue E : and what form it might take. In fact such B connection does e x i s t , and it will now be explored in some detail. That a Bloch function p ~ for a given wavevector k cannot be art eigenk f 1 function o I for an arbitrary energy eigenvalue E may be yeen as follows, If one substitutes i p ~ k ( x )(2.128) into the Schriidinger equation (2.95),one obtains the following eigenvalue equation for U E ~ ( X ) :
Izm
P (
UE&)
= EUEk(X).
(2.136)
The kdependence of the operator in this equation transfers to the eigenvalue E, i.e. E becomes H function E = E(k) of the quasiwavevector k. In the case of a free particle with V(x) = 0, one has E(k) = (H2/2m)k2. For a crystal one expects, of course, more compljcated dependencies than this. The specific form of the function E(k) is determined by the particular shape of the periodic potential V(x). Some general properties of E(k) follow,
95
however, just from the space and time symmetry of V(x). We will discuss this now, beginning with point symmetry, whose implications are relatively easy to examine.
Point symmetry
We consider the Schrodinger equation (2.95), on the one hand, for the quasiwavevector crk, where TY i s taken to be an arbitrary element of the point group of the crystal:
(2.137)
and, on the other hand, for the quasiwavevector k, but with an application of the transformation (Y as follows
HWEdX) E(kbPEk(X).
(2.138)
The relations (2.120) and (2.124), derived in the preceding subsection, mean that U p E k is an eigenfunction of H for quasiwavevector ak. If there is only orbe eigenfunction for a given wavevector, which we initially take to be the case, a c p ~ k ( x must, be identical with ( P E ~ ~ ( x ) , ) because ( P E ~ ~ ( is )by x definition the eigenfunction for a k . Thus,
(2.140)
The assumption that there is only one eigenfunction of the Hamiltonian for the quasiwavevector k holds for almost all k. There is an exception for those special k values for which ak does not differ from k or from a vector k K equivalent to k for all a. One refers to such vectors as symmetrical kvectors. They will be considered in section 2.5, but we exclude symmetrical kvectors at this point. For nonsymmetrical kvectors, the fact that there is only one eigenfunction of H , follows from the onedimensionality of the irreducible representations of the translation group (see Appendix A).
A property of the Hamiltonian H which we have not employed thus far is its invariance under time reversal. In order for the Schrodinger equation to remain unchanged under time reversal, the time reversed wavefunction must be defined as the complex conjugate of the original wavefunction. Then the
96
time reversed Bloch function of wavevector k has wavevector k. Since both Bloch functions have the same eigenvalue, one obtains E ( k )  E(k).
(2.141)
This relation has additional significance, beyond that of relation (2.140) obtained by means of spatial symmetry, only then, if the point group of the crystal does not contain the inversion.
K .R1
(2.143)
with K , and T % as integers. Therefore we obtain the full spwtrum of eigenvalues of t R if k varies within a particular primitive unit cell of kspace. The corresponding kvectors will br denoted by kl. The kvectors of any other primitive unit cell differ from these by a reciprocal lattice vector K and thus do not lead to new eigenvalues of the translation operator. However, the energy cigenvalues g ( k ) do differ, so that E(k1) f h(k1t K ) holds in general. Table 2.3 illustrates these connections. This asymmetry with respect to the kdependence of rk(R) and E(k) is the reason for the introduction of two different representation schemes for the energy eigenvalue function E(k). In the representation scheme employed thus far, k varies over the entire reciprocal space, and to each vector k we associate only one kdependent energy function E(k). With respect to the eigenvalues of the translation operator, this description i s highly redundant, since one encounters the same eigenvaliie an infinite number of times. This disadvantage is eliminated if the selection of the region over which k varies is not fitted to the energy E ( k ) but rather to the ck(R) eigenvalues of the translation operator, i.e., if k is allowed to vary only over a primitive unit cell rather than over the whole kspace. Then we must accept, however, that an infinite number of different energy eigenvalues are assigned to each kvector, namely all those which follow frotn E ( k ) by means of the prescription
97
Table 2.3: Representation of band structure in the cxtcnded and reduced zonc schemes.
I
Wavevector
Eigenvalue of
tR
Eigenvalue of H
Represent at ion
exp [ikl
. R]
k from Infinite
Space
1 Extended
'
Reduced
Zone Scheme
~
Zone Scheme
whme kl is a vector of the primitive unit cell at K 0. In this way the singlevalued function E(k) over all kspace is transformed to an infinitely multivalued function EK(k1) over a primitive unit cell. An analogous statement holds for the eigenfunctions of the Hamiltonian. Each Bloch function p,yk over all kspace corresponds to an infinite number of different Bloch functions pEKkl(x)  pqkl+Iqkl+I{(x). One cells the representation involving the entire kspace the extended zone scheme, and that involving the primitive unit call at K = 0 t h e w d u d zonw schpme. For most problems in semiconductor physics the reduced scheme is more convenient than the extended scheme. Later, we will demonstrate that the energy E(k), as a function over all kspace, is not continuous everywhere but has discontinuities at certain kplanes. Anticipating this, one may then inquire whether one can choose the primitive unit cell of the reduced scheme at K 0 in such a way that the discontinuities of the branches EK(k) do not occur in the interior, but only on the surface of the primitive unit cell. If such a special primitive unit cell exists at all, it must have at least the symmetry of the point group of direclionb, as followb from the symmetry reletion (2.140). This nieans that only the WignerSeitz cell of the reciprocal lattice is a possible candidate for siich a primitive unit cell. Of course, its point group is that of the crystal lattice and, theIefore, larger than the requisite point group of directions.
~
98
We know, however, that lattices have fewer point groups than there are point groups of equivalent crystal directions. The nextlower lattice point group would not contain the needed point group of directions. The issue in question thus revolves upon whether the function branches EK(k) are continuous in the interior of the WignerSeitz cell of the reciprocal lattice at K = 0. We will prove that this is indeed the case for the branch Eo(kl) in the next section. Moreover, it will be shown that the other branches EK(k1) of the infinitely multivalued function E(k) may be redefined in terms of new branches such that each singlevalued branch is also continuous over the WignerSeitz cell at K = 0. One calls these continuous branches energy bands and the WignerSeitz cell of the reciprocal lattice at K = 0 is the f i r s t Brillouin zone. Having started from general considerations and obvious conjectures, we thus arrive at the important conclusion that the spectrum of allowed energy values of crystal electrons will have the form of energy bands, and that these energy bands arise by varying the quasiwavevector k over the first Brillouin zone. We will discuss the concept of Brillouin zones along with the still outstanding proof of the continuity of E(k) over these zones more fully in the next section.
2.4
It is well known that the Schrodinger equation can be solved approximately if the potential, or a part of it, represents a small effect which may be treated by means of perturbation theory. To apply this procedure, the solution of the unperturbed problem must be known. At this point, we will treat the entire periodic potential by means of perturbation theory. Even if it is not to be expected that this will yield results which are quantitatively accurate, some qualitative understanding can be gained in this way. If one takes the periodic potential as a perturbation, then the unperturbed problem is that of a completely free electron. The corresponding Hamiltonian Ho is simply the kinetic energy of the electron, i.e.
P2 H o = .
2m
(2.145)
The perturbing operator H I , which together with Ho forms the total Hamiltonian (2.146) H = Ho H , i
(2.147)
99
Cpk
and eigenvalues E ( k )
+ 6E2(k)+ . . .
(2.149)
with respect to the perturbation potential, here with respect to the periodic potential V(x). In this discussion, we suppress the energy index E of the eigenfunctions for the sake of brevity. This leads to no ambiguity since we work in the extended zone scheme, where the energy is a unique function of the quasiwavevectors k. The zeroth order terms obey the equation
Hop; = Eo(k)Cpg.
(2.150)
In order that the perturbation theory be applicable, the zeroth order solutions must form a complete orthonormal set of functions, such that
and
yCpp(x)Cp;(x) = 6(x  x)
k
(2.152)
must hold. The ksummation of (2.152) is extended over all points of the infinite finely meshed net of Figure 2.4. The two relations (2.152) and (2.153) are in fact valid for the solutions
(2.153)
of equation (2.150). The validity of the orthonormality relation (2.151) may be easily verified by direct calculation, and the completeness relation (2.152) is proved in the theory of Fourier series. The energy eigenvalue Eo(k) of zeroth order for pg is
(2.154) 2m We will assume at the outset that this eigenvalue is not degenerate, more exactly, that this holds for kvalues for which the perturbation operator V has nonvanishing matrix elements (pi! I V I ~ p g )with any vector k. It is not possible to exclude degeneracy completely because wavevectors k of the same length lead to the same energy eigenvalue (2.154).
E (k) = k2
Ti2
100
2.4.1
(2.155)
(2.156) The energy correction 6E1(k) of the first order i s the average value (cpg I V I pi) of the periodic potential. This value may be set equal to zcro. The offdiagonal matrix element of the potential involved in (2.1551, (2.156) are given by
(z.157)
Because of the lattice periodicity of V(x) we can transform the integral over the periodicity region in (2.157) into a sum o f integrals over the primitive unit cell Qo(0) at R = 0:
Noling that
(2.159)
Using this result, the correction 6 ~ : the wavefunction takes the form to (2.162)
and the total wavvcfunclion
pk = ! p
One recognizes that this has the expected Bloch form (2.128) b r energy eigenfunctions of the crystal Hamiltonian. Moreover, (2.163) yields the Bloch factor as
(2.16 4) The second order energy correction E2(k) may be determined using (2.161)
as
(2.165)
The corrections to the eigenfunctions (2.162) and the enerffi eigmvtrlues (2.165) will become large if the denommator in these expressions becomes srnall. Actually. the denominator may even vanish when
E"(k) Eo(k I K )  0.
(2 166)
In such a case, I ~ basic premise for validity of nondegenerate perturbation P theory is violated, namely that there be no degeneracy among the unperturbed energy cigenvalues for those kvalues fur which the matrix elements of V ( x ) do not vanish. However, just such a degeneracy exists if equation (2.166) holds, with the states 9 and cp:K degenerate with each other. : Below, we will accommodate this case and discuss the corresponding degenerate perturbation theory, but first we analyze the kvalues for which the condition (2.166) is fulfilled. Using (2.154) for E o ( k ) ,we get from (2.151)
12= 0. (2.16 7) 2 This equation can be solved geometrically. It is obvious from Figure 2.5 that the tIps of the corresponding kvectors lie on a plane perpendicular to the vectors K or K, with this plane intersecting the vector K at its halflength. Such planes are familiar from the theory of Xray diffraction by crystal lattices. They aTe called Bmgg reflectam planes. The nondegenerate perturbationtheoretic results discussed above may be interpreted quite clearly if one regards the problem of the calculation of the eigenfunctions in the periodic potential as a scattering problem. In this, the unperturbed eigenfunctions p: are thought of as incoming waves and : the perturbed eigenfunctions 9 k as outgoing waves, 6 ~ corresponding to the scattered part of the latter. That the perturbation theory according to formula (2.162). yields only minor corrections to the unperturbed eigenfunctions is related in this picture to the fact that the incoming and scattered plane waves in\olve different wavelengths, since 1 k If jk KI, and correspondingly they are not capable of significant constructive interference.
1 k.K+ IK
102
&
K
of
Bragg
Figure 2.6: Illustration of perturba tion theory with respect to the periodic crystal potential in the vicinity of a Bragg reflection plans where the zeroth order energy levels are degenerate.
For incoming waves with kvectors close to the Bragg reflection plane of the reciprocal vector K, the scattered part dpk contains a p a r t i d a r l y large hs plane wave component of wavevector k + K. T i is due t o the fact that, in this case, the incoming plane wave and the Ecaltered wave of wavevecK, have almost the same wavelengths. They are thus rapable of tor k constructive interference w i h strongly enhances the amplitude of this parhc ticular scattered wave. If k is located exactly on the Bragg reflection plane, one may understand the scattered wave with wavevector k K as the plane wave reflected at this plane in the sense of geometrical optics. This follows immediately from the construction of the k h g g reflection plancs in Figure 2.5. The scattered wave of wavevector k K is strengthened by interference, w h c h makes it the refkected wave (bear in mind that in the wave picture reflection represents an interference phenomenon). This interpretation can be universally applied to the propagtltion of plane waves of any kind in a system of scattering centers periodically ordered on a lattice, including Xrays. Actually, the above results for elwtron waves were discoverd iisinE Xray radiation.
The amplitudes of plane waves reflected at Bragg planes, according to formula (2.162), become idnitely large formally What is expectd. however. is that the amplitudes of the refiected and incoming waves should be comparable. This contradiction arises from the misuse of nondegenerate perturbation theov, w h h is no longer applicable under the conditions of reflection. For kvectors which obey the Bragg condition (2.167), one must apply degenerate perturbation theory.
2.4.2
1 ko.K+  IK
2
12=
(2.168)
holds for a particular reciprocal lattice vector K. We denote a small deviation from ko by Ak where 'small' means that I A k I<<] K I. A vector k of the form
k=ko+Ak
(2.169)
is then a vector close the Bragg plane involved. The same holds for the vector k K, as it lies close to the parallel plane in Figure 2.6. According to quantum mechanical perturbation theory for the case of degenerate zeroth order states, one proceeds as follows. The perturbed eigenfunctions cpk are sought as linear combinations of the two (almost) degenerate eigenfunctions and (p& in zeroth order, writing
ci p
( I o x= cocp:(x) PF ( )
+C K d + K ( X )
(2.170)
with co and CK coefficients to be determined. In the lowest nonvanishing order of perturbation theory, the Schriidinger equation leads to the following set of equations;
Eo(k)  E V(K)* (2.171) V(K) Eo(k$ K )  E For the set to have a nontrivial solution, its determinant must vanish, whence
(2.172)
(2.173) For further evaluation of this expression, we assume that Ak is directed parallel to K (see Figure 2 . 6 ) . Expanding the square root in powers of I Ak I and terminating the series with the second order, we find
E*(k)
Eo(ko)f
I V(K) I 4 li2
jpK2 1 f 2m
I V(K) I]
Ak2.
(2.174)
104
Figure 2.7: The degeneracy of the two energy parabolas E(k + Ak) and E(k K + Ak) at the Bragg reflection plane Ak = 0 (left part of figure) is removed by the periodic crystal potential. In the formerly continuous energy spectrum a gap arises at this plane (right part of the figure).
The remarkable feature of this result is that the 2fold degenerate energy eigenvalue Eo(ko) = Eo(ko K ) at the Bragg reflection plane, i.e. for Ak = 0, splits into two different eigenvalues which are shifted by IV(K)I to higher and lower energies, respectively. Between them there is a region of width 2 I V ( k ) I where no allowed energy values may exist (see Figure 2.7). This means that, in the formerly continuous energy spectrum of the free electron. an energy gap appears as a result of the periodic potential. This occurs at all Bragg reflection planes, provided that the corresponding Fourier components I'(K) of the periodic potential do not vanish. Except for the Bragg planes, the function E ( k ) and the energy spectrum remain continuous. The result derived here by means of perturbation theory is also valid well beyond the framework of validity of this approximation. The following statement remains true also in the general case.
The energy f u n c h o n E(k) LS continuous everywhere in kspace, with the exceptzon of the Bragg reflectLon planes where, 271 generat, discontinuities occur and the energy spectrum exhihts gaps.
In regard to the shape of the function E*(k) close to the Bragg planes, the relation (2.174) provides the following picture (see Figure 2.7). The two branches E+(k) and E(k) approach Ak = 0 with horizontal tangents to the limiting values Eo(k) IV(K)I and Eo(k) ll'(K)I, the upper branch with positive curvature, and the lower with a negative one. The latter observation follonrs from (2.174) since the applicability of perturbation theory is restricted in validity by 1 V(k) I << h 2 K 2 / 2 m ,which means that the sign
105
2.5
Band structure
The perturbationtheory calculations of the preceding section were carried out in the extended zone scheme for E(k}. Alternatively, there is also the reduced scheme introduced in section 2.3, in which the quasiwavevector k varies o d y over a primitive unit cell of reciprocal space. In this, each of an infinite number of choices for this cell can be used. Here, we will determine the choice of primitive unit cell of the reciprocal lattice in such a way that the desired property of the function E(k} discussed above, namely that it have discontinuities only at Bragg reflection planes and otherwise be continuous, is described as simply as possible. Simplicity of the description calls for the discontinuities of E ( k ) to occur only on the boundaries of the unit cell, with E(k)continuous in the interior. The question of whether there is a primitive unit cell which guarantees that. is equivalent to the question of whether there exists a primitive unit cell which is bounded by Bragg planes, but has no other such planes in its interior. The prescription for the construction of the Bragg planes (see section 2.4) makes it immediately clear that the WignerSeitz cell of the reciprocal lattice at K = 0 has the desired property  it is bounded completely by planes which obey equation (2.1671, namely Bragg planes, and in its interior it is devoid of such planes. This means that within the WiperSeitz cell of the reciprocal lattice at K = 0, the function E(k) is continuous. However, it remains to be clarified how the rest of kspace can be reduced t o the WignerSeitz cell at K = 0 in such a way that the resulting new function branches for E(k) are also continuous over this celL It is clear that this cannot be done by simply dividing kspace into WiperSeitz cells and then translating all cells not containing the origin through reciprocal lattice vectors back to the cell at K = 0. The reason for this is that the noncentral WignerSeitz cells are cut by Bragg planes. The correct procedure goes back to Brillouin and will be described below.
2.5.1
Brillouin zones
Definition
Consider, in kspace, centrosymmetric bodies which contain the origin and are entirely bounded by parts of Bragg reflection planes. These bodies are arranged and enumerated according to their volume. The smallest will be the WignerSeitz cell at K = 0. The second, nextlargest body, has the volume of two WignerSeitz cells, as one may easily demonstrate. It contains the
106
ZONE
9 1 0
Figme 2.8: The first, 10 Rrillouin zones or H sqimre plane 1al.tic:e. (After Brillouin,
1953).
first body jointly with the Bragg planes bonnding i t . If one removes the first body from the second, then one obtains a hollow body of the volume of one WignerSeitz cell, which is bounded by Bragg planes inside and outside, but contains no siirh planes in i t s interior. This conhtruction may be carried further. Removing the vth body from the (v 1)th one again obtains a hollow body having the volume of the primitive unit cell in kspace. It i s boiindd by Bragg planes, and has no such planes in its interior. These difference bodies are called Brdloum zones (abbreviated 8 8 s ) . The vth body i s called the vth R%, the WignerSeitz cell at K = 0 is, accordingly, the first B Z . In Figure 2.8 the first 10 Brillouin zones are shown for a 2dimensional square reciprocal lattice. The first B Z for two important 3dimensional lattices, i.e. the frr lattice and the hexagonal lattice, are shown in Figure 2.9. We will derive the shape of the first of them below. Focusing on a point k of the vth R Z , a v e t o r K(k,) of the reciprocal , , , lattice is attached k such that k K(k,) represents a vector kl of the first
BZ,
k =k l ,
 K(k,).
(2.175)
107
Figure 2.9: The first Brillouin zones of the two Bravais lattices which apply to many semiconductors: (left) the face centered cubic lattice of diamond, zincblende and rocksalt structures; (right) the hexagonal lattice of wurtzite and selenium structures. We omit the formal proof for this important result here. For the square lattice it follows immediately from Figures 2.8 and 2.10. Its validity in the general 3dimensional case is clearly plausible. If, instead of k,, another point kh of the vth B Z is chosen, another reciprocal lattice vector K(kh) may be necessary in order that kh  K ( k l ) shall be a vector of the first B Z . The set of reciprocal lattice vectors K(k,), K(kL), . . . which arises if k, varies over the whole vth B Z , is just the set of Kvectors which defines the internal boundary planes of the vth B Z . The correspondence of the points of the vth B Z to points of the first B Z given by equation (2.175) is unambiguous in both directions  each k, corresponds exactly to one kl, and each kl exactly to one k,. One terms this correspondence folding of the vth B Z into the first. The term displacing would probably provide a better description. During such folding or displacing, the external or internal boundary planes of vth B Z may be translated to the interior of the first B Z . There, they border on other boundary planes of the vth B Z . This means that original klvectors which lie immediately to the right and to the left of such planes will involve different reciprocal lattice vectors. The folding operation (2.175) provides the mechanism for changing from the extended description of the energy function E ( k ) over all infinite kspace to the reduced description within the first B Z . The transfer of the function values is, defined above by relation (2.144). It states that the value E(k) at a point k = k, of the vth B Z is assigned to that particular point of the first B Z which arises from k, by folding the vth B Z into the first. By this assignment a multivalued function E,(kl) is defined within the first BZ. The unique function E(k) over the whole kspace is mapped into the multivalued function E,(kl) within the first B Z . Formally, this is expressed by
108
Figure 2.10: Folding of the second through ninth Rrillouiri zones into the first R Z for the plane square lattice of Figure 2.9. (After Brillouin, 1953)). the equation Ev(ki) E ( k i
+ K(k,))
E(kv).
(2.176)
We maintain that the branches E,(kl) of the multivalued function are continuous within their entire range of ddinition, i.e. within the fust B Z . For points kl which correspond to points k of the interior of the v t h B Z , , the validity of this assertion is obvious. For those psrticiilar klplanes srising from a bo~iirlaryplane of the vth B Z , continuity is not immediately , evident, for in penetrating such a klplane the original vector k , of the Yth R Z jumps by the negative of the rcciprocal latticc vector. which defines this boundary plane. However, this jump does not affect the energy eigenvalue E(k,) in the extended zone scheme as may be seen by means of the degenerate perturbation tbwry in swtion 2.4. Thus the functions E,(k) are also continuous on the particular planes in question. Each of the function branches F;,(kl) defined above encompasses some finite interval of values on the energy scale. The term energy hnnd for the set of E,(kl)values is thus obvious; I / is the socalled band andm Between the various function branches, or energy bands, energy regions may exist in which no energy eigenvalues occur. These regions are called forbidden zone8 or energy gaps. l'he whole s r t of functions R,(kl) is referred to collectively as thr band structure. In Figure 2.11 we illustrate the band structurc using the model of a 1dimensional lattice. The solid curves correspond l o a rornplelely free electron, and the dashed ones to an elwtzon in a weak periodic potential. One recognizes that energy gaps orcur not only on the boundaries of the first BZ
109
Figure 2.11: Rcduction of the energy parabola of a free electron into the first B Z of a 1dimensional lattice. The changes caused by a weak, periodic lattice potential are indicated by dashed lines. but also at its center at k = 0. This is due to the fact that, during folding, surface points of the first B Z are displaced to the B Z center, and the energy gaps from those points also appear at the center.
The folding procedure into the first B Z must also be carried out for the eigenfunctions of the Harniltonian. Let p v k l denote the Bloch function of energy E,(kl) and of reduced wavevector kl. In the extended scheme the same wavefunction reads q q k , ) k v ( x ) . Thus,
The representation of E(k) as a multivalued function E,(kl) over the first B Z is by far the most common and frequently used description of the energy spectrum of a crystal. The first Brillouin zone is, therefore, of outstanding importance in solid state physics. Unless otherwise specified, the quasiwavevector will henceforth always be understood to lie in the first B Z . Thus we will write k instead of kl, which means, in particular, E,(k) instead of E,(kl). The Schrodinger equation for an electron in a crystal then takes the form
110
ff~v/c(x) =
&(k)pvk(x).
(2.178)
The first RZ is the fully symmetric primitive unit cell of the reciprocal lattice. The relation between the reciprocal and the direct lattices was discussed in section 2.3, where it was seen that each of the 14 direct Bravais lattices corresponds to a particular reciprocal lattice of the same symmetry, although not necessarily of the same Rravais type. Accordingly, there are 14 different reciprocal Bravais lattices, which ah0 means 14 different first B Z s . Two of them, the two that are most important for semiconductors, are shown in Figure 2.9. Emptylattice band s t r u c t u r e In Figure 2.11 the energy E ( k ) of a free electron moving in 1 dimension has been represented over the first B Z of R Idimensional lattice. Similar representations are possible in 3 dimensions. Below we demonstrate this for the face centered cubic lattice which applies to crystals of the diamond and zincblende structures, According to Table 1.2, the primitive vectors of this lattice may be chosen in the form
(2.179) . 2 2 2 The corresponding basis vectors of the reciprocal lattice follow from equation (2.123), with the result 27r 27r 2n  bl = (I, T, I), b2 = (l, 1,T), b3 = (I, 1,l). a (2.180)
The first BZ is bounded by the 14 planes perpendicular to the following reciprocal lattice vectors:
(1) 8 lattice vectors having the length of the primitive lattice vectors given above, which means the smallest possible length overall. These vectors read:
2n (i,i, T),
a
2n
27r
(2.181)
27r
4%) , a 0, O
27r (0,2,0), a
2n (O, a
0,Z).
(2.182)
111
Figure 2.12: Symmetry points and irreducible parts of the first BZ of the fcc lattice (left) and the hexagonal lattice (right). The first B Z of the fcc lattice obtained by means of these reciprocal lattice vectors is shown in Figure 2.9. Perpendicular to the three cubic axes it is bounded by squares, and normal to the four space diagonals by hexagons. It is common to denote the symmetry points of the first BZ by capital letters. Greek letters are assigned to symmetry points in the interior of the first BZ, and latin letters to symmetry points on its surface (see Figure 2.12). The center of the B Z , for example, is I?, the point at which a cubic axis cuts the B Z surface is X , and the connecting line between I? and X is A. We will now reduce the energy parabola E(k) = (fi2/2m)k2 of a free electron along the Aline, i.e. for points
r = 2R o,o) (o, a
A = (o,
a
2R
o , ~ ) ; x = (o, a
2?r
0, 1)
(2.183)
E,(k)
k h2 (
2m
+K ( ~ Y ) ) ~ .
(2.184)
Using the above listed 14 reciprocal lattice vectors K(k,) in this relation, the second and third BZ's are folded into the first. For convenience we as introduce Eo = ( T ~ ' / 2 m ) ( 2 7 r / a ) ~ an energy unit, and set
E,(k)
we have
71
E u . ~v(k).
(2.185)
With lattice constant a of 5 A, a value of 5.9 eV follows for Eo. For ty(k)
112
(2.186)
K,, are the components of the vectors K, in units of where the K,, Kvy, 2nla. In Table 2.4 we show the reduced energy functions ty(<) for the 14 vectors K, together with those for 0. In some cases the same value of the function E,(C) is obtained for 4 different K,s, meaning that the corresponding energy bands are 4fold degenerate. In Figure 2.13 the content of Table 2.4 is illustrated graphically, along with the band structure parallel to the Aline between 0 and L not given in Table 2.4. Again, one recognizes that there are kpoints associated with the same energy value several times. The band structure shown in Figure 2.13 is that of an empty fcc lattice, i.e. a fcc lattice whose lattice points are not occupied by atoms, since the latter would create a nonzero periodic potential V(x)if they were present. For a vanishing potential V(x), definite lattice exists because of its arbino trary translation symmetry. Nevertheless, a definite lattice, the fcc lattice, and a definite lattice constant, were chosen in the above procedure. This was an arbitrary choice, in the sense that we could have also chosen any other of the 14 Bravais lattices, and any other lattice constants. However, if we want the empty lattice band structure to resemble the band structure of a really existing crystal, then we cannot take any empty lattice band structure but must chose the one for the lattice applying to the crystal under consideration. Later we will verify that the band structures of real crystals and the pertinent empty lattice band structures have in fact common features. In Figure 2.11 such features are, for example, the appearance of the lowest energy eigenvalue at the I?point and the development of several nondegenerate energy bands from this level along the Aline, one of them crossing another band at the Xpoint. It is also consistent that the energetically highlying bands display a relatively high degeneracy at the I?point, although for real fcc crystals the degeneracy at maximum can be only 3fold, and not 8fold as for the empty fcc lattice at the energy E = 3&, and not &fold as for the fcc lattice at E = 4Eo. A degeneracy higher than %fold turns out be accidental, i.e. not caused by symmetry, but by the particular values of the potential, which is identically zero in this case. The width of the lowest energy band in Figure 2.11 is about Eo, i.e. 6 eV. Also, this result is not too far from the actual widths in fcc crystals, as we will verify below. As with the reduction along A and Alines, one can plot the band structure along other symmetry lines in the interior and on the boundary of the first B Z . A complete representation of the functions E,(k) over the points of 3dimensional kspace would require a Cdimensional space. In the 3dimensional space available to us it is only possible to represent bands E,(k) over planes in kspace. Such representations are, however, uncom
113
(KV&V,K,)
cy(c)
Notation
Degree of Degeneracy
+ (6 + I)*
4 I cz
c2
F3
mon. 4s a rulc, one plots the energy against lines in kspace,as in Figure
2.13. The question of what valws the energy band functions E,(k) take outside the first B Z does not arise or is meaningless, since the E,(k) were
"L
A Wavevector
114
just defined by transferring the function values from other B Z s to the first one. For this reason, the E,(k) already encompass the entire spectrum of eigenvalues as k varies over the first B Z . If one wishes to also consider the functions E,(k) outside of the first B Z , one has to define them there. The most natural manner to do this is the periodic continuation. Periodic continuation In this context, the values of E,(k) in WignerSeitz cells having centers at K # 0 are defined by the relation (2.187) where k is a point of the first B Z . By periodic continuation, one can of course extend each function originally defined only within the first B Z to the entire kspace. The question is what analytic properties does this function have? If one considers an arbitrary function, discontinuities will occur on the boundaries of the WignerSeitz cells, i.e. the periodically continued function will not be continuous in the entire kspace. In order to have continuity, E,(k) must satisfy certain conditions on the boundary of the first B Z . These conditions follow from the fact that the boundary of the first B Z consists of pairs of equivalent parallel planes (see Figure 2.6). We denote a point on one plane of such a pair by ko, and kb denotes the equivalent point on the other plane of the pair. If ko belongs to the first BZ,as we will assume, then kb cannot also belong to it, because a primitive unit cell contains only nonequivalent points. The value of E,(kb) is defined by the periodic continuation and, as such, it is equal to E,(ko). In order for the periodically continued func%iorrt o be continuous, thix value must coincide with the limiting value of E,(k) if one approaches the point kb coming from the interior of the first B Z . Thus, a necessary condition for continuity of the periodically continued function is that the original function, defined only over the first B Z , obey the boundary condition lim E,(k)
=
k k& 
E,(ko),
(2.188)
where k is in the interior of the first B8. This condition is also siifFicient, since the continuity of the periodically continued function E,(k) implies that equation (2.188) holds. Therefore, the boundary condition (2.188) for the function E,(k) defined over the first R E , and the continuity of the periodically continued function E,(k) are just different descriptions of the same property. Tising continuity of the function &(k) defined over all kspace, this property may be expressed in a more convenient way. This is the ,k, reason why one continues the e n e r a bend fiinctions R ( ) which actually
115
have meaning only within the first BZ,periodically over the entire kspace. One could also forego this, and use the condition (2.188) directly. We will now prove that the energy bands E,(k) actually obey the boundary condition (2.188), and thus, are continuous everywhere as periodically continued functions. To this end, we use the Schrodinger equation (2.178) for the eigenvalues E,(k). The corresponding Bloch functions p,k(x) may be expanded in plane waves with wavevectors k K , where K is an arbitrary reciprocal lattice vector, as follows:
(2.189)
Using this result, the Schrodinger equation (2.178) takes the form
(2.190) The E,(k) are the eigenvalues of the matrix of coefficients of this system of equations and their kdependence i s determined by that of the matrix. The latter is manifpstly continiioiis. That it also exhibits the periodicity of the reciprocal lattice, one may verify as follows. Consider the coefficient matrix of (2.190) at the point k K with K an arbitrary reciprocal lattice vector. If one replaces the column and row indices Kand K , respcctively, by K+K and K K, using the relation (K KIVJK K)  (KIVIK), one obtains the matrix at the original kvertor. This implies that the eigenvalues Ev(k t K) and E,(k) are identical, as stated. The above proof employs only the lattice translation symmetry of the potential V(X). The continuity of the periodically rontinued function E,(k) in the entire reciprocal space is therefore an immediate consequence of the periodicity of V(x) in the direct lattice. The continuity of the periodically repeated energy band functions E,(k) is often very useful. It ensures, on the one hand, that the E,(k) may be npproximctted fairly closely by a Fourier series with relatively few terms. On the other hand, mathematical theorems dealing with the types and numbers of extrema of periodic analytic functions may also be applied. This i s particularly important for the theory of the optical proprrties of semiconductors. We return now to a question which was previously explored in section 2.3, namely the implications of point symmetry for band structure. This was discussed above in the extended zone scheme, and symmetric kpoints were excluded. NOW we USP the reduced zone scheme description and concentrate on symmetric kvectors.
116
2.5.2
The first B Z , as the WiperSeitx cell of the reciprocal lattice, also cxhibits the point synmelry of this lattice which, for its part, is the same as the point bymmetry of the corresponding direct lattice. The latter statement follows directly from the definition (2.123) of the reciprocal basis bl, bp, b3. The first B Z of the fcc lattice has, therefore, the point symmetry of a cube, and its point group is Oh(m3m). The energy bands likewise have a particular point symmetry, namely that of the equivalent directions of the crystal. This is expressed by quation (2.140), which underlies the extended zone scheme. It may, however, be transferred immediately to the reduced scheme, so that we may also write
&(k) = E v ( a k ) ,
(2.19 1)
where a is an element of the point group of equivalent directions of the crystal. This group is generally smaller than the point symmetry group of the lattice, but in special circumstances it can also be the same. The latter rase occurs for the diamond structure, where the point group of equivalent crystal directions is likewise o h . In section 2.3, symmetric kvectors, i.e. vectors which are transformed by at least one element o f the point group 1 of equivalent directions, into themselves or into an equivalent vector, were excluded. Now we also admit such kvectors to consideration. In r q a r d to their effect on symmetric kvectors, the elements cr of P split into two subsets. The Erst contains those elements which transform k into itself or, at points on the surface of the first B Z , inlo a vector crk equivalent to k which differs from k only by a reciprocal lattice vector K. This set forms a subgroup pk of P , and is called the s n ~ a l l poznt group of k. The second subset contains all those elements of P which transform k neither into itself nor into a vector equivalent to k. The set of all different and nonequivalent vectors a k is called the star ofk. We denote the number of elements of the point group P by p , and the number of different star points of k by S k . Since each star point has the same point symmetry, the number pk of elements of the small point gToup P is given by k the relation p k p l s k . A general kvector bas no point symmetry, whence p k = 1 and A k = p . For kvectors on symmetry lines or planes, p k has a value between 1 and p . The center of the first BZ is special: all elements of P arc also dements of Zk, so that pk = p and S k = 1. A wavevector whose small point group does not consist of only the unity element, is a symmetry point. The points of the first B Z in Figure 2.12 marked with Iargr greek or latin letters are among them. Using the terminology above, the point symmetry of band structure as given by equation (2.191) may also be described by saying that the energy of a particular band has the same value at all points of the star of a wavevector,
7
117
Y~UT
degunrrucy
OJ
e n e r g y bands.
Bmause of star degeneracy, the energy band functions E,(k) need to be calculatcd only for a scction of the first B Z which covers the region between adjacent star points (Figure 2.12). One calls such a section an zrrediiczble part of the first R E . The energy eigmvulues for the remainder of the first B Z are obtained by means of the symmetric continuation (2.191) of the values of the irreducible part. The irreducible part is the pth part of the first BZ. If one considers that, in the case of diamond structure, the number p of elements of the point group 0 , is 48, one can appreciate how greatly the 1 description of band structure is simplified by exploiting the point symmetry of crystals. In section 2.3 it was shown that for any eigenfunction (p,k(x), acp,k(x) = cpVk(crx) is also an eigenfunction of the Haniiltonian H with the same eigcnvalue B,(k), provided (1 belongs to the symmetry group of 11. Those elements a of this group which also do not change the wavevector k, or change it only by a reciprocal lattice vector, will thus transform an eigenfiinction pvk(x) of a particular Land v and kvector, into an eigenfunction cp,k(~u~x) of the same band and the same kvector. Thus, lor the case in which not all wp,k(x) are linearly dependent, t h e are several linearly independent eigenfunctions for a given band index 11 and wavevector k. Let their number be d,k. This means that t h e energy band E,(k) i s d,kfold degenerate at the point k. One calls this kind of degeneracy band degeneracy. We employ q l k ( x ) ,  1 , 2 , . . ,duk, to denote a basis set in the subspace 1 of eigenfunctions or H having the eigenvalue E,(k). The functions are then likewise eigenfunctions with the eigenvalue E,(k). As such, they can be expressed as linear combint\tions of the basis functions pVk(x), whence
crcplk(x)
(2.192)
with Dpl(a) as expansion coefficients. Through relation (2.192) each element a of the sniall point group is associated with t~ matrix D p l ( a ) . In Appendix A we explain that the matrices D p l ( a ) form a r e p r e s e n t a t z o n of the small point group of the wavevector k. If the clegeneiacy of the energy eigenvalues i s only due to the symmetry of the Hamiltonian, then this representation is arreducihle. We may say that the eigcnfimctions of the Hamiltonian for a particular kvector and bandindex v span a subspace in Hilbert space which gives rise to an irreducible representation of the small point group. The dimension of this reprcsentation, i.r. the dimension of its matrices, corresponds to the degree of degeneracy of the energy band for the kvrctor mider consideration. If one knows all irreducible representations of the small point group Pk, then one also knows what degrees of degeneracy of the energy bands are possible at k. This relation between degeneracy and symmetry
118
Table 2.5: Irreducible representations (notations and dmensions) of the smaLl point groups of symmetry points and lines of the fist B Z o the fcc lattice for f crystals with the diamond structure. For the point X, the projective irreducible representations are given in the crystallographic factor system. (See Append& A ) .
A
X
is quite remarkable. It holds not only for the energy bands o crystals, f but for the eigenvalues of any Hsmiltonisn in quantum mechanics. It is a good example how a relatively abstract mathematical theory  the theory of groups has immediate physical consequences.
~
The number of distinct irreducible representations of an arbitrary finite group is, b i t e , as are the dimensions o the representations. For the point f
groups of equivalent directions, the irreducible representations can be only 1,2 and 3dimensional; ldimensional for small groups, 2dimensional only for groups that are 'not too small', and 3dimcnsional only for the largest point groups, specifically for Oh(m3m),0(43m), T and Td(332) of the cubic h crystal system.
To illustrate these somewhat abstract statements and to prepare for the discussion of actual band structures in section 2.8, we consider some s p a i d kvectors of the first BZ of the fcc lattice. At the center r, the small point group is identical with the full point group P for crystals of diamond structure, i.e. with o h . In the context of energy band theory, the irreducible representations of the small point groups are denuted by the same capital greek or latin letters which stand for the kvectors, supplemented by subscripts. For the I?point and diamond structure, the irreducible representations are denoted by rl,Tz,I112,r25. r15, and I, I?$, ? ; This I, & somewhat strange indexing refers back to the socall& compatibility rela
119
tions between different representations (see Appendix A for more detail). The primes indicate that the involved representations differ from the unprimed ones only by a minus sign for the inversion matrix. In Table 2.5, all irreducible representations of the point group o h , which is the small point group of r for crystals with the diamond structure, are listed. In addition to the symbols of the representations, their dimensions are given. The irrek ducible representations of the small point groups P of the symmetry points A, X , A and L for crystals having the diamond structure are also indicated in Table 2.5. According to this table, one has both 1, 2 and %fold degenerate bands at the center of the first BZ,while at all other points only 1and 2fold degeneracies occur for crystals which have diamond structure. At X there are 2fold degenerate bands exclusively. This peculiarity is due to the fact that the space group of the diamond structure is nonsymmorphic. In this case, the irreducible representations of the small point group of X involve a factor system, which means that multiplication of two representation matrices yields the matrix of the product element, save only for a scalar factor (see Appendix A). In such circumstances, it is understandable that 1dimensional representations may not be possible. ' The degeneracy at I may be compared with that of a free atom. In the latter case the degeneracy is likewise a consequence of rotation and reflection symmetries. However, arbitrary rotations and reflections are possible in this case, i.e. the point group contains an infinite number of elements. The dimensions of the irreducible representations of this group are determined by the angular momentum quantum number I , where 1 can take the values 0,1,2,. . . ,00. Amounting to 21 1, all odd numbers are possible as dimensions of irreducible representations, and hence also as the multiplicities of energy eigenvalues. In the case of the hydrogen atom, the symmetryrelated degeneracy is still not the full degeneracy. In addition, one has an accidental degeneracy caused by the particular shape of the Coulomb potential. In this case the energy eigenvalues differ only for different principal quantum numbers n. Thus all eigenstates corresponding to a given n , with their various 1values 0, 1 , 2 , .. . ,n  1, have the same energy. The degeneracy is thus n2fold for the hydrogen atom.
2.5.3
The degeneracy of energy bands at symmetry points of the first BZ which we treated in the preceding subsection, is a direct consequence of the symmetry of these points. We proceed now to another consequence of this symmetry, which will lead us to the concept of critical points and effective masses. If a particular symmetry point ko of the first BZ is transformed into itself under the action of a point symmetry operation a , then points ko 6k in the vicinity of ko are necessarily transformed into points close to ko. If,
120
for example, a i s a reflection with respect to u plane normal to the xaxis, then points of the form ko 6 k e, with ex as unit vector in xdirection, will transform into ko  6 k . ex. Because of the symmetry relation (2.191) for energy bands, E,(k) will have the same value before and afler the reflection. From this it follows that the first derivative of the function EV(k) with respect to k, must vanish at the mirror plane. If ko also simultazlcously lies on a second mirror plane perpendicular to the first, e.g., one perpendicular to the yaxis, thrrr the b,derivativp of f!!,(k) will also be zero. If ko i s even more symmetric and involves a third mirror plane normal to the zaxis, then the derivative with respcrt to k, also has to vanish. What is being demonstrated here using the example of mirror planes has a general significance: some or all of the first derivatives of the energy band functions E,(k) with respect to the three wavevwtor components vanish at symmetry points, depending on the degree of symmetry. With some ambiguity of expression, we may say that at symmetry centers all three fiist clerivativcs are zero, at symmetry lines two vanish, namely those with respect to the normal plane, and on symmetry planes one derivative vanishes, nainrly the one in the normal direction. On a symmetry line there are also often special points, wheie, for reasons that have nothing to do with symmetry, the remaining first derivative parallel to thv line also vanishes. The first derivatives of the function Ey(k) with respect to k have a direct physical meaning. One can easily prove (and we will do so in section 3.3) that they are, apart from a constant factor, equal to the expectation value of the momentum operator p in a Bloch Btate, thus
(2.193)
(2.194)
With the requirement that E,(k) be a regular function at k,one may approximate it in the neighborhood of k by a Taylor expansion to second ,
121
order
(2.196)
IJsing this tensor, the expansion (2.195) of E,(k) may be written in the form (2.197) Thc tensor components cnter this expression in a manner similar to the way the reciprocal mass rnl enters the energy dispersion relation of a free electron. Actually one can obtain the expression (2.154) for the energy of a free electron from (2.197) by Erst taking the energy zero at E u ( k c ) which is unimportant, and then by substituting (2.198)
The components of the invrrsr tensor M , of M; accordingly have the dimension of a mass. One therefore calls M u the effectzve mas8 tensor. The term effective refers to the fact that the electrons of a crystal are not free elcctrons but arc subject to the influencc of thc periodic potential. T h e presence of this potential mandates that the wavevector of the free electron be repIared by the quasiwavevwtot k, and that the quadratic dppendenre of the energy on k become a more complicated dependence. Only in the vicinity of critical points does the quadratic drprndenre persist, but with coefficients M;$, which differ from those of the flee electton. They are eflectztw coefficients which involvc the effrct of the periodical potmtial. In contrast to the free electron case, where the coeficients form a multiple rnl of the unit tensor 6,p, i.e. a scalar quantity, for the case of an electron in a crystal they represent u tensor ML$ with nonvanishing offdiagonal elements and general diagonal elements. An analogous statement holds for the inverse, as the cffective mass tensor M,. Thus, in general, the effective mass is a tensorial quantily. Therein the crystal potential manifests itself, generally generating different mass values in different spatial directions. In special cases one ran approximate the effertive mass by a scalar quantity rn;. Then, the difference between the mass of the free electron and that of the crystal electron reduces to the different magnitudes of the two masses.
z 22
crystals
The mathematical description of effective mass can be simpl%d by exploitMv:, with rcsppct to an exchange of its indices a and 3, i.e. by using the relation
ing the symmetry of the tensor
(2.199)
This propat,y follows directly from the defining equation (2.196) for M&. As is wellknown from linear algebra, such a real symmetric tensor cau be brought l o diagonal form by a coordinate transformation, in our case o the f being components of the kvector, to the principal axis system. With the principal diagonal elements of M l l in this system, one has
ms'
The m;;' are real. In general, they can take both positive and negative f values. The same applies to thc inverse quantities, the effective masses r k n themselves. Taking k, as componenls of k in the principal axis system, we have
(2.201)
If all three effective masses are positive, the energy function E w ( k ) has a minimum at the critical point kc,and if all three are negative it is a maximum. If the three effective masses do not all have the same sign, then E , ( k ) has a saddle point. l e on a 4is Consider the particular case in which the critical point fold symmetry axis of the first 3 2 . and that this is simultaneously one of the principal axes. Then, for symmetry reasons, the two principal tensor elements normal to this axis must be equal. With LY = 3 for the symmetrical principal axis we thus have
mC1 = m t 2
(2.202)
The corresponding element for the symmetrical principal axis m:3 E m , in :, general differs from mZI. If k, however, lies on a symmetry center of the , first B Z , then all three elements are in fact identical, i.e.
(2 203)
For the frequentIy observed case of a symmetry point at the center of the first BZ!i.e. with k = 0, one gets the simple dispersion law ,
(2.204)
123
This differs from that of a free electron only in that the effective mass rn; appears instead of the free mass and that the energy at k = 0 assumes a value E,(O) which depends on the band index and, therefore, cannot be set equal to zero for all bands by changing the zero of energy. In the following subsection we will again deal with band structure in its general form E,(k). We will introduce the so called density of states that encompasses essential information on band structure in just one function of energy. Often, a knowledge of the density of states is enough to calculate important macroscopic properties of semiconductor crystals.
2.5.4
Density of states
Definition
We first consider an arbitrary infinite oneelectron system. Its Hamiltonian will be denoted by H , its eigenstates (disregarding spin) by li), and its energy eigenvalues by Ei, i = 1 , 2 , .. . , 00. Periodicity of the eigenstates is assumed with respect to a periodicity region of volume 51. The quantity
p ( E ) =  C 6 ( E  Ei).
(2.205)
O i
is called the density of states (DOS) of the system. We will show that this designation is justified, i.e., that p ( E ) does represent the number of o n e particle states of energy E per unit energy and volume. To this end we integrate equation (2.205) over a small energy interval E < E' < E A E , obtaining
P(E)AE =
FL
E+AE
dE'6(E'  E i ) .
(2.206)
JEE+AE
dE'6(E'  Ei) =
(2.207)
Thus, each state i whose energy lies between E and E A E contributes '1' to the sum on i, with no contributions from all other states. The sum in (2.206) equals, therefore, the number A z ( E )of states having energy in this interval. Considering spin, one has
2 ~ ( E ) A E A z ( E ) , =
51
(2.208)
124
confirming the designation of p ( E ) as DOS. The L)OS expression (2.205) may br written in a more compact form which will be used in Chapter 3 in the derivation of important general results. Noting that the 6function 6 ( & ) is the imaginary part of (1/7r)l/(E ie) with t a small positive number, one can easily prove that p ( E ) of (2.205), except for a factor R, equals
p ( E ) = Im
7T
Tr
(2.209)
where trace sytxho1 Tr denotes the sum over all diagonal elements with
respect to any basis set in Hihert space. If this set is identified with thr eigmstatcs 12) of ff, expression (2.209) transforms immediately into (2.205). We now consider the DOS p ( E ) defined by (2.205) for an ideal crystal. The cigenstates li) are Rloch states Ivk), in this case, and the DOS p ( F ) therefore reads
p(E)
2 C 6 ( E 
(2.210)
uk
The ksum is over all points of the finely meshed net of Figure 2.4 which lie within the first B Z . Because of the fineness of this network, the result of the summation is almost identical to that of an integration. The substitution of the s m by an integral is done in accordance with the prescription
c...= 1
R
k
8T3
d3k ...
(2.211)
The factor f l / 8 m 3 in front of the integral ocrurs because the volume of a mesh element is the G3th part of the volume 8n3/Slo of a primitive unit cell of the reciproral lattice, whirh is 87r3//n. One must divide by this volume in writing the sum as an integral. The density of states follows as
(2.212)
which is independent of the volume R of the system. The density of states restricted to the uth energy band, p u ( E ) ,is defined by the expression
(2.213)
and it differs from zero only for energy values for which the argument of the 6function may brrome zero, i.r. for energies in the vth band. Summing over d l bands one again obtains the entire density of states
(2.214)
125
Expression (2.213) for p , ( E ) may be transformed into a surface integral over the constant energy surface &(k) = E using the rules of calculation for the &function, thus obtaining
(2.215)
with df as element of the constant energy surface E,(k) = E . The inteErand l/IVkF,,(k)l is, apart from a factor TL, the reciprocal absolute value of the group velocity of an electron in the Bloch state Ivkq). The smaller this velocity is at a particular kpoint on the constant energy surface or, in other words, the longer an electron stays at it, the larger is its contribution to the density of states. If (VkE,(k)l  0, i.e. at a critical point, the contribution is, formally, infinitely large. From this it follows that for energy values E whose isoenergy surfaces contain critic81 points, singularities of the density of states as a function of energy are to be expected. They are called van Hove singularitiee.
We use this to calculate the density of states p , ( E ) from equation (2.213). The effective mass rn; is taken to be positive, i.e., E,(k) should have a minimum at k = 0. The use of the dispersion law (2.216) i s only justified within its limits of validity, i.e. for energy values E,(k) close to the minimum E d . Only for these Evalues can the density of states as calculated by means of (2.216), be relied upon. For such energies it so happens that the value of p , ( E ) does not depend on the upper limit of integration. We transfer this limit, for simplicity, from the edge of lhe first B Z to infinity, obtaining
(2.2 17)
In this integral, we transform to new variables x,y, z To start we, execute the integration over z using the rule
126
+6(z+
ZO)]
(2.2
= ,/E  E , , ~ x 2  y2 f o r E  E& 
> 0.
(2.2
It follows that
1 f o r E > Euo
0 for
E < Euo
(2.221)
is the Heaviside unit step function. If one considers, instead of an isotropic parabolic band, an anisotropic one with three different effective masses m:l, m:2, m:3 corresponding to the principal axes of the effective mass tensor, then the density of states is again given by expression (2.220), but m: must be replaced as follows
(2.222)
by the socalled densityofstatesmass m b . This may be seen immediately if one reviews the calculation for the isotropic case. The anisotropic effective mass involves a change only upon substitution of the variables for the components of the kvector  the factors for the 3 components are different and lead to the modification indicated in equation (2.222). According to expression (2.220) the density of states of a parabolic band with m*, > 0 exhibits square rootlike behavior at the lower band edge, and continues following a square root law up to infinitely large energies. The latter property is a consequence of our untenable assumption of parabolic dispersion up to the upper band edge and the replacement of this edge by the point at infinity. In reality the band edge lies at finite energies, and the density of states again falls off to 0 there. It has, qualitatively, the shape shown by the dashed curve of Figure 2.14. The sudden square rootlike increase of the DOS at the band edge reflects the fact that the function p(E) has singularities at these energies  the first derivative with respect to E is 0 if one approaches the edge from the forbidden zone, and it is +CXJ if one approaches it from the interior of the band. This is one of the already mentioned van Hove singularities of the DOS. These singularities occur not only at energies where a particular band E,(k) has a minimum, as in the
127
Figure 2.14: DOS of a parabolic isotropic band in the vicinity of its minimum (solid curve). The dashed curve shows the DOS of a more realistic band.
5
c
<
vl
4
321
1 2
0
1
0 3
case considered here, but at a l energies corresponding to critical points of l the energy bands, thus also at maxima and saddle points.
Counting states: 3D, 2D,and 1D density of s t a t e s
The square root enerw dependence of the density of states in the case of a parabolic dispersion law may also be demonstrated in the following, more vivid way (also see Figure 2.15). For simplicity we set E,o = 0. and omit factors which do not depend on k or E. The n m b r r A Z of b m d states with energy between E and E .f A E is the number of different kpoints of the finely meshed net which yield energy values in t,his interval. These kvalues lie in a thin spherical shell in kspace [see Figure 2.15) which, because E o(. k2, has radius k~ 6: &.and thickness Ak which, since Ah' CK k&k, is given by
Ak
0;
AE.
fi
(2.223)
(2.224)
Since the density of the finely meshed net of kpoints of the first BZ i s the same everywhere, the number of kpoints in the spherical shell is proportional to its volume AV. Hence AZ cy v%AE and
p(E)
o. (
4%.
(22 2 5)
Such considerations can easily be applied to energy bands in 2dimensiouel (2D) and 1dimensional (1D)kspaces. Such kBpaces have physical (as well as mathematical) meaning, in particular for electron systems whose free
128
(a)
motion is confined to one or two dimensions of 3dimensional space. We wiU encounter examples of such systems in Chapter 3  surfaces and quantum wells for the 2Dcase, and electrons in a homogeneous magnetic fieId for the 1Dcase. For 2 0 kspace, the volume of the spherical shell is replaced by the area A F of a circular ring, such that A F a k E A k L A E and X
p ( E ) 3 const. :
(2.226)
The density of states of a 2Delectron system is therefore independent of E. In the case of a I kspace: AV is replaced by the length Ak of the kinterval D itself. Thus,
p ( E ) DC
AE.
fi
(2.227)
2.5.5
Spin
Thus far, spin has been omitted from our discussion of the general properties of stationary oneelectron states of crystals. It turns out. however, that spin and spinorbit interaction play an important role in most semiconductors. Therefore, the question arises as to how the results derived above for scalar wavefunctions change when electron states are no longer scalar but spinor functions and the spinorbit interaction H , is taken into account. The starting point to address this question is the oneelectron Schrodinger equation (2.58) with spin, which may be written in the following short form
(2.228)
129
The first point we will consider is that of spatial symmetry in the presence of spin. Spatial symmetry Inspecting the explicit form of the spinorbit interaction operator ( b e y formula ( 2 . 5 6 ) ) , one readily recognizes that TKso has thc full symrwtry of the crystal. Thus it colrimutes with all elements 9 of the space group. Since the i one same holds for the spin free part H of the total Hamiltonian ! + has
Iff
+ H,.
g] = 0
(2.229)
To exploit this symmetry property of thv total Hamiltonian w e mu61 know how thc components p(x, s) of a spinor transforin i d e r thr operations g of the space group. Considering translations first. we have
rise to the Bloch theorrm. In t.he same way that this theorem was proved without spin above. its validity may be also demonstratd here in the pwsrnrr of spin. It. stdates that the solutions { p ~ k ( x , y ~ ( xf)}of , t.hr Schrodingeer equation (2.228) for the eigenvalue E A ( ~ ) be chosen in can the form of Bloch type spinor functions with a particiilar quasiwavevecbor k as
4).
where I ~ X ~ [ X . are the spindependent Bloch factors. If k is restricted to the first B Z . then E x ( k ) rrprwents a continuous function of k Thc band . index X here refers to both the state of the orbital motion and also to spin
state.
zti)
Second, we consider point symmetry operations u , whose action in tramforming t,hc components of a spinor is discussed in Appendix 4. The results of Appendix A are applied here in the form
130
with
In this, $, B and p are the Euler angles of the orthogonal transformation a. If the spinor is spatially constant, the transformation just reduces to multil plication by the matrix D r ( a ) . The set of a l orthogonal transforniations cy. 2 forins an (infinite) group. Through relation (2.233) this group is assigned a matrix group D i ( a ) . A peculiarity of thm mapping is that it is not unique 1  a change of the angles cp or $J in (2.233) by 27r leads to a change of the sign of the associated matrix, even though this signifies the identity transformation. One may say that under a rotation through 27r a spinor does not transform into itself, but into its negative. This gives rise to multiplication rules for the representation matrices which deviate from those of the group itself. Thus the representations of the full orthogonal group in the space of twocomponent spinors are not representations in the ordinary sense, but in a generalized sense. They form projective representatiom with a special factor system (see Appendix A), or, in short, spinor representatiom. By means of the socalled double group, which includes each element, of the fill1 orthogonal group twice, once in the original form, and once multiplied with a rotation through 27r, one may trace back the spinor representations to ordinary representations. The spinor representations are those representations of the double group which do not occur among the ordinary representations of the single g ~ o u p (for the derivation of this result see Appendix A). Up to this point we have only given attention to the transformations of spinors in spin space. The states of electrons are described by spinor fields having positional dependence. These give rise to spinor representations of the (full) orthogonal group which differ, in general, from D i , The totality of spinor representations may be obtained from the ordinary irreducible representations V, and the particular spinor representation D1 of the full
2
orthogonal group. Indeed, if a spinor field (P,(x, s), with s = fk, transforms according to a certain ordinary representation D, in coordinete space, then its transformation in coordinate spin space is governed by the product If, representation Vi x Vv. here, D, encompasses all vcctor representations, 2 then all spinor representations are obtained. Using the concept of spinor representations, the already discussed connection between the eigenfunctions of the crystal Hamiltonian for a given eigenvalue and the irreducible representations of its symmetry group may be generalized in the following way:
~
The spinor eigenfunctions of the crystal Hamiltonian W H , having the same energy eigenvalue span a representatdon space of a n irreducible spinor
13 1
of the small point groups of symmetry points and lines of the firat B Z of the fix lattice for crystala with the diamond structure.
+ H,,
The product representations Dr.x V v , taken as representations of the sman point groups of a particular wavevector rather than for the full orthogonal group, are generally reducible. This means that bands which are degenerate without spin by reason of spatial symmetry, may split if spin is taken into account. The magnitude of the energy splitting depends on the spinorbit interaction, without this interaction, degeneracy persists, but as an accidental rather than a symmetry induced degeneracy. The following example illustrates the removal of degeneracy. We consider the valence band of zincblende type semiconductors which is triply degenerate at I The . ' product representation in this case is Di x 115. It decomposes into the a ' , two irreducible spinor representations ra and r 7 of the point group T d of I where rs is of dimension 4 and r of dimensiun 2. This means that, due 7 to the spinorbit interaction, the triply degenerate r15valence band without spin splits into the $fold degenerate rsband and the 2fold degenerate r7band, accounting for the effects of spin. In Table 2.6 the irreducible spinor representations are shown for some symmetry paints of the first BZ of diamond type semiconductors. A systematic treatment of these representations is given in Appendix A.
132
i),
i)}
(2.234)
To prove this assertion, one executes the time reversal operation on the spinorbit interaction operator H,, of equation (2.56) explicitly. The vector a' then transforms into a*, and p into p, so that the net change in Hso is the replacement of d by a'*. Employing the operator K , on the other hand, the effect of timc reversal on H , , may be expressed in terms of the similarity For the two expressions to be identical, transformed operator KH,,K'. K must have the form given above in equation (2.234), apart from a phase factor which remains undetermined. The wavefunction {cpx(x, cpx(x,$)} must be replaced by K{cpx(x, cpx(x, $)} for the Schrodinger equation (2.180) to remain valid. The question, under what conditions time reversal symmetry includes degeneracy of eigenfunctions not accounted for by spatial symmetry alone, is treated in Appendix A in full generality. Here we deal only with a specjal cp~(x, case. We consider a nonsymmetric wavevector k. Let {cpxk(x, be an eigenfunction with energy Ex(k). If the point group of directions contains the inversion, then the eigenfunction {cpxk(x, cpxk(x, corresponds to the same eigenvalue Ex(k) = Ex(k). The time reverse of the Bloch function {cpxk(x, cpxk(x, $)} likewise has energy Ex(k) and wavevector k. It is linearly independent of {(Px~(x, cpxk(x, $)}, since
i),
i),
i),
i),
i)} i)}
i),
i),
as one may easily show by evaluating the scalar product (here, one also has to sum over the spin coordinate s ) . Thus, two linearly independent eigenstates of the same energy have the wavevector k. Since k was chosen arbitrarily, it
follows that, due to time reversal symmetry, all bands of inversionsymmetric crystals are at least twofold degenerate.
133
2.5.6
There are many methods, quasianalytic and numerical, for calculating the band structure of crystals. In the following we will give an overview. In band structure calculations one is confronted with two problems, firstly with the formulation of the oneelcctron Schrodingcr equation, i.c. with the determination of the effective periodic oneelectron potential V ( x ) of the crystal, and secondly with finding the eigenvalues and eigenfunctions of that equation. The various methods of band structure calculation differ in the manner in which thcsc two problcms are resolved. Hcre we will deal mainly with the second problem, i.e. with the solution of the Schrodinger equation whose potential is known. The first problem, the determination of thc cffectivc oncclcctron potential V ( x ) , has in principle brcn solved in section 2.2, where we dealt with the oneelectron approximation for the manyelectron system of a crystal. Here, we only address some further dctails of this problem.
134
cases, the Hartree and HartreeFock approximations fail to give satisfying results, so that correlation effects must also be inchided. As indicated in section 2.2. one way this can be done is by means of the dematy finctional theory in its local approximation (local density approximation or LDA). The LDAmethod yields very good results as far as valence bands are concerned, but it fails if applied to the conduction bands. In particular, as already mentioned in section 2.2, it does not reproduce the correct d u e of the fundamental energy gap. A procedure which avoids this failure is the Gmens functzon method mentioned in section 2.2. This approach is now being applied more frequently under the name pumaparticle method (see, e.g., Bechstedt, 1992). Within this framework the selfeneqy operator is often taken in the socalled GW approximation [G stands for Greens function and W for the Coulomb potential). We add two further remarks, related to the potential of the atomic cores. Firstly, only for relatively light atoms such as C or Si, may spin and the spinorbit interaction of the valence electrons be ignored. For the heavier atoms, such as Ge, this interaction is essential and must be incorporated in the effective oneelectron potential. Secondly, the decomposition of the total electron system into valence and core electrons need not to be made in the literal sense of these terms. What counts is which electrons are frozen in their atomic states and, therefore, need not be treated selfconsistently, and which electrons must be. The latter are valence electrons in a more general sense. In 111V semiconductors, for example, they can also include delectrons which, of course, do not belong to the valence shell of one of the two elements involved. In the extreme case, all electrons are treated as valence electrons. Then one has the socalled call electron problem. The solution of this problem involves an extraordinary large numerical effort, so that such allelectron band structure calculations have been performed in o d y a few cases to date. From the physical point of view, they are the most satisfying. With increasing computing power they will become ever more important.
The solution procedures can be divided into three groups, firstly! the matrix methods, secondly, the cell methods, and thirdly, the muffintin methods.
Matrix methods
In the application of matrix methods one represents Bloch type eigenfunctions of a given quasiwavevector k in a particular basis set consisting of a finite number of functions of the Bloch type. The Hamiltonian of the crystal is thereby represented by a kdependent complex Hermitian matrix which
135
has as many columns and rows as the basis set has functions per kvector. The calculation of band structure is thus traced back to the determination of the eigenvalues and eigenvectors of a kdependent Hamiltonian matrix. The basis sets employed differ by the number and kinds of functions they contain. One would like to manage with the fewest possible functions, to minimize the numerical effort. The price for this is a loss of precision, because with fewer basis vectors the eigenfunctions are necessarily approximated more crudely than with a larger number  in the extreme, one needs an i n h i t e number of functions. For a fixed number of basis functions the level of precision achieved is higher for basis functions which are better adjusted to represent the eigenstates. The following basis functions prove to be of practical use: Plane waves IkiK) with k being a vector of the first BZ and K a reciprocal lattice vector. This constitutes a generalization of the nearly free electron approximation.
~
Bloch sums of atomic orbitals, also called LCAO's (Lznear Combznatzons of Atomzc Orbatuls), or of other localized functions. This includes the tight binding met hod.
~
Bloch functions with Bloch factors for a special wavevector, referred to as LuttzngerKohn funetzons. The socalled k . pmethod uses these functions. In some circumstances, the tight binding and k pmethods may be used to derive analytic expressions for the kdispersion of energy bands. Such expressions are extremely useful to achieve a physical understanding of band structure. Therefore, we treat these two methods in greater detail below (sections 2.6 and 2.7).

Orthogonalized plane waves, called OY W's. An OPWfunction IOPW~+K) is obtained from n plane wave Ik+K) by subtracting a certain linear combination of core band eigenfunctions Irk) of the crystal Hamiltonian. Within the frozen core approximation, the core band eigenfunctions (ck) may be taken as Bloch sums of the core states of single atoms. The linear combinations to be subtracted are chosen such that the OPW's are orthogonal to all core eigenstates, so that
(2.236)
holds. The OPW's must have the form
l O l ' W k + ~ )= Ik. K )  C(cklk
C
+ K ) Ick)
(2.237)
in order to satisfy the orthogonality condition (2.236). Making the expansion functions orthogonal to the core eigenstates accounts for the fact that
136
eigenstates of the Hamiltonian for different energies are always mutually orthogonal. This means that the soughtafter eigenstates of the valence electrons of the crystal for a given quasiwavevector k must be orthogonal to the core eigenstates having the same quasiwavevector k. For this reason the O P W ' s are much better adjusted to represent the eigenfunctions of the valence electrons than are pure plane waves; correspondingly one needs fewer O P W ' s than plane waves to accurately represent the valence eigenfunctions. A further development of the OPWmethod is the pseudopotential method.
Pseudopotential method
The idea underlying the pseudopotential method is to transfer the core state orthogonality term in equation (2.237) from the OPW's to the oneelectron Hamiltonian H of the crystal. This transfer is done as follows. Consider the core electrons. which we have hitherto taken jointly with the atomic nuclei to form the cores, to be independent particles, just like the valence electrons. This means that the potential created by the core electrons is no longer included in the core potential Vc, but is added to the eEective o n e electron potential 1 ' ~ 15c the electronelectron interaction. Because of of this reinterpretation of the oneelectron Hamiltonian H , its eigenstates now also include core states, for which we have
H i c k ) = E,lck).
(2.238)
The valence band eigenvalues E and eigenfunctions ~a similarly satisfy the eigenvalue equation
The expansion of
reads
Removal of the core states from the expansion functions mandates changing from the eigenfunctions $ato other functions v p i that have the same expansion coefficients, but plane waves as expansion functions:
(2.241)
Surprisingly, the artificial wavefunctions are in fact eigenfunct ions of a particular Hamiltonian H p s , having the same eigenvalue E to which the a eigenfunctions $ of H correspond. In fact, by applying of H to $I% one immediately Ends that
~$g~
137
(2.242)
H P S= H
+ E ( E  E,)lck)(ckl
C
(2.243)
The Hamiltonian HPS is a welldefined linear operator, although it is nonlocal and depends on the eigenvalue E itself. It is called a pseudoHamiltonian The are termed pseudowavefunctions. Each eigenvalue of HP" is simultaneously an eigenvalue of H , but the reverse statement does not hold. While core levels E, are eigenvalues of H , they are not also eigenvalues of HPS since the pseudowavefunctions of the core states vanish. The pseudoHamiltonian HPS may be written in a form which clarifies its meaning. To begin with one revokes the reinterpretation of the Hamiltonian H , i.e. considers the core electrons no longer as independent particles which enter the effective oneelectron potential VH Vxc of the electronelectron interaction, but includes them again into the atomic cores making them contributors to the core potential V,. Then HPS takes the form
$gk
HPs=  + v p2
2m
(2.244)
The last two terms in this expression jointly constitute the socalled pseudopotential V,p"
(2.245)
The second term on the right is significant only in the core regions. There, it is preferentially positive, indicating that it repels valence and conduction band electrons away from the cores. One can show that it largely compensates the variation of the core potential V, in these regions. Because of this, the pseudopotential V,p" is relatively smooth throughout the cores. It can he made even smoother if one exploits a property of V,p" which has not yet heen discussed. We refer to the nonuniqueness of the repulsive part of V,p" in equation (2.245). If there the bravectors (E  E,)(ckl in (2.245) are r e placed by completely arbitrary functions (while keeping the ketvectors Ick) unchanged), the eigenvalues of the pseudoSchrodinger equation (2.242) rem i the same, only the pseudowavefunctions change. This freedom may be an used to make the pseudopotential still smoother, and also fulfill other requirements, such as, for example, normconservation of the pseudowavefunctions. The smoothness of the pseudoptential makes it possible to restrict the planewave expansion (2.241) of the pseudowavefunction to terms having small reciprocal lattice vectors K. Consequently, the representation matrix of the pseudoHamiltonian HPS is small and easy to diagonalize. This is the reason
138
that the pseudopotential method is very helpful in calculating valence and conduction band structures of semiconductors. In order to apply this method, the psendopotential must be known explicitly, of course, In principle it can be determined from the defining equation (2.245), with core levels and wavefunctions taken from atomic calculations, and with bravectors (ckl substituted by appropriate functions. In practical applications, the pspudopotential V,p"i s replaced by approximate expressions which range from empirical local pseudopotentials with adjustable parameters up to nonlocal pseudoyotentiah including core states of s! p and dsymmetr y. The pseudopotent,ial method is generally successful if the true valence band eigenfunctions are also sufficiently smooth outside ol the core region. This i s the c a ~ e long as the valence band states are composed mainly as of atomic 3 and porbitals. If dorbitals cont,ribute to these states in an essential manner, i.e. if the delectrons of the atoms are sigmficantly involved in t,he chemical bonding of the cryshl, then the pseudopotential method becomes problematic, because it,s main advantage in having the pseudoeigenfunctions built up from a relatively small number of plane waves, no longer applies. Thus may occur in trhe rase of TIIV and 11VI wmpound semiconductors whose cations have flat dlevels as in the case of Zn, for exmaple (see Table 2.2). A completely different approach to solving the oneelectron Schrodinger equation is taken in the socalled cell methods.
Cell methods
These methods are 3dimensional generalizations of the method of matching conditions usually employed in solving thr Schrodinger equation for the square well potential and other 1dimensional potentials. In cell methods? one first determines linearly independent solutiona of the Schrodinger equation within a primitive unit cell for arbitrary energies. Forming Bloch sums with them, one constructs the solution for the total crystal. The require ments that these functions, and their first derivatives, be continuous at the boundaries of the unit cells determines the energy eigenvalues and eigenfunctiuns. This method suffers from the arbitrariness of t,lie choice of the unit cells and the difficulty of satiseying the boundary conditions over the whole surface, i.s. at an infinite number of points. The most natural choice of unit cell is the WignerSeitz cell. A further development, of cell methods lies, in a sense, in the mufintin methods.
Muf&ntin methods
In the present method one delimits spheres around the atoms, and leaves some empty space around them. The crystal then looks something like
139
thc methods narnp. Within the spheres one uses the spherimlly syuunelric potentials of the atoms, trnd in the surrounding regions one takes the potential to be uniform.
In a special mufintin method, t,he augmented plane wave ( A P W)method, the solutions of the Schriidinger equation within the spheres are expanded with respect to angular momentum eigsnfunctians. The radial parts of the expansion are determined by the radial Schrijdinger equation. This is solved for the various angular momentum quantum numbers numerically. Then, the still unknown expansion coefficients are determined by the requirement that t,he soliit,ions within the spheres mist join continuausly with the solutions outside. The latter are plane waves of some wavevector k K. The functions constructed in this way are called augmented plane wawes {APWs). They are, of course not eigenfunctions of the oneelectron Schrdinger equation of the crystal, but m a y be taken as a basis for them. In contradistinction to bhe basis functions used in matrix methods, the APWYstill depend on the unknown energy eigenvalues. An rigenfunction expansion with respect to APWs leads, as in matrix methods, l o B homogenwus set of equations or the expansion coefficients. but the matrix elements are, unlike the Hamiltonian in matrix nielhoads, funcbiom of energy. However, the A PWmatrices are i general smaller, as a consequence of the use of better adjusted ban sis functions. Often, a linearized energy dependence of the matrix element,s yields useful results in the linearized APW o r LAPW method. If, in constructing the APWs, Gaussian functions are used instead of the angular momentum eigenfunctions, one speaks of rnufftin orbitals (MTO s), upon which the MTO and LMTO methods rest.
The different APWs are indexed by reciprocal lattice vectors K, in addition to angular momentum quantum numbers. In another muffintin method, called the KomngaKohnRostoker (KKR)metho$, which takes advantage of the formal scattering theory of quantum mechanics in the Greens function formulation, the expansion functions are also angular momentum eigenfunctions between the spheres.
In band structure calculations for semiconductors, some of the methods listed above are used more frequently than others. Among the ab initio procedures, the pseudopotentid method combined with density functional theory in its local approximation, and lately with the Greenss function method, is particularly important. In addition, also the APW and LMTO methods are used. mainly in their linearized forms. Of practical importance among the empirical procedures are, above d l , the empirical versions of the tight binding and of the pseudopotential methods.
140
2.6
In the nearlyfreeelectron approximation, the eigenstates of the oneelectron Hamiltonian of a crystal are represented by a superposition of plane waves. The nondiagonal matrix elements of the periodic potential with respect to these functions are treated as a small perturbation. Thus the eigenstates are weakly disturbed plane waves in which lhe electrons are spread out almost uniformly over the whole crystal. Such a distribution might be valid for drctrons of tho conduction band, biit it does not correspond to the reality one should expect for valence electrons if one considers the crystal to be n formed from previously isolated atoms. I free atoms, the valence electrons are l o c a l i d at their respeclive atomic cores. Although this localization must be partially breached in the crystal, in order for chemical bonding to take place, an almost complete deloralization such as is assumed in the nearlyfreeelectron approximation is not to be expected. This suggests a more appropriate approximation which takcs atomic wavefunctions as the basis set and treats the nondiagonal elements with respect to these functions as small perturbations. This approximation is callcd a tzght b m d m g (TB) appro.mmntzon The errors of this approximation are expected to be small if the valence electrons of the crystal are well localized at the atoms. The approximation of nearly free electrons will work very poorly in this case, while it gives good results if the electrons are weakly localized, i.e. when the tight binding approximation is not applicable. In this sense the two approximations are complementary. Results which are obtained from both these approximations may be considered to be independent of any particular approximation, i.e. to be exact. The term exact here means within the framework of the simplifications made earlier. One of these simplifications, the approximation of frozen cores introduced in section 2.1, is particularly important because it allows us to deal with only the valence electrons of the free atoms, while the core electrons are incorporated in the atomic cores. In this section we will develop the basic principles of the TB approximation. These principles will be applied to semiconductors of the diamond and zincblende type, and it will be shown that the TB approximation not only is capable of explaining the valence band structure of these crystals, but it also provides insight into their chemical bonding and atomic structures.
2.6.1
Fundamentals
Atomic Orbitals The basis functions of the TB approximation are the oneparticle eigenfunctions of the valence electrons of the free atoms, more strictly, of the atoms composing the crystal under consideration. These eigenfunctions are called
141
atomic oibatnls. The spinor character of the orbitals may be taken into account, but we will omit it here brevity. The mobt important property of the orhit&, which turns out to be decisive for the TB approximation, is their spatial symmetry. The latter i s determined by the symmetry of the Hartree o i HartrecFock potentials of the atomic cores. These potentials are isotropic if all core shells are fully populated by electrons. In the case of atoms forming diamond and zincblende type semiconductors, this condition is always satisfirul. Thus we may assume isotropic core potentials, and the atomic orbitals of given energy eigcnvalues form basis sets of irrediicible representations of the full orthogonal symmetry group. These representations are characterizd by an angular momentum quantum number 1 which may assume all nonnegative integral values. The irreducible representation of a given quantum number 1 is (21 1)fold degenerate. The 21 1 basis functions are distinguished by the magnetic quantum niimbcr 7n which takes all integei values between  1 and + l . The energy spectrum of the free atom is degenerate with respect to m. Foi the hydrogen atom there is also a d c generacy with respect to 1. This is not due to the spatial symmetry of the potcntial but to its purr Coulomb form. Here this additional degeneracy may not be assumed. For each value o l lhe quantum number 1 one has an infinite set of different energy eigenvalues. These are distinguished by the mdin quantum number n which may take all intcgcr values horn 1 1 to 00. In this way the energy levels E,l of an atom depend on thc two yuanturn numbers r~ and 1 , and the corresponding eigenfunctions dnlm(x)depend on thrw quantum numbers r ~I ,, m. 'l'ht. eigcnfunctions &,jrn(x) can be written as products of the radial wavefunctions &(I x I) and spherical harmonics Km(xl I x I),
(2.246)
Here, it is assumed that the tttorriir core is located iit the coordinate origin x = 0. A wavefunction with the quantum number I = 0 is called an sorbital, one with 1 = 1 a porbital, with 1  2 a dorbital etc. In order to represent the eigenstates of the valence electronb: of a crystal, one needs, rigorously speaking, nll orbitals of the cores of its atoms since only the totality of all orbitals forms a complete basis set in Hilbert space. However, not all of these contribute in an essential manner. 'l'he largest contributions are to be expccted from orbitals forming the valence shells of the free atoms. Within the TB approximation one takes only these orbitals into account, This corresponds to a perturbationtheoretic treatment of the Hamiltonian matrix with respect to the atomic orbital basis; only matrix clcmerits between valence orbitals are considered while those involving other orbitals are neglected. For the elemental semiconductors of the fourth group of the periodic table the valence shell orbitals are formed by the four ns
142
and npstates with n = 2 for C , n = 3 for Si, n = 4 for Ge, and n = 5 for aSn. In the case of semiconductors composed of different elements, the valrnce shell orbitals of the various atoms must be considered. For GaAs that means the one 4sstate and the three 4pstates of Ga, and the one 4sand t h e Qstates of As. For GaP one bas, besides the abovementioned 4s and lipstates of Ga, the one 3sstate and the three Spstates of P. The valence shell orbitals used as basis functions need not, of course, be populated by electrons in the case of the free atoms. For Si, for example, two of the three porbitals are empty. Similarly, the eigenstates of the crystal, which will be calculated later by means of the T B approximation, will also not be completely populated. As this applies to quantum mechanics in general, the eigenstates are candidates for posstble population. Whether they are popdated or not. depends on the macroscopic state of the system, e.g., on the temperature of the crystal In the examples considered above the valence shells of the atoms are formed by s and pstates. This is the typical case for tetrahedral semiconductors composed of elements of the main groups 11, IV, and VI of the periodic table, but it is by no means the only possibility, especially if one also includes other material classes. For bodycentered cubic metals such as Cu and Ni, for example, the valence shells are formed by 3d and 4sstates. In section 2.1 it was already mention4 that dstates may contribute to the valence shell of 11VI semiconductors with heavy metal atoms such as Zn. Here, we wiU exclusively consider semiconductor materials for which only s and pstates need to be taken as basis functions. The corresponding spherical harmonics X,(X/ I x 1) are
Using these harmonics, eigenfunctions &oo, $n.cll, &lo, &ii may be formed , according to equation (2.246). Instead of drill and d ~ ~ 1  1one may use the linear combinations
The latter are also energy eigenfunctions because of the degeneracy of qbn1l aiid 0 ~ 1  1 with respect to the magnetic quantum number m. For the sake of uniformity, one sets
143
Figure 2.16: Polar diagrams of the atomic sand p orbitals in Cartesian representation.
The eigenfunctions of equations (2.246) to (2.248) will be referred to as spherical orbitals, and those of (2.249), (2.250) as Carte3ian orbitak. The latter are visualized in Figure 2.16. The quantum numbers nlm of the spherical orbitals, as well as n, 6 , z, y, z of the Cartesian, will be abbreviated by a general index a. The orbitals considered above are orthonormalized, i.e. one has ($a I $fZ) af,. (2.251) If the atomic core is not located at the coordinate origin, as has been assumed thus far, but at a particular lattice position R+G, the corresponding orbitals will be denoted by &j~(x). They may be traced back to the orbitals &(x) of atoms located at the origin by shifting their arguments in accordance with
Two different orbitals & f j f R ( X ) and & ~ R ( x ) with identical values of R and R as well 8s of j and j , but different values of a and a, are also orthogonal to each other. For R f R or j # j , i.e. for orbitals at different centers, no such orthogonality exists. Although the two orbitals are localized in different spatial regions, and the integral over the product of the two, the socallcd ovcrlap integral, turns out to be relatively small, it may not be n e glected because its influence on the energy eigenvalues is of the same order of magnitude as the matrix elements of the Hamiltonian between orbitals at different centers. The latter elements are essential because they are r e sponsible for the bonding between atoms in a crystal and for the splitting of the atomic energy levels into bands. The nonorthogonality overlap integrals must therefore also be taken into account. This may be done directly, by writing down and solving the eigenvalue problem for the crystal Hamiltonian in the nonorthogonal basis set of the atomic orbitals. This procedure
144
is, Iiowcvcr, quite inconvenient becaiise the matrix of overlap integrals has to be calculated explicitly and diagonalized together with the Hamiltonian matrix. It is more useful to employ a set of orthoganalizd orbitals by forming suitable h e a r combinations of the q a j ~ ( x )Here, 'suitable' means that . the new orbitals should have t,he same spatial symmetries a s thc original atomic nrhitrals Q U j ~ ( xbecause these symmetries are the essential proper), ties that allow the matrix elements of the Ilamiltonian to be reduced to a few const.ants. Consequently, the new orbitals must, likewise form basis sets of irreducible representations of the full orthogonal group, each set heing chara c t w i d by a particular angular morrieriturri quantum miniber 1. That there are indeed linear combinations of atomic orbitals possessing these properties, forms the content of Liimdin,'<sthcowm (we, e.g., Slater, Koster, 1954). We will iise this theorem below. The orbitals m U j ~ ( xwhich thus far have been )! taken as ordinary atomic orbitals, will henceforth be understood as atomic ) orbitals in the sense of Ldwdin's theorem. The orbitals q a j ~ ( xare now, of course, no longer given by the expressions [2.246) to (2.250), but their indices n E nlm will retain the meaning they previously had for the ordinary atomic orbitals. hlthough explicit expressions for the G w d i n orbitals & j ~ ( x )can in principle be providd, they will not be given here since thcy are nut, n m d d if one proceeds in thc manner to be discussed below. The Lowdin orbitals arc, by definition, aIso orthogonal for different centers, such that
(&!jrR(
I O a j R ) = bo'abj'$R'R.
(2.253)
(2.254)
(2.255)
The sum over R extends over all lattice points of the periodicity region. By means of (2.254) one may readily verify that
(2.2 56)
145
which identifies the & j k ( x ) as Bloch functions of quasfwavevector k. The 4 , j k ( X ) are called Bloch sums of atomic orbitals. If one replaces k in &jk by a wavevector which differs from k by a reciprocal lattice vector K then, because of the relation exp[iK. R] = 1, the original wavefunction is reproduced expect for an unimportant phase factor. Therefore, it suffices to take k in the first B Z . Obviously, the use of the Bloch sums (2.255) automatically puts us in the reduced zone scheme. The orthogonality of the Lowdin orbitals & ~ R ( x ) results in the orthogonality of their Bloch sums 4 a j k ( X ) , such that
(#alkj
I #ajk)
6aa6j1j6k1k.
(2.257)
In the TB approximat,ion, the eigenfunctions C p v k ( x ) of the Schrodinger equation are written as linear combinations
cPvlc(x) = x ( J j k I c P v k ) c P a j k ( x )
jQ
(2.258)
of Bloch sums
x(ajk I H
ja
(2.259)
where the matrix elements of the Hamiltonian are given by the expressions
with
(aj0 I H
I ajR) =
d 3 x ~ ! J : ( x? ~ ) H ~ , I xR  5,). (
(2.261)
In deriving equation (2.260), the lattice translational symmetry of H has been used. For a given wavevector k, the (ajk I H I ajk) form a square matrix with a finite number of rows and columns, the same as the number of different orbitals per primitive unit cell which were used for the representation of the eigenfunctions, i.e. number J of atoms per primitive unit cell times number A of orbitals per atom. The integrals ( a j 0 I H I ajR) in (2.261) describe the interaction of electrons in orbitals a and a, where one of the orbitals belongs to the atom at ? and the other to the atom at R+ ?,I. These integrals substantially decrease with increasing distance between the atoms, so that it suffices in most cases to extend the sum on R in (2.260) over the nearest, or if need be, the second nearest neighbor atoms only. Once the integrah ( a j 0 I H I ajR)and, through them, also the matrix elements (ajk 1 H 1 ajk) are known, the energy eigenvalues and eigenfunctions of
146
the Schriidinger equation may be obtained by calculating the eigenvalues and eigenvectors of the ( J x A ) x ( J x A)dimensional IIamiltonian matrix of equation (2.260). This is an easily solvable task which occasionally can be treated analytically, but, can in any case, be done numerically. For each k one has J x A eigenvalues and eigenvectors. They will be partially degenerate if symmetrical kvectors are considered. If k varies over the first B Z , the J x A energy eigenvalues form J x A energy bands. Simple example: cubic crystal composed of satoms We will illustrate the above discussion with a simple model. Consider a crystal having a primitive cubic lattice and 1 atom per primitive unit cell  0. The ( j  1). The atoms are placed at the lattice points R, with set of atomic orbitals will be restricted to one nsorbital only, such that a = ns. With J = 1 and A  1, a matrix of size 1 x 1 = 1 is obtained. This the solution of thP eigenvalw problem is, in fact, trivial. In evaluating the matrix element (nslk IH I nslk), we will terminate the Rsum in expression (2.260) at the nearest neighbor atoms. The latter are located at R1,z = (fa,O,O),R3,4 = (O,fa,O),Rs,s = (O,O,fa). For the matrix elements occurring in equation (2.260), we use the notation
(2.262) (2.263)
I nslRt) = pns,
t = 1 , 2 , .. . 6 .
Here, we have employed the fact that, for reasons of symmetry, all six neighbor atoms give rise to the same value of the integral (2.261). The value of ,BnS depends on the overlap of the nsorbitals localized at adjacent atoms. With increasing distance between the atoms, & approaches 0. The value of E,,, , in a crude approximation, may be identified with the energy of the nslevel of the free atom. In terms of E,, and Pns, the matrix element (nslk IH I nslk) of (2.260) is given by
(nslk I H I n s l k ) = ens
+ &sxeikRt. (2.264)
t=l
It follows that the energy eigenvalues Ens(k) of the Schrodinger equation are
Ens(k) = ern
(2.265)
It is of interest to further discuss the eigenvalues e,,(k), to gain insight into the formation of energy bands. In this context, the main quantum number n will be allowed to take not just one value, as previously assumed,
147
Figure 2.17: Energy bands in TightBinding ayproxirnat,ion for the simple satom crystal described in the text.
k s
P3s
P 2s
PlS
IT 
0
Wavevector
(O,O,k,)
lr a
but several. This means that the valence electrons in the free atom do not occupy only one 5 level, but several slevels ens differing in the value of n. To justify the application of the results obtained above in the present case, the matrix elements of H between sorbitals having different values of n must be negligibly small. We assume this to be true. In addition, we suppose that ens is negative for a l n, and that the sign of pns alternates, such that for n = 1 l it is taken to be negative, for n  2 positive, for n  3 negative etc. This behavior reflects the differing numbers of nodes of thc atomic wavefiinctions for different values of n. The absolute values of Pns are expected to increase with growing n , corresponding to the larger values which the nsorbitals with larger n have at the nearest neighbor aloms. The separation between adjacent energy levels cn8 should, however, always be larger than 4Pns.
If the above conditions are met, the energy band dispersion along the line (0, 0, kz) of the first B Z of the simple cubic lattice under consideration, has the form shown in Figurr 2.17. Each level ens of the free atom gives rise to an energy band of the crystal. The width of these bands amounts to 4&., thus increasing with increasing n. The bands are separated by forbidden energy regions. Their widths decrease with growing n. If the distance between nearest neighbor atoms increaees, then the parameters fins of equation (2.263) decrease because of decreasing ovrrlap of the orbitals, as has been
148
pointed out above. The same holds for the widths of the energy bands, they become narrower if the distance between neighboring atoms grows. They approach the discrete levels ern of free atoms as this distance becomes infinitely large, corresponding to ,&, = 0. In the latter case. electron states with different values of the wavevector component k, have the same energy, ens. These levels are, therefore, highly degenerate. If the infinitely remote atoms again approach each other, then the ,&,values become finite. because of the onset of overlap of neighboring orbitals. Correspondingly, the degeneracy of electron states with different kz is removed. and the discrete levels of the free atoms spread into bands. The TB approximation quite naturally explains, in this way, how discrete energy levels of the free atoms transform into energy bands of the crystal. The gaps between the bands occur naturally in this approach, since the discrete atomic levels are separated by energy gaps from the outset. This stands in contrast to the approximation of nearly free electrons, where the occurrence of energy gaps calls for an explanation. On the other hand, 'bandwidth', in the form of the infinitely broad energy continuum of the free electron. is present at the outset in the latter approach. The gaps induced into the energy continuum were seen to arise because of the strong perturbation of plane wave states by the periodic lattice array of atomic coTes fox wavevectors on the Bragg reflection planes. Comparison of the two approximation procedures reveals the difference between the underlying concepts  the TB approximation emphasizes the atoms and the shortrange ordered complexes of the crystal. while the approximation of nearly free electrons focuses on the crystal as a whole and the longrange ordering of the atoms. Such comparison also shows that the shortrange and longrange order concepts are equivalent in the sense that they result in the same characteristic features of the electronic structure of crystals both concepts predict the existence of energy bands separated by gaps. Figure 2.18 depicts the manner in which the bands and gaps arise in the two approximations.
2.6.2
Semiconductors of the diamond and zincblende type are tetrahedrally coordinate cubic crystals with two atoms per primitive unit cell and four valence shell orbitals per atom, among them one sorbital([ = 0) and three porbitals ( I = 1). The application of the TB method to this specid case is of particular importance. It will be developed in the present subsection. Only nearest neighbor interactions will be taken into account because this introduces considerable simplification and nevertheless gives results of reasonable accuracy. For the porbitah. we choose the Cartesian form, so that the orbital index a takes the values ns,nz,ny and n z . The Cartesian components I , y r z refer
149
Figure 2.18: Illustrat,ion of the origin of energy bands and gaps in the enerw spectrum of a crystal. to the cubic crystal axes. The main quantum number n will be suppressed below because it may assume only one value here for each of the t,wo atoms, although not necessarily the same. As a first step we determine the 8 x 8Hamiltonian matrix for diamond type crystals. Later: it will be generalized to zincblende type structures. Hamiltonian matrix We start with matrix elements bet,ween orbitals at equivalent atoms.
Matrix elements between orbitals at equivalent atoms
These are elements of the general form ( a j k 1 H 1 a'jk). In the case of diamond type crystals. the two atoms J = 1 and J = 2 of the primitive unit cell are of the same chemical nature, thus their matrix elements will be identical. Since we are restricting ourselves to nearest neighbor interactions, ' only the term with R = 0 needs to be considered in the R'sum of formula (2.260). Thereby, ( a j k 1 H I a'jk) becomes independent of k. The eigenfunctions I s l O j and I zlO), 1 ylO), \ 210) transform, respectively, according to the unit representation rl and the vector representation I1 of the cubic '5 h point group Oh. Since o is the symmetry group of H , the matrix element ( s l k I H 1 s j k ) likewise belongs to the unity representation. According to Appendix A this means that its value, in general, is nonzero, and we denote
150
it by E ~ . In a crude approximation, es is the energy of the sorbital of the free atom. The matrix elements between s and porbitals at the same atom transform according to the representation rlx I'l x r 1 5 = r15, which does not contain the unit representation. Then, according to Appendix A, these elements must vanish. The matrix elements between p and porbitals belong to the representation r 1 5 x I ' l x r 1 5 = X'1+r12+I'i5+I'b5 (see Table A.27). Here, the unit representation occurs exactly once, which means that the p  pmatrix contains exactly one independent constant. A more detailed analysis shows that this constant corresponds to the three nonvanishing and mutually identical elements (210 I H I s10) = (y10 I H I y10) = (210 I H I 210) = E. , Again, as in the case of c8, the value of cp is roughly the energy of the correspondingporbital of the free atom. The nondiagonal p pmatrix elements must be zero according to the above symmetry analysis. Summarizing, we have
(2.266)
In evaluating the matrix elements between orbitals at nonequivalent atoms, i.e. elements of the general form (ujk 1 H I a'j'k) with j # j ' , the R'sum in (2.260) may be restricted to the 4 lattice points &,t = 1,2,3,4, whose primitive unit cells host the 4 nearest neighbor atoms. With this simplification the Hamiltonian matrix of (2.260) becomes
4
(ujk I H
(2.267)
The values of j and j' are complementary to each other because the nearest neighbor atoms lie in the other respective sublattice. For j = 1 one has j' = 2, and for j = 2 then j' = 1. We will restrict ourselves to the first case, i.e. j = 1,j' = 2, because the second may determined from the first with minor changes. The four nearest neighbors of a 1atom are located in the primitive unit cells at
R = 0, 1
R = al, 2
R 3 = a2,
R 4 = a3,
(2.268)
(see Figure 2.19). The calculation of the matrix elements (a10 I H I a'2Rt) between the different Cartesian orbitals and the four different values of Rt is
15 1
24
Figure 2.19: Atom of sublattice 1 and its four nearat neighbors in sublattice 2. somewhat laborious. It will be carried out in several (five) steps. In the first step we determine the matrices (a10 I H I a2Rt) between spherical orbitals.
I Im20) = & , , , ( h l O
1 N 1 lm20)
BmmrV~ppn 12 ,
(2.269)
152
neighbor dirwtions. For the secondnearest neighbor directions, the symmetry of H is smaller. Rwauw of that, the matrix elements (alk I H I a2k) of equation (2.260) are, in general, nondiagonal with respect to m, m if the Rsum is extended to the swondwarest or more remote atoms. The diagonality holds, however, in an approximate sense, which may be seen as foHows. The periodic potential V(x) in H represents a sum &,,,, vji,(x  R  i y ) of potential contributions of all the individual atomic cores of the crystal. The matrix elements of H between orbitals at different renterrs jR and JR thereby decompose into bums over all centers JIR. The largest contributions will arise from t a m s where the center index jR coincides either with 1R or with JR. If one considers only such twecenter terms and neglects all threecenter contributions, then the integrand of the matrix elements (Im10 I H I lm20) has the full axial symmetry, so that these dements become diagonal with respect to m, m. The hermiticily of the Hamiltonian and the particular form of the wavefunctions in (2.252) lcsd t o the relation
(lrn10 I H
(2.270)
(0010 (0010
(2.271)
(1010
are independent of each other: and the two elements
(1110 I H
(2.272)
are identicaL These elements are illustrated in Figure 2.20. This figure also
Since the atomic orbitals under consideration are those of bound states, and since the eigenvalues of the Hamiltonian for bound orbitals are generally negative, one may expect negative values for k<3n, Vm, and positive ones for Ifaw, IT&,. Taking account of the strength of the overlap of the orbitals
153
12 v s su
21 vuG
vssu
21
=V p p u
21, VPPT
=Vppcr
154
in Figure 2.20, one can conclude that the absolute values of these elements should obey the relations
V ,
I.
(2.274)
The matrix elements (lm10 I H I ImflO)of H with respect to spherical orbitals calculated above will be used to derive, in the second step, the matrix elements of this operator with respect to the Cartesian orbitals. The zaxis of the Cartesian coordinate system is taken t o be the same as that of the spherical coordinate system used above. This means that the z  ax is points in the direction of the connecting line between atom 10 and atom 21. The corresponding 2 and yaxes lie in the plane normal to the zaxis, apart from an irrelevant rotation about this axis. The pairrelated Cartesian coordinate system thus defined differs from the formerly introduced crystalrelated system which is given by the three cubic crystal axes. The coordinates in the pairrelated Bystem will be denoted by 2 , 5, The pairrelated ;orbital co2. incides with the 8 orbital with respect to the cubicaxes system. In terms of this notation, the matrix elements of H between Cartesian orbitals in the pairrelated system read (310 I H I 3.20),(3.10 I H I 120),(3.10I H I 520), (3.10 I H I .520),(110 I H I 120), (110 I H I 520) etc. Since the C a r t e sian orbitals are defined in terms of spherical orbitals by equations (2.249), (2.250), the matrix elements between Cartesian orbitals can also be related to those between spherical orbitals. The corresponding relations are given below. Elements which are complex conjugates due to hermiticity of the Hamiltonian, such as (310 I H I 220) and (120 1 H I ;lo), are listed only once. The relations read (310 I H I 3.20) = VSScr,
( . 0 H I 520) = ( . 0 H I 520) = 0, 3 1I 31
(3.10 I H I 220) = VSF, (110 I H 1220) = (510 I H 1520) = Vw, (510 I H 17/20) = (g10 I H 1.520) = (210 I H 1120) = 0, (El0 I H 1.220) = Vppu. (2.275)
To develop the representation of the Hamiltonian matrix (2.267), we need the elements (a10 I H 1 a2Rt) of H between Cartesian orbitals which refer to the three cubic crystal axes rather than to the pairrelated ones. We determine these elements in the third step.
155
To this end, the Cartesian orbitals 2,y, z related to the cubicaxes system must be expressed in terms of the pairrelated Cartesian orbitals ?,c,Z. To determine this relation, we consider a rotation which transforms the crystalrelated axes system into the pairrelated one. The transformation is characterized by Euler angles $, 6 and 'p. Since the orientation of the pairrelated system is defined only up to an arbitrary rotation about the Zaxis, the Euler angle 'p may, without any loss of generality, be set equal to zero. As noted in Appendix A, the rotation matrix A of equation (A.31) transforms the coordinates before rotation into the rotated one. The basis vectors are transformed by the transposed matrix, which, in the present case of rotation, is also the inverse matrix. Since the 2 ,y, z and ?,y, Z are understood here in the sense of basis vectors, we have
(:)
cos @ = (sin@
 cos 6 sin $
(i).
(2.276)
Below, we will see that the direction cosines (PI,q1, 71) of the pairrelated 2 axis with respect to the crystalrelated cubic zaxis play an important role. These are the elements of the third column of the rotation matrix in equation (2.276), i.e.
p l = sin 0 sin $, q1 =
 sin 6 cos $,
r1
= cos 0.
(2.277)
Using equations (2.275) to (2.277), the matrix elements in the crystalrelated system are evaluated as One obtains
156
These relations are valid for matrix elements involving the two nearest neighbor atoms belonging to the same unit cell at R = 0. From these elements we may determine the elements with the nearest neighbor atoms of different unit cells. This will be done in the next step.
iv) fourth step.
The vectors pointing from the central 1atom to the four nearest neighbor atoms will be denoted by dt, t = 1,2,3,4, such that dt = Rt  TI. For the diamond structure, the two sublattices are displaced with respect to each other by the vector (a/4)(1,1,1). Using equation (2.268) for Rt and the explicit form of the primitive lattice vectors of the facecentered cubic lattice, we obtain
dl =
(2.279)
These relations determine the direction cosines ( p t , qt, r t ) of the connecting lines between the central 1atom and the nearest neighbor atoms 2t. In particular, (PI, ,7.1) = ( l / f i ) ( L 1,1), P Z , 4 2 , 7 2 ) = (l/fi)(LI,l) a ( etc. The unknown matrix elements (a10 I H I a2Rt) for t = 2,3,4 follow from the elements (a10 I H 1 a2R1) in equation (2.278) by replacing the direction cosines (PI, m, ri) by ( p t , qt, r t ) in them.
v) fifth step
The elements (a10 I H I a2Rt) determined above are used to calculate the kdependent matrix elements (alk I H I a2k) between Bloch sums. For complex conjugate elements, again, only one relation will be given. With the help of equation (2.267), we obtain
4 4
(slk I H
(2.280)
t=l
4
(slk 1 H
157
The seven different tsums which enter t,he matrix elements (2.280) may be reduced to just four because of the obvious relations ptpt = r t , p t r t = qt. and qtrt = p t . The four independent sums are
158
9l(k) = eikdi
+ ,ik.dz +
,ik.d3
,ik.d4,
(2.282)
.&.dz
+ .ik.ds
 .ik.d4  ,ik.dn:
+ ,ik.d4
Finally, we c an write down the Hamiltonian matrix (ajk I H I ajk) between Bloch sums in explicit form. Arranging the eight basis functions 1 ajk) in the sequence I s l k ) , I z l k ) , I y l k ) , I z l k ) , I s2k), I 22k), I y2k), I z2k), this matrix is given by
159
1 1 1 1 1 1 1;1
type
semiconductors (in
~i
6s
EP
VsSa
V v
vma
17.52
8.97
4.50
5.91
10.41
2.60
13.55
6.52
1.93
2.54
2.36
4.47
4.15
1.04
Ge
14.38
6.36
1.79
m i
1.38 1.92
1.68
2.20
0.55
method. Equating the TB eigenvalues with those measured or known from other calculations, one obtains a system of equations which determines the unknown parameters uniquely. There are other variants of this procedure. For instance, the intraatomic matrix elements t 3 and tp may be identified with the atomic s and penergies, leaving only the interatomic constants E,,, ESP, and Eppfor the fitting procedure. Instead of EsS,E,, E , and Epp E,, one often takes the interatomic matrix elements V,,,, Vspo, V,, and V,, as independent parameters. Together with c3 and e p , they are referred to as tight binding parameters. (see Table 2.7). Once these are determined, one is able to calculate the band structure at all kpoints. The whole procedure may therefore be considered to be an interpolation of band energies between special points of the first B Z . It is called the empirical tight binding (ETB)method. This method can also be applied to the calculation of the electronic structure of perturbed semiconductors, which will be treated in Chapter 3. In this case it serves as an extrapolation method, because it helps to extrapolate from the electronic structure of the ideal crystal to that of the perturbed one. Inspecting the interatomic TB parameters listed for the various diamond type semiconductors in Table 2.7, it may be seen that, within an appropriate error limit, each of these parameters may be expressed in terms of the materialdependent nearest neighbor distance do of the crystal, and a
160
Table 2.9: Intraatomic TB parameters (in eV) to be used in conjunction with the universal interatomic TB parameters of Table 2.8. (Source same as i Table 2.8.) n
qltm.
signifying that the interatomic matrix elements of the Hamiltonian scale ) with the inversesquare of the nearest neighbor distance. The factor ( f i 2 / m = 7.62 eVA2 in equation (2.284) has been jntroducd in order to make the universal factors qzpm dimensionless. The nrarest neighbor distances dg follow from the cubic lattice constants a in Table 1.2 by ineans of the relation do  (fi/4a). Ultimately, the empirical do'dependence in equation (2.284) originates from the kinetic energy operator of the crystal Harniltonian (li'royen, Harrison. 1979). The eigenstates of this operator are plane waves and its eigenvalues are proportional to the inversesquare of the wave length of i t s eigenstates. As a unnsequmce of this, the empty lattice band structure scales with the inverse square of the latlice constmt. The TB band structure follows by diagonalizing the T H Hamiltonian (2.283), resulting in c l o d analytical expressionh for the energy hand lev& at special kpoints of the first BZ. If these expressions are identified with the empty lattice band levels at, the 5ame kpoints, respectively, linear eqiiations for the TB parameters ctre obtained. Their solutions stale with the inversesquare of the lattice constant a because the empty lattice band levels do so. hlormver, numerical valiies for the universal TB parameters q ~ follow~from these ' equations. They are close t o the values listed in Table 2.8 which have been derived from more precise band structure data (the meaning of figpa wiU be explained below in connection with zincblende type semiconductors). The intraatomic matrix elements e g and c p corresponding to the set of universal interatomic TB parameters ql+, in 'lable 2.8 are shown in Table 2.9 for a series of atoms forming diamond and zincblende type semiconduc
161
tors. They represent atomic s and plevel energies which deviate somewhat horn the energy levels given in Table 2.1. A kpoint of prtictilar interest is the 32 center k = 0. Here, the eigenvalues and eigenfunctions of the Hamilton matrix (2.283) may br o b tained ~ I closed analytic form. The results me the following 8 eigenvalues I &(a). i = 1,2,. . . , 8 .
1 EzOj = ( I / f i ) ( O ,
1 . ~ ~ ( i / ~ ) ( 0 , o , i , 0 , o , O , i , o ) , 1 E40) 0)
(i/&j[o,o,o, i,o.o,o,ij,
I R 5 O ) = (1/&)(1,0, o,o, 1,0, 0, 01, I EGO) (1/&)(0, 1,0,0,0.1, o,o), I E ~ O (1/&)(0,0, LO, o,o, i , o j , I E ~ O (i/fi){o,o,0, I, 0.0, o,i). ) )
(2.286)
Since cP is negative and EZzpositive, the triply degenerate level 6 2 ( 0 ) = E3(0)= E 4 ( 0 ) lies below the triply degcneratc level Rs[0)= Fe(O) = E 7 ( 0 ) . The constants E~ and E,, both have negative values, therefore the El(O)hvel is lower than the Es(O)level. Owing to the fact that the atomic senergy eS lies below thp pencrgy cp7 the pigPnvahx9 E l ( 0 ) is also smaller than the deeper of the two Zriyly degenerate levels. This means that E l ( 0 ) is the deepest of the foul levels. Thm+nrP, the energetic ordering of the lwels is determined by the relatiom
(2.2~17)
E s ( 0 )  &;7(0).
For the pmitinn of the &(O) lcvcl, there are still three pussibilitks (important malerials to whirl1 the three possible cases apply are listed alongside the cases), namely :
i) &(0]
>
&(O)
Ge,
These relations Inem that the E'c;(O)level moves down with respect to the other two levels as the size of the atoms increases.
162
It turns out that the ordering of the eight energy bands at k = 0 remains the same over the entire first BZ. This is important because the positions of the energy bands relative to each other determine the likelihood of their population by electrons. As already mentioned at the beginning of this section, not all of the bands are expected to be populated, just as the s and plevels of the free atom whose orbitals were used as basis functions were not completely filled. Electron population of the ground state of the crystal, i.e. at temperature T = 0 , may be obtained as follows: For a simple band, each kvalue corresponds to 2 eigenstates of opposite spin. For a periodicity region of volume Q = G3Ro, the first B Z contains, as does every primitive unit cell of reciprocal space, G3 allowed k values (see section 2.3). A simple energy band therefore has 2 x G3 states. Four such bands are necessary to host the (2 x 4) G3 = 8G3 valence electrons of a periodicity region. In the ground state of the crystal, therefore, the four lowest bands are populated, and the four highest bands are empty. This means that in the case of C , Si and Ge, E l ( k ) , E2(k), E3(k), E4(k) are the populated valence bands, and Es(k),&(k), E7(k), Es(k) are the empty conduction bands. For a  Sn, El(k) and &(k) form valence bands, together with two of the three bands E2(k), E3(k), E4(k). The remaining band of the three is a conduction band. Since it is degenerate at k = 0 with the highest valence band, the energy gap of aSn vanishes. The above band assignment allows us to determine the symmetry of the valence and conduction band states at r.
We know that degenerate eigenst ate8 of the crystal Hamiltonian having the same energy value form a set of basis functions for an irreducible representation of the small point group for the wavevector k. For k = 0 this group coincide8 with the full point group of cquivalent crystal directions, whch, here, is oh, The dimensions of the irreducible representations are the same as the degrres of degeneracy of the corresponding energy levels. Therefore, the eigenfunctions for & ( 0 ) and &(O) each belong to 1dimensional representations, and those of E2(0) = &(o) = E4(0) and E s ( 0 ) = E s ( 0 ) = E r ( 0 ) each belong to 3dimensional representations. According to equation (2.286), the deepest valence bend level Ei(0) ha5 the eigenfunction (l/fi)[Is10)+ I s20)]. In order to determine its transformation properties under the operations of the point group Oh, it is useful to decompose oh into two parts, firstly, the tetrahedron subgroup > I containing only elements which are not involved with an exchange of the two h sublattices 1 and 2, and secondly, the remainder of o which is composed of all elemfxttb of T d multiplied by the inversion. Each of the elements of the second part of 01, exchanges the two sublattices. For brevity, we will term
163
the latter elements 'exchanging', and the former ones 'nonexchanging'. The s eigenfunction (l/&)[l 1 0 ) + 1 sZO)] for E l ( 0 ) transforms into itself under the action of both types of elements, thus it belongs to the unit representasl0) I s 2 0 ) ] of the Es(O)level, the tion rl. For the eigenfunction (l/fi){l transformation into itself occurs only under the action of nonexchanging elements, while a factor 1 is generated in the case of exchanging elements. It follows from the character table of the irreducible representations of Oh given in Appendix A, that this transformation corresponds to the representation
r;.
The upper valence band level E z ( 0 ) = E3(O) = E4(0) possesses the three eigenstates ( ~ / f i ) [ l 1 220)], l / f i ) { Yloj I Y ~ o )(1/])[1. 210)( l & ~ 1 0 )  I d o ) ] . Under the action of the nonexchanging elements of oh. these functions transform like vector components. Inversion, which is part of the exchanging elements. reverses the sign of the vector components, and the exchange of the two sublattices also reverses the sign of the whole eigenfunctions. The three eigenfunctions therefore transform as they w o d d under the action of the corresponding nonexchanging elements. The character table of the irreducible representations of Oh in Appendix A shows that this transformation is characteristic of the representation J&. Similarly, one finds that the eigenstates (1/&)[1x10)+ I d o ) ] ,(1/&)[1 y 1 0 ) + I Y ~ O ) ](, 1 / f i ) [ I zl0jt I ZZO)] belonging to the eigenvalues E s ( 0 ) = E e ( 0 )  E7(O), transform according to the irrcduciblc representation rl;. We summarize the results of our TI3 band structure calculations as follows: For crystals which have the diamond structure, i.e. for C, Si, Ge, as a rule, the highest valence band is 3fold degenerate and belangs to the irreducible representation of the point group oh. The lowest conduction band at r exhibits either a similar %fold degeneracy, in which case it belongs to the repraentation f 1 5 (C, Si), or i t is nondegenerate and belongs t o the representation I (Ge). For LY  Sn, the F2band lies below the I'g ' i band, which in this way, is partially a conduction band. These results are illustrated graphically iu Figure 2.21.
r&,
Band structures of diianioand type semiconductors calculated by means of the empirical TB method reflect the essential features of the real valence bands of these materials quite well. In order to apply the TB approximation
to semiconductors having the zincblende structure, the Hamiltonian matrix [Z.ZS3) must be modified as follows. Firstly, it has to be recognized that the between orbitals at the mme center depend on matrix elements E~ and whether the center is an atom of chemical species 1 or 2. This means that two different s and penergies have to be inserted into the two 4 x 4 diagonal blocks of the matrix (2.283);ti,t i in the block at the upper left, and e ; , :t
164
r,
6,
6, r, 
r;, 
r,
r;,
diamond structure
6 
zinblende structure
Figure 2.21: Ordering of energy bands at the center 1 of the fint B Z for semicon ductors of diamond and zincblende type. in that at the lower right. Secondly, the relation V$$ V:$ G V3w of equation (2.273) cannot be used in the zincblende case because it rests on the chemical identity of lhe two atoms of A unit cell. For the matrix element l 2 , the sstate belongs to a 1 atom, and the pstate to a 2atom, while for the sstate belongs to a 2atom, and the pstate to a 1atom. Therefore, thr matrix element, V & is given by an independent constant rather than by V:$, (the parameters iSw Table 2.8 are associated with in by meany of equation (2.284). The other excliangr relations in (2.284) remain valid because they refer to matrix elements whose orbitals at the two different atoms belong to the same state. With these two changes, in equation (2.283) the TB Hamiltonian matrix for zincblende type semiconductors becomes
z;
espc csF
165
with ESP= (l/fi)c&,. TTsing this matrix, the band structure and the eigenstates of zincblende type semiconductors may be calculated. YIPspatial symmetry of the eigcnstatcs a t thr R Z rentpr is similar to thtlf, which was found for diamond type crystals above, except that the point group Ott has to he r ~ p l a r d the tetrahedron group T d . The wprPsPntaliun ri, of by Oh thereby becomes the representation r 5 of T d , I i replaced by rl, and 1 ' s i r . remains ri5. In contrast to diamond type crystals, one has practically 15 only one energetic ordering of the conduction bands here  the rlband being the deepest.
2.6.3
Once the bald stnicture is known, the total energy of the valence electrons ran be calculated. Formula (2.54) of section 2 . 1 indicates how this may be done for the ground state of the crystal. One hw to sum the energy levels of all valence electrons! and subsrquently remove the doubly counted electronelectron Coulomb interaction energy from this sum. The total energy of the vakncr elwtrons of the crystal or, strictly speaking, its deviation from the total energy of the valence electrons of the free atoms which were brought together to form the crystal, defines the energy gain due t o chemical bonding. known as coh.esion energy of the crystal. This definition is reasonable because the valence elect,rons are the only parts of the atoms whose stat= change when the crystal is formed. In order to obtain the total e n e r a in closed analytic form, one needs explicit mathematical expressions for the valence band energies st. all paints k of the f i d BZ. However, the TB approximation in the form developed above produces such expressions only at particular symmetry points. To find them everywhere, one must introduce further simplifications. A starting point for this is a formulation of the 'I'B approximation which employs certain linear combinations of 8 and p orbitals as basis functions, rather than the atomic orhihals of definite angular momentum quantum numbers I ! which were used above. These linear combinations are callcd sp'hgrhr%d orbitals. In contrast to the atomic orbitals, they are not energy eigenfunctions of the atoms.
sp3hybrid orbitals
The four hybrid orbitals I h~lR),z l R ) , I hslR),I hslR) of a 1atom in Ih the unit cell at R are defind by the equations
166
23
23
23
23
Figure 2.22: Illustration of the wavefunctions involved in chemical bonding i n tetrahedrally coordinated semiconductors: sp3hybridorbitals (a, b), bonding orbitals (c) and antibonding orbitals (d).
I Figure 2.22 the probability distributions of the four sp3hybrid orbitals are n shown in the form of polar diagrams. The orbitals resemble clubs pointing to one of the nearest neighbor atoms in the sublattice 2, e.g., I hllR) to atom 21, 1 h z l R ) to atom 22 etc.
167
Similarly, the four sp3hybrid orbitals I h12R),I h22R), I h32R), I h42R) at a 2atom in the unit cell at R, are defined by the relations
These orbitals point to an atom of type 1. The following four orbitals are directed to the 1atom in the unit cell at R: orbital 1 h12Rl+ R) at atom 21, I h 2 2 R 2 f R) at atom 22, I h32R3+ R)at atom 23,and I h42&+ R)at atom 24. Hybrid orbitals at the same center are orthonormalized with respect to each other, as the s and porbitals from which they are constructed. If the latter are understood in the sense of Liiwdin, orthogonality also holds for hybrid orbitals at different centers. For each unit cell there are 8 associated and j = 1,2, as there are 8 atomic hybrid orbitals 1 &jR)with t = 1,2,3,4 orbitals for the two free atoms of a unit cell. From the hybrid orbitals of a given sublattice one may form Bloch sums I htjk) by means of the relation
(2.291)
in complete analogy to the Bloch sums q b ~ involving atomic orbitals in equak tion (2.255). Due t o the definitions (2.289) and (2.290) of the hybrid orbitals in terms of atomic orbitals, the Bloch sums I h t j k ) may be thought to arise ~ from the Bloch sums # a 3 of atomic orbitals by means of a unitary transformation. The same holds for the Hamiltonian matrix ( h t j k I H I htrg'k) with respect to the basis I htj k); this matrix may be understood as arising horn the above mentioned unitary transformation of the known Hamiltonian matrix ( a j k I H I a'j'k) with respect to the basis daJk. As such, it has the same eigenvalues and eigenfunctions as the original matrix. The implementation of the TB method by means of hybrid orbitals instead of atomic orbitals having a definite angular momentum quantum number is therefore nothing but the solution of the same eigenvalue problem in another representation. This statement holds, of course, only as long as equivalent approximations are made in the two representations. However, the hybrid orbital representation is well s u i t e d for further approximations, beyond those already made earlier. These approximations facilitate the derivation of simple analytical
168
expressions for the eigenvalues and eigenvectors which may be used to explicitly determine the total energy of the valence electrons of a crystal in closed analytical form. Hamiltonian in hybridorbital representation Here, we describe the most important additional approximation available in hybrid orbital representation, which utilizes the fact that those particular hybrid orbitals at nearest neighbor atoms which point toward one another will also overlap one another more strongly than all others. In consequence of this, the Hamiltonian matrix elements ( h t l R I H I ht2 R t R) between these orbitals will be the largest. Approximately, one need only consider these elements, while all others may be neglected. The former elements do not depend on t and R, as can be seen from the explicit expressions for the hybrid orbitals (2.289), (2.290). Their common value, denoted by V2, may be determined from (2.289), (2.290) and equations (2.271) to (2.273) as
The corresponding matrix elements (h+lk I H I ht2k) between the Bloch sums of hybridorbitals follow from VJ by multiplying this quantity with the factor et eak.dt (2 293)
~
In the casc of Hemiltonian matrix elements between hybrid orbitals at the same center, one has to distinguish between diagonal and nondiagonal eleIf ) ments. Thc nondiagonal elements ( ~ Q 1R 1 h t , ~ R have thc samc value for all R and orbital quantum numbers t , t'. For the common value V1 one finds by means of (2.289), (2.290) the expression
Vl 5 ( k t j R 1 H
1 ht,jR) 
 ( c S  E ~ ,)
# t'
(2.294)
with e3 and ep defined in equation (2.266). Since only nearest neighbor atoms are considered, the rrvult is the same as for the matrix element ( h y k I ZI I h t f jk) between the corresponding Rlocli sums. An analogous result holds for the diagonal elements (&jR 1 H 1 ht3R) at the same center. Their common value ctt is
~h
(htjRI H
I h y R ) = (~g 4
t
kp).
(2.295)
Again, this result is the same as for the matrix elements ( h u k 1 H 1 h u k ) between the corresponding Uloch sums. Numerical values for th, along with values of Vi and VL, are listed in Table 2.10.
169
In writing down the Bamiltonian matrix (htjk I H I hrjk) in the hybridorbital representation, we arrange the eight basis functions in the sequence I h l l k ) , I hzlk), I h s W , I h 4 W , I h l W , I h Z W , I h32k), I h 4 W . Then the matrix ( h t j k I H I httjk) takes the form
(2.296)
This matrix is also known as the Wearelhorpe Uarniltonian. Remarkably, its 8 eigenvalues can be obtained in closed analytical form. Denoting them : by E,b, E, i = 1 , 2 , 3 , 4 ,we havve
Table 2.10: Hybrid matrix elements calculated from the TB parameters in Table 2.7 (in eV).
8.27
1.76 2.01
2.98
Ge
8.37
2.76
170
r
wovevector
Figure 2.23: Evolution OF the energy bards of semiconductors with the diamond slructura within the TB approximation The righthand part shows the band struct,ure of Si
calculated by meam of equations (2.297).
Here gl(k) is the structuredependent factor d e b 4 in equation (2.282). The actual positions and kdispersions of the bands (2.297) are determined by the parameters th, 11 and Vz. Using the values for S i given i Table n 210, one obtains t h bnrrd structure shown in trhf right hmd part of Figure ~ 2.23. In this regard. the 4 bands indicakl by b lie below the energy gap, arid thP 4 bands indicated by u lie above it. T i means that i the ground hs n state of t h e crystal. the &lands are fully populated by electrons, i.c. they form the valence bands, and the abands are complekly empty, thus they are the conduction bands. Essential features of these relationships are the 1 negative sign of V2, and the validity of the magnitude relation I V 1>1 2 I between the absolute values of and Vz. Figure 2.23 also illustrates how the different bands emerge from the atomic sand p1evrlb due to the two interactions V and V z . Roughly speaking, Vz determines the distance be1 tween the cpnters of gravity of the valence and conduction band complexes, and V1 the width of thpse bands. Now wc return to the main goal of this subsection. the ralcuIatjon of the total energy of the crystal. To accomplish this, the energy values of the four valence bands E,b(k)determined above must be summed over all i and k. It
171
turns out that this task may even be carried out analytically if a suitable additional approximation, the socalled bond orbital approsirnation. is made.
Bond orbital approximation To introduce this approximation, we review the already solved problem of diagonalizing the Hamiltonian (2.296). but procwd in a sompwhat different way, The diagonalization will now bt. carried out in two steps. In the first step, the eigenvalues and rigenvectors of the matrix (2.296) are calculated without taking account of the VIterms, i.e. temporarily setting Vl to zero. As may he scen from formula (2.297). this leads to the reduction of the enerffi bands to the two dispersionless levels givw by
(l/fi)[l, D,O,O,e;,O,
O,O),
(2.299)
(htlk 1 b4k) = ( l / J z ) ( O , 0: O , l , O , O , O , e : ) ,
E,
1,0,0,0:
ez,
0),
1.0,0,0,
.;I.
Each of the eigenfunctions I btk) and 1 atk) is a linear combination of Bloch sums of two hybrid orbitals pointing toward one another, or, equivalently, a Bloch sum of the linear combinations of these orbitals. In the case of &states the hybrid orbitals are added, and the corresponding linear combinations I btR) are given by
172
I kR)=  11 h t l
1/2
R)+ I h2Rt
+ R)], t  1 , 2 , 3 , 4 .
(2.301)
Thc exponential factors eF of the eigenvectors (2.299) and (2.300) were compensated by the etfactors of the Bloch sums. In the case of astates the hybrid orbitals are subtracted, and the corresponding linear combinations I a , R ) are given by
The polar diagrams of these functions are shown in Figure 2.22. For reasons which will be clarified later, one refers to the orbitals I h t R ) as bondiny and to the orbitals 1 arR)as anhibonding orbitals. With the help of bonding and antibonding orbitals the eigenfunctions of the Hamiltonian matrix (2.296) with b  0 may be written in the form 1 '
I btk) 
aR
eik'R I b t R ) , 1  1 , 2 , 3 , 4 ,
(2.303)
(2.304) In the second step of the diagonalization procedure the Vlterms are included. The Hamilton matrix (2.296) is transformed iuto the basis set of the previously calculated eigenvectors 1 btk) and 1 atk). This results in the (8 x 8)matrix
(2.305)
173
{2.307)
as block elements. The expression for I, follows from that or H & if in the f latter q, is replaced by E,. The structure factors g t p and &) in (2.306) and (2.307) are defined as follows:
The (4 x 4)matrix H M couples the various bonding states, and H,, couples the various antibop.ding states. The nondiagonal matrix Hab describes the interaction between the two types of states. It gives rise to corrections to the eigenvalues of relative order of magnitude I 11 I /2 I V2 I. Considering the actual values of V1 and V2. these corrections are rather small. This suggests treating them as perturbations. The zeroth approximation, i.e. the complete neglect of the interaction between bonding and antibonding states, is referred to as the b o d ovbital approximation. Within this approximation the valence and conduction bands follow from separate eigenvalue equations, the former by diagonalizing the matrix Hbbq the latter by diagonalizing Hw. To calculate the total energy of the crystal one needs the total energy of all valence electrons. T h e latter may be calculated by means of formnla (2.54) which expresses the total energy of an interacting electron system by means of its oneparticle energies. One has
Er$&f= 2
k z
E:(k)
Ecd
(2.309)
where E d means the Coulomb energy of the interacting valence electrons which is counted twice in summing upon all band states. The factor 2 accounts for the two spin states. Within the bondorbital approximation, the isum in (2.309) may be carried out in closed form, with the result
(2.310) To pro>*ethis relation, we write the eigenvalues @(k) in (2.310) as diagonal elements of H M between eigenstates, and note that within the bond orbital approximation the eigenstates for a given wavevector k are linear
174
combinations of the bonding states I bik) only. In this matter, the linear combinations are generated by the unitary transformation which diagonalizes the Hermitian matrix H a of (2.306). If one sums the eigenvalues @(k) over all i , and takes advantage of unitarity, then the diagonal elements of Hw with respect to the eigenvectors become the diagonal elements of this matrix with respect to the bonding states, i.e. q,, as is stated in relation (2.310). One may also prove this relation in another way, using Vieta's theorem, which states that the sum of all zeros of a polynomial of degree n equals the negative of the coefficient of the (n  1) th degree term. In the case of the characteristic polynomial of a matrix, this coefficient repre sents the negative of the sum of all diagonal elements, here, therefore, 4q, confirming the validity of equation (2.310).
(2.311)
To get the total energy of the crystal, the energy of the atomic cores must be
175
added to the energy value of equation (2.311). In doing so, one may again use the fact that the core states of the crystal do not differ from those of the free atoms. This means that only the mutual electrostatic interaction of the cores results in a structuredependent energy contribution, while the internal core energies sum to a constant Eo. The corecore interaction energy has, approximately, the same value as the electronelectron interaction energy E d between the valence electrons on different atoms. This is true because the valence electron charge of an atom equals its core charge for the crystals considered here. The corecore interaction energy approximately cancels, of valence electrons therefore, against the negative Coulomb energy E,1 in expression (2.311). Finally, the total energy of the crystal is given by
(2.312)
The ddependence of this energy is due to the fact that both Eh and V z depend on d  the hybrid energies Eh have ddependence as they are defined by the diagonal matrix elements of the Hamiltonian H between Lowdin orbitals which contain overlap integrals between s and p  orbitals of adjacent 2 atoms, and V because this quantity is the matrix element of H between hybrid orbitals at nearest neighbor atoms. V2 decreases with decreasing distance d, corresponding to an attractive force between the atoms. The hybrid energy q, increases as d decreases, corresponding to a repulsive force. For large d, the attraction dominates over the repulsion, and for small d, the repulsion dominates over the attraction. Overall, the total energy E z l b l ( d ) of the crystal varies with d as shown in Figure 2.24 schematically. At the equilibrium distance do, it takes its absolute minimum value. This means that the initially free atoms will not remain free but form a diamond type crystal with nearest neighbor distance do. They experience what is called
covalent chemical bonding.
In order to provide a better physical understanding of the nature of covalent chemical bonding, we compare the total energy E g t p l ( d ) of the crystal with the total energy E f $ T of 2G3 free atoms. For the elements of the fourth group of the periodic table with their two electrons in atomic ,levels and two in atomic cplevels, one has
(2.313)
where Eo, again, accounts for the energy of the atomic cores. The negative difference of the two energies (2.312) and (2.313) represents the cohesion eriergy of the crystal. It is given by the expression
176
distance d (schematically).
Formally, the occurrence of a positive cohesion energy is due to the fact that the matrix element V2 o l H bctween hybrid oxbitals at adjacent atoms pointing toward one another is negative, and that I 4 2 I exceeds the energy v differericc (cP  cs). The latter difference may he understood as the energy increase of an atom if one of itb two .+electrons i s lifted into a pstate or, equivalently, if its four valence electrons a x put into four sp'kybrid orbitals rathe1 than into two s and two porbitals. One calls this population the promoted configuration of the atom. In sp'hybrid states, the electrons of adjacent atoms are capable of pronounced interference. This can be constructive or destructive, depending on whether bonding or antibonding states are considered. In the casc of constructive interference, the probability amplitude becomes relatively large in the region between the two atoms and the two electrons of the interfering sp34ybrjds undergo a delocalization (see Figure 2.22). In this process, the potential energy of the Coulomb interaction of the two electrons among themselves and with the atomic cores remains almost unchanged. However, their kinetic energy decreases considerably. This may be imderstood in terms of the Heisenberg TTncertainty Principle which tells us that a weaker localization, i.e. a larger positional uncertainty, corresponds to a smaller momentum uncertainty and, therefore, to a smaller kinetic ene~gy. Altogether, the energy of the two electrons decreases, b e cause of constructive interference, in a bonding state. The energy gain per atom amounts to 4 I Vz 1. If it exceeds the energy necessary for promoting an atom into its sp'state, i.e. if the condition 4 I V2 I> ( e p  F ~ ) holds, it is energetically favorable for covalent chemical bonding to occur. As we have seen, quanturti rtieclianical phenonienology is essential in the interpretation of this behavior. Unlike the bonding of electrically diflerent charged ions, covalent bonding between neutral atoms Lannot bc understood in terms of classical physics.
The c:ondition necessary for thc occiirrcnce of bonding eigenstates able to host all vrtlprrcp electrons is the ordering of the newest, neigIi1,ors of an atom on the comers of a tetrahedron, i.e. the diamond structure of the crystal. In this way thc above consideration also justifies focming on the tetrtlhedral crystal structure of diamond type crystals, which was merely assumed at the outset. The atomic structure follows, so to speak, from the electronic structure.
Ionic bonding
The ionic contributions to chemical bonding will now be calculated for mat,+ rials having the zincblende structure. A s is wellknown, a scries of 111V and 11VI compound semiconductors form crystals of this type. For the Ilamiltonian matrix (2.305), the transit,ion from the diamond to the zincblendc structure means that ch in the upper left (4 x 4)block has to be replaced by the hybrid energy of the 1atom? and in the lower (4 x 4 )block by the hybrid energy e i of the 2atom. With this replacement, the bonding and antibonding energy levels become
EL
where =c i Thr energy separation betwren the two levels is larger than that of diamond type crystals. This results in an enlargement of the energy gap betwcm the valence and conduclion hands. The bonding and antibonding oibilals arp given by the expressions
!+,
~8.
where we set a p  Vs/dV; T h e f d o r s {1/2)(1 a p ) and (1/2)(1+ctp) in ('2.316) and (2.317) represent the probabiMes of finding an elrctron in the bonding state at atom 1 or 2, respectively. One calls aP the polarity uj bonding orbitals or simply the polardty of bonding. If f f f is deeper than c i , ! then I!?, and also up,art' positive. The electron prpferpntially stays at atom 2 . In this way the polarity of bonding orbitals is such that, in the ground state, whcre the electrons occupy only boxrding orbitals, the previously electrically neutral atoms heconie charged. Atom I becomes the positive cation, and atom 2 is the negative anion. The charge of the cation is given by c Z * with Z' = (Z,  4 t $ap),where 21 is the number of valence electrons at
+ bp.
178
thc free 1atom. The anion charge is e(Zz  4  4 a p ) =  e Z * , i.e. the unit cell is neutral. Owing to this redistribution of electron charge, the electronelectron interaction energy to be subtracted from the sum of oneparticle energies, because of double counting, takes a different, value. It is, therefore, no longer completely compensated by the electrostatic interaction energy between atomic cores. This leads to an additional contribution to the total energy of the crystal which may be interpreted as the electrostatic interaction cnergy between anions and cations. Onc calls it the Madelony energy E M &  The general expression for E M a d iu
where the sum extends over a periodicity region. With energy of the crystal is
EMad,
the total
The Madelung energy is negative, i.e. it strengthens chemical bonding. Since the bonding is then pdrtially due l o attractive forces between ions, OTW refers to it as partially zonzc bondzng. The absolute value of the Madelung energy is, on the one hand, proportional to the number GT of unit cells, and on the other hand, inversely proportional to the distance d between two adjacent ions. One therefore sets
(2.320)
with cy as the socalled Madelung constant. The latter depends on crystal structure and can easily be calculated numerically. In Table 2.11, the nvalues are listed for crystal structures which are observed in materials composed of group IV elements as well as 111V, 11VI and IVII compounds. The value for the wurtzite structiirp in Table 2.11 corresponds to the ideal tetrahedral case with an equivalent cubic lattice constant &a (see Chapter 1). The contribution of the Madelung energy to the total cnergy of a given compound will be larger for larger effwtive charge number Z* of the compound. This results in a tendency of compounds with larger 8* valiirs to crystallize in structures with Madelung constants larger than that of the zincblende structure. Therefore, in passing from the 111V through the IIVI to the IVII compounds, one observes a transition from the zincblende structure through the wurtzite to the rocksalt and cesium chloride structures. The cesium chloride structure follows from the rocksalt structure by replacing the two facecentered cubic sublattices by two primitive cubic sublattices, shifted in the same way with respect to each other as in the rocksalt
2.7. k . p method
179
Zincblende Uurtzite
Rocksalt
1.6381
1.6410
1.7476
structure, i.e. by ( a / 2 , a / 2 , a/2). With growing polarity of the bonding, the energy gap becomes larger, as mentioned above. This explains the transition from the semiconducting properties of the group IV crystals to the insulating nature of the IVII compound crystals. In the case of the IVII compounds, the absolute values of V3 are so large in comparison with V that the bonding polarity op is approximately 2 unity. This implies that almost all valence electrons of the compound stay at the anion. Then the crystal consists of positive ions of the group I atoms, which have lost all their valence electrons, and negative ions of the group VII atoms whose valence shells are completely filled. One refers to such crystals as tonac crystals. In this case, the energy gain due to the transfer of electrons from cations to anions, which represents an essentia1 part of the bonding energy and forms the driving force for the formation of ions, no longer depends on the crystal structure. This structure is determined by the hladelung energy only. Therefore, ionic crystals exhibit structures with particularly large hladelung energies, i.e. rocksalt and cesium chloride structures.
2.7
2.7.1
k .p method
Fundamentals
LuttingerKohn functions
The k. pmethod rests on a particular property of the BIoch type eigenfunctions pV,+(x) the crystal Hamiltonian H . As we know these functions of (which will be denoted below by (xluk) instead of q y k ( x ) ) are the product of an exponential factor exp(ik.x) and the latticeperiodic BIoch factor uvk(x). If one replaces the wavevector k in uYk(x) by a constant ko, while retaining k in the exponential factor, then the resulting functions
1x0
[2.321)
are no longer eigenfunctions of H of course, but they do form a complete orthonarmalizPc1 basis set in Hilbert space, as wcll as the Bloch functions. whence
~
(vkko(vkko) 6 v l u 6 k ~ k ,
~ ( x l v k k o ) ( v k k o l x b(x  x). )
uk
(2 322)
(2.323)
Thp vdidily of these relations f d o w b directly from the rorripletenyss and orthonormality of the Bloch functions. The (xlvkko) are referred tu as LuftmgGgP7 K u h fiLa~t7onu. h y arp determined by the Blorh factors u V k ( x ) T for the special wavevector ko in contrast to the Bloch functiom which require full knowledge u l u,k(x) lor all wavevectors k. The k . pmethod takw advantage of this properky of the LuttingaKohn functions. In this method, one represents the Srhriidinger quation for R crystal electron in terms of the complete orthonormaIized set of these functions. The rpsulting matrix elcments of H can be expressed, as we will s e e later, by the matrur elements of H between the Bloch factors Uyk(X) for k = ko. These elements arc, of course, just as little known as the Bloch factors themselves. However, one may take them as empirical parameters. If one does so and inserts values for the parameters. then the Hamiltonian matrix in the LuttingerKohn basis is completely determined. Uiagonalizing this matrix yields the eigenvalues and eigenfunctions of the crystal Hamiltonian H for all valiies of k. Tllis means that the k pmethod allows 0 1 to calrulate, 1 r from the Bloch matrix elements at only one point kol the eigenralues and cigenfunctions over the entirc first 32,i.e. to extrapolate from thc particular point ko t o t h e entire first BZ. ()ten one is only interested in solutions in the vicinity of a critical point k, e.g. in the vicinity of the valence band ninxinium or the coridurtion band minimimi. Then it is expedient. although not necessary, to icienliljr ko with k . If k, hes. for example, at the center , of the 6rst BZ,as often occurs. one has ko = 0. This choice will be used later. At the outset, ko should still be considered an arbitrary point of the first HZ. In order to accomplish the ahove program. we expand thP Bloch functions (xjuk}with respect to Luttinger Kohn functions (xjpkko). On\y terms with k = k occur in this expansion because of the lattice translation symmetry of both functions, whence
2.7. k . p method
181
With this expansion, the Schriidinger equation (2.178) in the LuttingerKohn representation bmomes
k .p
 Hamiltonian
lhematrix elements (pkko I H I pkko) of the llarniltonian bptween T,uttingpr Kohu functions can he t r a d back to matrix elements (pko I p j pko) ol the momentum operator p between Bloch functions. if one uses the easily provcn
commutation relation
[p2,eikx] = eikx (p2 f 2fik. p + h2k2) ,
(2.326)
which yields
E:(k) = E,(ko)
Ti2 + (k 2m
ko)2.
(2.328)
The matriv on the righthand side of (2.327) allows for an important rewriting. If one defines
Hk.p(k) =
Ho(k)
h + (k na
 ko) . p!
(2.329)
with
(PkkolfflPkko) = (PkoIHk.,(k)lPko).
(2.33 1)
The latter relation means that the actual Hamiltonian matrix W in the kdependent LuttingerKohn basis Ipkko) equals the representative matrix of a fictional kdependent Harniltonian Hk.p(k) in the kindependent partial Bloch basis lpkoko) = jpko) for the wavevector k = ko. The kdependence of the LuttingerKohn basis on the left handside of equation (2.327) has been transferred to the new Hamiltonian Hk.p(k) on the righthand side. The SchrGdinger equation (2.325), with this new Hamilt onian, reads
182
The components of the eigenvectors Ivk) in (2.332) refer to the LuttingerKohn basis Ipkko), although the operator Hk.p(k)is represented in the Bloch basis Ipko). Solution of the Schrodinger equation (2.332) involves the diagonalization of the matrix (pkolHkE,(k)lp'ko). For k = ko this matrix is automatically diagonal, by virtue of the fact that Bloch functions Ipko) are eigenfunctions of the Hamiltonian Hk.p(kO) = Ho. For k # ko, the Ipko) states are no longer eigenfunctions of Hk.&), so that the matrix (pkolHk,(k)Ip'ko) has offdiagonal elements with respect to the band indices. Formally. one may interpret these nonvanishing elements as arising horn an interaction between different bands. Since this interaction results from the (k  ko) . pterm in Hk.p(k), one calls it the k . pinteractian. In this, the bands which are mutually coupled, are not bands in the sense of the eigenvalues of the actual crystal Hamiltonian H  the latter are uncoupled by definition  they are fictional bands E:(k) defined by equation (2.328). As the point k in Hk.,,(k) approaches ko, the k . pinteraction tends to zero. For kvectors sufficiently close to ICO, one can treat this interaction with the help of quantum mechanical perturbation theory. Apart fkom the square term in (k  ko) already present in E:(k), this entails a power series expansion of the energy bands E,(k) with respect to (k  ko) about the point ko. The form of the perturbation theoretical expansion depends on whet her the unperturbed bands, i.e. the eigenvalues E,(ko), are degenerate or not. We will first consider the simpler case of nondegenerate bands.
In first, order perturbation theory the eigenvalue E$(k) arising from EE(k)
is given by the relation
(2.333)
and the Bloch function Ivk)' arising from 1.k)'
E
Since the f m t derivatives VkE,(k) of the exact band energies E , ( k ) at ko depend only on linear expansion terms of E,(k) in k  ko,no approximation is needed to obtain the relation OkE,(k)lk, = VkEb(k)lb. Considering
2.7. k . p method
183
(2.3331, this exact relation yields V&v(k)lb = ( h / r n ) ( v b l p l v h ) . This holds the same content as equation (2.193) used above without proof, because ko may be an arbitrary point of the first B Z . In particular, if ko is a critical point kc, i.e. if V&,(k)\k,, = 0 holds, then the k .pcorrection vanishes in first order perturbation theory. One must proceed to the second order to get a nonvanishing contribution from the k pperturbation. The result reads
mz'
Generalizing the terminology introduced in section 2.6, we call M L 1 the effective mass tensor at the critical point k. For the diagonal elements , of At;' with respect to the principal axis system, one obtains from (2.337) the relation
This relation connects the effective masses with the matrix elements of the momentum operator between different bands and with the energy separation of bands at the critical point. The tendency indicated is that the absolute values of the effective masses become larger for smaller momentum matrix elements and larger band separations. One expects small effective masses for large momentum matrix elements and small band separations. As far as the band separations are concerned (only for them can one make an easy estimate), we will later find conhmation of this tendency in all concrete cases. For pairs of bands which are closer to each other than to all other bands and, therefore, whose mutual interaction is stronger than that with all other bands, relation (2.338) allows one to also draw a conclusion about the signs of the effective masses. According to it, the energetically higher of the two bands should have a large positive effective mass, and the energetically lower a mass of the same large absolute value but of negative sign. This
184
CFZ.S~~~S
conclusion also proves to be valid in all cases in which the assumptions of this calculation apply.
Band degeneracy
Critical points are often symmetry centers or lie on symmetry lines, and at these symmetry points, degeneracy of the energy bands often occurs. If this happens, one must carry out second order k.pperturbation theory or degenerate bands. In quantum mechanics, perturbation theory for degenerate energy levels is c o m o n l y of first order  the matrix of the perturbing Hamiltonian operator between the degenerate states has to be diagonalized (we remind the reader of the nearly free electron approximation in section 2.4). This procedure does not apply here because the perturbation matrix at critical points vanishes in first order. One must therefore choose a variant of perturbation theory for degenerate energies which works in second order. To this end, one constructs the matrix of the perturbation operator not between degenerate unperturbed eigenstates, as is commonly done. but between the (also degenerate) eigenstates of first order of perturbation theory. By diagonalizing this matrix one obtains the eigenvalues in second order perturbation theory. These are, in general, no longer degenerate. An important case in which the k . pperturbation matrix between the degenerate unperturbed states vanishes, is the valence band maximum of semiconductors with diamond structure. This case will now be investigated. In doing so, w e initially neglect the spinoIbit interaction. This approximation is valid for semiconductor materials composed of light elements only, including, for example, Si. For other materials this procedure serves as a zeroorder approximation which can be used to proceed further (as we will do below).
2.7.2
As we know from section 2.3, the valence band maximum of diamond type semiconductors is located at the center r' of the first B Z . Therefore, we set & = 0. The maximum is 3fold degenerate. We denote the three pertinent Bloch functions by IvmO), where m can assume the values 2,g,2. According to section 2.6, these eigenfunctions belong to the irreducible represent ation l?b5 of the cubic group oh. As indicated in Appendix 4. a basis of this repz resentation is formed by the products yz, zx, y of the components 2,y, z of position vector x. Therefore, with regard to their transformation properties under the action of elements of o h , we may identify IvzO) with yz, IvyO) with zz, and IvzO) with zy. The vector components 2,y, z of the position vector itself transform in accordance with the irreducible representation I5 o Oh. ' 1 f
2.7. k . p method
185
For the subgroup T d of oh the two representations F15 and r h 5 coincide. For semiconductors having the zincblende structure, the three degenerate states ( v I O ) , IuyO), IuzO) of the valence band maximum may therefore be associated with I,y, z insofar as their transformation behavior is concerned. In the case of the diamond structure, 2,y, z are merely a short hand notation. The vanishing of the matrix (vmOlplum'0) of the momentum operator between valence band states at I?, anticipated above, may easily be demonstrated using the pertinent criterion for such vanishing given in Appendix A: The operator p transforms according to the irreducible representation I'15 of oh. The matrix (umOlplvm'0) therefore belongs to the reducible representation x r 1 5 x rl,, = I?;, x (rh ri2 I'15 r 2 5 ) , wherein the identity representation does not occur. According to Appendix A this means that the matrix (umOlplvm'0) must vanish. One can also obtain this result by means of inversion symmetry alone. We have chosen the somewhat more troublesome method of proof because it may also be applied in other, less obvious cases, as we will see immediately below.
+ + +
In order to apply degenerate second order perturbation theory, the solutions of Schrodingers equation (2.332) are needed to first order in the k . pperturbation. For the orthonormalieed Rloch valence band eigenstates (vmk)l one finds
(2.339)
where we set E,(O) = for brevity, and the degenerate valence band energy E J O ) is denoted by Ev. The third term in (2.339) guarantees the normalization. (:onsidering the sum on p , tlir value 11 = 7 m does not need to be 1 specifically excluded because the matrix elements ( p I p I vm') for p = 117n vanish anyway. Expressions of tlic form (2.339) also hold for the approximate Bloch functions Ipk)' of the remaining bands p with p f urn, but we omit an explicit presentation of them here. The states Ivmk)' and Ipk)' with p # t m will now be used as a basis set to represent the IIamiltonian H . The resulting matrix is 'almost' diagonal, because the basis functions are 'almost' eigenfunctions. In particular, the submatrix of the three velence bands is coupled to the remainder of the matrix only by elements of second order in the k . pperturbation. These elements give rise to corrections of the valence band energies which are only of third order and can be neglected.
186
In second order perturbation theory, the valence band energies Ev and Forresponding Rloch states I&,) therefore follow from an rigenvalue equation which is decoupled from the remaining bands, namely
'(vmklH!vm'k)l l(wn'kl&:y)= E ' , '(vrnklE,).
rn,
(2.340)
The initial occurrence of interaction betwren the valence band and the r e maining bands is incorporated in the matrix (u>m I H I wn'k) in first order k perturbation theory.
Harniltonian m a t r i x The (3 x 3)Hamiltonian matrix '(vmklHlvm'k)' of equation (2.340) can be obtained by means of expression (2.339) for the perturbed states Ivrnk)'. A short calculation yields
where
is a fourthrank tensor. Since the states I p ) = 1 P O } are eigenfunctions of H with eigenvalue EEl= E,(O), one may write (2.342) in the more compact form
With respect to the indices a r P lthe tensor D z i , is symmetric, and with respect to the indices m , m' it is IIermitiaa From equation ('2.345) one can see that D Z k , transforms under symmetry operations of the cubic group Oh according to the $fold product representation [rb5 ri5]a [1'1~ x l ? 1 ~ ] , x x where the index s denotes the fiyrnmetrical part of the product. According to Appendix A, then contains as many independent elements as the
L?zk,
number of times the identity representation occurs in the product [I?& x riE;lrn x I115Is. Using Appendix A , one finds [I'15 rl5Is [& x r&lS x [I'Is x = I = r1 r12 ra,, which yields [rk5 r& x [rI5 rl& = 3r1 rz 4r12 x x 3 ,k 5T'k5 The tensor D z L , therefore has three independent components. c one can show that these correspond t o the three types of nonvanishing
+ +
+ +
187
matrix elements D z z , L ; and D Z . We introduce the abbreviations L l& D g , hl = D$$, and N = D$ f D g . The elements L, M and can be calculated if the Bloch factors are known. In the absence of this informalion, however, we consider L , M and N to IIP empirical parameters (as indicated at the outset) and use their connection with the Bloch factors only to identify some general properti=, such as the fact that they can be chosen real. Since QB a l remaining matrix elements Dmm, vanish, the Hamiltmian matrix of the l valence band has the form
Method of invariants
The Hamiltonian matrix (2.344) can also be derived in a somewhat different way, whirh leads to the goal more quickly, but is formally more dernanding. One uses the fact that the Hamiltonian matrix l(vmklHlum'k)l can be represented as a linear combination of the 9 matrices of a basis in the product space 1 vmO) (iirn'O 1 which transforms according to the repreaentaI x r:,] of the point group oh. This representation is reducible. By & tion [ decomposing it into i t s irmliicible parts, m e obtains a basis which consists of subbases, each of which belongs to a particular irreducible representation of Oh. Such a matrix basis can easily be constructed by means of the 3dimrnsionel angular niomenturn matrices Iz,Iv, (considered in Appendix Iz A ) and their products, since it is known how these matrices transform, namely according to the pseudovector representation I':s. In the product spacc k,ffi of the components of the vwtor k, one proceeds in a similar way. One determimes a basis from subbases which transform according to the irreducible parts of the representation [I'15 x r15Iy.The Hamiltonian matrix reprewrits an element in the product space of Ihe two spaces which is invariant under transformations of the point group Oh. Such invariant elements of the product space can be produced by forming scalar pruducts of subbases of the two spaces whi& transform according to the same irreducible repre sentation. As seen in Appendix A, the corresponding scalar products belong to the identity representation, that is to say, they are invariant. To find the most general Hamiltonian matrix compatible with the symmetry oh, one has to determine all invariants of the product space. I. one then multiplies each by a real scalar factor and sums them all, one obtains
188
the most general invariant of the product space and thus the most general Hamiltonian matrix compatible with Oh symmetry. This process is called the method of invariants. It is applicable to arbitrary symmetry groups and degrees of degeneracy, and it quickly leads to the goal if one considers spin and spinorbit interaction. It also allows one to determine the matrices for perturbing Hamiltonians other than that of the k . pinteraction, such as the interaction between the angular momentum of Bloch electrons and an external magnetic field (see section 3.9) or the interaction with mechanical strain. In this book, we will only use the method of invariants occasionally. A comprehensive outline of the method with several applications i s given by Bir and Pikus (1974). Valence b a n d s t r u c t u r e The eigenvalues of the matrix (2.344) form three valence bands E,l(k), E,n(k), E,3(k). For the three symmetric kdirections [loo], [lll]and [110] the dispersion curves are determined by simple analytical expressions as follows:
E,lp(k) = M k 2 ,
E,3(k) = L k 2 ,
(2.345)
1 E,lp(k) =  [ L 3
+ 2M  N]k2,
1 E,3(k) =  [ L 3
+ 2M + 2 N ] k 2 ,
(2.346)
E,l(k) = M k 2 , E,2(k) =  [ L
1
(2.347)
1
+ M + N]k2,
E,3(k) =  [ L 2
+M  N]k2.
(2.348)
Along the two directions [loo] and [lll], the valence band, being triply degenerate at r, splits into two bands, one 2fold degenerate and one nondegenerate (see Figure 2.25). In the [llO]direction and also for all more asymmetric kvectors, no degeneracy remains. This indicates a %fold splitting of the valence band for such k. All bands are parabolic, but evidently, in general, not isotropic. One speaks of a warping of energy bands. In the case of Si, one has L = 5.64, M = 3.60, N = 8.68 in units of ( h 2 / 2 m ) . Using these values, the two degenerate bands E,l/a(k) of equations (2.345)
2.7.
k . p method
189
Figure 2.25: Valence band dispersion for diamond type semiconductors in the vicinity of the I'i5maximum for different kdirections.
or (2.346) have smaller curvatures than the third band E,s(k) in these equations. Thus the first two bands correspond to the heavy holes and the third band to the light holes of Si. Isotropy exists only if N = 0 and L = M . Then, there also is no longer any distinction between light and heavy holes. Conversely, anisotropy grows stronger as the difference between the masses of the two types of holes becomes larger. The results discussed above were obtained without consideration o l spinorbit interaction. However, for most of the diamond and zincblende type semiconductors, the valence band structure is significantly influenced by this interaction (in the case of Si it is small, hut often not negligible). We now proceed to consider the effects of spinorbit interaction.
2.7.3
r
190 we have
The total Hamiltonian of the system is obtained by adding the spinorbit interaction H to the Hamiltonian H in its absence, where Hgo is given by , equation (2.56) as
(2.3501
One has to be aware that the (sxlvuk) are eigenfunctions of H , but not of H H,. Correspondingly, the (xlvuko) signify the spindependent LuttingerKohn functions of H , but not of fi + f l W .T h e matrix repre senlation of the Schrtdinger equation with respect to the spindependent LuttingerKahn basis reads
x(prkOtH
pW
(2.351)
In calcuhting the matrix of H t H,, of (2.351)) an additional kdependent term appears in comparison with the spinless case, as a consequence of the fact that H , contains the momentum operator. This term has the same form as the (A/rn)k. pterm arising from H . except that the poperator is replaced by the operator p+(1/4rnc2)[5 x OV(x)]. The additional term can be taken into account by replacing the operator Ht.p of equation (2.329) (for ko = 0) by the operator
Hk.x = H o ( k )
Tl + k . a, m x VV(X)].
(2.352)
with
ii = p
1 + [a 4m c=
(2.353)
becomes
+ H,,
(2.354)
C(~(TOIH~.~ + HsoIpuO)(pukOIEu) E , ( p n k O l E , ) . =
PU
(2.355)
Up to this point, we have kept the discussion general. Now we wish t o explore the particular consequences of spinorbit interaction for the previously
2.7. k p method
191
considered valence band states. To this end, we need the matrix elements (vrnuO~Hso~wmuO)H,, between the spindependent Bloch states of Ivm 0 . 0 )namely, ~
F,
To evaluate the matrix element (vmOl[OV(x) x p]]vmOj in coordinate space we make use of crystal symmetry, as was done before, in the calculation of the matrix elements of the momentum operator p. The operator [VV(x) x pi is a pseudovector and transforms according to the irreducible representation ri5of oh. The entire thirdrank tensor (vrnOI[VV(x) x p]lvrnO) therefore 1 belongs to the reducible representation Ib5 x Ti, x P2,= rl,, x (r2 r 2 Ti5+I7L5), which the identity representation occurs exactly once. The tenin sor (vmO ! [VV(x) x p] I vmO) consequently contains one independent constant. This constant coincides with the matrix elements (vyOl[VV(x) x pIz I v z O j = (vzO I [VV(xj x plXI vyO) = (vzOI [VV(x) x p]ylvzO), as well as
(~.Ol[V(X)XPl*l v y 0 ) = (WYOI [ w x ~ x P l . / ~ = ( V Z O l PV(XjXp1ytvrO). ~ o ) where (vxOl[VV(x) x plz I u y O ) = (uyO I [VV(x) x p],lvzO) holds. Because the Bloch factors are real? these elements are pure imaginary. We denote the value of (vyOl[VV(x) x p],jvzO) by (4m2c2/h)(i/3)A, i.e. we set
h
4m c
,(uyO~[VV(x)x plrlvzO) = i  .
A
3
(2.357)
Below, we will see that the constant A is the energy splitting of the valence band at J? due to spinorbit interaction. Applying equation (2.357) and the explicit form of the spin matrices given in (2.57),the matrix (avmOlH,I aw m 0j becomes
0  i
i
0 0
0 0
0 0  4
(vmaOp,,Ivm.uO)
=
A 3
0
0
0
0  1 i
0 0
0 0
0  1
i
o i
0 0
(2.358)
0  i  i o
Here the rows end columns are associated with the basis functions in the sequence Ivz T O j = I t), Ivy 7 0 ) = ly tj, . . ., It)z 1 0 ) 3 1 1). Spinz s orbit interaction couples orbital states and spin states to each other. At k = 0 the expressions E:(k) and k ii are both zero. The eigenenergies and eigenfunctions of the total Hamiltonian Hk.a H , are therefore also those
192
k  0 becomes
(2.359)
u'm'
(vmaO~H,,lvm'a'O)(vm'a'O~E) E(wamO(E). =
Eigenfunctions at
r. Angular momcntum
basis.
1 p
(2.360)
IE )
v1 
Jz
IEv3) =
Jz
O,O, 1,  i , O ) ,
(2.361)
A
2 3
(2.3 62)
(o, fi
(2.363)
The components of the eigenvectors given by (2.361) to (2.363) refer l o the IwmaO), using basis functions IvmaO). If we abbreviate these by Ima) I T), J y I), 12 I), Iz J), l I ) , Iz i), the cigcnvcctors take the form z y
The eigenvectors lEvt),i  1,2, . . . , 6 of (2.361) have a simple meaning. The lEvl),IEvz),IEv3), IEv4) are basis functions of the irreducible representation of the cubic group o h . Acrording to Appendix A, these representations
2.7.
k p method
193
emerge from the representation 213 of the full rotation group if, D Dis taken ~ as a representation of the subgroup o with +1 for inversion. It has also been h
shown that the basis functions of this representation are the simultaneous 5 eigenfunctions of the angularmomentumsquared ' or the eigenvalue j ( J 1) (in units R 2 ), and of the zcomponent J z of J for the eigenvalues m j = 3 2, 2 , 1 .  1 . = and $ L One therefore denotes the first four eigenfunctions of (2.364) by
4.
I)2 2
33
= 1rz
31 + iy t), I)2 4 2
= "2Iz
fi
t) + la: + iY 1 1 1 7
33 i = 1. 22
(2.365)
I)2 2
3T
[& I.
 iy
T)
+2/2
111%I)
Jz
iY
I),
The lEvs),IE,s) are basis functions of the irreducible representation I'y of o h . These representations do not arise from any representation V3 of the full orthogonal group. in particular not from a representation for J = (this happens with in the case of Oh or r:) in the case of T d . but the expectation values of J2 and J z are the same as those in the Dl basis. Therefore one also uses the angular momentum notation or the last two eigenfunctiona of (2.3641, i.e. one sets
rt
Each of the dgenvectors (2.366 and (2.366) is determined only up to a phase factor, which is chosen heie such that the states with negative total angular I): $I,): 1 follow, respectively: from the states with positive momentum, ?, total angular mornenturn, I$$), ,);I ) ; ; 1 by means of time reversal, i e . by forming the comphx conjugate of the original eigenvector and subsequently multiplying it by rP. We refer to the functions Ijmj) of (2.365) and (2.366) henceforth as the angular momentum basis. According to (2.360) and (2.3621, eigenstates having the same eiggmvalue of the the angular momentum squared, J2,also have the same energy eigenvalue, while the rncrgy eigenvalues differ if states with different eigcnvalues of J' are consitirrrd. The valence band: being Bfold degenerate at the rpoint if spin is not taken into account, therefore splits into two bands, one with j : and one with j = if the spinorbit interaction is considcrccl. That such a .ynin,o)rhitqditting must occiir? one can recognize just by means of a grnnp theoretical analysis of the problem. The six valence hand states at transform in accordance with the &dimensional representation Dl x , of Oh. Tlicse representations are reducible, accord; ' I
3,
4,
ing to Appendix A , as Di x
T
= Y$
+I ' ; .
194
by the constant A, which determines the strength of the spinorbit interaction. Therefore A is called spinorbit splitting energy. One may interpret A as the difference of the spinorbit interaction energy between the states with 3 j = 3 and those with j = As one should expect, the states with larger angular momentum l e energetically above the states with smalIer angular i momentum. States with different m j , i.e. with different projections Jz of total angular momentum on the zaxis, but the same J2,have the same spinorbit interaction energies. Therefore the degeneracy of these states remains.
i.
The above statements refer to valence band states at the center r of the first B Z , where the k pinteraction vanishes. Off this interaction is no longer zero and must be taken into account in addition to the spinorbit interaction. We have seen how this can be done approximately in the preceding section, without consideration of spin. The method used there indicates the following procedure in the presence of spin and spinorbit interaction: One determines the functions Ipok)' which diagonalhe the operator H k q of (2.352) i first n order perturbation theory. In analogy to equation (2.339), one finds for the valence band states Ivmcrk)' the expression
r,
where we use the same abbreviations as in (2.339). Analogous relations hold for the states of the other bands. The functions Ipgk)' form a complete orthonormal set in t a m s of which the Hamiltonian H may be represented. The submatrix with respect to the valence band states (vmmk)' is decoupled from the remainder of the matrix in second order perturbation theory. Since H,, also only couples the valence band states among themselves, but not to states from other bands, the Schrdinger equation (2.355) in this representation reads
The spindependent term of k .7i in the eigenstates Ivmcrk)' of (2.367) d e scribes the change of the k .pinteraction due to spinorbit interaction. '5'ince t,he two interactious are supposed to be weak, this change i s second order small. It will be omitted below. Then we have, approximately,
2.7. k p nrt.th<>d
9
195
(vmakIH,,lv7nak)
(vmaOlH,qolvmaO).
Now we use the fact that H,, is diagonal in the angular momentum basis Ijmj) of (2.365) and (2.366). It is clear that this basis follows from Ivma0) E Imu) by a unitary transformation
~ j m j ) C U r r m j m j Ima).
ma
(2.370)
can be readily
The corresponding unitary transformation matrix UmjmJ obtained from the rclations (2.365) and (2.366). One has
(2.371)
If one applies this transformation to the Schrodinger equation (2.355), then the matrix (vmaOlH,,(vmaO) takes the diagonal form
(hjlff,oljm;)
o $ o o o o + o o o o g
0 0 0 0 
0 0 0 9
0 0
0
(2.372)
196
The sum of the two matrices (2.372) and (2.373) is the new Hamiltonian matrix. It has the same eigenvalues as the original matrix, even though its form deviates from that of the original. The difference in form is, above all, that the new matrix is already diagonal at k = 0. Kondiagonal elements and occur for k # 0. Among them, the elements between basis vectors ] j m 3 ) [ j ' r n $ ) with m3 # m i , , but = 3' play a different role than the ones with .f j'. While the influence of the 2diagonal elements on the eigenvalues is independent of the size of the spinorbit splitting A, it does depend on it for the j offdiagonal elements. The magnitude of the latter can be estimated as the larger of the two terms Nlki2 or IL  Mllkl'. If one assumes that l L f a ~ ( ~ V ,  AIl}lk12 << A holds, then a perturbation theoretical treatment IL is possible. It yields an energy correction of order of magnitude [ M a r { N ,ILM1}]21k14/A. Under the assumptions made, this is small compared to A. That means that the 3offdiagonal elements of the transformed Hamiltonian matrix can be neglected if the kvectors are sufficiently close to r. We will assume below that this is the case, although the LuttingerKohn model also covers the general case of a (6 x 6 ) Hamiltonian matrix. Neglecting ]offdiagonal elements the Hamltonian decomposes into two blocks, one (4 x 4) block conesponding to the basis vectors of the representation I'i,and one (2 x 2)block for the basis vectors of the representation The rsfmatrix reads
I't.
,)::I
thc 1 multiplying A/3 is thc (4 x 4) unity matrix, and the quantities & ? , S,T stand for the fobwing expressions; R
The Hamiltonian matrix of thr rzvalence band can also be derivcd using the method of iiivarinnts, which was discuss& at an earlier stagc. To do this, one needs the angular momentum matricrs for spin J = as well aa their products with earh other. These matrices arp given in Appendix A. The
4,
2.7. k . p +method
197
resulting Bamiltonian matrix agrees exactly with that of equation (2.374). This means that the neglect of the coupling terms between the spinorbit and k pinteractions has no influence on the general form of the r$Hamiltonian matrix. Even with this additional approximation one obtains the most general I'8+Hamiltonian compatible with the symmetry of the crystal. Only the Z : explicit expressions for the constants L , M , N in the matrix D, of (2.343) are affected. Since these arc understood as empirical parameters, this also does not play an important role. The matrix (2.374) has the two 2fold degenerate eigenvalues E&(k),
2
Using the explicit expressions for Q, T, R , S, this yields
(2.376)
A = (L 3
+ 2M),
1  ( L  Ad), C 2 3
1 [N2  ( L  M)')]. 3
(2.378)
: Thc 'T (2 x 2)matrix block of the full (6 x 6)matrix is already diagonal. Its 2fold degenerate eigenvalue Er7(k) is given by
The energy level Eo1/2/3/4 Bra(0) equation (2.360) is 4fold degenin erate at I?. Correspondingly, two 2fold degenerate bands Erf,(k) arise off of I?, and starting from the 2fold ckgenerdte level Ew5/6 = Ey7(0) r in at (2.362),which lies at an energy separation A below, a similar %fold degencratc band Ep,(k) (see Figure 2.26) evolves off of r. The 2fold degeneracy of ixll band6 is a consequence of time reversal symmetry jointly with spatial inversion symmetry (see Appendix A). The E&band has weaker curvature, it corresponds to the band of heavy holas. For k, = k, = 0, the pertinent .) 'l'hc EFaband is that of the Zzght holes. eigenfunctions are I$$) and; : 1 The eigenfunctions for k,  k,  0 read) ; ; 1 and ;). ;1 For arbitrary kdirections, the heavy and light hole states are linear combinations of all four basis functions ) :, ; I I) :, ; 122) and; ; 1 3T .) The described structure of the valence band around r represents what is called the LallzngerKohn modal. The particular form of the bands is determined by the three consttlnts A , B,C. Instead of the latter, one can use the dimensionless parameters 7 1 , 7 2 , 7 3 called Luttznger parameters, defined by :

198
I
I
s; II [loo]
G+l
5; II
[ l] Il
Figure 2.26: Valence band structure of diamond and zincblende type semiconductors in the LuttingerKohn model. The dispersion for the two kdirections is different (band warping).
h2 TI, B
2m
,2~,
ha Lni
h2 C2 = 12(7,2
2m
 7:).
(23 8 0 )
Values of the Luttinger parameters are listed in Table 2.12. The constants L , M , N , which were originally used, may be expressed in terms of the Luttinger parameters by the relations:
Both energy band functions E&(k) and Er.,(k) depend on the square of Ikl, i.e. they are parabolic. This would not have resulted if the interaction between the rBfstates, and the spinorbitsplit r,f states, had not been neglected as it in fact was. Concerning the dependencies on the direction of k, the spinorbitsplit band Er7(k) is isotropic, while the heavy and light f hole bands Er8(k) are not. In their case one again has a warping of energy bands as discussed above (see Figure 2.26). The constant C measures the strength of warping. In the case C = 0, the warping vanishes. For C # 0, the point k = 0 is a singularity of the energy band functions Erf,(k) in that the second derivatives with respect to the components of k depend on the direction from which one approaches the point k = 0. The effective mass tensor, as given by equation (2.196), is not defined in such circumstances. Instead, one can define an anisotropic effective mass, by differentiating in equation (2.377) not with respect to the components of k,but with respect to Ikl. If the warping of energy bands is ignored, for instance, by averaging
199
Table 2.12: k .pband parameters for selected diamond and zincblende type serniconductors. E i , A, and E , in eV. E p = ( 2 m / h 2 ) P 2s a measure of constant P in i Kane's band model. The values for 71, yz, 7 3 are adjusted to the LuttingerKohn
model. Temperature below 70 K . (After LandoldtBomstein, 1982.) Material
71
yz
73
C
Si
Ge
cz  S n
Ei 5.48
3.4
A
0
0.044 0.29 0.8 0.34
0.8
Ep
0.90
0.4 1.52
26.3
39
GaAs
2.9
5.26 16.91 1.53 8.8
GaSb
InSb ZnSe HgTe
4.03
15.64 0.67 10.6
0.70 0.18
2.67 0.30
0.98
0.42
1.08
over all directions, this yields the ordinary isotropic heavy and light hole masses but in the sense of an average. The LuttingerKohn model was described above for the case of diamond type semiconductors. Formally, for materials having the zincblende structure, the model does not apply because the matrix elements of the momentum operator p between the triply degenerate valence band states without spin. IvrnO), are in general nonzero: These states transform according to the vector representation r15 of the tetrahedral group T d r and the matrix '5 '5 1. elements (vrnOlplvm'0) belong to the product representation I1 x I1 x r 5 The latter contains the unity representation, as distinguished from diamond type semiconductors, where the unity representation does not occur in the corresponding I x r15 x ' i , product. The reason for this is the absence of inversion symmetry in the zincblende structure. The nonvanishing matrix elements (omOlplvrn'0) give rise to terms linear with respect t o k in the I1 valence band Hamiltonian, besides the quadratic terms which are are al'5 However, as a rule, ready present in the diamond case (see equation (2.344)). the klinear terms are small, and the LuttingerKohn model also applies to zincblende type materials, provided the other requirements which underlie this model are satisfied. Table 2.12 therefore also contains LuttingerKohn parameters for semiconductors of the zincblende type. The most important requirement for the LuttingerKohn model to be valid is
200
the validity of the assumption that the k pinteraction of the valence band with thc deepest conduction band is weak enough to be treated by means of perturbation theory. This is justified as long as the encrgy separation EF from the lowest condiiclion band and the I&valenre band (not to be confused with the fimdamental cnergy gap) is siifficiently large. One expects deviations from the LuttingerKohn model to become noticeable if E i is sinall Table 2.12 shows thal the EFvalue for InSb, for example, clearly lies below those for C, Si and Ge. In the case of aSn and HgTe, Ef; even becomes negative. Simultaneously, the spinorbit splitting energy A becomes i relatively large in some of the zincblende type materials, so that even A > / $ holds. Describing of l h e valence band of such semiconductors by means o l the LuttingrrKohn model would entail treating the effect of the remote spinorbitsplit band exactly, while taking iuto account only the energetically closer conduction band by means of perturbation theory. Such a procedure is not meaningful and one must seek a different, more appropriate description. A model which is tailored exactly to such circumstances is the Kana model, which we will now discuss. In this matler, we asbume that the point group of equivalent directions is the tetrahedral group rh, and no longer the cubic group o h as above, thercby encompessing both typeb of semiconduclois, those of zinrldende type EM well as those of diamond type. In the latter case, inversion symmetry still has to be added. This involves a spwialization of the results, which may be casily done, should thc need arise.
2.7.4
Kane model
The Kane model is based on the following assumptions. Pzrstly, it is assumed that the k. pinteraction of the valence band with the deqest conduction band at 1 is so strong that it must bc treated exactly. Secondly, at r the spiuless valcnce band should have the symmetry T 5 and 1, the spinlesu conduction band should have the symmetry rl. This assumption corresponds to the situation which actually exists in semiconductors of Llncblendc type. Thirdly, the interaction of the valence and conduction bands with all remaining bands (referrcd to as ~ e m o t is ) ~ assumed to be srrrall, so that it rriay be treated by perturbation theory, similar l o the LultingerKohn model in which the interaction of the valence band with all other bands was treated in this way. Here, we will simplify further and neglwt this interaction COHIn plelely. T addition, we will also neglect the direct k .pinleiaction among the three r15valence bands which, as mentioned above, does not rigorously vanish for zincblenrle type crystals, giving risr to klinmr terms in the Hamiltonian. It turns out that the latter approximation is valid in most cases.
2.7.
k.p methad
201
Neglect of interaction with remote bands We analyzr the generally valid Ychrbdinger equation (2.351) using the assumptions and approximations discussed, above. considering spin md spinorbit interaction horn the outset. We again denote the thrw valence band indices by urn, ni  I.u, z. The conduction band index c will be augmented by 3 . idirating the 3 or rlsynlm&y of the conduction band state at (the coincidence of this notation s with that of the s p h variable 9 is unfortunate, but unavoidable, and the reader should keep the dxerent meaning of s dearly in mirid to avoid confusion). The pertinent spindependent Bloch functions at k  0 are IvmuO) and I c s v O ) , rmpectively. The matrix elements of the term Ho(k] of f f k X are given by
2m
2
(2.382)
(2.383)
h2 (csdlHo(k)lcsdO) = 6,,)k 2m
(wmOlHo(k)IcsO)= 0.
(2.384)
Iii the otkw o p ~ a t o r term of A V ~ . ~ , namely {Fa/m)k.?i,WP may neglect thc spindqwndcnt part by virtue of the same arguments as in the Luttingerkohn model. The three needed matrix elements of this operatw may then be determined as
(2.385)
The matrix elements (vmOtplzlm'0) of p between the valence band states are wglectctcd in accordance with the assumptions madr above. The diagonal rlemenl (csOlplcs0) of p in the conduction band state IcsO) vanishes exactly. The matrix elements {cs01plvmO) between valence and conduction band states tranuform according to the produrt representation r x I115 x r15 1 trls i l725. Since the unity representation is contained in it = Fl +
exactly once, the matrix does not haw to vanish; it contains exactly one independed rlcnwnt. As such, one may chose ( c s O l p , l v d ) and set it equal t o z ( m / h ) P . The nunvanishing matrix elements of p are then given by the relation
202
(2.386)
The factor 2 giiarantws that P i s real, if the Bloch factors are real as we assume. The other factor (m/Ta)was introduced for convenience in the final dispersion rclations. With the moinentiim matrix elements of (2.3861, thp
Hamiltonian matrix (pu01Hk ,(k)lpuO) takes the form (2.387)
Ec
iPk, iPkg
iPk,
,iPk,
iYk,
0
0
0
0
0
0
iPk,
0
0
0
iPk,
0
0
0
0
iPk,
0
0
0
0
0
E C
0
0
0
0
0
0
iPk,
0
0
0
0
0
I ] /s
0
0
0
0
0
0
iPk,
iPk,
0
0
Finally ihe matrix demerits (puO1HsolpmO) nl t h e spiriorbit interaction operator ITrn haw to lie detrrminpd. For the Fljvaknrc band rlernents (wmcrOlH,,lwmaO), one can adopt the results which were formerly derived for the valence band. because li5 coincides with I15 for the tetrahedral group. There are new matrix elPrnents (vmuOlH,lrsaO) and (csuQIH,oj cscr0) involving the conduction band states. The coordinatedependent factor of the first matrix element transforms according to the representation r15 x l?25 x Il  Iz+ I15 r25. The unity representation does not oc cur here, thus this factor vanishes and with it the whole matrix element (umaOIH,,I csaO), whence
(um.aOlHsIcsrr0)= 0.
(2.388)
The caordinatedeyendrnt factor of (csuOlH,olcsdO) belongs to the product representation rlx r 2 5 x r : arid must therefore likewise wnish, l rztj
2.7. k . p method
203
+ H,,
,
Ec
iPk, iPkg iPk, iPk, iPk,
iPk,
0
0 0 0
Ec
0
0
0
O 0
0
?
ig A
.A
i$
0
0
o
0
0
o
0
0
A 3
0

27
i+
iPk,
o
iPk,
0 0 0
\
iPkz
0 0 a
3
0 0
2 5
iPk,
iPkY iPk,
2z
O
.A
+
i+
o
0 0
0
0
.A
Here, the order of rows and columns is the same as in (2.387). For k = 0 and vanishing Ec, this matrix reduces to the spinorbit interaction opcrator WSw If one rearranges the rows and columns of this matrix in such a way that those relating to the conduction band states 1s 1) and 1s J) occur in the left upper corner, side by side, then the matrix decomposes into a (2 x 2)block for the conduction bt~ncl,and B (6 x 6)block for the valence band. The eigenfunctioiis of the two blocks arc simultaneously also e i g e h c t i o n s of the total matrix. T h {2 x 2)co,nduction band block is already diagonal, i.c. 1s t) and 1s J] are eigenhnctions of the Hamiltonian matrix (2.390) at k  0. The (6 x G)vdencr band block is identical with the matrix of the spinorbit interaction operator H,, of (2.350) for the LuttingerKohn model. The eigenfunctions at k  0 here are therefore also the vectors 1$ms) and
i$rn
The latter basis should be particularly suitable for solution of the eigenvalue problem for the Hamiltonian matrix (2.490) at k # 0. The matrix (2.390) is, however, so simple, indeed, that one can also obtain the secular
equation directly. We will do t h s . before we h r t h w consider the angular momentum basis. To diminate the free elmtron part ( h 2 / 2 m ) k 2from the eigenvalues Elk), we write them in the form
1)
h2 2m
(2.391)
204
Secular equation
(2.392)
The fact that the two factors in round brackets appear squared, signifies an at least 2fold degeneracy of all eigenvalues. The reason for this is. again, time reversal symmetry jointly with spatial inversion symmetry (we remind the reader that the term of the Hamiltonian which can break inversion symmetry in the case of Tdsymmetry has been neglected). Accordingly, one has in general four 2fold degenerate bands Ei(k).Ei(k),E$(k),Ei(k). It is also noteworthy that in the secular equation (2.392). k enters only in the form of k2. This means that all four bands are isotropic, in contrast to the LuttingerKohn model where a warping of the valence bands occurs. In the case of k = 0, the energy levels of (2.392) are given by
One may draw conclusions from these expressions in regard to the meaning of the four energy bands E,(k): El(k) is the J?sconduction band. E z ( k ) 2nd Es(k) are the two upper degenerate rsvalence bands at r. and E4(k) corresponds to the spinorbitsplit rTvalence band. The energy separation E of the r6conduction band and the I'gvalence bend at. r is obtained as L
(2.394)
As long as E: is positive, it represents the energy gap E , at I?. The case of negative E' is discussed below. For one of the two upper valence bands  the , one which arises from the vanishing of the first factor of the secular equation (2.392) and which is denoted by i = 2  the energy El(k) does not depend on k. For E$(k).a kdependence follows with finite negative curvature, as we will soon see. Thus E;(k) corresponds to a band of (infinitely) heavy holes, and Ei(k) to a band of light holes. If one adds the (Ti2/2m)k2term, then the band Ez(k) displays a positive curvature. It is relatively small because of the large free electron mass, but the positive sign contradicts what is to be expected for a valence band. This unexpected prediction for E z ( k ) results from the fact that the interaction of the valence band with all remaining bands, except with the deepest conduction band, was completely neglected. In order to treat the heavy hole band correctly, the interaction with remote bands must also be considered at least by perturbation theory as in the LuttingerKohn model. This will be done below.
205
According to the assumptions made at the outset. the interaction with r e mote bands is weak and may be taken into account by means of second order k. p perturbation theory. The 8 x 8 Hamiltonian matrix of cquatiori (2,390) for the conriuclionvalence band complex contains two 4 x 4 blocks of definite spin with rows and columns referring to the conduction band s state and the three 2, y, zvalence band statrs without spin, respectively. To include the inteTactjori with remote bands. RII additional 4 x 4 matrix o second order in k has to be added to each of these 4 x 4 blocks. Since the interaction bctween conduct ion and valrncc b a l d states contributes alrpady in first order, second order corrections occurring at $2,sy, s z  , and zs, ys, zspositions may be omitted. For the sselement and the 3 x 3 valence band submatrix, second order corrections beconie import ant. Their genpral forms follow from symmetry arguments as above. The correction of the sselement may be written in the form Ack2, with A , a constant. The perturbation correction to the 3 x 3 valence band submatrix has the general form of the 3 x 3 matrix in qiiation (2.344) with parameters L , M. N &fin1 like the matrix elements D E , D g in equation (2.341), however, with the conduction band excluded from the summation over { t bewuse this band is not reinotc. The 4 x 4 matrix block thus determined is added to carti of the two 4 x 4 diagonal blocks already present in the 8 x 8 matrix (2.390). Finally, the whole 8 x 8 matrix is subjected to a unitary transformation into a basis set in which the spinorbit interaction part of the Hamiltonian becomes diagonal. The latter requirement is evidently satisfied by a basis which, as in the LuttingerKohn model. contains the 6 angular momentum eigenfunctions I$:), I$$), I:!),;I ,): I%+). I;$), and in addition the two conduction band states 1s 1) and 1s 1). This corresponds to a 8 x 8 unitary transformation matrix composed of a 2 x 2 unity matrix block for the two conduction band states, and the 6 x 6 matrix block from equation (2.371) for the six valence band states. Carrying out the unitary transformation one obtains the general 8 x 8 Kane Hamiltonian which applies to any diamond or zincblende type materials, including those which are already well described by the LuttingerKohn model. However, even in these cases the Kane model is more precise than the LuttingerKohn model, because the valenceconduction band interaction is treated exactly rather than approximately like in the LuttingerKohn model. If one uses the Kane model in cases in which the LuttingerKohn model already works well. one has to be aware that generally the parameters 71, 7 2 , 73 have different Values in the two models since those of the LuttingerKohn model contain the valenceconduction band interaction while those of the Kane model do not so. The general 8 x 8 Kane Hamiltonian is given by rather lengthy expres
206
sions. To avoid these below, the 4 x 4 block matrix of the remote band interaction will be reduced to a special case before proceeding further. We put L = M  A,, and N = 0, which means physically that the remote bands affect heavy and light holes in the same way, and do not disturb the isotropy of the bands. Ordering rows and columns in the sequence 1s t), 1s l),I);:, I;$), . . . , I$$), the transformed 8 x 8 Hamiltonian matrix with simplified remote band interaction becomes
U
0
0 U
0
aP,
D
&
0
0
0
iP
V
0
0
0 0
O
fiPi&
0 0
V
0
D
0
0
i$&P,
v
0 0
0
0 0
W 0
U '
(2.3
where the notations l'h = (1/&)P(kz f ik,), Pz = Pk,, U = Ec A&', V = (1/3)A A,k2, and W = (2/3)A A,k2 are used. The eigenvalues of this matrix follow from the secular equation
E'(k) + A i 3
 A,k2
Ii
E'(k) +
This equation only differs from equation (2.392) in that the factors whose vanishing define the conduction and valence bands contain, respectively, the additional terms A,k2 and A,k2.
Solution of the secular equation in limiting cases
The zeros of the h s t factor in (2.396) yield, as seen previously, the I'sband of heavy holes (i = 2). However, the dispersion relation for it now reads
En(k) = 
A 3
ii2 + A , k 2+ k2 2m
(2.397)
2.7.
k.p metbod
207
By choosing a negative value for of appropriate magnitude A,, the dispersion for heavy holes can be brought into agreement with experimental findings. The zeros of the second factor in (2.396) determine the dispersion of the Faconduction band ( i = l ) ,the rsband of light holes (i = 3), and the spinorbitsplit I'Tband (i = 4). For the conduction band, the dispersion is changed due to the A,k2term, and for the two valence bands due to the A,k2term. But, here, these corrections are added to already existing strong dispersion terms. We will therefore neglect them in the following, as we neglected the weak dispersion duc to the free electron term (li2/2m)k2 earlier. Then the eigenvahe equation for the three bands i = I, 3,4reads
 [E'i(k)
$1
P2k2 = 0.
(2.398)
This equation will be solved approximately in three limiting cases with r e sped to the order of magnitude relations between the energy gap E i and spinorbit splittiiig energy A, as w ~ l l with respect to the s i g n of E,'. as namely firstly for EF >> A, Eg > 0, secondly for EF CK A, E; > 0, and thirdly for IEiI < A, E i < 0. The significance of a negative value of E i will be discussed while treating the third case. All three cases actually occur in zincblende type semiconductors, as a look at Table 2.12 immediately
shows. The first case corresponds to materials with wide energy gaps whose valence band complex could be described just as well by means of the LuttingerKohn model; the second case refers to semiconductors with narrow energy gaps; and the third to materials whose energy gaps vanish.
Case 1: E: z E , >> A,
EF
>o
We consider energy values Ei(k)in the various bands with energy separations IEi(k)  &(O)( from the respective band exbrema which are small compared with EF. For such energies, the conduction band El(k) approximately obeys the equation
[El(k) E,]E,  P 2 k 2= 0.
and for the two valence bands E3(k) and E4(k) we have
(2.399)
bi(k) 
$1
bi(k)
+ TA
21
E,
+ [&(k) + 
I:
P2k2 = 0, i = 3 , 4 . (2.400)
208
1 A E3/4(k) =  [ 2 3
P2 k E,
2]*ij[?fKk2]+m'
P2
(2.402)
Under the condition (P2/Eg)k2<< A, a parabolic approximation for the otherwise nonparabolic valence band E3/4(k) is possible, namely
E3/4(k) =
 (2/3)A
(2.403)
The structure of the four bands E1/2/3/4(k) is illustrated in Figure 2.27a. Case 2: Ef
= E, << A, Ef
>0
The energy values Ei(k) now have energy separations IEi(k)  Ei(O)I from the respective band extrema which are small compared to A. This means they can be comparable to E i . For the conduction band and the light hole band one then gets from (2.398), approximately,
= 0, i = 1,3,
(2.404)
from which
follows. In general, the dispersion laws for the electrons and light holes are again nonparabolic. Only for energy separations from the band extrema which are small compared to E,, more exactly for P2k2 << (8/3)E,, a k2dependence emerges, namely
\i
2.7. k . p method
209
b)
7 A
r,
2 4
Figure 2.27: Valence band structure of zincblende type semiconductorsin the Kane model for limiting cases: (a) E, >> A, ( b ) E g<< A, E: > 0, ( c ) E , << A, E; < 0. Since the energy region of width Eg above the band extrema is relatively narrow for the narrow gap semiconductors considered here, one has in these materials, even at relatively small separations from the band edges, nonparabolicities in the dispersion laws of the electrons and light holes. As far as these particles are concerned, small energy gaps and nonparabolicities occur together. For the spinorbitsplit band, one obtains, without further approximat ions,
2 E4(k)= A
3
 k
PZ2 . 3A
(2.407)
The dispersion curves of all four bands in the limiting case considered here are depicted in Figure 2.27b. Case 3: IEil << A, E i < 0 According to equation (2.394), negative values of EF mean that the r 6
2 10
conduction band level lies below the rglevel. The relation lEil << A guarantees that the spinorbitsplit r;.level is found further below it (see Figure 2 . 2 7 ~ ) .If. again, only energy values are considered with separations IE,(k) E,(O)I from the respective band extrema which are small compared to A, one obtains dispersion relations having the same form as in the previously considered case EF << A , E , > 0 (see equation (2.406). For electrons and light holes they yield. under the condition (P2/A)k2 << IEil, the approximate parabolic dispersion laws
Because of the negative sign of E:, band 1 now lies energetically lower and exhibits a negative effective mass, and band 3 lies higher and has a positive effective mass. For the spinorbitsplit band E4(k), equation (2.407) holds unchanged, and for the band E i ( k ) of the heavy holes, relation (2.397) also remains the same. Thus, the band E 3 ( k ) is, among all four bands, the highest ' , energetically with the exception of I where it degenerates with the band of the heavy holes. Since the 8 valence electrons of a zincblende type semiconductor are only enough to occupy 6 of the 8 bands of the I?;.. FS. ??&band complex  2 electrons per unit cell are necessary to fill the deepest I's valence band, omitted from consideration here  the E3[k) band remains empty. It becomes the conduction band, which is also in accord with its positive effective mass. At r, its separation from the uppermost valence band, the Ez(k) band of heavy holes, is zero. This means that the energy gap vanishes in this case. Negative d u e s of the parameter E i . which signifies the energy gap when it is positive, cause EF to lose this significance, and the real energy gap becomes zero. Materials with vanishing energy gap are called zerogap semtconductors. Examples are HgTe as a zincblende type semiconductor, and D  SR as a semiconductor of the diamond type. In Table 2.13. the effective masses m t are listed for the three limiting cases considered above. Generally, one has 1m.I I 5 I rn31 < IrnZl. The effective masses of the electrons and light holes are proportional t o / E g l / P 2 throughout. The rule discussed above for degenerate bands is thereby confirmed to be valid also for nondegenerate bands wherein mass decreases as lEgl decreases and P increases. For IE,l << A the effective masses of the electrons and light holes are almost identical, for E, >> A the electrons are lighter than the light holes. For the spinorbitsplit band, the proportionality of the effective masses to IE,l/P2 exists only in the case E, >> A; for < A the mass mS becomes independent of Es and proportional to A. <
211
Band
EF>>A,E;>O
EL<A,h',r>O
EL
<< A, EF < 0
'
rdl)
rs(3)
1.(F,/P2)
(3/2) . ( E g / P ' ) 3
'
(3/2)
(I
8; I / P 2 )
(3/2) . (I Ei I / p 2 ) 3
'
rd4)
2.8
(E,/P2)
(A/Fs,)
(A/ I
q I)
In this section we discuss the band structures of some important semiconductors. In all cases, the results presented are based on both theoretical and experiment a1 investigations. Experimental data concerning band struct ure of semiconductors arc mainly obtained by means of optical reflectance spectroscopy. It turns out that characteristic structures of the reflectance spectra, like peaks or shoulders, are directly related to optical transitions at critical points of the energy difference between the initial and final bands involved. The frequencies of these structures are experimental measures of the energy separations between initial and final bands at critical points, To enhance the charactelistic spectral features, changes of the reflectance spectra are measured due to external perturbations, as, for example mechanical strain, clmtiic and magnetic fields, light, or heat. By modulating the perturbations periodically in time with frequencies in the kHz range, these spectral changes can be measured very precisely by means of frequency and phase scnsitive techniques. Exttniple5 of this socalled modulataon spectroscopy are electrorepectance (ER), pzezorejlectanw, thermarejlectance, and photoreflectance (PR) (for morc on elect roreflwtance see section 3.7). Details of the band structure at critical points, like effective masses of free carriers, may also be extracted from transport meaxmemenits. Again, external perturbations, in particular, magnetic fields, are applied to induce changes. Magnetotransport phenomena, like rnngnetoreuzutanc.r and ShubnzkovdpHaas effect are examples. In cyclotron rmonance one measures the absorption of microwave radiation by a semiconductor sample in the presence of a magnetic field to obtain the effective masses of electrons and holes (section 3.8). None of the experimental methods is capable of revealing the entire band strurture of a given semiconductor material at all poiutv of the first BZ. To
212
obtain this. one is nhliged to carry out band structure calculations. Expaimeutal data enter thrse calculatiolis in various ways. Tlus is obvious if empirical methods are cmployed their results have to be fitted to experimental data, as, for example, to energy separations between bands at critical points obtained from modulation spectroscopy. Less obvious. but nevertheless existing, is the need o experimental data for abinitio calculations. Although f these methods are free of fitting parameters, various approximations are involved which call for experimental confirmation or even corrections of the results. as, for example, in the case of the erroneous fundamental energy gap in the local density approximation. Below we represent the results of of uumprical hand structure calculations using one or mother of the methods described in section 2.5. We will not specify which particular method was applied since that is not of interest here. Our main concern is with the qualitative features of the energy bands. We will demonstratp that thew may be understood. at least partially, just by means of the general results derived in the preceding sections and in Appendix A. This is particularly true for features irivolving the degree of band degrnrracy at symmetry points of the first B Z , which follow from the i r r e dudble representations of the space group of the crystal under consideration (see Appendix A). The band structure models derived by the empty lattice approach. the k . p method, and the tight binding method in sections 2.4, 2.6 and 2.7, respectively. will also be helpful. We begin our discussion with silicon.
2.8J
Silicon
In Figure 2.28 the band structure of Si is shown. Spinorbit interaction plays only a minor role for the ovcwJl behavior of energy bands in Si, 80 the energy levels may be classified by means of the ordinary irreducible representations of the small point groups of the wavevectors k. l'hesr are subgroups of the full point group 01, eqiiivalent dirrctions. According t o what we already of know about the dimensions of these representations at the various symmetry points of the first B Z (see Table 2.5 and Appendix A), one expcrts at most %fold degenerate levels at the symmetry center I only 2fold a t the the ' , symmetry point X. and at most 2fold on thp symmetry lines A, A and at L. This expectation is codinned by the band structure shown in Figure 2.28. The deepest energy level of the entire band structure occurs at r, i s nondegenerate. and belongs to the irreducible representation r' . The same 1 result was previously obtained in the band structure analysis of diamond type crystals by means of t h e empty lattice altd tight binding approaches in earlipr swtions (see, Figwrs 2.13 and 2.2L rcspectively). Also, off of l?, the numerical calculation of the PIband of Figure 2.28 exhibits the behavior prdictfd by these simple approaches. DiEmmrw between the numerical
2 13
Figure 2.28: Rmid structure of Si. The energy unit i eV. (After Chelikowskp and
Cohen, 1974.j
and emptylattice band structures occur at the second level at as may be seen from Figure 2.29. The second empty lattice has an 8fold degeneracy In a real which exceeds what is compatible with the Ohsymmetry of
r.
diamond type crystal this degeneracy is removed. As indicated in Figure 2.29, a splitting into levels of rl,I'b, I'b5, and r l 5  s y m e t r y will occur. In this way the corresponding rk5, rls, and I'klevels in Figure 2.28 could have emerged from the 8fold degenerate empty lattice level. The tight binding analysis of the band structure of diamond type semiconductors in section 2.5 has already shown that the second level (from below) at I? has For the third level at r, the representations r 5 or 1 were symmetry possible, according to this analysis. 4 s Figure 2.28 shows, r15 applies in the case of Si, while the I'blevel is the fourth. A410ng the A and Alines, the two 3fold degenerate let~ls L and r15 I' must split since only 1 and 2dimensional irreducible representations are possible there. Both a %fold splitting into 3 simple bands as well its a 2fold splitting in a 2fold degenerate and a simple band are conceivable. From Figure 2.28 one sees that the latter case holds, both along the A and Alines. and for the I?& and the I'lslevels. Since there are only 2dimensional representations at X (for an explanation see Appendix A), two simple bands along the Aline must merge at this point, as actually happens in Figure 2.28.
214
Figure 2.29: Band structure of the empty fcc lattice. The energy unit is the same aa in Figure 2.28. The irreducible representations of the energy bands are also indicated. For comparison, the band structure of Si is shown in the same figure by dotted lines.
t lo
0
L
w  8
m
C
cu
w 6
r, 62 5; r; 5' 5 5 6;
r 1
A
Wavevector
A 2fold degenerate band along A cannot merge with another band at X , but
miist terminate in a doubly degenerate level at X. From this it follows, for example, that the upper of the two bands arising from the I'k5level along the Aline must be 2fold degenerate, and the lower band must be simple. A look at Table 2.4 shows that the only 2dimensional representation of the small point group of A is A5. At X this representation is compatible either with X 1 or X 3 (the compatibility relations between irreducible representations are derived I n Appendix A). Figure 2.28 shows that X 1 is correct in this case. In a similar way one may conclude that the representation of the lower level at X ,arising from the I'i,level, must be X z . A similar analysis for the splitting of the rlslevel along the Aline shows that the lower level is nondegenerate and belongs to the irreducible representation A l , and the upper level is 2fold degenerate and belongs to .As. The intersection between the Ahband emerging from the r!+vel, with the Asband emerging from the l'lslevel, is not due to symmetry, but reflects an accidental degeneracy.
Consider next the two bands arising from the l'$slevel on the Aline. The upper is the 2fold degenerate Asbaud, and the lower is the simple A1band. The simple band does not merge with another simple band at L , but remains separated from it by a finite gap, in contrast to the band behavior at X. This is possible because at L there arc also 1dimensional representations, L1 and L i , which give rise to the two lowest levels at L . The splitting of the rlslevel along the Aline is quite similar to that of the l?~5level; lower band is nondegenerate and belongs to A l , and the upper the
2 15
is 2fold degenerate and belongs t o h ~ . The occupation of the energy bands of Si and other diamond type crystals has already been discussed in section 2.6, using the band structure which follows in tight binding approximation. In the ground state of the crystal, the band associated with the 1fold I'&vel, and the bands arising from the %fold l?h51evd, are completely occupkd, while the abovelying r16 and rbbands are completely empty. Thus the rl and r&bands form the valence bands of Si. and the rls and rbbands. as well as a l higher bands form the l conduction hands. The two groups of bands are separated by kdependent forbidden energy regions, marked by dashes in Figure 2.28. Moreover, there are forbidden energy values which occur at all kvectors. This means that the bami structme calcidations for Si yield a finite energy gap, and, inded, they explain the semiconductor character of Si. While the absolute maximum of the uppermost valence band lies at r, the absolute minimum of the lowest conduction band is located on the Aline close to the edge of the h s t B Z , and its irreducible representation is ALSemiconductors with the absolute extrema of the valence and conduction bands located at different points of the first B Z , are called ipadiwct. Silicon is, therefore, an indirect aernicondurtor. If the extrema occur at the same point, the semiconductor i s called dawct. The property of a semiconductor material in being direct or indirect (Table 2.14 contains information about this) has important physical and technological consequences. For example, indirect materials are in general not suitable for manufacturing lightemitting devices, unless one takes special measures such as a particular doping. Following the general lines of section 2.5, we will now examine the efective mass tensors of Si. We will restrict oursehes to the two critical points mentioned above, in which the conduction band has its absolute minimum and the valence band has its absolute maximum  the vicinities of these points are the regions of the first 32 which. under thermodynamic equilibrium conditions, host most of the free eIectrons and holes. The effective masses at these points are therefore the effective masses of the electrons and holes of silicon. Owing to the cubic symmetry of the band structure of Si, a minimum of the conduction band on a particular Aline is automatically accompanied by 5 other minima on the star of Alines. This means that there are, altogether, 6 minima OT valleys, a term which is often used for the vicinities of the minima. Since Si has several valleys. one calls it a manyvalley semiconductor. The valleys are centered at the points
2 16
axe8 of the effective mass tensors of the various valleys. Each principal axis represents a Cfold rotation symmetry axis. To p r o r d further. we select, arbitrarily, the valley centered at (O,O, kJ. The band structure of this valley is given by the expressions (2.201) and (2.2@2), which are applicable here, as their conditions of validity are satisfied. Setting the band index v equal to c, which refers to thp coiiductioii band, the dispersion rdation EJk) of this band becomes
(2.409)
If t,hc zero point of the energy scale is put at. t h r valence band maximum, as we do here. then E , is t,he fundamental energy gap. In the vicinity of the valencc band maximum, the general results of the k . pmethod of section 2.7 are applicable. Without spinorbit interaction (see cqiiations (2.345) to (2.348)) one has two valence bands for each of the two symmetry directions A and A, an upper E,l,z(k) which is 2fold degenerate, and a lower &(k) which i s nondegenerate. Along less symmdricnl kdirections E,l/z(k) split 5 into two bands. However, spinorbit interaction cannot be neglected; although it has little effect on the overall band structure of Si, it inftuences the valence band structure considerably in an energy interval of several kI' below the rnaxiniurn at r, which is the energy region where most of the holes B r e located. Spinorbit interaction makes the uppermost valence band level at r, which has a 6;fold degeneracy in Si if spin i s considered, split into an uppm &fold degenerat.e I?$level The splitting rncrgy amounts to and a lower 2fold degenerate I';lcvel. 44 m e V . Away from I?: the upper ralevel decomposes into the two bands Efp(k) of heavy and light holes according t o equation (2.377). The lower r$levd givw rise to the spinorbitsplitr;band. The heavy and light hole bands are strongly warped and each exhibits %oldspindegeneracy, which is due to timc reversal syrrirriet,ryin coinbination wihh spatial inversion (see Appendix A). If warping i s neglected the two bands Era(k)can he described by isotropic effective mass# m$. 'I'hey are negative because of the maximum ' I at . Extracting the negative sign, we define the positive effective masses rnEh and m.G1of, respectively, heavy and light holeb;: setting m , =  m ; , and :, =  m l . For the heavy and light hole valence bands EF8(k) G E,h(k) and E&(k) = E,,,(k): we h a w
ntl
(2.410)
Numerical values for the effective hole masses of Si are given in Table 2.14. A vivid view of the conduction and valence band structure in the vicinity of the band edges is provided by the corresponding isoenerm surfaces. These
217
Figure 2.30: Isoenergy surfaces of the electrons (on the left) and holes (on the right) for Si.
I
are obtained by k i n g a specific energy and then drawing all kvectors for which the bands of equations (2.409) and (2.411) yield this energy value (see Figure 2.30). For the conduction band, the isoenergy surfaces are ellipsoids of revolution, pointing in the direction of the symmetry axis. Each of the six star points is the center of such an ellipsoid. For the valence bands, within the isotropic approximation, the isoenergy surfaces are concentric spheres ? centered at I. The inner sphere corresponds to the light hole band, and the outer to the heavy hole band. In reality the valence bands are not isotropic but have only cubic symmetry. Thus their isoenergy surfaces are warped, as shown in Figure 2.30. If the conduction band is populated by electrons up to a given energy, then the kvectors of these electrons lie within the ellipsoid of revolution corresponding to this energy. Accordingly, the kvalues of the holes lie within one of the two spheres or the two bodies bounded by the warped surfaces.
218
Figure 2.31: Band structure of Ge. The energy unit is eV. (After Chelikowsky
and Cohen, 1.974.)
2.8.2
Germanium
In Figure 2.31 the band structure of germanium is depicted. It is similar to that of silicon, owing to the fact that both materials have diamond structure. The differences between the two band structures are, apart from other reasom, due to the fact that the spinorbit interaction is considerably larger in the case of Ge as compared to Si (see Table 2.14). This results from the larger orbital velocity of the valence electrons of Ge because of the larger atomic nucleus of this element. With the regard of spinorbit interaction the valencp band maximum at r is formed by an upper 4fold degenerate rsflevel and a lowcr 2fold degenerate I';level, separated from the upper level by 340 meV. Away from I',the upper I',flevel decompoaes into the two 2fold degenerate bands of heavy and light holes of, respectively, A6 and A7 symmetry, As in the case of Si, the bands are warped but often considered in an isotropic approximation. The lowest conduction band level r, at 2 arises from the rilevel without ! spin, which was the second conduction band level in the case of Si. In Ge this level i s shifted down considerably so that it becomes lower than the r, and , doublet which arises from the I'15level without spin. The most important ? I
219
Table 2.14: Characteristic data of the barid structure of selected semiconductors. Energies in e V , effective masses in free electron masses. Temperature 300 K. For Si, Ge, GaAs, Gap, CdTe, and IIgTe read i = u h , v l (heavy, light holes), and for CdS (hcxagonal phasc), PbTe, Te, and Se read i ~ 1 1 I (parallel, perpendicular to ,
syniniebry axis). ( A f t e r CandoldtBlirnstein, 1982.)
Milaterial
i _ .
CBlmdMimirnurn
Energy
Gap
1.1 1
0.66
hl
II
0.15 0.044
I _
Si
G2 I GaAS
A
L
0.044 0.29
0.54 0.3
r
X
1.43
2.27
0.34
0.08
0.7
0.30
0.45
0.67
0.087 0.17
0.1 0.03 0.7
GaP CdTe
HPn:
CdY
r r
1.43
0.00
0.4
0.3
r
L
2.50 0.30
0.33
0.064
0.24
0.07
5
0.028
0.1
PbTe
0.31 0.11
0.022
0.24
k
Se 
H
Z(?)
1.8
 
difference between the conduction bands of Ge and Si is the location of the respective absolute minima. In Ce it does not occur on a Aline, but at L , being the B Z boundary point of the Aline. Therefore, the conduction band of Ge has 8 halfvalleys instead of 6 full valleys in the case of Si. The minima at L are L;levels arising from the &band on the Aline which enters the
r7flevel at
I.
2.8.3
111V Semiconductors
For the two 111V compound semiconductors GaAs and Gap, the band structures are shown in Figure 2.32. Their first B Z s are the same as that of Si and Ge, since they have the same Bravais lattice. The full point group of equivalent directions is T d for both materials. Spinorbit interaction is important for the overall band structure of GaAs, while for GaP it may be neglected. Thus the band structure of GaAs is described in terms of spinor representations, and that of GaP in terms of ordinary representations. For both materials, the valence band maximum lies at Without spin and
r.
220
Figure 2.32:Band structure of 111V semiconductors GaAs (top) and GaP (hot tom). The energy unit is el. (AfLer Chelikousky rand Gohen, 1974.j
22 1
I i
U,K
I
61
I r
XU,K
rI
Figure 2.33: Band structure of ITVI sernimnductom Cdk (left) and IIgTe (right). The energy unit i el.. (After Chadi, Walter, Cohen, Patroff and Balkanski, 2972.) s spinorbit interaction (the case of Gap), it belongs to the irreducible repre sentation Il5 of the point group Ta which arises from the representation ,I ; ? of Oh if the latter is taken as a representation of Td. With spinorbit interaction (the caw of GtrAs), the r15 vulence band maximum splits into an upper r8level, and a lower I7levd In the upper lglevel, the two 2fold degenerate heavy and light hole bands mcrgr together as in the case of Ge and Si. The lower l;.level gives rise to the 8fold degenerale spinorbitsplit band. The %fold degenpracy of the valence hands at X,observed in the case of Si and Ge, splits i CaAs and Gap. The reason or this is that the t,wn atoms n of the primitive unit cell are no longer identical, which means that the point symmdry of equivalent directions is rduccd from Of, t o Td. Therefore, one also has 2dimensional spinor representations ( X s , X:) instead of only one 4dimensional ( X , ) in the case of Si and Ge,and also 1dimensional ordinary representations (XI X2,X3, Xq) instead of only 2dimensional ones ( X i , X z ! x3> Xq) in the case of Si and Ge (see Appendix A). The conduction bend minirnnrn of GaAs occurs at the rpoint and belongs to the spinor representation r7. Thus, CaAs has a dirwt energy gap. One of the peculiarities of its condiiction band structure is the relative minimum at the Lpoint only about 0.4 eV above the absolute minimum atr r. In the case of Gap, the conduction band minimum occurs at X and belongs to the representation XI. Since the valence band maximum resides at I, GaP is an indirect semiconductor.
2.8.4
11VI semiconductors
The band structures of two typical ITVI semiconductors with zincblcnde structure, CdTe and HgTe,are shown in Figure 2.33. The band structure of
222
8
6
L
2
a
2
CdTe is similar to that of GaAs, in particular the conduction and valence band edges are located at the rpoint, as also occurs in GaAs. The same holds for HgTe, but as mentioned in section 2.7, the two bands touch each other at r, so that the energy gap vanishes. The reason for this is easy to understand, and was also already discussed in section 2.7: The r6level which lies above the Islevel in CdTe, is found below it in HgTe. The 8 valence electrons per primitive unit cell therefore suffice t o occupy only one of the two rsbands merging at I. The second band remains unoccupied and becomes ?
223
lxc,
L
K.U
I 4
r
Figure 2.35: Band structure of PbTe. The energy unit is eV. (After Martinez,
Schliiter and Cohen, 1975.)
the conduction band. Since it is degenerate with the uppermost valence band at I', the fundamental energy gap is zero. According to the general definitions of Chapter 1, this means that HgTe is no longer a semiconductor, because in semiconductors the valence and conduction bands must be separated by a finite energy gap. It is more fitting to term HgTe a semimetal. As an example of a 11VI semiconductor which does not exhibit the zincblende but rather the wurtzite structure, we chose the hexagonal CdS (see Figure 2.34). The first B Z is that of the hexagonal Bravais lattice, which is shown along with its symmetry points in Figure 2.12. CdS has a direct energy gap at r. Furthermore, the valence band is additionally split off at r as compared to the cubic materials of diamond and zincblende structure, because of the hexagonal deformation of CdS. Thus, the valence band is composed of 3 bands, namely r7band which is separated from the other two by spinorbit interaction, and the two bands r6 and I?7 into which the rgband decomposes under the hexagonal deformation. The latter effect is called crystal field splitting (see the right part of Figure 2.34).
224
12
8
4
0
4
8
1  z
Figure 2.36: Band structure of Te (left) and Se (right). The energy unit is eV.
(After Maurhke, 1 g71; Stuff, Maachke and Laude, i973.)
2.8.5
IVVI semiconductors
The band structure of a typical IVVI semiconductor, PbTe, is depicted in Figure 2.35. It crystallizes into the rocksalt structure. Its Bravais lattice is, thus, again the fcc lattice and its first B Z is the same as that of the diamond and zincblende type crystals. The conduction band minimum and the valence band maximum both lie at the edgepoint L of the first B Z . This means that PbTe is a direct gap semiconductor with many valleys, a special feature which is hardly realized in any other semiconductor families apart from the IVW compounds. The energy gap o PbTe is relatively small, it f amounts to about 0.2 e l F . Like InSb of the 111Vcompounds, PbTe belongs to the group of narrow gap semiconductors.
2.8.6
The band structures of tellurium and selenium are shown in Figure 2.36. Both materials have hexagonal Bravais lattices. Therefore, they have the same first B Z s as the hexagonal CdS. In the case of Te. both the valence band maximum and the conduction band minimum occur at the Hpoint of the first B Z . The material is therefore direct. Se has an indirect gap, the two edges both lie outside of I, that of the conduction band is probably at 2 , and that of the valence band is probably at H.
22 5
Chapter 3
31 .
The reference system for the description of the atomic structure of a real crystal is the ideal crystal. As discussed in Chapter I, the latter may be characterized as follows: All regular sites
Ri
rial
+ r2a2 + 7'3a3 + 6
(3.1)
defined by the crystal structure, are occupied by atoms of the 'correct' chemical species, and other sites are empty. In a real crystal, this characterization holds for the vast majority of regular and irregular sites  providing the justification to speak of a crystal at all, albeit a perturbed one. However, there is deviation from the ideal occupation at some of the regular and irregular sites. In this sense. one then refers to the crystal as either perturbed or real.
3.1.1
Classification of p e r t u r b a t i o n s
The perturbations to be treated can be classified in accordance with several points of view. On the one hand, one can distinguish them on the basis of whether they are of a purely chemical nature, or purely structural, or mixed type. In the first case, only regular sites of the crystal are occupied, but. not with chemically 'correct' atoms throughout. One refers t o this as chemical or cornpositzonal dzsorder. In the case of elemental crystals, this kind of disorder necessarily means the presence of impuaty atoms. In this context the perturbed crystal, which hosts the impurity atom. is called the host crystal. A crystal formed from a chemical compound may also exhibit f compositional disorder without impurity atoms, namely because o a perturbed stoichiornetrical composition. In the case of structural perturbations, only chemically 'correct' atoms are present, but these are not always positioned on regular crystal sites but also on irregular ones. Furthermore, not
227
all regular siles are occupied by atoms, some sites remain empty. Structural perturbations are also called strurtural & f w h or simply dpferts. The third case is the most general one  regular or irregular sites of the crystal are partially occupied by rhrrnirnlly wrong atoms, end also the correct atoms partially occupy wrong sites. Deviations from an ideal crystal may, on the other hand, be distinguished according to the macroscopic extension of the perturbations. A perturbation that is limited to one or a few neighboring regular or irregular crystal sites is called 0dimensional or a poznt perturbatzon. If the perturbation extends over sites located on a line or a planc, it is referred to as a 1dimensional or lzne perturbatton and a 2dimensional or planP perturbatton, rpspert ively, Combining the two classification schemes, one may refer to structural point or line perturbations, compositional point perturbations etc. The dimensions in the second classification scheme apply to the microscopic core of the perturbation. There may be smaller perturbations induced by that core which extend in three dimensions. Examples are charged impurity atoms, which, due to their longrange Coulomb forces, change the potential energy of an electron even over distances large compared to the lattice constant. Below we will characterize the various perturbations in more detail, starting with point perturbations.
3.1.2
Point perturbations
In this subsection, we describe the most important compositional and structural point perturbations of semiconductor crystals. An illustration is given in Figure 3.1.
(1) We begin with an impurity atom on a regular crystal site. Since the impurity atom substitutes an atom of the host crystal (see Figure 3.lb) it is referred to as a substztutzonal vrnpuraty. Examples of substitutional impurities are a phosphorus atom in Si on a Sisite, or a sulphur atom in GaAs on a Assite. To avoid a somewhat cumbersornc description, in the first case one uses the symhol S I : P , and in the second, the ~ynibol GaAs ; S A ~ . As a rule, impurity atoms which are chemically similar to atoms of the host crystal, are incorporated substitutionally. For this reason, many elements of the mein group of the periodic table, if added l o groupIV elemental semiconductors as well as binary 111V and 11VIcompound semiconductors, form substitutional impurities. The substitutional incorporation, in most cases, occurs on that lattice site which corresponds to the chemically most similar of the two atonis in the binary compound semiconductor. lherefore, the doping of GaAs with S leads to the above mentioned point perturbation
C a A s : S A (and not GaAs : S G ~ )and the doping of GaAs with Si leads to ~ , G a A s : Sica (and not GaAs : S ~ A ~ ) .
(2) An unoccupied regular crystal site is called a vncuncy, as depicted in Figure 3. Ic. T semiconductors made of binary chemical compoiinds, one n has to distinguish between cation and anion vacancies, as shown Figure 3 . 1 ~ .Vacancies occur in all important semiconductor crystals; the general symbol is V. The vacancy in Si is denoted by S i : c': the cation vacancy in C h 4 s by G a A s : T;&, and the anion vacancy by GaAs : VA*.
(3)If the impurity atom does not occupy a repular crystal site but a site between regdar ones, one has an interstitial ampurity atom (Figure 3.16). In order for an impurity atom to stay at an interstfitrialsite, it must have sufficiently low energy there. It is quite clear that this will be satisfied for interstitial sites which either haw high local symmetry or which lie on a bond between two at'orns. In thc latter case t,he crystal has bond centered interstitiah. The high symmetry interstitial sites in t e t r a h d a l seniiconductors may be such with tetrahcdral local symmetry in the neighborhood of the cation or of the anion (for group IV elemental semiconductors the latter distinction is void, of course). Moreover, there are high symmetry sites with hcxagonal symmetry. One refers to these7 respectively, as f e t v u h c h l and hexagonal interatitiak The incorporation of impurity atoms on interstitial crystal sites is especially likely when the impurity atom deviates relatively strongly from the atoms of the host crystal as! for example, in t,he case of transition metal atoms in semiconductors composrd of elements of the main groups. The general symbol for an interstitial is I . An interstitid Fe atom in Si on a tetrahedral site is denoted by Si : I, ;.
(4) If a chemically 'correct' atom of the crystal occupics an interstitial site rather than a regular one, one has a selfinterstitial (as shown in Figure 3.le). In order for such a structural point defect to develop in a crystal, there must h e enough space between the host ntmns, i.e., the crystal should n not be packed too densely. This happens, for example, i the case of tatrahedral semiconductors, particularly in Si and Ge which have purely covalent b ondinE. ( 5 ) If,a crystal consists of two different chemical elements, then an atom of the h s t may occupy a regular site of t h e sccond, and vice vcrsa Such point perturbations are called antisite defects, as i l h s t r a t d in Figure 3 . 1 In the ~ case of CiaAs, for example, a Ga atom may be located ttt an h  s i t e this is called B I ~ Asantisite defect,, and tla As atom may occupy a Ciasite  this is called a Gaantisite defect. The symbols are As& for the Asantisit>e defect, and G U Afor the Gaantisite defect. ~
~
Interstitials, vacancies and antisite defects are structural point perlurbations, or point defects. The compositional point perturbations, i.e. the
229
Ideal crystal
A Atom
SA
SB
Vacancy (V)
Interstitial ( I 1
Interst. impurlty
___ A,Antisite
B,Antisite
e)
Figure 3.1: Illustration of the most important point perturbations in semiconductors using the example of a crystal with two atoms per unit cell of the same chemical element (lefthand side) and different chemical elements (middle and righthand side).
Table 3.1: Electron corifiguratiori of main group elements. In the rightrnost column the respective closed shells are indicated.
The division of elements into groups, which is commonly used in chemistry, also proves to be helpful for the classification of impurity atoms in semiconductors. This is not surprising because the incorporation of an impurity atom in a crystal indicates a more or less strong chemical bonding. We summarize this group division of chemical elements below. The periodic table consists of two types of groups of elements, the main groups, and the transition groups. Of the first 98 elements, 50 belong to the main groups and 48 to the transition groups. The elements of the main groups in Table 3.1 have in common the feature that electron shells with angular momentum quantum numbers 1 2 2 either do not occur at all or, if they exist, they are completely filled or completely empty, i.e. no partially filled shells of this kind occur. The energetically highest, and thus in general, only partially filled shells of these elements either have 1 = 0 or 1 = 1. Therefore, they are s and pshells. Because of the relatively large spatial extension of s and pshells in comparison with d and fshells, the former are simultaneously also the outer shells of the atoms which are responsible for chemical bonding. One thus also speaks of spbonding elements. The rare gas elements are special cases, in which the s and pshells are also completely occupied. For the elements of the transition groups presented in Tables 3.2 and 3.3, the shells with 1 2 2 are energetically the highest and, thus, in gen
231.
Table 3.2: Electron configuration o transition dernent,s. In the rightmost column the respective closed shells are shown.
n
ziSc =Ti 3d4~3~ 3d248'
23V
Iron m o w 3d34s2
z4Cr 3d5&
wMn 3d"4s2
z7cO
Ni 3d64aa
3d74s2
3a23p6
era1 they are the not completely occupied oms. Because of the relation R 2 I 1 between the main quantum number ?a and the angular mornenturn quantum numbe1 E. the dshells (I  2) are possible only for n >_ 3. the fshells ( 1  3) only for n 2 4, etc. Accordingly, one has the shells 36,&, 4f, 5d, Sf, ;1g, 6 4 Sf, 69, fih ptc. Since among the first 98 elemmts of
Table 3.3: Electron configuration of rare earths and actinides. In the right column the respective closed shells are indicated.
Actinides
the periodic tablc, however, the 5gshell already remains unoccupied, only
d and fshells are to be considered, namely the dshells 3 d , 44 5 d , 6 d , and the fshells 4f and 5 f . The filling of the 3d, 4d and 5dshells takes place in the series of transztzon metals (together with the Elling of the 4s, 5s and 6sshells). Among the transition metals, one distinguishrs the iron groixp in which the 3d and 3sshells are being filled, the palladium group in which the samc happens with the 4d and 4.sshells, and the platinum group where the 5d and 5sshells are being Elled. The 4fshells are being filled in the rare earth elements, and the 5fshells in the actinides. In comparison with the s and porbitals, the d  and forbitals have a smaller extension in spacc, they lie mostly within the s and pshells of the same main quantum number
7 ~ .
'l'herefore, mainly s and pelectrons are involved in chemical bonding. This explains the remarkable chemical similarity of thc rare earth elements with each other, a n d a certain similarity of these elements with the elements of the main groups.
In the case of an ionized donor and an ionized acceptor, the lowering of total energy through the formation of a bound complex of associtlteb is particularly obvious the two point perturbations are differently charged and attract each other through electrostatic forces. This leads to the formation of donoracceptor paws, in which the donor and acceptor atoms occupy neighboring sites in the crystal. In general, the pairs are stable at several possible distances, whirh gives rise to a variety of different donoracceptor pair complexes.
Di and multivacancies
When there are two vacancies, the mechanism for the formation of bound pairs can also be easily undertitood the (internal) surfacc of thc crystal i s reduced if two previously isolated vacancies move together to occupy neigh
233
boring cryYtal sites. One caIls this associate a ddvacancy. Analogous atatements holds for the association of more than two vacancies, which are called
multrvarunczes.
Frenkel defects
If, in a crystal, an atom moves froin a regular site to an interstitial site, then it h v e x behind a vacancy which attracts the interstitial. Thus a defect pair is formed in this process which consists of a selfinterstitial and a vacancy. It is called a Freibel defprt. There arr important puint perturbation complexes which occur only in a specific material or matrrial group. We now consider some examples for si snd G A S .
Point perturbation complexes in Si
Wr various reasons, hydrogen is often present in Si. In ptype Si, H atoms undergo chemical bonding with the availablc wreptoi atoms. Thereby the electron of the H atom is captured by the acceptor atom: which becomes singly negatively charged. One may also conclude that the acceptor expends i t h hound hole to the H atom ralher than to the valence band because it has lower energy there. In Bdoped Si, for example, it negatively charged Bion and ti positively charged Hion are formed in this way. The two ions attract each other by Coulomb forces, which results in the formation of a neutral ( H . B )pair. Of course, the pair will not be able t o arcept an clcrtron which means that the B atom has lost its ability to act as an acceptor. e Chalcogen atoms like S, Se, and T are incorporated in S i not only as single atoms, but tlBo as twoatom molecules. Oxygen in Si enters into bonding with a vacancy, forming a pair which probably constitutes the so called Arenter known from caparity 111ea~iremmtfi. Oxygpn is also involved in a wries o other defect complexes in Si, among others the socalled thermal f d a m r s . which are thusly named hPraiise of their origin in thwmal treatment.
Point perturbation complexes in G a A s
A prominent defect associate in GaAs is the socalled RXcenter, which acts as a donor. It is found in GaAs and also in (Ga, A1)As mixed crystals under appropriate conditions (e.g., high pressure, for more see section 3.5).
Originally, the DXcenter was attributed to a donor atom, like S&, bound t o another point perturbation whose nature was unhiowii at t!hHt time and, therefow, was denoted by X. Currently, the DXcenter is thought to be due to the donor atom alone, more strictly; to a donor atom which is incorporated interstitially but not substitutionally, as commonly happens. Another
by point perturbations bmome larger and reach mesoscopic size,one refers to them as aggw.qates. Complexes of macroscopic size such as, for example, oxygen or heavy metals in Si cryshls, are called
Latticc relaxation
a point perturbation differ frorri those in lhr ideal crystal. They are nonxr?rol in general, at. the ideal crystal sites. Thus the atoms are forced t.o move t.u new equilibrium sitcs. This is known as lattice relamtion (in Figure 3.1 this effect is omitted). The new rqiiilihriiim sites are initially tinknown. In principle, they can be determined by means uf atomic structure calculatious for t h e perturbed crystal. These have t o be performed sirnultanmusly with calculations of the elect,ronic structure, just as is done in selfconsistmi. calcidations of the electronic and atomic st.ructwes of ideal crystals described in Chapter 2, section 2.2. HoweveT, there is an impwtant difference between the two cases. For ideal crystals, the calculation of at,omic structure may be avoided since, for t,he latter, complete and reliable experimental data are available. However, in regard t,o the atomic structure of a crystal in the vicinity of a point perturbdion, in many cases, hardly more t,hari t.he symmebry is known from experiment. Thus the selfconsktent calculation of the electronic and at.omir:sbriictures must actually be carried out. if lattice relaxat,ion becomes important. Experimental information concerning the symmetry of lattice relaxation may be derivrd from observation of t,he JahnTelEev efleet. This eflect results in splitting of the degenerate energy levels of a point perturbation due to a symmetrylowcring latt.ice relaxation. In the case of a varaiicy: for example, the lattice rehxation can reduce the original tetrahedral symmetry T d to tetragonal symmetv D2d. Such spontaneous symmetry breaking occurs when it leads to lowering of the total energy of the crystal. This mag; happen when, in the unrelaxed state, i.he point perturbation has a c1rgenerat.e level which is only parhially occupied. First. of all, the degenerate level will split off due to the symmetry lowering relaxation. According to perturbation theory,
235
this splitting proceeds such that the center of gravity of the levels remains unchanged. Thus, along with levels shifted up, there are also levels which are shifted down. If only the latter are predominantly occupied, then the energy of the electrons localized at the point perturbation decreases. This energy reduction can compensate the increase in total energy due to the removal of atoms from their equilibrium sites. If this happens the relaxation is energetically favorable and will take place spontaneously. In the case of the JahnTeller effect, the displacements of atoms are of the order of magnitude of one tenth of an Angstrom, i.e. they are relatively small. Larger displacements, of the order of magnitude of one Angstrom, are observed at point perturbations for which different atomic structures are stable, depending on external conditions as, for example, the position of the Fermi level. This phenomenon is observed at the D X centers in GaAs and (Ga, A1)As mentioned above (see section 3.5 for more detail).
3.1.3
Formation of structural defects All of the above mentioned defects of ideal crystal structure may in fact exist in real semiconductors. There are various reasons for their occurrence, the most important and general being the second law of thermodynamics. According to this law, the thcrrnodynamic cqiiilibrium state of a crystal at temperature T and pressure p , is characterized by a minimum of the Gibbs free energy G = H  TS. Here H is thc enthalpy and S the entropy of the system. The entropy of a macroscopic state is proportional to the logarithm of its microscopic realization probability (or thermodynamic probability) and the proportionality factor is given by Boltzmanns constant k. The system considered here is the totality of the atoms of the crystal. Lets assume that there arc only chemically correct atoms, and that these are randomly distributcd in space. Then the idedl crystal is formed wherein the atoms move to the regular crystal sites. This corresponds to a very special state of the system. It is extremely improbable compared to the large number of states in which deviations from the ideal configuration of atoms appear as described above. The entropy S of the ideel crystalline state i s smaller, therefore, than that of the imperfect crystalline states. The minimum of enthalpy II i s reached in the ideal crystalline state. Since the entropy S takes larger values in other states, the minimum of H is not necessarily coincident with a minimum of the Gibhs free energy G . Depending on temperature, a minimum of G = H  T S can also be adjusted for a state of the crystal which is less advantageous with respect to enthalpy, but more advantageous with respect to entropy, i.e. for a b t a k with a nonvanishing concentration of structural defects. As a concrete examplcs we
consider a vacancy in an elemental semiconductor. In an ideal crystal. the periodicity region contains a number J of identical atorris. A vacancy is created w h m one of t h e atoms is removed from the crystal. Altogether, there are J different possibilities for such a removal, as many as thme are atorns. For each realization of the irlcal crystal one has, therefore, J realizations of the crystal with B vacancy. II n independent vacancies exist, the number o f realizations is . J ( J  I ) . . ( J  r L + l)/nL The gain of entropy AS compared to the ideal cryatd t hrrefore mntnints to As=kln[
J!
n!(J  It)!
1,
In a more rigorous treatment, the entropy of lattice oscillations has yet to bP considered in AS, more accurately, the change of this entropy because of the alteration of the spectrum of lattice oscillations in the presence oI the vacancies. We will neglect this effect in what follows. What must, however, he taken into account is the enthalpy Rf necessary for the formation of n vacancy (at constant pressure). For R independent vacancies the enthalpy requirement is n H f . Altogether, the gain of Gibbs free energy AG due to
the formation of n. vacancies becomes
This expression Ims t o be minimized with respect to TL for a h e d value of T and p. According to of Stirling's formula, h { n ! )N n In(n)  n , the minimum coiidition may be written as
follows from this relation. The enthalpy of formatiult HJ or B vacancy typi d l y amounts to several c l ' . Assuming H f = 2 5 el/ and T  1400 K (roughly the crystalhdation temperature of Si), one gels the value n = J x 5 x 10l' from expression (3.5). Because J  s x cm' (for Si), a varancy conrmtration of about 1013 crriL3 follows. Similar concentrations are ohtnincd for other point deferts. Thew v a h s refer. hy thcir derivation, to the high temperatures assumed above, at which the c q s t d is grown. Cooling down to room temperature often does not, however, substantially change the defect concentrations The defects are frozrii in. From this one may conclude that the presence o a considerable number of point defects in f
237
crystals is essentially unavoidable. The Si singlecrystal bars used in microelectronics its a material base, in fact, contain vacancies and interstitials in ~rn~. the above estimated concentration of about Incorporation of impiirity a t o m s Chemical point perturbations are not as unavoidable as structural defects. Of course, the laws of thermodynamics in this case also act in a direction which leads away fro111 the absolute chemical purity of the ideal crystal. This tendency, however, can only be effective to the extent that chemically 'wrong' atoms are available in the raw materials and the chamber in which the crystal is grown. The state of a mure or less homogeneous distribution of impurity atoms in the whole chamber, i.e. the state in which the growing crystal also contains impurity atoms, has smaller Gibbs free energy than that of the chemically absolutely pure crystal with all the impurity atoms confined to the remainder of the chamber. However, by cleaning the raw materials and the growth chamber in thermodynamic terms by an extraction of entropy  the number of chemically 'wrong' atoms can be reduced. Theoretically, no limit exists for the degree of purity achievable, practically such td limit is set by the cleaning expenses, of course. Therefore, one reduces the concentration of impurity atoms only to a level which is absolutely necessary for the application iinder consideration. A boundary for the achievable concent,ration of impurity atoms in a crystal exists, as a rule, in the form of an upper limit. Under thermodynamic equilibriiim conditions it is given by the solubility of the corresponding elements in the semiconductor crystal. The solubility increases with rising temperature acrording to a law which is similar to that for the vacancy concentration of equation (3.5), providcd H f is underst,ood as the formation enthalpy of the impurity. The latter depends on the chemical nature of the impurity atom and host crystal, and it also differs for incorporations at differciit crystal sites like substitutional or interstitial ones. High solubility values are achieved, as a rule, if the element of lhe impurity atom is chemically similar to at least one of the elements of the crystal. Incorporation at substitutional sites is preferred in this case. If the crystal consists of elements of tho main groups, which i s the case for the majority of known semiconductor materials, one has relatively high solubility, especially for impurity atoms of these groups themselves. In the extreme rase, an alloy will be formed with t,he impurity atoms. The dissolving of A1 in GaAs, for example, results in a (Gtt, Al)As alloy. In such cases, t,he solubility equals the concentration of the host crystal atoms, i.e. crn'. For P in Si or Ge in G A S , solubility values of lo2' c n ~  ~ reached, which lie just a little below. For transiare tion metal elements, which d i k r st,rongly chemically from the main group elements, t>hesolubilities in semiconductors made of such elements are sub~
I 014
cm3.
There are various procedures to introduce impurity atoms into crystals. The easiest one exploits the growth of the crystal from the melt: the impurities are added to the melt of the host material, &henthe melt is rooled down to cause crystallization. Another method proceeds in the solid state: the irnpurity atoms are diffusedin from an outer source. To speed up diffusion, the crystal is heated. Besides the two processes mentioned, which introduce impurity atoms under equilibrium conditions, there are also nonequilibrium procedures to reach this goal, among them the socalled i o n i m p l a n t a t z o n In the latter, the impurity atoms, after they have initially been ionized and then accelerated by an electric field, penetrate into the crystal where they are implanted. This implantation process must be followed by heating of the crystal in order to heal out the defects which arise during implantation (annealzng). The heating also makes it possible for the impurity atoms, which were initially placed at sites relatively indiscriminately, to reach their equilibrium sites. If the concentration of implanted impurity atoms should be larger than the equilibrium solubility, the surplus atoms are later excluded from the crystal to a certain extent (precipitation). In many cases, a relatively large number of surplus atoms remains in the crystal, which then is in a frozenin nonequilibrium state. Migration a n d diffusion of point perturbations Point perturbations interact with the atoms of the crystal surrounding them. Both the point perturbations themselves as well as the environment atoms are in permanent motion because of the lattice oscillations. This results in more or less chaotic forces on the point perturbations. Due to the effects of these forces the perturbations move over to equivalent crystal sites in adjacent primitive unit cells. This is referred to as a m i g r a t i o n of point
239
perturbations. The migration of a vacancy proceeds such that an equivalent neighboring atom moves to the vacancy site and leaves a vacancy behind at its former position. The latter vacancy is displaced with respect to the original vacancy. A substitutional impurity atom migrates mainly with the aid of a vacancy, namdy by filling a neighboring vacancy site. In contrast to this, an interstitial impurity atom does not need the help of a vacancy for its migration, it can move directly to an adjacent equivalent crystal site. The initial and final states of an elementary migration act are relatively stable states of the crystal. Between them, intermediate states with larger total energies occur. This means that ttn energy barrim has to be overcome in an elementary migration act. One calls it the magration barrier Em. In order for migration to occur, the lattice oscillations must provide the energy Em or the enthalpy Hm = Em p V , with the pressure p held constant rather than the volume V , as comnionly OCCUTS. In this way, the migration rate becomes proportional to the activation factor e x p (  H , / t T f . For the migration af a vacancy in Si and Ge, H, amounts l a about 1e V , and for the migration of a selfinterstitial in these materials, it is smaller than 0.25 el/. Migration constitutes the elementary microscopic event underlying the mtacroscopic process of diffusion. The latter occurs then when the distribution of the migrating point perturbations is spatially inhomogeneous. For nottoolarge gradients of the defect concentrations, the first Fick's law may be applied in many cases. Conespondmgly, the diffusion of a particular kind of point perturbation may be traced back to just one material constant, namely the diffusion coefficient D. The temperature dppendence of the latter also exhibits an activation behavior.
where Do is the limit of U at high temperatures. In the case of diffusion through interstitial sites, the activation enthalpy Q equals the corresponding migration enthalpy H,. For diffusion through vacancies, Q is given by the sum of the migration mthalpy Hm and the formation enthalpy H f for a vacancy. Substitutional impurity atoms preferentially diffuse through vacancies. In Table 3.4 the Qvalues for substitutional and interstitial diffusion of some impurity atoms in Si are listed. For P in Si, which diffuses substitutionally, the diffusion coeficient at 1500 K is about lo'' cm2 sec'. At1 atoms in Si preferentially diffuse through interstitial sites. The d i f f ~ scoefficient i~~ at 1500 K of about 2 x c m 2 sccl exceeds that for P in Si by 6 orders of mappitude. The reason for this hiigp d i k e n c e is thp siibstantially smaller activation enthalpy for interstitial diffusion as compand to substitutional.
Ga
Au
Cu
3.1.4
With an increase in the number of point perturbations, a tendency arises for these perturbations to arrange themselves on lines or planes of macroscopic size. Then one has 1 and, respectively, 2dimensional perturbations in contrast to the 0dimensional ones considered above. In practice, ordering on lines and planes occurs only for structural point perturbations. The most important representatives of a 1dimensional perturbation are stepand screw dislocations. At a step dislocation, the occupation of a particular lattice plane breaks down along a line of lattice points  the plane is occupied on one side of this line (representing the step dislocation line), but not on the other side (see Figure 3.2). If one calculates the line integral over the path shown in Figure 3.2 (lefthand side), then the result will not be equal to zero, as would be the case for an ideal crystal, but equal to a nonzero vector perpendicular to the dislocation line. It is called the Burgers vector. For a screw dislocation one has to view the crystal as being cut by a semiinfinite lattice plane, which is bounded by a line of lattice points representing the screw line. Then the lattice planes left and right of the cutting plane are shifted parallel to the screw dislocation line by a lattice vector. After that one reconnects the two crystal halves again (see Figure 3.2, righthand side). The line integral about the screw dislocation line yields a vector parallel to this line. In a dislocation, deviations from crystal structure are, indeed, not limited to a line as one could initially assume. Slight atomic displacements (strain) occur in a finite macroscopic environment. The microscopic core of the perturbation is limited, however, to the dislocation line. The two most important examples of 2dimensional perturbations are stacking faults and grain boundaries. In a crystal with stacking faults, the various lattice planes carrying the atoms are not stacked in the same way as in the ideal crystal, but certain lattice planes are twisted by an angle. Grain boundaries occur in crystals which in fact consist of two differently oriented half crystals  the grain boundary is the lattice plane at which these two half crystals meet. In a sense, surfaces of crystals may also be considered as 2dimensional perturbations.
24 1
Having surveyed of the atomic structure of real semiconductor crystals we will now proceed to the electronic structure of such crystals. In this, we restrict, our considerations to crystals with compositional or structural point perturbations, therefore, we excliide 1 and 2dimensional perturbations. 'rhis is done in recognition of the fact that only point perturbations can play an important positive role in semiconductor devices, while perturbations of higher dimensions are generally disruptive. Fortunately, the latter can also be more easily avoided than 0dimensional perturbations, because they are less effective in increasing entropy. The silicon bars used in microelectronics are now grown practically free of dislocations and grain boundaries. In the following sect'ion we will formulat'e the Schrijdinger equation for the oneelectron states of a crystal with a point perturbation.
3.2
In regard to its geiwral form, thr ontLclectrou Schrodinger q u a t i o n for a crystal with point perturbations does not substantially differ from the o n e electron Schriidinger equation for an ideal crystal. The reason for this is that the derivation of the latter equation in section 2.2 never actually involved the periodicity of the ideal lattice. Only the oueelwtron potentid limrt(x) of the peTtriirbeclrrystal diffws from the rarresponding potential V ( x )of the ideal crystal given in equation (2.76). However, just like the latter, also the potential V ~ " ( x of the perturbed crystal is the sum of three contributions, ) the potential V ' c ? 7 L ( ~ ) caused by thf atomic cores of the perturbed crystal, the Hartret. yotentid I/Frt(x) of the valence electrons, and the exchangecorrelation potential V F : ( x ) of these electrons, In analogy to equation (2.76) we therefore have
VWTt(X)
= V ? t X f V,"'"x) L'()
+ Vg;t(X).
(3.7)
With this, thc oneelectron Schrodinger equation for the wave function $ l V ( x ) of an electron of the perturbed crystal reads
3.2.1
First of all, we discuss the potential VcPTt(x) the electroncore interaction. of We decompose it into a sum of the core potential Vc(x) for the ideal crystal, and a potential V,'(x)which describes the change of the core potential due to the point perturbation. Thus,
VC~'"X) = K ( X ) I
V,(x).
(3.9)
In further clisrussion, we assume that only one point perturbation exists in the crystal. The center of this perturbation will be taken as the origin of our Cartesian coordinatc systcm, and we procecd on the assumption that the potential VJx) falls off with increasing distance form the origin and finally approaches zero:
V,'(x)
+
0 for
Ix I
00.
(3.10)
The point perturbations listed in section 3.1 differ as to how fast this decay proceeds. We consider, initially, the case of a substitutional impurity atom whose core contains almost the same numbers of protons and neutrons as the core of the host atom. With this rqiiirement, the two atomic cores differ mainly through their different charges. One refers to them as isocoric impurity atoms. This case occurs, for example, if a P atom with only one additional proton and one addit,ional neutron substitutes a host atom of a Sicrystal (as illustrated in Figure 3.3). We denote the number of (positive) elementary charges of the core of the impurity atom by Z r , and the number of (positive) elementary charges of the core of the host atom by Z H . Because of charge neutrality of the individual atoms, ZI and ZH are simultaneously also the numbers of valence electrons of these atoms. The potential energy of a valence electron of an impurity atom of the type described differs, above all, by the change of Coulomb potential from the potential energy of an electron of the host atom. If one considers the cores to be pointlike and neglects spatial dispersion of dielectric screening, then one has, approximately,
243
(3.11) where E is the static dielectric constant of the semiconductor material. Deviations from this perturbation potential are to be expected in close proximity to the impurity atom, on the one hand, bwansc there the spatial dispersion of dielwtric screening cannot be neglected, and. on the other hand, because the core of thp impurity atom diffcrs not only by its charge number from that of the host atom, but also in other respects. In fact each core has a iinite spatial extension because of its spatially extended r o w electrons (the nucleus of the core may be treated as a point charge). In close proximity to the center, the core electrons give rise to additional forces, heside t h e elerlrostatic point charge forces already counted in [3.11). These additional forces are caused by higher moments of the core electron charge distribution, and also by exchange and correlation effects. If the cores of the host and impurity atoms differ, whch always happens iI the atoms arc not identical, the additional core forces will also differ. Both effects, the spatial dispersion of screening and the differpiirps of additioiiat rorc forces, are jointly termed central cell correcfzonu. The perturbation potential (3.11) is distinguished in that. with the exclusion of a close environment of the impurity atom, its variation over a primitive unit cell is relatively small. which implies that it remains significant over a relatively large distances from the impurity atom compared to the lattice constant. In this context, it is called a longrange potential. We may state. therefore, that isocoric impurity atoms are approximately described by smooth or longrange perturbation potentials. The potentials which apply for nonisocoric impurity atoms are different. Consider first the case in which the charge eZf of the impurity atom core equals the charge eZH of the host atom core. This does not necessarily mean that the nuclei of the two atoms must have the same numbers of core electrons, because different numbers of protons of the atomic nuclei may compensate the difference of core electron charge. The requirement of equal numbers of core charges implies, however, that the numbers of valence electrons of the two atoms must coincide  as described by the term asovalent impuritr atoms. Examples for nonisocoric isovalent impurity atoms are C atoms in Si or Gecrystals, or N atoms substituting P atoms in Gap. The potential energy of an electron at a nonisocoric. isovalent impurity atom differs from the potential energy at the host atom not just by the screened Coulomb potential (3.10). but by the different core electron shells in the two cases. The perturbation potential b:(x) accordingly contains only electrostatic contributions of higher moments of the core electron charge distribution difference, as well as exchange and correlation contributions due to this difference. Both kinds of contributions decay more rapidly with
Si
 _ _ _ Si Si_ _ _
Si
Si Si Si
v ;
v, pert v,
v
Figure 3.3: Illustration of thc origin of longrange (lefthand side) and shortrange (rightharitl side) core perturbation potentials. increasing distance fiorn the center than does tlic Coulomb potential of a point charge, they are therefore described as short runqe. These arc also the potential contributions, whicb in the context of the ibocoric impurity atoms abovc, gave rise to central cell corrections. The exact &termination of the pertrirhatiori potential is a problem which, like the determination of the periodic core potential of an ideal crystal, can ultimately be solved only by numerical calculations. These show, in fact, that the pet turbation potentials of isovdlenl impurity atoms differ horn zero substantially only over a distance of a few lattice constants. Consequently, we have
v(;(x)
(0
v'(x) 0
for I x
I 5
a f e u lattice constants
Ix1
(3.12)
Shortrange perturbation potmtials apply not only to isovelcnt, nonisocoiic substitutional impurities, but also to structural defects such a8 vacancies or interstitials. l n thc case of a vacancy, the occurrence of a shortrange perturbation potential is particiilatly obvioiis. The remuval o f a Si at<nrn from the chain of S i atoms in Figure 3.3a yields the potential profile of VJx) V,'(x) as depicted in Figure 3.3b, and the perturbation potential has the form shown in Figure 3 . 3 ~ The latter approximately represents the . ncgative potential v,'(x) of the missing atom.
245
Finally we foiego the requirement that the core charges of the two nonisocoric atoms be the same. Examples of this most general case are Cd atoms with their 2fold core charge in a crystal of S i atoms with their 4fold core charge, or Sn atoms with their Cfold core charge on a G&site in GaAs, thereby substituting a triply positively charged host atom core. In these cases, the perturbation potential represents a superpobition of the screened Coulomb potential (3.11), which accounts for the different core charges, and a shortrange potential which takes accoimt of all remaining differences bctween the cores d5 well as aspwts of the spatial dispersion of screening beyond those accounted for in equation (3.11). The effects of the two potential contributions are not independent of each other. The same bhortrange potential has a greater effect if the Coulomb potential, due to a larger core charge difference between the impurity and host atoms, is stronger. This comes about because the stronger Coulomb potential pulls the electrons closer to the core, where the the short range potential is essential. Just like the elwtroncore inteiaction potential, the Hartree potential V,(x) and the exchangecorrelation potential V,,(x) also undergo changes in the prrsence of point perturbationb. Below, we describe these changes for the Hartrre potential V I y t ( ( x ) and the exchange potential V,((x) of the liartreeFock approximation. Similar results can be derived for the exc~iaiig~rorrt~ation potential vzt(x)of the LDA mettiod.
3.2.2
Hartree potential
The Hartree potential V r t ( x z ) of the rth electron is, as before (see relation 2.49), given by the expression
(3.13)
However, the oneparticle states $ ( 2 here are not those of the ideal crystal, ,x) but those of the pertiirbed crystal and the summation runs over the o n e particle states u i occupied in the ground state of the pcrturbwl crystal. From the physical point of view it is quite clear that the prrtiirbed rrystal ~ H two kinds of stationary oneparticle states; first, those with energy S eigenvalues which were already allowed for the ideal crystal, i.e. those within the bands and, secondly, those with F I I C I eigenvalues within the energy gap ~~ between the valence and conduction bands. The states of the first kind represent pure Bloch states only in zeroorder approximation with respect to the perturbation potential, while, in higher approximations, superpositions
vrtd
(3.14)
with
extd
V g t d ( x i )= e2
/ d3x'
(3.16)
k J
and
Here we assume that the electron i on which the potential acts is located at the center. This assumption will also be maintained below. of The localized part VhOC(xi) the Hartree potential depends on the number n of electrons at the center. There are two sources of this dependence. First, the number of electrons which enter changes with n, Vfi"(xi)becomes more repulsive if n is large, and less so if n is small. Second, the wave functions of localized occupied states v O over which the sum in (3.16) is L C extended, depend on n. They become more localized if electrons are removed from the center, and less localized if electrons are added. One says that the wavefunctions relax. For reasons similar to those causing the localized part VhOC(xi)of the Hartree potential of the perturbed crystal to depend on the number n of electrons at the center, the extended part V;&(xi) also differs from that of the ideal crystal, firstly, because the number N of extended electrons is decreased by the localized ones and, secondly, because the states of the extended electrons relax. The first change causes a relative potential correction of order of magnitude 1 / N and thus can be omitted (a comparable
'
247
approximation was employed previously in the derivations of the Hartree potential and of Koopmans theorem in section 2.2). The change of the extended states is essential, however: because all N  n occupied states are affected. In this connection, the probability amplitude o the extended states f is reduced close to the localized electrons of the center because of Coulomb repulsion. Hence, a positive excess charge arises in the vicinity of the localized electrons which screens the Coulomb potential of these electrons. This change may be accounted for by replacing the localized part VAm(xi)of the perturbed Hartree potential by a screened potential Vh(%), and simultaneously substituting the extended part V F d ( x i )of the perturbed Hartree ) potential by the Hartree potential V ~ ( x iof the ideal crystal. Accordingly, we set Vkm() t VA<xi) and V r t d ( q ) V~(xi). .+ Then it follows
(3.17)
where, by definition, VA(xi) is given by the expression
Exchange potential
As we know from section 2.2, the exchange potential describes the Coulomb interaction with the exchange hole which occurs because two electrons of the same spin cannot reside at the same position. For the localized electrons of the perturbed crystal, the positional uncertainty is smaller than that of the extended ones. Accordingly, their exchange interaction will be stronger than that of the extended electrons. It is therefore again expedient to decompose the entire exchange potential V,Pt(x,),which an electron a localized at the center feels, into the two parts V,..td(&) and l r ~ ( x of, )respectively, the ~ (N  n ) extended and n localized electrons. We replace the extended part by the exchange potential V x ( x i ) of the ideal crystal and simultaneously With substitute the localized part by an effective exchange potential Vi(xz). this replacement, we have
v,pt(X,f VX(Xi) =
+ Vfi(X2).
(3.19)
(3.20)
where the summation over k inrlirdes only particles whose spin W k equals the spin LT? of particle i. Substituting trhe three potential parts (3.91, (3.17) and (3.19) into the oneelcctron Schrbdinger equation of the perturbed crystal, the various terms may bc arranged such that the effective oneelectron potential E(} of the 'x ideal crystal (2.60)occurs. The equation reads
The three perturbativc potential parts in this equation have different ef fwts. In gcnpral, the core perturbation potential l<'(xz) the strangest. is The occiirrenre of localized states at the center. the central feature of our considcrations above, is maidy due to this potential contribution. If only one electron is localird at the center, tlip two othw potential parts L$(x%) and Vdk(x,)due to electronelectron interactions vanish completely. <;enerafly, they a e non zero and lead to corrections of the localized eigensolutions of ihe Schriictinger PqiiBtion (3.21) which rontaiiis L:(x) as the only potential. 'I'hcse corrections will now be estimated by means of perturbation theory.
Pert rirbation theory
(3.22)
whcrt. I1  p2/2na V(x), as bcfore. signifies the oneelectron Hamiltonian operator of the ideal ciystal. The index i is omitted because all electrons feel the same core perturbation potential. If only one particle is localized at the center, equation (3.22) is exact.
Hartree energy
The energy correction due to the Hartrcc potential for an electron z in localized stateqVt i s given in first order perturbation theory b  ~ the rxpectation valri~ ($V, 1 VAw I vv,). [!ring (3.181, this expression may be written as
249
(3.23) We extract the factor (  ( , x) I xx' 1) from the x'integral by evaluating E ' x '/ ' . it at an average position X in place of x The remaining x'integral yields 1 because of the normalization of the wavefunction @,,(x'). The integral over X, multiplied by e 2 , will be denoted by Uy,, we set i.e. (3.24)
With this, the expectation value of the perturbation of the Hartree potential
becomes
Next we will calculate the corresponding energy correction due to the perturof bation of exchange potential V%{x) (3.19). The latter depends on the spin of the electron because. contrary to the ideal crystal case where the numbers of electrons with 'spinup' and 'spindown' are equal in ground state, the corresponding numbers n~ and nl of electrons localized at the center can differ. In terms of the total number n = n~+nl and the total spin projection M S = (1/2)(nT  "1) of the localized electrons, the two partial numbers ny and nl may be written in the form
1 1 (3.26) 2 h l s ) , nl = (n  211iIS) . 2 2 For the expectation value (&,t 1 1 I iu,) perturbation of the exchange 5 of the potential, equation (3.20) yields the expression
"1 =  ( n
( ~ v ,
I ~ I'+v,) i
(no,  I ) J ,
f o r not 2 1.
(3.27)
(3.28)
is socalled the exchange integrcab The dependence of this integral on the orbital state v' = Uk has been ignored in expression (3.28).
To solve the Schrodinger equation (3.21) explicitly) the number R of electrons at the center must be known. This question will now be addressed. Number of electrons at the center
To say that an electron is at the center means occupying a state which is IocaIized there. The number n. of electrons at the center equals, therefore, the number of electrons in localized states. This number is initially unknown. It can only be determined selfconsistenbly because the Hamihonian which determines the states depends on it. In thermodynamic equilibrium and at. temperature T = 0 K the number of electrons at the center may be found by selfconsistently counting the number of localized states which belong to energy eigenvalues below the Fermi level. Since the position of the Fermi level depends on the thermodynamic state of t,he semiconductor, the number n. of electrons at. the center is also a state dependent quantity. Below, we introduce some common not,ations used in this context and estimate n for particular impurity atoms whose chemical bonding to the host crystal is known. We assume the point perturbation to be elechically neutral, which means that in generating it, the same number of positive and negative elementary charges were added to the ideal crystal (as in the case of an impurity atom on an interstitial site): or removed from it (as in the case of a vacancy) : or removed and added (as in the case of a substitutional impurity atom). Depending on i t s environment in the crystal, the paint perturbation enters into a more 01 less strong chemical bonding with the surroundiug atoms. A phosphorus atom in a Sicrystal, for example, is chemically bound like a Si atom, i.e. the .Y and pstates of the P atom are involved in the formation of the valence band states of the crystal, and 4 of its 5 a2p3valenceelectrons are hosted by these states. Therefore, only 1 of the 5 valence electrons of the P atom i s available to occupy b c a l i z d states in the energy gap. T i means that n = 1 holds for the neutral hs phosphorus substitutional impurity in Si. If a Si atom in a Sicrystal is substiluld by a boron atom, one has T =  1. Therefore, 1bole is available I to occupy the states localized at the center. Similarly, n 1 emerges if a P atom in P ZnScryst81 replaces an S atom. The number V of electrons of an impurity atom X involved in its bonding to the crystal and which, thus, have energies in the valence barid and not in the gap, is generally referred to as the oddation state of the atom. The latter is denoted by X + . A phosphorus atom, for example: which is installed in a si c r y s t d on a regular crystalsite, has the oxidation state P4+, This notation, which looks like the number of e1ernent.aty charges at the atom without being such, originates from its prior use for crystals and impurity atoms with pureIy ionic bonding. I the latter, n the valence electrons in fact pass from the impurity atom to the surrounding crystal and the impurity atom is left behind as an Xv+ion. This is no
7
25 1
longer true if the bonding is covalent or partially covalent. Then some of the bonding electrons remain at the impurity atom, and the true number of its elementary charges differs from V. The n valence electrons of the impurity atom, which remain after the departure of the V electrons into the bonds with the crystal, and are available for occupation of the localized levels in the energy gap, are sometimes called active electrons. In the case of an impurity atom with Vi valence electrons one has n = Vi  V active electrons. So far, the charge state of the impurity atom prior to its incorporation into the crystal was taken to be neutral. However, positively or negatively charged ions can also be introduced into the crystal, and atoms introduced in a neutral charge state can have electrons removed or added within the crystal. Similar statements hold for structural defects. If a particular perturbation center X is not neutral for the lack of Q electrons, in the sense just specified, one says that the charge state of X is Q and writes X ( Q ) (where negative Q mean surplus electrons). The charge state should not be confused with the oxidation state. In the purely ionic case, the distinction between the two can be expressed most easily: the charge state counts the elementary charges of the atom outside the crystal, and the oxidation state counts its elementary charges inside. Generally, the oxidation state of a center in charge state Q will be X('+Q)+ if the oxidation state of the neutral center is V . The reason is that, in this case, V Q valence electrons are not available for occupation of localized states. The number n of electrons which are available, i.e. the number of active electrons, amounts to Vi  (V Q ) if V,, as before, denotes the number of valence electrons of the neutral impurity atom. The notations for the oxidation state and the charge state are summed up in the common symbol X('+Q)+(Q+). A simply ionized sulphur atom in a Sicrystal, for example, is denoted by S5+(l+).Of the 6 valence electrons of the S atom, 6  5 = 1 are available for occupation of localized energy levels in the gap, instead of 2 electrons in the case of the neutral center S4+(O+). Transition metal (TM) atoms can be installed in Si and other tetrahedral semiconductor crystals both substitutionally as well as interstitially. The oxidation states V and, therefore also the numbers n of electrons at the impurity atom, are different in the two cases. For substitutional TM atoms, V equals the number of electrons which are left at the atom after it is bound to the crystal. Interstitial impurity atoms are only weakly bound, i.e. the number V of electrons of the T M atom which occupy bonding valence states, is almost zero. The oxidation state is therefore T M o f . Oxidation and charge states coincide in this case, and the number n of electrons at a TM atom equals the number of its valence electrons. Interstitial Fe atoms in Si are found in the oxidation state F e o f , substitutional in the oxidation state Fe4+. In the first case, n = 8 (six electrons in 3dorbitals and two in 4sorbitals), and in the second case the value of n = 4.
With the background analysis set forth above, we are now sufFiciently p r e pared to address t h e solution of the oneelectron Schriidinger equation for the crystal with a point pcrturbation. Which solution methods can be applied with succcss depends decisively on whethe1 lhe core yrrt iulmlion potential is long or shortrange. Of course, t h r lattice translation symmetry, which drastically simplifirs the bolution of t h ~ Schriidinger eqiiation hi the casr of an ideal crystal is perturbed by both kinds of potentials. but only for shortrangepotentials in an e s s e a t d mannet. For longrange potentials the dec iations from lattice trauslation symmetry are relatively weak. In this case the Scfirodinger equation, in a certain sense, may be decomposed into two quations, one for the periodic potential of the ideal crystal, which was solved previously, and one for the perturbation potential. T he latter is called the LPffective mass equation, which we will derive in the following section.
3.3
ure start with the Schrodinger equation (3.8) of the perturbed crystal in the form (3.21). The most, iiriport.ant assumpt.ion we will make relat.es to Ihe core part V ~ ( X of the pert,urhatiozt potential V A T t ( x ) this equation. ) in This potential part is supposed to be smooth on the atomic length scale. In conjunction wit,h t,his wxmption, point pcrtiirhatioiis with shostrange core pert.iirbation pot.entials Vf(x)are ruled out from the very beginning,because these caiinot be considered to I c smooth. Point perturbations with Coihmbic core perturbation potentials are still allowed, although the Coilonib form of this potential is not necessary for the derivation of t.he effective mass equation. The other parts of thr t o l d perturbation potential V&.t(xj of eqiiation (3.21)!i.~.pdurbations of the Hartree and exchange potentials l7L(x) the and ViCx), arc automat,ically smooth if V&(x) has this property because these potential parts are dctcrmined by t.he solutions qVof Ychrodinger equation (3.21) which are smooth if L;(x) is also. The wwelunction dependence of Vh(x) and Li(x) requires selfconsistent solutions of the Schriidinger equation. In this section we do not intend to solve this quabion explicitly hut iclther to t,ransform it into another equation, the cffective mass equation? aliith can be solved more easily. Althoiigh the selfconsistency demand is transferred to the effective mass equation. it does not iulerfere with its derivation. For the latter? Vh(x) and Vk(x) may be treated as smooth external potdialii which, together with V,(x)! add lip l o form a smooth total perturbation potential i( ) x. Far hhe derivation which follows, the POt.ential need not be t.he smooth perturbation potential VieVt(x) a point of perturbelion, any smo0t.hpotential U ( x ) is allowed. This is important innsmuch as it becomes possible in this way to utilize the effective mass equalion not only 01 point perturbations with Coulombir core perturbation poten
253
tials, hilt also for macroscopic perturbations, such 8s that associatd with an external electric Geld, which are smooth on the atomic length scale by definition. The goal of the following consideralions is l o simplify the oneelectron Schriidingcr eyiiation
(3.29)
for t,he perturbed crystal in stages, so that., ultimately: a mare easily solvable equation, namely the effective mass equation, results. In this mat,ter, we will employ t,he fact that, in the vicinity of critical points of a certain band v , the energy of an electron of an ideal crystal depends quadratically on quasiwavevector k, i.e. in the same way that the energy of a free electron depends on its momentum. However, the free electron mass 7 n is replaced by the effective mms rnc of the particular band and critical point chosen. The e&ct.ivc mass includes the effects of interaction of the electron with the periodic potential of crystal. Therefore, it is generally a tensor, and if il can be reduced to a scalar maw, the value of the lat.ter generally differs horn the mass of a free electron. Having in mind the effective mass description of the band energy versus quasiwavevector k relat,ion, one may suspect that the influence of a perturbation potential V(X) on an &&on of the cryisla1 can be calculated in an appr0ximat.e way as follows: One eliminates the periodic potential of t h e oiic cltrtron Schrdinge~ equation for the perturbed crystal! while simultaneously replacing the free mass m in the kinetic energ); operator by the effective mass m:. The resulting Schrodinger equation represents the onebend e#ective mass eyuatian in its simplest form It can be solved much more easily than the original Schrodinger equation? which includes the periodic crystal potential ill explicit form. For a siibstitutioiial P atom in Si, for example. t.he effective mass equation is twsentinlly the same a8 the Schrodiuger equation for the hydrogen atom whose solutions are already known. The procedure described above needs, of COZITSC, further justificat.ion. To provide this, quest,ions have to bc addressed which have been left, open so far, for example, how the wavefunction of the e f l d i v e mass equation reiates to the wavefunction of the original Schrodinger equation, and the matter of what conditions must be placed on the perturbation potential V(x) and the wavefunction 7;j(x)in order for bhe effectivp mass equation to be applicable.
3.3.1
To address these questions and derive the effective mass equation for a single band, we rewrite the Schrodinger equdion (3.29) in the Bloch representation,
pvk(x),
(3.30)
with the expansion coefficients given by
(vk I
d3xlpXx)~(x).
(3.31)
Since the (pYk(x) are eigenfunctions of H with eigenvalue E,(k), the Schrodinger equation (3.29) takes the following form:
[ E y ( k ) 6 v f y f 6 k k ~ (vklUjvk)] (JklV) = E ( v k l $ ) .
vk
(3.32)
To transform this equation into the effective mass equation, the assumption made at the outset that U ( x ) should be a smooth potential, must be specified further. This is done by assuming that the change of U ( x ) over a primitive unit cell is small in comparison with the change of the periodic crystal potential t ( ~ ) such a cell. To formulate the condition for smoothness in over this sense quantitatively, we decompose U(x) in a Fourier series, using the same notation introduced previously in Chapter 2. We have
(3.33)
with Fourier coefficients (3.34) The sum in (3.33) is extended over the entire infinite kspace, meaning over all Brillouin zones. Smooth functions U ( x ) in the above sense have Fourier coefficients (klU) which, for large kvectors, more strictly speaking, for kvectors outside of the first BZ,are small compared to the Fourier coefficients (kll.) of the periodic crysta1 potential V ( x ) . The latter components were calculated in section 2.4. According to formula (2.161), they are nonzero only if k is a reciprocal lattice vector K differing from zero. This means that the smoothness condition for U ( x ) may be expressed its
for
for
<< I(KlV)l, K # 0 ,
(3.35)
255
The lefthand xlde of the inequality (3.36) is of the order of magnitude of the change of the perturbation potential U ( X ) over a primitive unit cell of the direct lattice, that is, a1VUI with a as lattice constant. The righthand side of (3.35) can be estimated by a characteristic band structure energy, like the fundamental energy gap E,q between the valence and conduction bands. With this, the inequillity (3.35) takes the form
alVUl
<< E g .
(3.36)
It states that, for a perturbation potential U ( x ) to be smooth and our first condition to be fulfilled, the changes of U(x)over a primitive unit cell must be small compared to the energy gap. Secondly, we assume that the wavefunction $(x) can be set up exclusively from Rloch functions of only one band YO, i.e., that $(x) should be of the form
(3.37)
with F , ( k ) as a function which has yet to be determined. It turns out that this requirement has no contradictions if the perturbation potential is smooth. The thzrd requirement refers to the function F,(k). It is assumed that Fvo(k)differs from zero only for small kvectors in the sense used in equation (3.35). Fourthly, it will be assumed that for small kvectors, which according to (3.35) have to be considered exclusively, the Bloch factors u W k ( x ) in the Bloch functions p,k(x) can be approximately replaced by their values at k=0:
U v o k ( X ) ZS
.mo(x).
(3.38)
In the following subsection 3.3.2 we will see that the two last requirements
can be justified when eigenstates of the perturbed crystal exist having energy eigenvalues just above and below the edge of band vo, and if only these eigenstates are considered. If, additionally, the band edge lies at k :, 0 the small energy deviations also correspond to small kvalues, and relation (3.38) holds approximately. The restriction of the location of the band edge to k = 0 can be omitted, and the derivation procedure can also be applied to band extrema other than the center of the first B Z , the only modification being the replacement, of the R X center k = 0 by the noncentral critical point k, well BY of k by k  k in the corresponding equations. as , The above four conditions will now be used to simplify the Schrodinger equation (3.32). Applying relation (3.38) the wavefunction (3.37) may be written as
Tihe ksum is the Fourier transform F,(x) of the function F,(k), whence
F,(x)
=
1 CFvo(k)eik.x.
& k
(3.40)
$44 = ~ , O ( X ) ~ Y l ( X ) .
(3.41)
The function F,(x) is termed the envelope functzon or, in short, the envelope. The envelope function, by definition, i s a bmooth function. Equation (3.41) means that the truc wavefunction + ( x ) is obtained by enveloping the rapidly oscillating Bloch lactor uvgo(x) with the smooth cnvelope function
Fuo ( X ) .
make The two ronditions conccrning the smoothness of U(x) and Fvo(x) it possible to represent the matrix element (uk(l/(vk) the Schrodinger in equation (3.32) in a substantially simpler form. We rewrite this element using thc Fourier representation (3.33) of U(x) and the product form of the Bloch functions p v k ( x ) , obtaining
Furthermore, we transform the integral over the periodicity region into a sum of integrals over the unit cells of this region. A lattice point R is associated with each unit cell, so that the sum runs over all lattice points of the periodicity region. Because of the lattice periodicity of the Bloch factors, only terms of the form exp [i(k k  k) . R] remain under the lattice sum. The summation over them can be executed easily. With K as an arbitrary reciprocal lattice vector, it follows that
C .i(k+kk)R
R
= G3
Sklfklk,K.
(3.43)
C ( k  k + KIU)Ci$(K),
K
(3.44)
(3.45)
257
has been introduced (0 = G3Ro). The Bloch components of the wavefunction @(x) in the Schrodinger equation (3.32), (vkl$) and (vklg), differ from zero only for small k and k because of the smoothness of the envelope. Therefore we need the matrix elements (vklUlvk) only for small k and k . With regard to the smoothness requirement for the potential U(x), the terms in expression (3.44) for (vklUlvk) yield no significant contributions to the Schrodinger equation (3.32) if K # 0, so that (vklUlvk)
N
(k
 klU)C$t(O).
(3.46)
Approximating the Bloch factors u,k(x) and U v l k t ( X ) by, respectively, ud(x) and uU,o(x), and applying the orthogonality of the Bloch functions, the Bloch integrals of equation (3.45) c : ~ : ( o ) as follow
c,,, (0) = .6 kk t,
Substituting this relation in expression (3.44) we obtain
(v kl U Ivk) w (kl U 1k)6,,, .
(3.47)
(3.48)
With this relation, the decisive step in simplifying the Schrodinger equation (3.32) for the perturbed crystal has been done  the coupling between the different energy bands caused by the perturbation potential has been eliminated. The Schrodinger equation now decomposes into separate equations for the various individual bands, and these equations can in fact be solved by a oneband Ansatz of the form (3.37). The coupling between different wavevectors remains in place. This can be processed relatively easily. We employ all results achieved up to now in the Schrodinger equation (3.32), obtaining the following relation for F,(k):
This equation will be transformed from kspace into coordinate space. To this end we multiply by ( /exp ( i k . x) = (xlk) and sum over k. The 1 a ) first term on the lefthand side thereby becomes
C &,,(k)Fuo(k)
k
( Ik) = X
(3.50)
One can easily prove that this transformation is correct by expanding E,(k) in a power series and by using the identity
U(x)Fm(x),
(3.52)
and the righthand side of (3.49) inluiediately becomw E F , ( x ) . With this, one finally gets the following equation for F,,(x)
E,(O)
h2 + (iVx)2, 2mg
(3.54)
and
[ L . d O )  a2 2m&
h2
I U ( x ) F,(x) = E F , ( x ) .
With the derivation of this equation, initially conjectured and now verified, it, is quite clear that the influence of the perturbation potential can be approximakly determined from an equation in which the periodic potential no longer appears and the effective mass replaces the free electron mass. Equation (3.55) is therefore the desired effective muss equation The eigenfunction Fvo(x) of this equation plays the role of an envelope function for the Uloch factor uvoo(x)in equation (3.41) for the true wavefunction $J(x). Equations (3.53) or (3.55) are also called envelope function equatioas in t,his context . The essential requirements involved in the derivation of the effective mass equation were the smoothness of the perturbation potential, the smoothness of the wavefunction and its composition of Bloch functions from only one band, as well as the kindependence of the Bloch factors. These assumptions are decisive, and are often not fulfilled. Shortrange perturbation potentials, for example, are not smooth. Nevertheless, the effective mass equation (3.55) is a suitable instrument for the solution of a series of important problems of solid state physics. Point perturbations with smooth potentials are only one example of this. Other problems which can be solved with the help of the effective mass equation include artificial superstructures in a crystal and external electric fields. These will be treated later in section 3.7 and 3.8, respectively, Also, the Coulomb attraction between electrons and holes  which results in the formation of excitons as pointed out in Chapter 2 
(3.55)
259
may be treated by means of this equation. To account for external magnetic fields, the effective maw equation has to be modified in a way which will be indicated later in section 3.9.
3.3.2
The oneband effective mass equation in its general form (3.53), derived in the preceding subsection, completely solves the eigenvalue problem for an electron in a crystal in the presence of a smooth perturbation potential. The practical use of this equation depmds, however, on the dispersion law E,(k) of the band under consideration. In the vicinity of the minima or maxima of nondegenerate bands of cubic crystals one has a purely parabolic and isotropic kdependence, and the effective mass equation is, as we have seen, no more difficult to solve than an ordinary Schrodinger equation with an external potential. This picture changes for bands which display degeneracy in the extreme. As demonstrated in section 2.7 on k ptheory, one then in general has nonparabolic and nonisotropic dispersion laws. This does not involve a major difference if the oneband effective mash equation can be solved in kspace, i.e. in the form of equation (3.49). IIowever, for various reasons it can be necessary to transform the effective mass equation into coordinate space and to solve it there. This applies if basis functions other than plane waves are better adjusted to the symmetry of the perturbation potential as, for example, the Coulomb potential of an impurity atom, or if no perturbation potential is present, but the perturbation is introduced through boundary conditions, such as in the case of artificial superstructures like superlattices and quantum wells. Because of the nonparabolic and anisotropic dispersion laws in the case of degenerate bands, the effective mass equation (3.53) in coordinate space is more complicated, in particular higher order differentia1 operators occur which make solution of the eigenvalue problem practically impossible. A resolution of this situation is offered by k . pperturbation theory. As we have seen, the parabolic and anisotropic dispersion laws of degenerate bands occur in this theory through diagonalization of the matrix of the Hamiltonian with respect to a basis set of Bloch functions Ivk)', which are exact up to the first order in the k . pperturbation (see formula (2.339)). The elements of this matrix are relatively simple linear and quadratic functions of the components of k. In the case of degenerate bands this suggests, therefore, not to take a onecomponent effective mass equation as starting point, but a multicomponent one which is obtained by writing the Schrodinger equation for the perturbed crystal in the approximate Bloch basis 1vk)l. We will undertake this program now. Regarding the perturbation potential U ( x ) and the envelope function F ( x ) , we pose the same requirements
?clW=
c
uk
(.kl$)(xtvk),
(3.56)
and the Schrodinger equation (3.29) in this representation takes the form
(vklH
uk
+ Ulvk)
(vkl$) = E(vk($).
(3.57)
The matrix (vklHIvk) of the unperturbed Hamiltonian H is diagonal with respect to k, k, and blockdiagonal with respect to the band indices v , v with the blocks referring to degenerate bands. The diagonal elements (vk(U(vk) of the perturbation potential U may be calculated by means of relation (3.48) which gives the elements of U with respect to approximate Bloch functions their Bloch factors u v k were replaced by u d . In the terminology of section 2.7, these are Bloch functions in zeroorder k . pperturbation theory or LuttingerKohn functions. Using relation (2.339), which expresses the first order Bloch functions Ivk) in terms of LuttingerKohn functions, it follows that the diagonal elements (vklUlvk) arc the Fourier transforms (klUJk), just as in relation (3.48) of subsection 3.3.1. However, unlike this ielation, thc matrix (vklU lvk) has also lionvanishing offdiagonal elements with respect to I / , 11. This is due to the kdependence of the Bloch factors in )vk)l which has been omitted in subsection 3.3.1. The nonvanishing offdiagonal elerrienls (vklU)vk) may he calculated in the same manner as the diagonal elements before. One obtains
The factor at (klVWlk) in equation (3.58) has the order of rnagiitudr of the lattice constant a, as can easily be seen replacing p by x by means of Heisenbergs quation of motion. This implies that the offdiagonal elements of U ( x ) are of the order of magnitude of the relative change of U ( x ) over a primitive unit cell. Terms of this order of magnitude are to be neglected within the framework of effective mass thpory. Thus the matrix elements of // with respect to first order Bloch functions Ivk) are approximately given by the relation
(3.59)
which corresponds to relation (3.48) of subsection 3.3.1. Note lhat equation (3.59) is valid independent of whether the two bands v and Y arc dcgeneratc
26 1
or not; even degenerate bands are not coupled by perturbation potentials U if these potentials are smooth. Before simplifying the Schrodinger equation (3.57) still further for the actually interesting case of degenerate bands, we will examine the case of nondegenerate bands.
Nondegenerate bands
WTeassume that p(x) can be expressed in terms of the approximate Bloch
v g ? so
that
'(vkl@)= b,,F,(k).
With the Ansatz (3.60), vj(x)may be written as
(3.60)
$(x)
CFw(k)(xlvok)l.
k
(3.61)
Employing expression (2.334) for the approximate Bloch functions jvok)', it follows that
where &(x) is the Fourier transform of F,(k) in coordinate space. The second term of @(x) of equation (3.62) does not occur in the corresponding equation (3.41) for +(x). This term is again due t o the kdependence of the Bloch factor in Ivk)', which was neglected in subsection 3.3.1. In fact, its relative magnitude with respect to the first term is of the order of the relative change of the envelope function F , ( x ) over a unit cell. Again terms of this order of magnitude are again to be neglected within the framework of effective mass theory. We therefore obtain the same expression V(x) as in equation (3.41) above. The Hamiltonian matrix '(vklHlv'k')' in (3.57) is approximately diagonal with respect to the band indices, and the diagonal elements are the eigenvalues Ez(k) of H calculated in section 2.7 in second order k. p perturbation theory. Employing expression (2.336) for E?(k),we obtain
where hi&8 are the elements of the reciprocal effective mass tensor according to formula (2.337). Using the Ansatz (3.59) for '(vklp) and expression (3.631 for l(vk)H]vk)', the Schrodinger equation (3.57) yields
This equation is in agreement with the effective mass equation (3.53) if the latter has E,,,(zV) replaced by a parabolic expression in the components of iV,refraining, however, from the assumption of isotropy of the band structure. If the effective mass tensor reduces to a scalar quantity ( l / m ~ & ) 6 ~ 0 , then (3.65)yields the effective mass equation in the form (3.55). The derivation of the effective mass equation based on k . p perturbation theory thus produces this equation automatically in parabolic approximation. In the earlier derivation of subsection 3.3.1, an effective mass equation was obtained that did not yet contain this approximation, which was invoked only later. The advantage of the derivation of the effective mass equation within the framework of k * pperturbation theory is clearly manifested when degenerate bands are considered, which we will address next.
Degenerate bands We direct our attention to the valence band maximum of semiconductors with diamond type structure (see section 2.7). Accordingly, we assume that at k = 0 a degenerate band level E,(O) exists with either symmetry rhs or r depending on whether spin is omitted or not. First we consider the : case without spin. The three degenerate , b Rloch states of energy E w ( 0 ) ' T arc distinguished, as before, by an integer index m. The Bloch functions at k = 0 arc therefore written as ) o m O ) . If we are interested in eigcnstates $(x) of the Schrodinger equation of the perturbed crystal whose energy eigenvalues are expected to be near the valence band edge, we can represent $(x) as a superposition of the Bloch functions Ivrnk)' in first order k . p perturbation theory. Thus, we set
(3.66)
As above, in the case of nondegenerate bands, it approximately follows that
(3.67)
263
with F,(x) as the Fourier transform of F,(k) in coordinate representation. Now, expression (3.67) for '(ukl$) is substituted into the Schrodinger equat.ion (3.57). The subspace of valence band states is approximately decoupled from the remainder of the Hilbert space. Using equation (2.341), the matrix elements '(vklHlvm'k')' may be written as
where the coefficients D: ;, are defined i equation (2.342). The matrix elen ments '(vklUlvrn'k')' of the perturbation potential U follow from equation (3.59). They vanish if v # wm'. For v = urn' they differ only from zero if m = m'. With this, the Schrodinger equation (3.57) takes the form
Transforming to coordinate space and taking account of relation (3.51), the equation
+ Ci(X)h,,!
+$
Fm'(X) = E F m ( x ) .
@
~ ~
(3.70)
($
for the unit matrix 6,i, and F (x)for the vector (F,(x), F,(x).F,(x)), t,his equation may be written in matrix form as
Dm : , f
l
(3.71)
valence band the Hamiltonian matrix is given by relat,ion (2.344). With t,his, the envelope function equation then reads, explicitly,
Here, derivatives of the type 62/6z2 are denoted by symbols like a;, and those of type a 2 / d r d y by symbols like a&. Consideration of spin leads to two changes. Firstly. one has to use spindependent Bloch functions (sxlvmak0)' and, therefore, also spindependent envelope functions Frrw(x, , and secondly the spinorbit interaction opers) ator H,, is to be added to the Hamiltonian operator H . Thereby, Hk.p
becomes Hk.=, which has the consequence for D that the matrix elements of p are replaced by those of ii from equation (2.353). If interference terms between the k ' pinteraction and the spinorbit interaction H , are neglected, the representation of the latter in the approximate Bloch basis leads to a * kindependent tensor Hs,.The envelope function equation then reads
It is advantageous to first transform this equation into the angular momenU tum basis l j m g ) ,in which the operator fiso is diagonal, before one proceeds to solve it. Using the arguments of section 2.7, and assuming a sufficiently ' large spinorbit splitting energy A, the coupling between the Ii and r$ valence bands may be neglected. For the T valence band, i.e. with j = $, i ' one arrives at the following $component envelope function equation:
The quantities
d,R , $,
Q = (L
(3.75)
1 6
1 3
+Mp:,
(3.76)
(3.77)
v %
 i& a,
 2iNa,B,]
(L M ) ( a2  a,) , 2
[3.78)
The energy origin in (3.74) was moved to the valence band maximum, which was previously located at A/3. The envelope equation (3.74) makes it possible to calculate acceptor states and hole bands in superlattices of the diamond and zincblende type semiconductors. One can derive a similar equation within the Kane model of the rsvalencebandrsconduction band complex of zincblende type semiconductors.
265
3.4
We will now explore stationary states of a semiconductor crystal caused by point perturbations which givv rise to smooth longrange perturbation potentials V'(x). The most important point perturbations of this kind are isocoric substitutional impurity atoms, i.e. atoms from rows and columns of the periodic table which are close to the row and column of the host atom which is substituted. Below we will concentrate on such point perturbations. We denote the difference of the core charge numbers of the impurity and host atoms by
AZ = Z I  XI?,
(3.79)
and assume that it is nonzero. The sign of AZ can be both positive as well as negative. In this sense we can speak of positive and negative point perturbations the point perturbation is positive if the core charge number of the impurity atom is larger than that of the host atom, negative, if it is smaller. For the pcrtiubation V'x of the core potential, we m a y use ,() expression (3.11) which here we write as
(3.80)
In general, V ' x does not yet represent the entire perturbation potential in ,() the oneparticle Schrodinger equation (3.21) for the perturbed crystal. One still must add the Hartree potential Vh(x) and the exchange potential Vfr(x)
caused by other electrons localized at the center. For impurity atoms with only one valence electron more or less than the host atom, these potentials, as a rule, have the effect that only one electron or hole can be bound at the center. For this one electron or hole, the Hartree and the exchange parts of the perturbation potentials vanish, i.e. V'X is itself the entire perturbation ,() potential, and the oneparticle Schrodinger equation has the form (3.22). For impurity atoms with lAZl > 1, Vb(x) and Vfr(x)do not vanish. The effect of a nonvanishing VA(x) will be discussed later. To solve the oneelectron Schrodinger equation (3.22) with the potential V ' x of (3.80), we may use the effective mass theory derived in section ,() 3.3. Here, effective mass equations must be written down for those critical points of energy bands where changes of the spectrum of energy eigenvalues dae to the perturbation potential are to be expected. In Chapter 1, it was pointed out that such changes occur at the bottom of the conduction band and the top of the valence band  for impurity atoms with A 2 > 0 discrete levels appear in the energy gap just underneath the conduction band edge,
(3.81)
Ti2 = k2,
&(k)
2m;
(3.82)
with rn; and ml, as effective masses of electrons and holes, respectively. In reality, for many semiconductors such as Si and Ge. the conduction band edge does not lie at k = 0. The valence band edge zs found at that point for many semiconductors including the ones mentioned; but there is degeneracy between heavy and light hole bands at k = 0. Nevertheless, we will initially proceed with the above simplifying assumptions. The relevance and necessary improvements of this idealized model are yet to be discussed separately. For reasons which will become clear below, the model is often referred to as the hydrogen model
3.4.1
Hydrogen model
For the hydrogen model we have the two effective mass equations
(3.83)
(3.84)
In the usual form of a Schrodinger equation they read
(3.86)
267
Apart from notation, signs, and constant factors these are the Schrodinger equations for charged particleu in the potential of a point charge, just like in the case of the hydrogen atom. From experience in quantum mechanics, we know that h r an dttrtlclite potential, there is a continuum of positive eiierpy eigmvalues and, in addition, there ale discrete energy levels which occur for negative energies. Fur repulbive potentials there are only pusitixe energy eigenvalues, but no bound statrs at negative energies. From this it follows that for a positive point perturbation (upper sign in equations (3.851, (3.86)), discrete e n e r a levels are to b e expected just below the conduction band edge, and for negative point pertiirhations such levels are exppctpd just above the valence band edge. Thew dibcrele levels lie in the emrgy gap between the conduction and valence bands where, as we know, no energy eigenvalues can appear in an ideal crystal. We consider first the cwe of the conduction band.
Conduction band
Transferring results from the quantum mechanical treatment of the hydrogen atom to our present situation, we get the discrete energy levels & of principal quantum niimher 71, n = I , 2 , . . ., as ( E ,  E ~=  ~ * / n ~ ) or
(3.87)
with
(3.88)
where Ef  (e'm//2ha) denotes the binding or Rydberg energy of the hydrogen atom. The m~av~fin~rtioii FTdm(x) for principal quantum number n = 1, angular mumedurn quantum number I 0 and magnetic quantum number m  0 reads
~
(3.89)
with
(3.90)
where a g = ( h 2 / e 2 m )denotes the Bohr radius of the hydrogen atom. The orders of magnitude of the energy Eg and the length ag can be estimated easily. With E g FY 13.6 eV,: = 0.5 A , ( m z / m )= 0.2, c = 11.4 and a = 1, it follows lhat
(3.91)
11.4 (=)
AM
29
A.
(3.92)
Compared with the width of the energy gap which is of order of magnitude 1 el/, the discrete energy level El lies closely below the conduction band edge (see Figure 3.4). Even closer are the states with n = 2 , 3 , . . .. The designation shallow levels is used in this situation. The wavefunction for energy level El decays exponentially with distance from the core of the impurity atom with the characteristic decay length a g . In contrast to band states, which are evenly spread out over the entire crystal, we have, therefore, localized electron states (see Figure 3.4). The localization radius a g is clearly larger than the distance between two crystal atoms, which is approximately 2.5 A. This means that the localization of an electron at an attracting impurity atom is weak if considered on the interatomic length scale of the crystal. Exactly this is to be expected considering the smoothness requirement made at the outset, which states that only Fourier components of small wavevectors k should contribute to the wavefunction. We can also examine the validity of this requirement quantitatively. Omitting a kindependent factor, the Fourier transform of Fcloo(x)is given by the expression
(4.93)
At the edges of the first B Z one has k FY (../a) and a g k M w ( a g / a ) FY 10. Between the center of the first B Z and its edge, Fcloo(k) therefore falls off by about 4 powers of ten. The smoothness requirement is therefore fulfilled very well. In Figure 3.4 the localization of the wavefunction in kspace is also indicated. Recalling the theory of the hydrogen atom, it is known that the effect of an attractive Coulomb potential is not restricted to the formation of bound states with energy levels in the previously forbidden negative energy region. Changes also occur in the continuum of positive energy eigenvalues which are allowed even without Coulomb potential. They are manifested in the fact that the eigenstates, which were previously spherical waves, are scattered by the Coulomb potential. Their amplitude at the positive center becomes larger, and further away from the center it is smaller. This leads to a change of the density of states in the continuous part of the energy spectrum (the definition of the density of states is given in section 2.5). According to Levinson's theorem, which we will discuss in the next section in somewhat greater detail, this change takes place such that the total number of all states in the presence of the perturbating potential, including the bound states at negative energies, remains the same as the total number of states without the perturbating potential. For each bound state of negative energy a state of positive energy is therefore excluded from the energy spectrum.
3.4. S h d a w lev&
269
AZ a 0
dZ.0
L 1
O k
r\
O k
States in coordina1.e spat:e and in kspace far mibslitutional impurity atoms with either posit.ive or negative differences A 2 beOween the impuril,y core charge nunthers and the h o d core charge numhers (upper part o figure). The occupation of the shallow levels by eI&rons is presented f in the lower part of the figure. The levels act as donors of elwtctruns for A 2 > 0 and ss acceptors of electrons for A 2 < 0.
The abovrInmtionfd
result,^ for shallow impurity levels appended to an isotropic parabolic conduction band can be immediately iransferrd to an isotropic parabolic valence band. Below, we will writs them down without further derival ion.
Valence band
For a negative point perturbation there appear discrete energy lpvels in the energy gap closely above the valence band d g r . The energies ol thew levels are given by the exprcssiorrs
En  ' z,
EB
(3.94)
(3.95)
The pcrtinent wavefunctions are localized at the perturbation center. The wavefunction of the ground state, n = 1 and 1 = m = 0, has the form (3.89)
with
(3.96)
as the effective Bohr radius. In Figure 3.4 (lower part) the energy levels and localization regions are shown schematically. Although the population of energy levels by electrons will not be treated systematically until the next chapter, we wish to deal now with a special case, namely with the occupation of the shallow impurity levels discussed above at absolute zero temperaturc, 7 = 0 K . This consideration will lead us to a better understanding of the nature of these impurity states. First of all, wc assume the case of a positive point perturbation, i.e., the core charge number 21 of the impurity atom is taken to be larger than the core charge number Z H of the host atom. For simplicity, we assume 21= ZH 1. Then the impurity atom also has one valence electron more than the host atom. To be specific, we can imagine it as a P atom in a Sicrystal. According to Levinsons theorem, which mandates that the total number of states must remain the same both with and without perturbation, one can proceed on the assumption that nothing will change due to the impurity atom regarding the number of states in the valence band. Therefore, all the valence electrons of the host atoms and all those of the impurity atom, except for the additional electron, can be placed in the valence band. The additional impurity electron has to reside in the lowest unoccupied energy level. That is the shallow n = 1level just below the conduction band edge which, according to the above results, arises from the impurity atom. The resulting occupation is shown in Figure 3.4. If the temperature is increased this electron can easily be excited from the shallow level to the conduction band. There, it is no longer localized but spread out uniformly over the whole crystal. representing a freely mobile negative charge carrier. The shallow energy level therefore functions as a donor of a free carrier. One calls it a donor level and the impurity atom itself as a donor. One can also say that a donor atom (with AZ = 1) has one surplus valence electron which is bound relatively weakly and can be transferred easily by thermal excitation to the conduction band. The impurity atom remains in the single positively charged state, i.e., it is singly ionized. For A 2 2 2 one has doubly or multiply ionizable donors. We will discuss this later more exactly. Next, we consider an impurity atom with Z I = Z H  I, i.e. with one positive elementary charge in the core and, with this, also one less valence electron than in the host atom. Due to the substitution of a host atom by such an impurity atom, the number of states in the valence band decreases  according to Levinsons theorem by one if only the n = 1level is counted. With this, the number of valence band states is still large enough to host all
271
valence electrons, but, also, there remain no unoccupied states in the valence band. The n = 1level in the energy gap remains empty at T = 0 X. If the temperature is increased, an electron from the valence band can easily be excited into the impurity level. The level accepts an electron, it functions as an acceptor. In the valence band itself, a hole remains which, as we know from Chapter 1, behaves like a positive freely mobile charge carrier. Consequently, we can also say that, under thermal excitation, the acceptor transfers a hole to the valence band, whereas at T = 0 K the hole was bound to the acceptor. This picture corresponds to the model of a negatively charged point perturbation which binds a positive hole and transfers it to the valence band under excitation. The analogy to the positively charged donor center which binds an electron and donates it to the conduction band is obvious. Similar interpretations are available for doubly and multiply ionizable acceptors.
3.4.2
As already indicated, the above hydrogen model of shallow impurities, for which the band extrema lie at the I? point and the bands are nondegenerate at r as well as isotropic and parabolic in its neighborhood, cannot be directly applied to many semiconductor materials including Si. Moreover, as we know from section 3.2, the perturbation potential V(x) has, in general. a shortrange part as well as an electroneIectron interaction part which are not considered in the hydrogen model. Below, we treat corrections to the hydrogen model which can be traced back to the cited effects. Some of these corrections appear only at donors. others only at acceptors. Therefore, we treat the two kinds of impurities separately.
Donors
The conduction band of Si and other indirect semiconductor differs in three ways from the simple isotropic hand model. First, the minimum lies outside of the center of the first BZ;second, for symmetry reasons, one minimum carries the implication of several equivalent minima or valleys; and third, the band structure in the neighborhood of each individual minimum is anisotropic. While the offcenter location of band minima has no direct effect on the donor binding enerm, the manyvalley feature and the anisotropy of the band striiccture rln have such an effect. Due to the existence of several valleys (dist.inguished by a valley index i ) , the wavefunction 01 the donor state $(XI to be expanded with respect. to the appr0ximat.c Bloch funchas tions (xlclri [k kill1 for all mutu& degenerate minima ki, rather than with respect to the approximate Rloch function (xlck, [k kc])of one , minimum k only Because of this, the onecomponent envelope function
rx
k
j
h2 (k  kilMClk  ki) 2
6ij6kkl
= EFi(k  ki),
where M is the effective mass tensor of valley i. i The wavevectors k  ki and k  k j vary in a small neighborhood about the zero point, i.e. k and k are vectors near the respective minima ki and kj. For i # j , the matrix elements (kJVJk)of the perturbation potential V(x) can approximately be replaced by the k and kindependent expressions (kilVlkj). The latter can easily be transformed into coordinate space where they result in an additional 6functionlike potential. With i = j , the matrix elements (kJVJk)result in the perturbation potential V(x) which is already present in the hydrogen model. The effective mass equation (3.98) in coordinate space thus reads
= EFi(X).
(3.99)
If V(x) is the screened Coulomb potential of equation (3.80), as was assumed, the absolute values of the intervalley Fourier components (kilV1kj) are negligibly small in comparison with the intravalley Fourier components, because of the large wavevectors ki  k j occurring in the former combined with the (l/lkl)dependence of (klV) on k. However, as we know from section 3.2, the true perturbation potential V(x) differs from the pure Coulomb potential, because the screening of a point charge by the semiconductor is wavevector dependent, and on the other hand, because V(x) contains a shortrange part, which cannot be traced back to the potential of a point charge in any way. Both modifications can cause the intervalley matrix elements (hlVJkj) of V(x) to take larger values. Nevertheless, they still remain so small that they can be considered by means of perturbation theory. In zeroorder approximation they may be omitted completely. Thereby, one obtains a separate effective mass equation for each valley from (3.99). In case of k, = ( O , O , k z ) , it reads, in coordinate space
273
Analogous equations hold for the other valleys. This means that, neglecting intervalley coupling, all valleys give rise to the same donor level.
Anisotropic effective masties
In contrast to the hydrogen model, the effective mass of the envelope q u a tion (3.100) depends on direction in kspace. In this circumstance, analytical
solutions arc not available, and one mist resort to approximations and numerical calculations. One possible approach is the application of a variational procedure, in which the eigenfunctions are represented as linear combinations of niembers of a set of auxiliary functions whose general shapes are adjusted as well as possible to the real solution, but which still contain free parameters. Using this representation one calculates the energy expectation value of the Ilamiltonian and varies the parameters until its absolute minimum is reached. This minimiim value then yields an approximation for the ground state energy, which is better as the linear combination of auxiliary functions approximates the actual eigcnfunction. The success of this approach thus depends in an essential way on the auxiliary functions used. They should, in particular) correctly reflect the symmetry of the actual eigenfunctions, in the case considered here the symmetry of the ground state wavefunction. One can also employ the variational process for the calculation of excited states. In this case the wavefunction of the first excited state must, in addition, be orthogonal to the ground state wavefunction, that of the second excited state must be orthogonal to both previously calculated states, etc. Numerical results obtained by means of this procedure for the energy levels of the Pdonor in Si are reproduced in Figure 3.5. For a simple estimate of the anisotropy effect, one can replace the reciprocal effective mass rn: of the hydrogen model by the reciprocal effective mass (l/m:ll + 2 / m ; ) / 3 obtained by averaging the directiondependent reciprocal effective masses of (3.100) over all directions in kspace. With this, one obtains energy levels for the simply ionizable donor in Si, which are shown in Figure 3.5a. The binding energy E B is represented as 29 meV in this approximation.
Intervalley coupling
As a second corrective step, the intervalley coupling in the effective mass equation (3.99) will be considered. In perturbation theory, the calculation of eigenvalue corrections involves the diagonalization of the matrix O(kilVlkj)lFj(0)12. Using the example of Si, this calculation will now be
 0'0 T >
w I w
CI,
2 0,Ol !
0,02
 0.03
0,04
0.05
performed explicitly. In this case one has 6 minima, which lie on the cubic _  ~ axes k,, k,, k , and k,, k,, k , close to the respective Xpoints. From symmetry considerations it follows that f ( , V l , I ; ( ) 2 has the general form lkl'k)Ik01
O b b O b b c b b c b b
c b b c 0 b b b O b b b O c b b
b b
b b
c
b b o
(3.101)
b = fi(k~lv'lky)l~z(0)12,
r
= ~t(kzIv'lkz)l.)IFz(0)12
(3.102)
have been used. Lines and columns in (3.101) are written in the sequence __ k,, k,, kzrk,, k,, k,. The matrix (3.101) has the three eigenvalues
A&I = 4b + C,
1  fold
C,
AEz
= 2b
fold
AEg  c,
3  fold
(3.103)
275
with the degrees of degeneracy indicated. Thus, intervalley coupling partially removes the 6fold degeneracy among the valleys.
The sphtting of the ground state donor lrvel in Si due to intervalley coupling can also be predicted j u s t oti the basis of group theory. One considers the representation of the symmetry group of the envelope e q u e h n (3.99) for si, i.e. _ _ ) in the space of the 6 reciprocal vector compo_ of o h ,, nents k,, kg,k,, k,, k, k,. This representation is reducible. To demonstrate this one may employ the transformation rules of the vector components x , y, z , a,7j, Z under the action of the clcments of o h (summarized in Table A.7 of Apprndix A), realizing that these component? transform in the same way as thc reciprocal vector components considered here. Then one easily 1 finds that the irreducible parts of this representation arc rl ( A , ) . r 2 ( E ) and Iz5(Tz). The notations A , E , T are commonly used in the theor, of z l o c a l i d states, and WP will also employ thcm here. With this. it is clear that the symmetry of the AEIstate is A l , that of the two AETstatcs is E , and that of the three AE3 states is 12. The size of the splitting depends on the intervalley nitrtrix elemen1 of the perturbation yolmtial. If one considers only thc first of thc two abovementioned effects which result in matrix elernentv ol considerable size. i.e. the wavevector dependence of screening, and assumes that screening has become fully ineffective at large wavevectors k,  kj,therefore replaring the screened Coulomb potential V(x) of equation (3.80) by the bare Coulomb potential (e2/IxI) in (ktlVlkj), then it follows that (3.104) With this, one obtains b as 2.36 mel and c as 1.2 met. This yields AEl = 10.8 meV. A E z = 3.6 met and AE3 = 1.2 meV. A 3fold splitting of the Pdonor ground state level in Si has, in fact, been observed experimentally (Aggrawal and Ramdas, 1965). The simple theoretical estimate presented above agrees remarkably well with the experimental splittings of 11.6 meV between the A l  and T2levels, and of 1.3 meV between the E  and T2levels (not resolved in Figure 3.5). The results of a numerical treatment of intervalley coupling shown in Figure 3.5 are even closer to the experimental values. However, the agreement is not as good for the absolute position of the ground state level. The addition of A E l , i.e. of 10.8 meV to the binding energy of 29 meV without intervalley interaction yields a corrected binding energy of 39.8 meL which is closer to the experimental value of 45 meV, for Si:P but still clearly below.
Chemical shifts
P
Ge
As Sb
Bi
13 14 10 13
B
A1 Ga In
153 11 11 11 12
GaAs
ionizable donor and acceptor atoms in Si, Ge and GaAs. Not only are the absolute values of these energies striking, but also the fact that they are not equal for donor or acceptor atoms of different chemicals stands out. Contrary to the prediction of the above theory, there are dependencies of binding energies on the chemical nature of the impurity atoms. One refers to these as chemical shifts. The absence of chemical shifts in the calculated binding energies is a consequence of the approximation of the perturbation potential V(x) as a purely Coulombic potential in the hydrogen model. As we know from section 3.2, the correct perturbation potential of an impurity atom also contains a shortrange part, to which all central cell corrections contribute. Among these, there are also contributions which depend on the chemical nature of the donor. The latter are the main reason for the experimentally measured chemical shifts of the donor binding energies. The effects of central ceE corrections are particularly large in the case of the 1sground state, as one can recognize from Figure 3.5 for the special case of the Pdonor in Si. A weaker influence, and, accordingly, better agreement between experiment and the predictions of a theory omitting central cell effects occurs for the excited states with n = 2 , 3 , . . .. This is to be expected since the excited state wavefunctions have maxima that do not lie at the perturbation center x = 0
277
(which does happen in the case of the ground state) but further outside. Electrons in the excited states are therefore less affected by changes of the potential in the central cell than are electrons in the ground state. Generally, the occurrence of pronounced chemical shifts means that the perturbation center is no longer shallow and the effective mass theory is no longer applicable for its theoretical treatment (see section 3.5 for further discussion). Acceptors Corrections to the simple hydrogen model are also necessary for acceptors. Of course, the maximum of the valence band lies in the center r of the first B Z for all diamond and zincblende type semiconductors, as is assumed in the hydrogen model. However. this maximum is degenerate: depending on whether spinorbit interaction is important or not one has, respectively. the %fold degeneracies of the representations l'b5 (diamond) or (zincblende) in the space of scalar functions. or the 4fold degeneracies of the representations I?$ (diamond) or Ts (zincblende) in the space of twccomponent spinor functions. In the vicinity of r, this degeneracy splits off and one has ihree or two anisotropic valence bands. In section 2.7 these were approximated by two isotropic parabolic bands. one for heavy holes and one for light holes. If one also uses this approximation here, then there are two hydrogenlike series of acceptor le\els, one for each sort of holes. Calculating acceptor binding energies by means of expression (3.95) in the case of Si yields 50 met7 for heavy holes which are expected to form the ground state. In the case of Ge. the result is 17 meV for heavy holes. One can hardy expect these values to be in agreement with experiment. In fact, they differ appreciably from the measured ground state binding energies shown in Table 3.5. Evidently. nonparabolicity and anisotropy of the band structure play an essential role and must be taken into account in realistic calculations. This may be done by means of the multfband effective mass theory developed in the preceding section. Below. we will explain the application of this theory to simply ionizable acceptors in Si. Owing t o the small spinorbit splitting energy A of 44 meV and the relatively large acceptor binding energies E B (68 me\' for the isocoric impurity atom Al). spinorbit interaction initially will be neglected. Later this approximation needs to be corrected since E g is not much larger than A. Without spin and spinorbit interaction the wavefunction ~ A ( x ) is a linear combination of the three Bloch functions (xlumO),m = x. y, z of the representation T , which are enveloped by the i, * components F,(x) of the envelope function vector F (x). Thus
(3.105)
The corresponding effective mass equation of the Ib5valence band is given by relation (3.72). If, therein. U ( x ) is identified as the Coulomb potential. (3.80), one obtains, in the case A 2 = 1,
The FA^ (x). FA^ (x), . . belong to a space which differs from the ordinary . Hilbert syaccr in that it is riot spanned hy ouecomyonent basis funrlious, but by threecomponent basis functions. A representation in this space is given by the product of two represt?nt,atinns, one of which corresponds to the 3D rqresentation according to which the components of each of the three= $ * component, fiinctions FA^, FA^ transform separately, and the other is the representation according to which the threecomponent functions transform among each other. One can easily show that the latter representation must +,.q2(x), . . ., forincd according to relation (3.105),b e be r A , so that $,.ql(x), long to the representation r A , as has been supposed. For the representation * * D A of FA^ (x):FA^ (x), . . it follows that .
(3.107)
279
This representation is reducible. The irreducible components determine the symmetries of the threecomponent basis functions to be used for the r e p resentation of the threecomponent envelope function of the acceptor state under consideration. Equation (3.107) yields a remarkable conclusion concerning the symmetry of acceptor states. An envelope function basis vector which transforms according to the unityrepresentation occurs in the expansion of the accep) . tor eigenfunctions $ A ~ [ X ) , @ ~ ( X . .~only when the symmetry r4 of these functions is I'h5. The basis vector belonging to the unityrepresentation is the only one which does not vanish at the central site x = 0. Since the wave function of the ground state will also be nonzero at x = 0: D A must contain the unityrepresentation if A is the ground state. According to (3.107) this is only possible if the ground state has the symmetry Fh5. Acceptor levels of other symmetry are necessarily excited states. The symmetries of all t h r e e component basis functions involved in the construction of the ground state envelope function are given by the relation
(3.108)
shown in Table A.27 of Appendix 4. Relation (3.108) determines the symmetries of the threecomponent auxiliary functions under rotations, therefore the angular dependencies of these functions. In order to also get their raand ri5functions with diaI dependencies one expands the rl, r12, respect to angular momentum eigenfunctions with the various quantum numbers 6, rn and multiplies the expansion coefficients of different values of I by corresponding radial wavefunctions r1exp(T/q). These are formed in analogy to the eigenfunctions for the Coulomb potential, but with &dependent localization radii rl. The latter are treated as variational parameters, just like the Coefficients of the auxiliary functions belonging to different irreducible representations. Applying this procedure to the acceptor ground state of Si, a binding energy of 31 me&' is obtained (Schechter, 1962). More recent calculations, taking spinorbit interaction into account, have resulted in a value of 44 mel/ (Baldereschi and Lipari. 1977), which is very close to the experimental value of 45 meV for 3 in Table 3.5 but, of course, does not account for the pronounced chemical shifts seen in this table. Calculations of acceptor binding energies have also been performed for materials like Ge whose spinorbit splitting energies are large compared to the acceptor binding energies. Spin has to be taken into account under such circumstances, i.e. the T'& valence band has to be replaced by the two rg and r$bands. Owing to the large spinorbitsplitting energy, however, the spinorbitsplit I'Fband can be omitted. For the expansion of the 4component envelope function of the remaining rgband one needs auxiliary functions of 2r15 2r25 symmetry (see Table x Ii = ri f ' k
+ +
is taken into account, then the 1slevel is shifted up by just this energy Us.By exciting one electron into the conduction band, the neutral Sdonor
28 1
S(D+) bccarnes a single positively charged S(1+) donor. For the I  a sld 01 h e S ( l 1 ) ioii thr shift in energy by Us do= not occur. Thus this level i s shifted down by I/, in comparison with the 1slevel of tbc S(0 +) atom. Exprrirneniully, one finds t,hat the ionkation energy of the neutral S(O+) donor is 0.31 eV, and the ionization energy of the S(1t) donor j s 0.69 eV (see Figure 3.6). From thefie values a Hubbard energy I i , of 0.28 eV may be deduced. IIowever, both ionization energies are suhstantially larger than the result. 4 x 0.03 el7 = 0.12 eb which follows from the hydrogen model. Evident.ly, the central cell shortrange poteniid cont,ribution is ewedial in the case of the Sdonor in Si. l l i s donor is a deep center rather than a shallow me.
3.5
Deep levels
283
2
4
6
.3
.4
Acceptors
   _ a: F
.
.5
.6 (eV)
2
4
G
8
1OleV)
Atomic Electronegativity
Figure 3.6: Ionization energy of substitutional impurity atoms in varioum tedrahedrally coordinate host semironductors as B function of the strength of the pcrturbation potential, measured in terms of valence shell slevel differences between impurity and host atoms in the case of donors, and valence shell plevel differences in the caae of acceptors. Donor energies are reIative to the conduction band edge, and acceptor energies to the valence band edge. (After Vogl, 1981. Reproduced
from Boer, 1990.  i the discussion above, the periodic potential of the crystal was omitted n
from consideration. An experimental proof of the threshdd behavior of shortrange potentials in crystals may be taken from Figure 3.6. There: the ionization emrgies of a series of substitutional impurity a t o m are plotted as a function of the difference! between the atomic valence shdl energy levels of the host and impurity atoms. This difkrence can serve as a measure of the strength of the shortrange potential. The fact that for donor states
285
for selected deep centers, among them the vacancy in Si. the substitutional impurity at,oms of the main groups of the periodic table. the group of 3dtransition metals. and the group of rare earths. Also, we discuss the D X center and the EL2center in GaAs.
3.5.2
The 'Tight 3inding' (TB) method developed in section 2.6 represents one of the various procedures for calculating band structures of ideal crystals. Unlike other methods it uses basis functions to represent the Hamiltonian which are localized on the atomic length scale. Since the perturbation potentials of deep centers are localized on the same scale, the TB method should be particulady well suited for such centers. Of course, one must chose the Hamiltonian matrix elements empirically in order t o arrive at useful practical results, and the results also cannot be expected to be very accurate in a quantitative sense. However, the method should be suited to the derivation of simple models that exhibit the essential physical features of real deep centers. The simplest among these models is the socalled defectmolecule model, which we will introduce below. In doing so, many particle effects and lattice relaxation will be ignored. We will mainly use the model to demonstrate the existence of deep levels and to explore the symmetry of the pertinent eigenfunctions. At the outset we have to clarify which of the various tight binding basis sets should be used for the representation of the Hamiltonian in the case of a deep center. E the ideal crystal case considered in section 2.6,the atomic n orbitals or hybrid orbitals IhtRj) were not used directly, but rather. we employed wavevectordependent Bloch sums lukj) or ihtkj) formed from them. This was advantageous because the translation symmetry of the crystal could be exploited in this way. The latter symmetry no longer exists in a crystal with a deep center. so that localized basis functions, i.e. atomic or hybrid orbitals. can also be used without any loss. We select hybrid orbitals because these produce drastic simplifications of the Hamiltonian matrix which, like the Bloch sums of hybrid orbitals in the case of an ideal crystal, allow the eigenvahes of the Hamiltonian to be calculated in closed analytical form. We introduce the defect molecule model using the vacancy as an example. Later, we will apply this model to substitutional impurities from the main groups of the periodic table. As the host crystal we take, in all cases, an elemental semiconductor of group IV,like Si.
Vacancy
Figure 3.7 shows part of a Si crystal containing a vacancy. Symmetry consid
erations make clear that the origin of the vacancy is not important, whether it originated by removal of a Si atom from sublattice 1 or from sublattice 2. Here, we consider the removal of an atom from sublattice 1, and to be still more specific, from the primitive unit cell at R = 0. The perturbation potential V(x) is the negative of the potential produced by this atom in the crystal. Because of the removal of the 1atom, the hybrid orbitals lht2Rt),t = 1 , 2 , 3 , 4 , of the four surrounding 2atoms in the unit cells Rt, pointing inwards, no longer have a hybrid orbital of a 1atom to which they can bind. They are called danglzng hybmds. The three other hybrid orbitals at a surrounding 2atom interact with inwardly directed hybrid orbitals at atoms lying still further away (these atoms are not shown in Figure 3.7). The hybrid orbitals at a pal ticular siirrounding 2atom also interact among themselves, including the one dangling hybrid at this atom. This means that the dangling hybrids are coupled to the entire crystal through nearest neighbor interactions. If interactions between hybrid orbitals at the same atom are omitted, i.e. if the matrix element V1 introduced in section 2.6 is set to zero, which means  cp, then the dangling hybrids are decouplcd from the remainder of the crystal. They still interact only among themselves. Since the atoms at which these hybrids are located are secondnearest neighbors, this interaction is not within the framework of Erstnearest neighbor interaction which we have been exclusively considering in the treatment of an ideal crystal in section 2.6, but, rather, it ha6 the sense of secondnearest neighbor interaction. The latter must be taken into account here in order to arrive at nontrivial results. Based on the approximations made above, the crystal with a vacancy
decomposes into two partial systems which do not interact with each other, first, the partial system of the four interacting dangling hybrid orbitals, one at each of the four atoms surrounding the vacancy, and second, the yarlial system of all remaining hybrid orbitals of the crystal with the vacancy, i.e. the hybrid orbitals of all atoms which are not directly adjacent to the vacancy, and the three hybrids at each one of the four adjacent atoms which
~
&renot alrpady included in the first ptrrtial system. With some ambiguity
287
and to the the restofcrystal. This designation also encompasses h e term defect molecule model for the tight binding approximation described above. For each hybrid of the restofcrystal another hybrid exists in the restofcrystal which points to it. The Hamiltonian matrix elements between the various pairs have the same value, namely the value given above in equation (2.292) defining matrix element T/2. The energy spertriim of the restofcrystal is therefore identical with t , h d of the infinite idea1 cryslal in bhe simplest TB approximation? consisting of the bonding level t b = th  IVzl, and the antibonding level E = E,+ I&I. The splitting of these highly , degenerate levels into bands remains incomplete because of the neglect of Ihe interaction between hybrids at. the same &om? i.e. the neglect. of r/l. In calculating the energy spectrum of the first partial system, i.c., the vacancymolecule, we need the matrix elements of the pertarbed Hamiltanian H + V of the crystal with vacancy. The diagonal elements (ht2RtlH + Vlht2Rt) are simply the hybrid energies ~h since the elements (h,t2Rt(V lht2Rt) of the vacancy potential V between hybrid orbitals localized off the %mancy sit.e are small. One therefore has
second
tts
In order to obtain the nondiitlgounel (secondnettrest neighbor) matrix elements ( h t 2 R t l H f Vlht32FQ) between different dangling hybrid orbitals t , t f t , one has to recaIl the symmetry of the perturbed Hamiltonian H +V. Since, by creating a vacancy in sublattice 1) the two sublattices are no longer eqiiivalent, the symmetry of the crystal is no longer given by thr full cubic point group Oh, but rather by the tetrahedral group y d . Nevertheless, this means that all nondiagonal elements are equal for symmetry reasons, such that
wit,b W as secondnearest neighbor interaction energy. Because of the predominantly negative values of the operator fI V acting on hybrid orbitals and the predominantly positive values of the hybrid orbital products. W is expect.ed to be negative. Ihe absolute value of W must be determined empirically. The energy levels of the vacancymolecule are obtained by diagonaliziug the matrix
IEY)=
IE,
VaC
Jz [IIZL~RI)IhaZRz)],
(3.116)
(3.117) (3.118)
)  [ I ~ I ~ R I )  Jh2R3)]
~
v5
One can easily demonstrate that IEI") belongs to the irreducible representation A1 of the symmetry group Td of the vacancy, and the three functions IE$uc), IEY'), IEiWc)to the irreducible representation T2 of this group (see Table A.6 of Bppendix 4 ) . The eigenfunction of the A1level resembles an atomic sorbital of a Si atom. and the three Tzeigenfunctions are similar to the three porbitals of such an atom. Evidently, the sp3hybridization of the atomic orbitals in a Si crystal is removed at a vacancy. The states are more atomlike. in consequence of the fact that the crystal potential is no longer fully effective in the vicinity of a vacancy. In Figure 3.8. the e n e r e spectrum of the defect molecule model of a vacancy is shown along with that of the restofcrystal. The A1level lies below the Tzlevel because of the negative value of Vt. Whether it lies below or above the bonding level of the crystal depends on whether we have 31W1 < l\Jzl or 31WI > lF'21. The question of whether it is found in the valence band or in the energy gap, cannot be answered within the defect molecule model because. therein, the valence band has shrunk to the bonding energy level. It is just as difficult to decide whether the Tzlevel la i in the conduction bend or in the energy gap. Experiment and more exact calculations, which we will discuss later in more detail. show that the 41level lies in the valence band. and the 7'2level in the energy gap. With this,
289
Figure 3.8: Energy spectrum of the defect molecule model of ~ 1 .Sivacancy along with that. of the restofcrystal. The distribution of thc 4 electtona of the defect molecule over the energy levels is also shown. the T2level is the actual deep level of the vacancy. Of the four electrons
of lhe neutral vacancy  each of the four dangling bonds yields one two must he hosted by this lcvcl while the other two occupy the A1level in the
~
valence band. IJsing the terminology introduced above, we may say that the oxidation state of the neutral vacancy is V2+. The defect moleciile model of the vacancy reflects the actual relationships remarkably well. In any case it, provides a qualitatively correct physical picture of the electronic structure of a vacancy in groupIV elemental semicondiictors. Refinements of this picture will be discussed further b e low. Here we treat a second example to illustrate the defect molecule model which emerges from the v.mancy by occupying its empty lattice site with an impurity atom. Substitutional impurity a t o m s w i t h sp3bonding We consider a substitutional impurity alom in an elemental semiconductor of ~ o u IV, which we can again imagine as a Si rrystal (see Figure 3.9). p Let the siibstituted Si atom be that of sublattice 1 in the primitive unit cell R = 0. Like Si, the impurity atom should belong to one of the main groups 11,111, IV, V, or VI of the periodic table so that the valence shell is formcd by s and porbitals, which all lie energetically higher than any other occupied orbitals (in contrast to rare earth atoms). The perturbation potentials of these impurity atoms in an elemental semiconductor of group I V possess, as we know, both a shortrange part and also a longrange Coulomb part. The
latter will be omitted from consideration below. This approximation does not affect the answer to the question of whether a particular impurity atom forms a deep level or not. although it influences the actual position of this level in the energy gap. The latter cannot be determined within the defect molecule model anyway. The four sp3hybrid orbitals of the impurity atom will be denoted by IhtiO), and the four hybrid orbitals of the host crystal pointing in the direction of this atom will be denoted by IhtzR,),t = 1,2.3,4. Xeglecting interactions between the hybrid orbitals at the host atoms, the crystal with a substitutional impurity decomposes, just as before in the vacancy case, into two partial systems, first. the defect molecule with the 8 orbitals jhtiO), jht2Rt),t = 1.2,3.4, and second. the restofcrystal with all remaining orbitals. The energy spectrum of the restofcrystal again coincides with that of the ideal crystal within the framework of the approximations used here. The Hamiltonian matrix of the defect molecule is composed of elements of the general form
fhtjRtIH
+ 17lht~jR~~)~
(3.119)
where j and j take the values i and 2 independently of each other. We consider only the most important elements of this matrix, namely
In this, E ; and c i signify, respectively, the hybrid energies of the impurity and host atoms, V2 corresponds to the matrix element of H of equation (2.292) between hybrid orbitals at nearestneighbor atoms of the ideal crystal
291
pointing toward each other, and W describes, as in the vacancy case, the second nearest neighbor interaction between differing host atom hybrid orbitals not pointing toward the impurity atom. All elements ( h t j R t l H VlhtJRt,) listed above are neglected, among them also the Vllike elements (htzRt(H Vlht,i&,) between different hybrid orbitals at the impurity atom, i.e. with t f t. With these approximations, the Hamiltonian matrix of the defect molecule may now be written down in explicit form. In doing so the rows and columns are ordered in the sequence IhliRl), lh2iR2), lh3iR3), lh4iR4), lh12R1)) lh22R2), lh32R3), I@&), and the Hamiltonian matrix is
L i 0 0
0 0
1720
0
0
e,O
0
0
vzo
0
6 i 0
v20
o a o v2a o o
0
v2o
a o v2
; W W W
(3.124)
W E k W W
W W E k W
0 (0
0 0
v20 0 I $
w w w
e;.
This matrix has four distinct eigenvalues, two simple and two triply degenerate. The eigenfunctions of the two simple levels (we distinguish them by indices b and a ) belong to the 1dimensional irreducible representation A1 of the tetrahedral group T d , and the two triply degenerate (again distinguished by indices b and a ) belong to the 3dimensional irreducible representation T2 of T d . The corresponding energy levels are, respectively, denoted as and E&Y:,b, whence
EiT/b
In Figure 3.10 these levels are plotted as functions of the difference ( c i  E ) ; between the hybrid energies of impurity and host atoms. This difference represents a measure of the strength of the shortrange perturbation potential of the impurity atom. In order to better understand the physical meaning of the energy levels derived above, it is helpful to consider two limiting cases. First, we assume the impurity atom to coincide with the host atom, which . ; means t i = c If one also neglects the second nearestneighbor interaction energy W , then the energy spectrum of the perturbed crystal of equations (3.125) and (3.126) must coincide with that of the ideal host crystal in the
sp3bonding impurity atom in a tetrahedral semiconductor as function of the hybrid energy difference ( t i  e i ) . Horizontal lines indicate the bonding level t b and the antibonding level of the host crystal.
simplest TB approximation, i.e. a bonding energy level q = ~ h IVzI, and an antibonding energy level E, = h + IV21 must emerge. This is in fact the case. of the perturbed crystal Thus it is also clear that the two energy levels correspond to bonding and antibonding states of A1symmetry, and the two energy levels EY . to bonding and antibonding states of Tysymmetry. The ;$ two undisturbed levels h f IV21 in Figure 3.10 encompass the energy gap of the host crystal. Second, we consider the limiting case of ( E ;  tk) tending toward $00 or 00 which, for a given host crystal, also means t i + +m or E ; + 00, respectively. In the limit
EiT,,
E i + $00
(3.127) (3.128)
and in the limit
Ei
4
oo
we have
EiT =
tk + 3W,
(3.129)
293
+ O0.
(3.130)
In the limiting case t i + +m, the two antibonding levels and EkYZ of (3.127) tend, along with t i , toward 00, i.e. leaving the energy spectrum on its high energy side, while the two bonding levels and limit toward, respectively, the two Al and Tzlevels of the vacancy of equations (3.113) and (3.114). This is understandable because the limiting case EL f +m means that no impurity hybrid energy level exists at finite energy and, therefore, there is also effectively no impurity atom. In other words, there is a vacancy. With E ; + m the two bonding energy levels and E$T leave the energy spectrum at its low energy side, while the antibonding levels limit toward the two A l  and T2levels of the vacancy of equations (3.113) and (3.114). This occurs for the same reason as above in the limit t i + +m, namely because t i + 00 means that there is a vacancy at the impurity site. For finite positive values of (EL  E;), the bonding Al and Tzlevels can occur in the energy gap, forming deep levels there, and for finite negative values of (tL~k) the same can happen with the two antibonding Al and Tzlevels. A look at Figure 3.10 shows that the most favorable candidate for an impurity level to be in the energy gap is the bonding Tzlevel if ( ~ 6;)i > 0 holds, and the antibonding A1level if ( 6 ;  6 ; ) < 0 holds. More rigorous calculations confirm this conjecture in that they find exactly these levels to be in the gap. Below, we will describe such calculations and discuss methods for solving the oneelectron Schrodinger equation (3.8) for the crystal with a point perturbation.
EiT
EiY
EkYl
EiY
3.5.3
Solution methods for the oneelectron Schrodinger equa. tion of a crystal with a point perturbation
The solution of the Schrodinger equation for a crystal with a point perturbation begins with the determination of the effective oneelectron potential VPrt(x).In section 3.2 this task already was addressed in general terms. Two remarkable differences were found in relation to an ideal crystal: First, depends on the electron poputhe effective oneelectron potential VPert(x) lation of (localized) oneparticle states, and second the atomic structure of the perturbed crystal, which participates in the determination of the pois tential Vwrt(x), in many cases initially unknown and it must be selfconsistently calculated jointly with the electronic structure. Below, we will eliminate such additional difficulties of electronic structure calculations for point perturbations by assuming that the population of the center and the atomic structure of the perturbed crystal are known. Thus the potential Vwrt(x) the center can also be considered to be known. In regard to the of
Supercell method
At the boundary of a properly shaped ciuster, one can add an additional cluster of the same kind. If one does this repeatedly and continues the whole proress ad infiniturn, one finally arrives at an infinite crystal. The primitive unit cell of this crystal is now, however, no longer that of the crystal hosting the point perturbation. but it is the cluster, which in this context is called a supeveell Thp crystal is referred to as a supercrystal. The
295
periodic repetition of the point perturbation in the supercell method causes the discrete d w p levels to become kdependent bands of h i t e widths. If the supercell is made large enough, these widths are negligibly small and it suffices to calculate the band structure of the supercrystal for one kpoint only, e.g., for k = 0. The appealing feature of the supercell method is that the whole apparatus of band structure calculations for ideal crystals can be exploited for the electronic and atomic structure determinations of deep centers. In the cluster and supercell methods one obtains the deep levels of the perturbed crystal by a process in which one numerically calculates the energy spectrum of a model system, i.e., of the cluster in the first method and of the supercrystal in the second. Results for the band structure of the ideal crystal are neither necessary nor useful in either method. Also, the decomposition of the entire effective oneelectron potential into that of an ideal crystal and a perturbation potential need not to be made. In the third method for calculating the electronic structure of perturbed crystals, the Green's function method, this decomposition is essential, and the band structure of the ideal crystal is required. Green's function method The Green's function method employs techniques and insights of quantum mechanical scattering theory. It provides results not only about bound states with energy levels in the gap of the ideal crystal, but also on scattering states having energies in the allowed energy spectrum of the ideal crystal. First we consider the bound states.
Bound states: KostcrSlater method To explain this method we write down the Schrdinger (3.21)equation for the perturbed crystal once again in a somewhat modified form,
[E  HI+  L@ ".
(3.131)
Since we are interested in bound states. the eigenvalues E which WP seek from this equation lie outside of the bands, to be mare precise. in the iundamenlal energy gap between the highest valence band and the lowest conduction band. As tl preliminary we note that equation (3.131) can be formally solved for @ by multiplying both sides by the inverse [E  H1l of the operator [ E  HI. The inverse operator [ F  H1l stands in close relation t o the Green's function of the unperturbed crystal. The retarded Green's finctaon O " ( E ) is defined by the equation
GO@)
1 Eti6H'
(3.132)
where 6 is an infinitesimal positive imaginary part which is set to zero after serving to remove singularities at the energy eigenvalues of the unperturbed crystal. This procedure assures that the wavefunction response of the unperturbed crystal conforms to the causality principle, in that the response occurs only after the system has been perturbed. As our present interest is in deep levels in the forbidden part of the energy spectrum of the ideal crystal, not in changes of wavefunctions with energies in the already existing continuous part, no singularities occur in G " ( E ) of equation (3.132) in the range of E of interest to us, and we may ignore 6. Using G " ( E ) ,equation (3.131) may be formally rewritten as
[@(E)V'  I]?)
1
0.
(3.133)
Nontrivial solutions me possible only for energies E for which the deter 1 minant of the matrix of the operator [ C 0 ( E ) V ' 1 , in an appropriate orthonormalized basis set, vanishps, i.r. if
Uet[G"(E)V' 1 = 0 1
(3.134)
holds. The energy eigenvalues E satisfying this equation are deep levels. These levels ran thus be determined by calculating the Green's function G o ( E )of the unperturbed crystal and solving equation (3.134). This p r o w dure is rcfrrred to as Koster Slater method, and equation (3.131) as KosterSlater equation The Green's operator @ ( E ) can be cdriilattul if the band structure E,(k) and the Bloch functions of the ideal crystal are known. The matrix represenlation of d ' ( E ) in the Bloch basis reads
(3.135)
With this we only need to know the matrix elements (uk(V'1u'k') of the perturbation potential V' in Bloch representation in order to determine the deep levels using equation (3.134). However, i t would be more expcdicnt for the matrix representation of the perturbation potential to use localized wavefunctions as) for example, atomic orbitals. If one does so, then the matrix representation of the (heen's opertrtoi is not as simple as in the Bloch basis. A particular compromise in choosing a basis set for the representation ol the eigenvalue equation (3.134) is the use of socalled Wunnzer finctzons.
KosterSlater method in Wannier representation
Wannirr functions are linear combinations of all Bloch functions for a given band index v of the form
297
(3.136)
Here the summation is over the whole first BZ,G is the number of primitive unit, rdls in a ppriodicity rcgioa. and t k are certain phase a~glru.If the latter are chosen properly, the Wunnier h c t i o n s IvRJturn out to be well localized in the unit cell at the lattice point FL If the eigenvalue equation (3.134) i s written in terms of this basis, it reads
The matrix elements of the Greens operator G o ( E )may be obtained by means of (3.143) as
The matrix elements of the potential (vRIVjvR1 are particularly large if R and R are identical, and both are equal to the lattice vector of the cell which hosts the point perturbation. As before, we assume & = 0. Neglecting all other elements we find
(3.140)
For the particular lattice point R = 0 it reads
{ 3.141)
Equation 13.141) forms a closed set of equations for the central cell compo) nents ( ~ O ~ I Jof $ only. If the latter are known, all other components (vRI$) follow at once from equation (3.140). For a nontrivial solution of the homogeneous system (3.141) to exist, its determinant has to vanish separately, i.e.
must hold. The diagonal elements (vOIGo(E)IvO) G ! ( E ) of the Green's = operator in equation (3.142) may be obtained from (3.138) as
G;(E)
1 (vOlG*(E)IvO) 3 =
G
1
Ic
E+irEE,(k)'
(3.143)
The dominant bands in equation (3.142) are those which form the gap, i.e. the uppermost valence band u and the lowest conduction band c. If we consider only them and neglect all others equation (3.142) becomes
or more explicitly
To solve this equation, the host Green's functions G:(E) and G : ( E ) , as well
the perturbation potential matrix elements, have t o be known. Below we calculate the host Green's functions for isotropic parabolic bands of finite widths. In expression (3.143) for G : ( E ) we rqlace the ksum by an integraL This yields
G0 E ) J
 , 1 G 3 8x3
'
d%i
s t ~
1
~E
= c, v.
(3.146)
st beiug the volume of a periodicity region. Introducing the identity operator dE'd(b"  E ) into this integral, G:(E:) may be written as
with
Jrm
(3.147)
whrre p v ( E ) is the density of states per unit energy and spin state
p,(E)
, v
= c,w,
(3.148)
299
which differs from the DOS in equation (2.213) by a factor (0/2). In evaluating p,(E), we approximate the true valence and conduction bands by isotropic parabolic ones, however, taking their band widths A E , to be the same as those of the original bands. This corresponds to the use of an effective mass for each band averaged over the whole first B Z . The bandwidths are introduced by putting the total number of states of the approximate isotropic parabolic bands equal to G3. Then it follows that
(3.149)
3 pc(E) = ~
G3
(3.150)
where B(E) is the unit step function. We substitute these expressions into the Greens function G ; ( E ) of equation (3.147). The Eintegral is readily done. The imaginary part ofG;(E) equals (TI times pv(E). The real parts ReG:(E) and ReG:(E) are given by
I.;TIF]
(3.152)
Although complex numbers appear in these expressions, they Eare in fact real. This would hP obvious if we rcplaced the Infunction by an arctanfunction. We avoid this because it is more convenient to handle the manyvalued character of the Infunction rather than that of the arctarzfunction. Altogether, the host Greens functions of OUT model depend on three parameters. the energy gap EB, and the two band widths A& and AE,. The latter are measure3 of the awrage kinetic energies of electrons in the vakncce or conduction bands large bandwidths mean large average kinetic energies (or small eEective masses). Later, the G r m s function method in Wannicr representation will be used to address the question of whether or not main group impurity atoms in tetrahedral semiconductors form deep levels in the gay.
~
The perturbation potential also gives rise to changes within the energy bands of the i d e a l rrystal. Of course, the w e r g i a of t h e w bands are still allowed quantum mechanically in the presence of the perturbation, so that in this rwpeet there is 110 change. However, change within the energy hands of the ideal rrysttll IS indured by the perturbation in the form of a modified density nf allowed energy levels pcr unit mnergy, i.e. in the form of a modifid density ol statvs. To calculat? this change it is expedient to introduce the Green's fiirirtion of the perturbed crystal,
1
G ( E )  I; 4 i6
 [H+ V ' ].
(3.153)
According to formula (2.209). the imaginary part Im T r [ G ( E ) of the trace ] represents (apart from a factor 1/ir) the density of states p ( E ) of the syst,em, here thai of the crystal with the point perturbation. The Green's function of the p e r t u r b 4 crystal oLrys the equation
G ( E )= G o ( E ) t G ' ( E ) V ' C ( E ) .
(3.154)
which follows at once from the definition (3.1533 of G'(E). In quantum field tbwry. this relation is known as the D p o n equatzon Using thk equation and performing some simple calculations, the DOS expression (2.209) may be brought into the form
p(E)
 I m In D e t [ G ( E ) ] .
A
d dE
(3.155)
of the unperturbed crystal may be expressed in terms of the unperturbed Green's function Co(E). We seek the change
I an analogous way? the DOS P O ( ! ? ) n
M E ) = P(E) P O ( W
(3.156)
ol t h r nos clue l o the point perturbation. This change may be obtained from (3.155) and (3.156) as
(3 157) G'(E)V'I. dE This relation can be used to ralculatc the total change of the DOS i the n entire enerRy range between oo and t o o . Integrating A p ( E ) over this interval, and considering the fact that G ' ( f i ) vanishes for li: + f m , yields 1 d Ap(E) = IrrtlnDDet[l
lm
M
~ E A ~ (=. 0. E )
(3.158)
30 1
This relation signiEes that under the action of the perturbation potential, the total number of states remains unchanged. This result already was used in section 3.4 in lhe context of shallow levels, and there it was referred to as Levznsons theorem. Here, this theorem means that for each new state of the crystal created by the point perturbation, a state which existed without perturbation must cease to exist. The deep states in the energy gap occur therefore at the expense of band states. For each state occurring in the gap, R stale is lost fiom a band,
dEAp(h:) 
Lap.
dBAp(E).
(3.159)
In the above derivat,ion of Levinsons theorem, no assumptions about the spatial variation of the perturbation potential were made. This theorem, therefore, also applies for purely longrange potenlials. Thus, we have proven what was anticipated earlier in section 3.4, namely, that for each shallow level in the gap, a state is lost from a band. In Chapter 4,where we will calculate the electron population distribution over the energy levels in the gap and the bands in thermodynamic equilibrium, this result will play an important role. At certain energy values in the bands, the DOS of the perturbed crystal can display maxima or minima. One speaks of these as resonance and antzresonance state.9. These states emcrge when the shortrange perturbation polential can bind or antibind states at band energies. Since the corresponding localized level is degenrrate with the band energy continuum, the localization of resonance and antiresonance states differs from that of deep level states in the gap  with increasing distance from the center, the eigenfunctions do not decay exponentially but oscillate with an amplitude decaying to zero arcording to a power law.
3.5.4
Correlation effects
Correlation effects, as discussed in section 2.1, are important for the N electron system of a crystal with oneparticle states localized at a deep centel. One of thcse e f k t s is based on tlie configuration dependence of the Hartlee and exchange potentials. Another results from the fact that Slater determinants, even thosc calrulatcd by means of configuration dependent Hartrw and exchange potentials, are not exact eigenstates of the Nelectron Hamiltonian. The exact eigenstates are linear coinbinations of different Sl&r determinants, an rffwt which is referred to as cor~fignralzon znteraetzon (see section 2.1). The two correlation effects, configuration dependenre and configuration interact ion, will be discussed helow for deep centers. In regard to configuration dependence, we continue the general discussion of section 3.2 here.
For an ideal crystal, the energy eigenvalues E , of the onepartick Schrijdinger equation have direct physical meaning. Apart from their sign, they are the ionization energies I,, of the corresponding eigenststes Y , i.e. E , I,. In this regard, according to section 2.2, the ionization energy Iv is definned as difference
(3.160)
of the total energy R t o ~ ( { v ) ) the Nelectron system in the ionized stair of {v}, and the total energy E t o d ( { v } ]in theground state {v}. Theionized state (v) differs from the ground state (v) in that a particle, which in the ground state of the system occupies a oneparticle state of energy E,, is transferred to a oneparticle state of energy 0 corresponding to the vacuum level. Like in section 2.1 one says that a31 electron is removed from the system. This expression has to he used with care, however, for if taken literally i t misrepresents the charge neutrality of the system. The total energy of the ground state follows, according to formula (2.541, summing by all occupied oneparticle energies, followed by subtraction of the electrostatic interaction energy of the electrons because the latter is doublecounted in forming the sum of oneparticle energies. The equation E , =  I , is the content of Koopmans theorem, which was explained in section 2.1. The essential requirement for the validity of this theorem is the approximate population independence of the Hartree and exchange potentials or, more generally, of the effective oneparticle potential. This requirement is not satisfied for a crystal with a point perturbation. The potentials depend on the number n of electrons occupying oneparticle states localized at the center, and so do the energy eigenvalues E , of the oneparticle Schrodinger equation of the crystal with a point perturbation. We will denote these oneparticle energies by E:) henceforth to emphasize this dependence explicitly. Of course, the definition (3.160) of the ionization energy is also valid in this case, but it is no longer true that I,, represents the ( negative oneparticle energy E$. That this cannot hold is immediately clear if one recognizes that the eigenvalues IT?) and E P  l ) of the oneparticle Schrodinger equation with. respectively, n and n  1 electrons at the center differ from each other  the level EP is deeper than the level E p ) because the removal of an electron results in the positive core being less strongly screened, so that the remaining electrons are more strongly attracted. In this situation, we say that the electrons at the center relax on the removal of an electron from the center. If one further considers that the total energies of both systems enter in the definition (3.160) of I,, that of the relaxed system with (n 1)electrons at the center, and that of the unrelaxed
303
system with n electrons at the center, there is no way to explain why the ) energy difference E b t d ( { v } v ) E ~ M ( { V } 'should be equal to the negative eigenvalue  E Z ) of the unrelaxed system and not equal to the negative ( eigenvalue EP') of the relaxed system. In reality, the ionization energy E b t a l ( { v } v )  E ~ M ( { v } ' ) not equal to either one, but lies somewhere in is between the two. It can be shown that it is approximately given by the negative of the oneparticle energy eigenvalue of the Nelectron system with the fictitious number ( n  1/2) of electrons at the center. This is plausible because the electron, during its removal, feels, so to speak, the potential with n localized electrons half of the time, and feels the potential with ( n  1) localized electrons there also half of the time. To carry out an exact calculation of the ionization energy of a center which, in its ground state, has n localized electrons, the oneparticle energy eigenvalues E P ) are not sufficient. Applying equation (2.54),these oneparticle energies provide the total energy E b ~ ( { v } ' of the ground state, ) but they cannot be used to calculate the total energy Etotal({v}v) of the ionized state with n  1 localized electrons at the center. To obtain the latter, one also needs the oneparticle energies EP') of the Nelectron system with n  1 localized electrons. In density functional theory, the two total energies follow more directly: one determines the eigenfunctions of the KohnSham equation for the centers with n and n  1 electrons, then forms the corresponding ground state densities, and evaluates the total energy functional (2.64) using these densities. Ionization, i.e. exciting an electron to the vacuum level, is only one of the various possible oneparticle excitation processes of the Nelectron system of a crystal having a perturbation center. Generally, one may examine an ' ~ excited state { v } ~with an electron in a formerly unoccupied oneparticle state Y of energy below the vacuum level, and a hole in a formerly occupied ' oneparticle state v. The corresponding excitation energy Iuiv is given by the total energy difference between the excited state { Y } ~ and the ground ' ~ state {}, v'
(3.16 1)
As in the case of ionization, the excitation energy Iulv is not equal to the oneparticle energy difference E,i  E,. Here, we are interested in excitation processes involving changes of the populations of oneparticle states localized at the deep center. There are different types of such processes. Firstly, in the final state, all electrons may still be localized at the center, but with one electron having changed its localized oneparticle state. Such excitations are called internal transitions of the center. Secondly, an electron originally localized at the center may undergo a transition to the bottom of the conduction band. This is referred to as a donor transition Thirdly, an electron
~2~
Below, we will use this terminology Acceptor ionization lcvcls may be traced back to donor ionization levels. In fact, ionizing a neutral acceptor center A(0) leaves this center in singly negative charge state A (  ) , and a hole appear8 at the valcncc hand edge. If one subjects the centex A() E D (  ) t o a donor transition D (  / O ) then the center returns t.o its neutral charge slate D ( 0 ) A(O), and an electron appears at the conduction band edge. As a resdt of this t,wostep ionization process of the center A(O), an electronbolc pair is excited while the state of the center has not changed. The sum of the two ionizalion energies A ( O /  ) and D (  / O ) equals, therefore, the (minimum) excitation energy of an electronhole pair, namely, the gap energy E , (see Figure 3.11). Generally, for an x:cept.nr A ( Q ) in charge state &, one has (3.162)
305
edge instead of the valence band edge (as is done in equation (3.162)), thus then it follows from (3.162) setting A(&/(&  1))= Eg  A(Q/(&  l)), that
This unified description of donor and acceptor excitation levels makes it possible to decide in simple way whether a given center X represents a donor, an acceptor, both a donor and an acceptor or none of them. One has
(a) a pure donor, when the ionization level X(Q/(Q 1))lies in the gap and, simultaneously, the ionization level X((Q  1)/Q) does not;
(b) a pure acceptor, if X((Q  l ) / Q ) lie in the gap and, simultaneously, X(Q/(Q 1))does not;
(c) if both levels X((Q  l ) / Q ) and X(Q/(Q 1)) are found in the gap, the center is both a donor and an acceptor. One then calls it amphoterzc;

if neither of the two levels X((Q  1)/Q) and X(Q/(Q the center is neither a donor nor an acceptor.
The difference between the acceptor ionization energy X ( ( Q  1)/Q) and the donor ionization energy X(Q/(Q 1))of an amphoteric center X(Q) is, by definition, the Hubbard energy 1.J. One therefore has
Since, coninionly, the Hubbard energy U is positive, the acceptor ionization level commonly lies highcr in thc gap then the donor ionization level. The number of electrons bound at a center in thermodynamic equilibrium depends on the position of the Fermi level with respect to the ionization levels. From the outset it i s clear that the ionization level X(Q/(Q + 1)) also marks that position of the Fermi level at which the charge state of the cenlei changes: if E p lies just above X(Q/(Q I l)),the charge state Q is realized, if E F lies just below X(Q/(Q l ) ) ,the charge state (Q 1) is realized. h o r n this observation one may conclude that the charge state Q occurs when the Fernii Pnergy lies above the ionization level X ( y / ( y t 1)) and simultaneously below the ionization level X((Q l ) / Q ) , i.e. between the two levels X(Q/(Q 1)) and X((Q  l ) / Q ) . Of course, for this conclusion to be valid, both levels have to be located in the gap.
Configuration interaction If there is degeneracy among the various oneparticle states of an Nelectron system, then there is also degeneracy between the Slater determinants formed
These are, altogether, 3' = 9, as one should expect. If spin and exchange interaction are considered, the symmetric and antisymmetric wavefunctions have slightly different energies because the exchange energy depends on total spin S , which differs for the two groups of states. Within each group there is degeneracy, however, at least within the framework of the oneparticle approximation. If the configuration interaction is taken into account, this degeneracy is removed. The result of this removil can be derived by means of group theory, strictly speaking,
307
by means of the decomposition of symmetric and antisymmetric product representations into irreducible parts (see Appendix A). For the symmetric = A1 E 2 T2, and for the antisymmetric product one obtains (T2 x T Z } ~ product (2'2 x T2}a= T I . In these relations, the factors on the lefthand sides are representations in oneparticle Hilbert space, while on the righthand side one has representations in twoparticle Hilbert space. This means that every state on the lefthand side can host 1electron, and every state on the righthand side 2 electrons. To avoid confusion, the representations in oneparticle space are denoted by lowercase letters in this context, i.e. by a l , a2,e, tl, t2 etc., instead by uppercase letters A l , A2,E , T I ,Tz etc., which are common in group theory. Here, uppercase letters are used for two (or, generally, multi) particle representations. This is chosen in accordance with the notation for free atoms, where s,p, d , f represent oneparticle states, and S, P, D , F represent manyparticle states. Below, we will employ the distinction between lower and uppercase representations wherever an ambiguity might occur. As in the free atom case, the total spin S of the manyelectron state is indicated by the spinmultiplicity 2s 1, appended to the representation letter at its upper left. Using the notations introduced above, we may conclude the analysis of our model deep center by stating that configuration interaction will split its {t;}configuration into the 4 different twoelectron and 3T1. energy levels 'A1, ' E , 'T2,
+ +
If more than two electrons are localized at the center, and if oneparticle states of different irreducible representations are involved as, for example, in the ground state configuration {aSt;} of the neutralvacancy with 4 electrons, then the corresponding manyelectron levels can be obtained in the same way as above. The only difference is that the symmetric and antisymmetric products to be decomposed into irreducible parts have more than two factors and the factors are not necessarily the same.
The splitting of the manyparticle levels of deep centers of crystals has its counterpart in the fine structure of the manyparticle energy levels of free atoms with more than one electron. For such atoms, manyparticle states, formed from oneparticle states of given total spin S and orbital angular momentum L , having different total angular momenta J , give rise to slightly different energy levels. (Recall that the irreducible representations of the full rotation group, into which the products of the irreducible oneparticle representations decompose, are distinguished by J . ) This fine structure splitting has its largest effect for electron shells which are strongly localized, i.e. for d and fshells (as opposed to s and pshells). One may expect, therefore, that impurity atoms with unoccupied d and fshells, i.e. atoms of transition groups of the periodic table, should result in deep levels exhibiting pronounced fine structure splittings. This is in fact the case, as we will see below.
Below we discuss the structure of several deep centers which are important either from the scientific or technological point of view. Knowledge about these centers is, in every, case the product of combined experimental and theoretical investigations. Since we have thus far treated only theoretical methods, we first present a short overview of the experimental methods.
Experimental methods
One can divide the experimental methods for investigation of deep centers into two groups, on the one hand, methods which measure ground state properties of the centers, and on the other hand, methods which give experimental data on center properties in thermally or optically excited states. Among the methods of the first kind are measurements of magnetic properties, like ElectronParamagnetic Resonance (EPR);ElectronNuclearDouble Resonance (ENDOR) and magnetic susceptibility. These methods provide data concerning the total spin S of the centers and, if anisotropy effects are measured, also spatial symmetries. The chemical identity of the centers can be determined (in addition to other methods) by means of mass spectroscopy or of Rutherford Backscattering (RBS). Measurements of the ExtendedState Xray Absorption Fine Structure (EXSAFS) provide data about the geometrical ordering of the atoms in the vicinity of a point perturbation. To investigate the excitation properties of deep centers, optical and electrical methods are available. Ionization energies can be determined by means of optical absorption spectroscopy and photoconductivity measurements, mainly in the infrared spectral region. The crosssections of deep centers for emission of free charge carriers can be determined by means of time resolved current or capacitance measurements at pnjunctions or at other depletion layers. By suddenly applying a reverse bias at such a junction, the deep levels are lifted relative to the Fermi level The new equilibrium state of the junction corresponds to fewer electrons in the deep levels than previously. This state does not occur suddenly, however. it adjusts exponentially through emission of electrons from the deep levels into the conduction band (we assume deep donor levels here, the case of deep acceptor levels may be treated analogously). In the conduction band, the electrons are freely mobile carriers and are immediately sucked up by the positive electrode at the nregion. This results in an exponentially decaying current, which for its part leads to an increasing positive charging and. thus. an increasing capacitance of the junction. The decay time of the current and the rise time of the associated capacitance change are determined by the emission probability from
309
the dttep centers. Measuring the capacitance rise time yields experimental values for the emission probability. Of particular importatice is the socalled Deep Level Trawaerat Spectroscopy (DLTS), wherein a reverse biased pnjunction is exposed to periodically repeted voltage pulses of forward {deep level filling) polarity. The recovery time for the capacitance change after a filling pulse has been switched off, is measured as a fimrtion of temperature. This fiinction exhibits maxima which, under certain conditions, can be used to determine the ionization energies of the deep centers. Ionization energy values from LILTS measurements and other thermal quililrrium technique are, as a rule, smaller than optically measured values: the lattice has time to rrlax if ionization proceeds thermally, and this lowers thc enerm of the finel stat?. Optical ionization occurs instantaneously, hence lattice relaxation is is not possible.
Perhaps the best understood point perturbation is the vacancy in Si, so we initiate our discussion of particular deep centers with it.
Vacaricy in Si
of the vscancy was treated in section 3.2. It predicts the existence of two bound oneparticle states, a nondegenerate alstate, and a tripiy degenerate t2state. More rigorous calculations within oneparticle approximation (3araff, Schliiter, 1980; Bernholc, Pantelides. 1980) show that the allevel can be excluded as a deep state in the gap because it lies in the valence band (see Figure 3.12a). As such, it is fully occupied having two electrons. The t 2 level can lie in the gap depending on how many electrons it hosts. ,4t maximum, this can be 6, and at minimum 0. Thus there exist 7 charge states of the vacancy, namely t7(2+),I,(+), i(O), I,(), V(2), Ir(3) and V(4). The oneparticle energies of these centers differ by the Hubbard energy U. A simple estimate yields 0.3 el for C3. Thus, manybody effects, more strictly speaking. configuration dependencies of oneparticle energies: should be important in the case of the Sivacancy. Other manybody effects, including configuration mixing, are small, and a description of the vacancy in terms of oneparticle states, albeit configuration dependent ones. is approximately justified.
Owing to the value of about 0.3 e V for U , it may be expected that three or four ionization levels codd fit in the Si gap of about 1.1e V . A4ctualIy, the donor levels 5(+/2+), t(O/+), and the acceptor levels V (  / 0 ) and V ( 2 /1 are found there (as above the acceptor levels are counted relative to the conduction band edge). In Figure 3.12 (part b). calculated positions of these levels are shown. These are not yet final positions because lattice relaxation has not yet been considered. The latter leads to changes which are discussed below (see Figure 3.13). In charge state V(Z+), no electrons are available
of
for the population of the t 2 level, therefore no JahnTeller distortion of the vacancy occurs. In charge state V ( + ) , the tzlevel is occupied by 1 electron. Its energy can be degraded through a tetragonal JahnTeller distortion. The symmetry after the distortion is DM. For this group, the 3dimensional irreducible representation t 2 splits into the 1dimensional representation 62 and the 2dimensional representation c. The bzlevel lies energetically below the elevel and is single occupied. In charge state V ( 0 )the additional electron can also be hosted by the bzlevel. In order to gain additional energy, the tetragonal JahnTeller distortion is strengthened. In charge state V (), population of the elevel begins. Through a further distortion of the vacancy, which reduces its symmetry from D M to CzVithe elevel splits into two levels and the additional electron is placed in the deeper of the two. This level can also still host the additional electron of the charge state V(2), with an increase of the C2,distortion. If one takes account of the energy shifts diie to JahnTeller distortion, the resulting level positions are as shown in Figure 3 . 1 2 ~ .'I'he exchange interaction evidently plays only a minor role, so that practically no spinsplitting of the vacancy levels occurs. Considering that the wavefunction of the deep vacancy levels extend as far as the nearest neighbor atoms, this is understandable. A surprising result concerning the level positions in Figure 3.12~ that is the donor level V(O/+) l e below the donor Ievel V ( + / 2 + ) . The usual is ordering of the ionization level of the more negative charge state above the ionization level of the less negative charge state is thus reversed. Formally. it seems as if the Hubbard energy U would be negative instead of positive for the transition V ( + / 2 + ) . One therefore also calls the vacancy in Si a negatzweU center. Of course, the interaction energy between two electrons at the center does not really change its sign, but the increase of ionization energy due to the JahnTeller effect on the V ( + / 2 + ) transition amounts to only about halfof that for the V(O/l+) transition since the JabTeller effect is absent at the V ( 2 + ) center. If the vacancy was initially in the neutral state, i.e. with Fermi energy lying above the V ( O / + )donor level and below the V (  / O ) acceptor level, with subsequent lowering of the Fermi energy below the V ( O / + ) level, then the vacancy will initially capture a hole from the valence band and pass into charge state IT(+), and from there, without further change of the Fermi energy, i.e. spontaneously, it will capture another hole passing into charge state V(2+). The occurrence of an effectively negatke iJ at the vacancy in Si was first predicted theoretically (BarrafF. Schliiter, 1980) and later found experimentally in correlated EPR and DLTS measurements. This phenomenon has now been observed at a number of other deep centers.
311
Figure 3.12: Deep levels of avacsncy in Si: a) Calculated ionization levels without Hubbard corredions and lattice relaxation. b) Levels of a) with Hubbard corrections. c ) Fsxperimetital ionization levels which, in addition, include JahnTeller shifts. The numbers give the level distances from the valence band edge in eV.
(After Watkins, l S S 4 . )
cl v
D2d
Dzd
d) vo+d
btc
Figurr 3.13: Defect molecule model of a vacancy in Si. Different charge states of the vacancy are shown, taking into account the torresponding JtthnTeller distortions. On the righthand side, the basis functions of the irreducible representations of the deep k v e l ~are indicated (a, b, c Bnd d bband for the dangling hybrids of the 4 surrounding atom). (After Watktns, l U U 4 . ) relatively different systems as, for inst ance, Si:Hg.
The perturbation polrntial of a main group impurity atom in a tetrahedral semiconductor contains, besides the shortrange part, as a rule, also a longrange Coulomb part. For isocoric impurity atoms, i.e. atoms whose cores do not deviate too strongly from that of the host atom, the Coulomb potential is in general the main contribution. As shown in section 3.4, this potential leads to shallow donor or acceptor levels. These are the levels principally involved if main group elements are used as doping atoms for
313
Table 3.6: Experimental deep l e d positions of neutral main group substitutional impurity atoms in Si (in eV). "Dash" indicates that no deep level occurs in the gap or this particular substitutional impurity, meaning either that no localized state exists at all (C, Ge, Sn, Pb), or that such states exist but are shallow (B, Al, Ga, P, As, Sb. Bi). An empty space means that the correspondingimpurity atom is either not incorporated substitutionally, or that there i no unambiguous experimental s data. The neutral vacancy (Vac.) is shown for comparison. [Data compiled from LandoldtBarnstein, 1982.)
?r
Be
B

C

hk
Zn
A1
E C
Vac.
 0.43
N Ec  0.14 P

0
S Ec  0.31
Se
E1, 0.32
Ga

Ge

As

E,
 0.3
Te
Cd
E,
In
E, t 0.15
E,
Sn

Sb

+ 0.55
Ec  0.20
Hg Ec  0.31
1
+ 0.25
T1
Pb
~
Bi

Po
tetrahedral semiconductors. For nonisocoric elements of the main groups, however, the shortrange potential dominates in general, and, in particular cases. gives rise to deep levels in the gap (see Table 3.6). If an impurity atom belongs to the same column of the periodic table as the host atom, one says that the two atoms are isovcalent. In this case. the Coulomb potential vanishes completely and the shortrange potential remains as the only potential contribution. Isovalent substitutional impurity atoms therefore lead either to deep levels (this occurs if the isovalent host and impurity atoms are chemically dissimilar, as in the case of GaP:P; or ZnTe:O), or no localized levels OCCLU at all (this takes place for chemically very similar isovalent host and impurity atoms as in the case of Si:Ge. which forms an alloy). An important theoretical problem which has yet to be solved for maingroup impurity atoms in tetrahedral semiconductors is to understand why certain elements cause deep levels in the gap while others do not. The impurity problem mentioned was treated above within the defect molecule modeL Although groupIV elemental semiconductors were assumed as host crystals, the general results obtained there also apply to compound semiconductors.
Accordingly, a substitutional sp3bonding impurity atom in a sp3bonding L host crystal will introduce two bonding levels E ? and E t T of, respectively, a1 and t 2 symmetry, and two antibonding levels E:T and E : ; of these symmetries (see Figure 3.10). Which of these levels are located in the gap, and which are not, cannot be decided by means of the defect molecule model. On the assumption that there is a deep level in the gap. one can guess its symmetry as folIows: The perturbation potential V = e t  E: has negative values (or is attracting) if the hybrid energy of the impurity atom is lower than that of the substituted host atom, i.e. if the group number of the impurity atom is higher than that of the host atom. Furthermore, V has positive values (or is repelling) if the hybrid energy of the impurity atom is larger than that of the host atom, i.e. for impurity atoms with lower group numbers than the host atom. If, by varying the impurity atom, the perturbation potential V=  E: takes increasingly negative values starting from zero, then a deep level in the gap will evolve from the conduction band below a certain negative threshold value. If the perturbation potential V takes increasingly positive values starting from zero, then one expects a deep level to arise from the valence band above a certain positive threshold value. Using this observation, a look at. Figure 3.10 indicates that for negative (attracting) perturbation potentials, the deep level should be the antibonding u1leve1, and for positive (repelling) perturbation potentials, the deep level in the gap should be the bonding talevel.
ER
The above conclusions are essentially confirmed by the more accurate Greens function tight binding calculations performed by Hjalmarson, Vogl, Wolford, and Dow (1980). Some results of these authors. concerning allevels, are depicted in Figure 3.14. Since the a1levels arise mainly from atomic sorbitals, they are plotted against the sorbital energy of the impurity atoms in Figure 3.14. Concerning the question of whether or not a main group impurity atom gives rise to a deep level, the indications of Figure 3.14 agree surprisingly well with experiment. For the particular case of Si this can be seen by means of comparison of Figure 3.14 with Table 3.6, which summarizes experimental results for this host crystal. The data from both sources agree that no deep levels exist in the case of the isovalent groupIV atoms C, Ge, Sn, Pb. Among the groupV substitutional impurity atoms N, P, ils, Sb, Bi, only I gives rise to a deep level while all others result in Y shallow levels (even those do not exist for the isovalent groupIV atoms). In the case of group\I atoms, Figure 3.14 indicates deep levels for substitutional 0, S, and Se. while experimentally one also finds a deep level for substitutional Te. For impurity elements left of column IV of the periodic table, like Ga or Zn. the perturbation potential is positive, and the deep level in the gap is expected to be tzlike rather than allike. Such levels are not well described in the TB calculations quoted above.
315
>" QI
1
2
0
I
11
I
I I!
Eandedge
>
Q)
1
2
Cation Site
3
I
30
20
10
d)
el
fI
Figure 3.15: Energy levels and their populations for sp3bonding main group impurity atoms in elemental semiconductors of group IV,within the defect molecule model. The partial illustrations a) to h) correspond to impurity a t o m of groups I to VII as follows: a  V, b  \I, c  VII, d  HI, e  11; f  I. and the defect molecule model. Of the 9 electrons of a Sirgroup1 atom molecule (remember that only N results in a deep level in this case), 8 are hosted by the four bonding states and the remaining electron occupies the antibonding alstate in the gap. In a 5t:groupVI atom molecule with 10 electrons, 2 occupy the allevel, and in the 3t:grotyVI atom molecule with 11 electrons, in addition 1 electron has to be placed into the antibonding tllevel. This state of the molecule will certainly not be stable. As in the vacancy case, a JahnTeller distortion will occur which removes the degeneracy of the talevel and allows for a lower total energy by occupying levels shifted downwards. The defect molecules of group111. 11 and I atoms have. respectively, 7, 6 and 5 electrons. Of them 2 electrons are hosted by the bonding allevel in the valence band. There are, respectively. 5, 4, and 3 electrons to be placed in the bonding tzstate which forms the deep le\Tel in this case. One may also say that this level hosts, respectively, 1. 2, and 3 holes. Again, a JabTeller distortion will occur which lowers the total energy. Qualitative
317
differences from this model occui for elements of the first main group which have no occupied pstates but relatively shallow closed dshells (Cu, Ag) or d and fshells (Au). Foi these atoms, delectrons participate in chemical bonding with the host crystal. We will discuss this problem in more detail in the context of the transition metal impuiity atoms. The question of whether deep levels exist or not for a particular substi tutional impurity atom, ran he treated arralytzrally, employing the Wannier representalion of the Greens function derived above. As lhis calculation provides further physical insight into the formation of deep levels, we will discuss it hclow, agdin taking Si as the host crystal. The Chens function in Wannier representation has already been determined above (see equations (3.158) and (3.159)). For the evaluation of the KosterSlatcr equation (3.152) we need the matrix elemeuts V,,,, V,, V ,of the perturbation potential V in this representation. To calculate them we use the results of the TR approximation of scction 2.5. In the simplest veision of this upproximation, the ILamiltonian of the ideal crystal is diagonalized by the bonding and antibonding Bloch functions Ibtk) and In&), t = 1 , 2 , 3 , 4 , of equations (2.303) and (2.304). Thus, in an approximate sense, the bonding and antibonding orbitals IbtR) and latR) are the Wannier functions of the eigenvalue equation (3.152). We will use thmn to grt explicit expressions for V,,,,,V,,, and V. Fiist, we write down the matrix representation of the unperturbed , Hamiltonian H between bonding orbitals. It is given by
(btRIHI4R)
fhGRR, h
(3.165)
where E; is the hybrid energy of a host atom. For the perturbed Hamiltonian EZ fV, one has t h e same matrix elements for all R and R with the exception of R R= 0. The latter elements are
(btOlH i VlbtO) 
S(E;
+ EL),
(3.166)
where EL is the hybrid energy of the impurity atom. Siiicc none of the matrix elements of (3.165) and (3.166) depend on t , one may identify IbtO) with the Wannier function of the uppermost valence band. Then, taking the difference , of equations (3.166) and (3.165), an expression for (vO1V(wo) G V, follows
SS
vlf,  v o , v u
s ( 6 th  Eh). 1 h
(3.167)
The factor in (3.166) arises because the Wannier function is equally spread out over the two atomic sites of a unit cell, while the substitution of a n atom occurs at only one site. Analogoiisly, matrix elementu of fI and IT I V between antibonding orbitals may be used to obtain an expression for (cO(Vlc0)I_ v,,,and matrix
v ,
= v = v, = LG. , ,
(3.168)
We seek deep level solutions Et of the KosterSlater equation (3.152) located in the gap, i.e. with 0 < Et < EQ. For such energies, the imaginary parts of G:(E'} and G,!(E) are zero: and equation (3.152) transforms into
Solutions of equation (3.169) within the gap do not exist for all possible values of the 4 parameters entering, i.e. the gap energy Eg, the two band widths AE,, A&, and the perturbation potential constant Vo. This is evident if one considers the particular case AE,,v >> Eg, which can be realized. The arguments of the logarithmic functions in G : ( E ) and G:(E) are close to 1 in thia case, and the logarithms themselves become negligibly small. If one assumes, in addition, that A & = AEw,then
Re[G:(E) G:fE)] 
(F)&  [,,/=.
fi]
(3.170)
With this exprwsion! the deep level condition (3.169) takes the form (3.171) It has solutions Et only in the energy gap 0 < Et < E,, within which Et has to be restricted anyway because the deep level condition (3.170) is valid only there. For negative Vo, one necessarily has Ef > Eg/2, for positive Vo, and Et < E,/2. For Vi t +m, El approaches the value E,/2, i.e. the deep level is pinned at the midgap position. This corresponds to the pinning of & to the vacancy level discussed in subsection 3.5.2. In order for a solution Et of equation (3.171) to exist at all, lV01 must exceed a minimum value J V O ) ~ ~ ~ given by
(3.172) This minimum value of (Vo(follows from equation (3.171) by identifying Et with the lowest possible deep level position in the gap namely Et = 0 in the case of positive Vo, and by identifying Et with the highest possible value of Et namely E, in the case of negative VO.Considering equation 13.1721, the solution of equation (3.171) may be written in the closed form
319
Table 3.7: Perturbation potential matrix elements Vo = (1/2)(~:  $) (in eV) for substitutional impurity atoms of main groups in Si, calculated by means of HermanSkillman s and porbital energies. The latter are partially reproduced in Table 2.2.
(After Herman and Skillman, 1963.)
B e B
TI
1.17
F 6.71
c l
3.56 As Se Br 0.99 1.97 2.98 Sb Te I 0.42 1.22 2.03 Bi Po At 0.24 0.96 1.68
According to condition (3.172) for a deep level to be formed, the absolute value of the potential and the energy gap must be relatively large, and the bandwidth relatively small. That lVol needs to be large enough is obvious. The bandwidth must be sufficiently small because it corresponds to the average kinetic energy, which militates against binding. The gap is the energy range where the deep level has t o be placed, so it should not be too small. EnIarging the gap and lowering the bandwidth increases the likelihood of forming a deep leveL The condition lVol > IV0lmifi may even serve as a quantitative guide, as the following numerical example for impurity atoms in Si demonstrates. In this case one has Eg = 1.1 eV, and AE, = A E , = 3.3 eV is a reasonable choice. With these values, IVolmiplis 1.21 e V . The perturbation potentials V0 for impurity atoms of the main groups II to VI of the periodic table, calculated by means of equation (3.167), are listed in Table 3.7. According to our model, a deep level should exist for lV0l > 1.21 eV, and should not exist for IVoi < 1.21 eV. Comparison with experimental results or deep levels of substitutional impurity atoms in Si shown in Table 3.6 reveals that
of semiconductor crystals
with perturbations
the rritprion is correct in all caws. pxrept for In and TI, which are close to the border line of our model but still on thp side where no deep level6 exist. while experimentally surh 1wrls art found.
Transition metal impurity atoms The transition metal ('YM)impurity atoms of the iron group, i.e. those with closed adshells (see Table 3.2), are incorporated in S crystals prdomi inantly on interstitial sites of tetrahedral symmetry, while the 5dTM atoms prefer substitutional cation sites. The position of t h e 4dTM atoms in Si lies between the two, i.e. both interstitial and substitutional incvrpuretions are observed. In IIIV and 11Vl semiconductors, all three groups of TM atoms prefer substitutional incorporation on cation sites. The solubility of most transition metal elements in Si is relatively small, lying in the range of rm3. Higher values in the range of 101frm.3arc? reached in tetrahedral compound semiconductors, and for M n in GaAs the solubility reaches as high as 10" ~ r n  ~ . also is a transition metal which gives rise to alloys with R;In cerhizl 11VI compound semicondudors, such a , for example, (Zn, b1n)Te s or (Cd.hln)Te. Semiconduct,or alloys exist also with Fe and Co. In tetrehdral semiconductors, most of the T atmns form deep levels. M Among them, the levels of 'I'Melements with closed 3dshells (the socalled 3dTM cleinents) are by far the bestst known. Concerning 4d and 5dTM imyurily atoms, it is understood that. deep levels also exist in their rasp (Beeler, Andersm. Sdieflcr, 1985 and 1990; Alves, Leihe, 1986). This is not surprising since the Thl elements are all chemically very similar. Here, wc restrict our considerations to the 3dTM atoms. Them are the elements Sc! Ti, V, Cr, Mn, Fe, Go and Xi. Sometimes, the neighbors of Ni to the right in the periodic table, CU and Zn? are also included, although, owing t.o their respective 3d1'4s and 3d1'4s2confi~~ations. these are not in fact transition metals. In tetrahedral semiconductor crystals, however, they behave similarly. Special interest in the investigation of 3dThl atoms results mainly from t,he fact that their deep levels play an import.ant part in electronic devices. They can serve as recorrilrizlation centers through which the lifetime of nonequilibrium carriers is shortened, or as capture centers for h e carriers which partially compensate the effect of dopant atoms. These effects are imclesirable in most caws, thus one must. try to minimize the contamination of the devicw by 3d'I'M atoms. Occasionally, 3dTM impurities can elso be useful. 'I'his happens, for example, in the case of Cr in GaAs which, due to its compensating effect! makes the crystals seruiinsulitting. There are numcrouq experimental investigations on 3d'l'h.I impurit.y &oms in tehahedral semiconductors (interested readers are referr4 to the book by Fkurov and Kikoin, 1994). Beside donor and acceptor transitions alsn internal transitions of the 3dTM imp1irit.y atoms are observed. Below we
32 1
'
concentrate on the donor and acceptor transitions. Figure 3.16 contains data related to this, as well as showing the ionization energies of the free 3dTM ions. The latter are larger than the ionization energies of the 3dThl atoms hosted by semiconductor crystals by more than one order of magnitude. The case of free atoms is also striking in regard to the large Iyariations of ionization energies between different ionization states. In semiconductor crystals these variations are almost two orders of magnitude smaller so that, as in the case of the Iacancy, ionization levels of several charge states may be found in the e n e r a gap. Considering the substantially stronger localization of delectrons and the much weaker screening of their interaction in comparison with the valence electrons this is surprising and needs explanation. We will return to this point later. Theoretically. the donor and acceptor ionization levels of substitutional 3dTM impurity atoms are now well understood. A simple oneparticle model which is supported by ab initio calculations, is a defect molecule with the 3dTM atom on a substitutional cation site at its center (see Figure 3.17). In this model, the TM atom is represented by the five 3dorbitals and the one 4sorbital of its valence shell, and the representation of the crystal is embodied in the four sp3hybrid orbitals of the four surrounding cation atoms pointing towards the TM atom. The energies of these orbitals will be denoted by Ed, eS and eh, respectively. Considering the tetrahedral sgmme try of the impurity center,we decompose the four sp3hybrid orbitals of the four surrounding cation atonis pointing lowdrdb the ThI atom into a state with A1symmetry and three states Itah) with T2symmetry. The sorbits1 of tlir TM atom is deformed by the crystal into an orbital of A1symmetry, and hhe five dstates decompose into two states ).1 with E symmetry and t h r w states It%) with Tzsymetry. The two 01orbitals interact with each othrr tand give rise to a bonding state dwp in the valenct. band and an antibonding state deep in the conduction band. The two estates remain without bonding, their energy levels are likely to be found either in the gap or in the valence band. The two triply drgeiicratc 12 states Itah) and 1t2d) do interact mutually. 'rhe corresponding interaction matrix elements are of the type j r and bym {see section 2.6). Neglecting bbPtype matrix elements, one triply degrnertlte bonding level Eb nnd one triply degenerate antibonding level Ea arise. For these two levels, the same formulas apply 8s those which wmt' derived in section 2.6 for interacting hybrids at nearest neighbor atoms of an ideal crystal. Denoting the Vpp,type matrix elemen1 by V h d , we hwe
(3.174)
where
d'o/d9 d'/do d%' d?d2 dyd' d%l' d?d5 d?da d%' d%ln
I I
I I I I
'
11
GoAs to/2.4
The corresponding thecomponent wavefunctions It2b) and Itza) of, respectively, energy Eb and E,, are
where we set
a = (I2
s a)' 
1 a 2 =  ( l +  s. )
(3.177)
323
c band
t2a
t2
4s 3d
v  band
Figure 3.17: Energy level diagram of the defect molecule model for a substitutional 3dTM impurity atom in a tetrahedral semiconductor. Explanations are given in the text. Since the two bonding and antibonding eigenvectors Itzb) and Itza) of equation (3.175) are linear combinations of basis functions with tzsymmetry, they also have t2symmetry (as already indicated by the notation). It turns out that the bonding tzblevel lies in the valence band, the nonbonding elevel in the valence band or the gay, and the antibonding tklevel is in the gap above the elevel or in the conduction band. In regard to deep levels of the 3dTM atom i s in the energy gap, therefore, the 2fold degenerate elevel and the abovelying triply degenerate antibonding tp,level are candidates (see Figure 3.17). The eigenfunctions of the elevel are linear combinations of the &orbitals of the TM atom and are therefore strongly localized at the latter. Considering localization of the t a b and tn,leveh, the position of the two orbital energies F d and t h relative to each other is decisive. If Ed lies deeper than Ch, then a! > ,8 holds, meaning that the bonding tzblevel is mainly formed from the dststes of the TM atom, while the antibonding tzalevel, which forms the deep level, arises to a considerable extent from the tzcomponents of the four sp3hybrid orbitalb of the surrounding host crystal atoms. Exactly this picture emerges from the abovementioned Greens function ab initio calculations (Zunger, Lindfeldt, 1983). In Figure 3.18 the calculated oneparticle energy levels are shown for some 3dTM atoms in Gap. The most striking feature of the above description of the electronic structure of substitutional 3dTM impurities is the existence of deep levels whose wavefunctions are spread out over the surrounding host atoms. These levels
4a
3 2
jFq
*\ 1
are hostlike to a certain extent rather than purely TM atomlike. This picture stands in remarkable contrast to the socalled ionic model, which was long believed to be correct for substitutional TM atoms also. In this model, the deep level is strongly localized at the TM atom. Today there exists clear experimental evidence that this is not the case. The ionic model is based on the socalled ligand field theory. This entails an approach to the deep level problem for impurity atoms which differs fundamentally from the one used in this book. Within ligand field theory, it is assumed that an impurity atom X in the crystal has essentially the same electronic structure as the free X"+ion, where V + means the oxidation state of the impurity atom in the crystal introduced above. The energy levels and wavefunctions of this ion are weakly disturbed by the crystal field at the impurity site, an effect referred to as crystal field splitting. While this model applies relatively well in the case of ionic crystals (where ch lies deeper than c d ) , it evidently fails in the case of the covalent or partially covalent tetrahedral semiconductor crystals. The oxidation states of ligand field theory are, however, also r e produced in the approach taken here. To demonstrat,e this we consider the example of a Co atom substituting the metal atom in a 111V compound. There are 9 electrons in the 3d74s2 valence shell of Co, and 5 in the valence shell of the group V atom to which the Co atom binds. 8 of these 14 electrons are hosted by bonding states of the deep center (2 by the bonding alstate, and 6 by the bonding t 2 state). S i x electrons remain for the population of the deep impurity states, i.e. the nonbonding estates and the antibonding t2states. Since 9 electrons are expected at a neutral Co atom, the oxidation state of Co in a 111V compound is Co3+. Substitutional Co in a 11VI compound has the oxidation state Co2+ since one additional valence electron is provided by the group VI atom of the host crystal.
325
3.0
I
2 55
LT
W
2.0
z w
1.o
0.0
Figure 3.19: Calculated inultiplet structure for substitutional 3d impurity atoms in ZnS. [After Faatio, Caldas, and Zungcr, 2984)
Ligand field theory accounts for the fine structure of the deep impurity levels, which is not explainrd by the approach taken above. Although this approach provides qualitative understanding of deep 3dTM centers in tetrahedral semiconductors, the excitation energies it yields are not correct quantitatively, particularly not for the internal transitions of the 3dTM centers. To be quantitatively correct, this approach must be refined by including manybody effects, in particular, configuration interaction. While the latter effect does not play an important role for main group impurity atoms, as we have scen above, it becomes important for T M impurity atoms with their strongly localized open dshells (as was specultited in section 3.5.4 using the analogy with frcc I'M atoms). Figure 3.19 demonstrates this in thc case of TM atoms in ZnS.
The differences between the donor or acceptor ioniaation energies of a particular TM atom in different host crystals exhibit interesting behavior. Experimentally one finds that these diffeiences are almost independent of the atom considered. Therefore the ' ionizationencrgyversus'I'M atom curves' of different semiconductors are parallel t o each other and ran be translated to overlay by rigid displacements along the energy axis (Caldas, Fazzio, Zunger, 1984; Langer, Heinrich, 1985). In Figure 3.20 this fact is shown for acceptor
Figure 3.20: (a) Experimental acceptor ionization energies of 3dTM impurities in 111V semiconductors. (b) Experimental donor (open symbols) and acceptor (filled symbols) ionization energies of 3dTM atoms in IFVI semicoIiducton. (After
Langer and Heinrich, 1985.)
and donor ionization levels of a 3dTM impurity atom in a series of I11 V and 11VT compound semiconductors. The rigid displacement along the energy axis has an important physical meaning as we will discuss further below. Here, we will provide a simple explanation for the similarity of the 'ionizationenergyversusTM atom' curves in different semiconductors (Tersoff, Harrison, 1986). In this discussion we will use the defect molecule model of 3dTM impurity atoms presented above, in addition, however, the dependence of the energy levels f d and Eh on their population will be taken into account in a selfconsistent way. Let n d , n h and n:, n: be the electron numbers in, respectively, the atomic d and sp3hybrid orbitals, the former after charge transfer between the impurity atom and the crystal has taken place, and the latter before. The dependence of the levels Ed and ~h on population may, in the linear approximation of equation (3.25), be written as
Ed
= td 0
+ Ud(nd 
r i0 ) , d
Eh = 1, 0
tJh(nh  nh). 0
(3.178)
Here c :E denote the energy levels without charge transfer, and [Jd, U h , : are, respectively, the Hubbsrd energies of the d and the hstates of the sp3hybrids. The electron numbers in the bonding and antibonding tastates will, respectively, be denoted by rib and n,, and the electron number for the estates by ne. Of the total number of electrons of the defect molecule,
n d = nbu! 2
+ nap2 in,
(3.179)
327
(3.180)
are in hstates. The bonding T2level Eb lies in the valence band, its three states are therefore completely occupied and it holds n b = 6. The difference 6 = (Ed  h ) / 2 between the d and hstates adjusts selfconsistently: if Eh increases, then 6 becomes small and, with this, also ,B and n h are small. As a consequence of this, n d increases which, according to (3.178), results in an increase of d and, hence, an increase of 6. Selfconsistency is then achieved if the two equations (3.178) for the level positions t d and Eh are satisfied with values n d and n h from equations (3.179) and (3.180), respectively, and values a 2 , P2 from (3.177). For the level difference 6 = ( d  ~ h ) / this condition 2 yields
+uh
where 260 = denotes the level distance without charge transfer. The crucial point at this juncture is that [ I d , because of strong localization of the &electrons and wcak scrcening of their interaction, has very large values (on the order of magnitude 10 e V ) . Therefore, the factor multiplying U d in equation (3.181) must practically vanish itself in order to satisfy this equation, leading to the approximate rclation
(E!
" ci)
,(nu
(3.18 1)
i ( ~ a
6 + 6 ) + s1( n u 6 ) A 
1 ne  n d w 0,
(3.182)
which uniquely determines 6. According to this relation, the dlevel adjusts in the crystal in such a way that the charge transfer between the crystal and the dshell practically vanishes (of course, it cannot vanish completely because adjustrrierii of the levels requiies a bit of rhargc to bc transferred). This adjustment is called the 'neutrality level'. According to equation (3.181), the selfconsistent value of h and, thus, also the energy separation tIdh = 26 between the d and < h level, is indepmrlent of Ch. Since t h is the only quantity which changes substantially in the series of the 111V and 11VIcornpound semiconductors, the separation between Ed and Eh is the same in different semiconductors. This also holds for the antibonding tzalevel dnd for thc nonbonding elevel, i.e. for the two levels which are candidatev fol deep levels in the gap  also, their clihtance from c h is independent of Eh. For the ta,level this follows from equation (3.174), and for the elevel from the fact that its position is determined by the location of the dlevel in the c ~ y s tal. We thus arrive at the conclusion that the deep levels of 'I'M atoms in
328 Chapter 3. Electronic structure of semiconductor crystals wilh perturbations the gap are pinned at the hybrid level F ? ~ the surrounding host anions. of If the coupling hetwccn the anionhybrids and the restofcrystal is taken into account, the anionhybrid energy Eh has to be replaced by the average cationanion hybrid energy of the crystal. The rebUlt8 above explain lhe experimental finding that the deeplevelversusThl atomcurves for two different semiconductors ran be tranbleted to overlay each other by rigid displacements. Moreover. they associate the with the difference between the average hymagnitude of the displa~ements brid energies of the two semiconductors. Thls quantity has a close relation to the valence band discontinuity at a semiconductor heterostnicture. as will be discussed iu more delail i section 3.8 n The large Hubbard energies of the atomic dstates of the IM impurity atoms and thFir pinning at lhc dangling hybrid lev& of the surrounding atoms are also contributo~y the striking phenomenology noted above. in to that the deep levels of TM atom8 depend only relatively weakly on their population, so that levels of several charge states can fit in the gap. In fact, if an additional electron is put in the deep tzolwel, and, with this, also part ofit IS added to the TM atom, the dlevel of the latter will be shifted up and the bondmg tBlevel will be depolarized ( a becoming smaller and /3 larger in equtliiori (3.177)). This means that electron charge density will flow from the Thl atom to the surrounding host atoms. in almost the same measure as was added t o the T M atom when the additional electron was placed in the deep ant ibonding &level. Since the Hubbard energies of the s and pshells of the host atoms are substantially smaller than those of the &orbitals of the Thl atoms, the deep TMlevels are shifted up only by several tenths of an e V rather than 10 el, for the case of the free TMion if one more electron is p l a ~ e d the center (Haldan, Anderson, 1976). at Cu, Ag, Au in Si These three elements (abbreviated below as Nhl, for noble metal) play an important part in silicon device technology. As impurity atoms, they constitute effective capture centers for nonequilibrium charge carriers. Their high solubility in Si is remarkable (in the region of cmd3 to ~ r n  ~ at melting temperature), as is their fast diffusion. Their incorporation in the host crystal is predominantly substitutional. The elements tend to form hs complexes with other elements. for example, Au with 0. Fe or 3dTMs. T i makes it somewhat difficult to identity the pure substitutional NM centers. Although the elements Cu, Ag and Au belong to the fast main group of the periodic table, their behavior its impurity atoms in tetrahedral semicunductors resembles that of transition metals. This is understandable if one considers that for these elements only the outer sshell, but not the outer pshell, is occupied, and that the closed dsheIl (Cu, Ag) or d  and fshells
32 9
(Auj lie energetically relatively shallow below the s and pshells. According to calculations by Fazzio, Caldas and Zunger (1985). the antibonding t 2 state lies in the energy gap, just as in the case of transition metals, and as in the latter case, this state arises from the interaction of the impurityderived dstates (more exactly their tgcomponent) with the tgcomponents of the dangling bonds of the surrounding host atoms. Because of the relatively weak localization of this deep impurity state, its Hubbard energy is small so that. again. several ionization levels can fit within the gap. For all three K M atoms, the calculations result in a Nh/l(O/+) donor level, and a NM(/O) acceptor level. In the case of Ag and Au, the amphoteric character has also been conhmed experimentally.
Rare earth atoms The rare earth (RE) atoms are among the following elements: Ce, Pr, Nd, Pm, Sm, Eu. Ac. Th. Pa, U. Np, P u , Am, and Cm. The valence shell configuration of the majority of these atoms may be described as 4fn6s2 with the number n of felectrons varying from 2 for Ce to 14 for Yb (see Table 3.3). Exceptions are Gd and Tb which have one 5d valence electron in addition. The investigation of RE impurities in Si as well as in the 111V and 11VI semiconductors has received new stimulus very recently. The reason for this is the luminescence of RE impurity atoms in these crystals in the technologically interesting visible and infrared spectral regions. In equilibrium the RE atoms are installed both substitutionally as well as interstitially. The equilibrium solubility of the RE atoms is, however; rather small. For practical applications, nonequilibrium incorporation techniques like ion implant ation must be used, although, generally, only parts of the implanted atoms are optically or electrically active. For Erbium (Er), impurity concentrations of about lo1 ~ r n have been reported in Si, and of about 10l8 C W I  ~ in Ga4s.  ~ The technologically interesting luminescence discussed above is caused by internal transitions within the 4f shell of the RE atoms, more strictly speaking, between the various levels arising from the manyparticle levels of the 4fshell under the influence of the crystal field. The 4forbitals remain, in fact. almost unchanged by installation of the RE atoms in the crystal. This may be traced back to the fact that these orbitals are strongly localized and, moreover, strongly shielded by the 6selectrons shutting out influences of the surrounding crystal. Furthermore, for RE elements heavier than Nd, there is also strong shielding by the completely occupied 5s and 5pshells lying outside the 4fshells for these elements. While internal electron transitions within the 4fshell have been studied already for a long time, transitions to energy levels outside of the 4fshell, including donor and acceptor transitions into conduction and valence band states of the crystal, have been subjected to more detailed investigation only recently.
"
6
4
5 0
v
P2 g 4
6 8
E(3+/4+)
10
12
1 I
, I !
I
I I / I I , ! I
Ce Pr Nd Rn Sm Eu Gd T Dy Ho Er Tm Yb b
Figure 3.21: Donor ionization levels of substitutional rare earth impurity atoms in different semiconductors. (After Delerue and Lannoo, 1994.)
A rough but qualitatively correct picture of the electronic structure of substitutional RE impurity atoms in tetrahedral semiconductors may again be obtained by means of a defect molecule model (Delerue and Lannoo, 1991). In this model, the 4forbitals neither interact with the 5d and 6sorbitals of the RE atom nor do they couple with the orbitals of the surrounding host atoms. The RE atom is represented by its 5d and 6sorbitals only. The 4forbitals, nevertheless, are involved in forming deep states because electrons are transferred to or from them. The host crystal is described by the four sp3hybrid orbitals at the four neighboring atoms pointing towards the RE atom. The model is therefore largely analogous to that of 3dTM atoms described above: The five 5dorbitals decompose into two eorbitals and three tzorbitals, and the 6sorbital becomes a alorbital under the influence of the tetrahedrally symmetric crystal field. The four sp3hybrid orbitals at the neighboring atoms pointing toward the RE atom split into one a l  and three t2orbitals. The interaction between the two alorbitals results in a bonding and an antibonding alstate lying deep in the valence and conduction bands, respectively. The estates remain without bonding to the host crystal, while the host and RE derived t2states couple to each other forming bonding and antibonding tzstates. The differences between 3dTM and RE impurities are essentially of a quantitative nature. The 5dlevels of the RE atoms are higher in energy and therefore closer to the hybrid levels at the neighboring atoms than are the 3dlevels in the 3dTM case. Because of this, the elevel and the antibonding tzlevel of the RE atom (which may lie in the gap in the case of 3dTM atoms) are lifted into the conduction band.
33 1
The position of the Ifshell with respect to the above levels still has to be determined. In the model by Delerue and Lannoo this is done assuming that the distance between the 4fshell and the average 5dlevel in the crystal is the same as that between the 4fshell and the 5dshell in the free RE atom. The ionization levels of the 4Jshell calculated by means of the above defect molecule model are shown in Figure 3.21 for various RE atoms and host crystals. Two diffcrcnt oxidation states of the RE atom, RE3+ and RE2+, have been assumed. One recognizes that, for InP, the oxidation state RE3+ is stabk in most rasrs, because the RE(3t /4 t ) ionization levels are still below the Fermi level, while the RE(2 t/3+) ionization levels are above it. For CdTe, the oxidation state RE2+ is found to be stable in most rases. With a few exceptions, these predictions are confirmed by experimental investigations. Oxidation states can also be derived in a more direct way from the above defect molecule model. In a 111V compound host crystal, the RE defect molecule has 2 n 5 electrons, 2 n of them from the HE atom, and 5 of them from the host crystal anion. These electrons orrupy either states in the 4fshell, or are in bonds between the RE atom and the crystal. The bonding al and t2states host 2 f 6 = 8 electrons provided that they are energetically lower than the fshell, which in fact turns out to be the case. Thus, 2 n 5  8 = JZ  1 electrons remain for the fshell. The oxidation state of a neutral RE atom is RE3+ in such circumstances, because 3 electrons are missing at the RE atom, the one f electron and the two selectrons in bonds with the host crystal. In 11VI compounds, the oxidation state of a neutral RE atom should be RE2+ according to this consideration, since the host crystal anion delivers 6 electrons instead of 5. A striking feature of the ionizationenergyversusREatomcurves for different semiconductors in Figure 3.21 is that they can be brought to overlie each other by rigid relative displacements. This is reminiscent of the ionization energies of the 3dTM atoms where the same feature is observed. The reason is similar to that of the 3dTM case: one the one hand, the 4f levels are pinned electrostatically to the average 5dlevels by means of the selfconsistent charge exchange between the d and the fshell with its large Hubbard energy, and on the other hand, the 5dlevels are pinned at the average hybrid energy levels.
+ +
+ +
Asantisite defect i n G a A s (GaAs: A s G ~ ) The interest in this defect arises mainly from the fact that it is closely connected with the socalled EL2center which is one of the most common point perturbations in GaAs. The EL2center is a double donor and plays an important role in GaAs electronics. Through deliberate use of its properties, ptype GaAs, which normally arises in crystal growth, can be transformed into semiinsulating GaAs which is required for the manufacture of GaAs
332 Chapter 3. Electronic structure of semiconductor crystals with perturbations MESFETs and other electronic devices based on GaAs. However, in regard to materials for light emitting optoelectronic devices, the concentration of EL2centers has to be kept as small as possible because EL2 acts to quench luminescence. Whether the Asantisite defect GaAs: ASG, is really identical with the EL2center, or if it is only part of a complex which represents the EL2center, is still somewhat controversial today. Nevertheless, important properties of the EL2center , in particular its structural metastability, can be explained in terms of the simple GaAs:Asc, antisite defect alone (Chadi, 1992, Dabrowski, Scheffler, 1992). The GaAs:AsG, antisite defect may be viewed as a special case of a substitutional main group impurity atom with sp3bonding in a tetrahedral semiconductor. For the latter point perturbation, the defect molecule model was developed in subsection 3.5.3. Accordingly, one has a bonding and an antibonding A1level, and a bonding and an antibonding T2level. From ab initio calculations it is known that the bonding levels lie within the valence band, the antibonding A1level in the gap, and the antibonding T2level in the conduction band. Of the 10 electrons of the defect molecule, 8 can be placed into the two bonding levels within the valence band. The remaining two electrons just suffice to fill the antibonding A1level in the gap. These are the two electrons which, by exciting the crystal, are transferred to the conduction band giving rise to the doubledonor nature of the center. However, the population of the two antibonding states is energetically costly. Therefore it is not surprising that for the As atom other positions than the substitutional one may be more favorable for minimization of the total en ergy. In the above mentioned ab initio calculations, it was shown that a displacement of the As atom by about I A in [Ill]direction, a displacement away of one of the 4 nearest neighbor A s atoms towards an interstitial position about 0.2 A below the plane spanned by the other three As atoms (see Figure 3.22), wsiilts in a relative totdl energy minirrium which lies only 0.25 eV above the absolute minimum of the substitutional Assite. In addition to the stable substitutional state of the defect, there exists, t h e r e fore, a metastable interstitial state. Thc stable state ia separated from the metastable by an energy barrier of about 0.8 eV, and in the reverse direction the barrier amounts to about 0.34 eV. If onr of the two donor electrons at the substitutional center is optically excited, the barrier decreases siibstantially and a thermal transition into the metastable state is possible. In this state the center is not capable of capturing the electron whir11 was previously optically excited into the conduction baud. This electron remains there resulting in the experimentally observed persistent photoconductivity. Only by heating the sample above toom tempwature can the As atom return to Its stable substitutional site. The main reason for the relatively small energy difference between the substitutional and interstitial locations of the As atom iu that, between an
333
Direction
Figure 3.22: Geometry of the stable and metastable state of the Asantisitc defect in GaAs. (After Dabrowski and SchrJler, 1992.) interstitial As atom and Ohe three adjacent, A s atoms, st,rong sy2bonds can be formed similar to the bonds in the graphite structure of carbon and that of g a y As. The nonbonding porbital of the A s atom, which lics rclatively high enmgetically, does not increase t,he total energy of thc interstif,ial s h t e because it remains largely unoccupied, the two remaining electrons are hosted by the dangling bond of t,hc fourth remotc lying As atom.
DXcenters
A metastability similar to that of the C:aAs:AsG,, ailtisite defect is also observed at ot,hcr point perturbations, among them at t,he socalled L)Xcent,ers in GaAs and (Ga,Al)As mixed crystals (Lang, Logan, 1977). The microscopic nature of these centers was a puzzle for a long time. It was only V clear that they were related to impurity atoms or the main groups l and VI of the periodic table which normally are incorporaled subst,itiitionally on cation and anion sitcs, respectively, and form shallow donors. lTndcr certain conditions hydrostatic pressure or strong Titype doping in the case of GaAs, and AlAs mole fractions larger than 22% in the case of (Al, Ga)As deep levels emerge from these shallow donors. Originally this transition was attributed to the formation of a defect complex involving the donor atom (a)and ti11 iinknown point, perturballon (X). Now, it is clear t,hat the D X center j u s t represents another state ol 1he shallow donor. In contrast to the lat,t,er,the donor atom of a DXccntcr is not neutral, but singly negatively charged, and its installation does not occur at a substitutional site but at an interstitial site. If the special Conditions mentioned above are not fuliilled, t,hen the shallow donor represents a stable state of the impurity atom, the DXstate is metastbble. However, as for the EL2center, thc rclalive total energy minimum of the DXstate lies only slightly (about 0.2 e V ) above thc absolute minimum of the shallow donor state. The reason for this energy balance is the sainc as in the E L 2 case, namely the energetically costly populalion of the antibonding A1level at the shallow donor. In the singly negatively charged state which the donor takes under high doping, (and ev
3.6
3.6.1
Every solid is bounded by a surface. Nonetheless, the model of an infinite solid which neglects the presence of a surface works very well in many cases. Why is this possible? The reason is, first, that in many cases one deals with properties, such as transport, optical, magnetic, mechanical or thermal properties, to which all the atoms of the solid contribute more or less to the same extent, and, secondly, that there are many, many more atoms in the bulk of a solid than at its surface, provided the solid is of macroscopic size. In the case of a silicon cube of 1 cm x 1 c m x 1 cm, for example,
335
one has 5 x loz2 bulk atoms and only 4 x 1015 surface atoms. Beside the above mentioned bulk properties, there are other properties or processes, like crystal growth, oxidation, etching or catalysis, which are determined by the surface atoms only. In these rases, the model of an infinite solid does not apply, of course. Moreover, in semiconductors, bulk properties like transport are often not controlled by all atoms but only by dopant atoms. Then the number N s of surface atoms per cni i s to be compared with the number of dopant atoms. For a semiconductor sample of thickness d , area 1 cm2, and doping concentration N D , the number N o x d of dopant atoms per c7n2 can easily become as small as the number IV, of surface atoms. This means that the transport properties of such a semiconductor sample will depend on its surface. In the history of semiconductor physics, this was recognized at a very early stage. Examples which demonstrate this are the electric rectification at a semiconductormetal contact (discovered by F. Braun in 1874), and the unsuccessful attempts to build a field effect transistor in the late thirties of this century. The failure was caused by electron states localized at the surface which captured all the electrons induced by the extcrnal voltage. lhe surfaces used in the early field effect experiments were far from perfect. They were made by cutting or cleaving a semiconductor sample in air. If at all, they were clean and smooth in a macroscopic sense, but not so microscopically. They exhibited surface roughness on the 100 nm scale, structural defwts on the 1 nm scale, and impurity atoms at and below the surface. The surface states responsible for the difficulties of the early field effect transistor were due to these imperfections. In the present section we wl deal with surfaces which are free of such imperfections. Perfect surfaces il of semiconductor crystals in this sense necessarily represent a particular lattice plane occupied only by chemically correct atoms at regular sites. No impurity atoms are allowed above or below the surface. The surface is reduced to what it means ideally, the termination of the crystal. Perfwt surfaces in this sense cannot, of course, be realized in practice. One may only try to approximate them so closely that the existing imperfections do not essentially change the properties of the surface as compared to the properties which a perfect surface would have, if it really could be made. One calls such almost perfect surfaces clean. Although this term refers only to chemical composition, it also implies structural perfection or atonzc smoothness. There are essentially t h r w ways to manufactwe clean surfaces. All thrw need ultrahigh vacuum (UHV), i.e. pressures below Torr: (i) Treatment of imperfect surfaces by ion bombardment and thermal annealing (generally in several cycles). (ii) Cleavage under UHV conditions (only surfaces which are cleavagc planes
+ +
h l , Ik2, jL3,
337
n  h l b l + h2bz
+h ~ b 3 .
(3.182)
Hrre, reciprocal lattice vectors are understood without the factor 27~ defin inition (2.122). If the normal direction n is given, the roefkients ( h l h ~ h 3 ) are only determined up to a common integer factor. We chose this factor such that h l , h 2 , hs have no common divisor. Then the coeffirients hi,h2, h 3 are the Miller indices of the lattice plane under consideration, however, r e ferring to the primitive lattice vectors al,a 2 , a 3 as coordinate axes rather than to the crystallographic axes. The latter are parallel to the piimitive lattice vectors only in the case of primitive Bravais lattices. For centered laltires, like the face rentrred cubic one, they have different directions, and the coefficienls ( h l h ~ h 3differ from the corrirnon Miller indices. Tf necessary, ) one can rasily switrh from one representation to the other. The ( h l h 2 h 3 ) are, apart from a co~linion factor, obtained by multiplying the ( h k l ) with the matrix which transforms the three nunprimitive crystallographic axis vectors into thc three primitive lattice vectors. Using the above characterization of lattice planes by their normal direction n, the lattice points
Ro = rloa1k r20az
+ '7'30ag
(3.183)
of the lattice plane perpendicular to n which contains the zero point, may be defined by the equation
n & E h17.10
+ h2r20 + h3r30 = 0.
(3.184)
The unique character of this equation lies in the fact that only integer solutions r10,r2O1r3o are allomwl. In mathematics it is called a Dzophantin
equation.
Whereas the Miller indices definr an infinite family of parallel lattice planes, equation (3.184) yields only a single plane, namely that member of the infinite family which contains the zero point. One can show that all lattice planes
Ri = w a l + v z a 2
+w
a ~
(3.185)
of the infinite family of parallel plancs are obtained by replacing the right hand side of quation (3.181) by arbitrary integers 1,
(3.186)
The points of the lattice plane defined by equation (3.184) form a 2D lattice. The primitive vectors of this lattice will be denoted by fi and f2. This i s done in such a way that the three vectors f1, f2, n form a right hand coordinate system. The f1, f2 may be expressed in terms of the primitive lattice
Table 3.8: Primitive surface lattice \rectors and stacking vectors of low index surfaces of semiconductor crystals.
Crpst'alStructures
I1
1
WurtBite, Selenium
vectors a l , a z , q of the crystal under consideration. In order to determine the coefficients of this representation, the Diophantin equation (3.184) has to he s o l v d , which ran be done by Incans of the Euclidean algorithm (set, Bechstrdt and Enderlein, 1988j. The piirriitive vectors fi, fz obtained in this way are shown in Table [3.8) for several Iow index lattice planes of the 5 common scrniconductor crystal structures. IJsing thc lattice vectors fi. f2 and ailitrary integers ~ 1 ~ 5 the lattice plane given by relations (3.183) and 2, (3.184) can bc represented as
R = Slfi o
+ sgh.
(3.187)
The lattice planes R defined by equations (3.185 and (3.186) can be written r in the form
where f.3 is a vcctor complementing F1 and f 2 to form a s ~ of primitive lattice t vrctors Fi, f2, F3 of the 313 bulk lattice of the crystal. The vector 3 can br detcrinined from thr Diophantin equation (3.186) for 1 1 in just the same way, i . ~ .by means of the Euclidean algorithm. BS the vectors Fi,fid were obtained above. The choice of $3 is not uniquc. of course, and any vector fi which differs from f 3 hy a vector aithiii the latticr plan? can also be used. We call f 3 the stacking vector because it determines how the lattice planes arc stacked in the crystal. The considerations above show that the primitive unit cell of a crystal may be chasm as a parallelrpiyed with one of its pair6 of parallel laws parallel to a gwen lattice plane. This implies that the structure of a crystal
~
339
Figure 3.23; Construrtion of a crystal from its lattice planes. The lattice points of a given plane are occupicd by identical atoms. Different planes may host different atoms if the crystal has a basis. In the figure, a crystal with o~ily one atom per primitive unit cell is shown. may be characterized as consisting or parallel lattice planes which are displaced with respect to each other and which are each o c c i i p i d hy atoms of a particular species (see Figure 3.23). 'I'o be more specific, the lattice plane Ro is occupied by atoms of species 1, the next plane, displaced by 6 with respect to the first onc, is occupied by atoms of species 2, etc., and the plane displaced by FJ is occupied by atoms of type J . It may happen that two OT more atoms of the basis are located at the same lattice plane. In that case an ntomic layer consists of two or more basis atoms. The lattice plane Ro 7j completev the constsuction of a crystal slab which, in the vertical direction, encompasses exactly one yrimitivc bulk unit cell. This slab is callcd a p r i m i t i v e crystal slab. A lattice plane occupied by atoms is referred to nb: a t o m i c layer. The second primitive crystal slab begins with a lattice plane occupied by atoms of species 1 and is displaced with respect to the zeroth plane of the first layer by the stacking vector f3, followed by a plane with atoms of species 2 which is displaced by f3 t ?2 etc., the last plane of the second primitive crystal slab being occupied by atoms of species J and displaced by f3 T'J, The second primitive crystal slab is followed by another J planes dlvplaced with respect to the first slab by 2f.j instead of f3. The crystal can therefore be thought of as consisting of successive primitive crystal slabs sitiihtcd one above the other, each of which consists of atomic layers containing difkrmt types of atoms and which are laterally displaced with respect to onc another. The location of an individual atom can bc specified by the number 1 of the primitive crystal slab, the numbers j of the atomic sublattice and the integer coordinates s1, s2 witahinthc lattice plane. The position R(j, I , s1, s2) of an atom j in crystal slab 1 can therefore be written as
(3.189)
The complete set of locations of atoms in an infinite 3D crystal can be obtained by assigning all possible integer values from oo to +m for 1, s 1 , s 2 , and all integer values from 1 to J for j . The normal direction n, upon which the whole construction of lattice planes is based, is arbitrary in the case of an infinite 3D crystal. Any choice of n yields the same crystal. We can immediately employ the above representation of an infinite crystal in describing a crystal with a surface. The normal direction n is then, however, fixed by the direction perpendicular to the surface. We start with the description of a crystal having an ideal surface. The exact meaning of ideal in this context is explained immediately below.
We consider a crystal surface given by a lattice plane perpendicular to n as defined by equation (3.184). A crystal having such a surface may be generated from an infinite crystal of infinite extension by removing all atomic layers above the surface and retaining those below. Since the forces acting on atoms situated beneath a lattice plane in an infinite crystal are also partially due to the atoms located above the plane, we can, in general, expect that the forces acting on atoms in a crystal with a surface should differ from those acting in an infinite crystal. The deviation from the infinite case, however, diminishes with increasing distance of atoms from the surface and we can thus assume that the forces acting on, and hence the positions of, atoms deep inside the crystal bulk are, to a good approximation, the same as those in an infinite crystal. This is, however, not true for atoms situated near the surface, and the forces acting on them are appreciably different, resulting in displacements of atomic positions with respect to those of the infinite crystal. We will discuss these displacements below. Here, we assume that they are not present and, correspondingly, the atoms at and immediately below the surface have the same positions as they would in an infinite crystal. The term ideal surface is used to refer to this configuration. The atoms of a crystal given having an ideal surface are thus located at the positions R(j, 1, ~ 1 , s ~ ) by equation (3.189), however, only positions below the surface plane are occupied. These obey the relation
n . R(j, 1 , s 1 , 5 2 ) = I + n . Tj 5 0.
(3.190)
The surface or first atomic layer is obtained if the left hand side of this relation is taken to be zero. A solution of (3.190) is 1 = 0 and = 0.
34 1
The latter condition signifies j = 1 since 7 = 0 is assumed. Thus, the 1 first atomic layer corresponds to the particular lattice plane perpendicular to n which goes through zero and whose lattice points are occupied by basis atoms of species 1. There may be other ?j beside ?I which, although not with respect to n. Then being zero themselves, have a zero projection n .3 the basis atoms of this species j are also located in the first atomic layer. They are shifted with respect to the atoms of species 1 by a vector Tj parallel to the surface. Such multiplespecies occupancy of an atomic layer occurs, for instance, in the case of (110) surfaces of diamond and zincblende type crystals. In this case one has two atoms in each primitive unit cell of the 2D lattice of a lattice plane, in the case of Si, for example, Si atoms, and in the case of GaAs, 1 Ga atom and 1 As atom. The distinguished role of atomic species 1 in the above considerations results from the choice of the origin  it has been placed at an atomic site of species 1. Of course, the coordinate system may be shifted in such a way that its origin coincides with the location of any other basis atom. In each case a different surface is obtained. Even if the basis atoms are chemically identical the surfaces may differ from each other in a topological sense. In the case of diamond type crystals, for instance, two topologically different (111) surfaces exist, one with three nearest neighbors above and one below the surface and another with one above and three below. The latter surface is more stable than the former one and the latter is meant if one refers to a (111) surface. For other crystal structures and surfaces the situation is similar. If topologically different surfaces exist for a given set of Miller indices, generally, one of them is more stable than any other, and that is the one which is commonly realized in experiment and studied theoretically. We first consider the translation symmetry of an ideal surface and the corresponding lattice.
Translation symmetry and lattices of ideal crystal surfaces
The translation symmetry of a crystal with an ideal surface can be derived from the following observations: (i) Only translations within the surface lattice plane perpendicular to n are admissible. Any translation leading out of this plane would alter the spatial location of the surface and thus would not transform the system crystal with surface into itself. The symmetry group of translations and, correspondingly, also the lattice of a crystal possessing a surface are, therefore, only 2dimensional, although the crystal with surface is 3dimensional in extent. (ii) Only those translations are admissible which transform each atomic layer into itself. The construction of a crystal, layer by layer, as described above implies that if a particular translation transforms the first atomic layer into
342 Chapter 3. Electronic structure of semiconductor crystals with perturbations itself, this also holds for any other layer. It immediately follows that the group of translations of a crystal with a surface is identical to the translation group of the first atomic layer. The lattice vectors r of a crystal possessing an ideal surface perpendicular t o the normal direction n are therefore those of equation (3.187), so that
r =slf1+
s2f2
(3.191)
holds. The vectors f1, f 2 are the primitive lattice vectors of the 2D lattice of the ideal surface. The primitive unit cell of the crystal with surface has the form of a prism bounded above by the parallelogram spanned by f1, f 2 and extending to minus infinity in the direction of f3. For the lowindex surfaces of the common semiconductor crystal structures, the three vectors f1, f2, f 3 are listed in Table 3.8. As in the 3D case, 2D lattices may be divided into crystal systems and Bravais lattices according to their symmetry with respect to rotations and reflections. We now consider the possible plane crystal systems and plane Bravais lattices. Their point symmetry elements are necessarily rotations about axes which are perpendicular to the 2D lattice plane, and reflections at lines within the 2D lattice plane. Thus, the point groups are either C , ( n ) or C,, (nm, nmm). Since the rotation through 180' is always a symmetry element of a plane lattice, only even n are allowed. The highest value of n may be readily obtained from the derivation of the possible rotation symmetry axes of 3D crystals in Chapter 1. There, n 5 6 was found. Thus, the possible multiplicities of rotation symmetry axis of plane lattices are n = 2,4, and 6. A lattice which only contains a 2fold symmetry axis is either a completely general oblique lattice or a rectangular lattice. The point groups of these lattices are, respectively, C 2 (2) and C2, (2mm). Lattices with a 4fold symmetry axis also possess 4 reflection lines which are rotated through 45' with respect to each other. The point group of such a lattice is therefore C4v (4mm). Similarly, lattices with a 6fold axis have 6 reflection lines which meet at an angle of 30'. In this case the point group is c 6 v (6mm). There are thus 4 different plane crystal systems  the oblique with holohedral point group 2, the rectangular with holohedral point group 2mm, the quadratic with holohedral point group 4mm and the hexagonal with holohedral point group 6mm (see Figure 3.24). The possible plane Bravais lattices are obtained as follows. First, one takes the four primitive lattices with unit cells which are parallelograms having either no particular symmetry, or that of a rectangle, a square or an equilateral hexagon. Then one adds additional points to each of the unit cells of these lattices in such a way that their point symmetries are not lowered. Only one new lattice is obtained in this way, namely the 'body' centered
343
oblique
P
rectangular
P
quadratic
P
hexagonal
P
Figure 3.24: The 5 plane Bravais lattices of the 4 plane crystal systems. rectangular lattice. This lattice cannot be transformed by continuous and symmetry preserving transformations in the primitive rectangular lattice, nor in any other primitive lattice. So it forms an additional Bravais lattice. In all other cases, the addition of points either leads back to the primitive lattice or results in no lattice at all (i.e. it creates a crystal with a basis). We conclude that five plane Bravais lattices can be realized within the framework of the 4 plane crystal systems: only the primitive lattice in the oblique case, the primitive and the centered in the rectangular case, and again only the primitive in the quadratic and hexagonal cases (see Figure 3.24). With the last statements we have completed the symmetry classification
of the plane lattices of crystal surfaces. The lattice types of the ideal low
index surfaces of the common semiconductor structures are summarized in Table 3.9. We now turn our attention to the symmetries of crystals with surface as a whole.
Point and space group symmetries of ideal crystal surfaces
We wish to establish, on the one hand, point groups which transfer equivalent directions of a crystal with surface into one another and, on the other hand, space groups which transform the crystal with a surface into itself. In the latter case this implies that not only does the 2D lattice transform into itself, but so do equivalent atoms occupying the primitive unit cells. These atoms are located at positions R(j,I, sir92) given by equation (3.189). To express the 2D nature of the translation symmetry of a crystal with a surface ~) explicitly, we denote the atomic positions by Rjl(s1, s2) R(j,I , ~ 1 , s and write
Table 3.9: Structural properties of ideal low index surfaces of the five common semiconductor crystal structures. Column 6 and 7 give, respectively, the number of atomic layers of an irreducible crystal slab and the number of basis atoms per layer.
Bravais Lattice hexagonal square prectangular hexagonal square prectangular hexagonal square prectangular hexagonal Point Group 3m 2mm 2mm 3m 2mm m 3m 4mm 2mm 3m m m
1
Space Group p3ml p2mm p2mg p3m1 p2mm plml p3ml P h m p2mm p3ml plml P W Pl
Irr. Slab
6
Basis
1
I
4
2
6
4 2
2
1
I
2
2
2
4 4 4
2
2
1
2 4
(ioio)
1 1 1
(ioio) (1120)
prectangular prectangular
Pl
P21
6
4
where
represents the basis of the crystal with surface. This basis contains an infinite number of vectors corresponding to the infinite number of atoms in a primitive unit cell. The point and space group elements of a crystal with surface must leave the surface invariant, i.e. only 2D symmetry groups need to be taken into consideration. Here, as in the case of translation symmetry, we therefore also have the situation that, although a crystal bounded by a surface extends in three dimensions, its point and space groups are only 2dimensional. The point groups of directions of crystals with plane surfaces are necessarily subgroups of the holohedral point groups of the plane lattices, i.e. subgroups of the point groups 2,2mm, 4mm and 6mm. There are exactly
345
a
2
@ ..
2 mm
. mm 4
@
..
6mm 6
0c3
1
(i I
4
a
@
3m
(I>
3
Figure 3.25: The 10 point groups of equivalent directions of crystals with plane
surface.
10 such groups. Their stereograms are shown in Figure 3.25. This implies the existence of 10 different crystal classes which are associated with corresponding crystal systems as indicated in Figure 3.25.
The possible plane space groups can be found as follows (see Figure 3.26). To start with, it is evident that each of the 10 point groups of equivalent directions combined with the corresponding associated lattice gives rise to a space group. The space groups p l , p 2 1 1 , p l m l , p 2 m m , p 4 , p 4 m m , p 3 , p 3 m l , p 6 and p 6 m m originate in this manner. Since the point groups of the rectangular crystal system are each associated with two Bravais lattices, primitive and centered, we find two further space groups, c l m l and c 2 m m . In the case of the point group 3 m , there are two different possibilities of positioning the reflection lines relative to the hexagonal lattice vectors, either through the vertices of the equilateral hexagon of the WignerSeitz cell, as assumed in the case of p 3 m 1 , or such that ,hey bisect its edges. In the latter case one
crystal svstem
oblique
2
LLZ
Bravais lattice
ooint
soace
rect angular
prect a n g ul o r
2mn
I I
p'g'
Ip I
crectangular
Figure 3.26: The 17 space groups of crystals with plane surface. has the group p31m as a 13th space group. One should further note that the point group of directions of a crystal remains unchanged if, in its space group, a glide reflection line is substituted for an ordinary reflection line. One must therefore examine the 13 space groups already established to determine whether the substitution of a reflection line m by a glide reflection line g (i.e. a reflection in m in conjunction with a translation 7' by half of the shortest lattice vector parallel to m ) leads to a new space group. One easily finds that this is not the case for the hexagonal crystal system. In the quadratic crystal system it is possible to substitute a system of glide reflection line for one but not both of the nonequivalent reflection line system. This yields the additional space group p4mg. The additional space groups in the case of the
347
2mn
c2mn
psquare
4mn
p4
p4mn
I I
,hexagonal
p3m1
Figure 3.26: Continuation: The 17 space groups of crystals with plane surface. primitive rectangular crystal system are p l g l (from p l r n l ) and p2rng,p2gg (from p2mrn). The centered rectangular and the oblique crystal systems do not give rise to additional space groups. In the case of crystals with plane surface, there are therefore a total of 17 different space groups. Four or them involve glide reflections. i.e. they are nonsymmorphic. The 2D point and space groups of the various surfaces of a given crystal may be derived from the 3D point and space groups of the infinite crystal under consideration. They are, in fact, the subgroups of the 3D groups which contain only those symmetry elements which transform lattice planes
In the preceding section the atomic structure of crystal surfaces was considered under the assumption that the atoms of the crystal bounded by a surface occupy the same positions Rjl(s1,s2) as they did in the infinite 3D crystal, if the former is generated from the latter by removing one half of it. As already noted, this assumption is actually not valid. The atoms of the surface layer experience different forces than those acting in the bulk of the crystal, and are thus subjected to displacements from their original sites in the crystal bulk. Since the forces acting on the atoms of the second layer are in part determined by the positions of the atoms in the first, these forces are also subjected to changes accompanied by displacements in the second layer and so on for each successive layer. All one can assume is that the displacements decrease from one layer t o the next and vanish altogether at a depth that is relatively far from the surface. Here, we present a more detailed description of the surfaceinduced displacements of atoms and discuss the resulting altered symmetries as compared to those of ideal crystal surfaces.
Translation symmetry of relaxed and reconstructed surfaces
We denote the displacements of atoms due to the formation of the surface by 6Rjl(sl, 4, the new positions of atoms in the crystal with surface and
349
(1111
(1101 (111)
coiii
(100)
(110)
0 1 A 0 1 A 0 1 A O2B 0 2 6 016 0 3A 0 3A 2A
Figure 3.27: Projections of a diamond or zincblende type crystal onto its low index surfaces.
sz). by Rj~(s1, Then
(3.195)
The displacements 6Rjl may be divided into two classes with regard to their effect on translation symmetry. If the latter is not affected, the displacements are termed reeluzation. In this case equivalent atoms in different primitive unit cells are displaced in the same way, i.e.
Figure 3.28: Surface relaxation (a) and surface reconstruction (b) extending up to the second atomic layer. A 2 x 2 reconstruction is shown in part (b).
for all
51, s2.
(3.196)
Only the vectors 6 of the basis are altered, the lattice vectors remain un1 changed in this case (see Figure 3.28a). If, on the other hand, the translation symmetry is altered, the displacements are termed reconstruction. In this case equivalent atoms in different unit cells are not all displaced in the same manner, i.e. bRjl(s1,s2) depends on sl,s2. Both the basis and the lattice are changed. We first consider the changed lattice translation symmetry of reconstructed surfaces (see Figure 3.28b). In describing reconstruction it is useful to divide the crystal with a surface into two slabs parallel to the surface, an upper slab containing the atomic layers with displaced atoms, and a lower slab encompassing all other layers, i.e. layers with nondisplaced atoms. The upper slab is sometimes called a selvedge, here we use the terms surface slab and bulk slab (or simply bulk) for the upper and lower slabs, respectively. By Ts and Tb we denote, respectively, the plane translation symmetry groups of the surface and bulk slabs. The translations which transform the crystal with surface into itself must belong to both groups of translations, Ts as well as Tb. The translation group T of the whole crystal with surface is thus the intersection
T = Ts fl Tb
(3.197)
of Ts and Tb. Alternatively, one can say that T is the largest common
subgroup of both groups Ts and Tb. There are two possibilities, either T only consists of the identity translation, which means that the lattices defined by Ts and Tb are noncommensurate and the crystal with surface does not possess any lattice translation symmetry, or T contains more elements than just the identity, in which case one says that the two lattices derived from Ts and Tb are commensurate. The lattice associated with the T is called the coincidence lattice. If, in particular, Tsis a subgroup of Tb then T is equal
35 1
to T,, i.e. the coincidence lattice is identical to the lattice T, of the surface slab. If T, is not a subgroup of Tb then T cannot be equal to T, and is necessarily a proper subgroup of T,, i.e. it is smaller than T,. One can thus distinguish between the following three cases with regard to the translation symmetry of crystals having a reconstructed surface: (i) no translation symmetry, (ii) translation symmetry exists but is smaller than that of the surface slab, (iii) translation symmetry exists and is the same as that of the surface slab.
If one assumes that, among the various conceivable surface reconstructions with a given degree of translation symmetry, that particular reconstruction will take place which allows for maximum translation symmetry of the crystal with surface (the system surface slab plus bulk slab), then only the third of the above possibilities can be realized. A formal proof of this assumption does not exist, and it is probably not valid without exceptions, however, as a rule, it generally yields correct results. Using the above conclusions we are able to determine the primitive lattice vectors of the reconstructed surface in terms of the primitive lattice vectors f1, f 2 of the ideal surface. The latter are, by definition, also the primitive lattice vectors of the bulk slab of the crystal with surface. Let be fi and fi the primitive lattice vectors of the reconstructed surface slab. They may be linearly composed of f 1 , f 2 according to
(3.198) (3.199)
Here, the coefficients Q i k , i, k = 1,2, of the transformation matrix Q are, at the outset, arbitrary real numbers. They need to be rational if and only if the two lattices derived from f i , f 2 and f l , f2 have a coincidence lattice, i.e. in case (ii) above. If the coincidence lattice is identical to the lattice derived from fi, fi, i.e. in the particularly interesting case (iii), it follows that the Qik must have integer values. In this case, the lattice derived from f : , f i is simultaneously the lattice of the reconstructed surface. We thus arrive at the important conclusion that in the only case of practical interest (iii) above, surface reconstruction can be described by a 2 x 2 matrix with integer elements. The two most common forms of this type of reconstruction have a special notation (Wood notation): (1) The nondiagonal elements vanish, i.e. f: and fi are parallel to f 1 , f 2 , respectively, and their lengths are integer multiples of the respective lengths of the latter. We thus have
f: = nfi,
f& = mf2 ,
(3.200)
with n and m being integers. The primitive unit cell of the surface slab contains n x m primitive unit cells of the bulk slab. This is said to constitute an n x m reconstruction An n x m reconstructed crystal surface of a particular material C parallel to a lattice plane with Miller indices (hlcl) (or ( h k i l ) in the case of hexagonal symmetry) is characterized by the symbolic notation
C(hlc1) n x m.
(3.201)
(2) The offdiagonal elements of Q are not equal zero, i.e. fi is not parallel to fi and/or fi is not parallel to f2. The angles Lf, fi) and L(f&fz), which (! are in general not equal to each other, are assumed to be equal in the case under consideration. This means that the two vectors f1, f2 (with tails joined at the same point) can be transformed into the two corresponding vectors = L(f4, f2) = a about an axis which is perpendicular to the surface, with a subsequent rescaling of f1, f2 by the factors lfiI/fll and ~ f ~ ~ / ~ f ~ ~ A, (hlcl) surface respectively. of a particular material C reconstructed in this way is characterized by the symbolic notation
lfil C(hkl) x
lfil
11 4   a.
If21
(3.202)
and The factors lf~l/lf~l lfil/lf21 are in general irrational in contrast to the qik, which in the case considered here are integers. Examples of both of the special reconstruction forms discussed, as well as for the general reconstruction form in case (iii), are shown in Figure 3.29. Sometimes, in the notation (3.201), n x m is replaced by p  n x m or c  n x m. The lattice vectors fi = nfi and fi = mfz are then not necessarily primitive as originally assumed in the notation (3.201), and in addition to primitive ( p ) reconstructed surface lattices, also centered ( c ) ones are possible. This can only take place, however, for rectangular surface lattices. Thus, the modified notation applies only to this case, although it is also sometimes used (formally incorrectly) for square lattices. In the rectangular case the notation c  n x m describes a type of reconstruction which is not covered by one of the two notations (3.201), (3.202), and which can otherwise only be characterized by the 2 x 2 matrix Q itself. For square reconstructed lattices the c  n x m notation is just a simpler description of a reconstruction of type (3.202). The point symmetry of a reconstructed surface lattice is generally lower than that of the ideal surface lattice from which it is derived. This implies that the crystal with reconstructed surface belongs to a different plane
353
Figure 3.29: Three different types of surface reconstructions: (a) 1 x 2, (b) 3h x 31 30, (c) general type, the Wood notation does not apply in this case, the matrix notation does with 411 = 5, 412 = 1, 421 = 2, 422 = 2. crystal system than the crystal with ideal surface. For example, the 2 x 1 reconstruction of a square lattice leads to a rectangular lattice. The same holds for the 2 x 1 reconstruction of a hexagonal lattice which also results in a rectangular lattice.
A further item is worth mentioning, concerning the surface reconstruction itself. It follows from the point symmetry of the crystal with ideal surface itself. If the latter belongs to the square crystal system, i.e. if it has a square lattice and one of the two point symmetry groups 4 m m or 4, the directions of the two primitive lattice vectors are symmetrically equivalent. A particular reconstruction which increases the surface unit cell in the direction of fl by a factor n and in the direction of f 2 by a factor m , is equivalent to another reconstruction which does the same for, respectively, the symmetrically equivalent vectors f2 and fi (Figure 3.30). An analogous statement holds for an ideal surface of the hexagonal crystal system, having a hexagonal lattice and one of the point groups 6mm, 6 , 3 m or 3. In this case, three symmetrically equivalent direction exist (Figure 3.30). If there is no physical reason which makes one of the different symmetrically equivalent reconstructions more likely than another, they will take place simultaneously in different regions of the surface. The result is the formation of domains of otherwise identical, but differently oriented, reconstructed unit cells. Due to the domain structure, the overall translation symmetry of the surface is destroyed. Structural imperfections of a more local nature occur where the boundaries of such domains meet.
Figure 3.30: Symmetrically equivalent 2 x 1 reconstructions in the case of ideal surfaces belonging to the square (left) and hexagonal crystal systems.
Point and space symmetries
The point and space symmetries of relaxed or reconstructed surfaces are generally of lower degree than those of the corresponding ideal surfaces. They do not only depend on the point and space symmetries of the irreducible crystal slab as in the case of an ideal surface, but also on the point and space symmetries of the relaxed or reconstructed surface slab. If the surface slab consists of more layers than the irreducible crystal slab, then its point and space groups can be taken to determine the point and space groups of the whole crystal with relaxed or reconstructed surface. If the surface slab contains fewer layers than the irreducible crystal slab it is expedient to add further atomic layers (which then do not contain displaced atoms) to make up the difference. The point and space group elements of the thus extended surface slab which are simultaneously point and space group elements of the irreducible crystal slab form, respectively, the point and space groups of the whole crystal with relaxed or reconstructed surface. Similarly like this was done for the translation symmetry group above, one may argue that the point and space groups of the relaxed or reconstructed surface slab should be subgroups of the point and space groups of the corresponding ideal surface. If this is the case, the point and space groups of the relaxed or reconstructed surface slab are, respectively, the point and space groups of whole crystal with relaxed or reconstructed surface.
3.6.3
The electrons and cores of a crystal with a surface undergo the same interactions as the electrons and cores of an infinite bulk crystal, the only difference being that the cores and electrons of the removed semiinfinite crystal above the surface are missing, and the positions of the cores just below the surface
355
differ from those in the infinite bulk. The two basic approximations of the theory of the interacting electroncore system of an infinite crystal, namely the adiabatic approximation and the oneelectron approximation, are not influenced by these modifications, so they may also be used in the presence of a surface. In the electronic structure calculations of infinite bulk semiconductor crystals, the core positions commonly are taken as input data. This is possible because these positions are crystal sites of high symmetry which are wellknown from Xray diffraction experiments. For crystals with a surface, this can no longer be assumed. The positions of atoms in the surface layers of relaxed or reconstructed surfaces are crystal sites of lower symmetry. In many cases, they are not known or only incompletely known from experimental investigations, for these investigations are more difficult and less precise than Xray diffraction in the case of bulk crystals (a short review of these methods will be given further below). Xray diffraction cannot be applied to surfaces because it lacks surface sensitivity. In such circumstances, the positions of atomic cores at surfaces are to be treated as output, rather than input, data of the electronic structure calculations. The way that this can be accomplished was discussed in section 2.2 in general terms, and in section 3.5 with respect to point defects. It involves, first, the calculation of the total energy of the electroncore system for a variety of different sets of core positions and, second, the minimization of the total energy with respect to these sets. The minimum set gives the core positions which really apply. The most critical part of the total energy is the energy of the electron system. In order to obtain it, the oneelectron energies of the crystal with surface have to be calculated for assumed core positions. Formally, this involves the same task as in the case of bulk crystals, namely, the calculation of stationary oneelectron states for given core positions. Below, we will demonstrate how this problem can be solved in the case of crystals with surfaces. The remaining parts of the procedure for determining the atomic and electronic structures of surfaces, the calculation of the entire total energy including the corecore interaction energy, and the minimization of the total energy with respect to the core positions, will not be treated here because it is mainly a numerical task. The assumption of a priori known core positions is valid if ideal surfaces are considered. In this case they are the infinite bulk positions. The electronic structures of ideal surfaces are important as reference data for the electronic structures of relaxed and reconstructed surfaces. Thus we will also deal with them.
(3.203)
as in the case of an infinite bulk crystal (see equation 2.53). The oneelectron potential V(x) may be written as the sum
of the electroncore interaction potential Vc(x), and an effective oneelectron any of the potential Ve(x)due to electronelectron interaction. For Ve(x), oneelectron approximations introduced in section 2.1 may be used. The follows from the corresponding expression for an infinite core potential Vc(x) crystal if the summation is restricted to cores within and below the surface. Using the surface adapted notation 5 If3 r for the position of the jth basis atom of the primitive bulk unit cell at the bulk lattice point r If3, j = 1 , 2 , . . . , J , 1 = 0, 1,. . . , 00, this leads to
+ +
W
(3.205)
where q(x) is the potential of a core of species j located at R = 0. The core potential Vc(x)and, hence, also Ve(x)and V ( x ) have the translation symmetry of the 2D surface lattice, so that
v(x)= V(x + r) .
(3.206)
As in the 3D bulk case, this symmetry can be used to derive the Bloch theorem.
Bloch theorem
This theorem states that the energy eigenfunctions $E(x) of the Schrodinger equation (3.203) may be chosen simultaneously as eigenfunctions of the surface lattice translation operators t,.. This allows one to write these functions in the form of Bloch functions @Q(x) with a 2D quasiwavevector
4 = 91g1 + 9282,
(3.207)
where q1,92 are arbitrary real numbers, and gl and gz are the primitive lattice vectors of the reciprocal surface lattice, defined by the relations
35 7
fi . g k = 27T6&, i, k = 1,2,
The latter equations are solved by the vectors
g1 = Nlf2 x
(3.208)
[fl x
f2],
g2 = Nlfl x [f2 x
fl]
(3.209)
with N = (1/2n)[(f1 . fl)(fZ.fz)  (fi . f2)] as normalization constant. The two vectors gl, gz of (3.209) lie in the plane spanned by the two primitive surface lattice vectors f1, fz, i.e., within the surface. The same statement applies to the 2D wavevector q. As in the 3D bulk case it is convenient to introduce a region of macroscopic size with respect to which the eigenfunctions $J,E~(x) be assumed can to be periodic. In the case of a crystal with surface this region forms a parallelogram spanned by the edge vectors Gfl, Gf2, with G being a large integer. The area RII of the periodicity region is G21f1 x f2l. The Bloch functions normalized to it may be written as
Q,Eq(x) e2qx =
fi
UEq(X),
(3.210)
where U Q ( X ) is the Bloch factor, which has the periodicity U E ~ ( X )= U E ~ ( X + r) of the surface lattice and is normalized with respect to a primitive surface of unit cell. To guarantee the periodicity of the Bloch functions $ ~ ~ ( x ) (3.210) with respect to the periodicity parallelogram, the wavevectors q must have the form
(3.211)
with p1,pz as integers. This means that the qvectors must belong to a finelymeshed lattice similar to that of Figure 2.4. For the macroscopically large values of G which we assume, q is practically continuous, although the number of different qvalues within a given region of qspace is finite.
q . g +  g2 = 0, 1
(3.212)
where g is an arbitrary surface reciprocal lattice vector. Equation (3.212) defines lines in the 2D qspace which are analogous to Bragg reflection planes in the 3D kspace of a bulk crystal. On Bragg reflection lines, the energy function E(q) has discontinuities. These lines may be used to define 2D Brillouin zones in just the same way as was done in the 3D case in section 2.4. One speaks of surface Brillouin zones (BZs). The surface B Z s of the square lattice have, in fact, already been used in section 2.4 as an illustration for the 3D case. Of particular importance is the f i r s t surface B Z . It may also be defined as the WignerSeitz cell of the reciprocal surface lattice. Since there are 5 different plane Bravais lattices and, hence, 5 different reciprocal surface lattices, there are also 5 different first surface B Z s . They are shown in Figure 3.31. Their shapes are the same as those of the WignerSeitz cells of the corresponding direct lattices since the Bravais types of the direct and reciprocal surface lattices always coincide. The first surface B Z s are, by definition, free of Bragg reflection lines. Thus the energy function E(q) is continuous within these zones. Furthermore, any higher surface B Z of order p may be reduced to the first surface B Z , and the energy function E(q) in the pth surface B Z can be folded back to the first surface B Z . There, it forms a continuous function E,(q) which is called a surface energy band. Thus we may state that the energy eigenvalues of a crystal with surface form energy bands over the first surface BZ. There are surface bands of different types, regarding their relations to the energy bands of 3D bulk crystals without surface. Below we characterize these differences in a qualitative way. Types of eigenstates
Bulk states
Consider an infinite bulk crystal, and cleave it into two semiinfinite crystals with a surface parallel to a particular lattice plane. The spectra of energy
359
t'
tY
Figure 3.31: Surface B Z s of the 5 plane lattices: (a) oblique, (b) prectangular, (c) crectangular, (d) square, (e) hexagonal. Symmetry lines and points are also shown, and their notations are introduced. eigenvalues of the two semiinfinite crystals will contain all energy levels which were already eigenvalues of the infinite bulk crystal before cleaving. This implies that each crystal with surface possesses an energy eigenvalue spectrum which partially is made up from energy eigenvalues of the infinite bulk crystals from which it is derived. The eigenfunctions of the crystal with surface corresponding to these eigenvalues, if examined at positions outside of the crystal, will decay exponentially with increasing distance from the surface, while inside they will practically be the same as those of the infinite crystal, i.e. they will exhibit undamped oscillations throughout the whole semiinfinite crystal, as do the eigenfunctions of the infinite bulk crystal (see Figure 3.32). Eigenstates of a crystal with surface exhibiting such properties are called bulk states. The corresponding energy eigenvalues form bulk state surface energy bands Ep(q). These bands can be obtained by projecting the bulk bands E?"(k) of the infinite crystal onto the first surface B Z . Here, 'projecting' means that one assigns to a particular point q of the first surface B Z all bulk band energies EF"(k) corresponding to kvectors of the first
w
0 .0
C 3
surface state
9
x
Figure 3.32: The three different types of electron energy eigenstates of a crystal with surface below the vacuum level (schematically). bulk B Z having the same projection q on the surface B Z plane, but with various components kl perpendicular to it. Formally, this can be expressed by the relations
The bulk state surface band index p in (3.213) replaces the bulk quantum numbers v k l . As kl varies continuously, @ also does. This means that a particular band of the infinite bulk crystal gives rise to a continuum of surface bands. In this way, the infinitely large number of atoms in a primitive unit cell of the crystal with surface manifests itself. We illustrate the projection of bulk bands onto the surface B Z using the (100) surface of a diamond type crystal as an example. The band structure is taken from the empty lattice model introduced in section 2.4. As a first step, the bulk B Z is to be projected onto the plane of the first surface B Z (see Figure 3.32). In doing so, one notes that part of the projected bulk B Z lies in the second surface B Z . This has to be folded back to the first surface B Z , together with its energy values. In this way we obtain the projections of the lowest three empty lattice bands shown in Figure 3.33. In addition to qvectors for which all energy values are allowed, there are also qvectors occurring for which certain energy values are forbidden. This peculiarity is found in other, more realistic cases as well, it represents a general feature of the projected bulk band structure. In the forbidden energy regions states may occur which are localized at the surface.
361
Figure 3.33: First (100) surface B Z (shaded area) together with the projected first bulk B Z for a crystal with fcc Bravais lattice.
Surface states
Such states are to be expected for the same reason that localized states are observed in the case of point or 0dimensional perturbations. In contrast to the latter, surfaces constitute 2dimensional perturbations. The localization occurs, therefore, only in 3  2 = 1 dimension, namely that perpendicular to the surface. If the energy lies in the forbidden region of the projected bulk band structure, the decay of the eigenfunction towards the bulk proceeds exponentially (see Figure 3.32). The states are then called bound surface states and the corresponding energy bands E,(q) are bound surface bands. Besides these, one has surface resonances and antiresonances. The latter occur at energies in the allowed region of the projected bulk band structure and give rise to resonant or antiresonant surface bands Ep(q). They manifest themselves in an increase (resonance) or decrease (antiresonance) of the density of states. The eigenfunctions at these energies are also localized at the surface, but decay less rapidly towards the bulk (according to a power law) than the exponentially decaying bound surface states. Antiresonances are necessary in order to satisfy Levinsons theorem, which holds for surfaces as well as for point perturbations. If bound surface states exist, the antiresonances must compensate the increase of the total DOS in the previously forbidden energy region.
1 u l
2 
5 n
W
E? a,
C
given wavevector q. It also results in symmetry relations between the values of EF(q)for different q values, and it determines the spatial symmetries of the eigenfunctions $i)w(~). The key for such conclusions are, in analogy to the infinite bulk case, the irreducible representations of the space group of the given crystal with surface. This is based on the fact that the eigenfunctions for a particular energy eigenvalue form a basis set of an irreducible representation of this group. Such a representation may be characterized by the star {q} of the wavevector q and the irreducible representations of the small point group of q with the factor system of equation (A. 157). The dimensions of the irreducible representations determine the possible degrees of degeneracy of an energy band E,(q) at the point q. Moreover, at all points of the star {q}, E,(q) has the same value. Since the 10 possible point groups of equivalent directions have at most 12 elements (this happens in the case of C6v(6mm)),and since the small point groups of the symmetric qpoints are in general even smaller than the corresponding point groups of equivalent directions, only irreducible representations with dimensions equal to 1 or 2 will appear. 2D representations are likely to occur in cases where the small point groups are, on the one hand, large enough, and on the other hand, the corresponding space groups contain glide reflections. Then nontrivial factor systems arise for points q on the boundary of the first BZ,
363
pig1
r, A;?;Z;r, A\x:z;M,z,
~r~r+~2~~;X;~;M2Z;
x2
x1
x,A,
~A~;&z;M,zF
Figure 3.35: Symmetry and degeneracy of the surface energy bands for the 5 space groups of the prectangular Bravais lattice. [After Terzibaschian and Enderlein,
1986.)
and 1D representations might not be possible at all. This happens for 2 of the 5 space groups of the primitive rectangular Bravais lattice, which below will be studied as an example. In Figure 3.35, the possible types of band structures are shown on certain symmetry lines of the prectangular surface B Z . In the case of the space group p2mg, which applies for (110) surfaces of diamond type crystals, only 2D representations exist at M and X . The two 1D representations on the 2line connecting these points belong to the same energy eigenvalue because of time reversal symmetry. For the space group p2gg, one has 2D representations only at X and X'. The representations on the connecting line X  2  M  2'  X ' are lD, but the corresponding energy eigenvalues are degenerate because of time reversal symmetry. For the three remaining space groups p2mm, p l m l , and p l g l , the representations at all symmetry points are 1D.
A 3D crystal with surface may be characterized as a crystal with a 2D lattice and primitive unit cells extending infinitely in the direction perpendicular
The electronic structure of a crystal with a 2D lattice may be calculated by means of any of the methods known from band structure calculations of 3D crystals, with the exception that the dimensionality of the lattice has to be changed from 3 to 2. In the tight binding method, for example, one has to use Bloch sums of atomic orbitals &(x) over all points r of the 2D surface lattice, rather than over all points R of the 3D bulk lattice. With the surface adapted notation, r+G+Zf3, for a regular crystal site, the orbital a localized at such a site is given by &(x  r  3  1f3) E &jl,.(x), The corresponding Bloch sums &jlp(x)are defined as
(3.2 14)
The number of different Bloch sums is infinitely large even if a finite number of orbitals is used per atom, because the integer 1 counting the primitive crystal slabs, runs from 0 to co. Consequently the Hamiltonian is given by an infinite matrix in the atomic orbital representation &jl,.(x). Some specific approximation is necessary to transform this to a finite matrix. One possibility is to consider a slab of the crystal, i.e. to cut off the semiinfinite crystal at a particular lattice plane parallel to the surface and discard the remainder. One may say that, in the direction perpendicular to the surface, the true semiinfinite crystal is replaced by a cluster (see Figure 3.36). The latter has two plane surfaces, one of them being real (that of the semiinfinite crystal), and the other one (obtained by cutting the semiinfinite crystal) not. This differs from the cluster method in the case of point defects discussed in section 3.5 where the whole cluster surface is artificial. The slab method may be combined with any band structure calculational method for infinite bulk crystals. Its combination with the tight binding method will be illustrated below by means of an example. We consider the ideal (111) surface of a diamond type crystal which according to subsection 3.6.2, has a hexagonal Bravais lattice. Instead of the one s and three porbitals per atom we use the four hybridorbitals, i.e. we replace a in the Bloch sum (3.214) by ht, t = 1 , 2 , 3 , 4 . Only nearest neighbor interactions are taken into account. An illustration of this model is given in Figure 3.37. To get the matrix elements of the Hamiltonian H between Bloch sums, we first need these elements between the localized hybrid orbitals +htjlT Ihtjlr). The diagonal elements are
(3.2 15)
365
Figure 3.36: Illustration of the slab method for surface band structure calculations.
Figure 3 . 3 7 Defect molecule model of the ideal (111) surface of a crystal with diamond structure. and the nondiagonal elements between different hybrids at the same atom
j = j are
= (htllrlH(htJ1lr) VI ,
# t .
(3.216)
The two types of matrix elements in (3.215) and (3.216) are equal for atoms at the surface and in the bulk. This is not true for elements between hybrids located at different atoms. Let j = 1,1= 0, designate the surface atom layer and consider the elements (ht10rJH]ht20rt)between the hybrid ht located at the surface atom 10r and the hybrid lht20xt) at its nearest neighbor 20rt below the surface, pointing toward (htlOr). Since no nearest neighbor hybrid exists for the hybrid )hllOr) pointing out of the surface, all nearest neighbor elements involving IhllOr) vanish. The other three elements with t = 2 , 3 , 4 are equal to the parameter V2 introduced in equation (2.292),
# t
(3.219) (3.220)
t = 2,3,4 ,
where e t is given by equation (2.240), with dt being a vector which points from the surface atom 10r toward its nearest neighbor atom 20rt in the direction of the hybrid t. We have
d2 = (lTT),
2
d3 =
(lll), 2
a  
d4 =
a(TT1). 2
(3.221)
The hybrids lht201i) at the nearest neighbor atoms pointing td the surface atom lor, are coupled to the other hybrids at these atoms, and these hybrids interact with hybrids at more remote atoms. Thus an infinite matrix would occur if we would not restrict consideration to a slab, as we in fact do. Here, we will go one step further and neglect all couplings between the hybrids at the nearest neighbor atoms. Then the 7 Bloch sums IhtlOq), t = 1,2,3,4, and (ht20q),t = 2 , 3 , 4 , are completely decoupled from the rest of the Bloch sums (see Figure 3.67). In this way we arrive at a simplified model of the crystal with surface which effectively reduces it to the two first atomic layers and treats the atoms of the second layer only in an approximate way. This model represents an analog of the defect molecule model in the case of a point perturbation. If the basis functions are arranged in the order IhllOq), IhzlOq), h l o q ) , h l o q ) , lh220q), lh320q), lh420q), then the Hamiltonian matrix is
(3.222)
The eigenvalues of this matrix are plotted in Figure 3.38 for different symmetry lines of the first surface B Z of the hexagonal lattice of Figure 3.31.
367
W avevec tor
Figure 3.38: Band structure of the ideal (111) surface of a diamond crystal within the defect molecule model. The TB parameters are V1 = 2.13 e V , V2 = 6.98 e V . The hybrid energy Eh is used as the energy origin. The three lowest and three highest bands are bulk state surface bands. They arise from the three hybrids of the surface atom, which bind this atom back to its nearest neighbors in the second layer. More strictly, they correspond to the 3 back bonding and antibonding states. The band in the gap is due to the dangling hybrid of the surface atom. It forms a bound surface band. If we were to consider more than 2 atomic layers, the region where the bulk bands in Figure 3.38 occur would be covered by more bulk state surface bands, while the bound surface band in the gap would hardly change.
Supercell method
The slab considered in the preceding subsection may be repeated periodically in the direction perpendicular to the surface, simultaneously inserting several layers of vacancies between two neighboring slabs (see Figure 3.39). This structure may be considered to be an infinite repetition of the original crystal with surface, each repetition being approximated by a finite slab of several atomic layers embedded between vacancy layers. This arrangement represents a 3 D supercrystal composed of 1D supercells (bear in mind that in the case of point defects in section 3.5, we similarly considered a
Figure 3.39: Supercrystal obtained by a periodic repetition of supercells. The latter are composed of crystal slabs embedded between vacancy layers.
3D supercrystal composed of 3 D supercells). The band structure of the supercrystal is approximately the same as that of the original crystal with surface, provided the vacancy slabs are thick enough to suppress coupling between neighboring crystal slabs, and the latter slabs are thick enough to simulate a semiinfinite crystal. The bands Ep(q) of the crystal with surface are obtained from the bulk bands EF(k) of the supercrystal by plotting the latter on the surface B Z . In this, k l may be chosen arbitrarily because any dispersion of EF(k) = E F ( q , k l ) with respect to k l would indicate a coupling between the slabs, which has been excluded. The band structure of the supercrystal may be obtained by means of any 3 D band structure calculation method without modification. This makes the supercell method particularly appealing. Combined with the pseudopotential method, as well as the local density functional or quasiparticle approximations, it represents the most important calculational method for electronic and atomic structure determinations of surfaces.
Defect model and Greens function method
A crystal with surface may also be viewed as a 3 D crystal with a 2D perturbation. The significance of this characterization is the following: Consider first an ideal infinite 3 D crystal, then remove some number of neighboring atomic layers parallel to the surface under consideration or, equivalently,
369
Figure 3.40: Illustration of the defect method for calculating surface energy band structure. create the same number of vacancy layers (see Figure 3.40). What remains are two identical semiinfinite crystals which are only displaced with respect to each other. They do not interact provided the number of vacancy layers is large enough, which we will assume. The electronic structures of the two semiinfinite crystals coincide, and are identical with that of the considered crystal with surface. As in the case of a single vacancy, the states bound at the defect may be obtained by means of the Greens function G o ( E )of the unperturbed 3D crystal, more strictly, by the vanishing of the determinant of [Go(E)V  1 (see equation (3.134)). Here, the perturbation potential 1 V(x) represents the difference of the potential energy of an electron in the perturbed and unperturbed crystals. It is the negative of the sum of the potentials of the removed atoms or the sum of all vacancy potentials. It has the 2D lattice symmetry of the surface so that
~ ( x= V(X )
+ r).
(3.223)
If only nearest neighbor interactions are taken into account, a single layer of vacancies is enough to decouple the two semiinfinite crystals. For the analysis of the bound state condition (3.74), one has to use a particular basis set, just as in the case of a point perturbation. Wannier functions are again a possible choice, but here they involve localization only in one direction, namely that perpendicular to the surface. We denote these functions by IvqRl) where RI is the component of a 3D lattice vector R perpendicular to the surface, such that
(3.224)
The matrix representation of the Greens function Go(E) with respect to this basis set is
For the perturbation potential V1(x) we take that of a single layer of vacancies located at RI = 0. Then the matrix representation of V1(x) is
where VL,,(q) = (vq01VJvqO)has been used. The relevant matrix elements (vqOIGoV1 1lvlqO) of GoV1 1 take the form 
with
(3.228)
According to equation (3.134), the determinant of the matrix (3.227) must vanish for energy eigenvalues E in the gap of the ideal crystal, i.e.
Det [G:(E, q)VL,,(q)  6,,,]
= 0.
(3.229)
If one compares this equation with the corresponding relation (3.94) for deep centers, it will be noted that the qdependence which occurs here was absent there. This dependence in the case of surfaces causes the eigenvalues in the gap to form deep bands rather than deep levels (which was the case for point perturbat ions).
Transfer matrix method
Finally, still another method for surface band structure calculations should be mentioned. It is based on the transfer matrix concept of quantum mechanics. The transfer matrix M ( E ) is formed from solutions of the Schrodinger equation upon one unit cell of the bulk crystal for particular boundary conditions in the direction perpendicular to the surface. The energy E is arbitrary first of all. Transferring the wave function from the surface unit cell to the
371
n  th unit cell below the surface can be done by applying the n  t h power W ( E ) of M ( E ) . Bound surface states are obtained for such energy values E for which M n ( E ) decays exponentially with n. The practical use of this
3.6.4
In this subsection we will deal with the atomic and electronic structures of
some important semiconductor surfaces, including the various reconstruction states of the Si (111) surface, the Si (100) surface, the (110) surface of GaAs and other 111V compound semiconductors as well as the (111) and (100) surfaces of GaAs. It is advisable to treat the atomic structure together with the electronic structure because of the close relation between the two kinds of structures, as was pointed out earlier. We begin with a short introduction to the experimental methods of surface structure analysis, again referring to both the atomic and electronic aspects. Experimental methods for surface s t r u c t u r e analysis Experimental methods for determining the atomic structure of bulk crystals are all based on the interaction of waves with the atomic cores and valence electrons of the crystal. If the wavelength is of the order of the distance between the atoms, i.e. of the order of magnitude of 1 A, the crystal constitutes a 3D diffraction lattice and diffraction maxima will occur in prescribed directions in space. The crystal structure may be determined from the positions and intensities of these maxima. The various experimental methods differ primarily in the nature of the waves employed. Xrays are by far the most important for determining the structure of bulk crystals. A wavelength in the region of 1 corresponds to a photon energy in the range of 10 k e V . Electron and neutron waves are also diffracted by bulk crystals. Electron energies in the region of 100 eV and neutron energies of 0.1 e V are required for wavelengths in the d region. Since they are neutral particles, Xray photons and neutrons interact only relatively weakly with the crystal. They can pass through crystals of macroscopic thickness and be backscattered from them from macroscopic depths within them. Xrays and neutrons thus yield information on all atomic layers of a crystal, including those at the surface. Since the number of surface layers is extremely small in comparison to the total number of layers, the diffraction patterns are dominated by the bulk of the crystals. The interaction of electrons with the atomic cores and the valence electrons of a crystal is significantly stronger than that of photons and neutrons. Electrons having energy less than 100 k e V can not pass through a crystal of
372 Chapter 3. Electronic structure of semiconductor crystals with perturbations macroscopic thickness. Experimentally, one therefore has only the backscattering available and then only elastically backscattered electrons can be employed in forming diffraction patterns. These originate at a depth which, on average, is equal to the inelastic mean free path of the electrons. This varies relatively independently of the particular crystal under consideration from 4 A to 10 for energies of 20 eV to 300 eV. Electron diffraction in this energy region is thus hardly suitable for examining bulk crystals but it can be readily employed in studying crystal surfaces. The diffraction of low energy electrons in the region of 100 eV is in fact the most intensively used method for surface structure determination. It is referred to as LEED (Low Energy Electron Diffraction). The principle of LEED may be explained as follows. We consider an incident electron wave with the wavevector ki. The interaction with the crystal generates scattered waves with wavevectors k,. The scattering potential has the translation symmetry of the surface lattice, thus its Fourier components differ from zero only for vectors g of the reciprocal surface lattice. This means that only those scattered wavevectors k, can occur whose components k,ll parallel to the surface differ from the parallel component kill of ki by a reciprocal surface lattice vector g, hence (3.230) There is no relation between the components of k, and ki perpendicular to the surface because there is no translation symmetry of the scattering potential in this direction. In writing down equation (3.230) we have implicitly assumed that only one scattering event takes place. This relation also applies, however, to multiple scattering processes. This is important because electrons scattered back from the surface have, as a rule, experienced many scattering events, in contrast to Xray photons which typically have been scattered only once. This difference is due to the above mentioned fact that electrons interact with the crystal much more strongly than do Xray photons. The electrons measured in LEED are elastically scattered. One therefore has
Ik,/=
lkzl .
(3,231)
A solution of the two equations (3.230), (3.231) always exists for given vectors k and g (this is in remarkable contrast to coherent scattering of eleci trons from 3D bulk crystal, which can only occur if k, lies on a Bragg reflection plane). The solution of equations (3.230) and (3.231) can be readily carried out using the construction shown in Figure 3.41. The points at which the vertical lines passing through the reciprocal lattice points g intersect the
373
Construction of
sphere Jk,] lkzl, determine the directions in which diffraction maxima oc= cur. There is exactly one maximum for each reciprocal lattice point g. The reciprocal surface lattice can thus be read immediately from the distribution of the diffraction maxima on the registration screen. The direct surface lattice is the reciprocal of the reciprocal surface lattice. Some typical LEED images are shown in Figure 3.42. The bright points correspond to the reciprocal lattice of the ideal surface, and the less bright points to the finer reciprocal lattice of the reconstructed surface. In this way it is relatively easy to determine the surface lattice by means of LEED. To obtain the actual positions of atoms is more diacult. One needs additional experimental and theoretical information about the intensity of the diffraction maxima as a function of the energy of the incident electrons (dynamical LEED). Besides LEED, there are other methods for surface structure analysis which, although they are not a substitute for LEED, can supplement it. These methods include diffraction of energetic electrons in the region of some 10 k e V , known as Re5ection High Energy Electron Diffraction ( M E E D ) , diffraction of Xrays incident almost parallel to the surface, and diffraction of slow Helium atoms (of M I00 meV). Scattering of energetic ions ( M 1 M e V ) is used in techniques like Rutherford backscattering (RBS) and ion channeling. Imaging procedures of significance are transmission electron microscopy (TEM), scanning tunneling microscopy (STM) and atomic force microscopy (AFM). The latter two methods have become particularly important. Like for atomic structure determinations, a variety of methods exist to study the electronic structure of surfaces, in particular the bound surface states in the energy gap of the bulk crystal. The most powerful and universal method is photoemission spectroscopy (PES). This method relies on the external photoeffect in which an electron is emitted from the crystal by
x6
Figure 3.42: LEED pictures of six differently prepared GaAs (100) surfaces.(After
Drathen, Ranke and Jacobi, 1978.)
absorbing a photon of sufficiently high energy. The emitted photoelectrons are spectrally decomposed with respect to their kinetic energies. The thus obtained energy spectrum of photoelectrons maps the density of states of occupied electron levels of the crystal. To enhance surface states and discriminate bulk states, photoelectrons with kinetic energies around 50 eV are used whose inelastic mean free path is only about 5 A and which can therefore only come from this depth below the surface. These electron energies correspond to photon energies which are not substantially larger, i.e. in the far ultraviolet region. The term UPS (Ultraviolet Photoemission Spectroscopy) is used in this context. The only practically suitable radiation source in this energy region is the electron synchrotron. By measuring angular resolved photoemission spectra (ARUPS), the wavevector dispersion of the bound surface energy bands can be determined. To study moccu
375
11101
hi01
Figure 3.43: Geometry of the ideal Si (111) surface (left) and of the Si (111) 2 x 1 surface according to the buckling model (right). pied surface states one may use inverse PES in which electrons captured by such states emit photons. Beside photoemission, a variety of other techniques exists which can provide data on surface states. In principle, any experimental technique which probes the electronic structure of bulk crystals can be employed for surface electronic structure investigations, provided it can be made surfacesensitive. This applies to optical reflectivity, electrical transport, photoconductivity, and capacity measurements, as well as electron energy loss spectroscopy (EELS). Experimental techniques like field effect measurements fulfill this requirement from the very beginning. Controlling the energy of tunneling electrons in scanning tunneling microscopy, surface states can be resolved spatially and energetically (scanning tunneling spectroscopy). Experimental techniques which primarily measure the electronic structure, can also provide data on the atomic structure. The solid state shifts of core levels (see section 2.1) are an example. These shifts differ for atoms in the bulk and at the surface because of the altered atomic structure at the surface. The difference (typically some tenths of an e v ) can be measured by means of PES and UPS. On the other hand, they can be calculated on the basis of a particular surface structure model. By comparing theory and experiment one can evaluate the feasibility of various models of surface structure. The calculation of the total energy is a purely theoretical test of the validity of a particular surface structure model, and it may be used to determine the parameters which can be varied in such a model. If the model has optimized parameters and results in a lower total energy than other models it may be given preference over them.
I '
*!:!
t i
5  *
.. ..
..
!ii ::I
..... .*....
,I I
iii
r
I
I... ll
r
2D wave vector
Si surfaces
(111) surface
The geometry of the ideal (111) surface of diamond type crystals is illustrated in Figure 3.43 (left). The surface lattice is hexagonal, and the two primitive lattice vectors are fi, f2 of Table 3.8. There is one surface atom per primitive unit cell, and one dangling bond per surface atom. The (111) surface is the cleavage plane of diamond type crystals. By cleaving a Si crystal in UHV at room temperature, one obtains a 2 x 1 reconstructed (111) surface. After annealing it at 500 C , a 7 x 7 reconstruction state evolves, which remains stable at room temperature. The occurrence of a 2 x 1 reconstruction immediately after cleavage is to be expected if one examines the band structure of the ideal (111) surface (see Figure 3.44) and considers, in particular, the electron occupancy of the bound surface band in the fundamental gap. This band arises from dangling sp3hybrids of surface atoms, and can host 2 electrons per surface unit cell. Since 3 of the 4 valence electrons of a surface atom are in bonds with secondlayer atoms, only 1 electron per surface unit cell is left for the bound surface band. Thus this band remains only halffilled. The ideal Si (111) surface is metallic.
377
This state is unlikely to be stable, however, i.e. surface reconstruction is likely to take place. Below we discuss a particularly simple reconstruction model, the socalled buckling model (see Figure 3.43, righthand side) which in the early days of clean semiconductor surface physics was believed to be correct. Later, it was realized that buckling is energetically advantageous only for 111V compound semiconductor surfaces, while it is not advantageous for groupIV semiconductor surfaces including the (111) surface of Si. To introduce the buckling model we consider doubling of the primitive unit cell in the direction of primitive lattice vector fi, which according to Table 3.8 points in the direction [ l i O ] . The hexagonal lattice with doubled primitive unit cell forms a rectangular lattice with primitive lattice vectors 2f1 tf2 and f2 The short side of the rectangular primitive unit cell, shown in Figure 3.43 right, is parallel to [Olq,and its long side parallel to [ 2 m . The corresponding first surface BZ is also a rectangle (see Figure 3.31) with its long side, i.e. its r  Xdirection, parallel to [Olq,and its short side, i.e its X  Mdirection, parallel to [2ii]. The rectangular BZ is half as big as the original hexagonal BZ, and each band of the latter gives rise two band in the former, a direct and a backfolded one. There is no gap between these two bands because they arise from the same band of the larger BZ. The surface is still metallic. A gap arises if the so far formal 2 x 1 reconstruction is made real. This can be done by a buckling of the surface, i.e. by alternately raising and lowering atoms in rows parallel to f1 above the surface and below it (see Figure 3.43, right). In this, the three backbonding hybrids of a raised atoms becomes more plike, and the dangling hybrid at this atom more slike simultaneously lowering its energy, while the three backbonding hybrids at a lowered atom become more sp2like and the dangling hybrid at this atom more plike simultaneously raising its energy. The two bound surface bands derived from these s and plike dangling hybrids are just the bands below and above the gap discussed before. The lower slike band can host all electrons of the dangling hybrids, while no electrons remain for the population of the upper plike band. If the total energy of this state were in fact lower than that of the ideal surface, buckling would take place spontaneously, i.e the translation symmetry of the surface would spontaneously be lowered, A similar spontaneous symmetry breaking, the JahnTeller effect, was discussed in the context of point perturbations in section 3.5. There, the point symmetry was broken, while no translation symmetry was involved. If the translation symmetry is broken, as in the case of surface reconstruction, one speaks of a Peierls instability or a Peierls t r a n s i t i o n However, as has been indicated at the outset, buckling turns out to be energetically not favorable in the case of Si (111) surfaces. Populating the lower slike band with two electrons per primitive surface unit cell means transferring charge from the atoms lowered below the surface to the atoms
r?
'P r[1?01
[I101
Side view
a)
b)
Figure 3.45: abonded chain model of the Si (111) 2 x 1 surface (After Pandey, 1982). Part (a) shows the unreconstructed surface in top and side views. The top view in the second row has been rotated with respect to the top view in the first row in order t o allow for the side view below. Part (b) shows the same views of the surface as in part (a), but after reconstruction has taken place.
379
raised above. This implies the creation of an electric dipole which is too costly in energy to actually take place. Using the terminology of section 2.2 we may say that correlation effects of electron electron interaction, more strictly speaking, the configuration dependence of oneelectron states, prevents the buckled Si (111) surface to be lower in energy than the ideal one. The reconstruction model which actually applies to the Si (111) surface is the socalled sbonded chain model, illustrated in Figure 3.45. In this model, second layer atoms in rows parallel to fi f2, i.e. along the [lo3 direction in Figure 3.45a (including atom number 2) are raised into the first layer as shown in Figure 3.45b, breaking their bonds with atoms in the third layer (for example, the 25 bond). The dangling bond of the new surface atom (say atom 2) is used to establish bonds with atoms of the first layer (the 21 bond in this case). These can only be sbonds (indicated by double lines in Figure 3.45b) because the dangling bonds are perpendicular to the surface. In this way sbonded chains occur along the [ l O q direction (Pandey, 1982). The dangling bonds left at the third layer atoms (for example, atom 5) are saturated by hybrids of the first layer atoms which have been lowered down to the second layer (for example, atom 3). The surface is in fact 2 x 1 reconstructed. This may be seen by taking the primitive lattice vectors of the ideal surface to be f 1 + f 2 and f2. Doubling f2 yields the rectangular lattice indicated in Figure 3.45b by dashed lines. Its primitive lattice vectors are f1 + f 2 and  2 f 2 + (fi + f 2 ) = f 1  f 2 so that the short side of the rectangle is parallel to the chain direction [lOT], and the long side perpendicular to it (parallel to [121]).A peculiarity of the nbonded chain model is that it has a different bonding topology in comparison with the ideal (111) surface and also with respect to the buckling model. While the latter exhibit rings with 6 mutually bonded atoms (see Figure 3.45a), the former shows alternating rings with 5 and 7 bonded atoms (Figure 3.45b). This is due to the fact that bonds existing at the ideal and buckled surfaces are broken and new bonds are established in the sbonded chain model. The total energy of this model is clearly below that of the ideal surface (about 0.5 eV per surface atom). Thus it represents a good candidate for the reconstruction of the (111) Si surface. Further evidence is provided by ARUPS and optical measurements. Figure 3.46 shows the wavevector dispersion of the two bound surface bands as obtained from ARUPS measurements together with the calculated dispersion of these bands. The agreement is quite satisfying. The strong dispersion of the bound surface band on the rXline and the weak dispersion on the XMline of the rectangular surface B Z is easily understandable: the long rXside of the rectangular unit cell in qspace corresponds to the short side of the rectangular unit cell in coordinate space, which is also the direction of the sbonded chains. One expects strong dispersion along the chains and weak for the perpendicular direction, exactly what is seen in Figure 3.46.
Figure 3.46: Dispersion of the bonding (B) and antibonding (A) bound surface state bands along the r  X  M line for the Si (111) 2 x 1 surface. Curves are calculated within the 7rbonding chain model, points are obtained by means from ARUPS measurements. (After Martensson, Cri
2 >
W
d
t2
3 1
0
Q,
Y l
W
g o
1
nf
20 wave vector
Figure 3.47: Differential reflectivity spectrum of the Si (111) 2 x 1 surface (After Chiarotti, Nannarone, Pastore and Chiaradia, 1971.)
93
0A .
0s , Energy (ev)
$5