Dies ist G o o g l e s Cache von http://www.ubka.uni-karlsruhe.de/indexer-vvv/wasbleibt/57355817.
G o o g l es Cache enthält einen Schnappschuss der Webseite, der während des Webdurchgangs aufgenommenen wurde.
Unter Umständen wurde die Seite inzwischen verändert.Klicken Sie hier, um zur aktuellen Seite ohne Hervorhebungen zu gelangen. Um einen Link oder ein Bookmark zu dieser Seite herzustellen, benutzen Sie bitte die folgende URL: http://www.google.com/search?q=cache:xDHdPpWEvFcJ:www.ubka.uni-karlsruhe.de/indexer-vvv/wasbleibt/57355817+dirac+principles+of+quantum+mechanics&hl=de
Google steht zu den Verfassern dieser Seite in keiner Beziehung. |
Diese Suchbegriffe wurden hervorgehoben: | dirac | principles | quantum | mechanics |
|
|
THE
PRINCIPLES
OF
QUANTUM MECHANICS
BY
P. A. M. DIRAC
LUCASIAN PROFESSOR OF MATHEMATICS
IN THE UNIVERSITY OF CAMBRIDGE
THIRD EDITION
OXFORD
AT THE CLARENDON PRESS
Oxford University Press, Amen House, London E.C.4
GLASGOW NEW YORK TORONTO MELBOURNE WELLINGTON
BOMBAY CALCUTTA MADRAS CAPE TOWN
Geojafrey c University
Second edition 1
Reprinted photographically in Great Britain
at the Univercity Prese, Oxford, 1948, 194Y
from eheets of the third edition
*
PREFACE TO THIRD EDITION
THE book has again been mostly rewritten to bring in various'
improvements. The chief of these is the use of the notation of bra
and ket vectors, which 1 have developed since 1939. This notation
allows a more direct connexion to be made between the formalism
in terms of the abstract quantities corresponding to states and
observables and the formalism in terms of representatives-in fact
the two formalisms become welded into a Single comprehensive
scheme. With the help of this notation several of the deductions in
the book take a simpler and neater form.
Other substantial alterations include :
(i) A new presentation of the theory of Systems with similar
particles, based on Fock's treatment of the theory of radiation
adapted to the present notation. This treatment is simpler and more
powerful than the one given in earlier editions of the book.
(ii) A further development of quantum electrodynamics, including
the theory of the Wentzel field. The theory of the electron in interact'ion
with the electromagnetic field is oarried as far as it tan be at
the present time without getting on to speculative ground.
P. A. M. D.
ST. JOHN% COLLEGE, CAMBRIDGE
21 April 1947 .
FROM THE
PREFACE TO THE SECOND EDITION
THE book has been mostly rewritten. 1 have tried by carefully overhauling
the method of presentation to give the development of the
theory in a rather less abstract form, without making any sacrifices
in exactness of expression or in the logical Character of the development.
This should make the work suitable for a wider circle of
readers, although the reader who likes abstractness for its own sake
may possibly prefer the style of the first edition.
The main Change has been brought about by the use of the word
`state ' in a three-dimensional non-relativistic sense. It would seem
at first sight a pity to build up the theory largely on the basis of nonrelativistic
concepts. The use of the non-relativistic meaning of
`state `, however, contributes so essentially to the possibilities of
clear exposition as to lead one to suspect that the fundamental ideas
of the present quantum mechanics are in need of serious alteration at
just tbis Point, and that an improved theory would agree more closely '
with the development here given than with a development which
aims -at preserving the relativistic meaning of `state' throughout.
P. A. M. D.
THE INSTITUTE FOR ADVANCED STUDY
PRINCETON
27 November 1934
PROM THE
PREFACE TO THE FIRST EDITION
THE methods of progress in theoretical physics have undergone a
vast Change during the present century. The classical fradition
has been to consider the world to be an association of observable
objects (particles, fluids, fields, etc.) moving about according `to
deflnite laws of forte, so that one could form a mental picture in
space and time of the whole scheme. This led to a physics whose aim
was to make assumptions about the mechanism and forces connecting
these observable objects, to account for their behaviour in the
simplest possible way. It has become increasingly evident ia recent
times, however, that nature works on a different plan. Her fundamental
laws do not govern the world as it appears in our mental
picture in any very direct way, but instead they control a substraturn
of which we cannot form a mental picture without introducing
irrelevancies. The formulation of these laws requires the use
of the mathematics of transformations. The important things in
the world appear as the invariants (or more generally the nearly
invariants, or quantities with simple transformation properties)
of these transformations. The things we are immediately aware of
are the relations of these nearly invariants to a certain frame of
reference, usually one Chosen so as to introduce special simplifying
features which are unimportant from the Point of view of general
theory.
The growth of the use of transformation theory, as applied first to
relativity and later to the quantum theory, is the essence of the new
method in theoretical physics. Further progress lies in the direction
of making our equations invariant under wider and still wider transformations.
This state of affairs is very satisfaotory from a philosophical
Point of view, as implying an increasing recognition of the
pst played by the observer in himself introducing the regularities
that appear in his observations, and a lack of arbitrariness in the ways
of nature, but it makes things less easy for the learner of physics.
The new theories, if one looks apart from their mathematical setting,
are built up from physical concepts which cannot be explained in
terms of things previously known to the Student, which cannot even
be explained adequately in words at all, Like the fundamental concepts
(e.g. proximity, identity) which every one must learn on his
------- ------_ --r---_..---~ ___- _--__ - -_--.---.-._
3
. . .
v111 PREFACE TO FIRST EDITION
arrival into the world, the newer concepts of physics tan be mastered
only by long familiarity with their properties and uses.
From the mathematical side the approach to the new theories
presents no difficulties, as the mathematics required (at any rate that
which is required for the development of physics up to the present)
is not essentially different from what has been current for a considerable
time. Mathematics is the tool specially suited for dealing with
abstract concepts of any kind and there is no limit to its power in this
field. For this reason a book on the new physics, if not purely descriptive
of experimental work, must be essentially mathematical. All the
same the mathematics is only a tool and one should learn to hold the
physical ideas in one's mind without reference to the mathematical
form. In this book 1 have tried to keep the physics to the forefront,
by beginning with an entirely physical chapter and in the later work
examining the physical meaning underlying the formalism wherever
possible. The amount of theoretical ground one has to cover before
being able to solve Problems of real practical value is rather large, but
this circumstance is an inevitable consequence of the fundamental
part played by transformation theory and is likely to become more
pronounced in the theoretical physics of the future.
With regard to the mathematical form in which the theory tan be
presented, an author must decide at the outset between two methods.
There is the symbolic method, which deals directly in an abstract way
with the quantities of fundamental importante (the invariants, etc.,
of the transformations) and there is the method of coordinates or
representations, which deals with sets of numbers corresponding to
these quantities. The second of these has usually been used for the
presentation of quantum mechanics (in fact it has been used practically
exclusively with the exception of Weyl's book Gruppentheorie
und Quantenmechanik). It is known under one or other of the two
names ` Wave Mechanics ' and ` Matrix Mechanics ' according to which
physical things receive emphasis in the treatment, the states of a
System or its dynamical variables. It has the advantage that the kind
of mathematics required is more familiar to the average Student, and
also it is the historical method.
The symbolic method, however, seems to go more deeply into the
nature of fhings. It enables one to exuress the physical laws in a neat
and concise way, and will probably be increasingly used in the future
as it becomes better understood and its own special mathematics gets
PREFACE TO FIRST EDITION ix '
developed. For this reason 1 have Chosen thc symbolic method,
introducing the representatives later merely as 51; aid to practical
calculation. This has necessitated a completc break from the historical
line of development, but this break is an advantage through
enabling the approach to the new ideas to be made as direct as
possible.
P. A. M. D.
ST.JOKN'S COLLEGE,CAMBRIDGE
29 May 1930
I. The Principle of Superposition .......... 1
1. The Need for a Quantum Theory .......... 1
2. The Polarization of Photons .......... 4
3. Interference of Photons .......... 7
4. Superposition and Indeterminacy .......... 10
5. Mathematical Formulation of the Principle .......... 14
6. Bra and Ket Vectors .......... 18
II. Dynamical Variables and Observables .......... 23
7. Linear Operators .......... 23
8. Conjugate Relations .......... 26
9. Eigenvalues and Eigenvectors .......... 29
10. Observables .......... 34
11. Functions of Observables .......... 41
12. The General Physical Interpretation .......... 45
13. Commutability and Compatibility .......... 49
III. Representations .......... 53
14. Basic Vectors .......... 53
15. The Function .......... 58
16. Properties of the Basic Vectors .......... 62
17. The Representation of Linear Operators .......... 67
18. Probability Amplitudes .......... 72
19. Theorems about Functions of Observables .......... 76
20. Developments in Notation .......... 79
IV. The Quantum Conditions .......... 84
21. Poisson Brackets .......... 84
22. Schroedinger's Representation .......... 89
23. The Momentum Representation .......... 94
24. Heisenberg's Principle of Uncertainty .......... 97
25. Displacement Operators .......... 99
26. Unitary Transformations .......... 103
V. The Equations of Motion .......... 108
27. Schrodinger's Form for the Equations of Motion .......... 108
28. Heisenberg's Form for the Equations of Motion .......... 111
29. Stationary States .......... 116
30. The Free Particle .......... 118
31. The Motion of Wave Packets .......... 121
32. The Action Principle .......... 125
33. The Gibbs Ensemble .......... 130
VI. Elementary Applications .......... 136
34. The Harmonic Oscillator .......... 136
35. Angular Momentum .......... 140
CONTENTS
1. THE PRINCIPLE OF SUPERPOSITION . . 1
1. The Need for a Quantum Theory . . 1
2. The Polarization of Photons . . . 4
3. Interference of Photons . . . 7
4. Superposition and Indeterminaoy . . 10
6. Mathematical Formulation of the Principle . 14
6. Bra andKet Vectors . . . . 18
11. DYNAMICAL VARIABLES AND OBSERVABLES . 23
7. Linear Operators . . . . 23
8. Conjugate Relations . . . . 26
9. Eigenvalues and Eigenvectors . . . 29
10. Observables . . . . . 34
11. Functions of Observables . `. . 41
12. The General Physicd Interpretation . . 45
13. Commutability and Compatibility . . 49
111. REPRESENTATIONS . . . . 53
14. Basic Vectors . . . . . 53
16. The 8 Funotion . . . . . 58
16. Properties of the Basic Vectors . . * 62
17. The Representation of Linear Operators . 67
18. Probability Amplitudes . . . 72
19. Theorems about Functions of Observables . 76
20. Developments in Notation . . . 79
IV. THE QUANTUM CONDITIONS . . . 84
21. Poisson Brackets . . . . 84
22. Schriidinger's Representation . . . 89
23. The Momentum Representation . . . 94
24. Heisenberg's Prinoiple of Uncertainty . . 97
26. Displacement Operators . . . . 99
26. Unitary Transformations . . . 103
V. THE EQUATIONS OF MOTION . . . 108
27. Schrodinger's Form for the Equations of Motion 108
28. Heisenberg's Form for the Equations of Motion 111
29. Stationary States . . . . 116
30. The Free Particle . . . . 118
31. The Motion of Wave Packets . . . 121
32. The Action Principle . . . . 126
33. The Gibbs Ensemble . . . . 130
VI. ELEMENTARY APPLICATIONS . . . 136
34. The Harmonie Oscillator . . . 136
35. Angular Momentum . . . . 140
36. Properties of Angular Momentum .......... 144
37. The Spin of the Electron .......... 149
38. Motion in a Central Field of Force .......... 152
39. Energy-levels of the Hydrogen Atom .......... 156
40. Selection Rules .......... 159
41. The Zeeman Effect for the Hydrogen Atom .......... 165
VII. Perturbation Theory .......... 167
42. General Remarks .......... 167
43. The Change in the Energy-levels caused by a Perturbation .......... 168
44. The Perturbation considered as causing Transitions .......... 172
45. Application to Radiation .......... 175
46. Transitions caused by a Perturbation Independent of the Time .......... 178
47. The Anomalous Zeeman Effect .......... 181
VIII. Collision Problems .......... 185
48. General Remarks .......... 185
49. The Scattering Coefficient .......... 188
50. Solution with the Momentum Representation .......... 193
51. Dispersive Scattering .......... 199
52. Resonance Scattering .......... 201
53. Emission and Absorption .......... 204
IX. Systems Containing Several Similar Particles .......... 207
54. Symmetrical and Antisymmetrical States .......... 207
55. Permutations as Dynamical Variables .......... 211
56. Permutations as Constants of the Motion .......... 213
57. Determination of the Energy-levels .......... 216
58. Application to Electrons .......... 219
X. Theory of Radiation .......... 225
59. An Assembly of Bosons .......... 225
60. The Connexion between Bosons and Oscillators .......... 227
61. Emission and Absorption of Bosons .......... 232
62. Applications to Photons .......... 235
63. The Interaction Energy between Photons and an Atom .......... 239
64. Emission, Absorption, and Scattering of Radiation .......... 244
65. An Assembly of Fermions .......... 248
XI. Relativistic Theory of the Electron .......... 252
66. Relativistic Treatment of a Particle .......... 252
67. The Wave Equation for the Electron .......... 253
68. Invariance under a Lorentz Transformation .......... 257
69. The Motion of a Free Electron .......... 260
70. Existence of the Spin .......... 263
71. Transition to Polar Variables .......... 266
72. The Fine-structure of the Energy-levels of Hydrogen .......... 268
73. Theory of the Positron .......... 272
CONTENTS xi
36. Properties of Angular Momentum . . . 144
37. The Spin of the Electron , . . . 149
38. Motion in a Central Field of Forte . . . 152
39. Energy-levels of the Hydrogen Atom . . . 156
40. Selection Rules . . . . . .159
41. The Zeeman Effect for the Hydrogen Atom . . 165
VII. PERTURBATION THEORY . . . . 167
42. General Remarks . . . . .
167
43. The Change in the Energy-levels caused by a Perturbation 168 b
44. The Perturbation considered as causing Transitions . 172
45. Application to Radiation . . . .175
46. Transitions caused by a Perturbation Independent of the
> Time . . . . . . . 178
47. The Anomalous Zeeman Effect . . . . 181
> VTTT. COLLTSION PROBLEMS . . . . . 185
48. General Remarks . . . . .
185
49. The Stattering Coefficient . . . .188
50. Solution with the Momentum Representation . . 193
51. Dispersive Stattering . . . . .
199
52. Rosonance Stattering . . . 201
53. Emission and Absorption . . . . 204
IX. SYSTEMS CONTAINING SEVERAL SIMILAR PARTICLES 207
54. Symmetrical and Antisymmetrical States . . 207
55. Permutations as Dynamical Variables . . . 211
56. Permutations as Constants of the Motion . ' . 213
, 57. Determination of the Energy-levels . . . 216
> 58. Application to Electrons. . . . . 219
X. THEORY OF RADIATION . . . . .
225
63. The Literaction Energy between Photons and an Atom . 239
64. Emission, Absorption, and Stattering of Radiation . 244
65. An Assembly of Fermions . . . .248
XI. RELATIVISTIC THEORY OF THE ELECTRON . . 252
66. Relativistic Treatment of a Particle . . . 252
67. The Wave Equation for the Electron . . . 253
68. Invariante under a Lorentz Transformation . . 257
69. Tho Motion of a Free Electron . . . . 260
70. Existente of the Spin . . . . .263
7 1. Transition to Polar Variables . . . . 266
72. The Fine-structure of the Energy-levels of Hydrogen . 268
> 73. Theory of the Positron . . . . .272
XII. Quantum Electrodynamics .......... 275
74. Relativistic Notation .......... 275
75. The Quantum Conditions for the Field .......... 278
76. The Hamiltonian for the Field .......... 283
77. The Supplementary Conditions .......... 285
78. Classical Electrodynamics in Hamiltonian Form .......... 289
79. Passage to the Quantum Theory .......... 296
80. Elimination of the Longitudinal Waves .......... 300
81. Discussion of the Transverse Waves .......... 306
Index .......... 310
xi CONTENTS
XII. QUANTUM ELECTRODYNAMICS . . . . 275
74. Relativistic Notation . . . . . 275
75. The Quantum Conditions for the Field . . . 278
76. The Hamiltonian for the Field . . . . 283
77. The Supplementary Conditions . . _ . 285
78. Classical Electrodynamits in Hamiltonian Form . . 289
79. Passage to the CJuantum Theory . . . 296
80. Elimination of the Longitudinal Waves . . . 300
8 1. Discussion of the Transverse Waves . . . 306
> INDEX . . . . . . . 310
I. The Principle of Superposition
1. The Need for a Quantum Theory
THE PRINCIPLE 03' SUPERPOSITION
1. The need for a quantum theory
CLASSICAL mechanics has been developed continuously from the time
of Newton and applied to an ever-widerring range of dynamical
Systems, including the electromagnetic field in interaction ,with
matter. The underlying ideas and the laws governing their application
form a simple and elegant scheme, which one would be inclined
to think could not be seriously modified without having all its
' attractive features spoilt. Nevertheless it has been found possible to
set up a new scheme, called quantum mechanics, which is more
suitable for the description of phenomena on the atomic scale and
which is in some respects more elegant and satisfying than the
classical scheme. This possibility is due to the changes which the
new scheme involves being of a very profound Character and not
clashing with the features of the classical theory that make it so
attractive, as a result of which all these features tan be incorporated
in the new scheme.
The necessity for a departure from classical mechanics is clearly
shown by experimental results. In the first place the forces known
in classical electrodynamics are inadequate for the explanation of the
remarkable stability of atoms and molecules, which is necessary in
Order that materials may have any definite physical and Chemical
properties at all. The introduction of new hypothetical forces will not
save the Situation, since there exist general principles of classical
mechanics, holding for all kinds of forces, leading to results in direct
disagreement with Observation. For example, if an atomic System has
its equilibrium disturbed in any way and is then left alone, it will be set
in oscillation and the oscillations will get impressed on the surrounding
electromagnetic field, so that their frequencies may be observed
with a spectroscope. Now whatever the laws of forte governing the
equilibrium, one would expect to be able to include the various frequencies
in a scheme comprising certain fundamental frequencies and
their harmonics. This is not observed to be the case. Instead, there
is observed a new and unexpected connexion between the frequencies,
called Ritz's Combination Law of Spectroscopy, according to which all
the frequencies tan be expressed as differentes between certain terms,
3696.67 73
2 THE PRINCIPLE OF SUPERPOSITION §l
the number of terms being much less than the number of frequencies.
This law is quite unintelligible from the classical Standpoint.
One might try to get over the difficulty without departing from
classical mechanics by assuming each of the spectroscopically observed
frequencies to be a fundamental frequency with its own degree
of freedom, the laws of forte being such that the harmonic vibrations
do not occur. Such a theory will not do, however, even apart from
the fact that it would give no explanation of the Combination Law,
since it would immediately bring one into conflict with the experimental
evidente on specific heats. Classical statistical mechanics
enables one to establish a general connexion between the total number
of degrees of freedom of an assembly of vibrating Systems and its
specific heat. If one assumes all the spectroscopic frequencies of an
atom to correspond to different degrees of freedom, one would get a
specific heat for any kind of matter very much greater than the
observed value. In fact the observed specific heats at ordinary
temperatures are given fairly weh by a theory that takes into account
merely the motion of each atom as a whole and assigns no internal
motion to it at all.
This leads us to a new clash between classical mechanics and the
results of experiment. There must certainly be some internal motion
in an atom to account for its spectrum, but the internal degrees of
freedom, for some classically inexplicable reason, do not contribute
to the specific heat. A similar clash is found in connexion with the
energy of oscillation of the electromagnetic field in a vacuum. Classical
mechanics requires the specific heat corresponding to this energy to
be infinite, but it is observed to be quite finite. A general conclusion
from experimental results is that oscillations of high frequency do
not contribute their classical quota to the specific heat.
As another illustration of the failure of classical mechanics we may
consider the behaviour of light. We have, on the one hand, the
phenomena of interference and diffraction, which tan be explained
only on the basis of a wave theory; on the other, phenomena such as
photo-electric emission and stattering by free electrons, which show
that light is composed of small particles. These particles, which
are called photons, have each a definite energy and momentum, depending
on the frequency of the light, and appear to have just as
real an existente as electrons, or any other particles known in physics.
A fraction of a Photon is never observed.
§1 THE NEED FOR A QUANTUM THEORY 3
Experiments have shown that this anomalous behaviour is not
peculiar to light, but is quite general. All material particles have
wave properties, which tan be exhibited under suitable conditions.
We have here a very striking and general example of the breakdown
of classical mechanics-not merely an inaccuracy in its laws of motion,
but an inadequucy of its concepts to supply us with a description of
atomic events.
The necessity to depart from classical ideas when one wishes to
account for the ultimate structure of matter may be Seen, not only
from experimentally established facts, but also from general philosophical
grounds. In a classical explanation of the constitution of
matter, one would assume it to be made up of a large number of small
constituent Parts and one would Postulate laws for the behaviour of
these Parts, from which the laws of the matter in bulk could be deduced.
This would not complete the explanation, however, since the
question of the structure and stability of the constituent Parts is left
untouched. To go into this question, it becomes necessary to postulate
that each constituent part is itself made up of smaller Parts, in
terms of which its behaviour is to be explained. There is clearly no
end to this procedure, so that one tan never arrive at the ultimate
structure of matter on these lines. So long as big and small are merely
relative concepts, it is no help to explain the big in terms of the small.
It is therefore necessary to modify classical ideas in such a way as to
give an absolute meaning to size.
At this Stage it becomes important to remember that science is
concerned only with observable things and that we tan observe an
Object only by letting it interact with some outside influence. An act
of Observation is thus necessarily accompanied by some disturbance
of the Object observed. We may define an Object to be big when the
disturbance accompanying our Observation of it may be neglected,
and small when the disturbance cannot be neglected. This definition
is in close agreement with the common meanings of big and small.
It is usually assumed that, by being careful, we may tut down the
disturbance accompanying our observation to any desired extent.
The concepts of big and small are then purely relative and refer to the
gentleness of our means of Observation as well as to the Object being
described. In Order to give an absolute meaning to size, such as is
required for any theory of the ultimate structure of matter, we have
to assume that there is a lz'mit to the$neness of ourpowers of observati&
2. The Polarization of Photons
4 THE PRINCIPLE OF SUPERPOSITION §l
and the smallness of the dccompanying disturbance-a limit which is
inherent in the nature of things and tun never be surpassed by improved
technique or increused skill on the part of the observer. If the Object under
Observation is such that the unavoidable limiting disturbance is negligible,
then the Object is big in the absolute sense and we may apply
classical mechanics to it. If, on the other hand, the limiting disturbance
is not negligible, then the Object is small in the absolute
sense and we require a new theory for dealing with it.
A consequence of the preceding discussion is that we must revise
our ideas of causality. Causality applies only to a System which is
left undisturbed. If a System is small, we cannot observe it without
producing a serious disturbance and hence we cannot expect to find
any causa1 connexion between the results of our observations.
Causality will still be assumed to apply to undisturbed Systems and
the equations which will be set up to describe an undisturbed System
will be differential equations expressing a causa1 connexion between
conditions at one time and conditions at a later time. These equations
will be in close correspondence with the equations of classical
mechanics, but they will be connected only indirectly with the results
of observations. There is an unavoidable indeterminacy in the calculation
of observational results, the theory enabling us to calculate in
general only the probability of our obtaining a particular result when
we make an Observation.
2. The polarization of photons
The discussion in the preceding section about the limit to the
gentleness with which observations tan be made and the consequent
indeterminacy in the results of those observations does not provide
any quantitative basis for the building up of quantum mechanics.
For this purpose a new set of accurate laws of nature is required.
One of the most fundamental and most drastic of these is the Principle
of Superposition of States. We shall lead up to a general formulation
of this principle through a consideration of some special cases, taking
first the example provided by the polarization of light.
It is known experimentally that when plane-p,olarized light is used
for ejecting photo-electrons, there is a preferential direction for the
electron emission. Thus the polarization properties of light are closely
connected with its corpuscular properties and one must ascribe a
polarization to the photons. One must consider, for instance, a beam
§2 THE POLARIZATIOK OF PHOTONS 6
of light plane-polarized in a certain direction as consisting of photons
each of which is plane-polarized in that direction and a beam of
circularly polarized light as consisting of photons each circularly
polarized. Every Photon is in a certain state of poihrization, as we
shall say. The Problem we must now consider is how to fit in these
ideas with the known facts about the resolution of light into polarized
components and the recombination of these components.
Let us take a definite case. Suppose we have a beam of light passing
through a crystal of tourmahne, which has the property of letting
through only light plane-polarized perpendicular to its optic axis.
Classical electrodynamics teils us what will happen for any given
polarization of the incident beam. If this beam is polarized perpendicular
to the optic axis, it will all go through the crystal; if
parallel to the axis, none of it will go through; while if polarized at
an angle CY to the axis, a fraction sin2a will go through. How are we
to understand these results on a Photon basis?
A beam that is plane-polarized in a certain direction is to be
pictured as made up of photons each plane-polarized in that
direction. This picture leads to no difficulty in the cases when our
incident beam is polarized perpendicular or parallel to the optic axis.
We merely have to suppose that each Photon polarized perpendicular
to the axis Passes unhindered and unchanged through the crystal,
while each Photon polarized parallel to the axis is stopped and absorbed.
A difhculty arises, however, in the case of the obliquely
polarized incident beam. Esch of the incident photons is then
obliquely polarized and it is not clear what will happen to such a
Photon when it reaches the tourmalme.
A question about what will happen to a particular Photon under
certain conditions is not really very precise. To make it precise one
must imagine some experiment performed having a bearing on the
question and inquire what. will be the result of the experiment- Only
questions about the results of experiments have a real significance
and it is only such questions that theoretical physics has to consider.
In our present example the obvious experiment is to use an incident
beam consisting of only a Single Photon and to observe what appears
on the back side of the crystal. According to quantum mechanics
the result of this experiment will be that sometimes one will find a
whole Photon, of energy equal to the energy of the incident Photon,
on the back side and other times one will find nothing. When one
6 THE PRINCIPLE OF SUPERPOSITION 82
Gands a whole Photon, it will be polarized perpendicular to the optic
axis. One will never find only a part of a Photon on the back side.
If one repeats the experiment a large number of times, one will find
the Photon on the back side in a fraction sin2cY of the total number
of times. Thus we may say that the Photon has a probability sin2cu.
of passing through the tourmahne and appearing on the back side
polarized perpendicular to the axis and a probability cos2, of being
absorbed. These values for the probabilities lead to the correct
classical results for an incident beam containing a large number of
photons.
In this way we preserve the individuality of the Photon in all
cases. We are able to do Gis, however, only because we abandon the
determinacy of the classical theory. The result of an experiment is
not determined, as it would be according to classical ideas, by the
conditions under the control of the experimenter. The most that tan
be predicted is a set of possible results, with a probability of occurrence
for each.
The foregoing discussion about the result of an experiment with a
Single obliquely polarized Photon incident on a crystal of tourmaline
answers all that tan legitimately be asked about what happens to an
obliquely polarized Photon when it reaches the tourmahne. Questions
about what decides whether the Photon is to go through or not and
how it changes its direction of polarization when it does go through
cannot be investigated by experiment and should be regarded as
outside the domain of science. Nevertheless some further description
is necessary in Order to correlate the results of this experiment with
the results of other experiments that might be performed with
photons and to fit them all into a general scheme. Such further
description should be regarded, not as an attempt to answer questions
outside the domain of science, but as an aid to the formulation of
rules for expressing concisely the results of large numbers of experi-
ments.
The further description provided by quantum mechanics runs as
follows. It is supposed that a Photon pobrized obliquely to the optic
axis may be regarded as being partly in the state of polarization
parallel to the axis and partly in the state of polarization perpendicular
to the axis. The state of oblique polarization may be considered
as the result of some kind of Superposition process applied to
the two states of parallel and perpendicular polarization. This implies
3. Interference of Photons
$2 THE POLARIZATION OF PHOTONS 7
a certain special kind of relationship between the various states of
polarization, a relationship similar to that between polarized beams in
classical optics, but which is now to be applied, not to beams, but to
the states of polarization of one particular Photon. This relationship
allows any state of polarization to be resolved into, or expressed as a
superposition of, any two mutually perpendicular states of polarization.
When we make the Photon meet a tourmalme crystal, we are sub-
jecting it to an Observation. We are observing whether it is polarized
parallel or perpendicular to the optic axis. The effect of making this
Observation is to forte the Photon entirely into the state of parallel
or entirely into the state of perpendicular polarization. It has to
make a sudden jump from being partly in each of these two states to
being entirely in one or other of them. Which of the two states it will
jump into cannot be predicted, but is governed only by probability
laws. If it jumps into the parallel state it gets absorbed and if it
jumps into the perpendicular state it Passes through the crystal and
appears on the other side preserving this state of polarization.
3. Interference of photons
In this section we shall deal with another example of Superposition.
We shall again take photons, but shall be concerned with their position
in space and their momentum instead of their polarization. If
we are given a beam of roughly monochromatic light, then we know
something about the location and momentum of the associated
photons. We know that each of them is located somewhere in the
region of space through which the beam is passing and has a momenturn
in the direction of the beam of magnitude given in terms of the
frequency of the beam by Einstein's photo-electric law-momentum
equals frequency multiplied by a universal constant. When we have
such information about the location and momentum of a Photon we
shall say that it is in a definite tramlat@nal state.
We shall discuss the description which quantum mechanics pro-
vides of the interference of photons. Let us take a definite experiment
demonstrating interference. Suppose we have a beam of light
which is passed through some kind of interferomefer, so that it gets
Split up into two components and the two components are subsequently
made to interfere. We may, as in the preceding section, take
an incident beam consisting of only a Single Photon and inquire what
8 THE PRINCIPLE OF SUPERPOSITION 93
will happen to it as it goes through the apparatus. This will present
to us the difficulty of the confliet between the wave and corpuscular
theories of light in an acute form.
Corresponding to the description that we had in the case of the
polarization, we must now describe the Photon as going partly into
each of the two components into which the incident beam is Split.
The Photon is then, as we may say, in a translational state given by the
Superposition of the two translational states associated with the two
components. We are thus led to a generalization of the term `translational
state' applied to a Photon. For a Photon to be in a definite
translational state it need not be associated with one Single beam of
light, but may be associated with two or more beams of light which
arc the components into which one original beam has been Split.? In
the accurate mathematical theory each translational state is associated
with one of the wave functions of ordinary wave optics, which wave
function may describe either a Single beam or two or more beams
into which one original beam has been Split. Translational states are
thus superposable in a similar way to wave functions.
Let us consider now what happens when we determine the energy
in one of the components. The result of such a determination must
be either the whole Photon or nothing at all. Thus the Photon must
Change sudderily from being partly in one beam and partly in the
other to being entirely in one of the beams. This sudden Change is
due to the disturbance in the translational state of the Photon which
the Observation necessarily makes. It is impossible to predict in which
of the two beama the Photon will be found. Only the probability of
either result tan be calculated from the previous diatribution of the
Photon over the two beams.
One could carry out the energy measurementwithout destroying the
component beam by, for example, reflecting the beam from a movable
mirror and observing the recoil. Our description of the Photon allows
us to infer that, ufter such an energy measurement, it would not be
possible to bring about any interference effects between the two components.
So long as the Photon is partly in one beam and partly in
the other, interference tan occur when the two beams are superpose&
but this possibility disappears when the Photon is forced entirely into
t The circumstance that the superposition idea requires us to generalize our
original meaning of translational states, but that no corresponding generalization was
needed for the states of Polarkation of the preceding section, is an accidental one
with no underlying theoretical sign&ance.
§3 INTERFERENCE OF PHOTONS
one of the beams by an Observation. The other beam then no langer
emers into the description of the Photon, so that it counts &S being
entirely in the one beam in the ordinary way for any experiment that
may subsequently be performed on it.
On these lines quantum mechanics is able to effect a reconciliation
of fhe wave and corpuscular properties of light. The essential Point
is the association of each of the translational states of a photon with
one of the wave functions of ordinary wave optics. The nature of this
association cannot be pictured on a basis of classical mechanics, but
is something entirely new. It would be quite wrong to picture the
Photon and its associated wave as interacting in the way in which
particles and waves tan interact in classical mechanics. The association
tan be interpreted only statistically, the wave function giving
us information about the probability of our finding the Photon in any
particular place when we make an Observation of where it is.
Some time before the discovery of quantum mechanics People
realized that the connexion between light waves and photons must
be of a statistical Character. What they did not clearly realize, however,
was that the wave function gives information about the probability
of one Photon being in a particular place and not the probable *
number of photons in that place. The importante of the distinction
tan be made clear in the following way. Suppose we have a beam
of light consisting of a large number of photons Split up into two components
of equal intensity. On the assumption that the intensity of
a beam is connected with the probable number of photons in it, we
should have half the total number of photons going into each component.
If the two components are now made to interfere, we should
require a Photon in one component to be able to interfere with one in
the other. Sometimes these two photons would have to annihilate one
another and other firnes they would have to produce four photons.
This would contradict the conservation of energy. The new theory,
which connects the wave function with probabilities for one Photon,
gets over the difficulty by making each Photon go partly into each of
the two components. Esch Photon then interferes only with itself. '
Interference between two different photons never occurs.
The association of particles with waves discussed above is not '
restricted to the case of light, but is, according to modern theory,
of universal applicability. All kinds of particles are associated with
waves in this way and conversely all wave motion is associated with
4. Superposition and Indeterminacy
10 THE PRINCIPLE OF SUPERPOSITION §3
particles. Thus all particles tan be made to exhibit interference
effects and all wave motion has its energy in the form of quanta. The
reason why these general phenomena are not more obvious is on
account of a law of proportionality betwcen the mass or energy of the
particles and the frequency of the waves, the coefficient being such
that for waves of familiar frequencies the associated quanta are
extremely small, while for particles even as light as electrons the
associated wave frequency is so high that it is not easy to demonstrate
interference.
4. Superposition and indeterminacy
The reader may possibly feel dissatisfied with the attempt in the
two preceding sections to fit in the existente of photons with the
classical theory of light. He may argue that a very strange idea has
been introduced-the possibility of a Photon being partly in each of
two states of polarization, or partly in each of two separate beamsbut
even with the help of this strange idea no satisfying picture of
the fundamental Single-Photon processes has been given. He may say
further that this strange idea did not provide any information about
experimental results for the experiments discussed, beyond what
could have been obtained from an elementary consideration of
photons being guided in some vague way by waves. What, then, is
the use of the strange idea?
In answer to the first criticism it may be remarked that the main
Object of physical science is not the Provision of pictures, but is the
formulation of laws governing phenomena and the application of
these laws to the discovery of new phenomena. If a picture exists,
so much the better; but whether a picture exists or not is a matter
H .---w- .._
of only secondary-importante.
_- ,c*- - ,-_-_ _. -.--- In the case of atomic phen&za
" . _., " `"".-"-"_"c
no picture tan be expected to exist in the usual sense of the word
`picture', by wbich is meant a model functioning essentially on
classical lines. One may, however, extend the meaning of the word
`picture' to include any way of looking at the fundamental laws which
makes their self-consistency obvious. With this extension, one may
gradually acquire a picture of atomic phenomena by becoming
familiar with the laws of the quantum theory.
With regard to the second criticism, it may be remarked that for
many simple experiments with light, an elementary theory of waves
and photons connected in a vague statistical way would be adequate
i
94 SUPERPOSITION AND INDETERMINACY 11
to account for the results. In the case of such experiments quantum
mechanics has no further information to give. In the great majority
of experiments, however, the conditions are too complex for an
elementary theory of this kind to be applicable and some more
elaborate scheme, such as is provided by quantum mechanics, is then
needed. The method of description that quantum mechanics gives
in the more complex cases is applicable also to the simple cases and
although it is then not really necessary for accounting for the experimental
results, its study in these simple cases is perhaps a suitable
introduction to its study in the general case.
There remains an Overall criticism that one may make to the whole
scheme, namely, that in departing from the determinacy of the
classical theory a great complication is introduced into the description
of Nature, which is a highly undesirable feature. This complication
is undeniable, but it is offset by a great simplification, provided
by the general principle of superposition of states, which we shall now
go on to consider. But first it is necessary to make precise the important
concept of a `state' of a general atomic System.
Let us take any atomic System, composed of particles or bedies
with specified properties (mass, moment of inertia, etc.) interacting
according to specified laws of forte. There will be various possible
motions of the particles or bodies consistent with the laws of forte.
Esch such motion is called a state of the System. According to
classical ideas one could specify a state by giving numerical values
to all the coordinates and velocities of the various component Parts
of the System at some instant of time, the whole motion being then
completely determined. Now the argument of pp. 3 and. 4 Shows that
we cannot observe a sma.8 System with that amount of detail which
classical theory supposes. The limitation in the power of Observation
puts a limitation on the number of data that tan be assigned to a
state. Thus a state of an atomic System must be specitled by fewer
or more indefinite data than a complete set of numerical values
for all the coordinates and velocities at some instant of time. In the
case when the System is just a Single Photon, a state would be completely
specified by a given state of motion in the sense of $3
together with a given sfate of polarization in the sense of $! 2.
A state of a System may be defined as an undisturbed motion that
is restricted by as many conditions or data as are theoretically
possible without mutual interference or contradiction. In practice
-.-_._______ ..l _..._ ------ - .- _-- __._ -~--. ~?r
12 THE PRINCIPLE OF SUPERPOSITION $4
the conditions could be imposed by a suitable preparation of the
system, consisting perhaps in passing it through various kinds of
sorting apparatus, such as slits and polarimeters, the System being
left undisturbed after the preparation. The word `state' may be
used to mean either the state at one particular time (after the
preparation), or the state throughout the whole of time after the
preparation. To distinguish these two meanings, the latter will be
called a `state of motion' when there is liable to be ambiguity.
The general principle of superposition of quantum mechanics
applies to the states, with either of the above meanings, of any one
dynamical System. It requires us to assume that between these
states there exist peculiar relationships such that whenever the
System is definitely in one state we tan consider it as being partly
in each of two or more other states. The original state must be
regarded as the result of a kind of superposition of the two or more
new states, in a way that cannot be conceived on classical ideas. Any
state may be considered as the result of a superposition of two or
more other states, and indeed in an infinite number of ways. Conversely
any two or more states may be superposed to give a new
state. The procedure of expressing a state as the result of superPosition
of a number of other states is a mathematical procedure
that is always permissible, independent of any reference to physical
conditions, like the procedure of resolving a wave into Fourier components.
Whether it is useful in any particular case, though, depends
on the special physical conditions of the Problem under consideration.
In the two preceding sections examples were given of the super-
Position principle applied to a System consisting of a Single Photon.
0 2 dealt with states differing only with regard to the polarization and
5 3 with states differing only with regard to the motion of the Photon
as a whole.
The nature of the relationships which the Superposition principle
requires to exist between the states of any System is of a kind that
cannot be explained in terms of familiar physical concepts. One
cannot in the classical sense picture a System being partly in each of
two states and see the equivalence of this to the System being completely
in some other state. There is an entirely new idea involved,
to which one must get accustomed and in terms of which one must
proceed to buil'd up an exact mathematical theory, without having
any detailed classical picture.
§4 SUPERPOSITION AND INDETERMINACY 13
When a state is formed by the Superposition of two other states,
it will have properties that are in some vague way intermediate
between those of the two original states and that approach more or
less closely to those of either of them according to the greater or less
`weight' attached to this state in the Superposition process. The new
state is completely defined by the two original states when their
relative weights in the Superposition process are known, together
with a certain Phase differente, the exact meaning of weights and
phases being provided in the general case by the mathematical theory.
In the case of the polarization of a Photon their meaning is that provided
by classical optics, so that, for example, when two perpendicularly
plane polarized states are superposed with equal weights, the
ne'w state may be circularly polarized in either direction, or linearly
polarized at an angle & 7~, or else elliptically polarized, according to
the Phase differente.
The non-classical nature of the Superposition process is brought
out clearly if we consider the Superposition of two states, A and B,
such that there exists an Observation which, when made on the
System in state A, is certain to lead to one particular result, a say, and
when made on the System in state B is certain to lead to some different
result, b say. What will be the result of the Observation when made
on the System in the superposed state ? The answer is that the result
will be sometimes a and sometimes b, according to a probability law
depending on the relative weights of A and B in the Superposition
process. It will never be different from both a and b. The intermediate
Character of the state formed by superposition thus expresses
itself through the probability of a particulur res& for an observution
being interkdiate between the corresponding probabilities for the original
stutes,j- not through the result itself being intermediate between the
corresponding results for the original states.
In this way we see that such a drastic departure from ordinary
ideas as the assumption of Superposition relationships between the
states is possible only on account of the recognition of the importarme
of the disturbance accompanying an Observation and of the consequent
indeterminacy in the result of the Observation. When an
Observation is made on any atomic System that is in a given state,
t The probability of a particulrtr result for the state formed by superposition is not
slways intermediate between those for the original states in the general case when
those for the original states are not Zero OP unity, so there arc restrictions on the
`intermediateness ' of a state formed by Superposition.
5. Mathematical Formulation of the Principle
14 THE PRINCIPLE OF SUPERPOSITION §4
in general the result will not be determinate, i.e., if the experiment
is repeated several times under identical conditions several different
results may be obtained. It is a law of nature, though, that if the
experiment is repeated a large number of firnes, each particular result
will be obtained in a definite fraction of the total number of firnes, so
that there is a definite probability of its being obtained. This probability
is what the theory sets out to calculate. Only in special cases
when the probability for some result is unity is the result of the
experiment determinate.
The assumption of Superposition relationships between the states
leads to a mathematical theory in which the equations that define
a state are linear in the unknowns. In consequence of this, People
have tried to establish analogies with Systems in classical mechanics,
such as vibrating strings or membranes, which are governed by linear
equations and for which, therefore, a superposition principle holds.
Such analogies have led to the name `Wave Mechanics' being sometimes
given to quantum mechanics. It is important to remember,
however, that the superposition that occurs in quuntum mechanics is
of an. essentially different nuture from any occurring in the classical
theory, as is shown by the fact that the quantum Superposition principle
demands indeterminacy in the results of observations in Order
to be capable of a sensible physical interpretation. The analogies are
thus liable to be misleading.
5. Mathematical formulation of the principle
A profound Change has taken place during `the present century in
the opinions physicists have held on the mathematical foundations
of their subject. Previously they supposed that the principles of
Newtonian mechanics would provide the basis for the description
of the whole of physical phenomena and that all the theoretical
physicist had to do was suitably to develop and apply these principles.
With the recognition that there is no logical reason why
Newtonian and other classical principles should be valid outside the
domains in which they have been experimentally verified has come
the realization that departures Fom these principles are indeed
necessary. Such departures find their expression through the introduction
of new mathematical formalisms, new schemes of axioms
and rules of manipulation, into the methods of theoretical physics.
Quantum mechanics provides a good example of the new ideas. It
0 5 MATHEMATICAL FORMULATION OF THE PRINCIPLE 10
requires the &ates of a dynamical System and the dynamical variables
to be interconnected in quite strange ways that are unintelligible
from the classical Standpoint. The states and dynamical variables
have to be represented by mathematical quantities of different
natures from those ordinarily used in physics. The new scheme
becomes a precise physical theory when all the axioms and rules of
manipulation governing the mathematical quantities arc spectied
and when in addition certain laws are laid down connecting physical
facts with the mathematical formalism, so that from any given
physical conditions equations between the mathematical quantities
may be inferred and vice versa. In an application of the theory one
would be given certain physical information, which one would proceed
to express by equations between the mathematical quantities.
One would then deduce new equations with the help of the axioms
and rules of manipulation and would conclude by interpreting these
new equations as physical conditions. The justification for the whole
scheme depends, apart from internal consistency, on the agreement
of the final results with experiment.
We shall begin to set up the scheme by dealing with the mathe-
matical relations between the states of a dynamical System at one
instant of time, which relations will come from the mathematical
formulation of the principle of Superposition. The Superposition process
is a kind of additive process and implies that states tan in some
way be added to give new states. The states must therefore be connected
with mathematical quantities of a kind which tan be added
together to give other quantities of the same kind. The most obvious
of such quantities are vectors. Ordinary vectors, existing in a space
of a finite number of dimensions, are not sufficiently general for
most of the dynamical Systems in quantum mechanics. We have to
make a generalization to vectors in a space of an infinite number of
dimensions, and the mathematical treatment becomes complicated
by questions of convergence. For the present, however, we shall deal
merely with some general properties of the vectors, properties which
tan be deduced on the basis of a simple scheme of axioms, and
questions of convergence and related topics will not be gone into
until the need arises.
It is desirable to have a speeist1 name for describing the vectors
which are connected with the states of a System in quantum mechanies,
whether they are in a space of a finite or an inf?nite number of
16 THE PRINCIPLE OF SUPERPOSITION §S
dimensions. We shall cal1 them ket vectors, or simply kets, and denote
a general one of them by a special Symbol j>. If we want to specify
a particular one of them by a label, A say, we insert it in the middle,
thus IA). The suitability of this notation will become clear as the
scheme is developed.
Ket vectors may be multiplied by complex numbers and may be
added together to give other ket vectors, eg. from two ket vectors
IA) and IB) we tan form
Cl IA)+% IW = Im,
say, where c1 and cs are any two complex numbers. We may also
perform more general linear processes with them, such as adding an
infinite sequence of them, and if we have a ket vector IX), depending
on and labelled by a Parameter x which tan take on all values in a
certain range, we may integrate it with respect to x, to get another
ket vector
s IX> dx 1 IQ>
say. A ket vector which is expressible linearly in terms of certain
others is said to be dependent on them. A set of ket vectors are called
independent if no one of them is expressible linearly in terms of the
others.
We now assume that euch state of a dynamical system at a particular
time cwresponds to a ket vector, the correspondence being such that if a
state results from the superposition of certain other states, its corresponding
ket vector is expressible linearly in terms of the corresponding ket
vectors of the other states, and conversely. Thus the state R results from
a Superposition of the states A and B when the corresponding ket
vectors are connected by (1).
The above assumption leads to certain properties of the super-
Position process, properties which are in fact necessary for the word
`superposition' to be appropriate. When two or more states are
superposed, the Order in which they occur in the Superposition
process is unimportant, so the Superposition process is symmetrical
between the states that are superposed. Again, we see from equation
(1) that (excluding the case when the coefficient c1 or c, is Zero) if
the state R tan be formed by Superposition of the states A and B,
then the state A tan be formed by Superposition of B and R, and B
tan be formed by Superposition of A and R. The Superposition
relationship is symmetrical between all three states A, 23, and R.
$5 MATHEMATICAL FORMULATION OF THE PRINCIPLE
A state which results from the Superposition of certain other
states will be said to be dependent on those states. More generally,
a state will be said to be dependent on any set of states, finite or
infinite in number, if its corresponding ket vector is dependent on
the corresponding ket vectors of the set of states. A set of states
will be called independent if no one of them is dependent on the
others.
To proceed with the mathematical formulation of the superposition
principle we must introduce a further assumption, namely the assumption
that by superposing a state with itself we cannot form any new
state, but only the original state over again. If the original state
corresponds to the ket vector IA), when it is superposed with itself
the resulting state will correspond to
clI4+% 14 = (c1+cJA),
where c1 and ca are numbers. Now we may have cl+cz = 0, in which
case the result of the Superposition process would be nothing at all,
the two components having cancelled each other by an interference
effect. Our new assumption requires that, apart from this special
case, the resulting state must be the same as the original one, so that
(c,+c,) IA} must correspond to the same state that IA> does. Now
c1+c2 is an arbitrary complex number and hence we tan conclude
that if the ket vector corresponding to a state is multi@ied by any
complex number, not xero, the resulting bet vector will correspond to the
same Stute. Thus a state is specified by the direction of a ket vector
and any length one may assign to the ket vector is irrelevant. All
the states of the dynamical System are in one-one correspondence
with all the possible directions for a ket vector, no distinction being
made between the directions of the ket vectors IA) and - IA).
The assumption just made Shows up very clearly the fundamental
differente between the Superposition of the quantum theory and any
kind of classical superposition. In the case of a classical System for
c which a superposition principle holds, for instance a vibrating mem-
brane, when one superposes a state with itself the result is a difSerent
state, with a different magnitude of the oscillations. There is no
physical characteristic of a quantum state corresponding to the
magnitude of the classical oscillations, as distinct from their quality,
described by the ratios of the amplitudes at different Points of
the membrane. Again, while there exists a classical state with zero
3595.57 a
6. Bra and Ket Vectors
----__-__- ---- - .-- !-
18 THE PRINCIPLE 03' SUPERPOSITION 95
amplitude of oscillation everywhere, namely the state of rest, there
does not exist any corresponding state for a quantum System, the
Zero ket vector corresponding to no state at all.
Given two states corresponding to the ket vectors IA) and IB),
the general state formed by superposing them corresponds to a ket
vector IR> which is determined by two complex numbers, namely
the coefficients cr and c2 of equation (1). If these two coefficients are
multiplied by the same factor (itself a complex number), the ket
veotor IR) will get multiplied by this factor and the corresponding
state will be unaltered. Thus only the ratio of the two coefficients
is effective in determining the state R. Hence this state is determined
by one complex number, or by two real Parameters. Thus
from two given states, a twofold infinity of states may be obtained
by superposition.
This resrilt is confirmed by the examples discussed in $9 2 and 3.
In the example of $2 there are just two independent states of polarization
for a Photon, which may be taken to be the states of plane
polarization parallel and perpendicular to some fixed direction, and
from the Superposition of these two a twofold infinity of states of
polarization tan be obtained, namely all the states of elliptic polarization,
the general one of which requires two Parameters to describe
it. Again, in the example of $ 3, from the Superposition of two given
states of motion for a Photon a twofold infinity of states of motion
may be obtained, the general one of which is described by two
Parameters, which may be taken to be the ratio of the amplitudes
of the two wave functions that are added together and their Phase
relationship. This confirmation Shows the need for allowing complex
coeflicients in equation (1). If these coefficients were restricted to be
real, then, since only their ratio is of importante for determining the
direction of the resultant ket vector 1 R> when IA) and IB) are
given, there would be only a simple in.Cnity of states obtainable from
the Superposition.
6. Bra and ket vectors
Whenever we have a set of vectors in any mathematical theory,
we tan always set up a second set of vectors, which mathematicians
call the dual vectors. The procedure will be described for the case
when the original vectors are our ket vectors.
Suppose we have a number + which is a function of a ket vector
t
*.
§ss BRA AND KET VECTORS 19
IA), i.e. to each ket vector IA) there corresponds one number 4,
and suppose further that the function is a linear one, which means
that the number corresponding to IA)+ IA') is the sum of the
numbers corresponding to IA) and to IA'), and the number corresponding
to c/A) is c times the number corresponding to IA), c
being any numerical factor. Then the number + corresponding to
any IA) may be looked upon as the scalar product of that IA) with
some new vector, there being one of these new vectors for each linear
function of the ket vectors IA). The justification for this way of
looking at + is that, as will be seen later (see equations (5) and (6)),
the new vectors may be added together and may be multiplied by
numbers to give other vectors of the same kind. The new vectors
are, of course, defined only to the extent that their scalar products
with the original ket vectors are given numbers, but this is sufficient
for one to be able to build up a mathematical theory about
them.
We shall cal1 the new vectors bra vectors, or simply bras, and denote
a general one of them by the Symbol ( 1, the mirror image of the
Symbol for a ket vector. If we want to specify a particular one of
them by a label, B say, we write it in the middle, thus <B 1. The
scalar product of a bra vector (BI and a ket vector IA) will be
written (BIA), i.e. as a juxtaposition of the Symbols for the bra
and ket vectors, that for the bra vector being on the left, and the
two vertical lines being contracted to one for brevity.
One may look upon the Symbols ( and > as a distinctive kind of
brackets. A scalar product (BIA) now appears as a complete bracket
expression and a bra vector (BI or a ket vector IA) as an incomplete
bracket expression. We have the rules that any complete bracket
expression denotes a number and any incomplete bracket expression
denotes a vector, of the bra or ket kind according to whether it contuins
the Jirst or second part sf thti brackets.
The condition that the scalar product of (BI and IA) is a linear
function of IA) may be expressed symbolically by
<BI(W+ IA')) = <JWO+<BIO (2)
<BI{+)) = c<BW, (3)
c being any number.
A bra vector is considered to be completely defined when its scalar
product with every ket vector is given, so that if a bra vector has its
20 THE PRINCIPLE OF SUPERPOSITION §6
scalar product with every ket vector vanishing, the bra vector itself
must be considered as vanishing. In Symbols, if
<PIA> = 0, all IA>,
then (PI = 0. (4)
1
The sum of two bra vectors (B 1 and (B' { is defined by the condition
that its scalar product with any ket vector IA) is the sum of the
scalar products of (BI and (B'I with IA),
@1+(8'l)lA> = <BIA>+<B'IA>, (5)
and the product of a bra vector (B 1 and a number c is defined by the
condition that its scalar product with any ket vector IA) is c firnes
the scalar product of (BI with IA),
(cwI4 = c(BIA). (6)
Equations (2) and (5) sh,ow that products of bra and ket vectors
satisfy the distributive axiom of multiplication, and equations (3)
and (6) show that multiplication by numerical factors satisfies the
usual algebraic axioms.
The bra vectors, as they have been here introduced, are quite a
different kind of vector from the kets, and so far there is no connexion
between them except for the existente of a scalar product of a bra
and a ket. We now make the assumption that Tiere is a one-one
correspondence between the bras and the kets, such that the bra corresponding
to IA) + IA') is the suna of the bras corresponding to 1 A) and
to IA'), md the bra corresponding to clA> is c' times the bra correspon&ng-
to IA), c' being the conjugate cornplex number to c. We shall
use the same label to specify a ket and the corresponding bra. Thus
the bra corresponding to IA) will be written (A 1.
The relationship between a ket vector and the corresponding bra
makes it reasonable to call one of them the conjugate imaginary of
the other. Our bra and ket vectors are complex quantities, since they
tan be multiplied by complex numbers and are then of the same
nature as before, but they are complex quantities of a special kind
which cannot be Split up into real and pure imaginary Parts. The
usual method of getting the real part of a complex quantity, by
taking half the sum of the quantity itself and its conjugate, cannot
be applied since a bra and a ket vector are of d.ifIerent natures and
cannot be added together. To call attention to this distinction, we
shall use the words `conjugate complex' to refer to numbers and
BRA AND KET VECTQRS
other complex quantities which tan be spht up into real and pure
imaginary Parts, and the words `conjugate imaginary' for bra and
ket vectors, which cannot. With the former kind of quantity, we
shall use the notation of putting a bar over one of them to get the
conjugate complex one. I
On account of the one-one correspondence between bra vectors and
ket vectors, any state of our dynamical system at a particular time may
be spec@ed by the direction of a bra uector just us weil as by the direction
of a ket vector. In fact the whole theory will be symmetrical in its
essentials between bras and kets.
Given any two ket vectors IA) and IB), we tan construct from
them a number (BIA) by taking the scalar product of the first with
the conjugate imaginary of the second. This number depends linearly
on IA) and antilinearly on IB), the antilinear dependence meaning
that the number formed from IB)+ IB') is the sum of the numbers
formed from 1 B) and from 1 B'), and the number formed from c 1 B)
is c' times the number formed from IB). There is a second way in
which we tan construct a number which depends linearly on IA> and
antilinearly on IB>, namely by forming the scalar product of IB)
with the conjugate imaginary of IA) and taking the conjugate complex
of this scalar product. We assume thut these two.numbers are
always equul, i.e. Gwo = <4w (7)
Putting IB) = IA> here, we find that the number @IA> must be
real. We make the further assumption
<44 > 0, (8)
except when IA) = 0.
In ordinary space, from any two vectors one tan construct a
number-their scalar product-which is a real number and is symmetrical
between them. In the space of bra vectors or the space of
ket vectors, from any two vectors one tan again construct a number
-the scalar product of one with the conjugate imaginary of the
other-but this number is complex and goes over into the conjugate
complex number when the two vectors are interchanged. There is
thus a bind of perpendicularity in these spaces, which is a generalization
of the perpendicularity in ordinary space. We shall call a bra
and a ket vector orthogonal if their scalar product is Zero, and two
bras or tw.o kets will be called orthogonal if the scalar product of one
with the conjugate imaginary of the other is Zero. E'urther, we shall
22 THE PRINCIPLE OF' SUPERPOSITION §6
say that two states of our dynamical System are orthogonal if the
vectors corresponding to these states are orthogonal.
The Zength of a bra vector (A 1 or of the conjugate imaginary ket
vector JA) is defined as the Square root of the positive number
(A IA). When we are given a state and wish to set up a bra or ket
vector to correspond to it, only the direction of the vector is given
and the vector itself is undetermined to the extent of an arbitrary
numerical factor. It is often convenient to choose this numerical
factor so that the vector is of length unity. This procedure is called
normalization and the vector so Chosen is said to be normlixed. The
vector is not completely determined even then, since one tan still
multiply it by any number of modulus unity, i.e. any number eiy
where y is real, without changing its length. We shall call such a
number a phase factor.
The foregoing assumptions give the complete scheme of relations
befween the states of a dynamical System at a particular time. The
relations appear in mathematical form, but they imply physical
conditions, which will lead to results expressible in terms of observations
when the theory is developed further. For instance, if two states
are orthogonal, it means at present simply a certain equation in our
formalism, but this equation implies a definite physical relationship
between the states, which further developments of the theory will
enable us to interpret in terms of observational results (see the
bottom of p. 35).
II. Dynamical Variables and Observables
7. Linear Operators
11
DYNAMICAL VARIABLES AND OBSERVABLES
7. Linear Operators
IN the preceding section we considered a number which is a linear
function of a ket vector, and this led to the concept of a bra vector.
We shall now consider a ket vector which is a linear function of a
ket vector, and this will lead to the concept of a linear Operator.
Suppose we have a ket IE`) which is a function of a ket IA), i.e.
to each ket IA) there corresponds one ket 1 F), and suppose further
that the function is a linear one, which means that the IF) corresponding
to IA) + IA') is the sum of the 1 F)`s corresponding to IA)
and to IA'), and the I-8') corresponding to clA> is c times the 1 F)
corresponding to IA), c being any numerical factor. Under these
conditions, we may 10015 upon the passage from IA) to 1 F) as the
application of a linear Operator to IA). Introducing the Symbol 01
for the linear Operator, we may write
in which the result of cx operafing on IA) is written like a product
of ac. with IA). We make the rule that in such products the ket wector
must always be put on the right of the linear operatm. The above
conditions of linearity may now be expressed by the equations
4A>+ IA'>) = +>+w>,
a{clA)) = c+4). 1 (1)
A linear Operator is considered to be completely defined when the
result of its application to every ket vector is given. Thus a linear
Operator is to be considered zero if the result of its application to every
ket vanishes, and two linear Operators are to be considered equal if
they produce the same result when applied to every ket.
Linear Operators tan be added together, the sum of two linear
Operators being defined to be that linear Operator which, operating
on any ket, produces the sum of what the two linear Operators
separately would produce. Thus CY+/~ is defined by
(~+Pw> = 4A>+mo (2)
for any IA). Equation (2) and the first of equations (1) show that
products of linear Operators with ket vectors satisfy the distributive
axiom of multiplication.
24 DYNAMICAL VARIABLES AND OBSERVABLES §7
Linear Operators tan also be multiplied together, the product of
two linear Operators being defined as that linear Operator, the application
of which to any ket produces fhe same result as the application
of the two linear Operators successively. Thus the product a/3 is
defined as the linear Operator which, operafing on any ket IA),
changes it into that ket which one would get by operating first on
IA> with /3, and then on the result of the first Operation with 01. In
Symbols
This definition appears as the associative axiom of multiplication for
the triple product of 01, fl, and IA), and allows us to write this triple
product as aj3jA) without brackets. However, this triple product is
in general not the same as what we should get if we operated on IA)
first with Q: and then with ss, i.e. in general @IA) differs from /3aIA),
so that in general 0#3 must differ from /Ia. The commutative axiom of
multiplication does not hZd for linear Operators. It may happen as a
special case that two linear Operators f and q are such that eq and
76 are equal. In this case we say that 5 commutes with 7, or that 6
and r] commute.
By repeated applications of the above processes of adding and
multiplying linear Operators, one tan form sums and products of
more than two of them, and one tan proceed to build up an algebra
with them. In this algebra the commutative axiom of multiplication
does not hold, and also the product of two linear Operators may
vanish without either factor vanishing. But all the other axioms of
ordinary algebra, including the associative and distributive axioms
of multiplication, are valid, as may easily be verified.
If we take a number li: and multiply it into ket vectors, it appears
as a linear Operator operating on ket vectors, the conditions (1) being
fulfrlled with E substituted for CX. A number is thus a special case of
a linear Operator. It has the property that it commutes with all linear
Operators and this property distinguishes it from a general linear
Operator.
So far we have considered linear Operators operating only on ket
vectors. We tan give a meaning to their operating also on bra vectors,
in the following way. Take the scalar product of any bra (BI with
the ket a IA). This scalar product is a number which depends
linearly on IA) and therefore, from the definition of bras, it may be
considered as the scalar product of IA) with some bra. The bra thus
$7 LINEAR OPERATORS
defined depends linearly on {B 1, so we may look upon it as the result of
some linear operator applied to (B 1. This linear Operator is uniquely
determined by the original linear Operator cx and may reasonably be
called the Same linear Operator operating on a bra. In this way our
linear Operators are made capable of operating on bra vectors.
A suitable notation to use for the resulting bra when u: operates on
the bra (BI is (Bla, as in this notation the equstion which defines
(Bleu is (3) +
for any JA>, which simply expresses the associative axiom of multiplication
for the triple product of (BI, CL, and IA). We therefore
make the general rule that in a product of a bra and a linear Operator,
the bra must always be put on the left. We tan now write the friple
product of (BI, CII, and IA> simply as (B ICX IA> without brsckets. It
may easily be verified that the distributive axiom of multiplication
holds for products of bras and linear operetors just as weil as for
products of linear Operators and kets.
There is one further kind of product which has a meaning in our
scheme, namely the product of a ket vector and a bra vector with
the ket on the left, such as lA)(B 1. To examine this product, let us
multiply it into an arbitrary ket 1 P), putting the ket on the right,
and assume the associative axiom of multiplication. The product is
then IA)(B 1 P), which is another ket, namely (A) multiplied by the
number (BI P), and this ket depends linearly on the ket 1 P). Thus
IA){ BI appears as a linear Operator that tan operate on kets. It
tan also operate on bras, its product with a bra (& 1 on fhe left being
(&JA>(BJ, which is the number (QIA) times the bra (BJ. The
product IA}{ B 1 is to be sharply distinguished from the product
(BIA} of the same factors in the reverse Order, the latter product
being, of course, a number.
We now have a complete algebraic scheme involving three kinds
of quantities, bra vectors, ket vectors, and linear Operators. They tan
be multiplied together in the various ways discussed above, ad the
associative and distributive axioms of multiplication always hold,
but the commutative axiom of multiplication does not hold. In this
general scheme we still have the rules of notation of the preceding
section, that any complete bracket expression, containing ( on the
left and > on the right, denotes a number, while any incomplete
bracket expression, containing only ( or >, denotes a vector.
8. Conjugate Relations
26 DYNAMICAL VARIABLES AND OBSERVABLES § .7
With regard to the physical significance of the scheme, we have
already assumed that the bra vectors and ket vectors, or rather the
directions of these vectors, correspond to the states of a dynamical
System at a particular time. We now make the further assumption
that the linear Operators correspond to the dynamical variables at that
time. By dynamical variables are meant quantities such as the
coordinates and the components of velocity, momentum and angular
momentum of particles, and functions of these quantities-in fact
the variables in terms of which classical mechanics is built up. The
new assumption requires that these quantities shall occur also in
quantum mechanics, but with the striking differente that they are
now subject to an algebra in which the commutative axiom of multiplica-
tion does not hold.
This different algebra for the dynamical variables is one of the
most important ways in which quantum mechanics differs from
classical mechanics. We shall see later on that, in spite of this fundamental
differente, the dynamical variables of quantum mechanics
still have many properties in common with their classical counterParts
and it will be possible to build up a theory of them closely
analogous to the classical theory and forming a beautiful generalization
of it.
It is convenient to use the same letter to denote a dynamical
variable and the corresponding linear Operator. In fact, we may consider
a dynamical variable and the corresponding linear Operator to
be both the same thing, without getting into confusion.
8. Conjugate relations
Our linear Operators are complex quantities, since one tan multiply
them by complex numbers and get other quantities of the Same nature.
Hence they must correspond in general to complex dynamical variables,
i.e. to complex functions of the coordinates, velocities, etc. We
need some further development of the theory to see what kind of
linear Operator corresponds to a real dynamical variable.
Consider the ket which is the conjugate imaginary of (P Ia. This
ket depends antilinearly on (P 1 and thus depends linearly on 1 P).
It may therefore be considered as the result of some linear Operator
operafing on [ P). This linear Operator is called the adjoint of 01 and
we shall denote it by 2. With this notation, the conjugate imaginary
of (P~cx is GIP).
§S CONJUGATE RELATIONS 27
In formula (7) of Chapter 1 put (P Ia for (A 1 and its conjugate
imaginary 0i1 P) for IA). The result is
(BIGIP) = {PlalB). (4)
This is a general formula holding for any ket vectors IB), 1.Q and
any linear Operator c11, and it expresses one of the most frequently
used properties of the adjoint.
Putting & for a in (4), we get
(BpqP) = <PIo;IB) = (BlaIP),
by using (4) again with ]P> and 1 B) interchanged. This holds for
any ket IP), so we tan infer from (4) of Chapter 1,
(SIE = (Bla,
and since this holds for any bra vector (B 1, we tan infer
Thus the adjoint of the adjoint of a linear Operator is the original linear
Operator. This property of the adjoint makes it like the conjugate
complex of a number, and it is easily verified that in the special case
when the linear Operator is a number, the adjoint linear Operator is
the conjugate complex number. Thus it is reasonable to assume that
the adjoint of a linear Operator corre.spor&. to the conjugate complex of
a dynamical variable. With this physical significance for the adjoint
of a linear Operator, we may call the adjoint alternatively the conjugate
complex linear Operator, which conforms with our notation 6.
A linear Operator may equal its adjoint, and is then called self-
adjoint. It corresponds to a real dynamical variable, so it may be
called alternatively a real linear Operator. Any linear Operator may
be Split up into a. real part and a pure imaginary part. For this
reason the words `conjugate complex' are applicable to linear
Operators and not the words `conjugate imaginary'.
The conjugate complex of the sum of two linear Operators is
obviously the sum of their conjugate complexes. To get the conjugate
complex of the product of two linear Operators (II and J3, we apply
formuela (7) of Chapter 1 with
<Al = <Pl% @I = wB9
so that IN = w>, IB = PIQ>.
The result is
28 DYNAMICAL VARIABLES AND OBSERVABLES
from (4). Since this holds for any IP) and (& 1, we tan infer that
P ii = q. (5)
Thus the conjugate complex of the product of two linear Operators equals
the product of the conjugate complexes of the factors in the reverse Order.
As simple examples of this result, it should be noted that, if 5 and
are real, in general CJq is not real. This is an important differente
Lorn classical mechanics. However, &I + $ is real, and so is ;( & - qc).
Only when 6 and q commute is (17 itself also real. Further, if 8 is real,
then so is t2 and, more generally, tn with n any positive integer.
We may get the conjugate complex of the product of three linear
Operators by successive applications of the rule (5) for the conjugate
comnlex
L of the nroduct
I of two of them. We have
& = a@y) = fijgk = j$ ae, (6)
so the conjugate complex of the product of three linear Operators
equals the product of the conjugate complexes of the factors in the
reverse Order. The rule mey easily be extended to the product of any
number of linear Operators.
In the preceding section we saw that the product (A)(B 1 is a linear
Operator. We may get its conjugate complex by referring directly to
the definition of the adjoint. Multiplying @l)(BI into a general bra
(P 1 we get (P IA)(B 1, whose conjugate imaginary ket is
w4Im = Gwwo = Pww).
Hence I4W = Im4 (7)
We now have several rules concerning coniugate complexes and
conjugate imaginaries of products, namely equation (7) of Chapter 1,
equations (4), (5), (6), (7) of this chapter, and the rule that the
conjugate imaginary of (P Ia is ai 1 P). These rules tan all be summed
up in a Single comprehensive rule, the conjugate complex or conjugate
imaginary of any product of bra vectors, Eet vectors, and linear operdors
is obtained by taking the conjugate complex or conjugate imaginary of
each factor and reversing the Order of all the factors. The rule is easily
verified to hold quite generally, also for the cases not explicitly given
above.
THEOREM. If ( is a real linear Operator and
lm/P) = 0 (8)
for a particulur ket 1 P>, m bei-g a positive integer, then
(IP) = 0.
9. Eigenvalues and Eigenvectors
§f3
To prove the theorem, take first the case when m = 2. Equation
(8) then gives (Plf21P) = 0,
showing that the ket [ 1 P) multiplied by the conjugate imaginary bra
(P]tj is Zero. From the assumption (8) of Chapter 1 with 41 P> for IA),
we see that 51 P) must be Zero. Thus the theorem is proved for m = 2.
Now take m > 2 and put
cm-21p) = IQ>.
Equation (8) now gives f21&) = 0.
Applying the theorem for m = 2, we get
w> = 0
or pyP) = 0. (9)
By repeating the process by which equation (9) is obtained fiom
(8), we obtain successively
p-2jP) = 0 , pyP) =0 , . . . . CpjP) = 0, W) = 0,
and so the theorem is proved generally.
9. Eigenvalues and eigenvectors
We must make a further development of the theory of linear
operators, consisting in studying the equation
ajP) = alp}, (10)
where 01 is a linear Operator and a is a number. This equation usually
presents itself in the form that CY. is a known linear Operator and the
number a and the ket IP) are unknowns, which we have to try to
choose so as to satisfy (lO), ignoring the trivial Solution 1 P) = 0.
Equation (10) means that the linear Operator cx applied to the ket
1 P) just multiplies this ket by a numerical factor without changing
its direction, or else multiplies it by the factor Zero, so that it ceases
to have a direction. This same cx applied to other kets will, of course,
in general Change both their lengths and their directions. It should
be noticed that only the direction of 1 P) is of importante in equation
(10). If one multiplies 1 P) by any number not Zero, it will not aff ect
the question of whether (10) is satisfied or not.
Together with equation (lO), we should consider also the conjugate
imaginary form of equation
(Qb = b<Ql, (11)
where b is a number. Here the unknowns are the number b and the
30 DYNAMlCAL VARIABLES AND OBSERVABLES 09
non-Zero bra (& 1. Equations (10) and (11) are of such fundamental
importante in the theory that it is desirable to have some special
words to describe the relationships between the quantities involved.
If (10) is satisfied, we shall call u an eigenvaluet of the linear Operator
a, or of the corresponding dynamical variable, and we shall cal1 IP)
an eigenket of the linear Operator or dynamical variable. Further, we
shall say that the eigenket [P) belongs to the eigenvalue u. Similarly,
if (11) is satisfied, we shall call b an eigenvalue of As. and (& 1 an
eigenbra belonging to this eigenvalue. The words eigenvalue, eigenket,
eigenbra have a meaning, of course, o%?y with reference to a linear
Operator or dynamical variable.
Using this terminology, we tan assert that, if an eigenket of cx is
multiplied by any number not Zero, the resulting ket is also an
eigenket and belongs to the Same eigenvalue as the original one.
It is possible to have two or more independent eigenkets of a linear
Operator belonging to the Same eigenvalue of that linear Operator,
e.g. equation (10) may have several solutions, /Pl), /Pd), jP3),... say,
all holding for the same value of a, with the various eigenkets [Pl),
IPQ, IP3),... independent. In this case it is evident that any linear
combination of the eigenkets is another eigenket belonging to the
same eigenvalue of the linear Operator, e.g.
c,l-w+c, IW+c, IP3)+...
is another solution of (lO), where cl, c2, c~,... are any numbers.
In the special case when the linear Operator 01 of equations (10) and
(11) is a number, Ic say, it is obvious that any ket IP) and bra (& 1
will satisfy these equations provided a and b equal i?. Thus a number
considered as a linear Operator has just one eigenvalue, and any ket
is an eigenket and any bra is an eigenbra, belonging to this eigenvalue.
The theory of eigenvalues and eigenvectors of a linear Operator CY
which is not real is not of much use for quantum mechanics. We
shall therefore tonfine ourselves to real linear Operators for the further
development of the theory. Putting for a the real linear Operator f,
we have instead of equations (10) and (11)
w-9 = 4Ph (12)
<GM = WL (13)
t The word `proper ' is sometimes used instead of `eigen `, but this is not satisfactory
as the words `proper' and `improper' are often used with other meanings. For example,
in $0 15 and 46 the words `improper function' and `proper-energy' are used.
§9 EIGENVALWES AND EIGENVECTORS 31
Three important results tan now be readily deduced.
(i) The eigenvalues are all real numbers. To prove that a satisfying
(12) is real, we multiply (12) by the bra (P 1 on the left, obtaining
Now from equation (4) with (B 1 replaced by (P / and cx replaced by
the real linear Operator e, we see that the number (P 16 J P> must be
real, and from (8) of $6, (P 1 P) must be real and not Zero. Hence a
is real. Similarly, by multiplying (13) by IQ> on the right, we tan
prove that b is real.
Suppose we have a Solution of (12) and we form the conjugate
imaginary equstion, which will read
in view of the reality of 5 and a. This conjugate imaginary equation
now provides a Solution of (13), with (& 1 = (PJ and b = a. Thus
we tan infer
(ii) The eigenvalues associated with eigenkets are the same as the
eigenvalues associated with eigenbras.
(iii) The conjugate imaginary of any eigenket is an eigenbra belonging
to the same eigenvalue, and conversely. This last result makes it reasonable
to cal1 the state corresponding to any eigenket or to the conjugate
imaginary eigenbra an eigenstate of the real dynamical variable f.
Eigenvalues and eigenvectors of vaious real dynamical variables
are used very extensively in quantum mechanics, so it is desirable
to have some systematic notation for labelling them. The following
is suitable for most purposes. If E is a real dynamical variable, we
call its eigenvalues [`, e", e, etc. Thus we have a letter by itself
denoting a real dymmid variable or a real linear Operator, and the
Same letter with primes or an index attached denoting a number,
namely an eigenvalue of what the letter by itself denotes. An eigenvector
may now be labelled by the eigenvalue to which it belongs.
Thus lt') denotes an eigenket belonging to the eigenvalue 6' of the
dynamical variable [. If in a piece of work we deal with more than
one eigenket belonging to the same eigenvalue of a dynamical variable,
we may distinguish them one fiom bnother by means of a further
label, or possibly of more than one further labels. Thus, if we are
dealing with two eigenkets belonging to the same eigenvalue of ff,
we may cal1 them /E'l) and If'2).
32 DYNAMICAL VARIABLES AND OBSERVABLES
THEOREM. Two eigenvectors of a real dynamicab variable belonging
to diJferent eigenvalues are orthogonal.
To prove the theorem, let 16') and It") be two eigenkets of the real
dynamical variable f, belonging to the eigenvalues [' and f" respec-
.
tively. Then we have the equations
tw> = w>, (14)
w> = eV?* (15)
Taking the conjugate imaginary of (14) we get
~`0'15 = 5w.
Multiplying this by It") on the right gives _
<iw~"> = tx'lt'9
and multiplying (16) by ([' 1 on the left gives
aw'> = M'lr>-
Herme, subtracting, e--5"`W Je'> = 0, (16)
showing that, if f' f: t", (l'I[") = 0 and the two eigenvectors It'>
and lt"> arc orthogonal. TI& theorem will be referred to as the
orthogonality theorem.
We have been discussing properties of the eigenvalues and eigen-
vectors of a real linear Operator, but hsve not yet considered the
question of whether, for a given real linear Operator, any eigenvalues
and eigenvectors exist, and if so, how to find them. This question
is in general very difficult to answer. There is one useful special case,
however, which is quite tractable, namely when the real linear
Operator, 6 sa,y, satisfies an algebraic equation
+(t) = [m+alfn-1+a2[n-2+...+an 7 0, (17)
the coefficients a being numbers. This equation means, of course,
that the linear Operator d(t) produces the result Zero when applied
to any ket vector or to any bra vector.
Let (17) be the simplest algebraic equation that E satisfies. Then
it will be shown that
(ar) The number of eigenvalues of 6 is n.
(8) There arc so many eigenkets of t that any ket whatever tan
be expressed as a sum of such eigenkets.
The algebraic form +(EJ) tan be factorized into n linear factors, the
result being m = (~-c,)(5-c,)(~-c,)...(~-c,) (18)
§9 EIGENVALUES AND EIGENVECTORS 33
say, the c's being numbers, not assumed to be all different. This
factorization tan be performed with 6 a linear Operator just as weil
as with ,$ an ordinary algebraic variable, since there is nothing
occurring in (18) that does not commute with f. Let the quotient
when #@) is divided by (e--c,) be x,,(e), so that
&i) = (&--c,h&) (i = 1,2,3,...., 12). -
Then, for any ket IP),
&c,)xA4`) P> = $w) lP> = 0, (19)
Nm x,(5) 1 p> cannot vanish for every ket IP}, as otherwise x,(f)
itself would vanish and we should have g satisfying an algebraic
equation of degree n- 1, which would contradict the assumption that
(17) is the simplest equation that f satisfies. If we choose IP) so that
x,.(f) IP) does not vanish, then equation (19) Shows that x,(e) IP} is
an eigenket of f, belonging to the eigenvalue c,. The argument holds
for each value of r from 1 to n, and hence each of the c's is an eigenvalue
of [. No other number tan be an eigenvalue of 5, since if 6' is
any eigenvalue, belonging to an eigenket it'),
w> = 4'10
and we tan deduce W) IE'> = w> bt?>,
and since the left-hand side vanishes we must have +(e') = 0.
To complete the proof of (ac) we must verify that the c's are all
different. Suppose the c's arc not all different and cs occurs m firnes
say, with m > 1. Then +(e) is of the ferm
Ws = ec,wm
with 8(t) a rational integral function of 4. Equation (17) now gives us
(I-c,Pw)l~~ = 0 (20)
for any ket IA). Since c, is an eigenvalue of 5 it must be real, so that
f-c, is a real linear Operator. Equation (20) is now of the Same form
as equation (8) with f-c, for 5 and 6([)@> for IP>. From the theorem
connected with equation (8) we tan infer that
Since the ket IA} is arbitrary,
&-c,vw = 0,
which contradicts the assumption that (17) is the simplest equation
that 6. satisfies. Hence the c's arc all different and (01) is proved.
Let x,,(c,.) be the number obtained when c,, is substituted for t in
8596.67 D
10. Observables
34 DYNAMICAL VARIABLES AND OBSERVABLES §9
the algebraic expression x(t). Since the C'S are all different, x,(c,)
cannot vanish. Consider now the expression
xAt3
1 (21)
--.
2
r XA%>
If ce is substituted for 6 here, every term in the sum vanishes except
the one for which r = s, since x,(f) contains (&c,) as a factor when
r # 8, and the term for which r = s is unity, so the whole expression
vanishes. Thus the expression (21) vanishes when 4 is put equal to
any of the n numbers ci,cz,...,c,. Since, however, the expression
is only of degree n- 1 in f, it must vanish identically. If we now
apply the linear Operator (21) to an arbitrary ket 1 P) and equate
the result to Zero, we get
IQ = 7 &jx.(s)Ip~. (22)
Esch term in the sum on the right here is, according to (19), an
eigenket of f, if it does not vanish. Equation (22) thus expresses the
arbitrary ket 1 P) as a sum of eigenkets of f, and thus (/3) is proved.
As a simple example we may consider a real linear Operator u that
satisfies the equation u2= 1. (23)
Then u has the two eigenvalues 1 and - 1. Any ket ]P) tan be
expressed as Ie = 6(1+4IP>+9(1-~>IP>.
It is easily verified that the two terms on the right here arc eigenkets
of Q, belonging to the eigenvalues 1 and - 1 respectively, when they
do not vanish.
IO. Observables
We have made a number of assumptions about the way in which
states and dynamical variables are to be represented mathematically
in the theory. These assumptions are not, by themselves, laws of
nature, but become laws of nature when we make some further
assumptions that provide a physical interpretation of the theory.
Such further assumptions must take the form of establishing connexions
between the results of observations, on one hand, and the
equations of the mafhematical formalism on the other.
When we make an Observation we measure some dynamical variable.
It is obvious physically that the result of such a measurement must
always be a real number, so we should expect that any dynamical
0 10 OBSERVABLES 30
variable that we tan measure must be a real dynamical variable.
One might think one could measure a complex dynamical variable
by measuring separately its real and pure imaginary Parts. But this
would involve two measurements or two observations, which would
be all right in classical mechanics, but would not do in quantum
mechanics, where two observations in general interfere with one
another-it is not in general permissible to consider that two observations
tan be made exactly simultaneously, and if they arc made in
quick succession the first will usually disturb the state of the System
and introduce an indeterminacy that will affect the second. We
therefore have to restritt the dynamical variables that we tan
measure to be real, the condition for this in quantum mechanics
being as given in $ 8. Not every real dynarnical variable tan be
measured, however. A further restriction is needed, as we shall see
Iater.
We now make some assumptions for the physical interpretation of
the t+heory. If the dynamical system is in an eigenstate of a real :*
dy~mid variable f, belonging to the eigenvalue f', then a measurement
of ( will certainly give us result the number [`. Gonversely, if the system
is in a state such that a meusurement of a real dynamical variable (c is
certuin to give one particular result (instead of giving one or Gother of
several possible results according to a probability law, as is in general
the case), then the state is an eigenstate of 5 and the result of the measurement
is the eigenvalue of ,$ to which this eigenstate belongs. These
assumptions are reasonable on account of the eigenvalues of real
`linear Operators being always real numbers.
Some of the immediate consequences of the assumptions will be
noted. If we have two or more eigenstates of a real dynamical
variable 4 belonging to the same eigenvalue k', then any state
formed by superposition of them will also` be an eigenstate of 6
belonging to the eigenvalue f'. We tan infer that if we have two or
more states for which a measurement of f is certain to give the result
t', then for any state formed by Superposition ,of them a measurement
of 5 will still be certain to give the result t'. This gives us some insight
into the physical significance of Superposition of states. Again, two
eigenstates of 4 belonging to different eigenvalues are orthogonal.
We tan infer that two states for which a mea&uement of [ is certain
to give two different results are orthogonal. This gives us some
insight into the physical significance of orthogonal states.
36 DYNAMICAL VARIABLES AND OBSERVABLES 0 10
When wc measure a real dynamical variable e, the disturbance
involved in the act of measurement Causes a jump in the state of the
dynamical System. From physical continuity, if we make a second
measurement of the same dynamical variable 4 immediately after
the first, the result of the second measurement must be the Same as
that of the first. Thus after the first measurement has been made,
there is no indeterminacy in the result of the second. Hence, after
the first measurement has been made, the System is in an eigenstate
of the dynamical variable [, the eigenvalue it belongs to being equal
to the result of the first measurement. This conclusion must still hold
if the second measurement is not actually made. In this way we see
that a measurement always Causes the System to jump into an eigenstate
of the dynamical variable that is being measured, the eigenvalue
this eigenstate belongs to being equal to the result of the measurement.
We tan infer that, with the dynamical System in any state, any
result of a measurement of a real dynumical variable is one of its eigenvalues.
Conversely, every eigenvalue is a possible result of a meusurement
of the dynamicul variable for some Stute of the System, since it is
certainly the result if the state is an eigenstate belonging to this
eigenvalue. This gives us the physical significance of eigenvalues.
The set of eigenvalues of a real dynamical variable are just the
possible results of measurements of that dynamical variable and the
calculation of eigenvalues is for this reason an important Problem.
Another assumption we make connected with the physical inter-
pretation of the theory is that, if a certuin real dynumicul variabk
4 is measured with the System in a particulur state, the states into which
the System may jump on account of the measurement are such that the
original state is dependent on them. Now these states into which
the System may jump are all eigenstates of f, and hence the original
state is dependent on eigenstates of 6. But the original state may be
any stafe, so we tan conclude that any state is dependent on eigenstates
of 4. If we define a complete set of states to be a set such that
any state is dependent on them, then our conclusion tan be formulated-the
eigenstates of 4 form a complete set.
Not every real dynamical variable has sufficient eigenstates to form
a complete set. Those whose eigenstates do not form complete sets
are not quantities that tan be measured. We obtain in this way a
further condition that a dynamical variable has to satisfy in Order
. `,"
, . .
s 10 OBSERVABLES 37
that it shall be susceptible to measurement, in addition to the condition
that it shall be real. We call a real dynamical variable whose
eigenstates form a complete set an observuble. Thus any quantity
that tan be measured is an observable.
The question now presents itself-Can every observable be
measured? The answer theoretically is yes. In practice it may be
very awkward, or perhaps even beyond the ingenuity of the experimenter,
to devise an apparatus which could measure some particular
observable, but the theory always allows one to imagine that the
measurement tan be made.
Let us examine mathematically the condition for a real dynamical
variable e to be an observable. Its eigenvalues may consist of a
(finite or infinite) discrete set of numbers, or alternatively, they
may consist of all numbers in a certain range, such as all numbers
lying between a and b. In the former case, the condition that
any state is dependent on eigenstates of 4 is that any ket tan
be expressed as a sum of eigenkets of 5. ' In the latter case the
condition needs modification, since one may have an integral instead
of a sum, i.e. a ket 15') may be expressible as an integral of eigenkets
of 4, IP) = [ It'> dt',
lt'> being an eigenket of [ belonging to the eigenvalue f' and the
range of integration being the range of eigenvalues, as such a ket is
dependent on eigenkets of [. Not every ket dependent on eigenkets
of 4 tan be expressed in the form of the right-hand side of (24), since
one of the eigenkets itself cannot, and more generally any sum of
eigenkets cannot. The condition for the eigenstates of 6 to form a
complete set must thus be formulated, that any ket IP) tan be
expressed as an integral plus a sum of eigenkets of E, i.e.
Ip) = j- 14'Q dt'+ C I&J>,
T (26)
where the j[`c), /Pd> are all eigenkets of e, the labels c and d being
inserted to distinguish them when the eigenvalues 6' and $ are equal,
and where the integral is taken over the whole range of eigenvalues
and the sum is taken over any selection of them. If this condition
is satisfied in the case when the eigenvalues of ,$ consist of a range
of numbers, then 4 is an observable.
There is a more general case that sometimes occurs, namely the
eigenvalues of ,$ may consist of a range of numbers together with a
38 DYNAMICAL VARIABLES AND OBSERVABLES 0 10
discrete set of numbers lying outside the range. In this case the
condition that f shall be an observable is still that any ket shall be
expressible in the ferm of the right-hand side of (%), but the sum
over r is now a sum over the discrete set of eigenvalues as weil as a
selection of those in the range.
It is often very difyicult to decide mathematically whether a par-
ticular real dynamical variable satisfies the condition for being an
observable or not, because the whole Problem of finding eigenvalues
and eigenvectors is in general very difficult. However, we may have
good reason on experimental grounds for believing that the dynamical
variable tan be measured and then we may reasonably assume that it
is an observable even though the mathematical proof is missing. This is
a thing we shall frequently do during the course of development of the
theory, e.g. we shall assume the energy of any dynamical System to be
always an observable, even though it is beyond the power of presentday
mathematical analysis to prove it so except in simple Gases.
In the special case when the real dynamical variable is a number,
every state is an eigenstate and the dynamical variable is obviously
an observable. Any measurement of it always gives the Same res&,
so it is just a physical constant, like the Charge on an electron.
A physical constant in quantum mechanics may thus be looked upon
either as an observable with a Single eigenvalue or as a mere number
appearing in the equations, the two Points of view being equivalent.
If the real dynamical variable satisfies an algebraic equation, then
the result (/3) of the preceding section Shows that the dynamical
variable is an observable. Such an observable has a finite number
of eigenvalues . Conversely, any observable with a finite number of
eigenvalues satisfies an algebraic equation, since if the observable 4
has as its eigenvalues f', l" ,..., En, then
(E-F)(~-~")*.*(5-~n)IP> = 0
holds for IP) any eigenket of [, and thus it holds for any IE'> whatever,
because any ket oan be expressed as a sum of eigenkets of 4
on account of t being an observable. Hence
(k-5')(~-~")***(~-~") = 0. P-9
As an example we may consider the linear Operator IA)@ 1, where
IA) is a normalized ket. This linear Operator is real according to (7),
and its Square is
{IA>L4 l]" = IA><A 140 I = W@ I (27)
OBSERVABLES 39
§ 10
since (AIA) = 1. Thus its `Square equals itself and so it satisfies an
algebraic equation and is an observable. Its eigenvalues are 1 and 0,
with IA) as the eigenket belonging to the eigenvalue 1 and all kets
orthogonal to IA) as eigenkets belonging to the eigenvalue 0. A
measurement of the observable thus certainly gives the result 1 if
the dynamical System is in the state corresponding to IA) and the
result 0 if the System is in any orthogonal state, so the observable
may be described as the quantity which determines whether the
System is in the state IA) or not.
Before concluding this section we should examine the conditions
for an integral such as occurs in (24) to be significant. Suppose IX}
and 13') are two kets which tan be expressed as integrals of eigenkets
of the observable 6,
IX> = j- It'+ dt', 1 Y> = f lF'y> dt",
x and y being used as labels to distinguish the two integrands. Then
we have, taking the conjugate imaginary of the first equation and
multiplying by the second
<XI Y> = jj- <Wt"y> &W"- (28)
Consider now the Single integral
(29)
*' From the orthogonality theorem, the integrand here must vanish
over the whole range of integration except the one Point [" = [`.
If the integrand is finite at this Point, the integral (29) vanishes, and
if this holds for all f', we get from (28) that (XI Y) vanishes. Now
in general <X 1 Y) does not vanish, so in general (6'~ 15'~) must be
infinitely great in such a way as to make (29) non-vanishing and
finite. The form of infinity required for this will be discussed in 5 15.
In our work up to the present it has been implied that our bra and
ket vectors are of finite Iength and their scalar products are finite.
We see now the need for relaxing this condition when we are dealing
with eigenvectors of an observable whose eigenvalues form a range.
If we did not relax it, the phenomenon of ranges of eigenvalues could
not occur and our theory would be too weak for most practical
Problems.
40 DYNAMICAL VARIABLES AND OBSERVABLES § 10
Taking 1 Y) = IX) above, we get the result that in general (5'~ If'x)
is infinitely great. We shall assume that if 1s'~) # 0
s Gf'x lt?> 47 > 0, (30)
as the axiom corresponding to (8) of 3 6 for vectors of infinite
length.
The space of bra or ket vectors when the vectors are restricted to
be of finite length and to have finite scalar products is called by
mathematicians a Hilbert space. The bra and ket vectors that we
now use form a more general space than a Hilbert space.
We tan now see that the expansion of a ket 1 P) in the form of the
right-hand side of (26) is unique, provided there are not two or more
terrns in the sum referring to the same eigenvalue. To prove this
result, let us suppose that two different expansions of 1 P) are possible.
Then by subtracting one from the other, we get an equation
of the form 0 = s Ib> dt' + 1 It?), (31)
8
a and b being used as new labels for the eigenvectors, and the sum
over s including all terms left after the subtraction of one sum from
the other. If there is a term in the sum in (31) referring to an eigenvalue
fl not in the range, we get, by multiplying (31) on the left by
(&l and using the orthogonality theorem,
which contradicts (8) of 5 6. Again, if the integrand in (31) does not
vanish for some eigenvalue 5" not equal to any (6 occurring in the
sum, we get, by multiplying (3 1) on the left by (["a 1 and using the
orthogonality theorem,
0 = (f"al(`a> dt',
f
which contradicts (30). Finally, if `there is a term in the sum in (31)
referring to an eigenvalue [i in the range, we get, multiplying (31) on
the 14% by (fb 1,
0 = <~W'~> dt' +<~tWt~>
s (32)
and multiplying (31) on the left by @al
0 = <&lf'a> dt' +C&4@>.
s (33)
Now the integral in (33) is finite, so @aIftb) is finite and @b Ipa) is
finite. The integral in (32) must then be Zero, so (ftbIetb) is Zero and
11. Functions of Observables
*
OBSERVABLES 41
we again have a contradiction. Thus every term in (31) must vanish
and the expansion of a ket lP> in the form of the right-hand side of
(25) must be unique.
11. Functions of observables
Let ,$ be an observable. We tan multiply it by any real number k
and get another observable k(. In Order that our theory may be
self-consistent it is necessary that, when the System is in a state such
that a measurement of the observable 5 certainly gives the result t',
a measurement of the observable k[ shall certainly give the result Er.
It is easily verified that this condition is fulfilled. The ket corresponding
to a state for which a measurement of f certainly gives the result
6' is an eigenket of 4, It'> say, satisfying
This equation leads to
showing that 14') is an eigenket of k( belonging to the eigenvalue kf',
and thus that a measurement of k( will certainly give the result -4'.
More generally, we may take any real function of f, f(l) say, and
consider it as a new observable which is automatically measured
whenever 4 is measured, since an experimental determination of the
value of f also provides the value Off([). We need not restritt f(f) to
be real, and then its real and pure imaginary Parts are two observables
which are automatically measured when 8 is measured. For the theory
to be consistent it is necessary that, when the System is in a state
such that a measurement of 6 certainly gives the result f', a measurement
of the real and pure imaginary Parts Off([) shall certainly give
for results the real and pure imaginary Parts off(6'). In the case when
f(t) is expressible as a power series
f(6) = c,+c,~+c2~2+c,~3+**.,
the c's being numbers, this condition tan again be verified by elementary
algebra. In the case of more general functions f it may not be
possible to verify the condition. The condition may then be used to
define f(f), hi h
w c we have not yet defined mathematically. In this
way we tan get a more general definition of a function of an observ-
able than is provided by power series.
We define f(f) in general to be that linear Operator which satisfies
m It'> = fr> IQ'> (34)
42 DYNAMICAL VARIABLES AND OBSERVABLES 0 11
for every eigenket 1s') of [, f(f') being a number for each eigenvalue 5'.
It is easily seen that this definition is self-consistent when applied to
eigenkets 14') that are not independent. If we have an eigenket If'A)
dependent on other eigenkets of 6, these other eigenkets must all
belong to the same eigenvalue t', otherwise we should have an equation
of the type (31)) which we have seen is impossible. On multiplying
the equation which expresses I[`A) linearly in terms of the other
eigenkets of 4 by f(4) on the left, we merely multiply each term in it
by the number f(e'), so we obviously get a consistent equation.
Further, equation (34) is suficient to define the linear Operator f(e)
completely, since to get the result Off(f) multiplied into an arbitrary
ket IP), we have only to expand IP) in the form of the right-hand
side of (25) and take
The conjugate complex f(E) of f(f) is defined by the conjugate
imaginary equation to (34), namely
<5vm = 3@3tc 19
holding for any eigenbra (P'I, f(f') being the conjugate complex -
function to f([`). Let us replace f' here by 4" and multiply the
equation on the right by the arbitrary ket 1 P). Then we get, using
the expansion (26) for IP),
cmIp> = #iY&"K5"Ip>
= 13WY'IW dt' + ~,fCWlbO
= j=3(F):5" IO> W +,fFW'lC'~> (36)
with the help of the orthogonality theorem, (t" If"d) being understood
to be zero if LJ" is not one of the eigenvalues to which the terms
in the sum in (25) refer. Again, putting the conjugate complex
function 3( f') for f(f') in (35) and multiplying on the left by {f" 1,
we get
C%&W'> = ~3(~W"l~`c> dt' +3(5")GT'd>.
The right-hand side here equals that of (36), since the integrands
vanish for 5' # r, and hence
<rlf@ IJ? = <mo In.
§ 11 FUNCTIONS OF OBSERVABLES
This holds for (4" 1 any eigenbra and 12') any ket, so
Thus the conjugate cornplex of the linear Operator f(4) is the conjugate
conaplex function f of e.
It follows as a corollary that if f ([`) is a real function of t', f(t) is
a real linear Operator. f(f) is then also an observable, since its
eigenstates form a complete set, every eigenstate of 6 being also an
eigenstate of f (k).
With the above definition we are able to give a meaning to any
function f of an observable, provided only thut the domain of existente
of the function of a real variable f(x) includes all the eigenvalues of the
observable. If the domain of existente contains other Points besides
these eigenvalues, then the values Off(x) for these other Points will
not affect the function of the observable. The function need not be
analytic or continuous. The eigenvalues of a function f of an observable
are just the function f of the eigenvalues of the observable.
It is important to observe that the possibility of defining a function
f of an observable requires the existente of a unique number f(x) for
each value of x which is an eigenvalue of the observable. Thus the
function f(x) must be Single-valued. This may be illustrated by considering
the question: When we have an observable f(A) which is a
real function of the observable A, is the observable A a function of
the observable f (A ) 1 The answer to this is yes, if diff erent eigenvalues
A' of A always lead to different values of f(A'). If, however, there
exist two different eigenvalues of A, A' and A" say, such that
f (A') = f(A"), then, corresponding to the eigenvalue f(A') of the
observable f(A), there will not be a unique eigenvalue of the observable
A and the latter will not be a function of the observable f(A).
It may easily be verified mathematically, from the definition, that
the sum or product of two functions of an observable is a function
of that observable and that a function of a function of an observable
is a function of that observable. Also it is easily seen that the whole
theory of functions of an observable is symmetrical between bras and
kets and that we could equally weil work from the equation
Wf (0 = f b?) ~5' 1 (38)
instead of from (34).
We shall conclude this section with a discussion of two examples
which are of great practical im.portance, namely the reciprocal and
44 DYNAMICAL VARIABLES AND OBSERVABLES § 11
the Square root. The reciprocal of an observable exists if the observable
does not have the eigenvalue Zero. If the observable cx does not
have the eigenvalue Zero, the reciprocal observable, which we call a--l
or I/cz, will satisfy OL-qx') = a'-lIQI')> (39)
where ja'> is an eigenket of 01 belonging to the eigenvalue a'. Hence
cwl~a') = ad-lla') = Ia').
Since this holds for any eigenket Ia'), we must have
cmF1 = 1. (40)
Similarly, cy-% = 1. (41)
Either of these equations is sufficient to determine a--l completely,
provided 01 does not have the eigenvalue Zero. To prove this in the
case of (40), let x be any linear Operator satisfying the equation
ax = 1
and multiply both sides on the left by the a-1 defined by (39). The
result is &-l&x = (y-1
and hence from (41) X -1
=a .
Equations (40) and (41) tan be used to define the reciprocal, when
it exists, of a general linear Operator CII, which need not even be real.
One of these equations by itself is then not necessarily sufficient. If
any two linear Operators (I! and ss have reciprocals, their product ass
has the reciprocal (ass)-1 = ss-kl, (42)
obtained by taking the reciprocal of each factor and reversing their
Order. We verify (42) by noting that its right-hand side gives unity
when multiplied by ass, either on the right or on the left. This reciprocal
law for products tan be immediately extended to more than
two factors, i.e., (assy...)-1 = . ..y-lss-101-1.
The Square root of an observable a always exists, and is real if CII
has no negative eigenvalues. We write it & or &. It satisfies
dcxIa'> = f&`lcY'), (43)
Ia'> being an eigenket of c11 belonging to the eigenvalue 01'. Hence
&&%la') = &`&`lc%`) = a'la') = a~cx'),
and since this holds for any eigenket ja'> we must have
4da = a. (44)
12. The General Physical Interpretation
FUNCTIONS OF OBSERVABLES 46
0 11
On account of the ambiguity of sign in (43) there will b8 several
Square roots. To fix one of them we must specify a particular sign
in (43) for each eigenvalue. This sign may vary irregularly fiom one
eigenvalue to the next and equation (43) will always define a linear
Operator & satisfying (44) and forming a square-root function of a.
If there is an eigenvalue of a with two or more independent eigenkets
belonging to it, then we must, according to our definition of a function,
have the same sign in (43) for each of these eigenkets. If we
took different signs, however, equation (44) would still hold, and hence
equation (44) by itself is not sufficient to define &, except in the
special case when there is only one independent eigenket of a belonging
to any eigenvalue.
The number of different Square roots of an observable is 2n, where
n is the total number of eigenvalues not Zero. In practice the squareroot
function is used only for observables without negative eigenvalues
and the particular Square root that is useful is the one for
which the positive sign is always taken in (43). This one will be called
the positive squure root.
12. The general physical interpretation
The assumptions that we made at the beginning of 5 10 to get a
physical interpretation of the mathematical theory are of a rather
special kind, since they tan be used only in connexion with eigenstates.
We need some more general assumption which will enable us
to extract physical information from the mathematics even when we
are not deeling with eigenstates.
In classical mechanics an observable always, as we say, `has a
value' for any particular state of the System. What is there in quanturn
mechanics corresponding to this? If we take any observable 6
and any two states x and y, corresponding to the vectors (XI and Iy),
then we tan form the number (xj,$ly). This number is not very
closely analogous to the value which an observable tan `have' in the
classical theory, for three reasons, namely, (i) it refers to two states
of the System, while the classical value always refers to one, (ii) it is
in general not a real number, and (iii) it is not uniquely determined
by the observable and the states, since the vectors (XI and 1~) contain
arbitrary numerical factors. Even if we impose on (XI and 19) the
condition that they shall be normalized, there will still be an undetermined
factor of modulus unity in (x Ie 1~). These three reasons cease
46 DYNAMICAL VARIABLES AND OBSERVABLES 9 12
to apply, however, if we take the two states to be identical and 1~)
to be the conjugate imaginary vector to (XI. The number that we
then get, namely (x It IX>, is necessarily real, and also it is uniquely
determined when (x j is normalized, since if we multiply (XI by the
numerical factor ek, c being some real number, we must multiply
IX) by e-ZC and (xl[lx) will be unaltered.
One mighf thus be inclined to make the tentative assumption fhat
the observable 5 `has the value' (xl[lx) for the state x, in a sense
analogous to the classical sense. This would not be satisfactory,
+ though, for the following reason. Let us fake a second observable r],
which would have by the above assumption the value (~17 IX> for
this same state. We should then expect, from classical analogy, fhat
for this statte the sum of the two observables would have a value
equal to the sum of the values of the two observables separately and
the product of the two observables would have a value equal to the
product of the values of the two observables separately. Actually, the
tentative assumption would give for the sum of the two observables
the value (x~[+T~x), which is, in fact, equal to the sum of (xl[lx) .
and <x 17 IX), but for the product it would give the value (x lt7 IX)
or wqw, neither of which is connected in any simple way with
WW and Wrllx)~
However, since things go wrang only with the product and not with
the sum, it would be reasonable to cal1 <Xerox) the average value of
the observable f for the state x. This is because the average of the
sum of two quantities must equal the sum of their averages, but the
average of their produot need not equal the product of their averages.
We therefore make the general assumption that if the meusurement
ii of the observable f for the system in the stute correqonding to IX} is
made a lurge number of times, the average of all the results obtained will
j be +4~lx>, P rovided IX) is normalixed. If IX) is not normalized, as is
necessarily the case if the stafe x is an eigenstate of some observable
belonging to an eigenvalue in a range, the assumption becomes that
the average result of a measurement of Q is proportional to (Xerox),
This general assumption provides a basis for a general physical interpretation
of the fheory.
The expression that an observable ` has a particular value' for a
particular state is permissible in quantum mechanics in the spe&1
case when a measurement of the observable is certain to lead to the
particular value, so that fhe state is an eigenstate of the observable.
It may easily be verified from the algebra that, with this restricted
meaning for an observable ` having a value', if two observables have
values for a particular state, then for this state fhe sum'of the two
observables (if this sum is an observablet) has a value equal to the
sum of the values of the two observables separately and the product
of the two observables (if this product is an observable$) has a value
equal to the product of the values of the two observables separately.
In the general case we cannot speak of an observable having a value
for a particular state, but we tan speak of its having an average value
for the state. We tan go further and speak of the probability of its
having any specified value for the state, meaning the probability of
this specified value being obtained when one makes a measurement of
the observable. This probability tan be obtained from the general
assumption in the following way.
Let the observable be f and let the state correspond to the normal-
ized ket IX>. Then the general assumption tells us, not only that the
average value of 5 is (X Itlx), but also that the average value of any
function of [,f(t) say, is (x jf(& IX). Takef(6) to be that function of 4`
which is equal to unity when f = a, a being some real number, and
zero otherwise. This function of [ has a meaning according to our
general theory of functions of an observable, and it may be denoted
by 8ta in conformity with the general notation of the Symbol 6 with
two suffixes given on p. 62 (equation (17)). The average value of
this function of (1` is just the probability, P, say, of 4 having the value
a. Thus (45)
If a is not an eigenvalue of f, 66, multiplied into any eigenket of f is
Zero, and hence Sta = 0 and P, = 0. This agrees with a conclusion
of 6 10, that any result of a measurement of an observable must be
one of its eigenvalues.
If the possible results of a measurement of 6 form a range of num-
bers, the probability of f having exactly a particular value will be
zero in most physical Problems. The quantity of physical importante
is then the probability of f having a value within a small range, say
fiom a to a+da. This probability, which we may call P(a) da, is
t This is not obviously so, since the sum may not have sticient eigenstates to
form a complete set, in which case the sum, considered as a Single quantity, would
not be measurable.
$ Here the reality condition may fail, as weil as the condition for the eigenstetes
to form a complete set. I
48 DYNAMICAL VARIABLES AND OBSERVABLES 0 12
equal to the average value of that function of 6 which is equal to
unity for f lying within the range a to a+da and zero otherwise.
This function of 6 has a meaning according to our general theory of
functions of an observable. Denoting it by x(e), we have
w dal = <x IX(f) IX>* (46)
If the range a to a+da does not include any eigenvalues of f, we
have as above ~(8) = 0 and P(a) = 0. If IX) is not normalized, the
right-hand sides of (45) and (46) will still be proportional to the
probability of (t having the value CG and lying within the range a to
a+da respectively.
The assumption of $10, that a measurement of LJ is certain to give
the result [' if the System is in an eigenstate of 6 belonging to the
eigenvalue Ir, is consistent with the general assumption for physical
interpretation and tan in fact be deduced from it. Working from the
general assumption we see that, if Ie') is an eigenket of 6 belonging
to the eigenvalue e', then, in the case of discrete eigenvalues of 8,
&&J 16') = 0 unless a = f',
and in the case of a range of eigenvalues of e
#lt') = 0 unless the range a to a+da includes 6'.
In either case, for the state corresponding to IE'>, the probability of
[ having any value other than f is Zero.
An eigenstate of 6 belonging to an eigenvalue 6' lying in a range
is a state which cannot strictly be realized in practice, since it would
need an infinite amount of precision to get 6 to equal exactly t'.
The most that could be attained in practice would be to get `$ to lie
within a narrow range about the value 4'. The System would then
be in a state approximating to an eigenstate of 4. Thus an eigenstate
belonging to an eigenvalue in a range is a mathematical idealization
of what tan be attained in practice. All the Same such eigenstates
play a very useful role in the theory and one could not very weh do
without them. Science contains many examples of theoretical concepts
which are hmits of things met with in practice and arc useful
for the precise formulation of laws of nafure, although they are not
realizable experimentally, and this is just one more of them. It may
be that the infinite length of the ket vectors corresponding to these
eigenstates is connecfed with their unrealizability, and that all realizable
states correspond to ket vectors that tan be normalized and that
form a Hilbert space.
13. Commutability and Compatibility
x ,
§ 13 COMMUTABILITY AND COMPATIBILITY 49
13. Commutability and compatibility
A state may be simultaneously an eigenstate of two observables.
If the state corresponds to the ket vector IA) and the observables arc
4 and 7, we should then have the equations
iV> = 5'IA>,
rllA> = q'lA>,
where t' and 71' arc eigenvalues of 4 and 7 respectively. We tan now
deduce
5qIA> = Eq'lA> = h+O = MA> = $`IA> = $dA>,
or @r-rlOIA> = 0.
This suggests that the chances for the existente of a simultaneous
eigenstate are most favourable if &- q[ = 0 and the two observables
commute. If they do not commute a simultaneous eigenstate is not
impossible, but is rather exceptional. On the other hand, if &ey do
commute there exist so many simultaneous eigenstutes that they ferm a
complete set, as will now be proved.
Let [ and 71 be two commuting observables. Take an eigenket of
7, 17') say, belonging to the eigenvalue q', and expand it in terms
of eigenkets of 5 in the form of the right-hand side of (26), thus
hf> = J wc> at' + c lbifo. (47)
r
The eigenkets of 6 on the right-hand side here have 7' inserted in
them as an extra label, in Order to remind us that they come from
the expansion of a special ket vector, namely Iq'), and not a general
one as in equation (25). We tan now show that each of these eigenkets
of f is also an eigenket of 7 belonging to the eigenvalue 7'. We
have
0 = h-$)Iq') = j- (y-`f)l~`~`c) dt' + 2 (d)lStrll~>e (48)
7
Now the ket (q-q') Ipq'd) satisfies
w3wwo = h-qfwqfa) = k~-~xwid>
= iF'(q--9') lP@>,
showing that it is an eigenket of ,$ belonging to the eigenvalue p,
and similarly the ket (q-- 7') I,$`q'c) is an eigenket of 6 belonging to
the eigenvalue ff. Equation (48) thus gives an integral plus a sum
of eigenkets of e equal to Zero, which, as we have seen with equation
3505.67 E
-7-- --
-------
-
60 DYNAMICAL VARIABLES AND OBSERVABLES § 13
(3l), is impossible unless the integrand and every term in the sum
vanishes. Henne
k-77')IWc> = 0, br--71'm?`d) = 0,
so that all the kets appearing on the right-hand side of (47) are
eiged.mts of r] as well as of e. Equation (47) now gives 117') expanded
in terms of simultaneous eigenkets of 5 and r]. Since any ket tan be
expanded in terms of eigenkets Iq'> of 7, it follows that any ket tan
be expanded in terms of simultaneous eigenkets of [ and 7, and thus
the simultaneous eigenstafes form a complete set.
The above simultaneous eigenkets of 4 and 7, Ie'q'c) and 1 pq'd),
are labelled by the eigenvalues 6' and q', or e and q', to which they
belong, together with the labels c and d which may also be necessary.
The procedure of using eigenvalues as labels for simultaneous eigen-
vectors will be generally followed in the future, just as it has been
followed in the past for eigenvectors of Single observables.
The converse to the above theorem says that, if 5 and 7 are two *
observables such that their simultaneous eigenstates form a complete set,
then f and 7 wmmute. To prove this, we note that, if jt'q'> is a
simultaneous eigenket belonging to the eigenvalues 4' and v',
@l--77i3 kf'rl') = ~~`?I'-&?) Ii?rl') = 0. (49)
, Since the simultaneous eigenstates form a complete set, an arbitrary
ket IP> tan be expanded in terms of simultaneous eigenkets l[`q'),
for each of which (49) holds, and hence
(h-m-e = 0
and so t+-174`= 0.
The idea of simultaneous eigenstates may be extended to more
than two observables and the above theorem and its converse still
hold, i.e. if any set of observables commute, each with all the others,
their simultaneous eigenstates form a complete set, and conversely.
The Same arguments used for the proof with two observables are
adequate for the general case; e.g., if we have three commuting
observables f, 7, 5, we tan expand any simultaneous eigenket of 4`
and r) in terms of eigenkets of 5 and then show that each of these
eigenkets of 5 is also an eigenket of 5 and of 7. Thus the simultaneous
eigenket of e and 7 is expanded in terms of simultaneous eigenkets
of e, v, and f, and since any ket tan be expanded in terms of simultaneous
eigenkets of t and 7, it tan also be expanded in terms of
simultaneous eigenkets of 4, 11, and 5.
§ 13 COMMUTABILITY AND COMPATIBILITY
The orthogonality theorem applied to simultaneous eigenkets teils
us that two simultaneous eigenvectors of a set of commuting observables
are orthogonal if the sets of eigenvalues to which they belong
differ in any way.
Owing to the simultaneous eigenstates of two or more commuting
observables forming a complete set, we tan set up a theory of functions
of two or more commuting observables on the same lines as the
theory of functions of a Single observable given in $ 11. If 5, 7, c,...
are commuting observables, we define a general function f of them
to be that linear Operator f([, 7, (I, . ..) which satisfies
f<& rl, L.4 kw5'*.-> = fk!`, $9 L.)l4'77'5'.-1, w L
where \,$`q'c'.. .) is any simultaneous eigenket of e,~, c,... belonging
to the eigenvalues e', q', c',... . Here f is any function such that
f(a, b, c,... ) is defined for all values of a, b, c,. . . which are eigenvalues
of & 7, L respectively. As with a function of a Single observable
defined by (34), we tan show that f(e, 7, c,...) is completely determined
by (50), that
corresponding to (37), and that if f(a, b, c, . ..) is a real function,
f([, q, 5 ,...) is real and is an observable.
We tan now proceed to generalize the results (45) and (46). Given
a set of commuting observables [, 7, c,..., we may form that function
of them which is equal to unity when 6 = a, 7 = 6, 5 = c ,..., a, b, c ,...
being real numbers, and is equal to Zero when any of these conditions
is not fulfilled. This function may be written 6ta 6,, $+..., and is in
fact just the product in any Order of the factors Sta, $,, 6cC,. . . defined
as functions of Single observables, as may be seen by substituting this
product for f(e, 7, c,...) in the left-hand side of (50). The average
value of this function for any state is the probability, Ph... say, of .
[, ~,c ,... having the values a, b, c ,... respectively for that state. Thus
if the state corresponds to the normalized ket vector IX), we get from
our general assumption for physical interpretation
Pabc... = <x\a,$as$ a&*** IX>* (61)
cbc... is Zero unless each of the numbers a, b, c,. . . is an eigenvalue of
the corresponding observable. If any of the numbers a, b, c,... is an
eigenvalue in a range of eigenvalues of the corresponding observable,
PtiC,.. will usually again be Zero, but in this case we ought to replace
62 DYNAMICAL VARIABLES AND OBSERVABLES 9 13
the requiremenf that this ohservable shall have exactly one value by
the requirement that it shall have a value lying within a small range,
which involves replacing one of the 6 factors in (51) by a factor like
the ~(6) of equafion (46). On carrying out such a replacement for
each of the observables 4, 7, 5 ,..., whose corresponding numerical
value a, b, c,... lies in a range of eigenvalues, we shall get a probability
which does not in general vanish.
If certain observables commute, there exist states for which they all
have particular values, in the sense explained at the bottom of p. 46,
namely the simultaneous eigenstates. Thus one tan give a wuning to
several commuting observables having values at the Same time. Further, we
see from (61) that for any state one tun give a meaning to the probability
of partklar results being obtained for simultaneous measurements of
several wmmuting observables. This conclusion is an important new
development . In general one cannot make an Observation on a
System in a definite state without disturbing that state and spoiling
it for the purposes of a second Observation. One cannot then give
any meaning to the two observations being made simultaneously.
The above conclusion teils us, though, that in the special case when
the two observables commute, the observations are to be considered
as non-interfering or compatible, in such a way that one tan give a
meaning to the two observations being made simultaneously and tan
discuss the probability of any particular results being obtained. The
two observations may, in fact, be considered as a Single Observation
of a more complicated type, the result of which is expressible by two
numbers instead of a Single number. Prom the Point of view of general
theory, any two or more commuting observables may be counted us a
Single observable, the result of a measurement of which consists of two or
more numbers. The states for which this measurement is certain to t
lead to one particular result are the simultaneous eigenstates.
III. Representations
14. Basic Vectors
111
REPRESENTATIONS
14. Basic vectors
IN the preceding chapters we sef up an algebraic scheme involving
certain abstract quantities of three kinds, namely bra vectors, ket
vectors, and linear Operators, and we expressed some of the fundamental
laws of quantum mechanics in terms of them. It would be
possible to continue to develop the theory in terms of these abstract
quantities and to use them for applications to particular Problems.
However, for some purposes it is more convenient to replace the
abstract quantities by sets of numbers with analogous mathematical
properties and to work in terms of these sets of numbers. The procedure
is similar to using coordinates in geometry, and hss the advantage
of giving one greater mathematical power for the solving of
particular Problems.
The way in which the abstract quantities arc to be replaced by
numbers is not unique, there being many possible ways corresponding
to the many Systems of coordinates one tan have in geometry. Esch
of these ways is called a representution and the set of numbers that
replace an abstract quantity is called the representutive of that
abstract quantity in the representation. Thus the representative of
an abstract quantity corresponds to the coordinates of a geometrical
Object. When one has a particular Problem to work out in quantum
mechanics, one tan minimize the labour by using a representation
in which the representatives of the more important abstract quantities
occurring in that Problem are as simple as possible.
To set up a representation in a general way, we take a complete
set of bra vectors, i.e. a set such that any bra tan be expressed
linearly in terms of them (as a sum or an integral or possibly an
integral plus a sum). These bras we cal1 the basic bras of the representation.
They are sufficient, as we shall see, to fix the representation
completely.
Take any ket Ia) and form its scalar product with each of the basic
bras. The numbers so obtained constitute the representative of ja).
They are sufficient to determine the ket Ia) completely, since if there
is a second ket, Ia,) say, for which these numbers are the Same, the
differente Ia)- Ia,) will have its scalar product with any basic bra
04 REPRESENTATIONS 0 14
vatishing, and hence its scalar product with any bra whatever will
van& and ja)- Ia,) itself will van&
We may suppose the basic bras to be labelled by one or more
Parameters, h,, h, ,..., h,, each of which may take on certain numerical
values, The basic bras will then be written (h, AZ.. .h, 1 and the representative
of ja> will be written (h, X,... AU ja>. This representative will
now consist of a set of numbers, one for each set of values that
hl, &r..*, h, may have in their respective domains. Such a set of
numbers just forms a fmction of the variables A1, AZ,..., AU. Thus the
representative of a ket may be looked upon either as a set of numbers
or as a function of the variables used to label the basic bras.
If fhe number of independent states of our dynamical System is
finite, equal to n say, it is sufficient to take n basic bras, which may
be labelled by a Single Parameter h taking on the values 1,2,3,..., n.
The representative of any ket Ia) now consists of the set of n numbers
(1 Ia>, <21@, (3 Ia)>*.*, (nlu), which are precisely the coordinates of
the vector Ia) referred to a System of coordinates in the usual way.
The idea of the representative of a ket vector is just a generalization
of the idea of the coordinates of an ordinary vector and reduces to
the latter when the number of dimensions of the space of the ket
vectors is finite.
In a general representation there is no need for the basic bras to
be all independent. In most representations used in practice, however,
they are all independent, and also satisfy the more stringent
condition that any two of them are orthogonal. The representation
is then called an orthogonal representation.
Take an orthogonal representation with basie bras (h, h,...h, 1,
labelled by Parameters A1, A2,. . . , X, whose domains are all real. Take
a ket Ia> and ferm its representative (h,h,...A,lu). Now form the
numbers A,(hlh,...h, Ia) and consider them as the representative of
a new ket Ib). This is permissible since the numbers forming the
represenfative of a ket are independent, on account of the basic bras
being independent, The ket Ib) is defined by the equation
(&&&$Jb> = h,<A,h,...h,lu).
The ket Ib) is evidently a linear function of the ket Ia}, so it may
be eonsidered as the result of a linear Operator applied to la;>. Cabg
this linear Operator L,, we have
10 = & Ia>
§ 14 BASIC VECTORS 56
and hence (X, &...h,, 1 L, Ia) = h,(X, h,...X, Ia).
This equation holds for any ket ja), so we get
(h, h,...h, 1 L, = h,(X, X,...h, 1. (1)
Equation (1) may be looked upon as the definition of the linear
Operator L,. It Shows that euch basic bra is an eigenbra of L,, the
value of the Parameter X, being the eigenvalue belonging to it.
From the condition that the basic bras are orthogonal we tan
deduce that L, is real and is an observable. Let Xi, hk,. . ., & and
x;, Ai,..., Ai be two sets of values for the Parameters h,, Ag,. . ., h,.
We have, putting h"s for the X's in (1) and multiplying on the right
by IA;h&..Ac), the conjugate imaginary of the basic bra (A2ha...AiI,
{x;h~...XulL,Ih:ha...~) = h;(h;A$...~lh;~~...hU).
Interchanging X"s and h"`s,
{Xi x;..q L, p; hk...G) = X;(h; h;...A.p; Ass...&).
On account of the basic bras being orthogonal, the right-hand sides
here vanish unless hr = & for all T from 1 to u, in which case the
right-hand sides are equal, and they are also real, Ai being real. Thus,
whether the X"`s are equal to the X"s or not,
--F--~--
<h;hB...~IL,IX;hij...~) = (X,X,...~IL,Ih;~~...Xu>
= (Xih2...XuI~,li\;)12...~)
from equation (4) of $ 8. Since the (h; Ai.. .& 1's form a complete set
of bras and the /Ai A~...~)`s form a complete set of kets, we tan
infer that L, = -f;,. The further condition required for L, to be an
observable, namely that its eigenstates shall form a complete set, is
obviously satisfied since it has as eigenbras the basic bras, which
form a complete set.
We tan similarly introduce linear Operators L,, Lw.., L, by multi-
plying (h, h,. . .h, Ia) by the factors A2, X,, . . . , h, in turn and considering
the resulting sets of numbers as representatives of kets. Esch of these
L's tan be shown in the Same way to have the basic bras as eigenbras
and to be real and an observable. The basic bras are simultaneous
eigenbras of all the L's. Since these simultaneous eigenbras form a
complete set, it follows from a theorem of $13 that any two of the
L's commute.
It will now be shown that, if &,f2,..., fU are any set of commuting
observables, we tun set up an orthogonal representution in which the basic
bras are simultuneous eigenbras of 5;, [%,..., fU. Let us suppose 6rst that
66 REPRESENTATIONS 9 14
there ia only one independent simultaneous eigenbra of fl, t2,..., 4,
belonging to any set of eigenvalues f;, &.,.. . , 5;. Then we may take
these simultaneous eigenbras, with arbitrary numerical coefficients, as
our basic bras. They are all orthogonal on account of the orthogonality
theorem (any two of them will have at least one eigenvalue different,
which is sufficient to make them orthogonal) and there are sufficient
of them to form a complete set, from a result of 6 13. They may
conveniently be labelled by the eigenvalues & SS,... , & to which they
belong, so that one of them is written (6; (32..&].
Passing now to the general case when there are several independent
simultaneous eigenbras of &, t2,..., CU belonging to some sets of eigenvalues,
we must pick out from all the simultaneous eigenbras belonging
to a set of eigenvalues 6;) &, . . . , CU a complete subset, the members
of which are all orthogonal to one another. (The condition of completeness
here means that any simultaneous eigenbra belonging to the
eigenvalues [i, [i,..., & tan be expressed linearly in terms of the
members of the subset.) We must do this for each set of eigenvalues
Ei, &,..., & and then put all the members of all the subsets together
and take them as the basic bras of the representation. These bras
are all orthogonal, two of them being orthogonal from the orthogonality
theorem if they belong to different sets of eigenvalues and from
the special way in which they were Chosen if they belong to the same
set of eigenvalues, and they form altogether a complete set of bras,
as any bra tan be expressed linearly in terms of simultaneous eigenbras
and each simultaneous eigenbra tan then be expressed linearly
in terms of the members of a subset. There are infmitely many ways
of choosing the subsets, and each way provides one orthogonal
representation.
For labelling the basic bras in this general case, we may use the
eigenvalues & &..., & to which they belong, together with certain
additional real variables h,, &, . . . , &, say , which must be introduced to
distinguish basic vectors belonging to the same set of eigenvalues
from one another. A basic bra is then written (k; &...& hIh,...h,I.
Corresponding to the variables X,, X,, . . ., &, we tan define linear
Operators L,, I&,..., L, by equations like (1) and tan show that these
linear Operators have the basic bras as eigenbras, and that they are
real and observables, and that they commute with one another and
with the 6's. The basic bras are now simultaneous eigenbras of all
the commuting observables fl, e2 ,..., tu, L,, L, ,..., L,.
§ 14 BASIC VECTORS
Let us define a campbete set of commuting obseruables to be a set of
observables which all commute with one another and for which there
is only one simultaneous eigenstate belonging to any set of eigen-
values. Then the observables fl, fZ ,..., [,, L,, L, ,..., L, form a complete
set of commuting observables, there being only one independent simul-
taneous eigenbra belonging to the eigenvalues e;, 62 ,..., &, h,, &. ,..., 4,
namely the corresponding basic bra. Similarly the observables
L,, L2,..., L, defined by equation (1) and the following work form
a complete set of commuting observables. With the help of this
definition the main results of the present section tan be concisely
formulated thus:
(i) The basic bras of an orthogonal representation are simul-
taneous eigenbras of a complete set of commuting observ-
ables.
(ii) Given a complete set of commuting observables, we tan set .
up an orthogonal representation in which the basic bras are
simultaneous eigenbras of this complete set.
(iii) Any set of commuting observables tan be made into a com-
plete commuting set by adding certain observables to it.
(iv) A convenient way of labelling the basic bras of an orthogonal
representation is by means of the eigenvalues of the complete
set of commuting observables of which the basic bras are
simultaneous eigenbras.
The conjugate imaginaries of the basic bras of a representation we
cal1 the basic kets of the representation. Thus, if the basic bras arc
denoted by (h, &. ..h, 1, the basic kets will be denoted by Ih, &..h,>.
The representative of a bra (b 1 is given by its scalar product with
each of the basic kets, i.e. by (blh, A,...h,). It may, like the representative
of a ket, be looked upon either as a set of numbers or as a
function of the variables h,, &,. . ., X,. We have
(b /Al h,. . .A,> = (h, h,...h, 1 b),
showing that the representatiue of a bra is the conjugate complex of the
representative of tke conjugate imuginary Eet. In an orthogonal representation,
where the basic bras are simultaneous eigenbras of a complete
set of commuting observables, fx, f2,..., & say, the basic kets
will be simultaneous eigenkets of fl, e2,..., &.
We have not yet considered the lengths of the basic vectors. With
an orthogonal representation, the natura1 thing to do is to normalize
15. The Function
REPRESENTATIONS
the basic vectors, rather than leave their lengths arbitrary, and so
introduce a further Stage of simplification into the representation.
However, it is possible to normalize them only if the Parameters
which label them all take on discrete values. If any of these parameters
are continuous variables that tan take on all values in a range,
the basic vectors are eigenvectors of some observable belonging to
eigenvalues in a range and are of infinite length, from the discussion
in $ 10 (see.p. 39 and top of p. 40). Some other procedure is then
needed to fix the numerical factors by which the basic vectors may
be multiplied. To get a convenient method of handling this question
a new mathematical notation is required, which will be given in the
next section.
15. The S function
Our work in 6 10 led us to consider quantities involving a certain
kind of infinity. To get a precise notation for dealing with these
infinities, we introduce a quantity S(x) depending on a Parameter x
satisfying the conditions
Co
S(x) dz = 1
s
-* S(x) = 0 for x # 0.
To get a picture of S(x), take a function of the real variable x which
vanishes everywhere except inside a small domain, of length E say,
surrounding the origin x = 0, and which is so large inside this domain
that its integral over this domain is unity. The exact shape of the
function inside this domain does not matter, provided there are no
unnecessarily wild variations (for example provided the function
is always of Order 4). Then in the limit E -+ 0 this function will go
over into S(X).
S(x) is not a function of x according to the usual mathematical
definition of a function, which requires a function to have a definite
value for each Point in its domain, but is something more general,
which we may call an `improper function' to show up its differente
from a function defined by the usual definition. Thus S(x) is not a
quantity which tan be generally used in mathematical analysis like
an ordinary function, but its use must be confined to certain simple
types of expression for `which it is obvious that no inconsistency
tan arise.
0 16 THE 6 FUNCTION
The most important proper@ of S(X) is exemplified by the follow-
ing equation, w
s f(4w9 dx = f(O), (3)
-03
where f(x) is any continuous function of x. We tan easily see the
validity of this equation ficom the above picture of S(x). The lefthand
side of (3) tan depend only on the values of f(x) very close
to the origin, so that we may replace f(x) by its value at the origin,
f(O), without essential error. Equation (3) then follows from the
first of equations (2). By making a Change of origin in (3), we tan
deduce the formula co
s fWW4 dx = f(a), (4)
-Co
where a is any real number. Thus the process of multiplying a function
of x by S(x-a) and integrating over all x is equivalent to the process of
substituting a for x. This general result holds also if the function of x is
not a numerical one, but is a vector or linear Operator depending on x.
The range of integration in (3) and (4) need not be from --Co to CO,
but may be over any domain surrounding the critical Point at which
the S function does not vanish. In future the Limits of integration
will usually be omitted in such equations, it being understood that
the domain of integration is a suitable one.
Equations (3) and (4) Show that, although an improper function
does not itself have a weh-defined value, when it occurs as a factor
in an integrand the integral has a well-defined value. In quantum
theory, whenever an improper function appears, it will be something
which is to be used ultimately in an integrand. Therefore it should be
possible to rewrite the theory in a form in which the improper functions
appear all through only in integrands. One could then eliminate
the improper functions altogether. The use of improper functions
thus does not involve any lack of rigour in the theory, but is merely
a convenient notation, enabling us to express in a concise form
certain relations which we could, if necessary, rewrite in a form not
involving improper functions, but only in a cumbersome way which
would tend to obscure the argument.
An alternative way of defining the S function is as the differential
coefficient E'(X) of the function E(X) given by
E(X) = 0 (x < 0) (5)
= 1 (x > 0). 1
00 REPRESENTATIONS 0 15
We may verify that this is equivalent to the previous definition by
substituting E'(X) for S(x) in the left-hand side of (3) and integrating
by Parts. We find, for g, and g, two positive numbers,
ulf~x)Ef~x) ax
1 = [ft4+J)]8'up- p'f'wt4 ax
-92 -02
= fkd- Pm fJx
-f(O), O
in agreement with (3). The 8 function appears whenever one differen-
tiates a discontinuous function.
There are a number of elemrntary equations which one tan write
down about 6 functions. These equations are essentially rules of
manipulation for algebraic werk involving 6 functions. The meaning -
of any of these equations is that its two sides give equivalent results
as factors in an integrand.
Examples of such equations are
q-x) = S(x) (6)
xS(x) = 0, (7)
S(ax) = dS(x) (a > O), (8)
S(x242) = B~-l{w-J)+s(x+~)) @ > o>, (9)
s s(a-x) ax s+b) = s+b), (10)
f(x)S(x-a) = f(a)S(x-a). (11)
Equation (6), which merely states that S(x) is an even function of its
variable x is trivial. To verify (7) take any continuous function of
x, f(x). Then
f(x)xS(x) ax = 0,
s
from (3). Thus x 6(x) as a factor in an integrand is equivalent to
Zero, which is just the meaning of (7). (8) and (9) may be verified
by similar elementary arguments. To verify (10) take any continuous
function of a, f(a). Then
ff(q d"J s(a-x) ax S(X-b) = f s(x-b) axJf(a)adqa-x)
= 1 S(x-b) dxf(x) = 1 f(a) da S(a-4).
Thus the two sides of (10) are equivalent as factors in an integrand
with a as variable of integration. It may be shown in the same way
§ 15 THE 8 FUNCTION 61
that they are equivalent also as factors in an integrand with b as
variable of integration, so that equation (10) is justified from either
of these Points of view. Equation (11) is also easily justified, with
the help of (4), from two Points of view.
Equation (10) would be given by an application of (4) with
f(x) = S(x-b). We have here an illustration of the fact that we may
often use an improper function as though it were an ordinary continuous
function, without getting a wrong result.
Equation (7) Shows that, whenever one divides both sides of an
equation by a variable x which tan take on the value Zero, one
should add on to one side an arbitrary multiple of S(x), i.e. from an
equation A-B (12)
one cannot infer A/x = Bfx,
but only Alx =B/x+c SW, (13)
where c is unknown.
As an illustration of work with the S function, we may consider the
differentiation of log x. The usual formula
d
-&logx = 1 (14)
X
requires examination for the neighbourhood of x = 0. In Order to
make the reciprocal function l/x well defined in the neighbourhood
of x = 0 (in the sense of an improper function) we must impose on
it an extra condition, such as that ite integral from -E to E vanishes.
With this extra condition, the integral of the right-hand side of (14)
from -E to E vanishes, while that of the left-hand side of (14) equals
log (- l), so that (14) is not a correct equation. To correct it, we must
remember that, taking principal values, logx has a pure imaginary
term irr for negative values of x. As x Passes through the value Zero
this pure imaginary term vanishes discontinuously. The differentiation
of this pure imaginary term gives us the result -ins(x), so
that ( 14) should read
d
zlogx -L&(x). (15)
X
The particular combination of reciprocal function and S function
appearing in (15) plays an important part in the quantum theory of
collision processes (see 5 50).
16. Properties of the Basic Vectors
62 REPRESENTATIONS §16
16. Properties of the basic vectors
Using the notation of the 8 function, we tan proceed with the theory
of representations. Let us suppose first that we have a Single observable
4 forming by itself a complete commuting set, the condition for
this being that there is only one eigenstate of 4 belonging to any
eigenvalue [`, and let us set up an orthogonal representation in which
the basic vectors are eigenvectors of e and are written <[`I, It'>.
In the case when the eigenvalues of `$ are discrete, we csn normalize
the basic vectors, and we then have
<w'> = 0 (4' # t3>
GT'> = 1.
These equations tan be combined into the Single equation
<cr> = S@, W-9
where the Symbol 6 with two suffixes, which we shall often use in the
future, has the meaning
srs =0 when rfs
= 1 when r = s.
In the case when the eigenvalues of t are continuous we cannot
normalize the basic vectors. If we now consider the quantity @`lt">
with 4' fixed and 6" varying, we see from the work connected with
expression (29) of 6 10 that this quantity vanishes for 4" # 8' and
thet its integral over a range of 6" extending through the value f
is finite, equal to c say. Thus
G' 15") = c s(&-y").
From (30) of 5 10, c is a positive number. It mag vary with f', so
we should write it ~(6') or c' for brevity, and thus we have
<kT"> =c' S(f'-6'). (18)
Alternatively, we. have
&?15"> =C" S(f'--f"), (19)
where c" is short for c(["), the right-hand sides of (18) and (19) being
equal on account of (11).
Let us pass to another representation whose basic vectors arc
eigenvectors of e, the new basic vectors being numerical multiples of
the previous ones. Calling the new basic vectors (4'" 1, It'*), with the
additional label * to distinguish them from the previous ones, we have
(f'"l = W~`l, 14'") = m'>,
P
§ 16 PROPERTIES OF THE BASIC VECTORS I 63
where k' is short for k(f) and is a number depending on 5'. We get
(t'* Ie"*> = k'~(f' lf") = k'j& S(,f'-4")
with the help of (18). This may be written
from (11). By choosing k' so that its modulus is c'-*, which is possible
since c' is positive, we arrange to have
(f'"l f'*> = S(&("). (20)
The lengths of the new basic vectors are now fixed so as to make the
representation as simple as possible. The way these lengths were
fixed is in some respects analogous to the normalizing of the basic
vectors in the case of discrete e', equation (20) being of the form of
(16) with the 8 function S([`--6") replacing the 6 Symbol 8ee of
equation ( 16). We shall continue to work with the new representation
and shall drop the * labels in it to save writing. Thus (20) will now
be written ([`lf') = S([`-5"). (21)
We tan develop the theory on closely parallel lines for the discrete
and continuous cases. For the discrete case we have, using (16),
c 15'>wY'> = 2 IEY,,l = 14">,
5' i?
the sum being taken over all eigenvalues. This equation holds for
any basic ket jr) and hence, since the basic kets form a complete set,
This is a useful equation expressing an important property of the
basic vectors, namely, if je'> is multiplied on the right by (6'1 the
resulting linear Operator, summed for all (`, equds the unit Operator.
Equations (16) and (22) give the fundamental properties of the basic
vectors for the discrete case.
Similarly, for the continuous case we have, using (21),
/ ~kf) dff wo = 1 14') at' w-rf) = 157 (23)
from (4) applied with a ket vector for f(x), the range of integration
being the range of eigenvalues. This holds for any basic ket 16")
and hence
s 149 dt' (~7 = 1. (24)
64 REPRESENTATIONS § 16
This is of the same form as (22) with an integral replacing the sum.
Equations (21) and (24) give the fundamental properties of the basic
vectors for the continuous case.
Equations (22) and (24) enable one to expand any bra or ket in
terms of the basic vectors. For example, we get for the ket IP) in the
discrete case, by multiplying (22) on the right by IP),
IP> = 2 14'>(5'IP>~ (25)
t?
which gives /P) expanded in terms of the 14')`s and Shows that the
coefficients in the expansion are (5'1 P), which are just the numbers
forming the representative of 1 P). Similarly, in the continuous case,
IP) = j- lt'> dt' <W'>, (26)
giving IP) as an integral over the lt')`s, with the coefficient in the
integrand again just the representative (6' 1 P) of 1 P), The conjugate
imaginary equations to (25) and (26) would give the bra vector (P 1
expanded in terms of the basic bras.
Our present mathematical methods enable us in the continuous
case to expand any ket as an integral of eigenkets of 5. If we do not
use the 6 function notation, the expansion of,a general ket will consist
of an integral plus a sum, as in equation (25) of 5 10, but the 6 function
enables us to replace the sum by an integral in which the integrand
consists of terms each containing a & function as a factor. For
example, the eigenket 16") may be replaced by an integral of eigenkets,
as is shown by the second of equations (23).
If (Q 1 is any bra and 1 P) any ket we get, by further applications
of (22) and (24), KW> = ya5')(5'IP> (27)
for discrete 6' and
OW> = j- <&lf> dt' <W> (28)
for continuous 5'. These equations express the scalar product of (QI
and 1 P) in terms of their representatives (Q It') and (6' 1 P). Equation
(27) is just the usual formula for the scalar product of two
vectors in terms of the coordinates of the vectors, and (28) is the
natura1 modification of this formula for the case of continuous t',
with an integral instead of a sum.
The generalization of the foregoing work to the case when 4` has
both discrete and continuous eigenvalues is quite straightforward.
S 16 PROPERTIES OF THE BASIC VECTORS 65
Using 4' and 4" to denote discrete eigenvalues and 6' and 4" to denote
continuous eigenvalues, we have the set of equations
Gw> = $Tg"> @ll?> = 0, GT'> = w-4") (29)
as the generalization of (16) or (21). These equations express that
the basic vectors are all orthogonal, that those belonging to discrete
eigenvalues are normalized and those belonging to continuous eigenvalues
have their lengths fixed by the same rule as led to (20). Prom
(29) we tan derive, as the generalization of (22) or (24),
the rsnge of integration being the range of continuous eigenvalues.
With the help of (30), we get immediately
lP> = c 14'>G?I~)+ 1 lt'> dt' WlP>
4'
as the generalization of (26) or (26), and
a,s the generalization of (27) or (28).
Let us now pass to the general case when we have several commuting
observables EI, t2,. . . , & forming a complete commuting set and set up
an orthogonal representation in which the basic vectors are simultaneous
eigenvectors of all of them, and a;re mitten {&...& 1, I&..&).
Let us suppose e1,t2,..., & (V < u) have discrete eigenvalues and
4 6 have continuous eigenvalues.
w+l,"', u
Consider the quantity (&..& ~~+I..&j~;..~~ g+,..[t). Rom the
orthogonality theorem, it must vanish unless each 68 = 6: for
S = v+ l,.., u. By extending the work connected with expression
(29) of 6 10 to simultaneous eigenvectors of several commuting
observables and extending also the axiom (30), we find that the
(u-v)-fold integral of this quantity with respect to each fi over
a range extending through the value ei is a finite positive number.
Calling this number c', the ' denoting that it is a function of
s;,.., G, ka+iv*, G, we tan express our results by the equation
<~;..~~~~+,..~~1~;..~~5~+1~.su> = c's(~~+,-5~+l)..s(~-~~), (33)
with one 8 factor on the right-hand side for each value of s from
V+ 1 to u. We now Change the lengths of our basic vectors so as to
3696.57 F
66 REPRESENTATIONS § 16
make c' unity, by a procedure similar to that which led to (20). By
a further use of the orthogonality theorem, we get finally
with a two-suffix 8 Symbol on the right-hand side for each 4 with
discrete eigenvalues and a 8 function for each ,$ with continuous
eigenvalues. This is the generalization of (16) or (21) to the case when
there are several commuting observables in the complete set.
From (34) we tan derive, as the generalization of (22) or (24)
(35)
the integral being a (u-v)-fold one over all the k"s with continuous
eigenvalues and the summation being over all the ["s with discrete
eigenvalues. Equations (34) and (35) give the fundamental properfies
of the basic vectors in the present case. From (35) we tan immediately
write down the generalization of (25) or (26) and of (27) or (28).
The case we have just considered tan be further generalized by
allowing some of the 4's to have both discrete and continuous eigenvalues.
The modifications required in the equations are quite straightforward,
but will not be given here as they are rather cumbersome to
write down in general form-
There are some Problems in which it is convenient not to make the
cf of equation (33) equal unity, but to make it equal to some definite
function of the 6"s instead. Calling this function of the f"s p'-l we
then have, instead of (34)
and instead of (35) we get
(37)
p' is called the weight function of the representation, p'd,$,+,..d&
being the `weight' attached to a small volume element of the space
of the variables cV+r,.., &.
The representations we considered previously all had the weight
function unity. The introduction of a weight function not unity is
entirely a matter of convenience and does not add anything to the
mathematical power of the representation. The basic bras {f;...&* 1
of a representation with the weight function p' are connected with
17. The Representation of Linear Operators
§ 16 PROPERTIES OF THE BASIC VECTORS 67
the basic bras (&..& 1 of the corresponding representation with the
weight function unity by
(&...fu*l = p'-~(~;...~ul, (38)
as is easily verified. An example of a useful representation with
non-unit weight function occurs when one has two 5's which are
the polar and azimuthal angles 8 and + giving a direction in threedimensional
space and one takes p' = sin 8'. One then has the elcment
of solid angle sin 8' dPd+' occurring in (37).
17. The representation of linear Operators
In 5 14 we saw how to represent ket and bra vectors by ssts of
numbers. We now have to do the same for linear Operators, in Order
to have a complete scheme for representing all our abstract quantities
by sets of numbers. The Same basic vectors that wo had in 3 14 tan
be used again for this purpose.
Let us suppose the basic vectors are simultaneous eigenvectors of
a complete set of commuting observables 41,eZ,...,[U. If 01 is any
linear Operator, we take a general basic bra (&.& 1 and a general
basic ket jf;...fc) and form the numbers
{C$..~~~CX~~~*..~~). (39)
These numbers are sufficient to determine 01 completely, since in the
first place they determine the ket 01jt;...tc) (as they provide the
representative of this ket), and the value of this ket for all the basic
kets 1~~...~~> determines CX. The numbers (39) are called the representative
of the linear Operator CY. or of the dynamical variable (x. They
are more complicated than the representative of a ket or bra vcctor
in that they involve the Parameters that label two basic vectora
instead of one.
Let us examine the form of these numbers in simple cases. Take
first the case when there is only one t, forming a complete commuting
set by itself, and suppose that it has discrete eigenvalues 6'. The
representative of 01 is then the discrete set of numbers (5' [CX 14"). If
one had to write out these numbers explicitly, the natura1 way of
arranging them would be as a two-dimensional array, thus:
G?l4P> <511442> @blP> * l
G21d?> GT2bE2> (4ë214k3> í í
I <~314~1> (P14t2> <S3bE3> * ' 1 (40)
...........
i . .í .........
J
68 REPRESENTATIONS § 17
where tl, t2, t3,.. arc all the eigenvalues of [. Such an array is called
a mutrix and the numbers are called the elements of the matrix- We
make the convention that the elements must always be arranged SO
that those in the same row refer to the Same basic bra vector and
those in the Same column refer to the same basic ket vector.
An element ([`[cu~[`> f
re erring to two basic vectors with the same
label is called a diagonal element of the matrix, as all such elements
lie on a diagonal. If we put Q: equal to unity, we have from (16) all
the diagonal elements equal to unity and all the other elements equal
to Zero. The matrix is then called the unit matrix.
If cx is real, we have -_----
<0#`> = <5"145'>* (41)
The effect of these conditions on the matrix (40) is to make the
diagonal elements all real and each of the other elements equal the
conjugate complex of its mirror reflection in the diagonal. The matrix
is then called a Hermitian matrix.
If we put 01 equal to 4, we get for a general element of the matrix
~4'1&?`> = mw'> = Q'&$$@. (42)
Thus all the elements not on the diagonal are Zero. The matrix is
then called a diagonul matrix. Its diagonal elements are just equal
to the eigenvalues of 5`. More generally, if we put a equal to f(f), a
function of 6, we get
(6' IM) lt?`> = f@> Kp@ (43)
and the matrix is again a diagonal matrix.
Let us determine the representative of a product @ of two linear
Operators a and ss in terms of the representatives of the factors.
F'rom equation (22) with p substituted for er we obtain
~ww'> = G'b F l5"><~lPlr'>
= G?l~l5"><5"ISlk'>~
f (44)
111
which gives us the required result. Equation (44) Shows that the
matrix formed by the elements (~`101/3l~) equals the product of the
matrices formed by the elements (6' Ia 15") and (k' Iss 1~") respectively,
according to the usual mathematical rule for multiplying matrices.
This rule gives for the element in the rth row and sth column of the
product matrix the sum of the product of each element in the rth
row of the first factor matrix with the corresponding element in the sth
s 17 THE REPRES ENTATION OF LINEAR OPERATORS
column of the second factor matrix. The multiplication of matrices
is non-commutative, like the multiphcation of linear Operators.
We tan summarize our results for the case when there is only one
t and it has discrete eigenvalues as follows:
(i) Any iinear operatdr is represented by a matrix.
(ii) The unit Operator is represented by the unit mutrix.
(iii) A real linear Operator is represented by a Hermitian rmztrix.
(iv) 6 and functions of ZJ aye represented by diagonal matrices.
(v) The matrix representing the product of two linear Operators is the
product of the matrices representing the two factors.
Let us now consider the case when there is only one e and it has
continuous eigenvalues. The representative of a is now (~`/~1~"), a
function of two variables 6' and 6" which tan vary continuously. It
is convenient to cal1 such a function a `rnatrix', using this word in
a generalized sense, in Order that we may be able to use the same
terminology for the discrete and continuous cases. One of these
generalized matrices cannot, of course, be written out as a twodimensional
array like an ordinary matrix, since the number of its
rows and columns is an infinity equal to the number of Points on a
line, and the number of its elements is an infinity equal to the
number of Points in an area.
We arrange our definitions concerning these generalized matrices
so that the rules (i)-(v) which we had above for the discrete aase
hold also for the continuous case. The unit Operator is represented
by S(t'--f") and the generalized matrix formed by these elements
we define to be the unit mtrix. We still have equation (41) as the
condition for 01 to be real and we define the generalized matrix formed
by the elements (6' ]o~]LJ"> to be Herrnitian when it satisfies this
condition. 5 is represented by
(6' lW> = 6' W-f') (46)
aJ-d f (59 bY <f'lf<f> lt'% = f(f) W-F'), (46)
and the generalized matrices formed by these elements we define to be
diagonal mutrices. From (1 l), we could equally well have f" and f (t")
as the coefficients of S([`-5") on the right-hand sides of (45) and (46)
respectively. Corresponding to equation (44) we now have, from (24)
<~`b/W'> = j <5'14t"`> dt"' @`l~lt">, (47)
with an integral instead of a sum, and we define the generalized
matrix formed by the elements on the right-hand side here to be the
70 REPRESENTATIONS $ 17
product of the matrices formed by (e'jaJ["> and (t'J/314"). With
these definitions we secure complete parallelism between the discrete
and continuous cases and we have the rules (i)-(v) holding for both.
The question arises how a general diagonal matrix is to be defined
in the continuous case, as so far we have only defined the right-hand
sides of (45) and (46) to be examples of diagonal matrices. One
might be inclined to define as diagonal any matrix whose (f', f")
elements all vanish except when t' differs infinitely little from t",
but this would not be satisfactory, because an important property
of diagonal matrices in the discrete case is that they always commute
with one another and we want this property to hold also in the
continuous case. In Order that the matrix formed by the elements
(4'1~ 15") in the continuous case may commute with that formed by
the elements on the right-hand side of (45) we must have, using the
multiplication rule (47),
With the help of formula (4), this reduces to
<4'144">4" = 4'w46"> (48)
or (pty)((' Iw I(") = 0.
This gives, according to the rule by which (13) follows from (12))
(&J 1f") = c' 6(&-tj")
where c' is a number that may depend on f'. Thus (c' Iw 16") is of the
form of the right-hand side of (46). For this reason we de$ne only
matrices whose elements are of the ferm of the right-hund side of (46) to
be diagonal matrices. It is easily verified that these matrices all
commute with one another. One tan form other matrices whose
(t', 4") elements all vanish when 5' differs appreciably from 4" and
have a different form of singularity when 5' equals 6" [we shall later
introduce the derivative 6'(x) of the 6 function and 6' (ff -6") will
then be an example, see $22 equation (lg)], but these other matrices
are not diagonal according to the definition.
Let us now pass on to the case when there is only one [ and it has
both discrete and continuous eigenvalues. Using e, t8 to denote
discrete eigenvalues and ff, 5" to denote continuous eigenvalues, we
now have the representative of a consisting of four kinds of quanti-
ties, (4'jaIF>, (p]oil~`>, ([`Icx]~), ([`lar]4"). These quantities tan all
5 17 THE REPRESENTATION OF LINEAR OPERATORS 71
be put together and considered to form a more general kind of matrix
having some discrete rows and columns and also a continuous range
of rows and columns. We define unit matrix, Hermitian matrix,
diagonal matrix, and the product of two matrices also for this more
general kind of matrix so as to make the rules (i)-(v) still hold. The
details are a straightforward generalization of what has gone before
and need not be given explicitly.
Let us now go back to the general case of several [`s, kl, fa,..., k,,.
The representative of 01, expression (39) may still be looked upon as
forming a matrix, with rows corresponding to different values of
Si,. . ., & and columns corresponding to different values of [i,. .., fi.
Unless all the ,$`s have discrete eigenvalues, this matrix will be of the
generalized kind with continuous ranges of rows and columns. We
again arrange our definitions so that the rules (i)-(v) hold, with rule
(iv) generalized to:
(iv') Esch tn, (rn = 1, 2,..., u> and any function of them is repre-
sented by a diagonal matrix.
A diagonal matrix is now defined as one whose general element
(&,..&~w~~~...~~> is of the form
in the case when fl,.., ,$V have discrete eigenvalues and &,+l, .., tU have
continuous eigenvalues, c' being any function of the 6"s. This definition
is the generalization of what we had with one 4` and makes
diagonal matrices always commute with one another. The other
definitions are straightforward and need not be given explicitly.
We now have a linear Operator always represented by a matrix.
The sum of two linear Operators is represented by the sum of the
matrices representing the Operators and this, together with rule (v),
means that the nuztrices are subject to the same algebraic relations as
the linear olperators. If any algebraic equation holds between certain
linear Operators, the same equation must hold between the matrices
representing those Operators.
The scheme of matrices tan be extended to bring in the repre-
sentatives of ket and bra vectors. The matrices representing linear
Operators are all Square matrices with the Same number of rows and
columns, and with, in fact, a one-one correspondence between their
rows and columns. We may look upon the representative of a ket
1 P) as a rrmtrix with a single wlumn by setting all the numbers
18. Probability Amplitudes
72 REPRESENTATIONS 0 17
(.&...&lP) which form this representative one below the other. The
number of rows in this matrix will be the Same as the number of
rows or columns in the Square matrices representing linear Operators.
Such a Single-column matrix tan be multiplied on the left by a Square
matrix (&...&Icx~~~...~~) p
re resenting a linear Operator, by a rule
similar to that for the multiplication of two Square matrices. The
product is another Single-column matrix with elements given by
From (35) this is just equal to (~;...&Icx~P), the. representative of
011 P). Similarly we may look upon the representative of a bra (Q /
as a matrix with a Single row by setting all the numbers (QI~~...&>
side by side. Such a Single-row matrix may be multiplied on the
right by a Square matrix (~~...&Icx\~~...R), the product being another
Single-row matrix, which is just the representative of <&Icx. The
Single-row matrix representing (Q 1 may be multiplied on the right
by the Single-column matrix representing IP), the product being a
matrix with just a Single element, which is equal to (Q IP). Finally,
the Single-row matrix representing (Q 1 may be multiplied on the left
by the Single-column matrix representing f P), the product being a
Square matrix, which is just the representative of l.P)(Q 1. In this
way all our abstract Symbols, linear Operators, bra vectors, and ket
veetors, tan be represented by matrices, which are subject to the
same algebraic relations as the abstract Symbols themselves.
18. Probability amplitudes
Representations are of great importante in the physical interpreta-
tion of quantum mechanics as they provide a convenient method for
obtaining the probabilities of observables having given values. In
$ 12 we obtained the probability of an observable having any specified
value for a given state and in $ 13 we generalized this result
and obtained the probability of a set of commuting observables
simultaneously having specified values for a given state. Let us now
apply this result to a complete set of commuting observables, say the
set of f's which we have been dealing with already. According to
formula (51) of 5 13, the probability of each 5,. having the value 6;
for the state corresponding to the normalized lret vector IX) is
18 PROBABILITY AMPLITUDES 73
§
If the 6's all have discrete eigenvalues, we tan um (35) with v = U,
and no integrals, and get
We thus get the simple result that the probahility of the 6's kving the
vulues 6' is just the Square of the modulus of the appropriate coordinate
of the normalized ket vector corresponding to the stade concerned.
If the LJ'S do not all have discrete eigenvalues, but if, say, fl,.., &,
have discrete eigenvalues and &,+r ,. . ,fU have continuous eigenvalues,,
then to get something physically significant we must obtain the
probability of each (Jr (r = l,.., v) having a specified value C and each
& (8 = v+L., U) lying in a specified small range 59 to [:+c@:. For
this purpose we must replace each factor Sg8g; in (50) by a factor xS,
which is that function of the observable & which is equal to unity
for & within the range [i to &+dtL and zero otherwise. Proceeding
as before with the help of (35), we obtain for this probability
Thus in every case the probability distribution of values for the e'~ is
given by the squure of the modulus of the representative of the normalixed
ket vector corresponding to the stute concerned.
The numbers which form the representative of a normalized ket
(or bra) may for this reason be called probability ampiitudes. The
Square of the modulus of a probability amplitude is an ordinary
probability, or a probability per unit range for those variables that
have continuous ranges of values.
We may be interested in a state whose corresponding ket IX) cannot
be normalized. This occurs, for example, if the state is an eigenstate
of some observable belonging to an eigenvalue lying in a range of
eigenvalues . The formula (51) or (52) tan then still be used to give
the relative probability of the 6's having specified values or having
values lying in specified small ranges, i.e. it will give correctly the
ratios of the probabilities for different 4"s. The numbers (&...&lx>
may then be called relative probability amplitudes.
74 REPRESENTATIONS § 18
The representation for which the above results hold is characterized
by the basic vectors being simultaneous eigenvectors of all the f's.
It may also be characterized by the requirement that each of the 5's
shall be represented by a diagonal matrix, this condition being easily
seen to be equivalent to the previous one. The latter characterization
is usually the more convenient one. For brevity, we shall formulate
it as each of the 6's ' being diagonal in the representation'.
Provided the f's form a complete set of commuting observables,
the representation is completely determined by the characterization,
apart Flom arbitrary Phase factors in the basic vectors. Esch basic bra
(ei.. .& 1 may be multiplied by eiy', where y' is any real function of
the variables &..., &, without changing any of the conditions which
the representation has to satisfy, i.e. the condition that the E's are
diagonal or that the basic vectors are simultaneous eigenvectors of
the 5'8, and the fundamental properties of the basic vectors (34) and
(35). With the basic bras changed in this way, the representative
(~~..&IP> of a ket /P) gets multiplied by eir', the representative
(& It;...&) of a bra (& 1 gets multiplied by e-iy' and the representative
(&...&lal~;...~~) f h
o a `near Operator cx gets multiplied by eflr'--r?
The probabilities or relative probabilities (51), (52) are, of course,
unaltered.
The probabilities that one calculates in practical Problems in
quantum mechanics are nearly always obtained from the squares
of the moduli of probability amplitudes or relative probability amplitudes.
Even when one is interested only in the probability of an
incomplete set of commuting observables having specified values, it
is usually necessary first to make the set a complete one by the
introduction of some extra commuting observables and to obtain
the probability of the complete set having specified values (as the
Square of the modulus of a probability amplitude), and then to sum
or integrate over all possible values of. the extra observables. A
more direct application of formula (51) of $ 13 is usually nof
practicable.
To introduce a representation in practice
(i) We look for observables which we would like to have diagonal,
either because we are interested in their probabilities or for
reasons of mathematical simplicity ;
(ii) We must see that they all commute-a necessary condition
since diagonal matrices always commute ; j.
9 18 PROBABILITY AMPLITUDES 75
(iii) We then sec that they form a complete commuting set, and
if not we add some more commuting observables to them fo
make them into a complete commuting set ;
(iv) We set up an orthogonal representation with this complete
commuting set diagonal.
The representation is then completely determined except for the
arbitrary Phase factors. For most purposes the arbitrary Phase
factors are unimportant and trivial, so that we may count the
representation as being completely determined by the observables
that are diagonal in it. This fact is already implied in our notation,
since the only indication in a representative of the representation to
which it belongs are the letters denoting the observables that are
diagonal.
It may be that we are interested in two representations for the
same dynamical System. Suppose that in one of them the complete
set of commuting observables [i,..., eU are diagonal and the basic
bras are <&...&] and in the other the complete set of commuting
observables T~,. . . , vw are diagonal and the basic bras are (q;...&, 1.
A ket 1 P) will now have the two representatives {&...&I P> and
<&.&lP>. If &,..> &, have discrete eigenvalues and &+l,.., fU have
continuous eigenvalues and if Q,. . , 7% have discrete eigenvalues and
?lx+l,") rlw have continuous eigenvalues, we get from (35)
and interchanging e's and 7's
These are the transformation equations which give one representative
of IP) in terms of the other. They show that either representative
is expressible linearly in terms of the other, with the quantities
as coefficients. These quantities are called the transformtion funcGons.
Similar equations may be written down to connect the two
representatives of a bra vector or of a linear Operator. The transformation
functions (55) are in every case the means which enable
one to pass fiom one representative to the other. Esch of the
19. Theorems about Functions of Observables
76 REPRESENTATIONS 4 18
transformation functions is the conjugate complex of the other, and
they satisfy the conditions
&,1.. vj-s (~;..qX;...iQ dt;+,..dS; <~;-~lrl;...~W)
8
= r)irlT" qzqz
6 í l 6(r15+1-rlr+l)..6(rl:o-rlf;) (56)
and the corresponding conditions with 6's and 17's interchanged, as
may be verified from (35) and (34) and the corresponding equations
for the 77's.
Transformation functions are examples of probability amplitudes
or relative probability amplitudes. Let us take the case when all the
6's and all the 7's have discrete eigenvalues. Then the basic ket
/qi...&) is normalized, so that its representative in the f-representation,
{~;...&I~;...&,), is a probability amplitude for each set of values
for the ["s. The state to which these probability amplitudes refer,
namely the state corresponding to 1 y;.. .$,,), is characterized by the
condition that a simultaneous measurement of Q,. . ., Q,, is certain to
lead to the results &...,&. Thus I([;...&[$...&,)12 is the probability
of the 5's having the values &,...& for the state for which the
7's certainly have the values $...&. Since
ro;...c.Jq;...&7>12 = rc~~...~Wl~;...~~>I",
we have the theorem of reciprocity-the probability of the e's having
the values [' for the state for which the r]`s certainly huve the values q'
is equal to the probability of the q's having the values 7' for the state for
which the f's certainly haue the values 4'.
If all the q's have discrete eigenvalues and some of the e's have
continuous eigenvalues, 1 {Ei.. .eh 1~;. . . $,J l2 still gives the probability
distribution of values for the 4's for the state for which the 7)`s certainly
have the values 7'. If some of the 7's have continuous eigen-
values, IT;...&,> is not normalized and I(~~...&I$...&>I" then gives
only the relative probability distribution of values for the 4's for the
state for which the 7's certainly have the values 7'.
19. Theorems about functions of observables
We shall illustrate the mathematical value of representations by
using them to prove some theorems.
THEOREM 1. A linear Operator that commutes with an observable 6
commutes also with any function of 4.
The theorem is obviously true when the function is expressible as
3 19 THEOREMS ABOUT FUNCTIONS OF OBSERVABLES 77
a power seiies. To prove it generally, Iet w be the linear Operator,
so that we have the equation
t+-05 = 0. (57)
Let us introduce a representation in which [ is diagonal. If 6 by
itself does not form a complete commuting set of observables, we must
make it into a complete commuting set by adding certain observables,
/? say, to it, and then take the representation in which t and the ,B's
are diagonal. (The case when 6 does form a complete commuting set
by itself tan be looked upon as a special case of the preceding one
with the number of /3 variables Zero.) In this representation equation
(57) becomes <!?ss'I!sJ -co&y'ss"> = 0,
which reduces to
6$`(ty 10 llf'ss') - (f'ss' Ii.0 lly'ss')y = 0.
In the case when the eigenvalues of 4 are discrete, this equation
Shows that all the matrix elements (f'ss' lolQ"ss') of w vanish except
those for which 5' = f". In the case when the eigenvalues of 6 are
continuous it Shows, like equation (48), that @ss' 10 Ie"ss"> is of the
form (fss' 10 I("ss') = c S(tj'q'),
where c is some function of f' and the ss"s and p"`s. In either case
we may say that the matrix representing w `is diagonal with respect
to 6'. Iff([) denotes any function of 6 in accordance with the general
theory of 3 I 1, which requires f(r) to be deflned for ,$"' any eigenvalue
of 5, we tan deduce in either case
This gives <fss' If (kl @-Jf(J3 ll"ss") = 0,
so that f(k) Ce-Wf(f) = 0
and the theorem is proved.
`As a special case of the theorem, we have the result that any
observable that commutes with an observable E also commutes with
any function of 4. This result appears as a physical necessity when
we identify, as in $13, the condition of commutability of two
observables with the condition of compatibility of the corresponding
observations. Any Observation that is compatible with the
measurement of an observable 6 must also be compatible with the
measurement of f(e), since any measurement of 6 includes in itself
a measurement of f( t).
78 REPRESENTATIONS 0 19
THEOREM 2. A linear Operator thut commutes with euch of a complete
set of commuting observables is a function of those observables.
Let o be the linear Operator and el, c2,. . . , eU the complete set of
commuting observables, and set up a representation with these
observables diagonal. Since w commutes with each of the 8'5, the
matrix representing it is diagonal with respect to each of the t's,
by the argument we had above. This matrix is therefore a diagonal
matrix and is of the form (49), involving a number c' which is a
function of the ("s. It thus represents the function of the [`s that
c' is of the e"s, and hence o equals this function of the f's.
TEEOREM 3. If an observable 6 and a linear Operator g are such that
any linear Operator thut commutes with f also commutes with g, thea g
is a ficnction of 5.
This is the converse of Theorem 1. To prove it, we use the same
representation with f diagonal as we had for Theorem 1. In the first
place, we see that g must commute with 6 itself, and hence the
representative of g must be diagonal with respect to e, i.e. it must
be of the form
<~`B'lslf'P"> = atS'W')~~g~ or 4S'B'P")W'-%"),
according to whether 6 has discrete or continuous eigenvalues. Now
let o be any linear Operator that commutes with f, so that its
representative is of the form
(fjs' 10 I["ss") = b([`p'/I")6pp or b([`/3'/3')8([`---f").
By hypothesis w must also commute with g, so that'
<CB kW -og(~"p') = 0. 6-w
If we suppose for definiteness that the Iss's have discrete eigenvalues,
(58) leads, with the help of the law of matrix multiplication, to
2 {a(~`~`jY")b(~`/3"`/3')-b(~`/3'/3"')cz(~'/3"'/3")~ = 0, (59)
ss"'
the left-hand side of (58) being equal to the left-hand side of (59)
multiplied by $6 or S([`---6"). Equation (59) must hold for all
functions b(f'lg'JQ"). We tan deduce that
43v") = 0 for j3' # fl",
a( [`/3'/3') = a( f'/3"/3").
The first of these results Shows that the matrix representing g is
diagonal and the second Shows that a(f'/3'p') is a function of 4' only.
We tan now infer that g is that function of f which c@`jY@`) is of [`,
20. Developments in Notation
§ 19 THEOREMS ABOUT FUNCTIONS OF OBSERVABLES 79
so the theorem is proved. The proof is analogous if some of the B's
have continuous eigenvalues.
Theorems 1 and 3 are still valid if we replace the observable 6 by
any set of commuting observables fl, f2,.., &., only formal changes
being needed in the proofs.
20. Developments in notation
The theory of representations that we have developed provides a
general System for labelling kets and bras. In a representation in which
the complete set of commuting observables (J1,... , Eu are diagonal any
ket 1-P) will have a representative (&...&IP>, or (l'/P) for brevity.
This representative is a definite function of the variables [`, say $(E').
The function # then determines the ket IP) completely, so it may be
used to label this ket, to replace the arbitrary label P. In Symbols,
if
we put
We must put IP) equal to l+(4)> and not $(f')>, since it does not
depend on a particular set of eigenvalues for the [`s, but only on the
form of the function #.
With f(t) any function of the observables El,..., CU, f(f)IP) will
have as its representative
CE' If(O I p> = f(E'J#(S') *
Thus according to (60) we put
f(6) v-3 = lfW(~b
With the help of the second of equations (60) we now get
f(t) W(5)> = If(S)sL(f)>* (61)
This is a general result holding for any functions f and # of the e's,
and it Shows that the vertical line 1 is not necessary with the new
notation for a ket-either side of (61) may be written simply as
f(~)#([)). Thus the rule for the new notation becomes:-
if (W> = VW) (62)
we put lP> = ?m>* 1
We may further shorten I(t)) to $>, leaving the variables [ understood,
if no ambiguity arises thereby.
The bt tw> may be considered as the product of the linear
Operator #([) with a ket which is denoted simply by ) without a
label. We cal1 the ket ) the standard ket. Any ket whatever tan be
80 REPRESENTATIONS § 20
expressed as a function of the 6's multiplied into the Standard ket.
For example, taking ]P) in (62) to be the basic ket It">, we find
(63)
in the case when tl,.., & have discrete eigenvalues and &,+l,. ., 4, have
continuous eigenvalues. The Standard ket is characterized by the
condition that its representative (5' 1) is unity over the whole domain
of the variable t', as may be seen by putting # = 1 in (62).
A further contraction may be made in the notation, namely to
leave the Symbol ) for the Standard ket understood. A ket is then
written simply as #(JJ), a function of the observables 5. A function
of the 5's used in this way to denote a ket is called a wave function.?
The System of notation provided by wave functions is the one usually
used by most authors for calculations in quantum mechanics. In
using it one should remember that each wave function is understood
to have the Standard ket multiplied into it on the right, which
prevents one from multiplying the wave function by any Operator
on the right. Waue functions tan be multiplied by Operators only on
the Zeft. This distinguishes them from ordinary functions of the Os,
which are Operators and tan be multiplied by Operators on either the
left or the right. A wave function is just the representative of a ket
expressed as a function of the observables f, instead of eigenvalues e'
for those observables. The Square of its modulus gives the probability
(or the relative probability, if it is not normalized) of the &`s
having specified values, or lying in specified small ranges, for the
corresponding state.
The new notation for bras may be developed in the Same way as
i for kets. A bra (&I whose representative (&[f') is #`) we write
r (~/(fl)l. With this notation the conjugate imaginary to I$(t)) is
(g(e) 1. Thus the rule that we have used hitherto, that a ket and
its conjugate imaginary bra are both specified by the Same label,
must be extended to read-if the labels of a Eet involve cornplex
numbers or cmplex functions, the lubels of the conjugate irnaginary
bra involve the conjugate cornplex numbers or functions. As in the
case of kets we tan show that (#)lf(k) and (~(@jo)~ are the Same,
so that the vertical line tan be omitted. We tan consider (c#) as
the product of the linear Operator +(f) into the Standard bra (, which
t The reason for this name is that in the early daye of quantum mechanics all the
examples of these functions were of the form of waves. The name is not a descriptive
one from the Point of view of the modern general theory.
9 20 DEVELOPMENTS IN NOTATION
is the conjugate imaginary of the Standard ket ). We may leave
the Standard bra understood, so that a general bra is written as #),
the conjugate complex of a wave function. The conjugate complex
of a wave function tan be multiplied by any linear Operator on the
right, but cannot be multiplied by a linear Operator on the left. We
tan construct triple products of the form (f(t)>. Such a triple product
is a number, equal to f(f) summed or integrated over the whole
domain of eigenvalues for the E's,
(64)
in the case when fr,.., ,$V have discrete eigenvalues and &,+l,. . ., & have
continuous eigenvalues.
The Standard ket and bra are defined with respect to a representa-
tion. If we carried through the above work with a different representation
in which the complete set of commuting observables r) are
diagonal, or if we merely changed the Phase factors in the representation
with the 5's diagonal, we should get a different Standard ket and
bra. In a piece of work in which more than one Standard ket or bra
appears one must, of course, distinguish them by giving them labels.
A further development of the notation which is of great importante
for dealing with complicated dynamical Systems will now be discussed.
Suppose we have a dynamical System describable in terms of dynamical
variables which tan all be divided into two Sets, set A and set B
say, such that any member of set A commutes with any member of
set B. A general dynamical variable must be expressible as a function
of the A-variables and B-variables together. We may consider
another dynamical System in which the dynamical variables are the
A-variables only-let us call it the A-System. Similarly we may
consider a third dynamical System in which the dynamical variables
are the B-variables only-the B-System. The original System tan
then be looked upon as a combination of the A-System and the
B-System in accordance with the mathematical scheme given below.
Let us take any ket Ia> for the A-System and any ket Jb} for the
B-System. We assume that they have a product Ia) [b) for which
the commutative and distributive axioms of multiplication hold, i.e.
lW> = P>W,
+4%>+d%>)I~) = Cl 1%) I~)+%l%) Ib>,
I~OECl l~l)+%lb2 = Cl l~M,>+c,j~> Ibn>,
3595.67 Q
82 REPRESENTATIONS 9 20
the c's being numbers. We tan give a meaning to any A-variable
operating on the product ja) Ib) by assuming that it operates only
on the Ia) factor and commutes with the Ib> factor, and similarly
we tan give a meaning to any B-variable operafing on this product
by assumiug that it operates only on the Ib) factor and commutes
with the ja) factor. (This makes every A-variable commute with
every B-variable.) Thus any dynamical variable of the original
System tan operate on the product Icc) Ib), so this product tan be
looked upon as a ket for the original System, and may then be
written lab), the two labels a and b being sufficient to specify it.
In this way we get the fundamental equations
Im> = Ib> Ia> = lW* (65)
The multiplication here is of quite a different kind from any that
occurs earlier in the theory. The ket vectors Ia) and Ib) are in two
different vector spaces and their product is in a third vector space,
which may be called the product of the two previous vector spaces.
The number of dimensions of the product space is equal to the
product of the number of dimensions of each of the factor spaces.
A general ket vector of the product space is not of the form (654, but
is a sum or integral of kets of this form.
Let us take a representation for the A-System in which a complete
set of commuting observables fA of the A-System are diagonal. We
shall then have the basic bras (62 1 for the A-System. Similarly, taking
a representation for the B-System with the observables tB diagonal,
we shall have the basic bras (&l for the B-System. The products
will then provide the basic bras for a representation for the original
System, in which representation the tA's and the fB's will be diagonal.
The fd's and tB's will together form a complete set of commuting
observables for the original System. From (65) and (66) we get
Kl la><&?lb> =f <4a tilab>, (67)
showing that the representative of jab) equals the product of the
representatives of Ia) and of Jb) in their respective representations.
We tan introduce the Standard ket, )a say, for the A-System,
with respect to the representation with the fA's diagonal, and also
the Standard ket )B for the B-System, with respect to the representation
with the &`s diagonal. Their product )a >* is then the
§ 20 DEVELOPMENTS IN NOTATION 83
Standard ket for the original System, with respect to the representation
with the fB's and tB's diagonal. Any ket for the original System
may be expressed as (68)
It may be that in a certain calculation we wish to use a particular
representation for the B-System, say the above representation with
the eB's diagonal, but do not wish to introduce any particular
representation for the A-System. It would then be convenient to
use the Standard ket )* for the B-System and no Standard ket for
the A-System. Under these circumstances we could write any ket
for the original System as I&3&3~ w
in which ItB) is a ket for the A-System and is also a function of the
fB's, i.e. it is a ket for the A-System for each set of values for the
fB's-in fact (69) equals (68) if we take
We may leave the Standard ket )B in (69) understood, and then we
have the general ket for the original System appearing as IeB>, a ket
for the A-System and a wave function in the variables tB of the
B-System. Examples of this notation will be used in 5s 66 and 79.
The above work tan be immediately extended to a dynamical
System describable in terms of dynamical variables which tan be
divided into three or more sets A, 23, C,... such that any member of
one set commutes with any member of another. Equation (65) gets
generalized to la)lb) Ic)... = pc...>,
the factors on the left being kets for the component Systems and
the ket on the right being a ket for the original System. Equations
(66), (67), and (68) get generalized to many factors in a similar way.
IV. The Quantum Conditions
21. Poisson Brackets
IV
THE QUANTUM CONDITIONS
2 1. Poisson brackets
OUR work so far has consisted in setting up a general mathematical
scheme connecfing states and observables in quantum mechanics.
One of the dominant features of this scheme is that observables, and
dynamical variables in general, appear in it as quantities which do
not obey the commutative law of multiplication. It now becomes
necessary for us to obtain equations to replace the commutative law
of multiplication, equations that will tell us the value of [r] - 76 when
6 and 7 are any two observables or dynamical variables. Only when
such equations are known shall we have a complete scheme of
mechanics with which to replace classical mechanics. These new
equations are called quantum conditions or comnutation relations.
The Problem of finding quantum conditions is not of such a general
Character as those we have been concerned with up to the present. It
is instead a special Problem which presents itself with each particular
dynamical System one is called upon to study. There is, however,
a fairly general method of obtaining quantum conditions, applicable
to a very large class of dynamical Systems. This is the method of
classical anulogy and will form the main theme of the present chapter.
Those dynamical Systems to which this method is not applicable
must be treated individually and special considerations used in each
case.
The value of classical analogy in the development of quantum
mechanics depends on the fact that classical mechanics provides a
valid description of dynamical Systems under certain conditions,
when the particles and bodies composing the Systems are sufficiently
massive for the disturbance accompanying an Observation to be
negligible. Classical mechanics must therefore be a limiting case of
quantum mechanics. We should thus expect to find that important
concepts in classical mechanics correspond to important concepts in
quantum mechanics, and, from an understanding of the general
nature of the analogy between classical and quantum mechanics, we
may hope to get laws and theorems in quantum mechanics appearing
as simple generalizations of well-known results in classical mechanics;
in particular we may hope to get the quantum conditions appearing
.
J2 21 POISSON BRACKETS
as a simple generalization of the classical law that all dynttmical
variables commute.
Let us take a dynamical System composed of a number of particles
in interaction. As independent dynamical variables for dealing with
the System we may use the Cartesian coordinates of all the particles
and the corresponding Cartesian components of velocity of the particles.
It is, however, more convenient to work with the momentum
components instead of the velocity components. Let us cal1 the
coordinates qr, r going from 1 to three times the number of particles,
and the corresponding momentum components 13,. The q's and p's
are called canonical coordinates and momenta.
The method of Lagrange's equations of motion involves introdu-
cing coordinates qp and momenta pT in a more general way, applicable
also for a System not composed of particles (e.g. a System containing
rigid bodies). These more general q's and JYS are also called canonical
coordinates and momenta. Any dynamical variable is expressible in
terms of a set of canonical coordinates and momenta.
An important concept in general dynamical theory is the Poisson
Bracket. Any two dynamical variables u and v have a P.B. (Poisson
Bracket) which we shall denote by [u, v], defined by
(1)
u and v being regarded as functions of a set of canonical coordinates
and momenta q,, and 13,. for the purpose of the differentiations. The
right-hand side of (1) is independent of which set of canonical
coordinates and momenta are used, this being a consequence of the
general definition of canonical coordinates and momenta, so the
P.B. [u,v] is well defined.
The main properties of P.B.`s, which follow, at once from their
definition (l), are
[u, v] = -p, Ul, (2)
[w-j = 0, (3)
where c is a number (which may be considered as a special case of a
dynamical variable),
THE QUANTUM CONDITIONS 8 21
= [Ul, "]u2+u,[u,, v],
[u, VI 021 = [u, v11v2++4 v21. (5)
Also the identity
1% [v, w]]+[v, [w, u]]+[w, [u, v]] = 0 (6)
is easily verified. Equations (4) express that the P.B. [u, v] involves
u and v linearly, while equations (5) correspond to the ordinary rules
for differentiating a product.
Let us try to introduce a quantum P.B. which shall be the analogue
of the classical one. We assume the quantum P.B. to satisfy all the
conditions (2) to (6), it being now necessary that the Order of the
factors u1 and uz in the first of equations (5) should be preserved
throughout the equation, as in the way we have here w-ritten it, and
similarly for the v1 and v2 in the second of equations (5). These conditions
are already sticient to determine the form of the quantum
P.B. uniquely, as may be seen from the following argument. We tan
evaluate the P.B. [ul us, v1 v2] in two different ways, since we tan use
either of the two formulas (5) first, thus,
[Ul U2Y Vl %l = ~~~~211~21~2+~~~~2~~~~21
= (C% "11~2+"1c% v21>u2+%@2> "11~2+~&27 van
= [~~>~,P2~2+~,[~,~~21~2+~,[~2~~,1~2+~,~,~~2~~21
and
[Ul UZ> VI v21 = Cu1 Uz9 v&2+4?% u29 v21
= [UD %]u,v,+~,lu,~ ~&a+"1[% "zluz+% %[U27 v21.
Equating these two results, we obtain
[Ul, ~&J27J2-~2U,) = mv-~I d-u,> v,l*
Since this condition holds with ui and v1 quite independent of u2 and
vuZ, we must hsve Ul V~--V~ Ul = i?i[ul, VJ,
u2v2-42u2 = iqu,, v21,
where fi must not depend on u1 and vl, nor on u, and v2, and also
must commute with (u,v, -vl u,). It follows that fi must be simply
a number. We Want the P.B. of two real variables to be real, as in
the classical theory, which requires, from the work at the top of p. 28,
that 6 aha11 be a real number when introduced, as here, with the
a
* § 21 POISSON BRACKETS 87
coefficient i. We arc thus led to the following definition for the
quuntum P.B. [u, v] of any two variables u and v,
UV-vu = iqu, q, (7)
in which 6 is it new universal constant. It has the dimensions of
action. In Order that the theory may agree with experiment, we
must take $5 equal to h/%, where h is the universal constant that
was introduced by Planck, known as Planck's constant. It is easily
verifled that the quantum P.B. satisfies all the conditions (2), (3), (4),
W, and (6).
The Problem of finding quantum conditions now reduces to the
Problem of determining P.B.`s in quantum mechanics. The strong
analogy between the quantum P.B. defined by (7) and the classical
P.B. defined by (1) leads us to make the assumption that the quantum
P.B.`s, or at any rate the simpler ones of them, have the same values
as the corresponding classical P.B.`s. The simplest P.B.`s arc those
involving the canonical coordinates and momenta themselves and
have the following values in the classical theory:
We therefore assume that the corresponding quantum P.B.`s also
have the values given by (8). By eliminating the quantum P.B.`s
with the help of (7), we obtain the equations
Qr Qs-%Pr = 0, PirPs-%Pr = 09
%%-%4r = =LY 1 (9)
which are the fundurnental quuntum conditions. They Show us where
the lack of commutability among the canonical coordinates and
momenta lies. They also provide us with a basis for calculating commutation
relations between other dynamical variables. For instance,
if [ and r) are any two functions of the q's and p's expressible as
power series, we may express [y--$ or [f, 71, by repeated applications
of the laws (2), (3), (4), and (6), in terms of the elementary
P.B.`s given in (8) and so evaluate it. The result is often, in simple
cases, the same as the classical result, or departs from the classical
result omy through requiring a special Order for factors in a product,
this Order being, of course, unimportant in the classical theory. Even
when f and 7 are more general functions of the q's and p's not expressible
as power series, equations (9) are still sufficient to fix the
88 THE QUANTUM CONDITIONS § 21
value of [T--$, as will become clear from the following werk.
Equ&ions (9) thus give the Solution of the Problem of finding the
quantum conditions, for all those dynamical Systems which have a
classical analogue and which are describable in terms of canonical
coordinates and momenta. This does not include all possible Systems
in quantum mechanics.
Equations (`7) and (9) provide the foundation for the analogy
between quantum mechanics and classical mechanics. They show
fhat classical mechunics may be regarded us the limiting case of quuntum
mechunics when 5 tends to Zero. A P.B. in quantum mechanics is a
purely algebraic notion and is thus a rather more fundamental concept
than a classical P.B., which tan be defined only with reference to
a set of canonical coordinates and momenta. For this reason canonical
coordinates and momenta are of less importante in quantum mechanics
than in classical mechanics; in fact, we may have a System in quanturn
mechanics for which canonical coordinates and momenta do
not exist and we tan still give a meaning to P.B.`s. Such a System
would be one without a classical analogue and we should not be able
to obtain its quantum conditions by the method here described.
From equations (9) we see that two variables with different suffixes
r and s always commute. It follows that any function of qT and p,,
will commute with any function of qS and p, when s differs from r.
Different values of r correspond to different degrees of freedom of the
dynamical System, so we get the result that dynumical variables
referring to different degrees of freedom commute. This law, as we have
derived it from (9), is proved only for dynamical Systems with
classical analogues, but we assume it to hold generally. In this way
we tan make a Start on the Problem of finding quantum conditions.
for dynamical Systems for which canonical coordinates and momenta
do not exist, provided we tan give a meaning to different degrees of
freedom, as we may be able to do with the help of physical insight.
We tan now see the physical meaning of the division, which was
discussed in the preceding section, of the dynamical variables into
Sets, any member of one set commuting with any member of another.
Esch set corresponds to certain degrees of freedom, or possibly just
one degree of freedom. The division may correspond to the physical
process of resolving the dynamical System into its constituent Parts,
each constituent being capable of existing by itself as a physical
System, and the various constituents having to be brought into
22. Schroedinger's Representation
§ 21 POISSON BRACKETS 89
interaction with one another to produce the original System. Alternatively
the division may be merely a mathematical procedure of
resolving the dynamical System into degrees of freedom which cannot
be separated physically, e.,CT. the System consisting of a particle with
internal structure may be divided into the degrees of freedom describing
the motion of the centre of the particle and those describing the
internal structure.
22. Schroedinger's representation
Let us consider a dynamical System with n degrees of freedom
having a classical analogue, and thus describable in terms of canonical
coordinates and momenta q,.,p, (r = 1,2,... , n). We assume that the
coordinates qr are all observables and haue continuous ranges of eigenvalues,
these assumptions being reasonable from the physical significance
of the q's. Let us set up a representation with the q's diagonal.
The question arises whether the q's form a complete commuting set
for this dynamical System. It seems pretty obvious from inspection
that they do. We shall here assume that they do, and the assumption
will be justified later (see top of p. 92). With the q's forming a
complete commuting set, the representation is fixed except for the
arbitrary Phase factors in it.
Let us consider first the case of n = 1, so that there is only one q
and ~p, satisfying qp-pq = in.
Any ket may be written in the Standard ket notation #(q)>. From it
we tan form another ket d#/dq), whose representative is the derivative
of the original one. This new ket is a linear function of the
original one and is thus the result of some linear Operator applied to
the original one. Calling this linear Operator d/dq, we have
gn = -",. (11)
Equation (11) holding for all functions $ defines the linear Operator
dldq. We have
g->-o. (12)
Let us treat the linear Operator d/dq according to the general theory
8f linear Operators of 6 7. We should then be able to apply it to a bra
(4(q), the product ($d/dq being defined, according to (3) of $ 7, by
90 THE QUA.NTUM CONDITIONS 0 22
for all functions #Q). Taking representatives, we get
(14)
We tan transform the right-hand side by partial integration and get
provided the contributions from the limits of integration vanish.
This gives d
<+dqw = -9,
showing that <Q,= -(!g$- . (16) ;- d
:- J
I .
T'hus dldq operating to the left on the conjugate complex of a wave ' '
function has the meaning of minus differentiation with respect to q.
The validity of this result depends on our being able to make the
passage f?om (14) to (15), which requires that we must restritt ourselves
to bras and kets corresponding to wave functions that satisfy
suitable boundary conditions. The conditions usually holding in
practice are that they vanish at the boundaries. (Somewhat more
general conditions will be given in the next section.) These conditions
do not limit the physical applicability of the theory, but, on the contrary,
are usually required also on physi4 grounds. For example,
if q is a Cartesian coordinate of a particle, its eigenvalues run from
-00 to CO, and the physical requirement that the particle has Zero
probability of being at infinity leads to the condition that the wave
function vanishes for q = &co.
ri ' ,f.1
:' The conjugate complex of fhe linear Operator d/dq tan be evaluated
1;. by noting that the conjugate imaginary of d/dq. #) or d#/dq) is
i ' 04Jf&, or - (4 d/dq from (16). Thus the conjugate complex of d/dq
.1s -dfdq, so d/dq is a pure imginary linear Operator.
To get the representative of djdq we note that, from an application
of formula (63) of 5 20,
k"> = WPfD, (17)
and hence (19)
The representative of dldq involves the derivative of the 8 function.
5 22 SCHROeDINGER'S REPRESENTATION
Let us werk out the comrnutation relation connecting d/dq with q.
We have d
&jM (20)
Since this holds for any ket #>, we have
a a
dsq-"4i= l*
Comparing this result with ( lO), we see that -% d/dq satisJies the
same coinmutation relution with q thut p does.
To extend the foregoing work to the case of arbitrary n, we write
the general ket as #(ql...q,)) = #) and introduce the n linear operators
a/aq, (T = l,..., n), which tan operate on it in accordance with
the formula
g*, = $), (22)
T r
corresponding to (11). We have
-&)=O
f (23)
corresponding to (12). Provided we restritt ourselves to bras and
kets corresponding to wave functions satisfying suitable boundary
conditions, these linear Operators tan operate also on bras, in accordance
with the formula <+$ = -<", (24)
r r
corresponding to (16). Thus a/aq, tan operate to the left on the
conjugate complex of a wave function, when it has the meaning pf
minus partial differentiation with respect to q,. We find as before
that each a/aq,. is a pure imaginary linear Operator. Corresponding
to (21) we have the commutation relations
We have further
=2%> (26)
%r %
aa aa
showing that --=-e*
a!L 4s (27)
a% a%
Comparing (25) and (27) with (9), we see that the linear operators
-4 a/aq, satisfy the same commutution relutions with the q's and with
euch other thut the p's do.
92 THE QUANTUM CONDITIONS 9 22
It would be possible to take
Pr = -3#ia/aqv (28)
without getting any inconsistency. This possibility enables us to see
that the q's must form a complete commuting set of observables,
since it means that any function of the q's and ~3s could be taken
to be a function of the q's and -4 i?/aq's and then could not commute
with all the q's unless it is a function of the q's only.
The equations (28) do not necessarily hold. But in any case the
quantities p,+S a/aq,. each commute with all the q's, so each of them
is a function of the q's, from Theorem 2 of 5 19. Thus
Pr = -4 var+frw (29)
Since 13, and --ifia/aqr are both real, J.(q) must be real. l?or any
function f of the q's we have
showing that $-f& = g.
? T r
With the help of (29) we tan now deduce the general formula
PJ-fPr = -ifL af/a!lr* (31)
This formula may be written in P.B. notation
Lf)Prl = afla% (32)
when it is the same as in the classical theory, as follows from (1).
Multiplying (27) by ( -i6)2 and substituting for ---in a/aq,. and -8 a/aqg
their values given by (29), we get
(Pr-frl(Ps-fs) = (Ps-f.s)(Pr-fr)?
which reduces, with the help of the quantum conditionp,p, = pspr, to
Prfs+frPs = Psfr+fsPr*
This reduces further, with the help of (31), to
aft31a% = afrla%) (33)
showing that the functions fr are all of the form
fr = waqr (34)
with F independent of r. Equation (29) now becomes
Pr = -i%a/ap,+aF/afjr. (35)
We have been working with a representation which is fixed to the
extent that the q's must be diagonal in it, but which contains arbitrary
a
§ 22 SCHRODINGER'S REPRESENTATION
Phase factors. If the Phase factors are changed, the Operators a/aq,
get changed. It will now be shown that, by a suitable Change in the
Phase factors, the function F in (35) tan be made to vanish, so that
equations (28) are made to hold.
Using Stars to distinguish quantities referring to the new repre-
sentation with the new Phase factors, we shall have the new basic
bras connected with the previous ones by
<4;4;* I = eif(q'...qk / WV
where y' = y(q') is a real function of the q"s. The new representative
of a ket is eir' firnes the old one, showing that e%j)* = #), SO
(37)
as the connexion between the new Standard ket and the original one.
The new linear Operator (a/aq,)* satisfies, corresponding to (22),
$ *#)* = $.>* = e-W!$.)
( r1 r t
with the help of (37). Using (22), this gives
showing that a * a
= e-iy-eiy
( G1 at& '
or, with the help of (SO), a * a +i$. (39)
( )
ag,=as, r
By choosing y so that ;F = ny+ a constant, (40)
(35) becomes Pr = -iiti(a/aq,)*. (41)
Equation (40) fixes y except for an arbitrary constant, so the representation
is fixed except for an arbitrary constant Phase factor.
In this way we see that a representation tan be set up in which
the q's a3ne diagonal and equations (28) hold. This representation is
a very useful one for many Problems. It will be called ScTLroecG~er's
representution, as it was the representation in terms of which Schroedinger
gave his original formulation of quantum mechanics in 1926.
Schroedinger's representation exists whenever one has canonical q's
and p's, and is completely determined by these q's and p's except for
an arbitrary constant Phase factor. It owes its great convenience to
its allowing one to express immediately any algebraic function of the
23. The Momentum Representation
94 THE QUANTUM CONDITIONS 522 11
q's and P's of the form of a power series in the P's as an Operator of i
differentiation, e.g. if f(ql,. . . , qn, PI,. . . , Pi,) is such a function, we have !
f(ql,...,qn,pil,...,pn) = fkb..4h -wag,,..., -ifivqn), (42)
provided we preserve the Order of the factors in a product on substi- '
tuting the -%a/aq's for the p's.
From (23) and (28), we have
PJ = 0. (43)
Thus the Standard ket in Schroedinger's representation is characterized :
by the condition that it is a simultaneous eigenket of all the momenta i
. belonging to the eigenvalues Zero. Some properties of the basic :
vectors of Schroedinger's representation may also be noted. Equation
(22) gives
<q;...q;]$$> = (q;...q;l$) = %g$Q = -$ <q;...q;1+>.
7 T T r
Hence <d...!Al; = gAq;...qiA, (44)
r r
(q;...q;/p, = 4; <q;...q;1. w
T
Similarly, equation (24) leads to
23. The momentum representation
Let us take a System with one degree of freedom, describable in
terms of a q and p with the eigenvalues of q running from --CO to CO, i
and let us take an eigenket Ip') of p. Its representative in the Sehroe- 1
dinger representation, (q' Ip'), satisfies
P'WP') = WIPIP'> = -i$$ (qf IP'),
with the help of (45) applied to the case of one degree of freedom.
The solutioy, of Chis differential equation for (q' Ip') is
(q'lp') = cf eWd/fi, (47
where c' = c@`) is independent of q', but may involve p'.
The representative (q' Ip') does not satisfy the boundary conditions
of vanishing at q' = -&o. This gives rise to some d.ifEculty, which
.' § 23 THE MOMENT~J~% REPRESENTATION 95
Shows itself up most arectly in the fGluro of the orthogonality
theorem. If we take a seoond eigenket I@`> of p with representative
(*'@"> = &fP"b/n,
belonging to a different eigenvalue p", we shA.l have
This i&egrd does not converge according to the USU~~ definition of
convergence. To br-g tho theory into Order, we adopt a new definition
of convergence of an integral whose domain extendsto inanity,
analogous to the Cesaro definition of the sum of an infuaife series.
With this new de-finition, an integral whose value to the upper hmif
q' is of the form cosq' or sin&, with a a real number nof Zero, is
counted as Zero when q' ten& to infinity, i.e. we take the mean value
of the oscillations, and simiIa;rly for the lower limit of q' tending to
minus Unity. Th& makes the right-hand side of (48) vanish for
13" # p', so that the ortlmgonality theorem is restored. Also it makes
the right-hand sides of (13) and (14) equal when (4 and $) arc eigenvectors
of p, so that eigenvectors of p become permissible vectors to
use with the Operator d/dq. Thus the boundary conditions that the
representative of a permissible bra or ket has to satisfy become
extended to allow the representrttive to oscillate like cos& or sinaq'
M
as q' goes to inCn.ity or minus in6nity.
For p" very close to p', the right-hand side of (48) involves a 6
function. To evaluate it, we need the formula
00
s eze dx = 27r6(a) (49)
-03
for real a, which may be proved as follows. The formula evidently
holds for a different from Zero, as both sides are then Zero. Further
we have, for any continuous functionf(a),
Jf(a) du Jo eim dx = sm&) da 2a-l sinag = 2$(0)
-cQ -l7 -Co
in the Limit when g tends to infinity. A more complicated argument
Shows that we get the Same result if instead of the limits g and -g
we put g, and -g2, and then Iet g1 and g, tend to infCty in Werent
ways (not too tidely different). This Shows the equivalence of both
sides of' (49) as facfors in an "fegrand, which proves the formula.
96 THE QUANTUM CONDITIONS 9 23
With the help of (49), (48) becomes
(p'lp") = i-2 ZnS[(p'-p')/li] = 7cM h S(p'-pl)
= IC'IVL S(p'-p'). (50)
We hsve obtained an eigenket of p belonging to any real eigenvalue
p', its representative being given by (47). Any ket IX) tan be expanded
in terms of these eigenkets of p, since its representative
(@IX> tan be expanded in terms of the representatives (47) by
Fourier analysis. It follows that the momentum p is an observable,
in agreement with the experimental result that momenta tan be
observed.
A symmetry now appears between 4 and p. Esch of them is an
observable with eigenvalues extending from --CO to CO, and the
commutation relation connecting q and p, equation (lO), remains
invariant if we interchange q and p and write -i for i. We have set
up a representation in which q is diagonal and p = -ihd/dq. It
follows from the symmetry that we tan also set up a representation
in which p is diagonal and
q = &d/dp, (51)
the Operator d/dp being defined by a procedure similar to that used
for d/dq. This representation will be called the momentum representation.
It is less useful than the previous Schroedinger representation
because, while the Schroedinger representation enables one to express
as an Operator of differentiation any function of q and p that is a
power series in p, the momentum representation enables one so to
express any function of q and p that is a power series in q, and the
important quantities in dynamics are almost always power series in
p but are often not power series in q. All the Same the momentum
representation is of value for certain Problems (see $ 50).
Let us calculate the transformation function (q' 1~`) connecting the
two representations. The basic kets jp') of the momentum representation
are eigenkets of p and their Schroedinger representatives (q'lp')
are given by (47) with the coefficients c' suitably Chosen. The Phase
factors of these basic kets must be Chosen so as to make (51) hold.
The easiest way to bring in this condition is to use the symmetry
between q and p referred to above, according to which (q' jp') must
go over into (p'[q') if we interchange q' and p' and write -4 for i.
Now {q'lp') is equal to the right-hand side of (47) and (p' Iq') to the
24. Heisenberg's Principle of Uncertainty
0 23 THE MOMENTUM REPRESENTATION 97
conjugate complex expression, and hence c' must be independent of
$. Thus c' is just a number c. Further, we must have
CP IP") = ~@`--l-0,
which Shows, on comparison with (50), that Ic 1 = h-4. We tan choose
the arbitrary constant Phase factor in either representation so as to
make c = h-+, and we then get
(q' Ip') = h-@-fd/fi W2)
for the transformation function.
The foregoing work may easily be generalized to a System with
n degrees of freedom, describable in terms of n p's and $8, with the
eigenvalues of each q running from --CO to 00. Esch 13 will then be
an observable with eigenvalues running from -CO to co, and there
will be symmetry between the set of q's and the set of p's, the
commutation relations remaining invariant if we interchange each q,,
with the corresponding p,. and write 4 for i. A momentum representation
tan be set up in which the @s are diagonal and esch
4T = iha/app. W)
The transformation function connecting it with the Schroedinger
representation will be given by the product of the transformation
functions for each degree of freedom separately, as is shown by
formula (67) of $20, and will thus be
<s;qB...snlP;~z..*Pn> = cel13;>caaI~~>.g.<q~l~~)
= h-n12ezo3;qz;+p;qef...~p~q~l~, (54) .
24. Heisenberg's principle of uncertainty
For a System with one degree of freedom, the Schroedinger and the
momentum representatives of a ket IX> are connected by
(pf IX) = h-h OD e-fq'p'l~ dq' (q'lx},
s
-03 w
(a'lX> =h-k * &`P'lfi &p' (p'IX>.
f
-CO 1
These formulas have an elementary significance. They show that
either of I the representatives is. given, apart from numerical coeficients,
by the am(plitudes of the Pourier components of tke~other. _
It is interesting to apply (55) to a ket whose Schroedinger repre-
sentative consists of what is called a wave packet. This is a function
3995.57 H
j.d>
98 THE QUANTUM CONDITIONS $ 24
whose value is very small everywhere outside a certain domain, of
width Aq' say, and inside this domain is approximately periodic with
a definite frequency.t If a Fourier analysis is made of such a wave
packet, the amplitude of all the Fourier components will be small,
except those in the neighbourhood of the definite frequency. The
components whose amplitudes are not small will fill up a frequencyt
band whose width is of the Order l/Aq', since two components whose
frequencies differ by this amount, if in Phase in the middle of the
domain Aq', will be just out of Phase and interfering at the ends of
this domain. Now in the first of equations (55) the variable
(2~)~3'/fi = p'/h plays the part of frequency. Thus with (q' IX) of the
form of a wave packet, the function (p'/X), being composed of the
amplitudes of the Fourier components of the wave packet, will be
small everywhere in the p'-space outside a certain domain of width
AP' = h/Aq'.
Let us now apply the physical interpretation of the Square of the
modulus of the representative of a ket as a probability. We find that
our wave packet represents a state for which a measurement of q is
almost certain to lead to a result lying in a domain of width Aq' and
a measurement of p is almost certain to lead to a result lying in a
domain of width Ap'. We may say that for this state q has a definite
value with an error of Order Aq' and p has a definite value with an
error of Order Ap'. The product of these two errors is
Aq'Ap' = h. (56)
Thus the more accurately one of the variables q,p has a definite
value, the less accurately the other has a definite value. For a System
with several degrees of freedom, equation (56) applies to each degree
of freedom separately.
Equation (56) is known as Heisenberg's Principle of Uncertainty.
It Shows clearly the limitations in the possibility of simultaneously
assigning numerical values, for any particular state, to two noncommuting
observables, when those observables are a canonical coOrdinate
and momentum, and provides a plain illustration of how
observations in quantum mechanics may be incompatible. It also
Shows how classical mechanics, which assumes that numerical values
tan be assigned simultaneously to all observables, may be a valid
approximation when h tan be considered as small enough fo be
t Frequency here means reciprocal of wave-length.
25. Displacement Operators
§ 24 HEISENBERG'S PRINCIPLE OF UNCERTAINTY 09
negligible. Equation (56) holds only in the most favourable case,
which occurs when the representative of the state is of the form of a
wave packet. Other forms of representative would lead to a Aq' and
AP' whose product is larger than h.
Heisenberg's principle of uncertainty Shows that, in the limit when
either q or p is completely determined, the other is completely
undetermined. This result tan also be obtained directly from the
transformation function (q'lp'). According to the end of 6 18,
l(q'1P'>12da' * P P t
1s ro or ional to the probability of q having a value in
the small range from q' to q'+dq' for the state for which p certainly
has the value p', and from (52) this probability is independent of q'
for 8, given &' . Thus if p certainly has a definite value p', all values
of q are equally probable. Similarly, if q certainly has a definite value
q', all values of p are equally probable.
It is evident physically that a state for which all values of q are
equally probable, or one for which all values ofp are equally probable,
cannot be attained in practice, in the first case because of limitations
of size and in the second because of limitations of energy. Thus an
eigenstate of p or an eigenstate of q cannot be attained in practice.
The argument at the end of $ 12 already showed that such eigenstates
are unattainable, because of the infinite precision that would be
needed to set them up, and we now have another argument leading
to the same conclusion.
25. Displacement Operators
We get a new insight into the meaning of some of the quantum con-
ditions by making a study of displacement Operators. These appear
in the theory when we take into consideration that the scheme of
relations between states and dynamical variables given in Chapter 11
is essentially aphysical scheme, so that if certain states and dynamical
variables are connected by some relation, on our displacing them all
in a definite way (for example, displacing them all through a distance
6x in the direction of the x-axis of Cartesian coordinates), the new
states and dynamical variables would have to be connected by the
same relation.
The displacement of a state or observable is a perfectly definite
process physically. Thus to displace a state or observable through a
distance 6x in the direction of the x-axis, we should merely have to
displace all the apparatus used in preparing the state, or all the
100 THE QUANTUM CONDITIONS s 25
apparatus required to measure the observable, through the distance
Sx in the direction of the x-axis, and the displaced apparatus would
define the displaced state or observable. The displacement of a
dynamical variable must be just as definite as the displacement of
an observable, because of the close mathematical connexion between
dynamical variables and observables. A displaced state or dynamical
variable is uniquely determined by the undisplaced state or dynamical
variable together with the direction and magnitude of the displacement
.
The displacement of a ket vector is not such a definite thing though.
If we take a certain ket vector, it will represent a certain state and we
may displace this state and get a perfectly definite new state, but this
new state will not determine our displaced ket, but only the direction
of our displaced ket. We help to fix our displaced ket by requiring
that it shall have the same length as the undisplaced ket, but even
then it is not completely determined, but tan still be multiplied by
an arbitrary Phase factor. One would think at first sight that each
ket one displaces would have a different arbitrary Phase factor,
but with the help of the following argument, we see that it must be
the same for them all. We make use of the law that Superposition
relationships between states remain invariant under the displacement.
A Superposition relationship between states is expressed
mathematically by a linear equation between the kets corresponding
to those states, for example
IJ9 =q4+c,IJo, (57)
where c1 and c2 are numbers, and the invariance of the Superposition
relationship requires that the displaced states correspond to kets
with the same linear equation between them-in our example they
would correspond to IRd), [Ad), IB&!> say, satisfying
pd) = C~~Ad)+c,(Bd). (58)
We take these kets to be our displaced kets, rather than these kets
multiplied by arbitrary independent Phase factors, which latter 4
kets would satisfy a linear equation with different coefficients cl, c2.
The only arbitrariness now left in the displaced kets is that of a Single
arbitrary Phase factor to be multiplied into all of them.
The condition that linear equations between the kets remain in-
variant under the displacement and that an equation such as (58)
holds whenever the corresponding (57) holds, means that the dis-
4 25 DISE'LACEMENT OPERATORS 101
placed kets are linear functions of the undisplaced kets and thus each
displaced ket /Pd) is the result of some linear Operator applied to the
corresponding undisplaced ket IP). In Symbols,
VW = DIP), (69)
where D is a linear Operator independent of 1 P> and depending only
on the displacement. The arbitrary Phase factor by which all the
displaced kets may be multiplied results in D being undetermined
to the extent of an arbitrary numerical factor of modulus unity.
With the displacement of kets made definite in the above manner
and the displacement of bras, of course, made equally definite,
through their being the conjugate imaginaries of the kets, we tan
now assert that any symbolic equation between kets, bras, and
dynamical variables must remain invariant under the displacement
of every Symbol occurring in it, on account of such an equation
having some physical significance which will not get changed by the
displacement .
Take as an exarnple the equation
<QIP) = c>
c being a number. Then we must have
(QdlPd) = c = {QIP). (60)
From the conjugate imaginary of (59) with Q instead of P,
(Qdl = (QID. (61)
Hence (60) gives (QJ~W'> = (QIP>*
Since this holds for arbitrary (Q] and ) P>, we must have
BD=1, (62)
giving us a general condition which D has to satisfy.
Take as a second example the equation
VIP) = IR),
where v is any dynamical variable. Then, using vd to denote the
displaced dynamical variable, we must have
v,/Pd) = /Rd).
With the help of (89) we get
v,IPd) = DIR) = DvjP) = DvD-lIPd).
Since 1 Pd) tan be any ket, we must have
Vd = DvD-`, (63)
102 TIIE QUANTUM CONDITIONS 9 25
which Shows that the linear Operator D determines the displacement
of dynamical variables as weh as that of kets and bras. Note that
the arbitrary numerical factor of modulus unity in D does not affect
vd, and also it does not affect the validity of (62).
Let us now pass to an infinitesimal displacement, i.e. taking the
displacement through the distance Sx in the direction of the x-axis,
let us make 8x + 0. From physical continuity we should expect
a displaced ket IPd) to tend to the original 1 P) and we may further
expect the limit
firn Jpd)-1') = lim D-1 Ip)
6x-+o SX Gx-+o sx
to exist. This requires that the limit
~~o'D- 1 )/Sx (64)
shall exist. This limit is a linear Operator which we shall cal1 the
dislplacement Operator for the x-direction and denote by dz. The
arbitrary numerical factor eiy with y real which we may multiply
into D must be made to tend to unity as Sx --+ 0 and then introduces
an arbitrariness in d,, namely, dx may be replaced by
hm (Deir-- l)/Sx = hm (D- l+iy)/Sx = d,+ia,,
6X+O 6x+0
where a, is the limit of r/Sx. Thus dz contains an arbitrary additive
pure imaginary number.
For Sx small D = I+Sxd,. (65) -
Substituting this into (62), we get
(l+Sxd,)(l+Sxd,) = 1,
which reduces, with neglect of Sx2, to
Sx(ci,+d,) = 0.
Thus dz is a pure imaginary linear Operator. Substituting (65) into
(63) we get, with neglect of Sx2 again,
vd = (l+Sxdx)v(l-Sxd,) = v+Sx(d,v-v dJ, (66)
showing that lim (w,-w)/Sx = .d,v-vd,. (67) -
6X-+O
We may describe any dynamical System in terms of the following
dynamical variables: the Cartesian coordinates x, y, x of the centre of
mass of the System, the components ~~,p~,p~ of the total momentum
of the System, which are the canonical momenta conjugate to x, y, z
respectively, and any dynamical variables needed for describing
26. Unitary Transformations
§ 25 DISPLACEMENT OPERATORS 103
internal degrees of freedom of the System. If we suppose a piece
of apparatus which has been set up to measure x, to be displaced a
distance 6x in the direction of the x-axis, it will measure x-6x, hence
x& = x-6x.
Comparing this with (66) for v = x, we obtain
d,x-xd, = -1. (68)
This is the quantum condition connecting d, with x. From similar
arguments we find that y, x, pZ, p2/, 13, and the internal dynamical vari-
ables, which are unaffected by the displacement, must commute with
d,. Comparing these results with (Q), we see that i& dz satisfies just
the same quantum conditions as 23,. Their differente, pZ-i7idZ,
commutes with all the dynamical variables and must therefore be a
number. This number, which is necessarily real since p5 and S dz are
both real, may be made Zero by a suitable choice of the arbitrary,
pure imaginary number that tan be added to dZ. We then have the
result Pz = ins,, (69)
or the x-component of the total momentum of the system is i!i times the
disphcement Operator d,.
This is a fundamental result, which gives a new significance to
displacement Operators. There is a corresponding result, of course,
also for the y and x displacement Operators d, and da. The quantum
conditions which state that (ps, pu and ps commute with each other
are now seen to be connected with the fact that displacements in
different directions are commutable operations.
26. Unitary transformations
Let U be any linear Operator that has a reciprocal U-l and con-
sider the equation a* = uau-1, (`0)
cx being an arbitrary linear Operator. This equation may be regarded
as expressing a transformation from any linear Operator CII to a
corresponding linear Operator a*, and as such it has rather remarkable
properties. In the first place it should be noted that each a* has the
same eigenvalues as the corresponding a; since, if a' is any eigenvalue
of 01 and Ia') is an eigenket belonging to it, we have
and hence cx*u[a.`) = UaU-Wla') = UaIa') = Cx'UId},
104 THE QUANTUM CONDITIONS 5 33
showing that Ul&) is an eigenket of CX* belonging to the same eigenvalue
01', and similarly any eigenvalue of CY* may be shown to be also
an eigenvalue of CL. Z'urther, if we take several a's that are connected
by algebraic equations and transform them all according to (`70), the
corresponding c11*`s will be connected by the same algebraic equations.
This result follows from the fact that the fundamental algebraic processes
of addition and multiplication are left invariant by the transformation
(YO), as is shown by the following equations :
(al+a&* = u(cxl+aJu-l = ucYlu-l+ua2 u-1 = af+cg,
(a1 aJ" = Uap, u-1 = Uctl u-wcL2 u-1 = c+g.
Let us now see what condition would be imposed on U by the
requirement that any real (Y' transforms into a real 01*. Equation
(70) may be written a*u = Ua. (71)
Taking the oonjugate complex of both sides in accordance with
(5) of Q 8 we find, if ~11 and CX* are both real,
uea* = aue. (72)
Equation (71) gives us uea*u = im,
and equation (72) gives us
UeCXJ = am.
Hence ueua = aueu.
Thus UeU commutes with any real linear Operator and therefore also
with any linear Operator whatever, since any linear Operator tan be
expressed as one real one plus i times another. Hence UeU is a
number. It is obviously real, its conjugate complex according to (5)
of $ 8 being the Same as itself, and further it must be a positive
number, since for any ket [P), (P 1 UeU 1 P) is positive as well as
<P 1 P). We tan suppose it to be unity without any loss of generality
in the transformation (70). We then have
UV-= 1. (73)
Equation (73) is equivalent to any of the following
u = ue-1, ue = u-1, u-q-1 = 1. (74)
A matrix or linear Operator 27 that satisfies (7 3) and (74) is said
to be unitury and a transformation (70) with unitary U is called a
unitary transformation. A unitary transformation transforms real
linear Operators into real linear Operators and leaves invariant any
3 26 UNITARY TRANSFORMATIONS
algebraic equation between linear Operators. It may be considered
as applying also to kets and bras, in accordance with the equations
ly*) = Ul0 (Pl = (Plue - (Pp-l, (`5)
and then it leaves invariant any algebraic equation between linear
Operators, kets, and bras. It transforms eigenvectors of 01 into eigenvectors
of a *. From this one tan easily deduce that it transforms an
observable into an observable and that it leaves invariant any functional
relation between observables based on the general definition
of 8 function given in 0 11.
The inverse of a unitary transformation is also a unitary trans-
formation, since from (74), if U is unitary, U-l is also unitary.
Further, if two unitary transformations are applied in succession,
the result is a third unitary transformation, as may be verified in
the following way. Let the two unitary transformations be (70) and
a+ = va*v-1.
The connexion between 011' and 01 is then
,+ = vu~u-lv-l
= (VU)a( VU)-1 (76)
from (42) of 3 11. Now V U is unitary since
--
vuvu = uvvu = ueu = 1,
and hence (76) is a unitary transformation.
The transformation given in the preceding section from undisplaced
to displaced quantities is an example of a unitary transformation, as
is shown by equations (62), (63), corresponding to squations (73),
(70), and equations (59), 61), corresponding to equations (75).
In classical mechanics one tan make a transformation from the
canonical coordinates and momenta qT,pr (r = l,.., n) to a new set of
variables &!, &! (r = 1,. . , n) satisfying the same P.B. relations as the
q's and ~`8, i.e. equations (8) of 6 21 with q*`s and p*`s replacing the
q's andp's, and tan express all dynamical variables in terms of the q*`s
and p*`s. The q*`s and ~8's are then also called canonical coordinates
and momenta and the transformation is called a contact transformation.
One tan easily verify that the P.B. of any two dynamical
variables u and v is correctly given by formula (1) of $21 with q*`s and
P*`S instead of q's and ~`5, so that the P.B. relationship is invariant
under a contact transformation. This results in the new canonical
coordinates and momenta being on the same footing as the original
ones for mmy purposes of general dynamical theory, even though the
THE QUANTUM CONDITIONS 3 26
new coordinates &! may not be a set of Lagrangian coordinates but
may be functions of the Lagrangian coordinates and velocities.
It will now be shown that, for a quantum dynamical System that
has a classical analogue, unitary transformations in the quantum theory
are the analogue of contact transformations in the classical theory.
Unitary transformations are more general than contact transformations,
since the former tan be applied to Systems in quantum
mechanics that have no classical analogue, but for those Systems in
quantum mechanics which are describable in terms of canonical
coordinates and momenta, the analogy between the two kinds of
transformation holds. To establish it, we note that a unitary transformation
applied to the quantum variables q,.,pr gives new variables
qF,pF satisfying the same P.B. relations, since the P.B. relations are
equivalent to the algebraic relations (9) of 0 2 1 and algebraic relations
are left invariant by a unitary transformation. Conversely, any real
variables q:,pz satisfying the P.& relations for canonical coordinates
and momenta are connected with the q,.,pr by a unitary transformation,
as is shown by ths following argument.
We use the Schroedinger representation, and write the basic ket
jq;...qk> as I@> for brevity. Since we are assuming that the qz,pF
satisfy the P.B. relations for canonical coordinates and momenta,
we tan set up a Schroedinger representation referring to them, with
the qz diagonal and each pf equal to -;fi a/i?qF. The basic kets in
this second Schroedinger representation will be jqf'...qz'), which we
write jq*`> for brevity. Now introduce the linear Operator 77 defined by
GI" I W') = W"`-q'), (77)
where S(Q*`- q') is short for
6(q"`-q') = s(q~`-q;)s(q~`-q;)...8(q$-q;). (78)
The conjugate complex of (77) is
(q' I ue kl*`) = qq*`-q'),
and hence-j-
(q' 1 ue u Iq") = 1 <q' 1 ue lq*`> 4" <q*' I 77 Ia">
= s(q*'
s -q') dq*' S(q*`-q")
= 6(q'-q"),
so that ueU=l.
t We use the notation of a Single integral sign and dq*' to denote an integral over
all the variables q:`, qz',..., qz'. This abbreviation will be used also in future work.
3 26 UNITARY TRANSFORMATIONS 107
Thus U is a unitary Operator. We have further
<!l*`l!lwlo = !$+`~(Q"`--Q')
and GI* I k 14') = G*`-!?`%
The right-hand sides of these two equations are equal on account of
the property of the 8 function (11) of 6 15, and hence
4v.J = Q*
or q; = uq, U-l.
Again, from (45) and (46),
(q*`lp~ulq'> = -ih&qq*~-q').
T
(q*' 1 Up,lq') = i?i$-$ a(q*`--q').
9.
The right-hand sides of these two equations are obviously equal, and
PW= UPr
or p: = upr U-l.
Thus all the conditions for a unitary transformation are verified.
We get an infinitesimal unitary transformation by taking U in (70)
to differ by an infinitesimal from unity. Put
U = 1+id,
where E is infinitesimal, so that its Square tan be neglected. Then
U-1 = l-kp.
The unit,ary condition (73) or (74) requires that J' shall be real. The
transformation equation (70) now takes the form
a* = (1 +kF)cx( 1 -id),
which gives a*--01 = ie(Pa-d). (79)
It may be w-ritten in P.B. notation
CL*--01 = &[cx,F]. . (80)
If 01 is a canonical coordinate or momentum, this is formallythe Same
as a classical infinitesimal contact transformation.
V. The Equations of Motion
27. Schrodinger's Form for the Equations of Motion
V
THE EQUATIONS OB' MOTION
27. Schroedinger's form for the equations of motion
OUR work fror-n 0 5 onwards has all been concerned with one instant
of time. It gave the general scheme of relations between states and
dynamical variables for a dynamical System at one instant of time.
To get a complete theory of dynamics we must consider also the
connexion between different instants of time. When one makes an
Observation on the dynamical System, the state of the System gets
changed in an unpredictable way, but in between observations
causality applies, in quantum mechanics as in classical mechanics,
and the System is governed by equations of motion which make the
state at one time determine the state at a later time. These equations
of motion we now proceed to study. They will apply so long as the
dynamical System is left undisturbed by any Observation or similar
pr0cess.t Their general form tan be deduced from the principle of
Superposition of Chapter 1.
Let us consider a particular state of motion throughout the time
during which the System is left undisturbed. We shall have the state
at any time t corresponding to a certain ket which depends on t and
which rnay be written It). If we deal with several of these states of
motion we distinguish them by giving them labels such as A, and we
then write the ket which corresponds to the state at time t for one
of them ]At). The requirement that the state at one time determines
the state at another time means that ]At,) determines ]At) except
for a numerical factor. The principle of Superposition applies to these
states of motion throughout the time during which the System is
undisturbed, and means that if we take a Superposition relation
holding for certain states at time t, and giving rise to a linear equation
between the corresponding kets, e.g. the equation
l%) = G`%~+m%h
the same superposition relation must hold between the states of
motion throughout the time during which the System is undisturbed
and must lead to the Same equation between the kets corresponding
t The preparation of a state is a prooess of this kind. It often takes the form of
making an Observation and selecting the System when the result of the Observation
turns out to be a certain pre-assigned number.
$27 SCHROeDINGER'S FORM FOR THE EQUATIONS OF MOTION 109
to these states at any time t (in the undisturbed time interval), i.e.
the equation 1Rt) = c,l~o+c,Im,
provided the arbitrary numerical factors by which these kets may be
multiplied arc suitably Chosen. It follows that the IPt)`s are linear
functions of the IPt,)`s and each IPt) is the result of some linear
Operator applied to 1 Pt,). In Symbols
/Pt> = W't,), (1)
where T is a linear Operator independent of P and depending only
on t (and to).
We now assume that each 1 Pt) has the same length as the corre-
sponding jPto>. It is not necessarily possible to choose the arbitrary
numerical factors by which the IPt)`s may be multiplied so as to
make this so without destroying the linear dependence of the IPt)`s
on the 1 PtJ's, so the new assumption is a physical one and not just
a question of notation. It involves a kind of sharpening of the
principle of Superposition. The arbitrariness in IPt) now becomes
merely a Phase factor, which must be independent of P in Order that
the linear dependence of the 1 Pt)`s on the 1 Pt,)`s may be preserved.
From the condition that the length of c1 1 Pt>+c2 1 Qt) equals that of
c,lPto>+cz~&to) for any complex numbers cl, cg, we tan deduce that
<QW> = <QW't,). (2)
The connexion between the IPt)`s and 1 Pt,)`s is formally similar
to the connexion we'had in $25 between the displaced and undisplaced
kets, with a process of time displacement instead of the space displacement
of 3 25. Equations (1) and (2) play the part of equations (69)
and (60) of 3 25. We tan develop the consequences of these equations
as in f~ 25 and tan deduce that T contains an arbitrary numerical
factor of modulus unity and satisfies
FT=l, (3)
corresponding to (62) of 5 25, so T is unitary. We pass to the infinitesimal
case by making t --+ t, and assume from physical continuity that
the limit lim I Pt>- I pt,>
t-d0 t-t,
exists. This limit is just the derivative of [Pt,) with respect to t,.
From (1) it equals
110 THE EQUATIONS OF MOTION 0 27
The limit Operator occurring here is, like (64) of $25, a pure imaginary
linear Operator and is undetermined to the extent of an arbitrary
additive pure imaginary number. Putting this limit Operator multiplied
by i6 equal to H, or rather H(t,) since it may depend on t,,
equation (4) becomes, when written for a general t,
&4po
- = lqt>p>.
dt (5)
Equation (5) gives the general law for the Variation with time of
the ket corresponding to the state at any time. It is Schroedinger's
ferm for the equutions of motion. It involves just one real linear
Operator H(t), which must be characteristic of the dynamical System
under consideration. We assume that H(t) is the total energy of
the system. There are two justifications for this assumption, (i) the
analogy with classical mechanics, which will be developed in the
next section, and (ii) we have H(t) appearing as in firnes an Operator
of displacement in time similar to the Operators of displacement in
the x, y, and x directions of 0 25, so corresponding to (69) of 8 25
we should have H(t) equal to the total energy, since the theory of
relativity puts energy in the same relation to time as momentum to
distance.
We assume on physical grounds that the total energy of a System
is always an observable. For an isolated System it is a constant, and
may then be written H. Even when it is not a constant we aha11 often
write it simply H, leaving its dependence on t understood. If the
energy depends on t, it means the System is acted on by external
forces. An action of this kind is to be distinguished from a disturbance
caused by a process of observation, as the former is compatible
with causality and equations of motion while the latter is not.
We tan get a connexion between H(t) and the T of equation (1)
by substituting for 1 Pt> in (5) its value given by equation (1). This
gives
ifif$ /Pt,) = H(t)TIPt,).
Since 1 Pt,) may be any ket, we have
ifidT
dt = H(t)T. (6)
Equation (5) is very important forpractical Problems, where it is
usually used in conjunction with a representation. Introducing a
28. Heisenberg's Form for the Equations of Motion
§ 27 SCHROeDINGEIC'S FORM FOR THE EQUATIONS OF MOTION 111
representation with a complete set of commuting observables f
diagonal and putting (6' [Pt) equal to #([`t), we have, passing to the
Standard ket notation, w = $440>*
Equation (5) now becomes
(7)
Equation (7) is known as Xchroedinger's wave equation and its solutions
#(&) arc time-dependent wave functions. Esch Solution corresponds to
a state of motion of the System and the Square of its modulus gives
the probability of the s's having specified values at any time t. For
a System describable in terms of canonical coordinates and momenta
we may use Schroedinger's representation and tan then take H to be
an Operator of differentiation in accordance w-ith (42) of 3 22.
28. Heisenberg's form for the equations of motion
In the preceding section we set up a picture of the states of
undisturbed motion by making each of them correspond to a moving
ket, the state at any time corresponding to the ket at that time. We
shall call this the Schroedinger picture. Let us apply to our kets the
unitary transformation which makes each ket Ia) go over into
Ia*) = T-l ja). (8)
This transformation is of the form given by (75) of 8 26 with T-l for
U, but it depends on the time t since T depends on t. It is thus to be
pictured as the application of a continuous motion (consisting of
rotations and uniform deformations) to the whole ket vector space.
A ket which is originally fixed becomes a moving one, its motion being
given by (8) with Ia) independent of t. On the other hand, a ket
which is originally moving to correspond to a state of undisturbed
motion, i.e. in accordance with equation (l), becomes fixed, since on
substituting /Pt) for Ia> ,in (8) we get Ia*> independent of t. Thus
the transformation brings the kets corresponding to stutes of undisturbed
motion. to rest.
The unitary transformation must be applied also to bras and linear
Operators, in Order that equations between the various quantities may
remain invariant. The transformation applied to bras is given by the
conjugate imaginary of (8) and applied to linear Operators it is given
by (70) of 5 26 with T-l for U, i.e.
a* = TAT. (9)
112 THE EQUATIONS OF MOTION
A linear Operator which is originally fixed transforms into a moving
linear Operator in general. Now a dynamical variable corresponds to
a linear Operator which is originally fixed (because it does not refer
to t at all), so after the transformation it corresponds to a moving
linear Operator. The transformation thus leads us to a new picture
of the motion, in which the states correspond to fixed vectors and
the dynamical variables to moving linear Operators. We shall cal1
this the Heisenberg picture.
The physical condition of the dynamical System at any time
involves the relation of the dynamical variables to the state, and
the Change of the physical condition with time may be ascribed
either to a Change in the state, with the dynamical variables kept
fixed, which gives us the Schroedinger picture, or to a Change in the
dynamical variables, with the state kept fixed, which gives us the
Heisenberg picture.
In the Heisenberg picture there are equations of motion for the
dynamical variables. Take a dynamical variable corresponding to
the fixed linear Operator v in the Schroedinger picture. In the Heisenberg
picture it corresponds to a moving linear Operator, which we
write as vt instead of v*, to bring out its dependence on t, and which
is given by vt = T-%T
or TV, = vT.
Dserentiating with respect to t, we get
aT av, aT
-p+T-= vz.
at
With the help of (6), this gives
.
HTq+zfiT dvt
dt = vHT
dv
OP in-1 = T-QHT- T-IHTv,
dt
= v, H,--H,v,, (11)
where H,= T-IHT. 02)
Equation (11) may be written in P.B. notation
9 28 HEISENBERG'S FORM FOR THE EQUATIONS OF MOTION 113
Equation (11) or (13) Shows how any dynamical variable varies
with time in the Heisenberg picture and gives us Heisenberg's ferm
fm the equutions of motion. These equations of motion are determined
by the one linear Operator H,, which is just the transform of the linear
Operator H occurring in Schroedinger's form for the equations of
motion and corresponds to the energy in the Heisenberg picture. We
shall cal1 the dynamical variables in the Heisenberg picture, where
they vary with the time, Heisenberg dynamical variables, to distinguish
them from the fixed dynamical variables of the Schroedinger picture,
which we shall cal1 Schroedinger dynamical variables. Esch Heisenberg
dynamical variable is connected with the corresponding Schroedinger
dynamical variable by equation ( 10). Since this connexion is a unitary
transformation, all algebraic and functional relationships are the
Same for both kinds of dynamical variable. We have T = 1 for
t = t,, so that viO = v and any Heisenberg dynamical variable at time
t, equals the corresponding Schroedinger dynamical variable.
Equation (13) tan be compared with classical mechanics, where we
also have dynamical variables varying with the time. The equations
of motion of classical mechanics tan be written in the Hamiltonian
form dq, = aH dP, im
-=--2
dt F,' dt (14)
%r
where the q's and p's are a set of canonical coordinates and momenta
and H is the energy expressed as a function of them and possibly also
of t. The energy expressed in this way is called the Hamiltonian.
Equations (14) give, for v any function of the q's and JI'S that does
not contain the time t explicitly,
dv av da av dp
-= -.r -2
dt %T dt +ap? dt
av aH av al.7
= -e-w-
% acp, apr aqr
= [v,HJ, (15)
with the classical definition of a P-B., equation (1) of 3 21. This is
of the Same form as equation (13) in the quantum theory. We thus
get an analogy between the classical equations of motion in the
Hamiltonian form and the quantum equations of motion in Heisenberg's
form. This analogy provides a justification for the assumption
3595.67 r
114 THE EQUATIOKS OJ? MOTION § 28
that the linear operator Ii introduced in the preceding section is the
energy of the System in quantum mechanics.
In cbssical mechanics a dynamical System is defined mathemati-
cally when the Hamiltonian is given, i.e. when the energy is given
in terms of a set of canonical coordinates and momenta, as this is
sufficient to fix the equations of motion. In quantum mechanics a
dynamical System is defined mathematically when the energy is
given in terms of dynamical variables whose commutation relations
are known, as this is then sufficient to fix the equations of motion,
in both Schroedinger's and Heisenberg's fozm. We need to have
either H expressed in terms of the Schroedinger dynamical variables
or Ht expressed in terms of the corresponding Heisenberg dynamical
variables, the functional relationship being, of course, the same in
both cases. We call the energy expressed in this way the Hamiltonian
of the dynamical System in quantum mechanics, to keep up the
analogy with the classical theory.
A System in quantum mechanics always has a Hamiltonian, whether
the System is one that has a classical analogue and is describable in
terms of canonical coordinates and momenta or not. However, 8 the
System does have a classical analogue, its connexion with classical
mechanics is specially close and one tan usually assume that the
Hamiltonian is the same function of the canonical coordinates and
momenta in the quantum theory as in the classical theory.? There
would be a dBlculty in this, of course, if the classical Hamiltonian
involved a product of factors whose quantum analogues do not commute,
as one would not know in which Order to put these factors in
the quantum Hamiltonian, but tbis does not happen for most of the
elementary dynamical Systems whose study is important for atomic
physics. In consequence we are able also largely to use the same
language for describing dynamical Systems in the quantum theory as
in the classical theory (e.g. to talk about particles with given masses
moving through given fields of forte), and when given a System in
classical mechanics, tan usually give a meaning to ` the Same' sysfem
in quantum mechanics.
Equation ( 13) holds for v, any function of the Heisenberg dynamical
variables not involving the time explicitly, i.e. for v any constant
t Thia sssumption is found in practice to be successful only when appkd with the
dynamical coordktes and momenta referring to a Cartesian system of axes and not
to more general curvilinear coordinates.
f 23 HEISENBERG'S FORM FOR THE EQUATIONS OF MOTION 115 "I ) "l " j.t.1 `5
linear Operator in the Schroedinger picture. It Shows that such a " / I '
function vt is constant if it commutes with 4 or if w commutes with H.7 > 7 7 - ",/TV:. 7 `*
We then have
and we call vt or v SL constant of the motion. It is necessary that v shall
commute with H at all times, which is usually possible only if H is
constant. In this case we tan Substitute H for v in (13) and deduce
that Ht is constant, showing that H itself is then a constant of the
motion. Thus if the Hamiltonian is constant in the Schroedinger
picture, it is also constant in the Heisenberg picture.
For an isolated System, a System not acted on by any external
forces, there are always certain constants of the motion. One of these
is the total energy or Hamiltonian. Others arc provided by the
displacement theory of 3 25. It is evident physically that the total
energy must remain unchanged if all the dynamical variables are
displaced in a certain way, so equation (63) of $ 25 must hold with
v(f=v= H. Thus D commutes with H and is a constant of the
motion. Passing to the case of an infinitesimal displacement, we see
that the displacement Operators dz, d,, and dz are constants of the
motion and hence, from (69) of 5 25, the total momentum is a constant
of the motion. Again, the total energy must remain unchanged if all
the dynamical variables are subjected to a certain rotation. This
leads, as will be shown in 6 35, to the result that the total angular
momentum is a constant of the motion. The Zuws of conservation of
energy, momentum, and angular momentum hold for an isoluted System
in the Heisenberg picture in quantum mechunics, as they hold in
clmsical mechunics.
Two forms for the equations of motion of quantum mechanics have
now been given. Of these, the Schroedinger form is the more useful
one for practical Problems, as it provides the simpler equations. The
unknowns in Schroedinger's wave equation are the numbers which
form the representative of a ket vector, while Heisenberg's equation
of motion for a dynamical variable, if expressed in terms of a representation,
would involve as unknowns the numbers forming the
representative of the dynamical variable. The latter are far more
numerous and therefore more difficult to evaluate than the Schroedinger
unknowns. Heisenberg's form for the equations of motion is
of value in providing an immediate analogy with classical mechanics
and enabling one to see how various features of classical theory, such
29. Stationary States
116 THE EQUATIONS OF MOTION 0 28
as the conservation laws referred to above, are translated into quan-
tumtheory.
29. Stationary states
We shall here deal with a dynamical System whose energy is con-
starrt. Certain specially simple relations hold for this case. Equation
(6) tan be integratedt to give
y = ,-ia(t-to)/fi>
with the help of the initial condition that T = 1 for t = t,. This
result substituted into (1) gives
(Pt) = e-iHWol/fi 1 pt,), (16)
which is the integral of Schroedinger's equation of motion (5), and
substituted into (10) it gives
vt = eiH(1-tO)/n,e-iH(I-tO)/~ > (17)
which is the integral of Heisenberg's equation of motion (1 l), Bt being
now equal to H. Thus we have solutions of the equations of motion
in a simple form. However, these solutions are not of much practical
value, because of the difficulty involved in evaluating the Operator
e-iH(t-lo)lR, unless H is particularly simple, and for practical purposes
one usually has to fall back on Schroedinger's wave equation.
Let us consider a state of motion such that at time t, it is an eigen-
state of the energy. The ket 1 Pt,) corresponding to it at this time
must be an eigenket of H. If H' is the eigenvalue to which it belongs,
equation ( 16) gives I PO = e-in'(l-lOYfi 1 pt,),
showing that [Pt) differs from 1 Pt,) only by a Phase factor. Thus
the state always remains an eigenstate of the energy, and further, it
does not vary with the time at all, since the direction of the ket 1 Pt)
does not vary with the time. Such a state is called a stutioruzry state.
The probability for any particular result of an Observation on it is
independent of the time when the Observation is made. From our
assumption that the energy is an observable, there are sufficient
stationary states for an arbitrary state to be dependent on them.
The time-dependent wave function z,b(&) representing a stationary
state of energy H' will vary with time according to the law
#(et) = ~O(~)e-iR't~fi, (18)
t The integration cm be carried out as though H wem an ordinary algebraic
variable instead of a linear Operator, because there is no quantity that does not
commute with H in the work.
§ 29 STATIONARY STATES 117
and Schroedinger's wave equafion (7) for it reduces to
fwl) = &4l). (19)
This equation merely asserts that the state represented by & is an
eigenstate of H. We call a function lc10 satisfying (19) an eigenfunction
of H, belonging to the eigenvalue H'.
In the Heisenberg picture the stationary states correspond to fixed
eigenvectors of the energy. We tan set up a representation in which
all the basic vectors are eigenvectors of the energy and so correspond
to stationary states in the Heisenberg picture. We call such a representation
a Heisenberg representation. The fkrst form of quantum
mechanics, discovered by Heisenberg in 1925, was in terms of a
representation of this kind. The energy is diagonal in the representation.
Any other diagonal dynamical variable must commute with the
energy and is therefore a constant of the motion. The Problem of
setting up a IIeisenberg representation thus reduces to the Problem
of finding a complete set of commuting observables, each of which
is a constant of the motion, and then making these observables
diagonal. The energy must be a function of these observables, from .
Theorem 2 of 0 19. It is sometimes convenient to take the energy
itself as one of them.
Let CY denote the complete set of commuting observables in a
Heisenberg reprssentation, so that the basic vectors are w-ritten (a'l,
1'~"). The energy is a function of these observables 01, say H = H(a).
I?rom (17) we get
(~`lvfld'> =<&le iH(f-lO)/?i~e-iH(f -fO)/fi 1 a')
= eW'-W(~-fo)/~( (11' I2,I a'), (20)
where H' = H(d) and H" = H(G). The factor (a'lvl~i") on the righthand
side here is independent of t, being an element of the matrix
representing the fixed linear Operator V. Formula (20) Shows how the
Heisenberg matrix elements of any Heisenberg dynamical variable
vary with time, and it makes V~ satisfy the equation of motion (1 l),
as is easily verified. The Variation given by (20) is simply periodic
with the frequency
IH'-H"I/27& = IH'-H"[/h, (21)
depending only on the energy differente of the two stationary states
to which the matrix element refers. This result is closely connected
with the Combination Law of Spectroscopy and Bohr's Frequency
30. The Free Particle
118 THE EQUATIONS 03 MOTION 3 29
Condition, according to which (22) is the frequency of the electromagnetic
radiation emitted or absorbed when the System makes a
transition under the influence of radiation between the stationary
states CY' and ~y", the eigenvalues of H being Bohr's energy levels.
These matters will be dealt with in 5 45.
30. The free particle
The most fundamental and elementary application of quantum
mechanics is to the System consisting merely of a free particle, or
particle not acted on by any forces. For dealing with it we use as
dynamical variables the three Cartesian coordinates x, y, x and their
conjugate momenta pz, py, pz. The Hamiltonian is equal to the
kinetic energy of the particle, namely
H = g-& (P:+P;+Pz> (22)
according to Newtoniaa mechanics, m being the mass. This formula
is valid only if the velocity of the particle is small compared with c,
the velocity of light. For a rapidly moving particle, such as we often
have to deal with in atomic theory, (22) must be replaced by the
relativistic formula H = c(m2c2+p~+p~+p~)*. (23)
For small values of pzc, py, and pz (23) goes over into (22), except for
the constant term mc2 which corresponds to the rest-energy of the
particle in the theory of relativity and which has no influence on the
equations of motion. Formulas (22) and (23) tan be taken over
directly into the quantum theory, the Square root in (23) being now
understood as the positive Square root defined at the end of $11.
The constant term mc2 by which (23) differs from (22) for small values
of ps, piy, and pz tan still have no physical effects, since the Hamiltonian
in the quantum theory, as introduced in $27, is undefined to
the extent of an arbitrary additive real constant.
We shall here work with the more accurate formula (23). We shall
first solve the Heisenberg equations of motion. From the quantum
conditions (9) of 3 21, ~~ commutes with pv and ps, and hence, from
Theorem 1 of 5 19 extended to a set of commuting observables, pz
commutes with any function of pz, py, and ps and therefore with H.
It follows that p, is a constant of the motion. Similarly p, and pz are
constants of the motion. These results are the same as in the classical
0 30 THE FREE PARTICLE 119
theory. Again, the equation of motion for a coordinate, X, say, is,
according to (1 l),
;nx, = i!i$ = x~C~~2C2+P~+p~+pB)~-C(~2~2+p5+p~+p~)fXI~
The right-hand side here tan he evaluated by means of formula
(31) of $22 with the roles of coordinates and momenta interchanged,
so that it reads qkf!L = ifi YPP,' (24)
f now being any function of the p's. This gives
.
xt = g- c@"c"+PH+p;+P~Y C2PZ
=-.
H
2 (25)
Similarly , C2Pf/
&=--, C2PZ
i, =-.
H H 1
The magnitude of the velocity is
v = (*;+s.jf+i~)" = c"(p;+p$+p,2)"/.H. W)
Equations (25) and (26) are just the same as in the classical theory.
Let us consider a state that is an eigenstatt of the momenta,
belonging to the eigenvalues p;, ph, pi. This state must be an eigen-
state of the Hamiltonian, belonging to the eigenvalue
H' = c(m2C2+~~2+~~2+p~2)f, (27)
and must therefore be a stationary sfate. The possible values for H'
are all numbers from mc2 to 03, as in the classical theory. The wave
function #(xyx) renresenting this state at any time in Schroedinger's
representation must satisfy
p&qxyx)) = p~$b(xyx)> = -Ins>,
with similar equations for py and pz. These equations show that
$(xyz) is of the form
#(xyz) = (-J&P~~+P;2J+P;~~l~, (28)
where a is independent of x, y, and x. From (18) we see now that the
time-dependenf wave function $(xyxt) is of the form
*(xyzt) = a, &P~X+P;V+P+H'fi, (29)
where a,-, is independent of x, y, x, and t.
The function (29) of x, y, x, and t describes plane waves in space-
time. We see from this example the suitability of the terms `wave
function' and `wave equation'. The frequency of the waves is
v = H'p?, (30)
120 THE EQUATIONS OF MOTION 3 30
their wavelength is
x = h/(p;2+pj2+p;2)~ = h/P', (31)
P' being the length of the vector (&.,&,&), and their motion is in
the direction specified by the vector (~5,&,pss) with the velocity
Au = H'JP' = c2/v', (32)
v' being the velocity of the particle corresponding to the momentum
(p&ph,pb) as given by formula (26). Equations (30), (31), and (32)
are easily seen to hold in all Lorentz frames of reference, the expression
on the right-hand side of (29) being, in fact, relativistically
invariant with p:, ph, p: and H' as the compononts of a 4-vector.
These properties of relativistic invariance led de Broglie, before the
discovery of quantum mechanics, to Postulate the existente of waves
. of the form (29) associated with the motion of any particle. They
are therefore known as de Broglie waves.
In the limiting case when the mass m is made to tend to Zero, the
classical velocity of the particle v becomes equal to c and hence, from
(32), the wave velocity also becomes c. The waves are then like the
light-waves associated with a Photon, with the differente that they
contain no reference to the polarization and involve a complex exponential
instead of sines and cosines. Formulas (30) and (31) are
still valid, connecting the firequency of the light-waves with the
energy of the Photon and the wavelength of the light-waves with
the momentum of the Photon.
For the state represented by (29), the probability of the particle
being found in any specified small volume when an Observation of its
Position is made is independent of where the volume is. This provides
an example of Heisenberg's principle of uncertainty, the state being
one for which the momentum is accurately given and for which, in
consequence, the Position is completely unknown. Such a state is,
of course, a limiting case which never occurs in practice. The states
usually met with in practice. are those represented by wave packets,
which may be formed by superposing a number of waves of the ty-pe
(29) belonging to slightly different values of (&, p;, p:), as discussed
in 5 24. The ordinary formula in hydrodynamics for the velocity of
such a wave packet, i.e. the group velocity of the waves, is
31. The Motion of Wave Packets
§ 30 THE FREE PARTICLE 121
which gives, from (30) and (31)
dH'
-=c-
d"r, (mV+ P'2)h 2g = v'.
dP' (34)
This is just the velocity of the particle. The wave pscket moves in
the Same direction and with the same velocity as the particle moves
in classical mechanics.
31. The motion of wave packets
The result just deduced for a free particle is an example of a general
principle. For any dynamical System with a classical analogue, a sfate
for which the classical description is valid as an approximation is
represented in quantum mechanics by a wave packet, all the coordinates
and momenta having approximate numerical values, whose
accuracy is limited by Heisenberg's principle of uncertainty. Now
Schroedinger's wave equation fixes how such a wave packet varies with
time, so in Order that the classical description may remain valid, the
wave packet should remain a wave packet and should move according
to the laws of classical dynamics. We shall verify that this is so.
We take a dynamical System having a classical analogue and let
its Hamiltonian be H(q,,pJ (r 7 1,2,..., 12). The corresponding classical
dynamical System will have as Hamiltonian H,(q,, JI,.) say, obtained
by putting ordinary algebraic variables for the 4,. and p,. in H(q,,g+)
and making fi -+ 0 if it occurs in H(q,.,p,). The classical Hamiltonian
HC is, of course, a real function of its variables. It is usually a
quadratic function of the momenta J+, but not always so, the
relativistic theory of a free particle being an example where it is not.
The following argument is valid for HC any algebraic function of thep's.
We suppose that the time-dependent wave function in Schroe-
dinger's representation is of the form
+(qt) = Aeisln, (35)
where A and X are real functions of the q's and t which do not vary
very rapidly with their arguments. The wave function is then of the
form of waves, with A and S determining the amplitude and Phase
respectively. Schroedinger's wave equation (7) gives
or = e--islfiH(q,., p,)Aeiqfi>. (36)
122 THE EQUATIONS OF MOTION 8 31
Now e--islfi is evidently a unitary linear Operator and may be used for
U in equation (70) of 3 26 to give us a unitary transformation. The
@s remain unchanged by this transformation, each J.+, goes over into
e-is+3preislfi = p,+as/ap;,
with the help of (31) of 0 22, and H goes over into
e-isifiH(qr,pT)eiS1fi = H(q,,pr+aS/aqr),
since algebraic relations are preserved by the transformation. Thus
(36) becomes
(37)
Lct us now suppose that fi tan be counted as small and iet us neglect
terms involving 6 in (37). This involves neglecting the pr's that occur
in H in (37), since each (P,. is equivalent to the Operator -ifia/aq,
operating on the functions of the q's to the right of it. The surviving
terms give
This is a differential equation which the Phase function S has to
satisfy. The equation is determined by the classical Hamiltonian
function HC and is known as the Hamilton-Jacobi equution in classical
dynamics. It allows S to be real and so Shows that the assumption
of the wave form (35) does not lead to an inconsistency.
To obtain an equation for A, we must retain the terms in (37)
which are linear in fi and see what they give. A direct evaluation of
these terms is rather awkward in the case of a general function H,
and we tan get the result we require more easily by first multiplying
both sides of (37) by the bra vector (Af, where f is an arbitrary real
function of the q's. This gives
(Af{ i?iaAz-AZ}> = wq!??&+~A).
The conjugate complex equation is
Subtracting and dividing out by in, we obtain
2<Af $) = CA [fJ+44.+gp)* (39)
g 31 THE MOTION OF WAVE PACKETS 123
We now have to evaluste the P.B.
Our assumption that 6 tan be counted BS small enables us to expand
H(!lTY PT+aiw-lr) as a power series in the p's. The tcrms of Zero degree
will contribute nothing to the P.B. The terms of the first degree in
the JI'S give a contribution to the P.B. which tan be evaluated most
easily with the help of the classical formula (1) of § 2 1 (this formula
being valid also in the quantum theory if zc is independent of the p's
and v is linear in the p's). The amount of this contribution is
the notation meaning that we must Substitute a8/aq, for each 13, in
the function [ ] of the q's and p's, so as to obtain a function of the q's
only. The ten-s of higher degree in the p's give contributions to the
P.B. which vanish when K --+ 0. Thus (39) becomes, with neglect of
terms involving 6, which is equivalent to the neglect of fi2 in (37),
Now if a(q) and b(q) arc any two functions of the q's, formula
(64) of $20 gives @(q)W)> = j- 40 4l' &tL
and so Wd a;;'
-> = -($.f@b(q)), (41)
r T
provided a(a) and b(q) satisfy suitable boundary conditions, as dis-
cussed in $9 22 and 23. Hence (40) may be written
Since this holds for an arbitrary real functionf, we must have
(42)
This is the equation for the amplitude A of the wave function. To
get an understanding of its significance, let us suppose we have a fluid
moving in the space of the variables q, the density of the fluid at any
Point and time being A2 and its velocity being
124 THE EQUATIONS OF MOTION § 31
Equation (42) is then just tho equation of conservation for such a
fluid. The motion of the fluid is determined by the function S
satisfying (38), there being one possible motion for each Solution
of (38).
For a given S, let us take a Solution of (42) for which at some
definite time the density A2 vanishes everywhere outside a certain
small region. We may suppose this region to move with the fluid,
its velocity at each Point being given by (43), and then the equation
of conservation (42) will require the density always to vanish outside
the region. There is a limit to how small the region may be, imposed
by the approximation we made in neglecting 6 in (39). This approximation
is valid only provided
as
&$A<29A,
r r
or
which requires that A shall vary by an appreciable fraction of itself
only through a range of the q's in which S varies by many times fi,
i.e. a range consisting of many wavelengths of the wave function (35).
Our Solution is then a wave packet of the type discussed in $ 24 and
remains so for all time.
We thus get a wave function representing a state of motion for
which the coordinates ahd momenta have approximate numerical
values throughout all time. Such a state of motion in quantum
theory corresponds to the states with which classical theory deals.
The motion of our wave packet is determined by equations (38) and
(43). From these we get, defining ps as W/aqg,
where in the last line the p's are counted as independent of the q's
before the partial differentiation. Equations (43) and (44) are just
the classical equations of motion in Hamiltonian form and show that
the wave packet moves according to the laws of classical mechanics.
32. The Action Principle
f 31 THE MOTION OF WAVE PACKETS 125
We see in this way how the classical equations of motion arc derivable
from the quantum theory as a limiting case.
By a more accurate Solution of the wave equation one tan show
that the accuracy with which the coordinates and momenta simultaneously
have numerical values cannot remain permanently as
favourable as the Limit allowed by Heisenberg's principle of uncertainty,
equation (56) of 3 24, but if it is initially so it will become
less favourable, the wave packet undergoing a spreading.?
32. The action principlet
Equation (10) Shows that the Heisenberg dynamical variables at
time t, vt, are connected with their values at time t,, vlO, 01: v, by a
unitary transformation. The Heisenberg variables at time t +& are
connected with their values at time t by an infinitesimal unitary
transformation, as is shown by the equation of motion (11) or (13),
which gives the connexion between v,+B and vl of the form of (79) or
(80) of 6 26 with Ht for P and at/& for E. The Variation with time of
the Heisenberg dynamical variables may thus be looked upon as the
continuous unfolding of a unitary transformation. In classical
mechanics the dynamical variables at time t +St are connected with
their values at time t by an infinitesimal contact transformation and
the whole motion may be looked upon as the continuous unfolding of a
contact transformation. We have here the mathematical foundation
of the analogy between the classical and quantum equations of
motion, and tan develop it to bring out the quantum analogue of all
the main features of the classical theory of dynamics.
Suppose we have a representation in which the complete set of
commuting observables +$ are diagonal, so that a basic bra is (l'l.
We tan introduce a second representation in which the basic bras are
<f'* I = <Fl T. (45)
The new basic bras depend on the time t and give us a moving
representation, like a moving System of axes in an ordinary vector
space . Comparing (45) with the conjugate imaginary of (8), we see
that the new basic vectors are just the transforms in the Heisenberg
picture of the original basic vectors in the Schroedinger picture, and
hence they must be connected with the Heisenberg dynamical
See Kennard, 2. f. Physik;44 (1927), 344; Dsrwin, Proc. Roy. Sec. A, 117 (1927),
268.
$ This section may be omitted by the student who is not specially concerned with
higher dynamics.
1% THE EQUATIONS OF MOTION § 32
variables 2~~ in the same way in which the original basic vectors are
connected with the Schroedinger dynamical variables v. In particular,
each (4'* 1 must be an eigenvector of the &`s belonging to the eigenvalues
5'. It may therefore be written (eil, with the understanding
that the uumbers 5; are the Same eigenvalues of the &`s that the (1"`s
arc of the 6's. Rom (45) we get
<l%"> = (4'ITIr)9 (46)
showirrg that the transformation function is just the representative
of !P in the original representation.
Ufferentiating (45) with respect to t and using (6), we get
with the help of (12). Multiplying on the right by any ket Ia)
independent of t, we get
ifi;<W> = GIH,l~) = f <Stff4G> G <GI@, (47)
i f we tnke for definiteness the case of continuous eigenvalues for the
6's. New equation (51, written in terms of representatives, reads
Since (&l.Hl/& is the same function of the variables 6; and g that
<~`/111~"> is of t' and f", equations (47) and (48) are of precisely the
samt form, with the variables Ei, ci in (47) playing the role of the
variables f' and r in (48) and the function <Si Ia;) playing the role
oC Ute function ([`]Pt). We tan thus look upon (47) as a form of
Schroedinger's wave equation, with the function (6; Ia} of the variables
fi as the wave function. In this way Schroedinger's wave equation
appears in a new light, us the condition on the representative, in the
moqving representation with the Heisenberg variables & diagonal, of the
$xed ket corresponding to a state in the Heisenberg picture. The function
{& In> owes its Variation with time to its left factor (&/, in contradistinction
to the function (4' 1 Pt), which owes its Variation with time
to its right factor /Pt>.
If we put In> = It") in (47), we get
aw#i> dt: <s;orsn>, (49)
f 32 THE ACTION PRINCIPLE
showing that the transformation function (6; 15") satisfies Schroe-
dinger's wave equation. Now &, = 5, so we must have
the 6 function here being understood as the producf of a number of
factors, one for each e-variable, such as occurs for the variables
4Vu+l>"? e
on the right-hand side of equation
u (34) of 5 16. Thus the
transformation function (&IE"> is that solution of Schroedinger's wave
equation for which the 4's certainly have the values r at time 1,
The Square of its modulus, J (&lr) J2, is the relative probability of the
t's having the values 5; at time t > t, if they certainly have the values
5" at time t,. We may write (&iF) as ([;I&) and consider it as
depending on t, as well as on t. To get its dependence on t, we take
the conjugate complex of equation (49), interchange t and t,, and also
interchange Single primes and double primes. This gives
The foregoing discussion of the transformation function {-oelt) is
valid with the t's any complete set of commuting observables. The
equations were written down for the case of the f's having continuous
eigenvalues, but they would still be valid if any of the 4's have
discrete eigenvalues, provided the necessary formal changes arc made
in them. Let us now take a dynamical System having a classical
analogue and let us take the f's to be the coordinates 4. Put
(qi jq") = &W (52)
and so define the function 8 of the variables qi, Q". This function also
depends explicitly on t. (52) is a Solution of Schroedinger's wave
equation and, if 6 tan be counted as small, it tan be handled in thc
Same wy as (35) was. The S of (52) differs from the 8 of (35) on
account of there being no A in (52), which makes the 8 of (52) complex,
but the real part of this S equals the S of (35) and its pure
imaginary part is of the Order fi. Thus, in the limit 6 -+ 0, the S of
(52) will equal that of (35) and will therefore satisfy, corresponding
fo (3% - (53)
where PA = W%, (54)
and N, is the Hamiltonian of the classical analogue of our quantum
dynamical System. But (52) is also a solution of (51) with q's for f's,
128 THE EQUATIONS OF MOTION § 32
which is the conjugate complex of Schroedinger's wave equation in the
variables 4" or &. This Causes S to satisfy also'f
aS/at, = H,(q;,p;)> (55)
where pp = -ax/aq;. WV
The Solution of the Hamilton-Jacobi equations (53), (55) is the
action function of classical mechanics for the time interval t, to t,
i.e. it is the time integral of the Lagrangian L,
S = t L(t') dt'.
s (57)
Thus the 8 de$ned by (52) is the quantuna analogue of the clussical action
function and equals it in the limit 6 -+ 0. To get the quantum analogue
of the classical Lagrangian, we pass to the case of an infinitesimal
time interval by putting t = t,+& and we then have (q~,+~ljq~O) as the
analogue of eiQ@lin. F or the sake of the analogy, one should consider
L(t,) as a function of the coordinates q' at time t,+6t and the coordinates
q" at time t,, rather than as a function of the coordinates
and velocities at time t,, as one usually does.
The principle of least action in classical mechanics says that the
action function (57) remains stationary for small variations of the trajectory
of the System which do not alter the end points, i.e. for small
variations of the q's at all intermediate times between t, and t with qt,
and qI fixed. Let us see what it corresponds to in the quantum theory.
Put exp[i/Qt) dt/%] = exp(iS(t,, t,)/h) = B(t,, t,), (58)
u
so that B(t,, ta) corresponds to <qo1qi,) in the quantum theory . (We
here allow qiG and qi, to denote different eigenvalues of qt, and qtb, to
save having to introduce a large number of primes into the analysis.)
Now suppose the time interval t, -+ t to be divided up into a large
number of small time intervals t, -+ t,, t, + t2,..., tmml -+ tm, tm -+ t, by
the introduction of a sequence of intermediate times t,, t2,..., t,. Then
Jw, 4)) = w, tm)W,, t,l)-*W,, t,P(t,, to>. (59)
The corresponding quantum equation, which follows from the pro- .
perty of basic vectors (35) of $ 16, is
~4~140) = jJ..J M4hJ &lh<amlqA-1) &L4a81q~> &Mlao>~
(60)
j- For a more accurate comparison of transformation functions with classical
theory, sec Van Vleck, Proc. Nat. Acc&. 14, 178.
0 32 THE ACTION PRINCIPLE 129 '
& being written for & for brevity. At first sight there does not seem
to be any close correspondence between (59) and (60). We must,
however, analyse the meaning of (59) rather more carefully. We must
regard each factor B as a function of the p's at the two ends of the
time interval to which it refers. This makes the right-hand side of
(59) a function, not only of CJ~ and Q~,, but also of all the intermediate
Q'S. Equation (59) is valid only when we Substitute for the intermediate
q's in its right-hand side their values for the real trajectory,
small variations in which values leave S stationary and therefore also,
from (58), leave B(t, to) stationary. It is the process of substituting
these values for the intermediate q's which corresponds to the integrations
over all values for the intermediate q"s in (60). The quantum
analogue of the action principle is thus absorbed in the composition
law (60) and the classical requirement that the values of the intermediate
q's shall make S stationary corresponds to the condition
in quantum mechanics that all values of the intermediate q"s
are important in Proportion to their contribution to the integral
in (60).
Let us see how (59) tan be a Iimiting case of (60) for fi small. We
must suppose the integrand in (60) to be of the form eiFjfi, where F is
a function of qh, qi, qi,...,qA, qf which remains continuous as fi tends
to Zero, so that the integrand is a rapidly oscillating function when
% is small. The integral of such a rapidly oscillating function will be
extremely small, except for the contribution arising from a region in
the domain of integration where comparatively large variations in
the q5 produce only very small variations in F. Such a region must
be the neighbourhood of a Point where P is stationary for small variations
of the qk. Thus the integral in (60) is determined essentially by
the value of the integrand at a Point where the integrand is stationary
for small variations of the intermediate q"s, and so (60) goes over
into (59).
Equations (54) and (56) express that the variables qi,pf. are con-
neoted with the variables q",p" by a contact transformation and are
one of the Standard forms of writing the equations of a contact transformation.
There is an analogous form for writing the equations of a
unitary transformation in quantum mechanics. We get from (52)) with
the help of (45) of 3 22,
<q;/pJq") = -iTi--${q;~q") = a~<q;lqrf)* (61)
3595.57 K
33. The Gibbs Ensemble
130 THE EQUATIONS OF MOTION I 32
Similarly, with the help of (46) of 5 22,
From the general definition of functions of commuting observables,
we have (63)
wheref(q,) and g(q) arc functions of the qt's and q's respectively. Let
G(qt,q) be any function of the qt's and q's consisting of a sum or
integral of terms each of the form f(qJg(q), so that all the qt's in Q
occur to the left of all the q's. Such & function we cal1 weil ordered.
Applying (63) to ea,ch of the terms in G and adding or integrating,
we get (ai 1 Wt, d Ia"> = W;, a"M; Id'>-
Now let us suppose each spti and ~p,, tan be expressed as a well-ordered
function of the qt's and q's and write these functions pti(qt, q),p,(qt, q).
Putting these functions for G, we get
<at Ii?-%tlq"> = Pr&;, mat l!f),
<!d IP,lS"> = PAd, a"Kdla">~
Comparing these equations with (61) and (62) respectively, we see
that Wd, Cl") as(d, a")
PrikL a") = PrGl") = - ap"
eid ' r
This means that
(64) .
provided the right-hand sides of (64) arc written as well-ordered
functions.
These equations arc of the Same form &s (54) and (56), but refer to
the non-commuting quantum variables qt,q instead of the ordinary
algebraic variables qi, q". They show how the conditions for a unitary
transformation between quantum variables are analogous to the conditions
for a contract transformation between classical variables. The
analogy is not complete, however, because the cbssical S must be real
and there is no simple condition corresponding to this for the S of (64).
33. The Gibbs ensemble
In our work up to the present we have been assuming all along that
our dynamical System at each instant of time is in a definite state,
that is to say, its motion is specified as completely and accurately as
is possible without conflicting with the general principles of the theory.
0 33 THE GIBBS ENSEMBLE
In the classical theory this would meen, of course, that all the coordi-
nates and momenta have specified values. Now we may be interested
in a motion which is specified to a lesser extent than this mnximum
possible. The present section will be devoted to the methods to be
used in such a case.
The procedure in classical mechanics is to introduce what is called
a Gibbs ensembbe, the idea of which is as follows. We consider all the
dynamical coordinates and momenta as Cartesian coordinates in a
certain space, the phse spute, whose number of dimensions is twice
the number of degrees of freedom of the System. Any state of the
System tan then be represented by a Point in this space. This Point
will move according to the classical equations of motion (14). Suppose,
now, that we arc not given that the system is in a definite state
at any time, but only that it is in one or other of it number of possible
states according to a definite probability law. We should then be
able to represent it by a fluid in the Phase space, the mass of fluid in
any volume of the phase space being the total probability of the
System being in any state whose representative Point lies in that
volume. Esch particle of the fluid will be moving according to the
equations of motion (14). If we introduce the density p of the fluid
at any Point, equal to the probability per unit volume of Phase space
of the-System being in the neighbourhood of the corresponding state,
we shall have the equation of conservation
= -[pJq. (65) '
This may be considered as the equation of motion for the fluid, since
it determines the density p for all time if p is given initially as a
function of t,he q's and p's. It is, apart from the minus sign, of the
same form as the ordinary equation of motion (15) for a dynamical
variable.
The requirement that the total probability of the System being in
any state shall be unity gives us a normalizing condition for p
SS pdq@ = 1,
the integration being over the whole of Phase space and the Single
132 THE EQUATIONS OF MOTION § 33
differential dq or dp being written to denote the product of all the
02~`s or dp's. If /3 denotes any function of the dynamical variables,
the average value of /3 will bef S 13P dq** (67)
It makes only a trivial alteration in the theory, but often facilitates
diseussion, if we work with a density p differing from the above one
by a positive constant factor, E say, so that we have instead of (66)
p dqdp = k.
SS
With this density we tan picture the fluid as representing a number
k: of similar dynamical Systems, all following through their motions
independently in the same place, without any mutual disturbance or
interaction. The density at any Point would then be the probable or
average number of Systems in the neighbourhood of any state per unit
volume of Phase space, and expression (67) would give the average
total value of /3 for all the Systems. Such a set of dynamical Systems,
which is the ensemble introduced by Gibbs, is usually not realizable
in practice, except as a rough approximation, but it forms all the
same a useful theoretical abstraction.
We shall now see that there exists a corresponding density p
in quantum mechanics, having properties analogous to the above.
It was first introduced by von Neumann. Its existente is rather
surprising in view of the fact that Phase space has no meaning in
quantum mechanics, there being no possibility of assigning numerical
values simultaneously to the q's and p's.
We consider a dynamical System which is at a certain time in one
or other of a number of possible states according to some given
probability law. These states may be either a discrete set or a continuous
range, or both together. We shall here take for definiteness
the case of a discrete set and suppose them labelled by a Parameter m.
Let the normalized ket vectors corresponding to them be Im) and let
the probability of the System being in the mth state be Pm. We then
define the quantum density p by
P =
c Im>en<mI~ (68)
m
Let p' be any eigenvalue of p and Ip') an eigenket belonging to this
eigenvalue. Then
c Ia?&4P') =
PIP') =
P'IP')
m
r /
/
8 33 THE GIBBS ENSEMBLE 133
so that 2 (P'I~)Pm(~lP') = PYP'IP')
m
0r 2 p,le4P'~12 = PYP'IP').
m
Now Pm, being a probability, tan never be negative. It follows that
p' cannot be negative. Thus p has no negative eigenvalues, in analogy
with the fact that the classical density p is never negative.
Let us now obtain the equation of motion for our quantum p. In
Schroedinger's picture the kets and bras in (68) will vary with the time
in accordance with Schroedinger's equation (5) and the conjugate
imaginary of this equation, while the Pm's will remain constant, since
the System, so long as it is left undisturbed, cannot Change over from
a state corresponding to one ket satisfying Schroedinger's equation to
a state corresponding to another. We thus have
= 2 (HIm~p~~mI~lm)pm~mlH)
m
= HP-pH. (69)
This is the quantum analogue of the classical equation of motion
(65). Our quantum p, like the classical one, is determined for all time
if it is given initially.
From the assumption of 3 12, the average value of any observable
/3 when the System is in the state m is (m I/llm). Hence if the System
is distributed over the various states m according to the probability
law Pm, the average value of ,8 will be 2 P,(m I~jrn}. If we introduce
a representation with a discrete set ofmbasic ket vectors 15:) say, this
equals
ersm EX Iss Im> = (; (5" Isslm)Pm<m 14'>
3
m'
=
p WssPlO =
c <5'IPssIf>9 (70)
I 5'
the last step being easily verified with the law of matrix multiplication,
equation (44) of 3 17. The expressions (70) are the analogue of
the expression (67) of the classical theory. Whereas in the classical
theory we have to multiply ss by p and take the integral of the
product over all Phase space, in the quantum theory we have to
multiply ss by p, with the factors in either Order, and take the
134 THE EQUATIONS OF MOTION 5 33
diagonal sum of the product in a representation. If the representation
involves a continuous range of basic vectors lt'), we get instead
of (70) (71)
so that we must carry through a process of `integrating along the
diagonal' instead of summing the diagonal elements. We shall define
(7 1) to be the diagonal sum of /3p in the continuous case. It tan easily
be verified, from the properties of transformation functions (56) of
6 18, that the diagonal sum is the Same for all representations.
From the condition that the Im)`s are normalized we get, with
discrete ["s
F G?IPlS'> =~Clnz)Pm(ml~`~ = "p, = 1, (72)
m
since the total probability of the System being in any state is unity.
This is the analogue of equation (66). The probability of the System
being in the state e', or the probability of the observables 6 which
are diagonal in the representation having the values ,$`, is, according
to the rule for interpreting representatives of kets (51) of 3 18,
c I<4?l~>12Pm = <4'IPIO, (73)
m
which gives us a meaning for each term in the sum on the left-hand
side of (72). For continuous &`s, the right-hand side of (73) gives the
probability of the (`s having values in the neighbourhood of t' per
unit range of Variation of the values 6'.
As in the classical theory, we may take a density equal to k times
the above p and consider it as representing a Gibbs ensemble of k
similar dynamical Systems, between which there is no mutual disturbance
or interaction. We shall then have k; on the right-hand side
of (72), and (70) or (71) will give the total average /3 for all the
members of the,ensemble, while (73) will give the total probability
of a member of the ensemble having values for its 6's equal to 5
or in the neighbourhood of 5' per unit range of Variation of the
values 4'.
An important application of the Gibbs ensemble is to a dynamical
System in thermodynamic equilibrium with its surroundings at a
given temperature T. Gibbs showed that such a System is represented
in classical mechanics by the density
p = Ce-HH', (74)
§ 33 THE GIBBS ENSEMBLE 135
EZ being the Hamiltonian, which is now independent of the time, k:
being Boltzmann's constant, and c being a number Chosen to make
the normalizing condition (66) hold. This formula may be taken over
unchanged into the quantum theory. At high temperatures, (74)
becomes p = c, which gives, on being substituted into the right-hand
side of (73), c((`l[`) = c in the case of discrete l"s. This Shows that
at high temperatures all discrete stutes are qually probable.
VI. Elementary Applications
34. The Harmonic Oscillator
VI
ELEMENTARY APPLICATIONS
34. The harmonic oscillator
A SIMPLE and interesting example of a dynamical System in quantum
mechanics is the harmonic oscillator. This example is of importante
for general theory, because it forms a corner-stone in the theory of
radiation. The dynamical variables needed for describing the System
are just one coordinate 4 and its conjugate momentum p. The
Hamiltonian in classical mechanics is
H = $ (p"+mWq2), (1)
where m is the mass of the oscillating particle and w is 2rr times the
frequency. We assume the same Hamiltonian in quantum mechanics.
This Hamiltonian, together with the quantum condition (10) of 9 22,
define the System completely.
The Heisenberg equations of motion are
4 = klt, Hl = rptlm,
st = [pt, H] = -mo2q,. (2)
1
It is convenie& to introduce the dimensionless complex dynamical
variable 7j = (2mfiw)-Q+imwq). (3)
The equations of motion (2) give
9jl = (2mh)-*(-mw2q,+iapt) = iw~.
This equation tan be integrated to give
yt = qoeiwt, (4)
where 7. is a linear Operator independent of t, and is equal to the
value of qt at time t = 0. The above equations are all as in the
classical theory.
We tan express q and 13 in terms of 7 and its conjugate complex +j
and may thus work entirely in terms of 7 and q. We have
Tiwr)+j = (2m)-1(p+imwq)(p-imoq)
= (2m)-1[p2+m2c02q2+imw(qp-pq)]
= H-Q?ko (5)
and similarly Fiwfj7j = H+iW. (6)
Thus Q-r)+ji = 1. (7)
8 34 THE HARMONIC OSCILLATOR 137
Equation (5) or (6) gives H in terms of 7 and 7 and (7) gives the
commutation relation connecting 71 and +. From (5)
?iGjqlJ = rjH+iC&j
and from (6) ~w?ei = H+j+&.kj.
Thus +jH-Hrj = EhTj. (8)
Also, (7) leads to +iqn---qn7j = nqn-l (9)
for any positive integer n, as may be verified by induction, since, by
multiplying (9) by 31 on the left, we tan deduce (9) with n+ 1 for n.
Let H' be an eigenvalue of H and 1 H') an eigenket belonging to it.
From (5)
*&o(H'~+j~H') = (H'IH-@Weh') = (H'-&J)(H'(H').
Now (H'[+jlH') is the Square of the length of the ket +jjH'), and
hence <H'lqrjlH'> 2 0,
the case of equality occurring only if ql H') = 0. Also (H' jH') > 0.
Thus H' > &t~, (10)
the case of equality occurring only if +j 1 H') = 0. From the form (1)
of H as a sum of squares, we should expect its eigenvalues to be all
positive or zero (since the average value of H for any state must be
positive or Zero.) We now have the more stringent condition (IO).
From (8)
HqIH') = (;~H---&.~J~)~H') = (H'-&x~)ij~H'>. (11)
Now if H' # @iw, rj]H'> is not zero and is then according to (11) an
eigenket of H belonging to the eigenvalue H'-Ziw. Thus, with H'
any eigenvalue of H not equal to &JJ, H'-Ah is another eigenvalue
of H. We tan repeat the argument and infer that, if H'-ih # @io,
H'-21iw is another eigenvalue of H. Continuing in this way, we
obtain the series of eigenvalues H', H'-h, H'-21io, H'-3Tio,...,
which cannot extend to infinity, because then it would contain eigenvalues
contradicting ( lO), and tan terminate only with the value *ao.
Again, from the conjugate complex of equation (8)
HqlH') = (qH+fiqW') = W'+~4qlH'),
showing that H'+&J is another eigenvalue of H, with q1H') as an
eigenket belonging to it, unless qlH'> = 0. The latter alternative
tan be ruled out, since it would lead to
0 = &ioijq~H') = (H+@~~J)IH') = (H'+Q?io)IH'>,
138 ELEMENTARY APPLICATIONS I 34
which contradicts (10). Thus H'+&J is always another eigenvalue
of H, and so are Hf+ 2fi0, H'+3b and so on. Hence the eigenvalues
of H are the series of numbers
piw, ;ncfJ, piw, pkJ> . . . . (12)
extending to infinity. These are the possible energy values for the
harmonic oscillator.
Let IO) be an eigenket of H belonging to the lowest eigenvalue
#CU, so that +jlO> = 0, (13)
and form the sequence of kets
IO>, dO>, q210>, TjqO), . . . . (14)
These kets are all eigenkets of H, belonging to the sequence of eigen-
values (12) respectively. Prom (9) and (13)
ij7jqO) = nTjJy0) (15)
for any non-negative integer n. Thus the set of kets (14) is such that
7 or +i applied to any one of the set gives a ket dependent on the set.
New all the dynamical variables in our Problem are expressible in terms
of q and +j, so the kets (14) must form a complete set (otherwise there
would be some more dynamical variables). There is just one of these
kets for each eigenvalue (12) of H, so H by itself forms a complete
commuting set of observables. The kets (14) correspond to the various
stationary states of the oscillator. The stationary state with energy
(%n+ g)rio, corresponding to 7" IO), is called the n;th quantum state.
The Square of the length of the ket qnlO) is
wj"~"lO> =n(Ol~n-17p-110>
with the help of (15). By induction, we find that
(Ol~n7p[O) I: n! (16)
provided IO) is normalized. Thus the kets (14) multiplied by the
coefficients n!-g with n = 0, 1,2 ,..., respectively form the basic kets
of a representation, namely the representation with H diagonal. Any
ket 1s) tan be expanded in the form
IX> = z\ w.P10>, (1')
0
where the x,`s are numbers. In this way the ket IX) is put into
correspondence with a power series 2 X, 7n in the variable 3, the
various terms in the power series corresponding to the various
stationary states. If IX} is normalized, it defines a state for which
$34 THE HARMONIC OSCILLATOR 139
the probability of the oscillator being in the &h quanfum state,
i.e. the probability of H having the value (n+$)fio, is
P, = n!lxn12, (18)
as follows from the same argument which led to (51) of 3 18.
We may consider the ket IO> as a Standard ket and the power series
in 17 as a wave function, since any ket tan be expressed as such a
wave function rnultiplied into this Standard ket. The present kind
of wave function differs from the usual kind, introduced by equations
(62) of 6 20, in that it is a function of the complex dynamical variable
`1 instead of observables. It is, however, for many purposes the most
convenient wave function to use for describing states of the harmonic
oscillator. The Standard ket IO) satisfies the condition ( 13), which
replaces the conditions (43) of 8 22 for the Standard ket in Schroedinger's
representation.
Let us introduce Schroedinger's representation with 4 diagonal and
obtain the representatives of the stationary states. From (13) and (3)
(p-imwq)JO) = 0,
so (q'Ip--imwq10) = 0.
With the help of (45) of $22, this gives
a
6-y (q'~0>+mwq'(q'~0) = 0.
3 (19)
The Solution of this differential equation is
(q' IO) = (mw~7di)~e-mw~`m~2T1, (20)
the numerical coefficient being Chosen so as to make IO} normalized.
We have here the representative of the normal state, as the state of
lowest energy is called. The representatives of the other stationary
states tan be obtained from it: We have from (3)
(q'prpl0) = (2mnw)-~/2(a'l(p+imwq)nlO)
= (2?r&w)-n12in -$+M "<cr'lO>
( 1
= in(254&0~)--7@(rn~/&)t -fi--$+m~q' ne-naoq'p/2fi. (21)
( 1
This may easily be worked out for small values of n. The result is of
the form of e- mwq'e12fi times a power series of degree n in q'. A further
factor n!-* must be inserted in (21) to get the normalized representative
of the &h quantum state. The factor in may be discarded, being
merely a Phase factor.
35. Angular Momentum
140 ELEMENTARY APPLICATIONS a 35
35. Angular momentum
Let us consider a particle described by the three Cartesian coordi-
nates x, Y, x and their conjugate momenta Ps, PV, Pz. Its angular
momentum about the origin is defined as in the classical theory, by
m, = YPi-ZP, my = zP,-xPz mf3 = XPy-YPm (22)
or by the vector equation
m=xxp.
We must evaluate the P.B.s of the angular momentum components
with the dynamical variables x, pz, etc., and with each other. This
we tan do most conveniently with the help of the laws (4) and (5) of
9 21, thus [m,, XI = [XPy-YPm 4 = -Yl-J&,X] = y, (23)
cm,,Yl = bP,-YPWYI = X[Py,Yl = -x7 1
Cm,,z] = [xpy-~~z9 21 = 0, (24)
and similarly,
[%PLzl = P*> [m,,P,l = -Pm (25)
[m,,PJ = 0, (26)
with corresponding relations for m, and mg. Again
[my, %l = bPc-XPm %l = 4Pm mzl-1x9 %lPa
= --vy+YPa = mm (27)
[m,, m,] = mu, [Wz, m,] = m,. 1
These results are all the sarne as in the classical theory. The sign in
the results (23)) (25)) and (27) may easily be remembered from the
rule that the + sign occurs when the three dynamical variables, consisting
of the two in the P.B. on the left-hand side and the one
forming the result on the right, are in the cyclic Order (xyx) and the
- sign occurs otherwise. Equations (27) may be put in the vector
form mxm = ifim. (28)
Now suppose we have several particles with angular momenta
m,, m,,... . Esch of these angular momentum vectors will satisfy
(28)) thus m, x m, = iZ?q,
and any one of them will commute with any other, so that
m,xm,+m,xm;= 0 (r #s).
§ 35 ANGULAR MOMENTUM 141
Hence if M = 1 m, is the total angular momentum,
T
MxM = 2 qxm, = 2 qxq+ 2 (m,xm,+m,xm,)
TS r r<s
= ifi ;f m, = %M. (29)
T
This result is of the same form as (28), so that the components of the
total angular momentum M of any number of particles satisfy the
same commutation relations as those of the angular momentum of
a Single particle.
Let A,, A,, A, denote the three coordinates of any one of the
particles, or else the three components of momentum of one of
the particles. The A's will commute with the angular momenta of
the other particles, and hence from (23), (24), (29, and (26)
[M,, A,] = A,, [K, Af/] = -4, [M,, A,] = 0. (30)
If B,, J$,, B, are a second set of three quantities denoting the
coordinates or momentum components of one of the particles, they
will satisfy similar relations to (30). We shall then have
PL 4 %+4/ BI/+4 4
= Pk &1%+4P?z7 4!lfv?3~ 4/lq/+4/M q/l
= A, B,+A, B,-A, B,-A, B,
= 0.
Thus the scalar product A, B,+A, B,+ A, Bz commutes with MS,
and similarly with &!% and &&. Introduce the vector product
AxB=C
or
A, Bz-A, B, = Cz, A, Bz--A, Bz = C,, A, BP--A, BS = Cs.
We have PL GI = -AZ B,+A,B, = C,
and similarly [M,,c,J = 4, [i&CJ = 0.
These equations are again of the form (3O), with C for A. We tan
conclude from this work that equations of the form (30) hold for the
three components of any vector that we tan construct from our
dynamical variables, and that any scalar commutes with M.
We tan introduce linear Operators R referring to rotations about
the origin in the same way in which we introduced the linear Operators
D in 5 25 referring to displacements. Taking a rotation through an
142 ELEMENTARY APPLICATIONS § 35
angle S+ about the x-axis and making S$ infinitesimal, we tan obtain
the limit Operator corresponding to (64) of 9 25,
lim (R- l)/S+,
W-+-O
which we shall cal1 the rotution operator about the x-axis and denote
by rZ. Like the displacement Operators, rZ is a pure imaginary linear
Operator and is undetermined to the extent of an arbitrary additive
pure imaginary number. Corresponding to (66) of 0 25, the Change
in any dynamical variable v caused by a rotation through a small
angle S+ about the x-axis is
S$(r, v---vr,L (31)
to the first Order in S+. Now the changes produced in the three
components A,, A,, A, of a vector by a (right-handed) rotation S+
about the x-axis applied to all measuring apparatus are S&4,,
-S+,, and 0 respectively, and any scalar quantity is unchanged by
the rotation. Equating these changes to (31), we find that
rzA,---A,r, = A,, r,A,--A,r, = -A,,
rzA,--A,r, = 0,
and rz commutes with any scalar. Comparing these results with (30),
we see that Zr, satisfies the Same commutation relations as M,.
Their differente, M,---iEirz, commutes with all the dynamical variables
and must therefore be a number. This number, which is necessarily
real since M, and Sr, are real, may be made zero by a suitable choice
of the arbitrary pure imaginary number that tan be added to rz. We
then have the result iJ& = ifir,. (32)
Similar equations hold for JI, and M,. They are the analogues of (69)
of $25. Thus the total angular momentum is connected with the rotation
Operators as the total momentum is connected with the displacement
Operators. This conclusion is valid for any Point as origin.
The above argument applies to the angular momentum arising
from the motion of particles, defined by (22) for each particle. There
is another kind of angular momentum occurring in atomic theory,
spin angular momentum. The former kind of angular momentum will
be called orbital angukzr momentum, to distinguish it. The spin angular
momentum of a particle should be pictured as due to some internal
motion of the particle, so that it is associated with different degrees
of freedom from those describing the motion of the particle as a whole,
§ 35 ANGULAR MOMENTUM
and hence the dynamical variables that describe the spin must commute
with x, y, x, ps, JZ+,, and ps. The spin does not correspond vexy
closely to anything in classical mechanics, so the method of classical
analogy is not suitable for studying it. However, we tan build up a
theory of the spin simply from the assumption that the components
of the Spin angular momentum are connected with the rotation operators
in the Same way as we had above for orbital angular momentum,
i.e. equation (32) holds with MB as the x component of the spin angular
momentum of a particle and r, as the rotation Operator about the
x-axis referring to states of spin of that particle. With this assumption,
the commutation relations connecting the components of the
spin angular momentum M with any vector A referring to the spin
must be of the Standard form (30), and hence, taking A to be the
spin angular momentum itself, we have equation (29) holding also
for the Spin. We now have (29) holding quite generally, for any sum
of spin and orbital angular momenta, and also (30) will hold generally,
for M the total spin and orbital angular momentum and A any vector
dynamical variable, and the connexion between angular momentum
and rotation Operators will be always valid.
As an irnmediate consequence of this connexion, we tan deduce th8
iaw of conservation of angdur momentum. For an isolated System, the
Hamiltonian must be unchanged by any rotation about the origin, in
other words it must be a scalar, so it must commute with the angular
momentum about the origin. Thus the angular momentum is a
constant of the motion. For this argument the origin may be any
Point.
As a second immediate consequence, we tan deduce that a state
with xero total angular momentum is sphericully symmetricai. The stafe
will correspond to a ket IS), say, satisfying
B,IS> = J!!lJS) = M,IX) = 0,
and hence r,[S) = ry\S) = r,.S) = 0. -
This Shows that the ket IX> is unaltered by infSt8simal rotations,
and it must therefore be unaltered by finite rotations, since the latter
tan be built up from infinitesimal ones. Thus the state is spherically
symmetrical. The converse theorem, a qherically symmetrical Stute
kts xero total angulur momentum, is also true, though its proof is not
quite so simple. A spherically symmetrical state corresponds to a ket
IS) whose direction is unaltered by any rotation. Thus the Change
36. Properties of Angular Momentum
144 ELEMENTARY APPLICATIONS $35 .
in 18) produced by a rotation Operator rs, rl/, or rZ must be a numerical
multiple of 1 S), say
r,Ifi> = cxlS>, r,l@ = c,lQ %P> = c,w,
where the c's are numbers. This gives
M,lS> = inc,Is), Jf$Q = iKc,lS),
M,jS) = i?iczIX). (33)
These equations are not consistent with the commutation relations
(29) for M,, My> M, unless c, = cy = 15~ = 0, in which case the state
has zero total angular momentum. We have in (33) an example of
a ket which is simultaneously an eigenket of the three non-commuting
linear Operators M,, My, M,, and this is possible only if all three
eigenvalues are Zero.
36. Properties of angular momentum
There are some general properties of angular momentum, deducible
simply fiom the commutation relations between the three components.
These propefiies must hold equally for spin and orbital angular
momentum. Let m,, mg, m, be the three components of an angular
momentum, and introduce the quantity ss defined by
ss = m~+-m;-+-m;.
Since /3 is a scalar it must commute with m,, mg, and rn,. Let us
suppose we have a dynamioal System for which m,, mg, m, are the
only dynamical variables. Then ss commutes with everything and
must be a number. We tan study this dynamical System on much
the sime lines as we used for the harmonic oscillator in 5 34.
Put m -im, = 7.
X
From the commutation relations (27) we get
ijq = (m,+im,)(m,-imJ = m$+m~---i(m,m,-m,m,)
= ss-+-i-~m, (34)
and similarly qfj = ss-rni-nm,. (361
Thus 7jq-r)Tj = 21imz. (36)
Also mgq--r)mz = %m,---?irn$ = -6~. (37)
We assume that the components of an angular momentum are
observables and thus m, has eigenvalues. Let rn: be one of them,
and Impf) an eigenket belonging to it. From (34)
<m$jqlmL> = <&Iss--mi+~m&d> = @--m~2+~~~>(m~~m~>.
ii ,i
r $4 36 PROPERTIES OF ANQULAR MOMENTUM 146
The left-hand side here is the Square of the length of the ket qlm;)
and is thus greater than or equal to Zero, the case of equality occurring
if and only if 37 Im:) = 0. Hence
ss-m;"+&m; 2 0,
or ss+*n2 > (m;-@)? (38)
Thus ss+@" 2 0.
' Defining the number 7 by
k+gi = (ss+@"y = (m;+m;+m;+$@)*, (39)
so that k a -86, the inequality (38) becomes
- > k+@ > fm;-+fil
or k+#i > m; > -k. (40)
An equality occurs if and only if 7 Im;> = 0. Similarly from (35)
<m2r+$O = (ss-m~2-+W)<m34>,
showing that ss -m~2-hlm~ 2 0
or k > mA > -k-4,
with an equality occurring if and only if +jjmL) = 0. This result
combined with (40) Shows that k 2 0 and
k > na; 2 -k, (41)
withm~=kif~lm~>=Oandm~=-kif~lm~>=O.
From (37)
Now if rn: # -k, 7 Im:) is not zero and is then an eigenket of mz
belonging to the eigenvalue VI;--fi. Similarly, if rn~-$i # -k, mi-2fi
is another eigenvalue of rn*, and so op. We get in this way a series
of eigenvalues rn;, mi-4, mL---21i,..., which must termirrate from (4l),
and tan terminate only with thevalue -k. Again, from the conjugate
complex of equation (37)
m, rilmL> = (@b+f$) Im;> = (mi+f+j W,
showing that rni+fi is another eigenvalue of m, unless Olms) = 0, in
which case rnz = k. Continuing in this way we get a series of eigenvalues
mL,mL+fi, rnL+%i ,..., which must termirrate from (.41), and
tan terminate only with the value k. We tan conclude that 2k is an
integral multiple of iti and that the eigenvalues of m, are
k, k-4, k-4%, . . . . -k+fi, -k. (42)
8696.67 L
146 ELEMENTARY APPLICATIONS 5 36
The eigenvalues of mz and my are the same, from symmetry. These
eigenvalues are all integral or half odd integral multiples of 6, according
fo whether 2k is an even or odd multiple of fi.
Let Imax) be an eigenket of m, belonging to the maximum eigen-
value k, so that +jlmax) = 0, (43)
and ferm the sequence of kets
Im->, 37lmaxh r121max>, . . . . 7j2Yfi 1 max) . (44)
These kets are all eigenkets of rnS, belonging to the sequence of eigenvalues
(42) respectively. The set of kets (44) is such that the Operator
q applied to any one of them gives a ket dependent on the set (q
applied to the last gives Zero), and from (36) and (43) one sees
that q applied to any one of the set also gives a ket dependent on the
Set. All the dynamical variables for the System we are now dealing
with are expressible in terms of 7 and q, so the set of kets (44) is a
complete set. There is just one of these kets for each eigenvalue (42)
of m,, so m, by itself forms a complete commuting set of observables.
It is convenient to define the magnitude of the angnlar momentum
vector m to be k, given by (39), rather than /3t, because the possible
values for k are (45)
extending to infinity, while the possible values for /3b are a more
complicated set of numbers.
Fora dynamical System involving other dynamical variables besides
m,, mv, and m,, there may be variables that do not commute with /?.
Then /3 is no longer a number, but a general linear Operator. This
happens for any orbital angular momentum (22), as x, y, x, pz, py, and
pS to not commute with /3. We shall assume that /3 is always an
observable, and k tan then be deCned by (39) with the positive Square
root fimction and is also an observable. We shall call k so defined
the magnitude of the angular momentum vector m in the general
case. The above analysis by which we obtained the eigenvalues of
vS is still valid if we replace Im;) by a simultaneous eigenket Ik'n$>
of the commuting observables k and mz, and leads to the result that
the possible eigenvalues for k are the numbers (45), and for each
eigenvalue k' of k the eigenvalues of m, are the numbers (42) with k'
substituted for k. We have here an example of a phenomenon which
we have not met with previously, namely that with two commuting
observables, the eigenvalues of one depend on what eigenvalue we
§ 36 PROPERTIES OF ANGULAR MOMENTUM 147
assign to the other. This phenomenon may be understood as the two
observables being not altogether independent, but partially functions
of one another. The number of independent simultaneous eigenkets
of Jc and m, belonging to the eigenvalues k' and mP; must be independent
of rn:, since for each independent Jk'm;) we tan obtain an
independent 1 k'mz), for any rni in the sequence (42), by multiplying
jk'ma> by a suitable power of 7 or +j.
As an example let us consider a dynamical System with two angular
momenta m1 and m,, which commute with one another. If there are
no other dynamical variables, then all the dynamical variables commute
with the magnitudes k, and kz of m, and m,, so k, and k, are
numbers. However, the magnitude K of the resultant angular
momentum M = m,+m, is not a number (it does not commute
with the components of m, and m,) and it is interesting to work out
the eigenvalues of K. This tan be done most simply by a method
of counting independent kets. There is one independent simultaneous
eigenket of m,, and rnza belonging to any eigenvalue 4 having one of
the values kl, kl--K, kl--2fi ,..., -kl and any eigenvalue rn; having one
of the values k,, k,--jii, k,---21i,..., -k,, and this ket is an eigenket
of M, belonging to the eigenvalue ML = m&+rnL. The possible
values of iV& are thus k,+k,, k,+k,-Ti, k,+k,-2&,...,-kl--k,, and
the number of times each of them occurs is given by the following
scheme (if we assume for definiteness that k, > kJ,
k,+k,, kl+k24, k,+k,-26 ,..., kl--k,, kl--kz--fi ,...
1 2 3 . . . 2k,+1 2k,+l . . . (46)
. . . -k,+k,,-k,+k,-&,...,--k,-E,
. . . 2k,+ 1 2k, . . . 1
Now each eigenvalue K' of K will be associated with the eigenvalues
K', K'-?i, K'-26 ,..., -K' for Hz, with the same number of independent
simultaneous eigenkets of K and M' for each of them. The total
number of independent eigenkets of MzI belonging to any eigenvalue
.&fL must be the Same, whether we take them to be simultaneous
eigenkets of mb and mb or simultaneous eigenkets of K and M,, i.e.
it is always given by the scheme (46). It follows that the eigenvalues
for K are
-1 +k29 k,+k,-5, kl-j-k,--Si, . . . . kl-k,, (4')
and that for each of these eigenvalues for K and an eigenvalue for
<-
148 ELEMENTARY APPLICATIONS 0 36
2M, going with it there is just one independent simultaneous eigenket
of K and M,.
The effect of rotations on eigenkets of angular momentum variables
should be noted. Take any eigenket I&Q of the x component of total
angular momentum for any dynamical System, and apply to it a small
rotation through an angle 84 about the x-axis. It will Change into
u+wr*)IJf;) = (1-~~~1M,/qIM;>
with the help of (32). This equals
to the first Order in 84. Thus IM:) gets multiplied by the numeriert1
factor e- iS$MJn. By applying a succession of these small rotations, we
find that the application of a finite rotation through an angle 4 about
the z-axis Causes IM:) to get multiplied by e-i+"Jn. Putting 4 = 277,
we find that an application of one revolution about the x-axis leaves
IM:) unchanged if the eigenvalue MI is an integral multiple of & and
Causes IM;) to Change sign if -84: is half an odd integral multiple of 6.
Now consider an eigenket IK'> of the magnitude K of the total angular
momentum. If the eigenva1ue.K' is an integral multiple of 6, the
possible eigenvalues of il& are all integral multiples of fi and the application
of one revolution about the x-axis must leave 1 K') unchanged.
Conversely, if K' is half an odd integral multiple of 6, the possible eigenvalues
of MS are all half odd integral multiples of 6 and the revolution
must Change the sign of 1 K'). From symmetry, the application of a
revolution about any other axis must have the same effect on IK')
as one about the x-axis. We thus get the general result, the application
of one revolution about any axis leaves a Eet unchanged or changes its
sign according to whether it belongs to eigenvalues of the magnitude of
the total angulur momentum which are integral or half odd integral
multiples of fi. A state, of course, is always unaffected by the revolution,
since a state is unaffected by a Change of sign of the ket corresponding
to it.
, For a dynamical System involving only orbital angular momenta,
a ket must be unchanged by a revolution about an axis, since we tan
set up Schroedinger's representation, with the coordinates of all the
particles diagonal, and the Schroedinger representative of a ket will
get brought back to its original value by the revolution. It follows
that the eigenvalues of the magnitude of an orbital angular momentum
are always integral multipies of 6. The eigenvalues of a component
37. The Spin of the Electron
of an orbital angular momentum are also always integral multiples
of 6. For a spin angular momentum, Schroedinger's representation
does not exist and both kinds of eigenvalue are possible.
37. The spin of the electron
Electrons, and also some of the other fundamental particles (pro-
tons, neutrons) have a spin whose magnitude is 4%. This is found
from experimental evidente, and also there are theoretical reasons
showing that this spin value is more elementary than any other, even
spin Zero (see Chapter XI). The study of this particular spin is therefore
of special importante.
For dealing with an angular momentum m whose magnitude is 46,
it is convenient to put m = @o. (48)
The components of the vector Q then satisfy, from (27), ' ' '
oy Dz = 2iu, (r
-VZQ
az ox -ux oz = 2io,, (49)
Ox ug -(Q*x = 2iaz. L
i
The eigenvalues of rnz are 46 and -+fi, so the eigenvalues of oB are 1
and - 1, and 0: has just the one eigenvalue 1. It follows that c$ must
equal 1, and similarly for 05 and D& i.e.
2
0, = $= o;= 1. (50)
We tan get equations (49) and (50) into a simpler form by means of
some straightforward non-commutative algebra. From (60)
o$uz-o,oy = 0
or a,(o,o,-a,a~)+(ay~z-a,a,)oy = 0
or cTy~x+fJxay = 0
with the help of the first of equations (49). This means oZ uy - --ag ax.
Two dynamical variables or linear Operators like these which satisfy
the commutative law of multiplication except for a minus sign will
be said to anticommute. Thus 0, anticommutes with aU. From symmetry
each of the three dynamical variables ox, oy, a, must anticommute
with any other. Equations (49) may now be written
*y az = io, = -azuv,
*z =x = io, = -cr,a,,
.
OxU, = za, = -csyuz,
and also from (50) .
axoyr7z = 2.
ELEMENTARY APPLICATIONS 0 37
Equations (50), (Sl), (52) are the fundamental equations satisfied by
the spin variables o describing a spin whose magnitude is 46.
Let us set up a matrix representation for the a's and let us take a,
to be diagonal. If there are no other independent dynamical variables
besides the m's or a's in our dynamical System, then a, by itself forms
a complete set of commuting observables, since the form of equations
(60) and (61) is such that we cannot construct out of u%, Um, and u,
any new dynamical variable that commutes with a,. The diagonal
elements of the matrix representing U, being the eigenvalues 1 and
- 1 of oz, the matrix itself will be
Let a, be represented by
This matrix must be Hermitian, so that a1 and ad must be real and
a, and a, conjugate complex numbers. The equation aB a, = -az a,
gives us
so that a, = a4 = 0. Hence 0, is represented by a matrix of the form
The equation 4 = 1 now shows that a, us = 1. `Thus a2 and a3, being
conjugate complex numbers, must be of the form e"a and e-ia respectively,
where 01 is a real number, so that 0% is represented by a
.matrix of the form
. . Similarly it may be shown that ?Y is also represented by a matrix of
this form. By suitably choosing the Phase factors in the representation,
which is not completely determined by the condition that us
shall be diagonal, we tan arrange that uz shall be represented by the
matrix 0 1
( 1
1 0'
The representative of uY is then determined by the equation
% = iu,u,. We thus obtain finally the three matrices
6 37 TEE SPIN OF THE ELECTRON 161
to represent a,, (T~, and a, respectively, which matrices satisfy all the
algebraic relations (49)) (50), (5 l), (52). The component of the vector
Q in an arbitrary direction specified by the direction cosines Z, m, 72,
namely ZG, + VZG~ + na,, is represented by
( n I-im
E-f-im (54)
1
-n '
The representative of a ket vector will consist of just two numbers,
corresponding to the two values + 1 and - 1 for 0;. These two numbers
form a function of the variable CF: whose domain conqists of only
the two Points + 1 and - 1. The state for which an has the value unity
will be represented by the function, f,(4 say, consisting of the pair
of numbers 1, 0 and that for which 5, has the value - 1 will be
represented by the function, fB(5;) say, consisting of the pair 0, 1.
Any function of / the variable 5;, i.e. any pair of numbers, tan be
expressed as a linear combination of these two. Thus any stute tan
be obtained by superposition of the two stutes for which oz equaLs -/-,l and
- 1 respectively. For example, the state for which the component of
a in the direction Z, m, n, represented by (54), has the value +l is
represented by the pair of numbers a, b which satisfy
n i
l-j-im
or nu+(Z--im)b = a,
(l+im)a-nb = b.
Thus a I-im 1-l-n
-=-=-*
b l-n l-j-im
This state tan be regarded as a Superposition of the two states for
' which aa equals + 1 and - 1, the relative weights in the superposition
process being as
ja12 : fb[" = ~Z-im~2: (~---Ts)~ = l+n : l-n. (55)
For the complete description of an electron (or other elementary
particle with spin Qiti) we require the spin dynamical variables 5,
whose connexion with the spin angular momentum is given by (48),
together with the Cartesian coordinates x, y, x and momenta pz, py,
pz. The spin dynamical variables commute with these coordinates
and momenta. Thus a complete set of commuting observables for a
System consisting of a Single electron will be x, y, x, oz. In a representation
in which these are diagonal, the representative of any state
38. Motion in a Central Field of Force
i
162 ELEMENTARY APPLICATIONS § 37
will be a function of four variables x', y', x', 0;. Since 0; has a domain
consisting of only two Points, nameiy 1 and - 1, this function of four
variables is the Same as two functions of three variables, namely the
two functions
<x'y'q)+ = <x',y',C+ll>, (x'y'x' 1
)- = (x', y', x`, - 11). (56)
Thus the presence of the spin nmy be considered either us introducing a
new variable into the representative of a state or us giving this representative
two components.
38. Motion in a central field of forte
An atom consists of a massive positively charged nucleus together
with a number of electrons moving round, under the influence of the
attractive forte of the nucleus and their own mutual repulsions. An
exact treatment of this dynamical System is a very difficult mathematical
Problem. One tan, however, gain some insight into the main
features of the System by making the rough approximation of regarding
each electron as moving independently in a certain central field
of forte, namely that of the nucleus, assumed fixed, together with
some kind of average of the forces due to the other electrons. Thus
our present Problem of the motion of a particle in a central field of
forte forms a corner-stone in the theory of the atom.
Let the Cartesian coordinates of the particle, referred to a System
of axes with the centre of forte as origin, be x, y, x and the corresponding
components of momentum pa, pu, pz. The Hamiltonian,
with neglect of relativistic mechanics, will be of the form
27 = 1/2m. (P:+P;+P3+% (57)
where V, the potential energy, is a function only of (x2+y2+x2). To
develop the theory it is convenient to introduce polar dynamical
variables. We introduce first the radius r, defined as the positive
Square root r = (x2+y2+x2)*.
Its eigenvalues go from 0 to Co. If we evaluzte its P.B.s with ps, py,
and p,, we obtain, with the help of formula (32) of zj 22,
[r,p,] - E = :, [w,] = F9 [r,pJ = :,
the Same as in the classical theory. We introduce also the dynamical
variable pP defined by
P3T = +(XPz+YP,+~PJ= (58)
$38 MOTION IN A CENTRAL FIELD OF FORCE
Its P.B. with r is given by
r[r,pJ = [c v%l = Cr, wc+wy+%l
= X[r,~,l+Y[rtPyj+X[r,13,1
= x.x/r+y.y/r+z.zp = r.
Hence PY PJ = 1
OI` rp,-p,r = in.
The commutation relation between r and Pr is just the one for a
canonical coordinate and momentum, namely equation (10) of 5 22.
This makes P,~ like the momentum conjugate to the r coordinate, but
it is not exactly equal to this momentum because it is not real, its
conjugate complex being
f% = (P, x+p, Y+Pz Gr-l = (~p,+YPg+~Pz- 3ar-l
= (rpT--3ifi)r-l = pr-2ifb--1. (69)
Thus pr- $kl is real and is the true momentum conjugate to r.
The angular momentum m of the particle about the origin is given
by (22) and its magnitude k is given by (39). Since r and pv are
scalars, they commute with m, and therefore also with k.
We tan express the Hamiltonian in terms of r, pr, and k. We have,
if z denotes a sum over cyclic permutations of the suffixes x, y, x,
k;k+n> = 2 m; = 2 (~P,-Yza2
=" ~"Pu~~+YPzYPz-~P~YPz-YpLT~Py)
xw
= 2 (X2p;+y2P~-X2izP1/Y-YP~Pz~+~2P:-~PLEP3F-
zu2 - 2ifixpJ
= (x2+y2-~z2)(p"fpy+~)-
-(xP,+YP,+?Pz)CiPaX+PyY+P~~+2w
= r2(p5+pY+p~)-r23,(1?,r+2in)
= r2(p~+p~+pZ)-w%
from (59). Hence
H = & ipFr+ @p)+v. (60)
(
This form for H is such that k commutes not only with H, as is
necessary since k is a constant of the motion, but also with every
dynamical variable occurring in H, namely r, pr, and V, which is a
164 ELEMENTARY APPLICATIONS 0 38
function of r. In consequence, a simple treatment becomes possible,
namely, we may consider an eigenstate of I% belonging to an eigen-
value k' and then we tan Substitute iY for E in (60) and get a Problem
in one degree of freedom r.
Let us introduce Schroedinger's representation with x, y, x diagonal.
Then pz, py, p, are equal to the Operators -4 a/ax, -4 a/ay, -4% a/az
respectively. A state is represented by a wave function $(xyxt) satisfying
Schroedinger's wave equation (7) of 3 27, which now reads, with
H given by (57),
We may pass from the Cartesian coordinates x, y, x to the polar
coordinates r, 0,4 by means of the equations
X = rsinOcos$,
Y = rsinOsin+,
X= r cos l9,
and may express the wave function in terms of the polar coordinates,
so that it reads t,&@t). The equations (62) give the Operator equation
a-= axa+aya+aza
-- --
--=
ar arax aray tia2 -;;+;;,g;,
which Shows, on being compared with (58), that p,, = -4 a/ar. Thus
Schroedinger's wave equation reads, with the form (60) for H,
,a* fi2 1 a2
-= - -- -,+w+w
at i ~TYA ( T at-2 F +w (63)
1 1
Here k is a certain linear Operator which, since it commutes with r
and a/ar, tan involve only 6, #, a/8, and a/a+. From the formula
w+w = m~+??g+?n~, (64)
which Comes from (39), and from (62) one tan work out the form of
k(k+fi) and one finds
W+fi) 1 asinoa 1 a2
--@---=---
sin 8 ae ----
ae sin29 ap ' (65)
This Operator is well known in mathematical physics. Its eigenfunctions
are called sphericul harmonics and its eigenvalues are
n(n,+l) where n is an integer. Thus the theory of spherical harmonics
provides an alternative proof that the eigenvalues of k are
integral multiples of $.
0 33 MOTION IN A CENTRAL FIELD OF FORCE 156
For an eigenstate of E belonging to the eigenvalue & (n a non-
negative integer) the wave function will be of the form
# = ~-%w?@~>, (66)
where 8, (04) satisfies
(67)
i.e. from (65) Sn is a spherical harmonic of Order n. The factor r-l
is inserted in (66) for convenience. Substituting (66) into (63), we
get as the equation for x
ax Ti2
%f= 1-
(68)
2m ( --+
a2 nqy+v}x.
s-r2
If the state is a stationary state belonging to the energy value H',
x will be of the form x(d) = Xo(r)e-sri~fi
and (68) will reduce to
52
H'xo = T& --/-j-j+
a2 n!d!tp +v xo. (69)
1 ( ) 1
This equation may be used to determine the energy-levels H' of the
System. For each Solution x,, of (69j, arising from a given n, there
will be 2n+l independent states, because there are 2n+l independent
solutions of (67) corresponding to the 212+ 1 different values
that a component of the angular momentum, na, say, tan take on.
The probability of the particle being in an element of volume
dxdydx is proportional to [# 1%-&.& With $J of the form (66) this
becomes r-21~/2/X,12dxdyd~. The probability of the particle being in
a spherical Shell between r and r+dr is then proportional to 1x12dr.
It now becomes clear that, in solving equation (68) or (69), we must
impose a boundary condition on the function x at r = 0, namely the
function must be such that the integral to the origin 1 1~ l2 dr k
0
convergent. If this integral were not convergent, the wave function
would represent a state for which the chances arc inSnitely in favour
of the particle being at the origin and such a state would not be
physically admissible.
The boundary condition at r = 0 obtained by the above considera-
tion of probabilities is, however, not sufficiently stringent. We get a
more stringent condition by verifying that the wave function obtained
by solving the wave equation in polar coordinates (63) really satisfies
the wave equation in Cartesian coordinates (61). Let us take the case
39. Energy-levels of the Hydrogen Atom
156 ELEMENTARY APPLICATIONS 8 38
of V = 0, giving us the Problem of the free particle. Applied to a
stationary state with energy H' = 0, equation (61) gives
v2* = 0, (70)
where V2 is written for the Laplacian Operator a2/ax2+a2/ay2+ a2/ax2,
and equation (63) gives
i a2
4 -r- kw)# = 0. (71)
(T ar2
A Solution of (71) for k: = 0 is t) = r-l. This does not satisfy
(7O), since, although V2r-1 vanishes for any finite value of r, its integral
through a volume containing the origin is -4~ (as may be verified
bg transforming this volume integral to a surface integral by means
of Gauss's theorem), and hence
V2Y-1 = -47T S(x)S(y)S(x). (72)
Thus not every solution of (71) gives a Solution of (70), and more
generally, not every solution of (63) is a Solution of (61). We musst
impose on the Solution of (63) the condition that it shall not tend to
infinity as rapidly as r-l when r -+ 0 in Order that, when substituted
into (61), it shall not give a S function on the right like the right-hand
side of (72). Only when equation (63) is supplemented with this condition
does it become equivalent to equation (61). We thus hrtve the
boundary condition r$ -+ 0 or x + 0 as r -+ 0.
There are also boundary conditions for the wave function at r = 00.
If we are interested only in `closed' states, i.e. states for which the
particle does not go off to infinity, we must restritt the integral to
infinity s IX(~) l2 dr to be convergent. These closed states, however,
arc not the only ones that arc physically permissible, as we tan also
have states in which the particle arrives from infinity, is scattered
by the central field of forte, and goes off to infinity again. For these
states the watve function may remain finite as r + co. Such states will
be dealt with in Chapter VIII under the heading of collision Problems.
In any case the wave function must not tend to infinity as r -+ CO, or
it will represent a state that has no physical meaning.
39. Energy-levels of the hydrogen atom
The above analysis may be applied to the Problem of the hydrogen
atom with neglect of relativistic mechanics and the spin of the
§ 39 ENERGY-LEVELS OF THE HYDROGEN ATOM 157
electron. The potential energy V is nowt -e2/r, so that equation
(69) becomes
!!Yy+L;$ +, = -3?&o.
dr2 (73)
A thorough investigation of this equation has been given by Schroedinger.
We shall here obtain its eigenvalues H' by an elementary
argument .
It is convenient to put x. = f(r)e-rju, (74)
introducing the new function f(r), where a is one or other of the
Square roots a = -J--,/(--P/ZmH'). (75)
Equation (73) now becomes
We look for a Solution of this equation in the form of a power series
(77)
in which consecutive values for s differ by unity altbough these
values themselves need not be integers. On substituting (77) in (76)
we obtain
2 ~~(~(~-1)1"8-~~-(2~/a)~~-~-n(~+l)r~-~+(2me~/li~)r~-~~ = 0,
8
which gives, on equating to Zero the coefficient of P+-~, the following
relation between successive coefficients c,,
c,[s(s- 1) -n(n+ l)] = c~~~[~(s--- l)/a- 2rne2/K2]. (78)
We saw in the preceding section that only those eigenfunctions x
are allowed that tend to Zero with r and hence, from (74), f(r) must
tend to zero with r. The series (77) must therefore terminate on the
side of small s and the minimum value of s must be greater than Zero.
Now the only possible minimum values of s are those that make the
coefficient of cs in (78) vanish, i.e. n+ 1 and -n, and the second
of these is negative or Zero. Thus the minimum value of s must be
n+ 1. Since n is always an integer, the values of s will all be integers.
+ The e here, denoting minus the Charge on an electron, is, of course, to be dis-
tinguished from the e denoting the base of exponentials.
$ Schroedinger, Am. d. Physik, 79 (1926), 361.
ELEMENTARY APPLICATIONS I 39
The series (77) will in general extend to infinity on the side of large s.
For large values of s the ratio of successive terms is
2r
3-y=-
Q-1 sa
according to (78). Thus the series (77) will always converge, as the
ratios of the higher terms to one another are the Same as for the
1 2r8
CO
-- (79)
s! a '
8
which converges to e2rla.
We must now examine how our Solution x. behaves for large
values of r. We must distinguish between the two cases of H' positive
and H' negative. For H' negative, a given by (75) will be real. Suppose
we take the positive value for a. Then as r -+ 00 the sum of the
series (77) will tend to inf?.nity according to the Same law as the sum
of the series (79), i.e. the law e2rla. Thus, from (74), x. will tend to
i..n.fGty according to the law eda and will not represent a physically
possible state. There is therefore in general no permissible Solution
of (73) for negative values of H'. An exception arises, however, whenever
the series (77) terminates on the side of large s, in which case the
boundary conditions are all satisfied. The condition for this termination
of the series is that the coefficient of csVr in (78) shall vanish for
some value of the suffix s- 1 not less than its minimum value n+ 1,
which is the same as the condition that
s 9ne2
--- =
a na 0
for some integer .s not less than n+ 1. With the help of (75) this
condition becomes H'= -11264
2s2P' (80)
and is thus a condition for the energy-level H'. Since s may be any
positive integer, the formula (80) gives a discrete set of negative
energy-levels for the hydrogen atom. These are in agreement with
experiment. For each of them (except the lowest one s = 1) there
are several independent states, as there are various possible values
for n, namely any positive or zero integer less than s. This multiplicity
of states belonging to an energy-level is in addition to that
mentioned in the preceding section arising from the various possible
40. Selection Rules
§ 39 ENERGY-LEVELS OF THE HYDROGEN ATOM 169
values for a component of angular momentum, which latter multi-
plicity occurs with any central field of foroe. The n multiplicity occurs
only with an inverse Square law of forte and even then is removed
when one takes relativistic mechanics into account, as will be found
in Chapter XI. The Solution x. of (73) when H' satisfies (80) tends to
Zero exponentially as r -+ CQ and thus represents a closed state (corresponding
to an elliptic Orbit in Bohr's theory).
For any positive values of H', a given by (75) will bepure imaginary.
The series (771, which is like the series (79) for large r, will now have a
sum that remains finite as r -+ a. Thus Xogiven by (74) will now remain
Finite as r -+ co and will therefore be a permissible Solution of (73),
giving a wave function (CI that tends to Zero according to the law r-1 as
r -+ CO. Hence in addition to the discrete set of negative energy-levels
(80), all positive energy-levels are allowed. The states of positive
energy are not closed, since for them the integral to in6nity r 1 x. i2 dr
does not converge. (These states correspoad to the hyperbolic Orbits
of Bohr's theory.)
40. Selection rules
If a dynamical System is set up in a certain stationary statte, it will
remain in that stationary state so long as it is not acted upon by
outside forces. Any atomic System in practice, however, frequently
gets acted upon by external electromagnetic fields, under whose
infiuence it is liable to cease to be in one stationary state and to make
a transition to another. The theory of such transitions will be de-
veloped in $8 44 and 45. A result of this theory is that, to a high degree
of accuracy, transitions between two states cannot occur under the
influence of electromagnetic radiation if, in a Heisenberg representation
with these two stationary states as two of the basic states, the
matrix element, referring to these two states, of the representative
of the total electric displacement D of the System vanishes. New it
happens for many atomic Systems that the great majority of the
matrix elements of D in a Heisenberg representation do vanish, and
hence there are severe limitations on the possibilities for transitions.
' The rules that express these limitations are called selection ruEes.
The idea of selection rules tan be refined by a more detailed
application of the theory of $5 44 and 45, according to which
the matrix elements of the different Cartesian components of the
vector D are associated with different states of polarization of the
160 ELEMENTARY APPLICATIONS Q 40
electromagnetic radiation. The nature of this association is just what
one would get if one considered the matrix elements, or rather their
real parts, as the amplitudes of harmonic oscillators which interact
with the field of radiation according to classical electrodynamics.
There is a general method for obtaining all selection rules, as
follows. Let us call the constants of the motion which are diagonal in
the Heisenberg representation ar's and let .D be one of the Cartesian
components of D. We must obtain an algebraic equation connecting
D and the a's which does not involve any dynamical variables other
than D and the 2s and which is linear in D. Such an equation will
be of the form w
where the f?`s and g,.`s are functions of the a's only. If this equation
is expressed in terms of representatives, it gives us
01:
which Shows that (a' ID ld') = 0 unless
This last equation, giving the connexion which must exist between
CY' and an in Order that (d/Dld') may not vanish, constitutes the
selection rule, so far as the component D of D is concerned.
0u.r work on the harmonic oscillator in 9 34 provides an exampie
of a selection rule. Equation (8) is of the form (81) with +j for D and
EI playing the part of the 01'8, and it Shows that the matrix elements
(F I+ IriT") of +j all vanish except those for which H"=N' = 6~. The
conjugate complex of this result is that the matrix elements (H' Iq IH">
of 7 all vanish except those for which H"-H' = -6~. Since q is a
numerical multiple of q--q, its matrix elements (H' IqlH") all vanish
except those for which Hf'-Hf = -j$w. If the harmonic oscillator
Garries an electric Charge, its electric displacement D will be proportional
to Q. The selection rule is then that only those transitions
tan take place in which the energy H changes by a Single quanturn
tiw.
We shall now obtain the selection rules for m, and k: for an electron
moving in a central field of forte. The components. of electric dis-
.* :,
Q 40 SELECTION RULES 161
placement arc here proportional to the Cartesian coordinates x, y, x.
Taking first m,, we have that rn, commutes with x, or that
m,x--zm, = 0.
This is an equation of the required type (EU), giving us the selection
rule , Ir
m,-m =
B 0
for the x-component of the displacement. Again, from equations
(23) we have P-h [m,,xl] = Cm,, YJ = -4
or m,2x-2m,xm,fxm~-Px = 0,
which is also of the type (81) and gives us the selection rule
or (mi-rnz-%)(me-mi+Tb) = 0
for the x-component of the displscement. The selection rule for the
y-component is the same. Thus our selection rules for ma! are that
in transitions associated with ra&ation with a polarization correspondi9q
to an electric dipole in the x-direction, rn: cunnot chunge, while in transiGons
associated with a polarkation corresponding to an electric dipole
in the x-direction or y-direction, mp: must change by -J+.
We tan determine more accurately the state of polarization of the
radiation associated with a transition in which rni changes by -J& by
considering the condition for the non-vanish$g of matrix elements
of x+iy and x -iy. We have
[m,,x+iy] = y-ix = -i(x+iy)
0r m,(x+iy)--- (x+iy)(m,+W = 0,
which is again of the type (81). It gives
I
m,--rni-4 = 0
as the condition that (m~jx+iyJm~) shall not vanish. Similarly,
mZ;--rni+fi = 0
is the condition that (mzlx-iy Im:> shall not vanish. Hence
(m;jx-iyjmL-4) = 0
or (m~JxJm~---6) = i(m~~yJrn~--fi) = (a+ib)kw"
say, a, b, and CC) being real. The conjugate oomplex of this is
(m~--ALJx]m~) = -i(mi--6]ylmL> = (a-ib)e-i~t
Thus the vector &{{m; ID (m;--%> + <mA-&/D Im;)), which determines
8895.67 M
162 ELEMENTARY APPLICATIONS § 40
the state of Polarkation of the radiation associated with transitions
for which rni = rn; -4, has the following three components
~{<m~lxlm~-n>+(m~-Alxlm~>)
= g{(a+ib)ei"`+(a-ib)e-qwl) = a cos d-b sin wt,
~(~m~lvlma-~>+~m~-~lulm~>) (83)
= gi(-(a+ib)e"`"`+(a-ib)e-io3 = acsin ot+b cos wt,
~{<m~[z~rn~-~)+(rn~-~~~~rn~)~ = 0. 1
From the form of these components we see that the associated radiation
moving in the z-direction will be circularly polarized, that
moving in any direction in the q-plane will be linearly polarized in
this plane, and that moving in intermediate directions will be
elliptically polarized. The direction of circular polarization for radiation
moving in the x-direction will depend on whether w is positive
or negative, and this will depend on which of the two states rni or
mg = mi--?i has the greater energy.
We shall now determine the selection rule for E. We have
[W+W, z] = [m$ z]+[m$ 21
= -ym% -m,y+xm,+m,x
= 2(m,x-m, y+ifiz)
= 2(m, x- ym,) = 2(xm,-m, y).
Similarly , [WC+% x] = 2(ym,--m, 4
and [W+fi), Y] = 2(m,+xm,).
Hence
[w+fi), Cw+a 4-j
= qw+Jq, my x-m, y-f-ifiz]
= 2m,[k(E+~),x]-2m,[k(k+1Ti),y]+2ifi[k(E+~), ~1 *
= 4m,(ym,----m,z)-4m,(m,z--xmz)+2(k(k+fi)x-xE(k+fi)) -
= 4(mzx+riEy y+m,z)m,-4(m!$+mi+ma)z+
+ 2w+w z-zk(k+fi)).
From (22) m,x+m, y+m,x = 0 P)
and hence
[w+fi), Pw+fo, Zl] = -2{W+~)~+~E(~+~)),
which gives
IC2(k+n)32_2k(E+fi)xE(E+n)+xE2(E+n)2-.
-21i2Ek(l+n)z+xk(k+n)E = 0. (85)
§ 40 SELECTION RULES 163
Similar equations hold for x and y. These equations are of the re-
quired type (81), and give us the selection rule
k'2(k'+q2-2k'(k'+fi)v(~+fi)+~yY+~)~-
-2fi%`(E'+?i)-2Pk"(k"+n) = 0,
which reduces to
(k'+E"+2n)(k'+E")(k'-k"+n)(k'-E"-fi) = 0.
A transition tan take place between two states k' and k?' only if one
of these four factors vanishes.
Now the first of the factors, (Ic'+lc"+ 2fi), tan never vanish, since
the eigenvalues of k are all positive or Zero. The second, (E'+V), tan
vanish only if k' = 0 and k" = 0. But transitions between two states
with these values for Ic cannot occur on account of other selection
rules, as may be seen from the following argument. If two states
(labelled respectively with a Single Prime and a double Prime) arc
such that k' = 0 and k;" = 0, then from (41) and the corresponding
results for m, and my, rn; = mk = rni = 0 and rni = rni = rni = 0.
The selection rule for m, now Shows that the matrix elements of
x and y referring to the two states must vanish, as the value of m,
does not Change during the transition, and the similar selection rule
for m, or rny Shows that the matrix element of z also vanishes. Thus
transitions between the two states cannot occur. Our selection rule
for ?c now reduces to
(k'-k"+h)(k'-k"-4) = 0,
showing that k mzcst chcLnge by -@. This selection rule may be written
p-2jypyp-$2 = 0,
and since this is the condition that a matrix element (Er ~x~F'> shall
not vanish, we get the equation
k2z- 2kzk+zk2-fi2z = 0
or [k [k, z]] = -2, (86)
a result which could not easily be obtained in a more direct way.
As a final example we shall obtain the selection rule for the magni-
tude K of the total angular momentum M of a general atomic System.
Let x, y, z be the coordinates of one of the electrons. We must obtain
the condition that the (Kr, K") matrix element of X, y, or x shall not
vanish. This is evidently the Same as the condition that the (Kr, K")
matrix element of h,, h,, or & shall not vanish, where &, h,, and $
164 ELEMENTARY APPLICATIONS § 40
are any three independent linear functions of x, y, and x with numerical
coefficients, or more generally with any coefficients that commute
with K and are thus represented by matrices which are diagonal with
respect to K. Let &, = M,x+M, y+JQ,
h, =M,z-M,y-iKx,
Av = M,x-M,x-ifiy,
AZ = M, y-M,x-ifix.
We have
= 1 (M, MV-M, M,-ifiM,)x = 0 (87)
from (29). Thus &,, )I,, and ATare not linearly independent functions
of x, y, and z. Any two of them, however, together with AO are three
linearly independent functions of x, y, and x and may be taken as the
above h,, X,, X,, since the coefficients M,, M,, M, all commute with K.
Our Problem thus reduces to finding the condition that the (K', K")
matrix elements of h,, hz, AU, and h, shall not vanish. The physical
meanings of these h's are that X, is proportional to the component of
the vector (x, y, x) in the direction of the vector M, and AZ, &,, Xz are
proportional to the Cartesian components of the component of (x, y, x)
perpendicular to M.
Since &, is a scalar it must commute with K. It follows that only
the diagonal elements (K' /h,lK') of h, tan differ from Zero, so the
selection rule is that K cannot Change so far as h, is concerned. Applying
(30) to the vector hz, &,, h,, we have
PfAl = 4l9 [&, hy] = -h,, [M,, &] = 0.
These relations between M, and h,, X,, h, are of exactly the same form
as the relations (23), (24) between m, and x, y, x, and also (87) is of
the same form as (84). The dynamical variables &., h,, AZ thus have the
Same properties relative to the angular momentum M as x, y, x have
relative to m. The deduction of the selection rule for lc when the
electric displacement is proportional to (x, y, x) tan therefore be taken
over and applied to the selection rule for K when the electric displacement
is proportional to (h,, h,, h,). We find in this way that, so far as
&., h,, h, are concerned, the selection rule for K is that it must Change
by 33.
Collecting results, we have as the selection rule for K that it must
Change by 0 or -J$. We have considered the electric displacement
41. The Zeeman Effect for the Hydrogen Atom
f 40 SELECTION RULES
produced by only one of the electrons, but the same selection rule
must hold for each electron and thus also for the total electricjlisplacement
.
41. The Zeeman effect for the hydrogen atorn
We shall now consider the System of a hydrogen atom in a uniform
magnetic field. The Hamiltonian (5'7) with V = -ez/r, which describes
the hydrogen atom in no external field, gets modified by the magnetic
field, the modification, according to classical mechanics, consisting
in the replacement of the components of momentum, pz, pV, p3,, by
px+e/c.A,, p,+e/c.A,, %+e/c .A,, where A,, A,, A, arc the components
of the vector potential describing the field. For a, uniform
field of magnitude J+ in the direction of the x-axis we may tske
A, = -Q&y, A, = +&x, A, =0. The classical Hamiltonian will
then be
This classical Hamiltonian may be taken over into the quantum
theory if we add on to it a ferm giving the effect of the spin of the
electron. According to experimental evidente and according to the
theory of Chapter XI, the electron has a magnetic moment - efi/2mc. G,
where Q is the spin vector of 0 37. The energy of this magnetic moment
in the magnetic field will be e!i3/2mc. 0,. Thus the total quantum
Hamiltonian will be
H '
z-
2m px
((
There ought strictly to be other terms in this Hamiltonisn giving the
interaction of the magnetic moment of the electron with the electric
field of the nucleus of the atom, but this effect is small, of the same
Order of magnitude as the correction one gets by taking relativistic
mechanics into account, and will be neglected here. It will be taken
into account in the relativistic theory of the electron given in
Chapter XI.
If the magnetic field is not too large, we tan neglect terms involving
#2, so that the Hamiltonian (88) reduces to
(89)
166 ELEMENTARY APPLICATIONS 8 41
The extra terms due to the magnetic field are now eJ4/2mc. (mz+hz).
But these extra terms commute with the total Hamiltonian and arc
thus constants of the motion. This makes the Problem very easy.
The stationary states of the system, i.e. the eigenstates of the Hamiltonian
(89), will be those eigenstates of the Hamiltonian for no field
that are simultaneously eigenstates of the observables m, and Ob, or
at least of the one observrtble rn,+fia,, and the energy-levels of the
System will be those for the System with no field, given by (80) if
one considers only closed states, increased by an eigenvalue of
e#/2mc. (m,+?b,). Thus stationary states of the System with no
field for which rn8 has the numerical value rnl, an integral multiple
of 5, and for which also O* has the numerical value 0; = j- 1, will still
be stationary states when the field is applied. Their energy will be
increased by an amount consisting of the sum of two Parts, a part
e&/2mc.m~ arising from the orbital motion, which part may be considered
as due to an orbital magnetic moment -emi/2mc, and a part
e3#/2mc. ha; arising from the Spin. The ratio of the orbital magnetic
moment to the orbital angular momentum rnz is -e/2mc, which is
half the ratio of the spin magnetic moment to the spin angular
momentum. This fact is sometimes referred to as the magnetic
anomaly of the Spin.
Since the energy-levels now involve m,, the selection rule for m,
obtained in the preceding section becomes capable of direct comparison
with experiment. We take a Heisenberg representation in
which, among other constants of the motion, m, and oz are diagonal.
The selection rule for m, now requires m, to Change by &, 0, or -4,
while u,, since it commutes with the electric displacement, will not
Change at all. Thus the energy differente between the two states
taking part in the transition process will differ by an amount
e?iJ+/2mc, 0, or -eW/2mc from its value for no magnetic field.
Hence, from Bohr's frequency condition, the frequency of the
associated electromagnetic radiation will differ by eJ$/&rnc, 0, or
-eJ#/hmc from that for no magnetic field. This means that each i
specfrd he for no magnetic field gets Split up by the field into three
components. If one considers radiation moving in the x-clirection,
then from (83) the two outer components will be circularly polarized,
while the central undisplaced one will be of zero intensity. These
reaults are in agreement with experiment and also with the classical
theory of the Zeeman effect.
VII. Perturbation Theory
42. General Remarks
PERTURBATION THEORY
42. General remarks
IN the preceding chapter exact treatments were given of some simple
dynamical Systems in the quantum theory. Most quantum Problems,
however, cannot be solved exactly with the present resources of .
mathematics, as they lead to equations whose solutions cannot be
expressed in finite terms with the help of the ordinary functions of
analysis. For such Problems one tan often use a perturbation method.
This consists in splitting up the Hamiltonian into two park, one of
which must be simple and the other small. The first part may then
be considered as the Hamiltonian of a simplified or unperturbed
< System, which tan be dealt with exactly, and the adclition of the
second will then require small corrections, of the nature of a perturbation,
in the Solution for the unperturbed System. The requirement
that the first part shall be simple requires in practice that it shall not
involve the time explicitly. If the second part contains a small
i numerical factor E, we tan obtain the solution of our equations for
the perturbed System in the form of a power series in E, which, provided
it converges, will give the answer to our Problem with any
desired accuracy. Even when the series does not converge, the first
approximation obtained by means of it is usually fairly accurate.
There are two distincf methods in perturbation theory. In one of
these the perturbation is considered as causing a modzjkation of the
states of motion of the unperturbed System. In the other we do nof
consider any modification to be made in the states of the unperturbed
System, but we suppose that the perturbed System, instead of remaining
permanently in one of these states, is continually changing from
one to another, or wmking transitions, under the influence of the
perturbation. Which method is to be used in any particular case
depends on the nature of the Problem to be solved. The first method
is useful usually only when the perturbing energy (the correction in the
Hamiltonian for the undisturbed System) does not involve the time
explicitly, and is then applied to the stationary states. It tan be used
for calculating things that do not refer to any definite time, such as
the energy-levels of the stationary states of the perturbed System, or,
in the case of collision Problems, the probability of stattering through
43. The Change in the Energy-levels caused by a Perturbation
168 PERTURBATION THEORY s 42
a given angle. The second method must, on the other hand, be used
for solving all Problems involving a consideration of time, such as
those about the transient phenomena that occur when the perturbation
is suddenly applied, or more generally Problems in which the
perturbation varies with the time in any way (i.e. in which the perturbing
energy involves the time explicitly). Again, this second
method must be used in collision Problems, even though the perturbing
energy does not here involve the time explicitly, if one
wishes to calculate absorption and emission probabilities, since these
probabilities, unlike a stattering probability, cannot be defined without
reference to a state of affairs that varies with the time.
One tan summarize the distinctive features of the two methods by
saying that, with the first method, one compares the stationary states
of the perturbed systsm with those of the unperturbed System; with
the second method one takes a stationary state of the unperturbed
system and sees how it varies with time under the influence of the
perturbation.
43. The Change in the energy-levels caused by a perturbation
The first of the above-mentioned methods will now be applied to
the calculation of the changes in the energy-levels of a System caused
by a perturbation. We assume the perturbing energy, like the Hamiltonian
for the unperturbed System, not to involve the time explicitly.
Our Problem has a meaning, of course, only provided the energy-levels
of the unperturbed System are discrete and the differentes between
them are large compared with the changes in them caused by the
perfurbation. This circumstance results in the treatment of perturbation
Problems by the first method having some different features
according to whether the energy-levels of the unperturbed System are
discrete or continuous.
Let the Hamiltonian of the perturbed System be
H=E+K (1)
E being the Hamiltonian of the unperturbed System and V the small
perturbing energy. By hypothesis each eigenvalue H' of H lies very
close to one and only one eigenvalue E' of E. We shall use the same
number of primes to specify any eigenvalue of H and the eigenvalue
of E to which it lies very close. Thus we shall have H" differing from
E" by a small quantity of Order V and differing from E' by a quantity
that is not small unless E' = E". We must now take care always to
§ 43 CHANGE IN THE ENERGY-LEVElk 169
use different numbers of primes to specify eigenvalues of H and E
which we do not want to lie very close together.
To obtain the eigenvalues of H, we have to solve the equation
WO = H'IH')
or (H'--E)]H') = VIH'). (2)
Let IO) be an eigenket of E belonging to the eigenvalue E' and
suppose the IH') and H' that satisfy (2) to differ from IO} and E'
only by small quantities and to be expressed as
10 = IO>+ IV+ FD+-->
Hf = E'+a,+a,+..., 1 (3)
where 1 1 > and a, are of the first Order of smallness (i.e. the same Order
as V), /2> and a2 are of the second Order, and so on. Substituting
these expressions in (2), we obtain
{Ef-E+al+aa+~..)(lo)f IV+ 12>+...) = V(lo>+ IO+-*}.
If we now separate the terms of Zero Order, of the first Order, of the
second Order, and so on, we get the following set of equations,
(E'-E)IO) = 0,
CE'-JW)+a,P> = VP>, (4
(E'-E)l2)+a,ll)+a,lO) = VP>,
. . . . . . . . . 1
The first of these equations tells us, what we have already assumed,
that IO> is an eigenket of E belonging to the eigenvalue E'. The others
enable us to calculate the various corrections Il), 12),..., al,a,,... .
For the further discussion of these equations it is convenient to
introduce a representation in which E is diagonal, i.e. a Heisenberg
representation for the unperturbed System, and to take E itself as
one of the observables whose eigenvalues label the representatives.
Let the others, in the event of others being necessary, as is the case
when there is more than one eigenstate of E belonging to any eigenvalue,
be called Iss's. A basic bra is then (E"/3" 1. Since IO) is an
eigenket of E belonging to the eigenvalue E', we have
@"P"lo) = &pE,f@"), (5)
wheref(/3") is some function of the variables p". With the help of this
result the second of equations (GI), written in terms of representatives,
becomes(E'-E")(E"P"Il)+~,8~"~f(rssn) = B (B"j"IV~IE'/3')f(jg'). (6)
,
170 PERTURBATION THEORY B 43
Putting E" = E' here, we get
Equation (7) is of the form of the Standard equation in the theory
of eigenvalues, so far as the variables /3' are concerned. It Shows that
the various possible values for a, are the eigenvalues of the matrix
<E'/l"IVIE'f3'). This matrix is a part of the representative of the
perturbing energy in the Heisenberg representation for the unperturbed
System, namely, the part consisting of those elements that
refer to the same unperturbed energy-level E' for their row and
column. Esch of these values for a, gives, to the first Order, an energylevel
of the perturbed System lying close to the energy-level E' of the
unperturbed System.? There may thus be several energy-levels of the
perturbed System lying close to the one energy-level E' of the unperturbed
System, their number being anything not exceeding the
number of independent states of the unperturbed System belonging
to the energy-level E'. In this way the perturbation may Cause a
Separation or partial Separation of the energy-levels that coincide
at E' for the unperturbed System.
Equation (7) also determines, to the zero Order, the representatives
(IG"/? IO) of the stationary states of the perturbed System belonging
to energy-levels lying close to E', any solutionf(fl') of (7) substituted
in (5) giving one such representative. Esch of these stationary states
of the perturbed System approximates to one of the stationary states
of the unperturbed System, but the converse, that each stationary
state of the unperturbed System approximates to one of the stationary
states of the perturbed System, is not true, since the general
stationary state of the unperturbed System belonging to the energylevel
E' is represented by the right-hand side of (5) with an arbitrary
function f(p). The Problem of finding which stationary states of
the unperturbed System approximate to stationary states of the
perturbed System, i.e. the Problem of finding the solutions f@`> of
(7), corresponds to the Problem of `secular perturbations' in classical
mechanics. It should be noted that the above results are independent
of the values of all those matrix elements of the perturbing
i To distinguish these energy-levels one from another we should require some
more elaborate notation, since according to the present notation they must all be
specified by the same number of primes, namely by the number of primes specifying
the energy-level of the unperturbed System from which they arise. For our present
purposes, however, this more elaborate notation is not required.
§ 43 CHANGE IN THE ENERGY-LEVELS 171
energy which refer to two different energy-levels of the unperturbed
System.
Let us see what the above results become in the specially simple case
when there is only one stationary state of the unperturbed sysfem
belonging to each energy-1evel.t In this case E alone fixes the representation,
no 13's being required. The sum in (7) now reduces to a
Single term and we get
CcI =(E'IVIE'). (8)
There is only one energy-level of the perturbed System lying close to
any energy-level of the unperturbed System and the Change in energy
is equal, in the @st Order, to the corresponding diagonal element of the
perturbing energy in the Heisenberg representution for the unperturbed
System, or to the average value of the perturbing energy for the corresponding
unperturbed state. The latter formulation of the result is the Same
as in classical mechanics when the unperturbed System is multiply
periodic .
We shall proceed 60 calculate the second-Order correction a2 in
the energy-level for fhe case when the unperturbed System is non-
I
degenerate. Equation (5) for this case reads
(E"IO) = &`E',
with neglect of an unimportant numerical factor, and equation (6)
reads (E'-E")(E"p>+a,8~.$y = (E"p$!r).
This gives us the value of (E" J 1) when E" # E', namely
(E'IVIE')
(E"I1> = E' E" .
- (9)
The third of equations (4), written in terms of representatives,
becomes
(E'-E")<E"12)+a,(E"Il)+a,~&`E' = 2 <E'IVIE")(E"~1).
z"
Putting E" = E' here, we gef
al@`Il)+a2 = & GVV'XE"l~),
which reduces, with the help of .(8), fo
a2 =~~~,(E'IV~E'}(E'11).
t A System with only one stationary state belonging to each energy-level is often
called non-degenerute and one with two or more stationary states belonging to an
energy-level is called degenerste, although these words arc not very appropriate from
the modern Point of view.
44. The Perturbation considered as causing Transitions
172 PERTURBATION THEORY § 43
Substituting for (23" 1 l> from (9), we obtain finally
(E'IVIE")(E"jV~E')
a2= -
c EI-E" >
E"#E'
giving for the total energy Change to the second Order
a,+a, = (E'IVIE')+ 2 `E'IVI~)J~~IvlE'?,
E"#E
The method may be developed for the calculation of the higher
approximations if required. General recurrence formulas giving the
nth Order corrections in terms of those of lower Order have been
obtained by Born, Heisenberg, and Jordan.?
44. The perturbation considered as causing transitions
We shall now consider the second of the two perturbation methods
mentioned in lj 42. We suppose again that we have an unperturbed
System governed by a Hamiltonian E which does not involve the
time explicitly, and a perturbing energy `V which tan now be an
arbitrary function of the time. The Hamiltonian for the perturbed
System is again H = E+V. For the present method it does not
make any essential differente whether the energy-levels of the
unperturbed System, i.e. the eigenvalues of E, form a discrete or
continuous set. We shall, however, take the discrete case, for
definiteness. We shall again work with a Heisenberg representation
for the unperturbed System, but as there will now be no advantage in
taking E itself as one of the observables whose eigenvalues label the
representatives, we shall suppose we have a general set of 2s to label
the representatives.
Let us suppose that at the initial time t, the System is in a state for
which the CX'S certainly have the values CY'. The ket corresponding to
this state is the basic ket 1~`). If there were no perturbation, i.e. if the
Hamiltonian were E, this state would be stationary. The perturbation
Causes the state to Change. At time t the ket corresponding to the
state in Schroedinger's picture will be T 1 a'), according to equation (1)
of 5 27. The probability of the a's then having the values 0~" is
P(&") = I(a"lTla'>l2. (11)
For (11" # c11', P(a'a") is the probability of a transition taking place
from state a' to state ~2' during the time interval t, -+ t, while P(&&)
t 2. f. Physik, 35 (19259, 565.
,
§ 44 PERTURBATION CAUSINC TRANSITIONS 173
is the probability of no transition taking place at all. The sum of
P(a'a") for all O? is, of course, unity.
Let us now suppose that initially the System, instead of being
certainly in the state CX', is in one or other of various states 01' with
the probability Pa, for each. The Gibbs density corresponding to this
distribution is, according to (68) of 5 33
p = c ja'>P&' 1. (12)
ff'
At time t, each ket Ia') will have changed to Tl@`) and each bra (a' 1
to (cu'[T, so p will have changed to
pt = C T(ar')P& fi?'. (13)
01'
The probability of the CX'S then having the values QL" will be, Biom
(73) of 5 33, (cx"]pt~a") = 2 (~HITJ~`)P,~(ol'l~la">
a*
= 2 Pa' P(cx'a") (14)
with the help of (11). This result expresses that the probability of
the System being in the state ~2' at time t is the sum of the probabilities
of the System being initially in any state 01' # an, and making a transition
from state 01' to state O? and the probability of its being initially
in the state.$ and making no transition. Thus the various transition
probabilities act independently of one another, according to the
ordinary laws of probability.
The whole Problem of calculating transitions thus reduces to the
determination of the probability amplitudes (CU" 1 T Ia'). These tan be
worked out from the differential equation for T, equation (6) of $27, or
%dT,`dt = HT = (E+V)T. (15)
. The calculation tan be simplified by working with
T* = &N-tol/fiT. (16)
We have i&dT*/dt = eiE(t-to)jfi( - ET+% dT/dt)
= &W-lo)lfiVT = V*T* > (17)
where V* = eiE(t-to)/~Ve-iE(t-I,llfL 1 (18)
i.e. V* is the result of applying a certain unitary transformation to V.
Equation (17) is of a more convenient form than (15), because (17)
makes the Change in T* depend entirely on the perturbation V, and
174 PERTURBATION THEORY 5 44
for v = 0 it would make T* equal its initial value, namely unity.
We have from (16)
(CX"(T"lLX') = ,&W-lo)/fi( a" 1 T 1 a' ) ,
so that P(a'afl) = I(anIT*Ia'>12, (19)
showing that T* and T are equally good for determining transition
probabilities.
Our work up to the present has been exact. We now assume V is
a small quantity of the first Order and express T* in the form
T'= l+Tf+T;+..., (20)
where TT is of the first Order, 5!`: is of the second, and so on. Substituting
(20) into (17) and equating terms of equal Order, we get
i!idTT/dt = V",
ifidT;/dt = V*T;, (21)
. . . . . . 1
From the first of these equations we obtain
t
Ti = -in--1 V*(t') dt',
s (22)
to
fiom the second we obtain
T,* = ---#i-2 i V*(t') dt' j V*(f) dt", (23)
to to
and so on. For many practical Problems it is sufficiently accurate to
retain only the term Tz, which gives for the transition probability
P(&d') with cy" # cc'
P(a'd) =6-2 (a"l i V*(t') dt'la') 2
I to I (24)
= n-2 (2 1 V*(t') ja') dt' 2.
l 1
We obtain in this way the transition probability to the second Order
of accuracy. The result depends only on the matrix element
(a"lV*(t')Ia') of V*(t') referring to the two states concerned, with t'
going from t, to t. Since V* is real, like V,
+qv*(t')~a'> = ~`Iv*(t')ldr>
and hence P(&!") = P(c&`) (25)
to the second Order of accuracy.
45. Application to Radiation
§ 44 PERTURBATIQN CAUSING TRANSITIONS 175
Sometimes one is interested in a transition Q' -+ CY" such that the
matrix element (a" 1 V* [ ~2) vanishes, or is small compared with ofher
matrix elements of V*. It is then necessary to work to a higher
accuracy. If we retain only the terms Tf and Tz, we get, for 0~" # a',
P(o!`a") = n-2 / <ar"j v*(t') Ia'} dt'-
It0
-ifi-1 2 1 (a",V*(t'),,"`) dt' j'<a"`,V*(t'),n'> &"i2. (26)
`3"" + a' d' to to
The terms (Y"' = 01' and ~4" = (x" are omitted from the sum since they
are small compared with other terms of the sum, on account of the
smallness of (01" 1 V* 101'). To interpret the result (26), we may suppose
that the term t
s (a"~V*(t')~a'> dt' (27)
to
gives rise to a transition directly fiom state a' to state 2, while the
ferm
-&--1 / (d'l V*(t') Ia"`) dt' i' (a", V*(f) (cJ> dt" (533)
to to
gives rise fo a transition from sfate 01' to state OP', followed by a
transition from state 01' to state a". The state 01"' is called an `Wermediate
staie in this interpretation. We must add the term (27) to the
various ferms (28) corresponding fo different intermediate sfafes
and then take the Square of the modulus of the sum, which means
that th8r8 is interference between the different transition proc8ssesthe
direct one and those involving intermediafe states-and one cannof
give a meaning to the probability for one of these processes by
itself. E'or each of these processes, however, there is a probability
amplitude. If one carries out the perturbation method to a higher
degree of accuracy, one obtains a result which tan be interpreted
similarly, with the help of more complicated transition processes
involving a succession of intermediate states.
45. Application to radiation
In the preceding section a general theory of the perturbation of an
atomic System was developed, in which the perturbing energy could
vary with the time in an arbitrary way. A perturbation of this
kind tan be realized in practice by allowing incident electromagnetic
176 PERTURBATION TREORY 0 46
radiation to fall on the System. Let us see what our result (24) reduces
to in this case.
If we neglect the effects of the magnetic field of the incident radia-
tion, and if we further assume that the wave-lengths of the harmonic
components of this radiation are all large compared with the dimensions
of the atomic System, then the perturbing energy is simply the
scalar product V = W', e), (29)
where D is the total electric displacement of the System and 42 is
the electric forte of the incident radiation. We suppose e to be a
given function of the time. If we take for simplicity the case when
the incident radiation is plane polarized with its electric vector in
a certain direction and let D denote the Cartesian component of D
in this direction, the expression (29) for V reduces to the ordinary
product V=De,
where e is the magnitude of the vector &!. The matrix elements of
V are (a"lVIa'> = (cx"~D~a')~,
since e is a number. The matrix element (CX" 1 D 10~`) is independent
oft. From (18)
<a"IV*(t)la') = (~"lDlor')eicEl-E~~~~/~~(~),
and hence the expression (24) for the transition probability becomes
If the incident radiation during the time interval t, to t is resolved
into its Fourier components, the energy crossing unit area per unit
frequency range about the frequency v will be, according to classical
electrodynamics, t 2
=- @W-to)& (t' ) dt' .
Ev 2; (31)
IJ
to
Comparing this with (30), we obtain
P(&") = 2rr~-Yi-21(a"lDl01')1~E,, (32)
where v = IE"-E'I/h. (33)
From this result we see in the fnst place that the transition proba-
bility depends only on that Fourier component of the incident radiation
whose frequency v is connected with the Change of energy by (33).
§ 45 APPLICATION TO RADIATION 177
This gives us Bohr'8 Frequency Condition and Shows how the ideas
of Bohr's atomic theory, which was the forerunner of quantum
mechanics, tan be fifted in with quantum meohanics.
The present elementary theory does not tell us anything about the
energy of the field of radiation. It would be reasonable to assume,
though, that the energy absorbed or liberated by the atomic System
in the transition process Comes from or goes into the component of
the radiation with frequency v given by (33). This assumption will
be justified by the more complete theory of radiation given in
Chapter X. The result (32) is then to be interpreted as the probability
of the System, if initially in the state of lower energy, absorbing
radiation and being carried to the upper state, and if initially in
the upper state, being stimuZated by the incident radiation to emit
and fall to the lower state. The present theory does not account for
the experimental fact that the System, if in the upper state with no
incident radiation, tan emit spontaneously and fall to the lower state,
but this also will be accounted for by the more complete theory of
Chapter X.
The existente of the phenomenon of stimulated emission was in-
ferred by Einsteint long before the discovery of quantum mechanics,
from a consideration of statistical equilibrium between atoms and a
field of black-body radiation satisfying Planck's law. Einstein showed
that the transition probability for stimulated emission must equal
that for absorption between the Same pair of states, in agreement
with the present quantum theory, and deduced also a relation connecting
this transition probability with that for spontaneous emission,
which relation is in agreement with the theory of Chapter X.
The matrix element (a"lDja')~ in (32) plays the part of the ampli-
tude of one of the Fourier components of D in the classical theory of
a multiplybperiodic System interacting with radiation. In fact it was
the idea of replacing classical Fourier components by matrix elements
which led Heisenberg to the discovery of quantum mechanics in 1925.
Heisenberg assumed that the formulas describing the interaction with
radiation of a System in the quantum theory tan be obtsined from
the classical formulas by substituting for the Fourier components of
the total electric displacement of the System the corresponding matrix
elements. According to this assumption applied to spontaneous emission,
a System having an electric moment D will, when in the state
j' Einstein, Phys. Zeih 18 (19I7), 121.
3595.87 N
46. Transitions caused by a Perturbation Independent of the Time
178 PERTURBATION THEORY 9 45
(Y`, spontaneously emit radiation of frequency v = (E'- E")/h, where
E" is an energy-level, less than E', of some state an, at the rate
4 (27q
3 -+<~"lDI~`)12~ (34)
The distribution of this radiation over the different directions of
emission and its state of polarization for each direction will be the
Same as that for a classical electric dipole of moment equal to the
real part of (a" IDla'>. To interpret this rate of emission of radiant
energy as a transition probability, we must divide it by the quantum
of energy of this frequency, namely hv, and call it the probability per
unit time of this quantum being spontaneously emitted, with the
atomic System simultaneously dropping to the state a" of lower
energy. These assumptions of Heisenberg are justified by the present
radiation theory, supplemented by the spontaneous transition theory
of Chapter X.
46. Transitions caused by a perturbation independent of the
time
The perturbation method of 3 44 is still valid when the perturbing
energy V does not involve the time t explicitly. Since the total
Hamiltonian H in this case does not involve t explicitly, we could
now, if desired, deal with the System by the perturbation method of
$ 43 and find its stationary states. Whether this method would be
oonvenient or not would depend on what we want to find out about
the System. If what we have to calculate makes an explicit reference
to the time, e.g. if we have to calculate the probability of the System
being in a certain state at one time when we are given that it is in a
certain state at another time, the method of $44 would be the more
convenient one.
Let us see what the result (24) for the transition probability becomes
when P does not involve t explicitly and let us take t, = 0 to simplify
the writing. The matrix element (a"lVla') is now independent of t,
and from (18) (d'~V*(t')~c%`) = (d'~v~a')e~~`-~3"i, (35)
ts(a)v*(t')Id) dt' = (d']Vla'> $,y--;;jj;,
0
provided E" + E'. Thus the transition probability (24) becomes
p(a'a") = j(~"1vl~`)12[e~(~"-E3"`"-
l][e-i(E'-E)t/A- l]/(E"-E')2
= 2~(&'~V~a')~2[b--cos((E"-E')t/fi~]/(E"-E')2. (36)
§ 46 TRANSITION PROBABILITIES
If E" differs appreciably from E' this transition probability is small
and remains so for all values of t. This result is required by the law
of the conservation of energy. The total energy H is constant and
hence the proper-energy E (i.e. the energy with neglect of the part
V due to the perturbation), being approximately equal to H, must
be approximately constant. This means that if E initially has the
numerical value E', at any later time there must be only a small
probability of its having a numerical value differing considerably
from E'.
On the other hand, when the initial state CL' is such that there exists
another state CX" having the same or very nearly the Same properenergy
E, the probability of a transition to the final state All" may be
quite large. The case of physical interest now is that in which there
is a continuous range of final states CL" having a continuous range of
proper-energy levels E" passing through the value E' of the properenergy
of the initial state. The initial state must not be one of the
continuous range of final states, but may be either a separate discrete
state or one of another continuous range of states. We shall now have,
remembering the rules of 6 18 for the interpretation of probability
amplitudes tith continuous ranges of states, that, with P(cY.`oI")
having the value (36), the probability of a transition to a final state
within the small range a" to cll"+&" will be P(cL'cx") da" if the initial
state a' is discrete and will be proportional to this quantity if 01' is
one of a continuous range.
We may suppose that the OL'S describing the final state consist of
E together with a number of other dynamical variables 8, so that we
have a representation like that of 3 43 for the degenerate case. (The
Iss's, however, need have no meaning for the initial state CC'.) We shall
suppose for definiteness that the /3's have only discrete eigenvalues.
The total probability of a transition to a final state CX" for which the
/3's have the values ,8" and E has any value (there will be a strong
probability of its having a val.ue near the initial value E') will now
be (or be proportional to)
= 2 co I(~~~vl~`)l"[l-cos((E" - E')t/fij]/( E"- E')2 dE" (37)
s
-to
180 PERTURBATION THEORY § 46
if one makes the Substitution (E"-E')t/$ = z. For large values of t
this reduces to
ztn-lI(E'p"lvla')I" r [l-cosx]/x2 dx
-UJ = 27&-l~(E'B"~V~cY')~? (38)
Thus the total probability up to time t of a transition to a final state
for which the /3's have the values /3" is proportional to t. There is
therefore a definite probability coe$icient, or probability per unit time,
for the transition process under consideration, having the value
27+-ll(EB"~V~CX')~2. (39)
It is proportional to the Square of the modulus of the matrix element,
associated with this transition, of the perturbing energy.
If the matrix element (E'/3" 1 V Ia') is small compared with other
matrix elements of V, we must work with the more accurate formula
(26). We have from (35)
j (o!" 1 V*(t') Ia"`) dt' / (a"`l v*(t") ld) dt"
0 0
= (a"l Vla")(a"lVla') S &E"-E"')tqW dt' f ei@P"-E')f"/fi dt"
0 0
= +" 1v1a">6"' 1 v Ia'>
--
i( E"- E')/fi st (e+C.E')6'/fi _ e~(ELE"~/n) dt'.
For E' close to E', only the first term in the integrand here gives rise
to a transition probability of physical importante and the second
term may be discarded. Using this result in (26) we get
P(a'a')
= 2 (a'lvla')- <fqqa")(a"IVla') 2 l-cos((E"-E')l/?i)
- ----
I c
IX" # cd . a!" E"-E' (fl-E')2 '
which replaces (36). Proceeding as before, we obtain for the transition
probability per unit time to a final state for which the /3's have
the values /3" and E has a value close to its initial value E'
This formula Shows how intermediate states, differing from the initial
state and final state, play a role in the determination of a probability
coefficient .
t
47. The Anomalous Zeeman Effect
0 46 TRANSITION PROBABILITIES 181
In Order that the approximations used in deriving (39) and (40) may
be valid, the time t must be not too small and not too large. It must
be large compared with the periods of the atomic System in Order that
the approximate evaluation of the integral (37) leading to the result
(38) may be valid, while it must not be excessively large or else the
general formula (24) or (26) will break down. In fact one could make
the probability (38) greater than unity by taking t large enough. The
upper limit to t is fixed by the condition that the probability (24) or
(26), or t times (39) or (40), must be small compared with unity. There
is no difficulty in t satisfying both these conditions simultaneously
provided the perturbing energy V is sufficiently small.
47. The anomalous Zeeman effect
One of the simplest examples of the perturbation method of $43
is the calculation of the first-Order Change in the energy-levels of an
atom caused by a uniform magnetic field. The Problem of a hydrogen
atom in a uniform magnetic field has already been dealt with in $41
and was so simple that perturbation theory was unnecessary. The
case of a general atom is not much more complicated when we make
a few approximations such that we tan set up a simple model for the
atom.
We first of all consider the atom in the absence of the magnetic
field and look for constants of the motion or quantities that are
approximately constants of the motion. The total angular momenturn
of the atom, the vector j say, is certainly a constantl of the
motion. This angular momentum may be regarded as the sum of two
Parts, the total orbital angular momentum of all the electrons, 1 say,
and the total spin angular momentum, s say. Thus we have j = l+~.
Now the effect of the spin magnetic moments on the motion of the
electrons is small compared with the effect of the Coulomb forces and
may be neglected as a first approximation. With this approximation
the spin angular momentum of each electron is a constant of the
motion, there being no forces tending to Change ita orientation. Thus
s, and hence also 1, will be constants of the motion. The magnitudes,
Z, s, and j say, of 1, s, and j will be given by
z+*fi = (EZ+z;+e+i)n2)`,
s+Q?i = (8~+s;+s~+gP)*,
182 PERTURBATION THEORY § 47
corresponding to equation (39) of 5 36. They commute with each
other, and ffrom (47) of Q 36 we see that with given numerical values
for Z and s the possible numerical values for j are
z+s, z+s-6, . . . . IZ-SI.
Let us consider a stationary state for which Z, s, and j have definite
numerical values in agreement with the above scheme. The energy
of this state will depend on Z, but one might think that with neglect
of the spin magnetic moments it would be independent of s, and
also of the direction of the vector s relative to 1, and thus of j. It will
be found in Chapter IX, however, that the energy depends very much
on the magnitude s of the vector s, although independent of its
direction when one neglects the spin magnetic moments, on account
of certain phenomena arising from the fact that the electrons are
indistinguishable one from another. There are thus different energylevels
of the System for each different value of Z and s. This means
that Z and s are functions of the energy, according to the general
definition of a function given in 0 11, since the Z and s of a stationary
state are fixed when the energy of that state is fixed.
We tan now take into account the effect of the spin magnetic
moments, treating it as a small perturbation according to the method
of 8 43. The energy of the unperturbed System will still be approximately
a constant of the motion and hence Z and S, being functions
of this energy, will still be approximately constants of the motion.
The directions of the vectors 1 and s, however, not being functions of
the unperturbed energy, need not now be approximately constants
of the motion and may undergo large secular variations. Since the
vector j is constant, the only possible Variation of 1 and s is a precession
about the vector j. We thus have an approximate model of
the atom consisting of the two vectors 1 and s of constant lengths
precessing about their sum j, which is a fixed vector. The energy is
determined mainly by the magnitudes of 1 and s and depends only
slightly on their reiative directions, specified by j. Thus states with
the same Z and s and different j will have only slightly different
energy-levels, forming what is called a multiplet term.
Let us now take this atomic model as our unperturbed System and
suppose it to be subjected to a uniform magnetic field of magnitude J#
in the direction of the x-axis. The extra energy due to this magnetic
field will consist of a terme&/2mc. (m,+liLa,), (41)
§ 47 THE ANOMALOUS ZEEMAN EFFECT 183
like the last term in equation (89) of $ 41, contributed by each
electron, and will thus be altogether
e3/2mc. 2 (m,+&r,) = eJ+/2mc. (Zz+2sz) = eA/2m. (j,+s,). (42)
This is our perturbing energy Y. We shall now use the method of
6 43 to determine the changes in the energy-levels caused by this V.
The method will be legitimafe only provided fhe field is so weak that
V is small compared wifh the energy differentes within a multiplet.
Our unperturbed System is degenerate, on account of the direction
of the vector j being undetermined. We must therefore take, from
the representative of V in a Heisenberg representation for the unperturbed
System, those matrix elements that refer to one particular
energy-level for their row and column, and obtain the eigenvalues of
the matrix thus formed. We tan do this best by first splitfing up V
into two Parts, one of which is a constant of the unperturbed motion,
so that its representative contains only matrix elements referring to
the same unperturbed energy-Ievel for their row and column, while
the representative of the other contains only matrix elements referring
to two different unperturbed energy-levels for their row and
column, so that this second part does not affect the first-Order perturbation.
The term involving ja in (42) is a constant of the unperturbed
motion and thus belongs entirely to the first part. For the
term involving s, we have
where . .
Yx = sz3$--Jz~y = szEy-l~8v = l&-E, SV,
yg = j,s,-szjx = l,s,-szlx = Zzsx---Zxsz. (44)
The first term in this expression for sz is a constant of the unperturbed
motion and thus belongs entirely to the first Part, while the second
term, as we shall now See, belongs entirely to the second part.
Corresponding to (44) we tan introduce
Yz = 1,s,-+,.
It tan now easily be verified that
jxy,+j,y,+~z~z = 0
and from (30) of 8 35
h ~~1 = rgi LizJ r,l = --yxT bz9 ral = 0.
184 PERTURBATION THEORY § 47
These relations connectingj,, jy, jz and yz, rU, yz are of the Same form
as the relations connecting m,, my, m, and x, y, x in the calculation
in 5 40 of the selection rule for the matrix elements of x in a representation
with E diagonal. From the result there obtained that all
matrix elements of x vanish except those referring to two E values
differing by -f-n, we tan infer that all matrix elements of yz, and
similarly of ya: and yV, in a representation with j diagonal, vanish
except those referring to two j values differing by &iFi. The coefficients
of yz and ry in the second term on the right-hand side of (43)
commute with j, so the representative of the whole of this term will
contain only matrix elements referring to two j values differing by
rfr&, and thus referring to two different energy-levels of the unperturbed
System.
Hence the perturbing energy V becomes, when we neglect that
part of it whose representative consists of matrix elements referring
to two different unperturbed energy-levels,
The eigenvalues of this give the first-Order changes in the energylevels.
We. tan make the representative of this expression diagonal
by choosing our representation such that jz is diagonal, and it then
gives us directly the first-Order changes in the energy-levels caused by
the magnetic field. This expression is known as Lande's formula.
The result (46) holds only provided the perturbing energy V is small
compared with the energy diff erences within a multiplet. For larger
values of V a more complicated theory is required. For very strong
fields, however, for which V is large compared with the energy differences
within a multiplet, the theory is again very simple. We may
now neglect altogether the energy of the spin magnetic moments for
the atom with no external field, so that for our unperturbed System
the vectors 1 and s themselves are constants of the motion, and not
merely their magnitudes Z and S. Our perturbing energy V, which is
still eJ%/2mc. (j,+s,), is now a constant of the motion for the unperturbed
System, so that its eigenvalues give directly the changes in the
energ y -1evels. These eigenvalues are integral or half-odd integral
multiples of e&ti/2mc according to whether the number of electrons
in the atom is even or odd.
VIII. Collision Problems
48. General Remarks
VIII
COLLISION PROBLEMS
48. General remarks
IN this chapter we shall investigate Problems connected with a partitle
which, coming from infinity, encounters or `collides with' some
atomic System and, after being scattered through a certain angle, goes
off to infinity again. The atomic System which does the stattering
we shall call, for brevity, the scatterer. We thus have a dynamical
System composed of an incident particle and a scatterer interacting
with each other, which we must deal with according to the laws of
quantum mechanics, and for which we must, in particular, calculate
the probability of stattering through any given angle. The scatterer
is usually assumed to be of infinite mass and to be at rest throughout
the stattering process. The Problem was first solved by Born by a
method substantially equivalent to that of the next section. We must
take into account the possibility that the scatterer, considered as a
System by itself, may have a number of different stationary states
and that if it is initially in one of these states when the particle arrives
from infinity, it may be left in a different one when the particle goes
off to infinity again, The colliding particle may thus induce transitions
in the scatterer.
The Hamiltonian for the whole System of scatterer plus particle
will not involve the time explicitly, so that this whole System will
have stationary states represented by periodic solutions of Schroedinger's
wave equation. The meaning of these stationary states
requires a little care to be properly understood. It is evident that
for any state of motion of the System the particle will spend nearly all
its time at infinity, so that the time average of the probability of the
particle being in any finite volume will be Zero. Now for a statiomry
state the probability of the particle being in a given finite volume,
like any other result of Observation, must be independent of the time,
and hence this probability will equal its time average, which we have
seen is Zero. Thus only the relative probabilities of the particle being
in different finite volumes will be physically significant, their absolute
values being all Zero. The total energy of the System has a continuous
range of eigenvalues, since the initial energy of the particle tan be
anything. Thus a ket, 1s) say, corresponding to a stationary state,
186 COLLISION PROBLEMS 0 48
being an eigenket of the total energy, must be of infinite length. We
tan see a physical reason for this, since if 1s) were normalized and if
& denotes that observable-a certain function of the Position of
the particle-that is equal to unity if the particle is in a given finite
volume and Zero otherwise, fhen (sl&ls) would be Zero, meaning that
the average value of &, i.e. the probability of the particle being in the
given volume, is Zero, Such a ket 16) would not be a convenient one
to work with. However, with 1s) of infinite length, (SI& js> tan be
finite and would then give the relative probability of the particle
being in the given volume.
In picturing a state of a System corresponding to a ket IX) which
is not normalized, but for which (xlx) = n say, it may be convenient
to suppose that we have n similar Systems all occupying the same
space but with no interaction between them, so that each one follows
out its own motion independently of the others, as we had in the
theory of the Gibbs ensemble in 0 33. We tan then interpret (xlc~lx),
where 01 is any observable, directly as the total 01 for all the rt Systems.
In applying these ideas to the above-mentioned Is} of infinite length,
corresponding to a stationary state of the System of scatterer plus
colliding particle, we should picture an infinite number of such systems
with the scatterers all located at the same Point and the particles
distributed continuously throughout space. The number of particles
in a given finite volume would be pictured as (st& js>, & being the
observable defined above, which has the value unity when the particle
is in the given volume and Zero otherwise. If the ket is represented
by a Schroedinger wave function involving the Cartesian coordinates
of the particle, then the Square of the modulus of the wave function
could be interpreted directly as the density of particles in the picture.
One must remember, however, that eacF, of these particles has its own
individual scutterer. Different particles may belong to scatterers in
different states. There will thus be one particle density for each state
of the scatterer, namely the density of those particles belonging to
scatterera in that state. This is taken account of by the wave function
involving variables describing the state of the scatterer in addition
to those describing the Position of the particle.
For determining stattering coefficients we have to investigafe
stutionary stutes of the whole System of scatterer plus particle. For
instance, if we want to determine the probability of stattering in
various directions when the scatterer is initially in a given stationary
§ 48 GENERAL REMARKS 187
state and the incident particle has initially a given velocity in a given
direction, we must investigate that stationary state of the whole
System whose picture, according to the above method, contains at
great distances from the Point of location of the scatterers only
particles moving with the given initial velocity and direction and
belonging each to a scatterer in the given initial stationary state,
together with particles moving outward from the Point of location
of the scatterers and belonging possibly to scatterers in various
stationary states. This picture corresponds closely to the actual state
of affairs in an experimental determination of stattering coefficients,
with the differente that the picture really describes only one actuul
System of scatterer plus particle. The distribution of outward moving
particles at inflnity in the picture gives us immediately all the information
about stattering coefficients that could be obtained by experiment.
For practical calculations about the stationary state described
by this picture one may use a perturbation method somewhat like
that of $43, taking as unperturbed System, for example, that for
which there is no interaction between the scatterer and particle.
In dealing with collision Problems, a further possibility to be taken
into consideration is that the scatterer may perhaps be capable of
absorbing and re-emitting the particle. This possibility arises when
there exists one or more stutes of absorption of the whole System, a
state of absorption being an approximately stationary state which
is closed in the sense mentioned at the end of Q 38 (i.e. for which
the probability of the particle being at a greater distance than r from
the scatterer tends to zero as r -+ CO). Since a state of absorption is
only approximately stationary, its property of being closed will be
only a transient one, and after a sufficient lapse of time there will be
a finite probability of the particle being on its way to infinity.
Physically this means there is a finite probability of spontaneous
emission of the particle. The fact that we had to use the word
`approximately' in stating the conditions required for the phenomena
of emission and absorption to be able to occur Shows that these conditions
are not expressible in exact mathematical language. One tan give
a meaning to these phenomena only with reference to a perturbation
method. They occur when the unperturbed System (of scatterer plus
particle) has stationary states that are closed. The introduction of the
perturbation spoils the stationary property of these states and gives
rise to spontaneous emission and its converse absorption.
49. The Scattering Coefficient
Y
188 COLLISION PROBLEMS B 48
For'calculating absorption and emission probabilities it is necessary
to deal with m-~tutionury &&8 of the System, in contradistinction
to the case for stattering coefficients, so that the perturbation method
of $44 must be used. Thus for calculating an emission coefficient
we must consider the non-stationary states of absorption described
above. Ag&, since an absorption is always followed by a re-emission,
it cannot be distinguished from a stattering in any experiment involving
a steady state of affairs, corresponding to a stationary state
of the System. The distinction tan be made only by reference to a
non-steady state of affairs, e.g. by use of a stream of incident particles
that has a sharp beginning, SO that the scattered particles will appear
immediately after the incident particles meet the scatterers, while
those that have been absorbed and re-emitted will begin to appear
only some time later. This stream of particles would he the picture
of ts certain ket of infinite length, which could be used for calculating
the absorption coefficient.
49. The stattering coefficient
We shall now consider the calculation of stattering coefficients,
taking first the case when there is no absorption and emission, which
means that our unperturbed System has no closed stationary states,
We may conveniently take this unperturbed System to be that for
which there is no interaction between the scatterer and particle. Its
Hamiltonian will thus be of the form
E=H,+W, (1)
where H8 is that for the scatterer alone and W that for the particle
alone, namely, with neglect of relativistic mechanics,
w = 1/2m. (p2+zg+pl)*
The perturbing energy V, assumed small, will now be a function of
the Cartesian coordinates of the particle x, y, x, and also, perhaps,
of its momenta ~p%, py, pB, together with dynamical variables describ-
ing the scatterer.
Since we are now interested only in stationary states of the whole
System, we use a perturbation method like that of 8 43. Our unperturbed
System now necessarily has a continuous range of energylevels,
since it contains a fiee particle, and this gives rise to certain
modifications in the perturbation method. The question of the Change
in the energy-levels caused by the perturbation, which was the main
THE SCATTERING COEFFICIENT 189
question of 4 43, no longer has a meaning, and the convention in 5 43
of using the Same nunaber of primes to denote nearly equal eigenvalues
of E and H now drops out. Again, the splitting of energylevels
which we had in 0 43 when the unperturbed System is degenerate
cannot now arise, since if the unperturbed System is degenerate the
perturbed one, which must also have a continuous range of energylevels,
will also be degenerate to exactly the same extent.
We again use the general scheme of equations developed at the
beginning of 3 43, equations (1) to (4) there, but we now take our
unperturbed stationary state forming the Zero-Order approximation
to belong to an energy-level E' just equal to the energy-level H' of
our perturbed stationary state. Thus the u's introduced in the second
of equations (3) 5 43 are now all zero and the second of equations
(4) there now rcads (E'-E)I I> = V(0). (3)
Similarly, the third of equations (4) § 43 now reeds
(P-E)J2> = V]l}. (4)
We shall proceed to solve equation (3) and to obtain the stattering
coefficient to the first Order. We shall need equation (4) in 0 51.
Let 01 denote a complete set of commuting observables describing
the scatterer, which are constants of the motion when the scatterer is
alone and may thus be used for labelling the stationary states of the
scatterer. `I`his requires that H, shall commute with the OCS and be
a function of them. We tan now take a representation of the whole
system in which the OC'S and 2, y, z, the coordinates of the particle,
are diagonal. This will make E& diagonal. Let IO) be represented by
( XCX'(O) and 11) by (xa'( 1>, the Single variable x being written to
denote X, y, x and the Prime being omitted from x for brevity. In the
same way the Single differential d3x will be written to denote the
product dxdydz. Equation (3), w-ritten in terms of representatives,
becomes, with the help of (1) and (2),
(E'---H,(a')+P/2m.V2)(XcY'p} = z 1 (XcY'lV]X"cY") d3X"(X"a"IO>.
a
(5)
Suppose that the incident particle has the momentum po and that
the initial stationary state of the scatterer is ~9. The stationary state
of our unperturbed System is now the one for which p = po and
cy = CX*, and hence its representative is
<Xcx'lO> =L- S4,ao ei(Po+X)Ih. (6)
180 COLLISION PROBLEMS § 49
This makes equation (5) reduce to
(~`-H,(~`)+~z/27Tb.V2)cxa'll)
= 1 (XdIVIX"~o> &@f$P",Xol/fi
or (k2+V2)(Xdp) = F, (7)
where ~2 = %n~-2(E'- H,(d)) (8)
and F = %r&-2 (xd 1 V 1 xO& dsXO ei(PO,xO)lfi,
s (9)
a definite function of x, y, X, and a'. We must also have
E' = Hs(ao)+ po2/2m. (10) :
Our problem now is to obtain 8 solution (XCW' 11) of (7) which, for
values of x, y, z denoting Points far from the scatterer, represents
only outward moving particles. The Square of its modulus, 1 (x& / 1) 12,
will then give the density of scattered particles belonging to scatterers
in the state a' when the density of the incident particles is 1 ( xa" IO) 12,
which is unity. If we transform to polar coordinates r, 6,& equation
(7) becomes
(r&#a'Il) = F. (11)
Now F must tend to zero as r + co, on account of the physical re- 6
quirement that the interaction energy between the scatterer and
particle must tend to Zero as the distance between them tends to
infinity. If we neglect F in (11) altogether, an approximate Solution
for large r is <rw I 0 = u(O+xf)r-leikr, (12) '
where u is an arbitrary function of 8, 4, and 01', since this expression
substituted in the left-hand side of (11) gives a result of Order +.
When we do not neglect F, the Solution of (11) will still be of the
form (12) for large r, provided F tends to zero sufficiently rapidly as
r-+co,butth f
e unction u will now be definite and determined by the
Solution for smaller values of r.
For values 01' of the $5 such that k2, defined by (8), is positive, the
k in (12) must be Chosen to be the positive Square root of k2, in Order
that (12) may represent only outward moving particles, i.e. particles
for which the radial component of momentum, which from 6 38
equals p,--iTrl or -i?i(a/&+r-l), has a positive value. We now
have that the density of scattered particles belonging to scatterers in
state a', equal to the Square of the modulus of (12), falls off with
increasing r according to the inverse Square law, as is physically
-i.-
,
§ 49 THE ScATTERING COEFFICIENT 191
necessary, and their angular distribution is given by IU(&~`)/~.
Further, the magnitude, P' say, of the momentum of these scattered
particles must equal k&, the momentum being radial for large r,
SO that their energy is equal to
P'2 k2?i2
-=-= E'-H,(cY') =
2m 2m
with the help of (8) and (10). This is just the energy of an incident
particle, namely p02/2m, reduced by the increase in energy of the
scatterer, namely H,( a') - H,( ao), in agreement with the law of conservation
of energy. For values cy' of the 01's such that k2 is negative
there are no scattered particles, the total initial energy being insufficient
for the scatterer to be left in the state 01'.
We must now evaluate u(@x') for a set of values Q' for the CJS such
that k2 is positive, and obtain the angular distribution of the scattered
particles belonging to scatterers in state c11'. It is sticient to evaluafe
u for the direction 8 = 0 of the pole of the polar coordinates, since
this direction is arbitrary. We make use of Green's theorem, which
states that for any two functions of Position A and B the volume
integral I (AV2B-BV2A) d3x taken over any volume equals the
surface integral 1 (MB/&+- B&4/%) d8 taken over the boundary
of the volume, a/an denoting differentiation along the normal to
the surface. We take
A = e-ikrCOB 3> B = (re+ipj
and apply the theorem to a large sphere with the origin as centre.
The volume integrand is thus
e-ikrcoa e 772(&$,& 11) _ (re+a' 11 )`i&-ikr cos e
= e-ikrcos81'iJ2+k2)(r6~~tll) =.= e-ikroosdF
from (7) or (ll), while the surface integrand is, with the help of (12),
= @cr cos e u( -~+~)ckr+i~e"*%cosee-~"rD"8
= ikur-l( 1 +cos t+@f~--COS~
with neglect of Y-~. Hence we get
2T Ir
e-ikrcoseF d3X = d# r2 sin 8 dt?, ikur-l( 1+ cos O)eikdl -cos 4,
s s s
0 0
192 COLLISION PROBLEMS 8 49
the volume integral on the left being taken over the whole of space.
The right-hand side becomes, on being integrated by Parts with
respect to 8,
The second term in the (> brackets is of the Order of magnitude of
r-1, as would be revealed by further partial integrations, and may
therefore be neglected. We arc thus left with
277
e-dbcoSflF d3x = - 2 d$ u(O#d) = - 4ru( O&`),
s s0
giving the value of u(&x') for the direction 0 = 0.
This res& may be written
u(O$a') = - -1 e-bP7-XSe/fiF d3x,
(477') 1 (13)
since P' = EL If the vector p' denotes the momentum of the scattered
electrons coming off in a certain direction (and is thus of magnitude
P'), the value of u for this direction will be
u(ef+faf) = -kw-1 J e-WdfiF dsx,
as follows from (13) if one takes this direction to be the pole of the
polar coordinates. This becomes, with the help of (9),
u(ef4faf)
= -(27r)-hfi-2 J/ e-i(P',x)/R d3x ( xaf ] V 1~0~0) d3xO ei(P"JCo)/fi
= -2mh(p'a'j V Ip"dJ), (14)
when one makes a transformation from the coordinates x to the
momenta p of the particle, using the transformation function (54)
of tj 23. The Single letter p is here used as a label for the three
components of momentum.
The density of scattered particles belonging to scatterers in state i
CE` is now given by /u(8f+`af)12/r2. Since their velocity is P'/m, the :
rate at which these particles appear per unit solid angle about the
direction of the vector p' will be P`/m. ~u(e'+`&)j2. The density of ,
the incident particles is, as we have Seen, unity, so that the number
of incident particles crossing unit area per unit time is equal to their
velocity PO/m, where PO is the magnitude of po. Hence the effective
area that must be hit by an incident particle in Order to be scattered
t
50. Solution with the Momentum Representation
fi 49 THE SCATTERIKG COKFFICIENT 1 93
in a unit solid angle about the direction p' and then belong to a
scatterer in state CY' will be
P'/PO. lu(B'&x')l2 = 47?m2h2P'/Po, ~(~`cL'~V~~~CX~)~~. (15) '
This is the stattering coefficient for transitions ao-+ CX' of the scatterer.
It depends on that matrix element (p'a'l V 1 p%O) of the perturbing
energy V whose column p%O and whose row p'a refer respectively to
the initial and final states of the unperturbed System, between which
the stattering transition process takes place. The result (15) is thus
in some ways analogous to the result (24) of 8 44, although the
numerical coefficients are different in the two cases, corresponding
to the different natures of the two transition processes.
50. Solution with the momentum representation
The result (I 5) for the stattering coefficient makes a reference only
to that representation in which the momentum p is diagonal. One
would thus expect to be able to get a more direct proof of the result
by working all the time in the p-representation, instead of working
in the x-representation and transforming at the end to the p-representation,
as was done in 6 49. This would not at first sight appear
to be a great improvement, as the lack of directness of the x-representation
method is offset by more direct applicability, it being
possible to picture the Square of the modulus of the x-representative
of a state as the density of a stream of particles in process of being
scattered. The x-representation method has, however, other more
serious disadvantages. One of the main applications of the theory
of collisions is to the case of photons as incident particles. Now a
Photon is not a simple particle but has a polarization. It is evident
from classical electromagnetic theory that a Photon with a definite
momentum, i.e. one moving in a definite direction with a definite
frequency, may have a definite state of polarization (linear, circular,
etc.), while a Photon with a definite position, which is to be pictured
as an electromagnetic disturbance confined to a very small volume,
cannot have any definite polarization. These facts mean that the
polarization observable of a Photon commutes with its momentum
but not with its Position. This results in the p-representation method
being immediately applicable to the case of photons, it being only
necessary to introduce the polarizing variable into the representatives
and treat it along with the CU'S describing the scatterer, while the
3595.57 0
194 COLLISION PROBLEMS § 00
x-representation method is not applicable. Further, in dealing with
photons, it is necessary to take relativistic mechanics into account.
This tan easily be done in the p-representation method, but not SO
easily in the x-representation method.
Equation (3) still holds with relativistic mechanics, but W is now
given by ~+-2 = rnV+P2 = mv+p;+p;+p; (16)
instead of by (2). Written in terms of p-representatives, equation (3)
gives (E'-H,(d)- W)<pcL'(l) = (pa'IV10),
p being written instead of p' for brevity and W being understood as
a definite function of pz, py, pz given by (16). This may be written
(W'-W(Pdl) = (poqqO), (17
where w' = Bi'-H,(d) (1%
and is the energy required by the law of conservation of energy for . -
a scattered particle belonging to a scatterer in state a'. The ket (0) . .
is represented by (6) in the x-representation and the basic ket lp"&J>
is represented by
(xd 1 p"ao) = Satao <x Ip") = 6,to10 h-*ei@"~X)/n,
from the transformation function (54) of § 23. Hence
IO) = hqpv) , (19)
and equation (17) may be written
(FV- W)(pa'/l) = h~(pa'~V~pvJ). (209
We now make a transformation from the Cartesian coordinates
ps, py, pz of p to its polar coordinates P, CO, x, given by
Pz = Pcosw, py = PsinocosX, pz = Psinwsinx.
If in the new representation we take the weight function P2 sine,
then the weight attached to any volume of p-space will be the same
as in the previous p-representation, so that the transformation will
mean simply a relabelling of the rows and columns of the matrices
without any alteration of the matrix elements. Thus (20) will become
in the new representation
(W'- W)(Pwpx' j l} = hyPCtJp'[ VI P%u0~%%0), (21)
W being now a function of the Single variable P.
§ 50 SOLUTION WITH MOMENT-UM REPRESENTATION 195
The coefficient of (PWXCX' (l), namely FV'- W, is riow simply a
multiplying factor and not a differential Operator as it was with the
x-representation method. We tan therefore divide out by this factor
and obtain an explicit expression for (PUXCX 11). When, however, 01'
is such that W', defined by (18), is greater than mc2, this factor will
have the value Zero for a certain Point in the domain of the variable
P, namely the Point P = P', given in terms of W' by (16). The
function (Pep' ] 1) will then have a singularity at this point. This
singularity Shows that (Pep' 11) represents an infinite number of
particles moving about at great distances from the scatterers with
energies indefinitely close to W' and it is therefore this singularity
that we have to study to get the angular distribution of the particles
at infinity.
The result of dividing out (21) by the factor W'- W is, according
to (13) of 5 15,
(Pwp'~ 1) = h~(PwXa' j V]P%J"~%~)/( W'- W)+X(cqa') S(W'- W),
(22)
where A is an arbitrary function of w, x, and a'. To give a meaning
to the first term on the right-hand side of (22), we make the convention
that its integral with respect to P over a range that includes the
value P' is the limit when E -+ 0 of the integral when fhe small
domain P-E to P'+E is excluded from the range of integration.
This is sufficient to make the meaning of (22) precise, since we are
interested effectively only in the integrals of the representatives of
states when the representation has continuous ranges of rows and
columns. We see that equation (21) is inadequate to determine the
representative <P~xdIl) completely, on account of the arbitrary
function X occurring in (22). We must choose this A such that
(PWXOI' ] l> represents only outward moving particles, since we want
the only inward moving particles to be those corresponding to IO).
Let us take first the general case when the representative <Pt.q I>
of a state of the particle satisfies an equation of the type
(w'--w)mJxI) = f(Pox), (23)
where f( POX) is any function of P, o, and x, and W' is a number
greater than CMA~, so that (POX 1) is of the form
<PqI) =f(P~x)/(W'-W)S-h(~X)~(W'-W), (24)
and let us determine now what h must be in Order that (Pq 1) may
196 COLLTSION PROBLEMS § 50
represent only outward moving particles. We tan do this by transforming
(Pq 1) to the x-representation, or rather the (r0+)-repretransformation
function is (12)
sentation, and comparing it with for large values of r. The
(re+IPcq) = h-g&p,x)/?i = ~-~eZPr&os w cos t?+ sin~sin8cos(~-&]/fi~.
For the direction 0 = 0 we find
The second term in the ( ) brackets is of Order r-2, as may be verified
by further partial integrations with respect to Er), and tan therefore
be neglected. We are left with
When we Substitute for (PwxI) its value given by (24), the first
tcrm in the integrand in (25) gives
Q,
ih-+J p dp ,+Prlfi
s (f(pvx)/( w'- W)+%d S( W'- W>>. (26)
0
The term involving S(W'- W) here may be integrated immediately
and @es, when one uses Lhe relation P d P = W d WIc2, which
follows from (1 ti),
ih-kV-l * W d W e-@`~~fiA(~x)s( W'- W)
s
??w" = ih-lc-2r-1W'h(rrX)e-iP'rI". (27)
To integrate the other term in (26) we use the formula
03 CO
fg(P)$?!?df'
0 - = g(P')s;;T;dp, VW
0
with neglect of terms involving r- l, for any continuous function g(P),
which formula holds since ~.K(P)K-~~~~~ dP is of Order r-1 for any
0
continuous function K(P) and since the differente
s(P>/(P'-P)-s(P')i(P'-P)
is continuous. The right-hand side of (28), when evaluated with
neglect of terms involving r--l, and also with neglect of the small
domain PI--E to P'+E in the domain of integration, gives
CoiyipTF dp = g( Pf)e-iP'r/?i03
dP')s "~-p~ dp
- s -
03
= ig( s'np:-_ pw
pf)e-iP'r/h s . _..___ dP = ing(P')@"T/fi, (29)
P'-P
In our present example g(P) is
g(P) = ih-Q-lP f (P7rx)( P'- P)/( W'- W),
which has the limiting value when P = P',
g(P') = ih-FP'f(P'q)W'/PV = ih-+-2r-1W'f(P'rrX).
Substituting this in (29) and adding on the expression (27), we obtain
the following value for the integral (26)
h-*c-2r-1 W'(-vf (P'77X) +ih(~~))e--~~~~~. (30)
Similarly the second term in the integrand in (26) gives
h-W+lW'(-nf(P'O~)-ih(O~))e~~~~. (31)
The sum of these two expressions is the value of (rO+ 1) when r is
large.
We require that (rO+l) shall represent only outward moving
particles, and hence it must be of the form of a multiple of eiP`r/fi.
Thus (30) must vanish, so that
h(nx) = -iTf(P'7TX). (32)
We see in this way that the condition that <r&j~> shall represent
only outward moving particles in the direction 0 = 0 fixes the value
of h for the opposite direction 0 = V. Since the direction 8 = 0 or
o = 0 of the pole of our polar coordinates is not in any way Singular,
we tan generalize (32) to
GJX) = --Gf(P'OX), (33)
198 COLLISION PROBLEMS § 50
which gives the value of X for an arbitrary direction. This value
substituted in (24) gives a result that may be written
(PWXI) =f(Pwx){l/(W'-W)-iTG(W'-W)), (34)
since one tan Substitute P' for P in the coefficient of a term involving
6( W'- W) as a factor without changing the value of the term. The
condition thut (PWX 1) sh&?1 represent only outward moving particles is
thus thut it shall contain the factor
{l/(w'-w)-i7r6(w'-w)}. (35)
It is interesting to note that this factor is of the form of the right-
hand side of equation (15) of $ 16.
With A given by (33), expression (30) vanishes and the value of
<rO+ 1) for large r is given by expression (3 1) alone, thus
(rOq5 1) = - 2?rh-k-2r-1W'f(P'Ox)e~~`@.
This may be generalized to
<r@ 1) = - 2~h-k-+1 W'f( P'Wx)e~J"+,
giving the value of <rO+ i> for any direction 8, 4 in terms of f(P'wx)
for the same direction labelled by O, X. This is of the ferm (12) with
uuw = - 2rrh-*c-2 W'f( P'ox)
and thus represents a distribution of outward moving particles of
momentum P' whose number is
per unit solid angle per unit time. This distribution is the one
represented by the (PCOX 1) of (34).
From this general result we tan infer that, whenever we have a
representative ( PWX 1) representing only outward moving particles
and satisfying an equation of the type (23), the number per unit solid
angle per unit time of these particles is given by (36). If this ( PWX I>
occurs in a Problem in which the number of incident particles is one
per unit volume, it will correspond to a stattering coefficient of
4n2W~W'P'
MP0 If (P'mx) 12' (37)
It is only the value of the function f(Pwx) for the Point P = P' that
is of importante.
51. Dispersive Scattering
r3 50 SOLUTION WITH MOMENTUM REPRESENTATION 199
lf we now apply th.is general theory to our equations (21) and
(SZ), we have
f(Pwx) = ~~(P~x~`~v~Powoxo~o).
Hence from (37) the stattering coefficient is
4-n2h2woW'P'/C*Po. 1 (P'wxcy' 1 v 1 P0,0x0010) 12. _ (38)
If one neglects relativity and puts W"W'/c4 = m2, this result reduces
to the result (15) obtained in the preceding section by means of
Green's theorem.
51. Dispersive stattering
We shall now determine the stattering when the incident particle
is capable of being absorbed, that is, when our unperturbed System
of scatterer plus particle has closed stationary states with the particle
absorbed. The existente of these closed states for the unperturbed
System will be found to have a considerable effect on the stattering
for the perturbed System, and indeed an effect that depends very
much on the energy of the incident particle, giving rise to the phenomenon
of dispersion in optics when the incident particle is taken to
be a Photon.
We use a representation for which the basic kets correspond to
the stationary states of the unperturbed System, as was the case with
the p-representation of the preceding section. We take these stationary
states to be the states (p'd) for which the particle has a definite
momentum p' and the scatterer is in a definite state (x', together with
the closed states, 1 say, which form a separate discrete set, and
assume that these states are all independent and orthogonal. This
assumption is not accurate when the particle is an electron or atomic '
nucleus, since in this case for an absorbed state k: the particle will
still certainly be somewhere, so that one would expect to be able to
expand Jk) in terms of the eigenkets 1 x'a') of x, y, z, and the CX'S,
and hence also in terms of the Ip'd)`s. On the other hand, when the
particle is a Photon it will no longer exist for the absorbed states,
which are then certainly independent of and orthogonal to the states
(p'c~`) for which the particle does exist. Thus the assumption is valid
in this case, which is an important practical one.
Since we are concerned with stattering, we must still deal with
stationary states of the whole System. We shall now, however, have
to work to the second Order of accuracy, so that we cannot use merely
200 COLLISION PROBLEMS § 61
the first -Order equetion (3), but must use also (4). Equation (3)
becomes, when written in terms of representatives in our present
representation,
w- W(w'Il> = (P(gqO), (39)
tJ---Ekwv = @Jvp>, 1
where W' is the function of E' and the CX"S given by (18) and Ek is the
energy of the stationary state II: of the unperturbed System. Similarly,
equation (4) becomes
W'-W(P~`l2) = <Pa'1'v/l), (40)
(E'-E,){kl2) = (k:IV(l). 1
Kxpanding the right-hand sidos by matrix rnultiplication, we get
( W'- JQ<Pa'12)
= 1 J- (PqVlp"~") d3p" (p'a"ll)+ 2 (pa'IVIE")(k"l1),
" k"
(E'-,&(kla) (41)
I
= 2 J (kylp'cu') CPp" (p%`ll)+ c (Ic~v~Iv>(k"~l).
d k'
The ket 10) is still given by (19), so (39) may be written
W'- W(P~`I 1) = h~(pa'y~pOaO), (42) *
(Ef-E,)(kI1) = ~"(JclVlp%O). (43)
We may assume that the matrix elements <k'IV Ik") of V vanish,
since these matrix elements are not essential to the phenomena under
investigation, and if they did not vanish it would mean simply that
the absorbed states ic had not been suitably Chosen. We shall further
assume that the matrix elements (p'01' 1 V 1 p'a') are of the second Order
of smallness when the matrix elements (k'l V 1 p"a'}, (ph'] V IE") are
taken to be of the first Order of smallness. This assumption will be
justified for the case of photons in $ 64. We now have from (43) and
(42) that (k 11) is of the first Order of smallness, provided E' does not
lie near one of the discrete set of energy-levels Ek, and (PU 11) is of
the second Order. The value of (pa' 12) to the second Order will thus
be given, from the first of equations (4l), by
(W'--W)(poi'j2) = ?G 2 (porfIVIE")(k"IVIpoao)/(Ef-E,.).
k'
52. Resonance Scattering
5 51 DISPERSIVE SCATTERING 201
The total correction in the wave function to the second Order, namely
/ poI' 11) plus ( PCX' 12), therefore satisfies
(W'-W)((P~`l1)S(P~`l2))
= ~y(P4J7P"~o)+ 2 (P~`l~l-)<k;l'V~pOor0>~(~`-~~)~*
k
This equation is of the type (23), provided CY' is such that W' > mc2,
which means that ac' as a final state for the scatterer is not inconsistent
with the law of conservation of energy. We tan therefore infer
from the general result (37) that the stattering coefficient is
47r2k2W0WfP'
~_--_---
C4P0 I
The stattering may now be considered as composed of two Parts,
a part that arises from the matrix element (p'r~' IV/ p"cuo) of the perturbing
energy and a part that arises from the matrix elements'
(p'01' 1 V I!c) and (Ic [ V 1 POCLO). The first Part, vhioh is the Same as our
previously obtained result (38), may be cdled the direct stattering.
The second part may be considered as arising from an absorption of
the incident particle into some state Ic, followed immediately by a
re-emission in a different direction, and is like the transitions through
an intermediate state considered in 3 44. The fact that we have to
add the two terms before taking the Square of the modulus denotes
interference between the two kinds of stattering. There is no experimental
way of separating the two kinds, the distinction between
them being only mathematical.
52. Resonance stattering
Suppose the energy of the incident particle to he varied con-
tinuously while the initial state 2 of the scatterer is kept fixed, so
that the total energy E' or Hf varies continuously. The formula (44)
now Shows that as E' approaches one of the discrete set of energylevels
Ek, the stattering becomes very large. In fact, according fo
formula (44) the stattering should be infinite when E' is exactly equal
to an Ek. An infinite stattering coefficient is, of course, physically
impossible, so that we tan infer that the approximations used in
deriving (44) are no langer legitimate when E' is close to an Ek. To
investigate the stattering in this case we must therefore go back to
the exact equation (E'-E) IH') = VIH'),
equation (2) of 6 43 with E' written for Hf, and use a different method
202 COLLISION PROBLEMS 8 52
of approximating to its Solution. This exact equation, written in
terms of representatives hke (41), becomes
( W'- W)( pa' IH') H'>,
= c 1 (pa'~V[p'v'> d3p" (p"a"[H'>+ 2 {pcx'~V~IC")(E"
k" (45)
(E"i!&)(kl~`)
= c J- (k~V(p"cx") Gp" (p"d'IH')-+ c (ky~E")(k"~H') 1
d k
Let us take one particular .#?& and consider the case when E' is close
to it. The large term in the stattering coefficient (44) now arises from
those elements of the matrix representing V that lie in row k or in
column k, i.e. those of the type (kl VI pa') or (pa'l Vlk). The scattering
arising from the other matrix elements of V is of a smaller Order
of magnitude. This suggests that in our exact equations (45) we should
make the approximation of neglecting all the matrix elements of V
except the important ones, which are those of the type (PU 1 VlrC) or
(k 1 V 1 p01'), where a' is a state of the scatterer that has not too much
energy to be disallowed as a final state by the law of conservation of
energy. These equations then reduce to
(FV-W)(pa'IH') = <pa'~V~E)(k~H'),
(=--&&klH'> = 2 1 @I~b'> d3P <Pa'IH'>, (46)
a 1
the 01' summation being over those values of CX' for which FV' given
by (18) is > mc2. These equations arc now sticiently simple for us
to be able to solve exactly without firrther approximation.
From the first of equations (46) we obtain by division
(pa'IH'> = (pa'jY~k)(k~a')/(W'-W)+AG(W'-W). (47)
We must choose X, which may be any function of the momentum
p and 01', such that (47) represents the incident particles corresponding
to IO} or it% 1 p"ao) together with only outward moving particles. [The
representative of I3 ~%P) is actually of the form X S( FV'- W), since
the conditions 01' = 010 and p = po for it not to vanish lead to
W' = E'-HS(&) = E'-Hs($) = WO = W.] Thus (47) must be
(pa'IH'> = hYPa' I POaO> +
+(pa'lVllc)(k]H'){1/(W'--W)--GS(W'---W)], (48)
and from the general formula (37) the stattering coefficient will be
4v2WoW'P'/hc*P0. I(p'a'IVlk)121(klH'>12. (49)
$52 RESONANCE SCATTERING
~[t remains for us to determine the value of (!c JH'). We tan do this
by snbstituting for (PU' IH') in the second of equations (46) its value
given by (48). Chis gives
CE'-E,)(kIH') = h+(kIVIpOaO)+
+QIN') c J' I<-lvlp~`>12{1/(W'-W)-irr 6(W'-W)}dSp
= h~(lclVlpOcuO)+(EIH')(a-ib),
where a = 2 J I(4VlP~`>12 d3Pl(W'- W) PO)
and b = 77 1 (<k~;lpaybyw'-w) d3p
Ct.' s
= 4oL# JJS l(-IVIPWxoL')126(W'-W)P2dPsino dwdx
ZZZ 77 2 P' W'c-2
a' Is I(k]VJP'Wxol')12sinw dwdx. (51)
Thus (klH') = h~(k(V~p"~oS/(E'-~~-a+~~). (52)
Note that a and b are real and that b is positive.
This value for <EIH') substituted in (49) gives for the stattering
coefficient
One tan obtain the total effective area that the incident particle
must hit in Order to be scattered anywhere by integrating (53) over
all directions of stattering, i.e. by integrating over all directions of
the vector p' with its magnitude kept fixed at P', and then summing
over all a' that are to be taken into consideration, i.e. for which
W' > mc2. This gives, with the help of (51), the result
4vh2W0 bl{kIV~p"a0)~2
--i%=-- (E'-,?Ck-a)2+b2' (54)
If we suppose E' to vary continuously through the value E,, the
main Variation of (53) or (64) will be due to the small denominator
(E'-Ek---a)2+b2. If we neglect the dependence of the other factors
in (53) and (54) on E', then the maximum stattering will occur when
E' has the value E,+a and the stattering will be half its maximum
when E differs from this value by an amount b. The large amount of
stattering that occurs for values of the energy of the incident particle
that make E' nearly equal to Ek give rise to the phenomenon of an
absorption fine. The centre of the line is displaced by an amount
53. Emission and Absorption
204 $53
a from the resonance energy of the incident particle, i.e. the energy
which would make the total energy just Ek, while the quantity b is
what is sometimes called the half-width of the line.
53. Emission and absorption
For studying emission and absorption we must consider non-
stationary states of the System and must use the perturbation method
of 3 44. To determine the coefficient of spontaneous emission we must
take an initial state for which the particle is absorbed, corresponding
to a ket Ik), and determine the probability that at some later time
the particle shall be on its way to infinity with a definite momentum.
The method of $46 tan now be applied. From the result (39) of that
section we see that the probability per unit time per unit range of w
and X, of the particle being emitted in any direction CO', X' with the
scatterer being left in state a' is
2&ll( W'O'X'OI' 1 v Ilc) 12, (55)
provided, of course, that c11' is such that the energy W', given by (18),
of the particle is greater than mc 2. For values of 01' that do not satisfy
this condition there is no emission possible. The matrix element
(W'o'x'a'lV]k) here must refer to a representation in which W, W, x,
and 01 are diagonal wifli the weight ftinction unity. The matrix
elements of V appearing in the three preceding sections refer to a repre-
`sentation Sn which pz, J.+,, pz are diagonal with the weight function
unity, or P, O, x ar~;@i+gonal with the weight function P2 sino.
They would thus refer to a representation in which W, CO, x are
diagonal with the. weigh$ function dP/o? W. P2 sin o = WP/c2. sin CO.
Thus the matrix &ment ( W'w'x'a'( V Ik> in (55) is equal to
( W'P'/ci . sin o')* times aur previous matrix element < W'O'X'CX' 1 V Ik>
or ( P'LII' 1 V Ilc), so that (55) is equal to
The probability of emission per unit solid angle per unit time, with
the scatterer simultaneously dropping to state OZ', is thus
2n W'P'
n 7 I(P'4'Vl~)12.
TO obtain the total probability per unit time of the particle being
em&ed in any direction, with any final state for the scatterer, we
tj 53 EMISSION AND ABSORPTION 205
must integrate (56) over all angles w', x' and sum over all states cy'
whose energy &(a') is such that H,(a')+mc2 < Bk. The result is
just 2b/h, where b is defined by (51). There is thus this simple relation
between the total emission coeficient and the half-width b of the
&mption line.
I,et us now consider absorption. This requires that we shall take
an initial state for which the particle is certainly not absorbed but is
incident with a definite momentum. Thus the ket corresponding to
the initial state must be of the form (19). We must now determine
the probability of the particle being absorbed after time t. Since our
final state Ic is not one of a continuous range, we cannot use directly
the result (39) of 5 46. If, however, we take
IO> = lPOaO), (57)
as the ket corresponding to the initial state, the analysis of $9 44 and 46
is still applicable as far as equation (36) and Shows us that the probability
of the particle being absorbed into stabte k after time t is
WWlP"~o~12[I -co*~(~~-Eljtjfb)]~(~~- R')!
This corresponds to a distributim of in&i$@ particles of density
h-3, owing to the omission of the factor ,bt from (57), as compared \
I ,.,
with (19). The probability sf there being ah absorption after time
t when there is one incident particle crossi&g unit area per unit time
is therefore
2hVP/cV"`. ~(k~V~pOcuo)~2[1 -cos((.E&r)t/n)]prk-E')2. (58)
To obtain the absorption coefficient we must consider the incident
particles not all to have exactly the Same energy Wo = E'---H,(o~o),
but to have a distribution of energy values about the correct value
Ek--Hs(czo) required for absorption. If we take a beam of incident
particles consisting of one crossing unit area per unit time per unit
energy range, the probability of there being an absorption after time
t will be given by the integral of (58) with respect to E'. This integral
may be evaluated in the Same way as (37) of 9 46 and is equal to
4,rr2h2 W"t/c2P0. 1 (k 1 V 1 pV) 1 2.
The probability per unit time of an absorption taking place with an
incident beam of one particle per unit area per unif time per unit
energy range is therefore
4n2h2 W"/c2Po. 1 (k 1 V 1 p"aO) 1 2, (59)
which is the absorption coefficient.
206 COLLISION PROBLEMS 9 53
The connexion between the absorption and emission coefficienfs
(59) and (56) and the resonance stattering coefficients calculated in
the preceding section should be noted. When the incident beam does
not consist of particles all with the Same energy, but consists of a unit
distribution of particles per unit energy range crossing unit area per
unit time, the total number of incident particles with energies near
an absorption line that get scattered will be given by the integral
of (54) with respect to E'. If one neglects the dependence of the
numerator of (54) on E', this integral will, since
s b
(,&4,42+~2 dE' = ?T,
-CO
have just the value (69). Thus the total nunaber of scuttered particles
in the neighbourhood of an absorption line is equul to the total number
abwrbed. We tan therefore regard all these scattered particles as
absorbed particles that are subsequently re-emitted in a different
direction. Further, the number of particles in the neighbourhood of
the absorption line that get scattered per unit solid angle about a
given direction specified by p' and then belong to scatterers in state
01' will be given by the integral with respect to E' of (53), which
integral has in the same way the value
This is just equal to the absorption coefficient (59) multiplied by the
emission coefficient (66) divided by 2b/&, the total emission coefficient.
This is in agreement with the Point of view of regarding the resonance
scattered particles as those that are absorbed and then re-emitted,
with the absorption and emission processes governed independently
each by its own probability law, since this Point of view would
make the Fueraction of the total number of absorbed particles that are
re-emitted in a unit solid angle about a given direction just the
emission coefficient for this direction divided by the total emission
coefficient .
IX. Systems Containing Several Similar Particles
54. Symmetrical and Antisymmetrical States
.--
IX
SYSTEMS CONTAINING SEVERAL SIMILAR PARTICLES
54. Symmetrical and antisymmetrical states
IF a System in atomic physics contains a number of particles of the
same kind, e.g. a number of electrons, the particles are absolutely
indistinguishable one from another. No observable Change is made
when two of them arc interchanged. This circumstance gives rise to
some curious phenomena in quantum mechanics having no analogue
in the classical theory, which arise from the fact that in quantum
mechanics a transition may occur resulting in merely the interchange
of two similar particles, which transition then could not be detected
by any observational means. A satisfactory theory ought, of course,
to count two observationally indistinguishable states as the same
state and to deny that any transition does occur when two similar
particles exchange places. We shall find that it is possible to reformulate
the theory so that this is so.
Suppose we have a System containing n similar particles. We may
take as our dynamical variables a set of variables f1 describing the
first particle, the corresponding set f2 describing the second particle,
and so on up to the set & describing the nth particle. We shall then
have the &.`s commuting with the &`s for r # s. (We may require
certain extra variables, describing what the System consists of in
addition to the n similar particles, but it is not necessary to mention
these explicitly in the present chapter.) The Hamiltonian describing
the motion of the System will now be expressjble as a function of the
fl, e2,..., fn. The fact that the particles arc similar requires that the
Hamiltonian shull be a symmetricai function of the t1,f2,.,., &, i.e. it
shall remain unchanged when the sets of variables & are interchanged
or permuted in any way. This condition must hold, no matter what
perturbations are applied to the System. In fact, any quantity of
physical significance must be a syrnmetrical function of the 6's.
Let ja,}, lbi>, . . . be kets for the first particle considered as a dynami-
cal System by itself. There will be corresponding kets Ia,), 1 b,}, . . . for
the second particle by itself, and so on. We tan get a ket for the
assembly by taking the product of kets for each particle by itself,
for example
laJlW~~).4g~> = lalb2ca.~.gn) (11
T -
Il
208 SYSTEMS CONTAINING SEVERAL SIMILAR PARTIcLES s 51
say, according to the notation of (65) of 6 20. The ket ( 1) corresponds
to a spezial kind of state for the assembly, which may be described
by saying that each particle is in its own sfate, corresponding to its
own factor on the left-hand side of (1). The general ket for the
assembly is of the form of a sum or integral of kets like (l), and
corresponds to a state for the assembly for which one cannot say that
each particle is in its own state, but only that each particle is partly
in several states, in a way which is correlated with the other particles
being partly in several states. If the kets ja,), Ib,),... arc a Set of
basic kets for the first particle by itself, the kets la,), jb,),... will be
a set of basic kets for the second particle by itself, and so on, and the
kets (1) will be a set of basic kets for the assembly. We cal1 the representation
provided by such basic kets for the assembly a ,symmetricuZ
representution, as it treats all the particles on the same footing.
In (1) we may interchange the kets for the first two particles and
get another ket for the assembly, namely
IWa,)Ic,).&G = Ibw,...g,>.
More generally, we may interchange the role of the first two particles
in any ket for the assembly and get another ket for the assembly.
The process of interchanging the first two particles is an Operator
which tan be applied to kets for the assembly, and is evidently a
linear Operator, of the type dealt with in $7. Similarly, the process
of interchanging any pair of particles is a linear Operator, and by
repeated applications of such interchanges we get any Permutation
of the particles appearing as a linear Operator which tan be applied
to kets for the assembly. A Permutation is called an ewen permutation
or an oueki! permutation according to whether it tan be built up from
an even or an odd number of interchanges.
A ket for the assembly IX) is called symmetrical if it is unchanged
by any Permutation, i.e. if
wo = IX> (2) '
for any Permutation P. It is called antisymmetrical if it is unchanged
by any even Permutation and has its sign changed by any odd
Permutation, i.e. if w> = IkIX), (3)
the + or - sign being taken according to whether P is even or edd.
The state corresponding to a symmetrical ket is called a symmetricd
state, and the state corresponding to an antisymmetrical ket is called
an antisymmetrical Stute. In a symmetrical representation, the repre-
' t
§ 64 SYMMETRICAL AND ANTISYMMETRICAL STATES
sentative of a symmetrical ket is a symmetrical function of the
variables referring to the various particles and the representrttive of
an antisymmetrical ket is an antisymmetrical function.
In the Schroedinger picture, the ket corresponding to a state of the
assembly will vary with time according to Schroedinger's equation of
motion. If it is initially symmetrical it must always remain symmetrical,
since, owing to the Hamiltonian being symmetrical, there
is nothing to disturb the symmetry. Similarly if the ket is initially.
antisymmetrical it must always remain antisymmetrical. Thus a
stute which is initially symmetrical always remains 8ymmetriu.d and
a state Which is initially antisymmetricul always rernuins antisymmetricul.
In consequence, it may be that for a particular kind of
particle only symmetrical states occur in nature, or only antisymmetrical
states occur in nature. If either of these possibilities
held, it would lead to certain special phenomena for the particles in
question.
Let us suppose first that only antisymmetrical states occur in
nature. The ket (1) is not antisymmetrical and so does not correspond
to a state occurring in nature. From (1) we tan in general form
an antisymmetrical ket by applying all possible permutations to it
and adding the results, with the coefficient - 1 inserted before those
terms arising from an odd Permutation, so as to get .
the + or - sign being taken according to whether P is even oi odd.
The ket (4) may be written as a determinant
and its representative in a symmetrical representation is a determinant.
The ket (4) 0; (5) is not the general antisymmetrical ket, but
is a specially simple one. It corresponds to a state for the assembly
for which one tan say that certain particle-states, namely the states
a, b, c,. . . ,g, are occupied, but one cannot say which particle is in
which state, particle being equally likely to be in any stak If
3595.57 P
-
I//
210 SYSTEMS CONTAINING SEVERAL SIMILAR PARTICLES 3 54
I
two of the particle-states a, b, c,. . ., g arc the Same, the ket (4) or (5)
vanishes and does not correspond to any state for the assembly. /1
Thus two particles cunnot occupy the same stak More generally, the l
occupied states must be all independent, otherwise (4) or (5) vanishes. !
This is an important characteristic of particles for which only antisymmetrical
states occur in nature. It leads to a special statistics,
which was first studied by Fermi, so we shall cal1 particles for which
only antisymmetrical states occur in nature ffmGms.
Let US suppose now that only symmetrical states occur in nature.
The kef (1) is not symmetrical, except in the special case when all the
particle-states a, b, c ,..., g are the Same, but we tan always obtain a
symmetrical ket from it by applying all possible permutations to it
and adding the results, so as to get l,
g %b,%4L)~ (6)
The ket (6) is not the general symmetrical ket, but is a specially
simple one. It corresponds to a state for the assembly for which one
tan say that certain particle-states are occupied, namely the states
a, b, c,. . . ,g, without being able to say. which particle is in which state.
It is now possible for two or more of the states a, b, c,.. ., g to be the
Same, so that two or more particles tan be in the same state. In spite
of this, the statistics of the particles is not the Same as the usual
statistics of the classical theory. The new statistics was first studied
by Bose, so we shall cal1 particles for which only symmetrical states
occur in nature bosons.
We tan see the differente of Bose statistics from the usual statistics
by considering a special case-that of only two particles and only two
independent states a and b for a particle. According to classical
mechanics, if the assembly of two particles is in thermodynamic
equilibrium at a high temperature, each particle will be equally likely
to be in either state. There is thus a probability 4 of both particles
being in state a, a probability 2 of both particles being in state b,
and a probability & of one particle being in each state. In the quanturn
theory there are three independent symmetrical states for the
pair of particles, corresponding to the symmetrical kets ~a,)~a,>,
Ib,) Ib,), and Ia,> Vd+ Ia,> Ib,), and describable as both particles in
sfate a, both particles in state b, and one particle in each state
respectively. For thermodynamic equilibrium at a high temperature
these three states are equally probable, as was shown in 8 33, so that
f
55. Permutations as Dynamical Variables
§ 54 SyMMETRICAL AND ANTISYMMETRICAL STATES 211
there is a probability 4 of both particles being in state a, a probability
Q of both particles being in state b, and a probability 4 of one particle
b&ng in each state. Thus with Bose statistics the probability of two
parti&s being in the same state is greuter than with classical statistics.
Bose statistics differ from classical statistics in the opposite direction
t. Fermi statistics, for which the probability of two particles being
in the same state is Zero.
In building up a theory of atoms on the lines mentioned at the
beginning of $ 38, to get agreement with experiment one must assume
that two electrons are never in the Same state. This rule is known as
Pauli's exclusion principle. It Shows us that electrons are fermions.
Planck's law of radiation Shows us that photons ure bosons, as only the
Bose statistics for photons will lead to Planck's law. Similarly, for
each of the other kinds of particle known in physics, there is experimental
evidente to show either that they are fermions, or that they
are bosons. Protons, neutrons, positrons are fermions, cr-particles are
bosons. It appears that all particles occurring in nature are either
fermions or bosons, and thus only antisymmetrical or symmetrical
states for an assembly of similar particles are met with in practice.
Other more complicated kinds of symmetry are possible mathematically,
but do not apply to any known particles. With a theory which
allows only antisymmetrical or only symmetrical states for a particular
kind of particle, one cannot make a distinction between two states
which differ only through a Permutation of the particles, so that the
transitions mentioned at the beginning of this section disappem.
55. Permutations as dynamical variables
We shall now build up a general theory for a System confaining n
simila;r particles when states with any kind of symmetry properties
are allowed, i.e. when there is no restriction to only symmetrical or
only antisymmetrical states. The general state now will not be symmetrical
or antisymmetrical, nor will it be expressible linearly in
terms of symmetrical and antisymmetrical states when n > 2. This
theory will not apply directly to any particles occurring in nature,
but all the same it is useful for setting up an approximate treatment
for an assembly of electrons, as will be shown in fs 58.
We have Seen that each Permutation P of the n particles is a linear
operator which tan be applied to any ket for the assembly. Hence
we tan regard P as a dynamical variable in OUT System of n particles.
212 SYSTEMS CONTAINING SEVERAL SIMILAR PARTICLES § 66
There are n! permutations, each of which tan be regarded as a
dynamical variable. One of them, Fl say, is the identical Permutation,
which is equal to uni@. The product of any two permutations is a
third Permutation and hence any function of the permutations is
reducible to a linear function of them. Any Permutation P has a
reciprocal P-l satisfying
PP-1 = p-1p = Pl = 1.
A Permutation P tan be applied to a bra (XI for the assembly,
to give another bra, which we shall denote for the present by P(X (.
If P is applied to both factors of the product (XI Y), the product
must be unchanged, since it is just a number, independent of any
Order of the particles. Thus
uYxopI y> = (XI y>
.showing that P(X~=<X~P-1 (7)
Now P(XJ is the conjugate imaginary of PIX) and is thus equal to
(XlP, and hence nfrom (7) p = p-1. (8)
Thus a permutation is not in general a real dynamical variable, its
conjugate complex being equal to its reciprocal.
Any Permutation of the numbers 1,2, 3,..., n may be expressed in
the cyclic notation, e.g. with n = 8
in which each number is to be replaced by the succeeding number in
a bracket, unless it is the last in a bracket, when it is to be replaced
by the flrst in that bracket. Thus Pa changes the numbers 12345678
into 47138625. The type of any Permutation is specified by the
partition of the number n which is provided by the number of num-
bers in each of the brackets. Thus the type of P, is specified by the
partition 8 = 3+ 2 + 2 + 1. Permutations of the same type, i.e. corresponding
to the same partition, we shall call simdur. Thus, for
example, Pa in (9) is similar to
4 = (871)(35)(46)(2). (W
The whole of the n! possible permutations may be divided into sets
of similar permutations, each such set being called a ciuss. The permutation
P1 = 1 forms a class by itself. Any Permutation is simiIar
to its reciprocal.
56. Permutations as Constants of the Motion
pERMUTATIONS AS DYNAMICAL VARIABLES 213
When two permutations Pa and Pu are similar, either of them Pb
may be obtained by making a certain Permutation PS in the other
cz. Thus, in our example (9), ( 10) we tan take Pz to be the permutation
that changes 14327586 into 87135462, i.e. the Permutation
P, = (18623)(475).
Different ways of writing P, and P, in the cyclic notation would lead
to different Pz's. Any of these Pz's applied to the product P, IX>
would Change it into Pb. Pz IX), i.e.
P,P,lX> = PbP3,jX).
Hence Pb = PS P, P;$ (11)
which expresses the condition for P, and Pb to be similar as an
algebraic equation. The existente of any Pz satisfying (11) is sufficient
to show that P, and Pb are similar.
56. Permutations as constants of the motion
Any