Dirac, P. A. M.: The Principles of Quantum Mechanics

Dies ist G o o g l e s Cache von http://www.ubka.uni-karlsruhe.de/indexer-vvv/wasbleibt/57355817.
G o o g l es Cache enthält einen Schnappschuss der Webseite, der während des Webdurchgangs aufgenommenen wurde.
Unter Umständen wurde die Seite inzwischen verändert.Klicken Sie hier, um zur aktuellen Seite ohne Hervorhebungen zu gelangen.
Um einen Link oder ein Bookmark zu dieser Seite herzustellen, benutzen Sie bitte die folgende URL:

http://www.google.com/search?q=cache:xDHdPpWEvFcJ:www.ubka.uni-karlsruhe.de/indexer-vvv/wasbleibt/57355817+dirac+principles+of+quantum+mechanics&hl=de

Google steht zu den Verfassern dieser Seite in keiner Beziehung.

Diese Suchbegriffe wurden hervorgehoben:

dirac

principles

quantum

mechanics

THE
PRINCIPLES
OF
QUANTUM MECHANICS

BY
P. A. M. DIRAC
LUCASIAN PROFESSOR OF MATHEMATICS
IN THE UNIVERSITY OF CAMBRIDGE








THIRD EDITION












OXFORD
AT THE CLARENDON PRESS
 
Oxford University  Press, Amen House, London E.C.4
GLASGOW NEW YORK TORONTO MELBOURNE WELLINGTON

BOMBAY CALCUTTA MADRAS CAPE TOWN
Geojafrey c                                          University












Second  edition 1

























Reprinted  photographically  in  Great  Britain
at the Univercity   Prese, Oxford, 1948, 194Y
from  eheets  of the third edition












*
 
PREFACE TO THIRD EDITION
THE  book has again been mostly rewritten to bring in various' 
improvements. The  chief of these is the use of the notation of bra 
and ket vectors,  which  1 have developed  since  1939. This notation 
allows a more  direct  connexion to be made between the formalism 
in terms of the  abstract  quantities corresponding to states and 
observables and the formalism in terms of representatives-in fact 
the two formalisms  become welded into a  Single comprehensive 
scheme. With the help of this notation several of the deductions in 
the book take a simpler and neater form.
Other substantial alterations include :
(i) A new presentation of the theory of Systems with similar
particles,  based on  Fock's treatment of the theory  of radiation 
adapted to the present notation. This treatment is simpler and more 
powerful than the one given in earlier editions of the book.
(ii) A further  development of  quantum  electrodynamics,  including
the theory of the Wentzel field.  The theory of the  electron in interact'ion 
with the electromagnetic  field is oarried as far as it tan be at 
the present time without getting on to speculative  ground.
P. A. M. D.
ST.  JOHN% COLLEGE, CAMBRIDGE
21  April  1947    .
 
FROM THE
PREFACE TO THE SECOND EDITION
THE book has been mostly rewritten. 1 have tried by carefully overhauling 
the method of presentation to give the development of the 
theory in a rather less abstract  form, without making any sacrifices 
in exactness of expression or in the logical Character of the development. 
This should make the work suitable for  a  wider  circle  of 
readers, although the reader who likes abstractness for its own sake 
may possibly prefer the style of the  first edition.
The main Change has been brought about by the use of the word
`state  ' in a three-dimensional non-relativistic sense. It would seem 
at  first  sight a pity to build up the theory largely on the basis of  nonrelativistic 
concepts. The  use  of the non-relativistic meaning of 
`state  `, however,  contributes  so essentially to the possibilities of 
clear  exposition as to lead one to suspect  that the fundamental ideas 
of the present  quantum  mechanics are in need of serious alteration  at 
just  tbis  Point,  and that an improved theory would agree more closely        ' 
with the development here given than with a development  which 
aims -at preserving the  relativistic  meaning of `state' throughout.
P. A. M. D.
THE INSTITUTE FOR ADVANCED  STUDY
PRINCETON
27  November  1934
 
PROM THE
PREFACE TO THE FIRST EDITION
THE  methods of progress in theoretical physics  have  undergone a 
vast  Change  during  the present century. The classical  fradition 
has been to consider the world to  be an  association  of observable 
objects  (particles,   fluids,  fields, etc.) moving  about  according   `to 
deflnite laws of  forte,  so that one could form a mental picture in 
space   and time of the whole scheme. This led to a physics whose aim 
was to make assumptions  about the mechanism and  forces  connecting 
these observable objects, to  account  for their behaviour in the 
simplest possible way. It has become  increasingly evident ia recent 
times, however, that nature  works  on a different plan. Her fundamental 
laws do not govern the world  as it appears in our mental 
picture in any very direct  way, but instead they control  a substraturn 
of  which   we  cannot  form  a mental picture without  introducing 
irrelevancies. The formulation of these laws requires the use 
of the mathematics of transformations. The important things in 
the world  appear  as the invariants (or more generally the nearly 
invariants,  or quantities with simple transformation properties) 
of these transformations. The things we are immediately aware  of 
are the relations of these nearly invariants to a certain frame  of 
reference, usually one  Chosen  so  as to introduce  special  simplifying 
features  which  are  unimportant  from the  Point  of view of general 
theory.
The growth of the use of transformation theory, as applied first to
relativity and later to the quantum  theory, is the essence  of the new 
method in theoretical physics. Further  progress lies in the direction 
of making  our  equations invariant  under  wider and still wider  transformations. 
This state of affairs  is very satisfaotory from a  philosophical 
Point  of view, as implying an increasing recognition of the 
pst played by the observer in himself introducing the regularities 
that  appear in his observations, and  a  lack  of arbitrariness in the ways 
of nature, but it makes  things less easy for the learner of physics. 
The new theories, if one looks apart from their mathematical setting, 
are built up  from  physical concepts  which   cannot  be explained in 
terms  of things  previously known to the Student, which  cannot  even 
be explained adequately in words at all, Like the fundamental concepts 
(e.g. proximity, identity)  which  every one must learn on his
 
-------                   ------_         --r---_..---~    ___-  _--__  -   -_--.---.-._

3


















. . .
v111                           PREFACE TO FIRST EDITION

arrival  into the world, the  newer  concepts of physics  tan  be mastered
only by long familiarity with their properties and uses.
From the mathematical side the  approach to the new  theories
presents no difficulties, as the mathematics required (at any rate that 
which is required for the development of physics up to the present) 
is not essentially different from what has been  current  for a  considerable 
time. Mathematics is the tool specially suited for dealing with 
abstract concepts of any kind and there is no limit to its power in this 
field. For  this reason a book on the new physics, if not purely  descriptive 
of experimental work, must be essentially mathematical. All the 
same  the mathematics is only  a  tool and one should learn to hold the 
physical ideas in one's mind without reference to the mathematical 
form. In this book 1 have tried to keep the physics to the forefront, 
by beginning with an entirely physical  chapter  and in the later work 
examining the physical meaning underlying the formalism wherever 
possible. The amount of theoretical ground one has to  cover before 
being able to  solve  Problems  of real practical value is rather large, but 
this circumstance is an inevitable consequence of the fundamental 
part played by transformation theory and is likely  to  become more 
pronounced in the theoretical physics of the future.
With regard to the mathematical form in which the theory tan  be
presented, an author must decide at the  outset  between two methods. 
There is the  symbolic method, which  deals  directly in an abstract way 
with the quantities of fundamental importante (the invariants, etc., 
of the transformations) and there is the method of coordinates or 
representations, which deals with sets of numbers corresponding to 
these quantities. The  second  of these has usually been used for the 
presentation of quantum  mechanics (in fact it has been used practically 
exclusively with the exception of Weyl's book Gruppentheorie 
und Quantenmechanik). It is known under  one or other of the two 
names  ` Wave Mechanics  ' and  ` Matrix Mechanics  ' according to which 
physical things receive emphasis in the treatment, the states of a 
System  or its dynamical variables. It has the advantage that the kind 
of mathematics required is more familiar to the  average  Student, and 
also it is the historical method.
The  symbolic  method, however, seems to go more deeply into the
nature of fhings. It enables one to  exuress  the physical laws in a neat 
and concise way, and will probably be increasingly used in the future 
as it  becomes   better understood and its own  special  mathematics gets
 
PREFACE TO FIRST EDITION                           ix     '

developed. For this  reason   1 have  Chosen  thc  symbolic  method, 
introducing the representatives later  merely as 51; aid to practical 
calculation. This has necessitated a completc break from the  historical 
line of development, but this break is an advantage through 
enabling the  approach  to the new ideas to be made as  direct   as 
possible.
P.  A. M. D.
ST.JOKN'S COLLEGE,CAMBRIDGE
29 May 1930
 
I. The Principle of Superposition .......... 1
1. The Need for a Quantum Theory  .......... 1
2. The Polarization of Photons .......... 4
3. Interference of Photons .......... 7
4. Superposition and Indeterminacy .......... 10
5. Mathematical Formulation of the Principle .......... 14
6. Bra and Ket Vectors  .......... 18
II. Dynamical Variables and Observables .......... 23
7. Linear Operators .......... 23
8. Conjugate Relations .......... 26
9. Eigenvalues and Eigenvectors .......... 29
10. Observables .......... 34
11. Functions of Observables .......... 41
12. The General Physical Interpretation  .......... 45
13. Commutability and Compatibility .......... 49
III. Representations .......... 53
14. Basic Vectors .......... 53
15. The  Function .......... 58
16. Properties of the Basic Vectors .......... 62
17. The Representation of Linear Operators .......... 67
18. Probability Amplitudes .......... 72
19. Theorems about Functions of Observables  .......... 76
20. Developments in Notation .......... 79
IV. The Quantum Conditions  .......... 84
21. Poisson Brackets .......... 84
22. Schroedinger's Representation .......... 89
23. The Momentum Representation  .......... 94
24. Heisenberg's Principle of Uncertainty  .......... 97
25. Displacement Operators .......... 99
26. Unitary Transformations .......... 103
V. The Equations of Motion .......... 108
27. Schrodinger's Form for the Equations of Motion .......... 108
28. Heisenberg's Form for the Equations of Motion .......... 111
29. Stationary States  .......... 116
30. The Free Particle .......... 118
31. The Motion of Wave Packets .......... 121
32. The Action Principle .......... 125
33. The Gibbs Ensemble .......... 130
VI. Elementary Applications .......... 136
34. The Harmonic Oscillator .......... 136
35. Angular Momentum .......... 140

CONTENTS
1. THE PRINCIPLE OF SUPERPOSITION .                                                                 .                    1
1.  The Need for a Quantum  Theory                          .                                  .                    1 
2. The Polarization of Photons .                            .                                  .                    4 
3.  Interference      of Photons        .                   .                                  .                    7 
4. Superposition and  Indeterminaoy                         .                                  .                   10 
6.  Mathematical Formulation of the Principle                                                  .                   14 
6. Bra               andKet  Vectors . . . .                                                                       18

11. DYNAMICAL VARIABLES AND OBSERVABLES .                                                                               23
7. Linear Operators . . .                                                                           .              23 
8. Conjugate Relations . . .                                                                        .              26 
9.  Eigenvalues  and  Eigenvectors  .                            .                                  .              29
10. Observables . . . .                                                                              .              34 
11. Functions of Observables                  .              `.                                      .              41 
12. The General  Physicd  Interpretation                               .                             .              45
13.  Commutability  and Compatibility                                 .                             .              49

111. REPRESENTATIONS . . . .                                                                                            53
14.      Basic Vectors . . . .                                                                      .              53 
16. The  8  Funotion . .                          . . .                                                            58 
16.  Properties of the Basic  Vectors  .                                   .                             *         62 
17.  The  Representation of Linear Operators                                                             .         67 
18.  Probability  Amplitudes                      .                   .                                  .         72 
19. Theorems  about Functions of Observables                                                             .         76 
20.  Developments  in Notation .                                                .                        .         79
IV.  THE  QUANTUM CONDITIONS .                                                       .                        .         84
21. Poisson               Brackets  . . .                                                                .         84 
22. Schriidinger's  Representation .                                            .                        .         89 
23.  The   Momentum  Representation .                                                .                        .    94 
24. Heisenberg's Prinoiple of Uncertainty                                            .                        .    97 
26.  Displacement       Operators .                    .                        .                             .    99 
26. Unitary Transformations . .                                                                               .    103
V. THE EQUATIONS OF MOTION .                                                                   .                   .    108
27. Schrodinger's Form for the Equations of Motion                                                                 108 
28. Heisenberg's Form for the Equations of Motion                                                                  111 
29. Stationary                 States  . . . .                                                                     116 
30. The Free                    Particle  . . . .                                                                  118 
31. The Motion of Wave Packets  .                                                         .                   .    121 
32. The Action Principle . . . .                                                                                   126 
33.      The  Gibbs Ensemble . . . .                                                                               130

VI. ELEMENTARY APPLICATIONS . . .                                                                                       136
34. The  Harmonie Oscillator . . .                                                                                 136 
35.        Angular         Momentum     . . . .                                                                    140
 
36. Properties of Angular Momentum  .......... 144
37. The Spin of the Electron .......... 149
38. Motion in a Central Field of Force .......... 152
39. Energy-levels of the Hydrogen Atom  .......... 156
40. Selection Rules  .......... 159
41. The Zeeman Effect for the Hydrogen Atom .......... 165
VII. Perturbation Theory  .......... 167
42. General Remarks  .......... 167
43. The Change in the Energy-levels caused by a Perturbation .......... 168
44. The Perturbation considered as causing Transitions .......... 172
45. Application to Radiation .......... 175
46. Transitions caused by a Perturbation Independent of the Time .......... 178
47. The Anomalous Zeeman Effect .......... 181
VIII. Collision Problems  .......... 185
48. General Remarks .......... 185
49. The Scattering Coefficient .......... 188
50. Solution with the Momentum Representation  .......... 193
51. Dispersive Scattering .......... 199
52. Resonance Scattering  .......... 201
53. Emission and Absorption  .......... 204
IX. Systems Containing Several Similar Particles .......... 207
54. Symmetrical and Antisymmetrical States  .......... 207
55. Permutations as Dynamical Variables .......... 211
56. Permutations as Constants of the Motion .......... 213
57. Determination of the Energy-levels .......... 216
58. Application to Electrons .......... 219
X. Theory of Radiation  .......... 225
59. An Assembly of Bosons .......... 225
60. The Connexion between Bosons and Oscillators .......... 227
61. Emission and Absorption of Bosons  .......... 232
62. Applications to Photons .......... 235
63. The Interaction Energy between Photons and an Atom .......... 239
64. Emission, Absorption, and Scattering of Radiation .......... 244
65. An Assembly of Fermions .......... 248
XI. Relativistic Theory of the Electron .......... 252
66. Relativistic Treatment of a Particle .......... 252
67. The Wave Equation for the Electron .......... 253
68. Invariance under a Lorentz Transformation .......... 257
69. The Motion of a Free Electron .......... 260
70. Existence of the Spin .......... 263
71. Transition to Polar Variables .......... 266
72. The Fine-structure of the Energy-levels of Hydrogen .......... 268
73. Theory of the Positron .......... 272




CONTENTS                                                                      xi

36.  Properties  of  Angular  Momentum                         .                   .              .  144 
37. The Spin of the Electron                   ,                    .              .              . 149 
38. Motion in a Central  Field of Forte                        .                   .              . 152 
39.  Energy-levels of  the  Hydrogen Atom .                                   .                   . 156 
40.       Selection  Rules . . . . . .159
41. The Zeeman Effect for the Hydrogen Atom                                   .                   .    165

VII.  PERTURBATION THEORY . . . . 167
42. General Remarks . . . . .
167
43. The Change in the Energy-levels        caused by a Perturbation 168                                        b 
44. The Perturbation considered as causing Transitions                                            .    172 
45. Application to Radiation . . . .175
46. Transitions caused by a Perturbation Independent of the
> Time . . . . . . . 178
47. The Anomalous Zeeman Effect .                              .                   .              .    181

> VTTT. COLLTSION PROBLEMS . . . . . 185
48. General Remarks . . . . .
185
49. The              Stattering Coefficient . . . .188
50. Solution with the Momentum        Representation                          .              .         193
51. Dispersive              Stattering . . . . .
199
52. Rosonance   Stattering .                                                  .              .         201
53. Emission and Absorption . . . . 204

IX. SYSTEMS CONTAINING SEVERAL SIMILAR PARTICLES 207
54. Symmetrical and Antisymmetrical States                                    .         . 207 
55. Permutations  as  Dynamical Variables .                                   .              . 211 
56. Permutations as Constants of the Motion                              .  '           . 213
,           57. Determination of the  Energy-levels              .                        .         . 216
> 58. Application to                Electrons.  . . . . 219

X. THEORY OF RADIATION . . . . .
225
63. The Literaction Energy between Photons and  an Atom .                                              239
64. Emission, Absorption, and Stattering  of Radiation                                  .              244
65. An Assembly of Fermions . . . .248

XI. RELATIVISTIC THEORY OF THE ELECTRON .                                                    .              252
66.  Relativistic     Treatment of a  Particle  .                        .                   .         252 
67. The Wave Equation for the Electron .                                 .              .              253 
68.  Invariante  under  a  Lorentz Transformation                        .              .              257 
69. Tho Motion of a  Free  Electron .                .                   .              .              260 
70.     Existente of the Spin . . . . .263
7 1. Transition to Polar Variables .                 .                   .              .              266 
72. The Fine-structure of the Energy-levels          of Hydrogen                        .              268 
> 73. Theory of the Positron . . . . .272
 
XII. Quantum Electrodynamics  .......... 275
74. Relativistic Notation .......... 275
75. The Quantum Conditions for the Field .......... 278
76. The Hamiltonian for the Field .......... 283
77. The Supplementary Conditions .......... 285
78. Classical Electrodynamics in Hamiltonian Form .......... 289
79. Passage to the Quantum Theory .......... 296
80. Elimination of the Longitudinal Waves .......... 300
81. Discussion of the Transverse Waves  .......... 306
Index  .......... 310


xi                          CONTENTS
XII. QUANTUM ELECTRODYNAMICS . . .                                    .           275
74.     Relativistic  Notation . . . .                               . 275 
75. The Quantum Conditions for the Field .            .         .           278 
76. The Hamiltonian for the Field .         .         .              . 283 
77. The Supplementary Conditions .          .              _         .      285 
78. Classical  Electrodynamits  in Hamiltonian Form .                .      289 
79. Passage to the  CJuantum  Theory        .         .              . 296 
80. Elimination of the Longitudinal Waves .                .         .      300 
8 1. Discussion  of the Transverse Waves         .    .                   . 306 
> INDEX . . . . . . . 310
 
I. The Principle of Superposition
1. The Need for a Quantum Theory

THE PRINCIPLE  03' SUPERPOSITION
1. The need for a quantum  theory
CLASSICAL   mechanics has been developed continuously from the time 
of Newton and applied to an ever-widerring range of dynamical 
Systems,  including  the electromagnetic field in  interaction   ,with 
matter. The underlying ideas and the laws governing their application 
form a simple and elegant scheme, which one would be inclined 
to think could not be seriously modified without having all its
' attractive features spoilt. Nevertheless it has been found possible to
set up a new scheme, called  quantum  mechanics, which is more 
suitable for the description of phenomena on the atomic  scale  and 
which is in some  respects  more elegant and satisfying than the 
classical scheme. This possibility is due to the  changes  which the 
new scheme involves being of a very profound Character and not 
clashing  with the features of the classical theory that make it so 
attractive, as a result of which all these features  tan  be incorporated 
in the new scheme.
The necessity for a departure from classical mechanics is clearly
shown by experimental results. In the first place the  forces known 
in classical  electrodynamics  are inadequate for the explanation of the 
remarkable stability of atoms and molecules, which is necessary in 
Order  that materials may have any definite physical and  Chemical 
properties at all. The  introduction  of new hypothetical  forces will not 
save the Situation, since there exist general principles of classical 
mechanics, holding for all kinds of  forces,   leading  to results in  direct 
disagreement with Observation. For example, if an atomic  System  has 
its equilibrium disturbed in any  way  and is then left alone, it will be set 
in  oscillation  and the oscillations will get impressed on the  surrounding 
electromagnetic field, so that their frequencies may be observed 
with a spectroscope. Now whatever the laws of forte governing the 
equilibrium, one would expect to be able to include the various frequencies 
in a scheme comprising certain fundamental frequencies and 
their  harmonics.  This is not observed to be the  case.  Instead, there 
is observed a new and unexpected connexion between the frequencies, 
called Ritz's Combination  Law  of Spectroscopy, according to which all 
the frequencies  tan  be expressed as  differentes  between certain  terms,
3696.67                             73
 
2               THE PRINCIPLE OF SUPERPOSITION                               §l 
the number of terms being  much  less than the number of frequencies. 
This law is quite unintelligible from the classical Standpoint.
One might try to get over the difficulty without departing from
classical mechanics by assuming each  of the spectroscopically observed 
frequencies to be a fundamental frequency with its own degree 
of freedom, the laws of  forte being such that the  harmonic  vibrations 
do not occur. Such a theory will not do, however, even apart from 
the  fact that it would give no explanation of the Combination Law, 
since it would immediately bring one into conflict with the experimental 
evidente on specific heats. Classical statistical mechanics 
enables one to establish a general connexion between the total number 
of degrees of freedom of an assembly of vibrating Systems and its 
specific heat. If one assumes all the spectroscopic frequencies of an 
atom to correspond to different degrees of freedom, one would get a 
specific heat for any kind of matter very  much  greater than the 
observed value. In  fact the observed specific heats at ordinary 
temperatures are given fairly  weh  by a theory that takes into account 
merely the motion of each  atom as a whole and assigns no internal 
motion to it at all.
This leads us to a new clash  between classical mechanics and the
results of experiment. There must certainly be some internal motion 
in an atom to account for its spectrum,  but the internal degrees of 
freedom, for some classically inexplicable reason, do not contribute 
to the specific heat. A similar clash  is found in connexion with the 
energy of oscillation  of the electromagnetic  field in a vacuum.     Classical 
mechanics requires the specific heat corresponding to this energy to 
be infinite, but it is observed to be quite finite. A general conclusion 
from  experimental results is that oscillations of high frequency do 
not contribute  their classical quota to the specific heat.
As another illustration of the failure of classical mechanics  we  may
consider the behaviour of light. We have, on the one hand, the 
phenomena of interference and diffraction, which   tan be explained 
only  on the basis of a wave theory; on the other, phenomena such as 
photo-electric emission and stattering  by free electrons,  which  show 
that light is composed of small  particles.  These  particles,   which 
are called photons, have each a definite energy and momentum, depending 
on the frequency of the light, and appear to have just as
real an  existente as  electrons,   or any other  particles  known in  physics.
A  fraction  of a Photon  is never observed.
 
§1                 THE NEED FOR A QUANTUM THEORY                                 3
Experiments have shown that this anomalous behaviour is not
peculiar to light, but is quite general. All material  particles  have 
wave properties, which tan  be exhibited  under  suitable conditions. 
We have here a very striking and general example of the breakdown 
of classical mechanics-not merely an inaccuracy in its laws of motion, 
but  an inadequucy  of  its concepts to supply us with a  description  of 
atomic  events.
The necessity to depart from classical ideas when one wishes to
account for the ultimate structure of matter may be  Seen, not only 
from  experimentally established  facts,  but also  from  general philosophical 
grounds. In a classical explanation of the  constitution  of 
matter, one would assume it to be made up of a large number of small 
constituent Parts  and one would Postulate laws for the behaviour of 
these  Parts,  from which the laws of the matter in bulk could be deduced. 
This  would not complete the explanation, however, since  the 
question of the structure and stability of the constituent  Parts  is left 
untouched. To go into this question, it becomes necessary to  postulate 
that each  constituent part is itself made up of smaller Parts,  in 
terms of which its behaviour is to be explained. There is clearly no 
end to this procedure, so that one  tan  never arrive at the ultimate 
structure of matter on these lines. So long as  big  and  small  are merely 
relative concepts, it is no help to explain the big in terms of the small. 
It is therefore necessary to modify classical ideas in such a way as to 
give an absolute meaning to size.
At this  Stage it  becomes  important to remember that  science  is
concerned only with observable things and that we  tan observe an 
Object  only by letting it interact with some outside influence. An act 
of Observation is thus necessarily accompanied by some disturbance 
of the Object  observed. We may define an Object  to be big when the 
disturbance accompanying our Observation of it may be neglected, 
and small when the disturbance  cannot  be neglected. This definition 
is in  close  agreement with the common meanings of big and small.
It is usually assumed that, by being careful, we may  tut down the
disturbance accompanying our observation to any desired extent. 
The concepts of big and small are then purely relative and refer to the 
gentleness of our means of Observation as well as to the  Object  being 
described. In Order  to give an absolute meaning to size, such as is 
required for any theory of the ultimate structure of matter, we have 
to assume that there is a lz'mit  to the$neness   of ourpowers  of observati&
 
2. The Polarization of Photons

4               THE PRINCIPLE OF SUPERPOSITION                               §l 
and the  smallness  of the dccompanying disturbance-a  limit which is 
inherent  in  the  nature   of things and  tun  never be surpassed by improved 
technique or  increused skill on the part of the observer. If the Object  under 
Observation is such that the unavoidable limiting disturbance is  negligible, 
then the Object  is big in the absolute sense and we may apply 
classical mechanics to it. If, on the other hand, the limiting  disturbance 
is not negligible, then the  Object  is small in the absolute 
sense and we require a new theory for dealing with it.
A consequence of the preceding discussion is that we must revise
our ideas of causality. Causality applies only to a  System  which is 
left undisturbed. If a  System  is small, we cannot observe it without 
producing a serious disturbance and  hence  we cannot expect to find 
any causa1 connexion between the results of our observations. 
Causality will still be assumed to apply to undisturbed Systems and 
the equations which will  be set up to describe an undisturbed  System 
will be differential equations expressing a causa1 connexion between 
conditions at one time and conditions at a later time. These equations 
will be in  close  correspondence with the equations of classical 
mechanics, but they will be  connected only indirectly with the results 
of observations. There is an unavoidable indeterminacy in the  calculation 
of observational results, the theory enabling us to calculate in 
general only the probability of our obtaining a  particular  result when 
we make an Observation.

2. The polarization of photons
The discussion in the preceding  section   about the limit to the
gentleness with which observations  tan  be made and the  consequent 
indeterminacy in the results of those observations does not provide 
any quantitative basis for the building up of  quantum  mechanics. 
For this purpose a new set of accurate laws of nature is required. 
One of the most fundamental and most  drastic  of these is the  Principle 
of Superposition  of States.  We shall lead up to a general formulation 
of this principle through a consideration of some  special   cases,  taking 
first the example provided by the polarization of light.
It is known experimentally that when plane-p,olarized light is used
for ejecting photo-electrons, there is a preferential direction  for the 
electron emission. Thus the polarization properties of light are closely 
connected with its corpuscular properties and one must ascribe a 
polarization to the photons. One must consider, for instance, a beam
 
§2                  THE POLARIZATIOK OF PHOTONS                                 6 
of light plane-polarized in a certain direction as consisting of photons 
each  of which is plane-polarized in that direction and a beam of 
circularly polarized light as consisting  of photons  each  circularly 
polarized. Every  Photon  is in a certain  state   of  poihrization,   as we 
shall  say. The Problem  we must now consider is how to fit in these 
ideas with the known  facts  about the resolution of light into polarized 
components and the recombination of these components.
Let us take a definite  case. Suppose we have a beam of light  passing
through a crystal of tourmahne, which has the property of letting 
through only light plane-polarized perpendicular to its optic axis.
Classical  electrodynamics   teils  us what will happen for any given
polarization of  the incident beam. If this beam is polarized  perpendicular 
to the optic axis, it will all go through the crystal; if 
parallel to the axis, none of it will go through; while if polarized at 
an angle CY to the axis, a fraction  sin2a  will go through. How are we 
to understand these results on a Photon  basis?
A beam that is plane-polarized in a certain direction is to be
pictured as made up of photons  each  plane-polarized in that 
direction. This picture leads to no difficulty in the  cases  when  our 
incident beam is polarized perpendicular  or  parallel to the optic axis. 
We merely have to suppose that  each   Photon  polarized perpendicular 
to the axis Passes  unhindered and  unchanged  through the crystal, 
while  each  Photon  polarized parallel to the axis is stopped and  absorbed. 
A  difhculty   arises,  however, in the  case  of the obliquely 
polarized incident beam. Esch of the incident photons is then 
obliquely polarized and it is not  clear  what will happen to such a 
Photon when it reaches  the tourmalme.
A question about what will happen to a  particular   Photon  under
certain conditions is not really very precise.  To make it precise  one 
must imagine some experiment performed having a bearing on the 
question and inquire what. will be the  result  of the experiment- Only 
questions  about  the  results of experiments have a real significance 
and it is  only such questions that theoretical  physics  has to consider.
In  our present example the obvious experiment is to use an incident
beam consisting of only a Single Photon  and to observe what appears 
on the back side of the crystal. According to  quantum  mechanics 
the result of this experiment will be that sometimes one will find a 
whole  Photon,  of energy equal to the energy of the incident Photon, 
on the back side and other times one will find nothing. When one
 
6                 THE PRINCIPLE OF SUPERPOSITION                         82

Gands a whole  Photon,  it will be polarized perpendicular to the optic 
axis. One will never find only a part of a  Photon  on the back side. 
If one repeats the experiment a large number of times, one will find 
the  Photon  on the back side in a  fraction  sin2cY of the total number 
of times. Thus we may say that the  Photon  has a probability sin2cu. 
of  passing  through the tourmahne and appearing on the back side 
polarized perpendicular to the axis and a probability cos2, of being 
absorbed. These values for the probabilities lead to the correct 
classical results for an incident beam containing a large number of 
photons.
In this way we preserve the individuality of the  Photon  in all
cases.  We are able to do  Gis, however, only because  we  abandon the 
determinacy of the classical theory. The result of an experiment is 
not determined, as it would be according to classical ideas, by the 
conditions under  the control of the experimenter. The most that tan 
be predicted is a set of possible results, with a probability of occurrence 
for each.
The foregoing discussion  about the result of an experiment with a
Single obliquely polarized Photon incident on a crystal of tourmaline 
answers all that  tan legitimately be asked  about what happens to an 
obliquely polarized  Photon  when it  reaches  the tourmahne. Questions 
about what decides whether the Photon  is to go through or not and 
how it changes  its direction  of polarization when it does go through 
cannot be investigated by experiment and should be regarded as 
outside the  domain of science. Nevertheless some  further  description 
is necessary in Order  to correlate  the results of this experiment with
the results of other experiments that might be performed with
photons and to fit them all into a general scheme. Such  further
description should be regarded, not as an attempt to  answer  questions 
outside the  domain of science, but as an aid to the formulation of
rules  for expressing concisely the results of large numbers of experi-
ments.
The  further  description provided by quantum  mechanics runs as
follows. It is supposed that a Photon pobrized obliquely to the optic 
axis may be regarded as being partly in the state of polarization 
parallel to the axis and partly in the state of polarization perpendicular 
to the axis. The state of oblique polarization may be  considered 
as the result of some kind of Superposition process  applied to 
the two states of parallel and perpendicular polarization. This implies
 
3. Interference of Photons

$2                 THE POLARIZATION OF PHOTONS                              7 
a certain special  kind of relationship between the various states of 
polarization, a relationship similar to that between polarized beams in 
classical optics, but which is now to be applied, not to beams, but to 
the states of polarization of one  particular  Photon.  This relationship 
allows any state of polarization to be resolved into, or expressed as a 
superposition of, any two mutually perpendicular states of  polarization.
When we make the  Photon  meet a tourmalme crystal, we are  sub-
jecting it to an Observation. We are observing whether it is polarized 
parallel or perpendicular to the optic axis. The effect of making this 
Observation is to forte the  Photon  entirely into the state of parallel 
or entirely into the state of perpendicular polarization. It has to 
make a sudden jump from being partly in  each  of these two states to 
being entirely in one or other of them. Which  of the two states it will 
jump into  cannot  be predicted, but is governed only by probability 
laws. If it jumps into the parallel state it gets absorbed and if it 
jumps into the perpendicular state it  Passes  through the crystal and 
appears on the other side preserving this state of polarization. 

3. Interference of photons
In this  section we shall deal with another example of Superposition.
We shall again take photons, but shall be concerned  with  their  position 
in space  and their momentum  instead of their polarization. If 
we are given a beam of roughly monochromatic light, then we  know 
something  about the location and  momentum  of the associated 
photons. We know that  each  of them is located somewhere in the 
region of  space  through which the beam is  passing  and has a  momenturn 
in the  direction of the beam of magnitude given in terms of the 
frequency of the beam by Einstein's photo-electric law-momentum 
equals frequency multiplied by a universal constant.  When we have 
such information about the location and momentum  of a  Photon  we 
shall  say  that it is in a definite  tramlat@nal  state.
We shall discuss  the description which quantum  mechanics pro-
vides of the interference of photons. Let us take a definite  experiment 
demonstrating interference. Suppose we have a beam of light 
which is passed through some kind of interferomefer, so that it gets 
Split  up into two components and the two components are  subsequently 
made to interfere. We may, as in the preceding  section,  take 
an incident beam consisting of only a  Single  Photon  and inquire what
 
8                 THE PRINCIPLE OF SUPERPOSITION                                    93

will happen to it as it goes through the apparatus. This will present 
to us the difficulty of the confliet between the wave and corpuscular 
theories  of light in an acute  form.
Corresponding to the description that we had in the  case  of the
polarization, we must now describe the Photon  as going partly into 
each  of the two components into which the incident beam is Split. 
The  Photon  is then, as we may say, in  a  translational state given by the 
Superposition of the two translational states associated  with  the two 
components.        We are thus led to a generalization of the term  `translational 
state' applied to a  Photon.          For a  Photon  to be in  a  definite 
translational state it need not be associated with one Single beam of 
light, but may be associated with two or more beams of light which
arc  the components into which one original beam has been  Split.?                   In
the accurate mathematical theory  each  translational state is associated 
with one of the  wave  functions of ordinary wave optics, which wave 
function  may describe either a  Single beam or two or more beams 
into which one original beam  has  been  Split. Translational states are 
thus superposable in a  similar  way to wave functions.
Let us consider now what happens when we determine the energy
in one of the components. The result of such a determination must 
be either the whole  Photon  or nothing  at  all. Thus the  Photon  must 
Change  sudderily  from being partly in one beam and partly in the 
other to being entirely in one of the beams. This sudden Change is 
due to the disturbance in the translational state of the  Photon  which 
the Observation necessarily makes. It is impossible to  predict in which 
of the two beama the  Photon  will be found. Only the probability of 
either result tan  be calculated from  the previous diatribution of the 
Photon  over the two beams.
One could carry out the energy measurementwithout destroying the
component beam by, for example, reflecting the beam from  a  movable 
mirror and observing the recoil. Our description of the  Photon  allows 
us to infer that,  ufter  such an energy measurement, it would not be 
possible to bring  about any interference  effects between the two  components. 
So long as the Photon  is partly in one beam and partly in 
the other, interference  tan occur when the two beams are  superpose& 
but this possibility disappears when the  Photon  is forced entirely into
t The circumstance that the superposition  idea   requires  us to  generalize our
original meaning of translational states, but that no  corresponding  generalization was 
needed for the states of  Polarkation of the preceding  section, is an accidental one 
with no underlying theoretical sign&ance.
 
§3                    INTERFERENCE OF PHOTONS
one of the beams by an Observation. The other beam then no langer
emers   into  the description of the  Photon,  so  that  it counts   &S  being 
entirely  in the one beam in the ordinary  way for any experiment  that 
may subsequently be performed on it.
On these lines  quantum  mechanics is able to  effect a reconciliation
of fhe wave  and  corpuscular properties of light. The essential Point 
is the association of  each  of the translational states of a  photon  with 
one of the wave functions  of ordinary wave optics. The nature of this 
association cannot be pictured on a basis of classical mechanics, but 
is something entirely new. It would be quite wrong to picture the 
Photon  and its associated  wave as interacting in the way in which 
particles and waves  tan  interact in classical mechanics. The  association 
tan  be interpreted only statistically, the wave  function  giving 
us information  about the probability of our finding the  Photon  in any 
particular   place when we make an Observation of where it is.
Some time before the  discovery of  quantum  mechanics  People
realized that the connexion between light waves and photons must 
be of a statistical Character. What they did not clearly realize, however, 
was that the wave  function gives information  about the  probability 
of one Photon  being in a  particular  place and not the probable  * 
number of photons in that place.  The  importante of the  distinction 
tan  be made  clear  in the following way. Suppose we have  a  beam 
of light consisting of a large number of photons  Split  up into two  components 
of  equal  intensity. On the assumption that the intensity of 
a beam is connected with the probable number of photons in it, we 
should have half the total number of photons going into  each  component. 
If the two components are now made to interfere, we should 
require a  Photon in one component to be able to interfere with one in 
the other. Sometimes these two photons would have to annihilate one 
another and other firnes they would have to  produce four photons. 
This would contradict the conservation of energy. The new theory, 
which connects the wave function with probabilities for one Photon, 
gets  over the difficulty by making each  Photon  go partly into each  of 
the two components. Esch Photon then interferes only with itself. ' 
Interference between two different photons never occurs.
The association of particles with waves discussed above is not  '
restricted   to  the  case   of light, but is, according to modern theory, 
of universal applicability.  All kinds of particles are associated with 
waves  in this  way  and conversely all wave motion is associated with
 
4. Superposition and Indeterminacy

10                  THE PRINCIPLE OF SUPERPOSITION                                       §3
particles. Thus all particles  tan be made to exhibit interference 
effects and all wave motion has its energy in the form of quanta. The 
reason why these general phenomena are not more obvious is on 
account of a law of proportionality  betwcen the mass or energy of the 
particles and the frequency of the waves, the coefficient being such 
that for waves of familiar frequencies the associated quanta are 
extremely small, while for particles even as light  as electrons  the 
associated wave frequency is so high that it is not easy to demonstrate 
interference.

4. Superposition and indeterminacy
The reader may possibly feel dissatisfied with the attempt in the
two preceding  sections  to fit in the  existente of photons with the 
classical theory of light. He may argue that a very strange idea has 
been introduced-the possibility of  a Photon  being partly in  each  of 
two states of polarization, or partly in each  of two separate beamsbut 
even with the help of this strange idea no satisfying picture of 
the fundamental  Single-Photon   processes  has been given. He may say 
further  that this strange idea did not provide any information about 
experimental results for the experiments discussed, beyond what 
could have been obtained from an elementary consideration of 
photons being guided in some vague way by waves. What, then, is 
the use of the strange idea?
In answer  to the first criticism it may be remarked that the main
Object  of physical science is not the Provision of pictures,  but is the
formulation of laws governing phenomena and the  application of 
these laws to the  discovery of new phenomena. If a picture exists, 
so  much  the  better; but whether a picture exists or not is a matter
H                                .---w- .._
of only  secondary-importante.
_- ,c*- - ,-_-_  _. -.---                             In the  case  of  atomic   phen&za
" . _., " `"".-"-"_"c
no picture  tan  be expected to exist in the usual sense of the word 
`picture',  by  wbich  is meant a model functioning essentially on 
classical lines. One may, however, extend the meaning of the word 
`picture' to include any way of looking  at the fundamental laws  which 
makes   their  self-consistency  obvious.  With this extension, one may 
gradually acquire a picture of  atomic  phenomena by becoming
familiar with the laws of the  quantum  theory.
With regard to the second  criticism, it may be remarked that for
many simple experiments with light, an elementary theory of waves 
and photons connected in a vague statistical way would be adequate 







i
 
94             SUPERPOSITION  AND INDETERMINACY                               11
to account for the results. In the case of such experiments quantum 
mechanics has no further  information to give. In the great majority 
of experiments, however, the conditions are too complex for an 
elementary theory of this  kind  to be applicable and some more 
elaborate scheme, such as is provided by  quantum  mechanics, is then 
needed. The method of description that  quantum  mechanics gives 
in the more complex cases is applicable also to the simple cases and 
although it is then not  really  necessary for  accounting  for the experimental 
results, its study in these simple cases is perhaps a suitable 
introduction to its study in the general case.
There remains an Overall criticism that one may make to the whole
scheme, namely, that in  departing  from the determinacy of the 
classical theory a  great complication is introduced into the  description 
of Nature,  which  is a highly undesirable feature. This  complication 
is undeniable, but it is offset by a great simplification,  provided 
by the general  principle   of  superposition   of  states,  which  we shall now 
go on to consider. But  first  it is necessary to make  precise  the  important 
concept of a `state'  of a general atomic  System.
Let us take any  atomic System, composed of  particles  or  bedies
with specified properties (mass, moment of inertia, etc.) interacting 
according to specified laws of  forte. There will be various possible 
motions of the  particles  or bodies consistent  with the laws of  forte. 
Esch such motion is called  a  state   of the System. According  to 
classical ideas one could  specify a state by giving numerical  values 
to all the coordinates and velocities of the various component Parts 
of  the System  at some instant of time, the whole motion being then 
completely determined. Now the argument of pp. 3 and. 4  Shows  that 
we  cannot  observe a  sma.8 System  with that amount of  detail which 
classical theory supposes. The limitation in  the power of Observation 
puts a limitation on the number of data that  tan  be assigned  to  a 
state. Thus a state of an atomic  System  must be  specitled  by fewer 
or more indefinite data than a  complete  set of  numerical  values 
for all the coordinates and velocities at some instant of time. In the 
case  when the  System  is just a  Single Photon,  a state  would be  completely 
specified by a given state of motion in the sense  of $3 
together with a given sfate of polarization  in the sense of $!  2.
A state of a System  may be defined  as an undisturbed motion that
is restricted by as many conditions or data  as are theoretically 
possible without mutual interference or contradiction. In  practice
 
-.-_._______  ..l _..._  ------ - .- _--    __._ -~--.                ~?r


12                THE PRINCIPLE OF SUPERPOSITION                         $4

the conditions could be imposed by a suitable preparation of the 
system,  consisting perhaps in  passing  it through various kinds of 
sorting apparatus, such as slits and polarimeters, the System  being 
left undisturbed after the preparation. The word `state' may be 
used to mean either the state at one particular time (after the 
preparation),  or  the state throughout the whole of time after the 
preparation. To distinguish these two meanings, the latter will be 
called a `state of motion' when there is liable to be ambiguity.
The general principle of superposition of  quantum  mechanics
applies to the states, with either of the above meanings, of any one 
dynamical System. It  requires  us to assume that between these 
states there exist peculiar relationships such that whenever the 
System  is definitely  in one state we  tan consider it as being partly 
in  each  of two  or more other states. The original state must be 
regarded as the result of a kind of  superposition  of the two  or more 
new states, in a way that  cannot  be conceived on classical ideas. Any 
state may be considered as the result of  a  superposition of two  or 
more other states, and indeed in an infinite number of  ways. Conversely 
any two  or  more states may be superposed to give a new 
state. The procedure of expressing  a state as the result of  superPosition 
of a number of other states is a mathematical procedure 
that is always permissible, independent of any reference to physical 
conditions, like the procedure of resolving a wave into Fourier components. 
Whether it is useful in any particular  case,  though, depends 
on the  special  physical conditions of the  Problem   under  consideration.
In the two preceding sections  examples were given of the  super-
Position  principle  applied to a  System  consisting of a  Single Photon. 
0 2 dealt with states differing only with regard to the polarization and 
5 3 with states differing only with regard to the motion of the  Photon 
as a whole.
The nature of the relationships which  the Superposition principle
requires to exist between the states of any System  is of a kind that 
cannot be explained in terms of familiar physical concepts. One 
cannot in the classical sense picture a System  being partly in each  of 
two states and see the equivalence of this to the  System  being completely 
in some other state. There is an entirely new idea involved, 
to  which  one must get accustomed and in terms of  which  one must 
proceed to  buil'd  up an exact mathematical theory, without having 
any detailed classical picture.
 
§4              SUPERPOSITION AND INDETERMINACY                                               13
When a state is formed by the Superposition of two other states,
it will have properties that are in some vague way intermediate 
between those of the two original states and that approach more or 
less closely to those of either of them according to the greater  or less 
`weight' attached to this state in the Superposition process. The new 
state is completely defined by the two original states when their 
relative weights in the Superposition process are known, together 
with a certain Phase  differente, the  exact meaning of weights and 
phases being provided in the general  case   by the mathematical theory. 
In the case  of the polarization of a Photon  their meaning is that provided 
by classical optics, so that, for example, when two  perpendicularly 
plane polarized states are superposed with equal weights, the 
ne'w state may be circularly polarized in either direction,  or linearly 
polarized at an angle & 7~, or else elliptically polarized, according to 
the  Phase  differente.
The non-classical nature of the Superposition process is brought
out clearly if we consider the Superposition of two states, A and B, 
such that there exists an Observation  which,  when made on the
System  in state A, is certain to lead to one  particular  result,  a say, and 
when made on the  System  in state  B  is certain to lead to some different 
result,  b  say. What will be the result of the Observation when made 
on the  System  in the superposed state ? The  answer  is that the result 
will be sometimes a and sometimes b, according to a probability law 
depending on the relative weights of  A and  B  in the Superposition 
process. It will never be different from both a and  b.  The  intermediate 
Character of the state formed by superposition thus  expresses 
itself  through the probability of a  particulur  res&  for an observution 
being  interkdiate  between the corresponding  probabilities  for the original 
stutes,j-  not  through  the  result  itself  being  intermediate  between the 
corresponding  results for the original states.
In this way we see that such a  drastic  departure from ordinary
ideas as the assumption of Superposition relationships between the 
states is possible only on account of the recognition of the  importarme 
of the disturbance accompanying an Observation and of the  consequent 
indeterminacy in the result of the Observation. When an 
Observation is made on any atomic  System  that is in a given state,
t The probability of  a  particulrtr   result  for the  state  formed by superposition is not
slways  intermediate between those for the original states in the  general   case when 
those for the original states are not  Zero   OP unity, so there  arc  restrictions  on the 
`intermediateness  '  of a state formed by Superposition.
 
5. Mathematical Formulation of the Principle

14               THE PRINCIPLE OF SUPERPOSITION                            §4

in general the result will not be determinate, i.e., if the experiment 
is repeated several times under  identical conditions several different 
results may be obtained. It is a law of nature, though, that if the 
experiment is repeated a large number of firnes,  each   particular  result 
will  be  obtained in a  definite   fraction of the total number of firnes, so 
that there is a definite  probability  of its being obtained. This probability 
is what the theory sets out to calculate. Only in special  cases 
when the probability for some result is unity is the result of the 
experiment determinate.
The assumption of Superposition relationships between the states
leads to a mathematical theory in which the equations that define 
a state are linear in the unknowns. In consequence of this, People 
have tried to establish analogies with Systems in classical mechanics, 
such as vibrating strings or membranes, which are governed by linear 
equations and for which, therefore, a superposition principle holds. 
Such analogies have led to the name `Wave Mechanics' being sometimes 
given to  quantum  mechanics. It is important to remember, 
however, that  the superposition  that occurs in quuntum mechanics is 
of  an. essentially different  nuture from any occurring in the classical 
theory,  as is shown by the  fact that the  quantum  Superposition  principle 
demands indeterminacy in the results of observations in Order 
to be capable  of a sensible physical interpretation. The analogies are 
thus liable  to be misleading.

5. Mathematical formulation of the principle
A profound Change has taken place during  `the present century in
the opinions physicists have held on the mathematical foundations 
of their  subject.  Previously they supposed that the principles of 
Newtonian mechanics would provide the  basis  for the description 
of the whole of physical phenomena and that all the theoretical 
physicist had to do was suitably to develop and apply these  principles. 
With the recognition that there is no logical reason why 
Newtonian and other classical principles should be valid outside the 
domains  in which they have been experimentally verified has come 
the  realization  that departures  Fom these principles are indeed 
necessary. Such departures find their expression through the introduction 
of new mathematical formalisms, new  schemes  of axioms 
and rules of manipulation, into the methods of theoretical physics.
Quantum mechanics provides a good example of the new ideas. It
 
0 5  MATHEMATICAL   FORMULATION              OF THE PRINCIPLE             10

requires the  &ates  of a dynamical  System  and the dynamical variables 
to be interconnected in quite  strange  ways that are unintelligible 
from the classical Standpoint. The states and dynamical variables 
have to be represented by mathematical quantities of different 
natures  from those ordinarily used in  physics.  The new scheme 
becomes a  precise  physical theory when all the axioms and  rules  of 
manipulation governing the mathematical quantities  arc   spectied 
and when in addition certain laws are laid down connecting physical 
facts  with the mathematical formalism, so that from any given 
physical conditions equations between the mathematical quantities 
may be inferred and vice versa. In an application of the theory one 
would be given certain physical information, which  one would  proceed 
to express by equations between the mathematical quantities. 
One would then deduce new equations with the help of the axioms 
and rules of manipulation and would conclude by interpreting these 
new equations as physical conditions. The justification  for the whole 
scheme depends, apart from internal consistency, on the agreement 
of the final results with experiment.
We shall begin to set up the scheme by dealing with the  mathe-
matical relations between the states of a dynamical System  at one 
instant of time,  which  relations will  come  from the mathematical 
formulation of the principle of Superposition. The Superposition process 
is a kind of additive process and implies that states tan in some 
way be added to give new states. The states must therefore be  connected 
with mathematical quantities of a kind  which  tan  be added 
together to give other quantities of the same kind. The most obvious 
of such quantities are vectors.    Ordinary  vectors,  existing in a  space 
of a finite number of  dimensions,  are not sufficiently general for 
most of the dynamical Systems  in quantum  mechanics. We have to 
make a generalization to vectors in a space  of an infinite number of 
dimensions, and the mathematical treatment becomes complicated 
by questions of convergence. For the present, however, we shall deal 
merely with some general properties of the  vectors,  properties  which 
tan  be deduced on the basis of a simple scheme of axioms, and 
questions of convergence and related  topics will not be gone into 
until the need arises.
It is desirable to have a  speeist1  name for describing the  vectors
which  are connected with the states of a System  in quantum  mechanies, 
whether they are in a space  of a finite or an inf?nite  number of
 
16                THE PRINCIPLE OF SUPERPOSITION                                §S 
dimensions. We shall  cal1  them  ket vectors,  or simply  kets,  and denote 
a general one of them by a  special  Symbol  j>.  If we want to  specify 
a particular one of them by a label,  A say, we insert it in the middle, 
thus  IA).  The suitability of this notation will  become  clear   as the 
scheme is developed.
Ket vectors may be multiplied by complex numbers and may be
added together to give other ket vectors,  eg.  from two ket vectors
IA) and  IB) we  tan form

Cl  IA)+%   IW = Im,
say, where  c1 and  cs are any two complex numbers. We may also 
perform more general linear  processes  with them, such as  adding an 
infinite sequence of them, and if we have a ket vector IX),  depending 
on and labelled by a Parameter x which tan take on all values in a 
certain range, we may integrate it with respect  to x, to get another 
ket vector
s IX>   dx 1 IQ>
say. A ket vector which is expressible linearly in terms of certain 
others is said to be  dependent  on them. A set of ket vectors are called 
independent  if no one of them is expressible linearly in terms of the 
others.
We now assume that  euch  state  of a dynamical system at a particular
time  cwresponds  to a ket vector, the correspondence being such  that  if a 
state results  from  the superposition  of certain other states, its  corresponding 
ket vector  is  expressible linearly in terms of the corresponding ket 
vectors of the other states, and  conversely. Thus the state  R  results  from 
a Superposition of the states A and  B when the corresponding ket
vectors are  connected by (1).
The above assumption leads to certain properties of the  super-
Position process, properties which are in fact necessary for the word 
`superposition' to be appropriate. When two or more states are 
superposed, the  Order  in which they occur in the Superposition 
process is unimportant,  so the Superposition process is symmetrical 
between the states that are superposed. Again, we see from equation 
(1) that  (excluding  the  case when the coefficient  c1 or  c, is  Zero)  if 
the state  R tan  be formed by Superposition of the states A and  B, 
then the state A tan be formed  by Superposition of B and R, and B 
tan be formed by Superposition of  A and  R.  The Superposition 
relationship is symmetrical between all three states A, 23, and R.
 
$5 MATHEMATICAL FORMULATION OF THE PRINCIPLE
A state which results from the Superposition of certain other
states will be said to be dependent on those states. More generally, 
a state will be said to be  dependent on any set of states, finite or 
infinite in number, if its corresponding ket vector is dependent on 
the corresponding ket vectors of the set of states. A set of states 
will be called  independent  if no one of them  is dependent on the 
others.
To proceed with the mathematical formulation of the superposition
principle we must introduce a  further  assumption, namely the  assumption 
that by superposing a state with itself we  cannot  form any new 
state, but only the original state over again. If the original state 
corresponds to the ket vector IA),  when it is superposed with itself 
the resulting state will correspond to

clI4+%   14 =  (c1+cJA),

where  c1 and  ca are numbers. Now we may have cl+cz  = 0, in which 
case  the result of the Superposition process would be nothing at all, 
the two components having cancelled  each  other by an interference 
effect. Our new assumption requires that, apart from this special 
case,  the resulting state must be the same as the original one, so that 
(c,+c,)   IA} must correspond to the same state that IA>  does. Now 
c1+c2  is an arbitrary complex number and  hence  we  tan  conclude 
that  if the  ket vector corresponding to  a state is multi@ied  by any 
complex  number, not  xero,  the resulting bet vector will correspond to the 
same Stute.  Thus a state is specified by the  direction  of a ket vector 
and any length one may assign to the ket vector is irrelevant. All 
the states of the dynamical  System  are in one-one correspondence 
with  all the possible directions  for a ket vector, no distinction  being 
made between the  directions  of the ket vectors IA) and  - IA).
The assumption just made  Shows up very clearly the fundamental
differente between the Superposition of the  quantum  theory and any 
kind of classical superposition. In the  case  of a classical System  for
c    which a superposition principle holds, for instance a vibrating  mem-
brane, when one superposes a state with itself the result is a  difSerent 
state, with a different magnitude of the oscillations. There is no 
physical characteristic of a  quantum  state corresponding to the 
magnitude  of the classical oscillations, as  distinct  from their quality, 
described  by the ratios of the amplitudes at different  Points  of 
the membrane.  Again, while there exists a classical state with zero
3595.57                            a
 
6. Bra and Ket Vectors

----__-__-  ---- -  .--                                             !-


18                THE PRINCIPLE  03'  SUPERPOSITION                    95

amplitude of oscillation everywhere, namely the state of rest, there 
does not exist any corresponding state for a  quantum  System, the 
Zero ket vector corresponding to no state at all.
Given two states corresponding to the ket vectors  IA) and  IB),
the general state formed by superposing them corresponds to a ket 
vector  IR> which is determined by two complex numbers, namely
the coefficients cr and c2 of equation (1). If these two coefficients are 
multiplied by the same factor (itself a complex number), the ket 
veotor  IR) will get multiplied by this factor and the corresponding 
state will be unaltered. Thus only the ratio of the two coefficients 
is effective in determining the state  R. Hence this state is determined 
by one complex number, or by two real Parameters. Thus 
from two given states, a twofold infinity of states may be obtained 
by superposition.
This resrilt  is confirmed by the examples discussed in $9 2 and 3.
In the example of $2 there are just two independent states of polarization 
for a  Photon,  which may be taken to be the states of plane 
polarization parallel and  perpendicular  to some fixed  direction,  and 
from the Superposition of these two a twofold infinity of states of 
polarization tan be obtained, namely all the states of elliptic polarization, 
the general one of which requires two Parameters to describe 
it. Again, in the example of $ 3, from the Superposition of two given 
states of motion for a  Photon  a twofold infinity of states of motion 
may be obtained, the general one of which is described by two 
Parameters, which may be taken to be the ratio of the amplitudes 
of the two  wave functions  that are added together and their Phase 
relationship.  This confirmation  Shows  the need for allowing complex 
coeflicients  in equation (1). If these coefficients were restricted to be 
real, then, since only their  ratio is of importante for determining the 
direction  of the resultant ket vector  1 R>  when  IA) and  IB) are 
given, there would be  only a simple  in.Cnity of states obtainable from 
the Superposition.

6. Bra and ket vectors
Whenever we have a set of vectors in any mathematical theory,
we  tan  always set up a second set of vectors, which mathematicians 
call  the dual vectors. The procedure will be described for the  case 
when the original vectors are our ket vectors.
Suppose we have a number + which is a  function  of a ket vector



t
*.
 
§ss                         BRA AND KET VECTORS                                  19
IA), i.e. to  each ket vector  IA) there corresponds one number  4,
and suppose further  that the function is a linear one, which  means 
that the number corresponding to IA)+  IA') is the sum  of the 
numbers corresponding to  IA) and to  IA'),  and the number corresponding 
to  c/A)  is c times the number corresponding  to  IA),  c 
being any  numerical   factor. Then the number + corresponding to 
any IA) may be looked upon as the scalar product of that IA) with 
some new vector, there being one of these new vectors for  each  linear 
function of the ket vectors  IA). The  justification  for this way of 
looking at + is that, as will be seen later  (see equations (5) and (6)), 
the new vectors may be added together and may be multiplied by 
numbers to give other vectors of the same kind. The new vectors 
are, of course, defined only to the extent that their scalar products 
with the original ket vectors are given numbers, but this is sufficient 
for one to be able to build up a mathematical theory  about 
them.
We shall  cal1 the new vectors  bra  vectors,  or simply  bras,  and denote
a general one of them by the  Symbol  (  1,  the mirror image of the 
Symbol  for a ket vector. If we want to  specify a  particular  one of 
them by a label, B say, we write it in the middle, thus <B 1.  The 
scalar product of a bra vector  (BI  and a ket vector IA) will be 
written  (BIA), i.e. as a juxtaposition of the Symbols for the bra 
and ket vectors, that for the bra vector being on the left, and the 
two  vertical  lines being contracted to one for brevity.
One may look upon the Symbols ( and > as a  distinctive  kind of
brackets.  A scalar product  (BIA)  now appears as a  complete   bracket 
expression and a bra vector (BI or  a ket vector IA) as an incomplete 
bracket  expression. We have the rules that  any  complete  bracket 
expression denotes a number and any incomplete  bracket  expression
denotes  a  vector,  of  the  bra  or  ket kind according to whether it contuins
the Jirst  or  second  part sf thti brackets.
The  condition  that the scalar product of  (BI and IA) is a linear
function of IA) may be expressed symbolically by
<BI(W+  IA')) = <JWO+<BIO                               (2)
<BI{+))  = c<BW,                                  (3)
c being any number.
A bra vector is considered to be completely defined when its scalar
product with every ket vector is given, so that if a bra vector has its
 
20                THE PRINCIPLE OF SUPERPOSITION                                 §6
scalar product with every ket vector vanishing, the bra vector itself
must be considered as vanishing. In Symbols, if

<PIA> = 0, all  IA>,
then                            (PI = 0.                                         (4)
1
The sum of two bra vectors  (B  1 and  (B'  { is defined by the condition
that its scalar product with any ket vector IA) is the sum of the
scalar products of  (BI and  (B'I   with IA),
@1+(8'l)lA>  = <BIA>+<B'IA>,                                (5)
and the product of a bra vector (B 1 and a number c is defined by the 
condition that its scalar product with any ket vector IA) is c firnes 
the scalar product of  (BI  with IA),
(cwI4 = c(BIA).                                     (6)
Equations (2) and (5)  sh,ow  that products of bra and ket vectors 
satisfy the distributive axiom of multiplication, and equations (3) 
and (6) show that multiplication by numerical   factors  satisfies the 
usual algebraic axioms.
The bra vectors, as they have been here introduced, are quite a
different kind of vector  from  the kets, and so far there is no connexion 
between them except for the  existente of a scalar product of a bra 
and a ket.  We now make the assumption that  Tiere  is a  one-one 
correspondence between the bras and  the  kets, such that the bra corresponding 
to  IA) +  IA') is the  suna   of the bras corresponding to  1 A) and 
to IA'), md  the bra corresponding to clA> is c' times the bra correspon&ng- 
to  IA),   c'  being the conjugate  cornplex  number to c.  We  shall 
use the same label to specify a ket and the corresponding bra. Thus
the bra corresponding to  IA) will be written (A  1.
The  relationship  between a ket vector and the corresponding bra
makes it reasonable to  call one of them the conjugate imaginary of 
the other. Our bra and ket vectors  are  complex quantities,  since  they 
tan be multiplied by  complex numbers and are then of the  same 
nature as before, but they are  complex quantities of a  special  kind 
which   cannot  be  Split  up into  real and pure imaginary  Parts.  The 
usual method of getting the real part of a  complex quantity, by 
taking half the sum of the quantity itself and its conjugate, cannot 
be applied  since  a bra and a ket vector are  of  d.ifIerent  natures  and 
cannot  be added together. To  call   attention to this  distinction,  we 
shall use the words `conjugate  complex' to refer to numbers and
 
BRA AND KET VECTQRS

other complex quantities which tan be  spht  up into real and pure 
imaginary  Parts,  and the words `conjugate imaginary' for bra and 
ket vectors, which  cannot.  With the  former  kind of quantity, we 
shall use the  notation  of putting a bar over one of them to get the 
conjugate complex one.                     I
On account of the one-one correspondence between bra vectors and
ket vectors,  any state  of our dynamical system at a  particular  time may 
be  spec@ed  by the direction of  a bra uector  just us  weil as by the direction 
of a  ket vector. In  fact the whole theory will be symmetrical in its
essentials between bras and kets.
Given any two ket vectors  IA) and  IB),  we  tan construct from
them a number (BIA) by taking the scalar product of the first with 
the conjugate imaginary of the second. This number depends linearly 
on  IA) and  antilinearly  on IB),  the antilinear dependence meaning 
that the number formed from IB)+ IB') is the sum of the numbers 
formed from 1 B)  and from 1 B'), and the number formed from  c 1 B) 
is  c' times the number formed from  IB).  There is a second way in 
which we  tan  construct a number which depends linearly on  IA> and 
antilinearly on  IB>, namely by forming the scalar product of  IB) 
with the conjugate imaginary of IA) and taking the conjugate  complex 
of this scalar product. We assume thut  these two.numbers  are
always equul, i.e.            Gwo   =  <4w                                     (7)
Putting  IB)  = IA> here, we find that the number @IA>  must be
real. We make the  further  assumption
<44   > 0,                                    (8)
except when  IA)  =  0.
In  ordinary   space,   from  any two vectors one  tan  construct a
number-their  scalar product-which  is a real number and is symmetrical 
between them. In the  space  of bra vectors  or  the  space  of 
ket vectors, from any two vectors one tan  again construct a number 
-the scalar product of one with the conjugate imaginary of the 
other-but this number is complex and goes over into the conjugate 
complex number when the two vectors are interchanged. There is
thus a  bind of perpendicularity in these  spaces,  which is a  generalization 
of the perpendicularity in ordinary space.  We shall  call  a bra 
and  a  ket vector orthogonal  if their scalar product is Zero, and two 
bras  or tw.o kets will be called orthogonal if the scalar product of one 
with the conjugate imaginary of the other is Zero.  E'urther,  we shall
 
22               THE PRINCIPLE  OF'  SUPERPOSITION                       §6 
say   that two states of  our dynamical  System  are orthogonal if the 
vectors  corresponding to these states are orthogonal.
The   Zength  of a bra vector (A  1 or of the conjugate imaginary ket
vector  JA) is defined as the  Square root of the positive number 
(A  IA).  When we are given a state and wish to set up a bra  or ket 
vector to correspond to it,  only the  direction of the vector is given 
and the vector itself is undetermined to the extent of an arbitrary 
numerical factor. It is often convenient to  choose this numerical 
factor so that the vector is of length unity. This procedure is called 
normalization   and the vector so  Chosen  is said to be  normlixed.  The 
vector is not completely determined even then,  since  one  tan still 
multiply it by any number of modulus unity, i.e. any number  eiy 
where  y is real, without  changing  its length. We shall call  such a 
number a  phase  factor.
The foregoing assumptions give the  complete scheme of relations
befween the states of a dynamical System  at a  particular  time. The 
relations appear in mathematical form, but they imply physical
conditions,  which  will lead to results expressible in terms of  observations 
when the theory is  developed   further. For instance, if two states 
are orthogonal, it means at present simply a certain equation in our 
formalism, but this equation implies a definite physical relationship 
between the states,  which   further  developments of the theory will 
enable  us to interpret in terms of observational results (see the 
bottom of p. 35).
 
II. Dynamical Variables and Observables
7. Linear Operators

11
DYNAMICAL VARIABLES AND OBSERVABLES
7. Linear Operators
IN  the preceding section  we considered a number which is a linear 
function of a ket vector, and this led to the concept of a bra vector. 
We shall now consider a ket vector which is a linear function of a 
ket vector, and this will lead to the concept of a linear Operator.
Suppose we have a ket IE`) which is a function of a ket IA),  i.e.
to  each  ket IA) there corresponds one ket 1 F), and suppose  further 
that the function is a linear one, which means that the  IF) corresponding 
to  IA) + IA') is the sum of the  1 F)`s  corresponding to  IA) 
and to IA'), and the  I-8') corresponding to  clA>  is c times the  1 F) 
corresponding to  IA), c being any  numerical   factor.   Under  these 
conditions, we may  10015 upon the passage  from   IA) to  1 F)  as the 
application of a  linear Operator  to  IA). Introducing the  Symbol   01 
for the linear Operator, we may  write

in which the result of  cx  operafing on IA) is written like a  product 
of  ac. with IA).  We make the rule that in such products the ket wector 
must always be  put  on the  right  of the linear  operatm.   The above 
conditions of linearity may now be expressed by the equations
4A>+ IA'>)  = +>+w>,
a{clA)) = c+4).                           1 (1)
A linear Operator is considered to be completely defined when the
result of its application to every ket vector is given. Thus a linear 
Operator is to be considered zero if the result of its application to  every 
ket vanishes, and two linear Operators are to be considered equal if 
they produce the same result when applied to every ket.
Linear Operators  tan  be added together, the sum of two linear
Operators being defined to be that linear Operator which, operating 
on any ket,  produces the sum of what the two linear Operators 
separately would  produce.  Thus CY+/~  is  defined by
(~+Pw>  = 4A>+mo                                 (2)
for any  IA). Equation (2) and the first of equations  (1)  show that 
products of linear Operators with ket  vectors satisfy the distributive 
axiom of  multiplication.
 
24           DYNAMICAL VARIABLES AND OBSERVABLES                           §7

Linear Operators tan  also be multiplied together, the product of
two linear Operators being defined as that linear Operator, the  application 
of which to any ket  produces fhe same result as the  application 
of the two linear Operators successively. Thus the product  a/3  is 
defined as the linear Operator which, operafing on any ket  IA), 
changes  it into that ket which one would get by operating first on 
IA>  with /3, and then on the result of the first Operation with 01. In 
Symbols

This definition appears as the associative axiom of multiplication for 
the triple product of  01, fl, and  IA), and allows us to write this triple 
product as aj3jA) without brackets.  However, this triple product is 
in general not the same as what we should get if we operated on IA) 
first with  Q:  and then with  ss, i.e. in general  @IA) differs from  /3aIA), 
so that in general 0#3 must differ from  /Ia. The  commutative  axiom  of 
multiplication does not  hZd   for linear Operators.  It may happen as a 
special case that two linear Operators f and q are such that eq and 
76 are equal. In this case we say that 5 commutes with 7, or that 6 
and  r] commute.
By repeated applications of the above  processes  of  adding  and
multiplying linear Operators, one  tan  form sums and  products of 
more than two of them, and one  tan proceed to build up an algebra 
with  them. In this algebra the commutative axiom of multiplication 
does not hold, and also the product of two linear Operators may 
vanish without either factor vanishing. But all the other axioms of 
ordinary algebra, including the associative and distributive axioms 
of multiplication, are valid, as may easily be verified.
If we take a number li:  and multiply it into ket vectors,  it appears
as a linear Operator operating on ket  vectors,  the conditions  (1)  being 
fulfrlled  with E substituted for CX.  A number is thus a special case  of 
a linear Operator. It has the property that it commutes with all linear 
Operators and this property distinguishes it from a general linear 
Operator.
So far we have considered linear Operators operating only on ket
vectors. We  tan  give a meaning to their operating also on bra  vectors, 
in the following way. Take the  scalar  product of any bra  (BI  with
the ket  a  IA).  This  scalar  product is a number which depends
linearly on IA) and therefore, from  the definition of bras, it may be
considered as the scalar  product of IA) with some bra. The bra thus
 
$7                          LINEAR OPERATORS

defined depends linearly on  {B  1,  so we may look upon it as the result of 
some linear operator  applied to (B 1. This linear Operator is  uniquely 
determined by the original linear Operator cx  and may  reasonably be 
called the Same linear Operator operating on a bra. In this way our 
linear Operators are made  capable  of operating on bra vectors.
A suitable notation to use for the resulting  bra  when  u:  operates on
the bra (BI  is (Bla, as in this notation the equstion which defines
(Bleu  is                                                                   (3)     +
for any JA>, which simply expresses the associative axiom of multiplication 
for the triple product of  (BI,   CL, and  IA). We therefore 
make the general rule that in a product of a bra and  a linear Operator, 
the bra must always be put on the left. We tan  now write  the friple 
product of  (BI, CII, and IA> simply as (B ICX IA>  without brsckets. It 
may easily be verified that the distributive axiom of multiplication 
holds for products of bras and linear operetors just  as  weil  as for 
products of linear Operators and kets.
There is one further  kind of product which has a meaning in our
scheme, namely the product of a ket  vector  and a bra  vector  with 
the ket on the left, such as lA)(B 1.  To examine this product, let us 
multiply it into an  arbitrary  ket  1 P),  putting the ket on the right, 
and assume the associative axiom of multiplication. The product is 
then  IA)(B  1 P),  which is another ket, namely  (A)  multiplied by the 
number  (BI  P),  and this ket depends linearly on the ket  1 P). Thus 
IA){  BI appears as a linear Operator that tan  operate on kets. It 
tan also operate on bras, its product with  a bra (& 1 on fhe left being 
(&JA>(BJ,  which is the number  (QIA)  times the bra  (BJ.  The 
product IA}{ B 1 is to be sharply distinguished from the product 
(BIA}  of the same factors in the reverse Order, the latter product 
being, of course, a number.
We now have  a complete algebraic  scheme  involving three kinds
of quantities, bra vectors, ket vectors, and linear Operators. They  tan 
be multiplied together in the various ways  discussed above, ad the 
associative and distributive axioms of multiplication always hold, 
but the commutative axiom of multiplication does not hold. In this 
general scheme we still have the  rules  of notation of the preceding 
section,  that any complete  bracket  expression, containing ( on the 
left and > on the right, denotes a number, while any incomplete 
bracket  expression, containing  only (  or  >, denotes a  vector.
 
8. Conjugate Relations

26          DYNAMICAL VARIABLES AND OBSERVABLES                               § .7

With regard to the physical significance of the scheme, we have
already assumed that the bra vectors and ket vectors, or rather the 
directions  of these vectors, correspond to the states of a dynamical 
System  at a  particular  time. We now  make  the  further  assumption 
that  the linear Operators  correspond  to the dynamical variables at that 
time.  By dynamical variables are meant quantities such  as  the 
coordinates and the components of velocity, momentum  and angular 
momentum  of  particles,  and functions of these quantities-in  fact 
the variables in terms of which classical mechanics is built up. The 
new assumption requires that these quantities shall occur also in 
quantum  mechanics, but with the striking differente that they are
now  subject  to  an  algebra in which  the  commutative axiom  of  multiplica-
tion  does not hold.
This different algebra for the dynamical variables is one of the
most important ways in which  quantum  mechanics differs from
classical mechanics. We shall see later on that, in spite of this fundamental 
differente,  the dynamical variables of  quantum  mechanics 
still have many properties in common with their classical counterParts 
and it will be possible to build up a theory of them closely 
analogous to the classical theory and forming  a  beautiful  generalization 
of it.
It is convenient to use the same letter  to denote a dynamical
variable and the corresponding linear Operator. In  fact, we may  consider 
a dynamical variable and the corresponding linear Operator to 
be both the same thing, without getting into confusion.

8. Conjugate relations
Our linear Operators are  complex quantities,  since  one  tan multiply
them by  complex numbers and get other quantities of the  Same nature. 
Hence  they must correspond in general to  complex dynamical variables, 
i.e. to complex functions of the coordinates, velocities, etc. We 
need some  further  development of the theory  to see what kind of 
linear Operator corresponds to a real dynamical variable.
Consider the ket which is the conjugate imaginary of (P Ia. This
ket depends antilinearly on (P 1 and thus depends linearly on 1 P). 
It may therefore be considered as the result of some linear Operator 
operafing on [ P). This linear Operator is called the  adjoint  of 01  and 
we shall denote it by 2. With this notation, the conjugate imaginary 
of  (P~cx  is GIP).
 
§S                       CONJUGATE RELATIONS                                     27

In formula (7) of Chapter 1 put  (P   Ia for  (A   1 and its conjugate
imaginary  0i1 P) for IA). The result is
(BIGIP)  =  {PlalB).                               (4)
This is a general formula holding for any ket vectors  IB),  1.Q  and 
any linear Operator c11,  and it expresses one of the most frequently 
used properties of the  adjoint.
Putting  & for a in (4), we get
(BpqP) =  <PIo;IB)  =  (BlaIP),
by using (4) again with  ]P>  and  1 B) interchanged. This holds for
any ket IP),  so we  tan  infer from (4)  of Chapter 1,
(SIE =  (Bla,
and  since  this holds for any bra  vector (B  1,  we  tan infer

Thus  the  adjoint   of the  adjoint   of a linear Operator  is the original linear
Operator.  This property of the  adjoint makes it like the conjugate 
complex of a number, and it is easily verified that in the special  case 
when the linear Operator is a number, the  adjoint linear Operator is 
the conjugate complex number. Thus it is reasonable to assume that
the  adjoint   of a linear Operator  corre.spor&.  to the conjugate complex  of 
a dynamical variable.  With this physical significance for the  adjoint 
of a linear Operator, we may call  the  adjoint alternatively the  conjugate 
complex linear Operator,  which  conforms with our notation  6.
A linear Operator may equal its  adjoint, and is then called  self-
adjoint.   It corresponds to  a real dynamical variable, so it may be 
called alternatively  a real linear Operator.  Any linear Operator may 
be  Split  up into  a. real part and a pure imaginary part. For this 
reason the words `conjugate complex' are  applicable to linear 
Operators and not the words `conjugate imaginary'.
The conjugate complex of the sum of two linear Operators is
obviously the sum of their conjugate  complexes.          To get the conjugate 
complex of the  product of two linear Operators (II and  J3, we apply 
formuela  (7) of Chapter 1 with
<Al  =  <Pl%         @I  = wB9
so that                IN =  w>,            IB =  PIQ>.
The result is
 
28           DYNAMICAL VARIABLES AND OBSERVABLES

from (4). Since  this holds for any IP) and  (& 1,  we  tan infer that
P ii =  q.                              (5)
Thus  the conjugate complex  of the product  of two linear Operators  equals 
the product  of the conjugate complexes  of  the factors in the reverse  Order.
As simple examples of this result, it should be noted that, if  5 and 
are real, in general CJq is not real. This is an important differente
Lorn  classical mechanics. However,  &I  +  $ is real, and so is  ;(  &  -  qc). 
Only  when  6 and  q  commute  is  (17 itself also real. Further,  if  8 is real, 
then so is t2  and, more generally, tn  with n any positive integer.
We may get the conjugate complex of the product of three linear
Operators by successive applications of the rule (5) for the conjugate
comnlex
L     of the  nroduct
I            of two of them. We have
&  = a@y)  = fijgk =  j$ ae,                         (6)
so the conjugate complex of the product of three linear Operators 
equals the product of the conjugate complexes of the factors in the 
reverse Order. The rule mey easily be extended to the product of any 
number of linear Operators.
In the preceding  section  we saw that the product  (A)(B  1 is a linear
Operator. We may get its conjugate complex by referring directly to 
the  definition  of the  adjoint.  Multiplying @l)(BI  into a general bra 
(P 1 we get (P IA)(B 1,  whose conjugate imaginary ket is
w4Im  = Gwwo  = Pww).
Hence                             I4W  =  Im4                                 (7)
We now have several rules concerning coniugate complexes and
conjugate imaginaries of  products,  namely equation (7) of Chapter 1, 
equations  (4),  (5),  (6), (7) of this  chapter,  and the rule that the 
conjugate imaginary of  (P  Ia is  ai  1 P). These rules  tan all be summed 
up in a  Single comprehensive rule,  the conjugate complex  or  conjugate 
imaginary   of any product of  bra vectors,  Eet vectors,  and linear  operdors 
is obtained by taking the conjugate complex  or  conjugate imaginary  of 
each  factor  and reversing the  Order   of  all the factors. The rule is easily 
verified to hold quite generally, also for the cases  not explicitly given 
above.
THEOREM.  If ( is a real linear Operator and
lm/P)  = 0                               (8)
for  a  particulur  ket 1 P>, m bei-g a positive integer, then
(IP) = 0.
 
9. Eigenvalues and Eigenvectors

§f3
To prove the theorem, take first the  case when m = 2. Equation
(8) then gives                 (Plf21P)  =  0,
showing that the ket  [  1 P)  multiplied by the conjugate imaginary bra 
(P]tj   is  Zero.  From the assumption (8) of Chapter 1 with  41  P>   for  IA), 
we see that  51  P)  must be  Zero. Thus the theorem is proved for  m = 2.
Now take m > 2 and put
cm-21p)  =  IQ>.
Equation (8) now gives            f21&) = 0.
Applying the theorem for m = 2, we get
w> = 0
or                              pyP)  = 0.                                   (9)
By repeating the process by which equation (9) is obtained  fiom
(8), we obtain successively
p-2jP)  = 0 ,     pyP)  =0 , . . . . CpjP)  = 0,            W) = 0,
and so the theorem is proved generally.

9. Eigenvalues and eigenvectors
We must make a  further  development of the theory of linear
operators, consisting in studying the equation
ajP) =  alp},                               (10)
where  01  is a linear Operator and a is a number. This equation usually 
presents itself in the form that  CY. is a known linear Operator and the 
number a and the ket IP) are unknowns, which we have to try to 
choose  so as to satisfy (lO), ignoring the trivial Solution 1 P) = 0.
Equation (10) means that the linear Operator cx  applied to the ket
1 P) just multiplies this ket by a numerical  factor  without changing 
its direction,  or else multiplies it by the factor Zero,  so that it ceases 
to have a direction.    This same  cx  applied to other kets will, of course, 
in general Change both their lengths and their directions.  It should 
be noticed that only the direction  of 1 P) is of importante in equation 
(10). If one multiplies 1 P) by any number not Zero,  it will not aff ect 
the question of whether (10) is satisfied or not.
Together with equation  (lO),  we should consider also the conjugate
imaginary form of equation
(Qb = b<Ql,                                 (11)
where  b  is a number. Here the unknowns are the number  b  and the
 
30             DYNAMlCAL  VARIABLES AND OBSERVABLES                                             09

non-Zero  bra  (& 1.  Equations (10) and  (11)  are of such fundamental 
importante in the theory that it is desirable to have some special 
words to describe the relationships between the quantities involved.
If  (10)  is satisfied, we shall  call   u an  eigenvaluet  of the linear Operator
a, or of the corresponding dynamical variable, and we shall cal1  IP) 
an  eigenket  of the linear Operator or dynamical variable.  Further,  we 
shall say that the eigenket  [P)  belongs   to  the eigenvalue  u. Similarly, 
if  (11)  is satisfied, we shall  call   b  an eigenvalue of  As.  and  (&  1 an 
eigenbra belonging to this eigenvalue. The words eigenvalue, eigenket, 
eigenbra have a meaning, of course,  o%?y   with  reference  to a linear 
Operator or dynamical variable.
Using this terminology, we tan  assert that, if an eigenket of  cx  is
multiplied by any number not  Zero,  the resulting ket is also an 
eigenket and belongs to the  Same eigenvalue as the original one. 
It is possible to have two or more independent eigenkets of a linear 
Operator belonging to the  Same eigenvalue of that linear Operator,
e.g.  equation  (10)  may  have several solutions,  /Pl),   /Pd),   jP3),... say,
all holding for the same value of a, with the various eigenkets [Pl), 
IPQ,   IP3),... independent. In this case  it is evident that any linear 
combination of the eigenkets is another eigenket belonging to the 
same eigenvalue of the linear Operator, e.g.

c,l-w+c,  IW+c,  IP3)+...
is another solution of (lO),  where cl,  c2, c~,...  are any numbers.
In the special  case  when the linear Operator  01  of equations  (10)  and
(11) is a number, Ic say,  it is obvious that any ket IP) and bra  (& 1 
will satisfy these equations provided  a  and  b  equal  i?. Thus a number 
considered  as  a linear Operator has just one eigenvalue, and any ket 
is an eigenket and any bra is an eigenbra, belonging to this eigenvalue.
The theory of eigenvalues and  eigenvectors  of a linear Operator CY
which  is not real is not of  much   use  for  quantum  mechanics. We 
shall therefore  tonfine ourselves to real linear Operators for the  further 
development of the theory. Putting for a the real linear Operator f, 
we have instead of equations (10) and  (11)
w-9 = 4Ph                                            (12) 
<GM = WL                                             (13)
t The word `proper  ' is sometimes  used  instead of `eigen  `, but this is not satisfactory
as the words `proper' and  `improper'  are often used with other meanings. For example, 
in  $0  15 and 46 the words `improper  function'  and  `proper-energy'  are used.
 
§9                 EIGENVALWES AND EIGENVECTORS                                 31

Three important results tan  now be readily deduced.
(i)  The eigenvalues  are  all  real  numbers.  To prove that  a  satisfying
(12) is real, we multiply (12) by the bra (P 1 on the  left,  obtaining


Now from equation (4) with (B  1 replaced by (P / and  cx  replaced by 
the real linear Operator  e,  we see that the number  (P  16  J  P>  must be 
real, and from (8) of $6, (P  1 P) must be real and not Zero. Hence  a 
is real. Similarly, by multiplying (13) by IQ>  on the right,  we  tan 
prove that b is real.
Suppose we have a Solution of  (12)  and  we form the conjugate
imaginary equstion, which will read


in view of the reality of 5 and a. This conjugate imaginary equation 
now provides  a Solution of  (13),   with  (&  1 =  (PJ  and  b  = a. Thus 
we  tan infer
(ii) The eigenvalues associated with eigenkets  are the same  as the
eigenvalues associated with eigenbras.
(iii) The  conjugate  imaginary  of any eigenket  is an eigenbra belonging
to the  same  eigenvalue,  and  conversely. This last result makes it  reasonable 
to  cal1  the state corresponding to any eigenket  or to the conjugate 
imaginary eigenbra an eigenstate  of the real dynamical variable f.
Eigenvalues and eigenvectors  of vaious  real dynamical variables
are used very extensively in quantum  mechanics, so it is desirable 
to have some  systematic  notation for labelling them. The following 
is suitable for most purposes. If  E is a real dynamical variable, we 
call  its eigenvalues  [`,  e",  e, etc. Thus we have a letter by itself 
denoting a real  dymmid  variable or a real linear  Operator, and the 
Same letter with primes or  an index attached denoting a  number, 
namely an eigenvalue of what the letter by itself denotes. An eigenvector 
may now be labelled by the eigenvalue to  which  it belongs. 
Thus  lt') denotes an eigenket belonging to the eigenvalue  6' of the 
dynamical variable [. If in a piece of work we deal with more than 
one eigenket belonging to the same eigenvalue of a dynamical variable, 
we may distinguish them one  fiom   bnother  by means of a  further 
label,  or possibly of more than one  further  labels. Thus, if we are 
dealing with two eigenkets belonging to the same eigenvalue of  ff, 
we may cal1  them /E'l)   and  If'2).
 
32          DYNAMICAL VARIABLES AND OBSERVABLES

THEOREM.  Two eigenvectors  of a real  dynamicab  variable  belonging
to  diJferent eigenvalues are orthogonal.
To prove the theorem, let 16')  and It") be two eigenkets of the real
dynamical  variable  f,  belonging to the eigenvalues [' and  f" respec-
.
tively. Then we have the equations
tw>  =  w>,                                 (14) 
w> = eV?*                                   (15)
Taking the conjugate imaginary of  (14)  we get
~`0'15 = 5w.
Multiplying this by It") on the right gives              _
<iw~">  = tx'lt'9
and multiplying (16) by ([' 1 on the left gives
aw'> = M'lr>-
Herme,  subtracting,          e--5"`W  Je'> =  0,                           (16) 
showing  that,  if  f'  f:  t",  (l'I[")  = 0 and the two eigenvectors  It'> 
and  lt">   arc orthogonal. TI& theorem will be referred to as the
orthogonality theorem.
We have been discussing properties of the eigenvalues and  eigen-
vectors of  a real linear Operator, but hsve not yet considered the 
question of whether, for  a  given real linear Operator, any eigenvalues 
and eigenvectors exist, and if so, how to find  them.  This question 
is in  general  very difficult to  answer. There is one useful  special   case, 
however,  which  is  quite  tractable,  namely when the real linear
Operator, 6 sa,y,  satisfies  an algebraic  equation
+(t) = [m+alfn-1+a2[n-2+...+an               7 0,         (17)
the coefficients a being numbers. This equation  means,  of course, 
that the linear Operator d(t) produces the result Zero when applied 
to any ket vector or to any bra vector.
Let (17) be the simplest algebraic  equation that E satisfies. Then
it will be shown that
(ar) The number of eigenvalues of 6 is n.
(8) There  arc  so many eigenkets of  t that any ket whatever tan
be expressed as a sum of such eigenkets.
The  algebraic  form +(EJ) tan  be factorized into n linear factors,  the
result being         m  = (~-c,)(5-c,)(~-c,)...(~-c,)                       (18)
 
§9              EIGENVALUES AND EIGENVECTORS                                   33
say, the  c's being  numbers,  not  assumed to be all different. This 
factorization tan  be performed with 6 a linear Operator just as weil 
as with  ,$  an ordinary algebraic variable, since there is nothing 
occurring in  (18)  that does not  commute  with  f.  Let the quotient 
when  #@) is divided by (e--c,)  be  x,,(e), so that
&i)   =  (&--c,h&)   (i =  1,2,3,....,   12).         -
Then, for any ket IP),
&c,)xA4`)   P>  =  $w)   lP>  =  0,                     (19)
Nm  x,(5) 1 p> cannot vanish for every ket  IP},   as  otherwise  x,(f) 
itself would vanish and we should  have  g  satisfying  an algebraic 
equation of degree n- 1, which would contradict the assumption that 
(17) is the simplest equation that f satisfies.  If we choose  IP) so that 
x,.(f) IP) does not vanish, then equation (19) Shows  that x,(e)  IP} is 
an eigenket of f, belonging to the eigenvalue c,. The argument holds 
for each  value of r from 1 to n, and hence  each  of the c's is an eigenvalue 
of  [. No other number tan be an eigenvalue of  5, since if 6' is 
any eigenvalue, belonging to  an eigenket it'),
w>  =  4'10
and we  tan deduce         W) IE'>  = w> bt?>,
and since the left-hand side vanishes we must have +(e') = 0.
To  complete the proof of  (ac) we must verify that the c's are  all
different. Suppose the c's arc  not all different and cs occurs m firnes
say, with m > 1. Then +(e) is of the  ferm
Ws   = ec,wm
with  8(t) a rational integral  function of  4. Equation (17) now gives us
(I-c,Pw)l~~  =  0                                 (20)
for any ket IA). Since   c, is an eigenvalue of  5 it must be real, so that 
f-c, is a real linear Operator. Equation (20) is now of the Same form 
as equation (8)  with   f-c, for  5 and  6([)@>  for  IP>. From  the theorem 
connected with equation (8) we  tan infer that

Since  the ket IA} is arbitrary,
&-c,vw  =  0,
which contradicts the assumption that  (17)  is the simplest equation 
that 6. satisfies. Hence  the c's arc  all different and  (01) is proved.
Let  x,,(c,.)  be the number obtained when c,, is substituted for t in
8596.67                             D
 
10. Observables

34          DYNAMICAL  VARIABLES AND OBSERVABLES                           §9

the  algebraic  expression  x(t). Since the  C'S   are all different,  x,(c,)
cannot  vanish.   Consider  now the expression

xAt3
1                      (21)
--.



2
r      XA%>



If  ce is substituted  for 6 here,  every term in the sum vanishes except 
the one for which   r = s,  since  x,(f)  contains (&c,)  as a factor  when 
r # 8, and the term for which  r = s is unity, so the whole expression 
vanishes. Thus the expression (21) vanishes when 4 is put equal to 
any of the  n numbers  ci,cz,...,c,. Since, however, the expression 
is only of degree  n- 1 in f,  it must vanish identically. If we now 
apply the linear Operator (21) to an arbitrary ket  1 P) and equate 
the result to  Zero,  we  get
IQ = 7 &jx.(s)Ip~.                             (22)
Esch term in the sum on the right here is, according to  (19), an 
eigenket of f,  if it does not vanish. Equation (22) thus expresses the 
arbitrary ket 1 P) as a sum of eigenkets of f,  and thus (/3) is proved.
As a simple example we may consider a real linear Operator  u that
satisfies the equation                  u2= 1.                           (23)
Then  u has the two eigenvalues  1  and  - 1. Any ket  ]P)  tan be
expressed as           Ie =  6(1+4IP>+9(1-~>IP>.
It is  easily  verified  that the two terms on the right here  arc  eigenkets 
of  Q,  belonging  to the eigenvalues 1 and - 1 respectively, when they 
do  not vanish.

IO.  Observables
We have made a number  of assumptions about the way in which
states and dynamical variables are to be represented mathematically 
in the theory. These assumptions are not, by themselves, laws of 
nature,   but  become laws of nature when we make some  further 
assumptions that provide a physical  interpretation  of the theory. 
Such  further  assumptions must take the form of establishing connexions 
between the  results  of observations, on one hand, and the 
equations of the mafhematical formalism on the other.
When  we   make  an Observation we measure  some  dynamical variable.
It is obvious physically that the result of such a measurement must 
always  be a real number,  so we should expect that any dynamical
 
0  10                           OBSERVABLES                                          30
variable that  we   tan measure must be a real dynamical variable. 
One might think one could measure a  complex dynamical variable 
by measuring separately its real and pure imaginary Parts.  But this 
would involve two measurements or  two observations, which would 
be all right in classical mechanics, but would not do in  quantum 
mechanics, where two observations in general interfere with one 
another-it is not in general permissible to consider that two  observations 
tan be made exactly simultaneously, and if they arc  made in 
quick succession the first will usually disturb the state of the  System 
and introduce an indeterminacy that will  affect the  second.  We 
therefore  have to  restritt  the dynamical variables that we  tan 
measure to be real, the  condition  for this  in   quantum  mechanics 
being as given in  $ 8. Not every real  dynarnical  variable  tan be 
measured, however. A  further  restriction  is needed, as we shall see 
Iater.
We now make some assumptions for the physical interpretation of
the  t+heory.    If the dynamical system is in an eigenstate of a  real  :* 
dy~mid  variable  f,   belonging  to the eigenvalue  f', then  a  measurement 
of  ( will certainly give us result the number [`. Gonversely,   if  the system 
is in a state such that a meusurement  of  a  real dynamical variable  (c is 
certuin to give one particular  result (instead of giving one  or Gother  of 
several possible results according to a probability law, as is in general 
the  case),  then the state is an eigenstate of  5 and the result of the measurement 
is the eigenvalue of  ,$ to which this  eigenstate  belongs. These 
assumptions are reasonable on account of the eigenvalues of real 
`linear Operators being always real numbers.
Some of the immediate consequences of the assumptions will be
noted. If we have two or more eigenstates of a real dynamical 
variable  4 belonging to the same eigenvalue  k',  then any state 
formed by superposition of them will  also`   be an eigenstate of  6 
belonging to the eigenvalue f'. We  tan  infer that if we have two  or 
more states for which a measurement of  f is certain to give  the result 
t', then for any state formed by Superposition  ,of them a measurement 
of  5 will still be  certain  to give the result  t'. This gives us some insight 
into the physical significance of Superposition of states. Again, two 
eigenstates of  4 belonging to different eigenvalues are orthogonal. 
We  tan infer that two states for which a  mea&uement  of  [ is certain 
to give two different results are orthogonal. This gives us some 
insight into the physical significance of orthogonal states.
 
36           DYNAMICAL  VARIABLES AND OBSERVABLES                             0  10
When  wc  measure a real dynamical variable  e,  the  disturbance
involved in the act of measurement  Causes a jump in the state of the 
dynamical System. From physical continuity, if we make a second 
measurement of the same dynamical variable  4 immediately after 
the first, the result of the second measurement must be the Same as 
that of the first. Thus after the first measurement has been made, 
there  is no indeterminacy in the result of the second.  Hence,  after 
the  first  measurement has been made, the  System  is in an eigenstate 
of the dynamical variable [,  the eigenvalue it belongs to being equal 
to  the result of the first measurement. This conclusion must still hold 
if the second measurement is not actually made. In this way we see 
that  a measurement always  Causes the  System  to jump into an eigenstate 
of the dynamical variable that is being measured, the eigenvalue 
this eigenstate belongs to being equal to the result of the measurement.
We  tan  infer that, with the dynamical System  in any state, any
result  of a measurement of  a  real  dynumical   variable  is one of its eigenvalues. 
Conversely,  every eigenvalue is a possible result  of a  meusurement 
of  the  dynamicul  variable for some  Stute   of  the System,  since  it is 
certainly the result if the state is an eigenstate  belonging  to this 
eigenvalue. This gives us the physical significance of eigenvalues. 
The set of eigenvalues of a real dynamical variable are just the 
possible results of measurements of that dynamical variable and the 
calculation of eigenvalues is for this reason an important Problem.
Another assumption we  make connected with the physical inter-
pretation of the theory is that, if a  certuin   real dynumicul  variabk 
4 is measured with the  System  in  a  particulur   state,  the  states into which 
the  System  may jump on account  of  the measurement are such  that  the 
original state is dependent on them. Now these states into  which 
the System  may jump are all eigenstates of f, and  hence  the original 
state is dependent on eigenstates of 6. But the original state may be 
any stafe,  so we  tan  conclude that any state is dependent on eigenstates 
of 4. If we define a complete  set of states to be a set such that 
any  state is dependent on them, then our conclusion tan  be formulated-the 
eigenstates of 4 form a complete set.
Not  every  real dynamical variable has  sufficient   eigenstates to  form
a  complete  set. Those whose eigenstates do not form complete sets 
are not quantities that tan  be measured. We obtain in this way a 
further   condition  that a dynamical variable  has to satisfy in Order
 
.     `,"


,      .     .
















s 10                         OBSERVABLES                                   37 
that it shall be  susceptible  to measurement, in addition to the  condition 
that it shall be real. We call  a real dynamical variable whose 
eigenstates form a complete set  an  observuble.   Thus any quantity 
that tan be measured is an observable.
The question now presents itself-Can every observable be
measured? The  answer  theoretically is yes. In practice it may  be 
very awkward, or perhaps even beyond the ingenuity of the  experimenter, 
to devise  an  apparatus  which  could measure some  particular 
observable, but the theory  always  allows   one to imagine that the 
measurement  tan be made.
Let us examine mathematically the condition for a real dynamical
variable  e to be an observable. Its eigenvalues  may  consist of a 
(finite or infinite)  discrete  set of numbers, or  alternatively,  they 
may consist of all numbers in a certain range, such as all numbers 
lying between  a  and  b. In the  former   case,   the condition that 
any  state is dependent on eigenstates  of  4 is that any  ket  tan 
be expressed  as a  sum of eigenkets of  5.  ' In the latter  case   the 
condition needs  modification,   since  one may have an integral instead 
of a sum, i.e. a ket 15') may be expressible as an integral of  eigenkets 
of  4,                  IP) = [ It'> dt',
lt'>  being an eigenket of  [ belonging to the eigenvalue  f' and the 
range of integration being the range of eigenvalues, as such a ket is 
dependent on eigenkets of  [. Not every ket dependent on eigenkets 
of  4 tan  be expressed in the form of the right-hand side of (24), since 
one of the eigenkets itself  cannot,  and more generally any sum of 
eigenkets  cannot.  The condition for the eigenstates of  6 to form a 
complete set must thus be formulated, that any ket  IP)  tan be 
expressed as an integral plus a sum of eigenkets of E, i.e.
Ip) = j- 14'Q  dt'+ C I&J>,
T                          (26)
where the  j[`c),   /Pd> are all eigenkets  of  e,  the labels c and  d being 
inserted to distinguish them when the eigenvalues  6' and  $ are equal, 
and where the integral is taken over the whole range of eigenvalues 
and the sum is taken over any  selection  of them. If  this  condition 
is satisfied in the  case  when the eigenvalues of  ,$ consist of a range 
of numbers, then 4 is an observable.
There is a more general case  that sometimes occurs, namely the
eigenvalues of ,$  may consist of a range of numbers together with a
 
38          DYNAMICAL VARIABLES AND OBSERVABLES                          0 10 
discrete set of numbers  lying outside the range. In this  case  the 
condition that f shall  be an observable is still that any ket shall be 
expressible in the ferm  of the right-hand side of  (%),  but the sum 
over r is now a sum  over the discrete set of eigenvalues as weil  as a 
selection  of those in the range.
It is often very difyicult  to decide mathematically whether a  par-
ticular  real dynamical variable satisfies the condition for being an 
observable or not, because the whole Problem  of finding eigenvalues 
and  eigenvectors  is in general very difficult. However, we may have 
good reason on experimental grounds for believing that the dynamical 
variable  tan be measured and then we may reasonably assume that it 
is an observable even though the mathematical proof is missing. This is 
a  thing  we shall frequently do  during  the course of development of the 
theory, e.g. we shall assume the energy of any dynamical  System  to be 
always an observable,  even though it is beyond the power of  presentday 
mathematical analysis  to prove it so except in simple  Gases.
In the  special   case  when the real dynamical variable is a number,
every state is an eigenstate and the dynamical variable is obviously 
an observable. Any measurement of it always gives the Same res&, 
so it is just a physical  constant, like the  Charge on an  electron. 
A physical  constant  in  quantum  mechanics may thus be looked upon 
either as an observable with a  Single  eigenvalue or as a mere number 
appearing in the equations, the two Points  of view being equivalent.
If the real dynamical variable satisfies an  algebraic  equation, then
the result  (/3) of the preceding  section  Shows  that the dynamical 
variable is an observable. Such an observable has a finite number 
of eigenvalues . Conversely, any observable with a finite number of 
eigenvalues satisfies an algebraic equation, since  if the observable 4 
has as its eigenvalues  f',  l"  ,...,   En,  then
(E-F)(~-~")*.*(5-~n)IP>  =  0
holds for IP) any eigenket of [,  and thus it holds for any IE'>  whatever, 
because any ket oan be expressed as a sum of eigenkets of  4 
on account of  t being an observable.  Hence
(k-5')(~-~")***(~-~")  =  0.                   P-9
As an example we may consider the linear Operator  IA)@   1, where
IA) is a normalized ket. This linear Operator is real according to  (7),
and its  Square is
{IA>L4 l]" = IA><A   140 I =  W@  I                   (27)
 
OBSERVABLES                                  39
§  10

since   (AIA) = 1.  Thus its  `Square  equals itself and so it satisfies an 
algebraic  equation and is an observable. Its eigenvalues are  1  and  0, 
with  IA) as the eigenket belonging to the eigenvalue  1  and  all  kets 
orthogonal to  IA) as eigenkets belonging to the eigenvalue  0.  A 
measurement of the observable thus certainly gives the result 1  if 
the dynamical System  is in the state corresponding to  IA) and the 
result  0 if the  System  is in any orthogonal state, so the observable 
may be described as the quantity which determines whether the
System  is in the state  IA) or not.
Before concluding this section  we should examine the  conditions
for an integral such as occurs in (24) to be significant.  Suppose IX} 
and  13')  are two kets which  tan  be expressed as integrals of eigenkets 
of the observable  6,

IX> = j-  It'+ dt',      1 Y> = f lF'y> dt",
x and y being used as labels to distinguish the two integrands. Then 
we have, taking the conjugate imaginary of the first equation and 
multiplying by the  second

<XI Y>  = jj- <Wt"y>  &W"-                         (28)

Consider now the Single integral

(29)
*'    From the orthogonality theorem, the integrand here must vanish
over the whole range of integration except the one  Point   ["  =  [`. 
If the integrand is finite at this  Point,  the integral  (29)  vanishes, and 
if this holds for all f', we get from (28) that  (XI Y) vanishes. Now 
in general <X 1 Y) does not vanish, so in general (6'~  15'~) must be 
infinitely great in such a way as to make  (29) non-vanishing and 
finite. The form of infinity required for this will be discussed in  5  15.
In our work up to the present it has been  implied that our bra and
ket vectors are of finite Iength and their scalar  products are finite. 
We see now the need for relaxing this  condition  when we are  dealing 
with eigenvectors  of an observable whose eigenvalues form a range. 
If we did  not relax it, the phenomenon of ranges  of eigenvalues could 
not occur and our theory would be too weak for most practical
Problems.
 
40           DYNAMICAL VARIABLES AND OBSERVABLES                            §  10

Taking  1 Y) =  IX)  above, we get the result that in general  (5'~  If'x)
is infinitely great. We shall assume that if  1s'~) # 0

s Gf'x  lt?> 47 > 0,                         (30)
as the axiom corresponding  to (8) of  3  6  for vectors of infinite
length.
The space of bra or ket vectors when the vectors are restricted to
be of finite length and to have finite  scalar   products is called by 
mathematicians a Hilbert  space. The bra and ket vectors that we 
now use form a more general space than a Hilbert space.
We tan  now see that the expansion of a ket 1 P) in the form of the
right-hand side of (26) is unique,  provided there are not two or more 
terrns in the sum referring to the same eigenvalue. To prove this 
result, let us suppose that two different expansions of  1 P)  are  possible. 
Then by subtracting one from the other, we get an equation
of the form                 0 = s Ib> dt' + 1 It?),                         (31)
8
a  and b being used as new labels for the eigenvectors,  and the sum 
over s including all terms left after the subtraction  of one sum from 
the other. If there is a term in the sum in (31) referring to an eigenvalue 
fl not in the range, we get, by multiplying (31) on the left by 
(&l  and using the orthogonality theorem,


which  contradicts (8) of 5 6. Again, if the integrand in (31) does not 
vanish for some eigenvalue  5" not equal to any (6 occurring in the 
sum, we get, by multiplying (3 1) on the left by (["a 1 and using the 
orthogonality theorem,
0 = (f"al(`a>   dt',
f
which  contradicts (30). Finally, if `there is a term in the sum in  (31) 
referring to an eigenvalue  [i in the range, we get, multiplying  (31)  on 
the  14%  by (fb 1,
0 =  <~W'~>  dt'  +<~tWt~>
s                                            (32)
and multiplying (31) on the left by @al
0 =  <&lf'a>   dt'  +C&4@>.
s                                            (33)
Now the integral in (33) is finite, so @aIftb)  is finite and  @b Ipa) is 
finite.  The integral in (32) must then be Zero,  so (ftbIetb)  is Zero and
 
11. Functions of Observables

*



OBSERVABLES                                    41

we again have a contradiction. Thus every term in  (31) must vanish 
and the expansion of a ket lP>  in the form of the right-hand side of 
(25) must be unique.

11.  Functions  of observables
Let ,$  be an observable. We  tan multiply it by any real number k
and get another observable  k(.  In  Order  that our theory may be 
self-consistent it is necessary that, when the  System  is in a state such 
that a measurement of the observable 5 certainly gives the result t', 
a measurement of the observable  k[  shall  certainly give the result  Er. 
It is easily verified that this condition is  fulfilled. The ket  corresponding 
to  a  state for which a measurement of  f certainly gives the result 
6' is an eigenket of  4, It'>  say, satisfying

This equation leads to

showing that  14') is an eigenket of  k(  belonging to the eigenvalue  kf', 
and thus that a measurement of  k(  will certainly give the result  -4'.
More generally, we may take any real function of f,  f(l) say, and
consider it as  a  new observable which is automatically measured 
whenever 4 is measured, since  an experimental determination of the 
value of f also provides the value Off([).  We need not restritt  f(f) to 
be real, and then its real and pure imaginary  Parts  are two observables 
which are automatically measured when  8 is measured. For the theory 
to be  consistent  it is necessary that, when the  System  is in a state 
such that a measurement of  6 certainly gives the result  f', a  measurement 
of the real and pure imaginary  Parts   Off([)   shall  certainly give 
for results the real and pure imaginary  Parts   off(6').    In the  case  when 
f(t) is expressible as a power series
f(6) = c,+c,~+c2~2+c,~3+**.,
the c's being numbers, this condition  tan  again  be verified by  elementary 
algebra. In the  case  of more general functions  f it may not be 
possible to verify the condition. The condition may then be used to 
define  f(f), hi h
w c we have not yet defined mathematically. In this
way we tan get a more general definition of a function of an observ-
able than is provided by power series.
We define  f(f) in general to be that linear Operator which satisfies
m It'>  = fr> IQ'>                             (34)
 
42          DYNAMICAL VARIABLES AND OBSERVABLES                              0  11

for every eigenket  1s')  of  [,  f(f') being a number for  each  eigenvalue  5'. 
It is easily seen that this definition is self-consistent when applied to 
eigenkets  14') that are not independent.  If we have an eigenket  If'A) 
dependent on other eigenkets of  6, these other eigenkets must all 
belong to the same eigenvalue t', otherwise we should have an equation 
of the type  (31)) which we have seen is impossible. On multiplying 
the equation which expresses  I[`A)  linearly in terms of the other 
eigenkets of 4 by f(4) on the left, we merely multiply each  term in it 
by the number  f(e'), so we obviously get a  consistent  equation. 
Further,  equation (34) is suficient  to define the linear Operator f(e) 
completely, since to get the result  Off(f)  multiplied into an arbitrary 
ket  IP),  we have  only to expand  IP) in the form of the right-hand 
side of (25) and take


The conjugate complex  f(E) of  f(f) is defined by the conjugate
imaginary equation to  (34),  namely

<5vm  = 3@3tc 19
holding for any eigenbra  (P'I,   f(f')  being  the conjugate complex  - 
function to  f([`). Let us replace  f' here by  4" and multiply the 
equation on the right by the arbitrary ket 1 P). Then we get, using 
the expansion  (26) for IP),
cmIp>  = #iY&"K5"Ip>
= 13WY'IW dt' + ~,fCWlbO
= j=3(F):5"  IO> W +,fFW'lC'~>                     (36)
with the help of the orthogonality theorem,  (t"   If"d)  being  understood 
to be zero if  LJ" is not one of the eigenvalues to which the terms 
in the sum in (25) refer. Again, putting the conjugate complex 
function 3( f') for f(f') in (35) and multiplying on the left by {f"  1, 
we get
C%&W'> = ~3(~W"l~`c>  dt'  +3(5")GT'd>.
The right-hand side here equals that of  (36),  since the integrands
vanish for 5' # r, and  hence
<rlf@  IJ? = <mo In.
 
§ 11                 FUNCTIONS OF OBSERVABLES
This holds for (4" 1 any eigenbra and 12') any ket, so


Thus  the conjugate  cornplex  of the linear Operator  f(4)  is  the  conjugate
conaplex function  f  of  e.
It follows as a corollary that if f ([`) is a real function of t', f(t) is
a real linear Operator.  f(f) is then also an observable,  since  its 
eigenstates form a  complete set, every eigenstate of 6 being also an 
eigenstate of f (k).
With the above definition we are  able to give a  meaning  to any
function  f of an observable,  provided  only  thut  the  domain   of  existente 
of the function  of a real variable f(x)  includes all the eigenvalues of  the 
observable. If the  domain  of  existente contains other Points  besides 
these eigenvalues, then the values Off(x)  for these other Points  will 
not  affect the function of the observable. The function need not be 
analytic  or  continuous. The eigenvalues of a function  f of an observable 
are just the function f of the eigenvalues of the observable.
It is important to observe that the possibility of defining  a function
f of an observable requires the existente  of a unique number f(x) for 
each  value of x which  is an eigenvalue of the observable. Thus the 
function f(x) must be Single-valued. This may be illustrated by considering 
the question: When we have an observable f(A) which  is a 
real  function of the observable A, is the observable  A a function of 
the observable  f  (A )  1 The  answer  to this is yes, if diff erent eigenvalues 
A'  of  A always lead to different values of  f(A').  If, however, there 
exist two different eigenvalues of  A, A' and A" say, such that 
f  (A') =  f(A"), then, corresponding to the eigenvalue  f(A')   of the 
observable  f(A), there will not be a unique eigenvalue of the observable 
A and the  latter will not be a function of the observable f(A).
It may easily be verified mathematically, from the definition, that
the sum or product of two  functions  of an observable is a function 
of that observable and that a function of a function of an observable 
is a function of that observable. Also it is easily seen that the whole 
theory of  functions of an observable  is symmetrical between bras and 
kets and that we could equally weil  work from  the equation
Wf (0 =  f b?)  ~5'  1                      (38)
instead of  from (34).
We  shall  conclude this section  with a  discussion  of two examples
which  are of great practical im.portance, namely the reciprocal and
 
44            DYNAMICAL VARIABLES AND OBSERVABLES                         §  11 
the  Square root. The reciprocal of an observable exists if the observable 
does not have the eigenvalue  Zero.  If the observable  cx  does not 
have the eigenvalue  Zero, the reciprocal observable, which we  call   a--l 
or  I/cz,  will satisfy         OL-qx')  =  a'-lIQI')>                    (39) 
where  ja'>  is an eigenket of  01  belonging to the eigenvalue a'. Hence
cwl~a')  =  ad-lla') =  Ia').
Since  this holds for any eigenket Ia'), we must have
cmF1 = 1.                            (40)
Similarly,                           cy-% = 1.                            (41) 
Either  of these equations is sufficient to determine  a--l completely, 
provided  01  does not have the eigenvalue  Zero. To prove this in the 
case  of (40), let x be any linear Operator satisfying the equation
ax = 1
and multiply both sides on the left by the  a-1 defined by (39). The
result is                           &-l&x  = (y-1 
and  hence  from (41)                 X        -1
=a .
Equations (40) and (41) tan be used to define the reciprocal, when
it exists, of a general linear Operator  CII,  which  need not even be real. 
One of these equations by itself is then not necessarily sufficient. If 
any two linear Operators (I! and ss have reciprocals, their product ass 
has the reciprocal               (ass)-1 =  ss-kl,                         (42) 
obtained by taking the reciprocal of each  factor and reversing their 
Order. We verify (42) by noting that its right-hand side gives unity 
when multiplied by ass, either on the right or  on the left. This reciprocal 
law for products tan  be immediately extended to more than 
two factors,  i.e.,         (assy...)-1  = . ..y-lss-101-1.
The  Square root of an observable  a always exists, and is real if  CII
has no negative  eigenvalues.  We write it & or &. It satisfies
dcxIa'>  =  f&`lcY'),                     (43)
Ia'>   being  an eigenket of  c11  belonging to the eigenvalue  01'.  Hence
&&%la')  =  &`&`lc%`) =  a'la')  =  a~cx'),
and  since  this holds for any eigenket ja'>  we must have
4da =  a.                     (44)
 
12. The General Physical Interpretation

FUNCTIONS OF OBSERVABLES                                         46
0  11

On   account of the  ambiguity  of sign   in  (43) there   will  b8 several
Square  roots. To  fix one of them   we  must  specify a  particular  sign 
in  (43) for each  eigenvalue.       This  sign  may  vary irregularly  fiom  one 
eigenvalue   to the next  and equation (43)  will  always  define   a linear 
Operator   & satisfying  (44)  and forming  a  square-root function of  a. 
If there  is an eigenvalue  of a with  two or  more  independent eigenkets 
belonging   to it,  then  we  must,  according  to  our definition  of a function, 
have   the  same   sign   in   (43)  for  each   of  these   eigenkets.   If  we 
took different signs,  however,  equation (44) would still  hold,  and hence 
equation  (44)  by  itself  is  not  sufficient   to  define   &,  except  in   the 
special  case when  there  is only one independent  eigenket  of a belonging 
to any eigenvalue.
The  number  of different Square roots of an observable  is 2n,  where
n is the total number  of eigenvalues  not Zero. In practice  the squareroot 
function  is  used   only  for  observables   without  negative  eigenvalues 
and  the  particular   Square  root  that   is  useful   is  the  one  for 
which  the positive sign  is always  taken  in (43). This  one will  be called 
the  positive   squure  root.

12. The  general physical interpretation
The   assumptions  that  we made at the  beginning  of 5  10 to get  a
physical   interpretation  of  the  mathematical   theory  are  of  a  rather 
special   kind,   since   they  tan   be  used   only   in  connexion   with  eigenstates. 
We need some  more  general  assumption  which  will   enable  us 
to extract physical  information from  the mathematics  even when  we 
are  not deeling with  eigenstates.
In   classical   mechanics   an  observable   always,   as  we   say,   `has   a
value' for any  particular  state  of the System.  What is there  in  quanturn 
mechanics   corresponding   to  this?   If  we  take   any   observable 6 
and any two  states  x and y, corresponding  to the vectors (XI and Iy), 
then   we   tan   form   the  number   (xj,$ly).   This   number   is  not  very 
closely  analogous  to the value  which  an observable tan  `have'  in  the 
classical   theory,  for three  reasons,  namely,   (i) it refers  to two  states 
of  the System,  while  the classical  value  always  refers  to one,  (ii)   it is 
in   general  not a real  number,  and (iii)   it is not uniquely  determined 
by the observable and the states,  since  the vectors (XI and 1~)  contain 
arbitrary   numerical   factors. Even  if  we   impose   on  (XI  and  19)  the 
condition that  they shall  be normalized,  there  will  still  be an undetermined 
factor of modulus  unity  in  (x Ie 1~).  These  three  reasons  cease
 
46           DYNAMICAL      VARIABLES AND OBSERVABLES                      9  12

to apply, however,  if  we  take the two states to be identical and  1~) 
to  be  the conjugate imaginary  vector to  (XI. The number that we 
then get, namely  (x  It  IX>, is necessarily real, and also it is uniquely 
determined   when  (x j is normalized, since if we multiply (XI by the 
numerical   factor  ek, c being  some  real number, we must multiply
IX) by  e-ZC and  (xl[lx) will be unaltered.
One  mighf  thus  be inclined to make the tentative assumption fhat
the observable  5 `has the value' (xl[lx) for the state x, in a sense 
analogous to the classical sense. This would not be satisfactory,
+ though, for the following reason. Let us fake a second  observable r],
which would have by the above assumption the value  (~17  IX> for 
this  same  state. We should then expect, from classical analogy, fhat 
for this  statte  the sum of the two observables would have a value 
equal to the sum of the values of the two observables separately and 
the  product of the two observables would have a value equal to the 
product of  the values of the  two observables separately. Actually, the 
tentative assumption would give for the sum of the two observables 
the value  (x~[+T~x),  which is, in fact,  equal to the sum of  (xl[lx) . 
and  <x  17 IX),  but for the  product  it would give the value (x lt7  IX) 
or wqw, neither of which is connected in any simple way with
WW  and  Wrllx)~
However, since things go  wrang  only with the  product and not with
the sum, it would be reasonable to  cal1  <Xerox) the  average value of 
the observable  f for the state x. This is because  the  average  of the 
sum of two quantities must equal the sum of their averages, but the 
average  of their produot need not equal the  product of their  averages. 
We therefore make the general assumption that  if the meusurement
ii of the observable f for the system  in the stute correqonding  to IX} is
made  a lurge number of times, the average  of  all the results obtained  will
j  be  +4~lx>,   P rovided IX)  is normalixed.  If  IX) is not normalized, as is
necessarily the  case  if the  stafe x is an eigenstate of some observable 
belonging to an eigenvalue in a range, the assumption  becomes that 
the  average  result of a measurement of  Q is proportional to (Xerox), 
This general assumption provides a basis for  a  general physical  interpretation 
of the fheory.
The expression that an observable  ` has a  particular  value' for a
particular  state is permissible in quantum  mechanics in the  spe&1 
case  when a measurement of the observable is certain to lead to the 
particular  value, so that fhe state  is an eigenstate of the observable.
 
It may easily be verified from the algebra that, with this  restricted 
meaning for an observable ` having a value', if two observables have 
values for a particular state, then for this state fhe sum'of the two 
observables (if this sum is an observablet)  has a value equal to the 
sum of the values of the two observables separately and the product 
of the two observables (if this product is an  observable$) has a value 
equal to the product of the values of the two observables separately.
In the general  case  we  cannot  speak of an observable having a value
for a particular state, but we  tan  speak of its having an  average  value 
for the state. We  tan go  further  and speak of the probability of its 
having any specified value for the state, meaning the probability of 
this specified value being obtained when one makes a measurement of 
the observable. This probability  tan be obtained from the general 
assumption in the following way.
Let the observable be  f and let the state correspond to the  normal-
ized ket IX>. Then the general assumption tells us, not only that the 
average  value of 5 is (X Itlx), but also that the average  value of any 
function of [,f(t)  say, is (x jf(&  IX). Takef(6)  to be that function of 4` 
which is equal to unity when f = a, a being some real number, and 
zero otherwise. This function of  [ has a meaning according to our 
general theory of functions  of an observable, and it may be denoted 
by 8ta  in conformity with the general notation of the Symbol  6 with 
two suffixes given on p. 62 (equation  (17)).  The  average  value of
this  function of  (1`  is just the probability,  P,   say, of  4 having the value 
a.  Thus                                                                             (45) 
If  a is not an eigenvalue of f,  66, multiplied into any eigenket of f is 
Zero,  and  hence  Sta  = 0 and  P,  = 0. This agrees with a conclusion 
of  6 10, that any result of a measurement of an observable must be 
one of its eigenvalues.
If the possible results of a measurement of  6 form a range of  num-
bers, the probability of  f having exactly a particular value will be 
zero in most physical Problems. The quantity of physical  importante 
is then the probability of  f having a value within a small range, say 
fiom   a  to  a+da.  This probability, which we may  call   P(a) da,  is 

t This is not obviously so, since  the sum may not have sticient  eigenstates  to
form a complete  set, in which  case  the sum, considered as a Single  quantity, would
not be measurable.
$ Here  the reality condition  may fail, as weil  as the condition  for the eigenstetes
to form a complete  set.       I
 
48           DYNAMICAL VARIABLES AND OBSERVABLES                               0  12
equal to the  average  value of that function of  6 which is equal to 
unity  for  f lying within the range  a to  a+da and zero otherwise. 
This function of 6 has a meaning according to our general theory of 
functions  of an observable. Denoting it by  x(e),  we have
w  dal = <x  IX(f) IX>*                        (46)
If the range  a  to  a+da does not include  any  eigenvalues of  f,  we 
have   as  above  ~(8) = 0 and P(a) = 0. If  IX) is not normalized, the 
right-hand sides of (45) and (46) will still be proportional to the 
probability of  (t having the value  CG and lying within the range a to 
a+da respectively.
The assumption of $10, that a measurement of  LJ is certain  to give
the  result  [' if the  System  is in an eigenstate of  6 belonging to the 
eigenvalue  Ir, is consistent  with the general assumption for physical 
interpretation and tan in fact be deduced from it. Working from the 
general assumption we see that, if Ie') is an eigenket of 6 belonging 
to the eigenvalue  e',  then, in the  case  of  discrete  eigenvalues of 8,
&&J  16') = 0 unless  a =  f',
and in the case  of a range of eigenvalues of e
#lt')  = 0 unless the range a to a+da includes 6'.
In either case,  for the state corresponding to  IE'>,  the probability of
[ having any value other than f is Zero.
An eigenstate of  6 belonging to an eigenvalue  6' lying in a range
is a state which  cannot   strictly be realized in practice,  since  it would 
need an  infinite amount  of precision to get  6 to equal exactly  t'. 
The most that could be attained in practice would be to get `$ to lie 
within a narrow range  about the value  4'.  The  System  would then 
be  in a state approximating  to an eigenstate of  4. Thus an eigenstate 
belonging to  an  eigenvalue in a range is a mathematical idealization 
of what tan  be attained in practice. All the  Same such eigenstates 
play a very useful  role in the theory and one could not very weh do 
without them. Science  contains many examples of theoretical concepts 
which are hmits  of things met with in practice and  arc useful 
for  the precise  formulation  of  laws  of  nafure, although they are not 
realizable  experimentally, and this is just one more of them. It may 
be that the infinite length of the ket vectors corresponding to these 
eigenstates  is connecfed with their unrealizability, and that all  realizable 
states correspond to ket  vectors that  tan be normalized and that 
form a  Hilbert  space.
 
13. Commutability and Compatibility

x     ,



§  13            COMMUTABILITY AND COMPATIBILITY                                49
13. Commutability and compatibility
A state may be simultaneously an eigenstate of two observables.
If the state corresponds to the ket vector  IA) and the observables  arc
4 and 7, we should then have the equations
iV>  =  5'IA>, 
rllA>  =  q'lA>,
where  t' and  71' arc  eigenvalues of  4 and  7 respectively.  We tan now
deduce
5qIA>  =  Eq'lA>  =  h+O   =  MA>  =  $`IA> =  $dA>,
or                           @r-rlOIA>  =  0.
This suggests that the  chances  for the  existente of a simultaneous 
eigenstate are most favourable if  &-  q[  =  0  and the  two  observables 
commute. If they do not commute a simultaneous eigenstate is not 
impossible, but is rather exceptional. On the  other hand, if &ey  do 
commute there  exist  so  many  simultaneous  eigenstutes   that  they  ferm a 
complete  set, as will now be proved.
Let  [ and  71  be two commuting observables. Take an eigenket of
7,  17')  say, belonging to the eigenvalue  q', and expand it in terms 
of eigenkets of 5 in the form of the right-hand side of (26),  thus
hf>  = J wc> at' +  c lbifo.                            (47)
r
The eigenkets of  6 on the right-hand side here have  7' inserted  in 
them as an extra label, in Order  to remind us that they come  from 
the expansion of a special  ket vector, namely Iq'), and not a general 
one as in equation (25). We  tan  now show that each  of  these  eigenkets 
of f is also an eigenket of 7 belonging to the eigenvalue  7'.  We 
have
0 = h-$)Iq')  = j- (y-`f)l~`~`c)  dt' + 2 (d)lStrll~>e (48)
7
Now the ket (q-q') Ipq'd)  satisfies
w3wwo =  h-qfwqfa)  =  k~-~xwid>
= iF'(q--9')  lP@>,
showing that it is an eigenket of  ,$ belonging to the eigenvalue  p, 
and similarly the ket (q-- 7') I,$`q'c)  is an eigenket of 6 belonging to 
the eigenvalue  ff. Equation (48)  thus gives an integral plus a sum 
of eigenkets of e equal to Zero,  which,  as we have seen with equation
3505.67                         E
 
-7--                      --
-------
-













60           DYNAMICAL  VARIABLES AND OBSERVABLES                        §  13

(3l),  is impossible  unless  the integrand and every term in the sum
vanishes. Henne
k-77')IWc> =  0,              br--71'm?`d)  =  0,
so that all the kets appearing on the right-hand side of  (47)   are 
eiged.mts  of r] as well as of e. Equation (47)  now gives 117') expanded 
in terms of simultaneous eigenkets of 5 and r].  Since any ket tan be 
expanded in terms of eigenkets Iq'>  of  7, it follows that any ket tan 
be expanded in terms  of simultaneous eigenkets of  [ and  7, and thus 
the simultaneous eigenstafes form a complete set.
The  above simultaneous eigenkets of  4 and  7, Ie'q'c) and  1 pq'd),
are labelled by the eigenvalues 6' and q',  or e and q',  to which they 
belong, together with the labels c and  d  which may also be necessary.
The procedure of using eigenvalues as labels for simultaneous  eigen-
vectors will be generally followed in the future, just as it has been
followed in the past for eigenvectors  of  Single  observables.
The converse to the above theorem says  that, if 5 and 7 are  two           *
observables such that their simultaneous eigenstates form a complete set,
then  f and  7 wmmute. To prove this, we note that, if  jt'q'> is a 
simultaneous eigenket belonging to the eigenvalues 4' and v',
@l--77i3  kf'rl') = ~~`?I'-&?) Ii?rl') = 0.           (49)
,       Since the simultaneous eigenstates form a complete set, an arbitrary
ket  IP>  tan  be expanded in terms of simultaneous eigenkets l[`q'),
for each  of which (49)  holds, and hence
(h-m-e =  0
and so                                      t+-174`=   0.
The idea of simultaneous eigenstates may be extended to more
than two observables and the above theorem and its converse still 
hold, i.e. if any set of observables  commute,   each  with all the others, 
their simultaneous eigenstates form a complete set, and conversely. 
The  Same arguments used for the proof with two observables are 
adequate for the general  case;   e.g.,  if we have three commuting 
observables  f,  7, 5, we  tan expand any simultaneous eigenket of  4` 
and  r) in terms of eigenkets of  5 and then show that each  of these 
eigenkets of  5 is also an eigenket of  5 and of  7. Thus the simultaneous 
eigenket of e and 7 is expanded in terms of simultaneous eigenkets 
of  e,  v, and f, and since  any ket tan be expanded in terms of simultaneous 
eigenkets of  t and  7, it  tan also be expanded in terms of 
simultaneous eigenkets of 4, 11, and 5.
 
§  13           COMMUTABILITY AND COMPATIBILITY

The orthogonality theorem applied to simultaneous eigenkets  teils
us that two simultaneous  eigenvectors  of  a set of commuting  observables 
are orthogonal if the sets of eigenvalues to which they belong 
differ in any way.
Owing to the simultaneous eigenstates of two or more commuting
observables forming a complete set, we  tan set up  a theory of functions 
of two or more commuting observables on the  same  lines as the 
theory of functions of a  Single  observable given in $ 11. If  5,  7, c,... 
are commuting observables, we define  a general function f of them 
to be that linear Operator  f([,  7,  (I, . ..) which satisfies
f<& rl,  L.4  kw5'*.->  = fk!`, $9 L.)l4'77'5'.-1,                  w       L
where  \,$`q'c'..  .) is any simultaneous eigenket of  e,~, c,...  belonging 
to the eigenvalues  e',  q',   c',...  . Here  f is any function such that 
f(a,  b, c,... ) is defined for all values of a,  b, c,. . . which are eigenvalues 
of &  7, L respectively. As with  a  function of a  Single observable 
defined by  (34),  we  tan  show that  f(e,   7,  c,...)  is completely  determined 
by (50),  that

corresponding to  (37),   and that if  f(a,   b,  c, . ..) is a real function,
f([, q, 5 ,...) is real and is an observable.
We tan now proceed to generalize the results (45) and (46). Given
a set of commuting observables [, 7, c,...,  we may form that function 
of them which is equal to unity when 6 = a, 7 = 6, 5 = c ,...,  a, b, c ,... 
being real numbers, and is equal to  Zero when any of  these  conditions 
is not fulfilled. This function may be  written  6ta  6,, $+...,  and is in 
fact just the  product in any  Order  of the  factors  Sta,   $,,   6cC,.  . . defined 
as functions of  Single  observables, as may be seen by substituting this 
product for f(e,  7, c,...) in the left-hand side of (50). The  average 
value  of this function for any state is the probability, Ph...  say, of . 
[,  ~,c ,...  having  the values a,  b, c ,... respectively for that state.  Thus 
if the state corresponds to the normalized ket  vector   IX),  we get from 
our general assumption for physical interpretation
Pabc... =  <x\a,$as$   a&***  IX>*                     (61)
cbc... is  Zero unless  each  of the numbers a,  b, c,. . . is  an  eigenvalue of 
the corresponding observable. If any of the numbers  a,  b, c,...  is an 
eigenvalue  in a range of eigenvalues of the corresponding observable, 
PtiC,..  will usually again be Zero,  but in this case  we ought to replace
 
62           DYNAMICAL   VARIABLES  AND OBSERVABLES                                 9  13

the  requiremenf  that this  ohservable  shall have exactly one value  by 
the  requirement  that it shall  have a value lying within a  small  range, 
which involves replacing one of the 6 factors  in (51) by a factor like 
the  ~(6) of  equafion   (46). On  carrying  out such a  replacement  for 
each  of the observables  4,  7,  5  ,..., whose corresponding  numerical 
value  a, b, c,... lies in a  range  of  eigenvalues,  we shall get a  probability 
which does not in general  vanish.
If certain observables commute, there  exist  states for which  they  all
have particular  values,  in the sense  explained at the bottom of  p.  46, 
namely the simultaneous eigenstates. Thus  one tan  give a wuning  to 
several commuting observables having  values at the  Same  time. Further,  we 
see from  (61)  that for any state  one  tun   give a  meaning   to   the   probability 
of  partklar  results being  obtained for simultaneous measurements  of 
several  wmmuting  observables.  This conclusion is an important new 
development . In  general   one  cannot make an Observation on a 
System  in a definite state without  disturbing that state and spoiling 
it for the purposes of a  second Observation. One  cannot  then give 
any  meaning to the two observations being made simultaneously. 
The above conclusion teils us, though, that in the special  case  when 
the two observables commute, the observations are to be considered 
as non-interfering or compatible,  in such a way that one  tan  give a 
meaning to  the  two observations  being  made simultaneously and  tan 
discuss  the probability of any particular results being obtained. The 
two observations may, in fact, be considered as a  Single Observation 
of a more complicated  type,  the result of which is expressible by two 
numbers instead of a  Single  number. Prom the Point   of view of general 
theory,  any two or more commuting observables  may  be  counted   us a
Single  observable,   the  result  of  a measurement  of which  consists  of  two  or 
more numbers.  The states for which this measurement is certain  to                          t 
lead  to one particular result are the simultaneous  eigenstates.
 
III. Representations
14. Basic Vectors

111
REPRESENTATIONS
14. Basic vectors
IN  the  preceding chapters  we sef  up an algebraic  scheme involving 
certain abstract quantities of three kinds, namely bra vectors, ket 
vectors, and linear Operators, and we expressed some of the fundamental 
laws of  quantum  mechanics in terms of them. It would be 
possible to continue to develop the theory in terms of these abstract 
quantities and to use them for applications to particular Problems. 
However, for some purposes it is more convenient to replace the 
abstract quantities by sets of numbers with analogous mathematical 
properties and to work in terms of these sets of numbers. The procedure 
is similar  to using coordinates in geometry, and hss the advantage 
of giving one greater mathematical power for the solving of 
particular Problems.
The way in which  the abstract quantities  arc  to be replaced by
numbers is not  unique,  there being many possible ways corresponding 
to the many Systems of coordinates one  tan have in geometry. Esch 
of these ways is called a representution  and the set of numbers that 
replace an abstract quantity is called the  representutive   of that 
abstract quantity in the representation. Thus the representative of 
an abstract quantity corresponds to the coordinates of a geometrical 
Object.  When one has a particular Problem  to work out in quantum 
mechanics, one  tan  minimize the  labour  by using a representation 
in which  the representatives of the more important abstract quantities 
occurring in that  Problem  are as simple as possible.
To set up a representation in a general way, we take a complete
set of bra vectors, i.e. a set such that any bra  tan  be expressed 
linearly in terms of them (as a sum or an integral or possibly an 
integral plus a sum). These bras we  cal1 the basic bras of the  representation. 
They are  sufficient,  as we  shall  see, to fix the representation 
completely.
Take any ket  Ia)  and form its  scalar   product with  each  of the  basic
bras. The numbers so obtained constitute the representative of ja). 
They are sufficient  to determine the ket Ia) completely, since  if there 
is a second ket, Ia,)  say, for which  these numbers are the Same, the 
differente  Ia)- Ia,) will have its scalar  product with any basic  bra
 
04                          REPRESENTATIONS                                   0  14

vatishing,  and  hence  its scalar  product with any bra whatever will
van&  and ja)- Ia,) itself will van&
We may  suppose   the basic bras to be labelled by one or more
Parameters,  h,,   h,  ,..., h,,   each  of which may take on certain  numerical 
values, The basic bras will then  be written  (h,  AZ..   .h,  1 and the  representative 
of ja> will be written (h, X,... AU ja>.  This representative will 
now  consist  of  a set of numbers, one for  each  set of values that 
hl,  &r..*, h, may have in their  respective   domains.  Such a set of 
numbers  just  forms a fmction of the variables A1,  AZ,...,  AU. Thus  the 
representative   of a ket may be looked upon either as a set of numbers 
or as a function  of the variables used to label the basic bras.
If  fhe  number  of independent states of our dynamical  System  is
finite, equal  to n say, it is sufficient  to take n basic bras, which may 
be labelled by a Single Parameter h taking on the values 1,2,3,..., n. 
The representative of any ket  Ia) now consists of the set of  n  numbers 
(1  Ia>,  <21@,   (3 Ia)>*.*, (nlu),   which are precisely the coordinates of 
the vector Ia) referred to a  System  of coordinates in the usual way. 
The idea of the representative of a ket vector is just a generalization 
of the idea of the coordinates of an ordinary vector and  reduces  to 
the latter when the number of dimensions of the  space  of the ket 
vectors is finite.
In a general representation there is no need for the basic bras to
be all independent. In most representations used in practice, however, 
they are all independent, and also satisfy the more stringent 
condition that any two of them are orthogonal. The representation 
is then called an  orthogonal  representation.
Take an orthogonal representation with  basie  bras  (h,  h,...h,   1,
labelled by Parameters A1,  A2,. . . , X, whose domains  are all real. Take 
a ket  Ia> and  ferm its representative  (h,h,...A,lu). Now form the 
numbers A,(hlh,...h,  Ia) and consider them as the representative of 
a new ket  Ib).  This  is permissible  since  the numbers forming the 
represenfative  of a ket are independent, on account of the basic bras 
being  independent,  The ket Ib) is defined by the equation
(&&&$Jb> =  h,<A,h,...h,lu).
The ket Ib) is evidently a linear function of the ket Ia}, so it may 
be  eonsidered  as  the result of a linear Operator applied to  la;>. Cabg 
this  linear Operator  L,,   we  have
10  =  & Ia>
 
§  14                           BASIC VECTORS                                        56

and hence            (X, &...h,,  1 L, Ia) = h,(X, h,...X,  Ia).
This equation holds for any ket ja), so we get
(h, h,...h,  1 L, = h,(X, X,...h,  1.                      (1)
Equation  (1)  may be looked upon  as the  definition of the linear 
Operator  L,. It  Shows  that  euch   basic bra  is an eigenbra  of  L,, the 
value  of the Parameter  X,  being the eigenvalue  belonging  to it.
From the condition that the basic bras are orthogonal we  tan
deduce that L,  is real and is an observable. Let Xi,  hk,.  .  ., & and 
x;, Ai,..., Ai be two sets of values for the Parameters  h,,   Ag,. .  .,  h,. 
We have, putting h"s  for the X's in (1) and multiplying on the right 
by  IA;h&..Ac), the conjugate imaginary of the basic bra  (A2ha...AiI,
{x;h~...XulL,Ih:ha...~)     =  h;(h;A$...~lh;~~...hU).
Interchanging X"s and h"`s,
{Xi x;..q  L,  p;  hk...G)  = X;(h; h;...A.p; Ass...&).
On account of the basic bras being orthogonal, the right-hand sides 
here vanish unless hr = & for all T from  1 to u, in which case  the 
right-hand sides are equal, and they are also real,  Ai  being real. Thus, 
whether the X"`s  are equal to the  X"s or  not,
--F--~--
<h;hB...~IL,IX;hij...~)      =  (X,X,...~IL,Ih;~~...Xu>
=  (Xih2...XuI~,li\;)12...~)
from equation (4) of $ 8. Since the (h; Ai..  .&  1's form a complete set 
of bras and the  /Ai   A~...~)`s  form a complete set of kets, we  tan 
infer that  L, = -f;,.  The  further  condition required for L,  to be an 
observable, namely that its eigenstates shall  form a complete set, is 
obviously satisfied  since  it has as eigenbras the basic bras, which 
form a complete set.
We  tan  similarly introduce linear Operators L,, Lw.., L, by multi-
plying  (h, h,. . .h, Ia) by the factors A2,   X,,  . . . , h, in turn and considering
the resulting sets of numbers  as  representatives of kets. Esch of these 
L's  tan be shown in the  Same way to  have  the basic bras as eigenbras 
and to be real and an observable. The basic bras  are  simultaneous 
eigenbras of all the L's. Since these simultaneous eigenbras form a 
complete set, it follows from a theorem of $13 that any two of the 
L's  commute.
It will now be shown that, if &,f2,...,  fU  are any set of commuting
observables, we tun set  up an orthogonal  representution  in which the basic 
bras  are simultuneous  eigenbras  of 5;, [%,...,  fU.  Let us suppose 6rst  that
 
66                              REPRESENTATIONS                                9  14

there  ia only  one independent simultaneous eigenbra of  fl, t2,...,  4, 
belonging to any set of eigenvalues f;, &.,..  . , 5;. Then we may take 
these simultaneous eigenbras, with arbitrary  numerical  coefficients, as 
our basic bras. They are all orthogonal on account of the orthogonality 
theorem (any two of them will have at least one eigenvalue different, 
which is sufficient to make them orthogonal) and there are sufficient 
of them to form a complete set, from a result of  6  13. They may 
conveniently be labelled by the eigenvalues  &  SS,...  ,  & to which they 
belong, so that one of them is written (6; (32..&].
Passing  now to the general  case  when there are several independent
simultaneous eigenbras of  &,   t2,...,   CU belonging to some sets of eigenvalues, 
we must pick out from all the simultaneous eigenbras belonging 
to a set of eigenvalues  6;)   &, . . . ,  CU a complete subset, the members 
of which are all orthogonal to one another. (The  condition of  completeness 
here means that any simultaneous eigenbra belonging to the 
eigenvalues [i, [i,..., & tan be expressed linearly in terms of the 
members of the subset.) We must do this for each  set of eigenvalues 
Ei, &,...,  & and then put all the members of all the subsets together 
and take them as the basic bras of the representation. These bras 
are all orthogonal, two of them being orthogonal from the  orthogonality 
theorem if they belong to different sets of eigenvalues and from 
the  special  way in which they were  Chosen if they belong to the same 
set of eigenvalues, and they form altogether a complete set of bras, 
as any bra  tan  be expressed linearly in terms of simultaneous eigenbras 
and each  simultaneous eigenbra tan then be expressed linearly 
in terms of the members of a subset. There are  infmitely  many ways 
of choosing the subsets, and  each  way provides one orthogonal 
representation.
For labelling the basic bras in this general case,  we may use the
eigenvalues  &  &..., & to which they belong, together with certain
additional real variables  h,,   &,  . . . ,  &, say , which must be introduced to
distinguish basic vectors belonging to the same set of eigenvalues 
from one another. A basic bra is then written  (k; &...&  hIh,...h,I. 
Corresponding  to the variables  X,,   X,,  . .  .,  &,  we  tan   define  linear 
Operators  L,,   I&,..., L, by equations like  (1) and tan  show that these 
linear Operators have the basic bras  as eigenbras, and that they are 
real and observables, and that they commute  with one another and 
with the  6's. The basic bras are now simultaneous eigenbras of all 
the commuting observables  fl,  e2   ,...,   tu,   L,,   L,  ,...,   L,.
 
§  14                               BASIC VECTORS
Let us define a campbete set of commuting obseruables  to be a set of
observables which  all  commute  with one another  and for  which there 
is only one simultaneous eigenstate belonging to any set of  eigen-
values. Then the observables  fl,  fZ  ,...,   [,,  L,,   L,  ,...,   L,  form a complete
set of commuting observables, there being only one independent simul-
taneous eigenbra belonging to the eigenvalues  e;,  62  ,...,   &,  h,,  &.  ,...,   4,
namely the corresponding basic bra. Similarly the observables 
L,,  L2,..., L,  defined by equation (1) and the  following work form 
a complete set of commuting observables. With the help of this 
definition the main results of the present  section tan be concisely 
formulated thus:
(i) The basic bras of an orthogonal representation are simul-
taneous eigenbras of a complete set of commuting  observ-
ables.
(ii) Given a complete set of commuting observables, we  tan  set                    .
up an orthogonal representation in which the basic bras are
simultaneous eigenbras of this complete set.
(iii) Any set of commuting observables tan be made into a  com-
plete commuting set by adding  certain observables to it.
(iv) A convenient way of labelling the basic bras of an orthogonal
representation is by means of the eigenvalues of the complete 
set of commuting observables of which the basic bras are 
simultaneous       eigenbras.
The conjugate imaginaries of the basic bras of a representation we
cal1 the  basic kets of the representation. Thus, if the basic bras arc 
denoted by (h, &. ..h,  1,  the basic kets will be denoted by Ih,  &..h,>. 
The representative of a bra  (b 1 is given by its scalar  product with 
each  of the basic kets, i.e. by (blh,  A,...h,).  It may, like  the  representative 
of a ket, be looked upon either as a set of numbers or as a 
function of the variables  h,,   &,.  .  .,  X,. We have
(b /Al h,.  . .A,> = (h, h,...h,  1 b),
showing that  the  representatiue   of  a bra  is  the   conjugate   complex   of  the
representative  of tke conjugate  imuginary  Eet. In an orthogonal representation, 
where the basic bras are simultaneous eigenbras of a  complete 
set of commuting observables, fx,  f2,...,  & say, the basic kets 
will be simultaneous eigenkets of fl, e2,...,  &.
We have not yet considered the lengths of the basic vectors.  With
an orthogonal representation, the natura1 thing to  do  is to normalize
 
15. The  Function

REPRESENTATIONS

the basic vectors, rather than leave their lengths arbitrary, and so 
introduce a  further   Stage of  simplification  into the representation. 
However, it is possible to normalize them only if the Parameters 
which label them all take on discrete  values. If any of these  parameters 
are continuous variables that  tan  take on all values in a range, 
the basic vectors are  eigenvectors  of some observable belonging to 
eigenvalues in a range and are of infinite length, from the  discussion 
in  $  10 (see.p.  39 and top of  p.  40).   Some other procedure is then 
needed to fix the  numerical  factors  by which the basic vectors may 
be multiplied. To get a convenient method of handling this question 
a new mathematical notation is required, which will be given in the 
next  section.

15. The S function
Our work in 6 10 led us to consider quantities involving a certain
kind of  infinity.  To get a  precise  notation for dealing with these 
infinities, we introduce a quantity S(x) depending on a Parameter x 
satisfying the conditions
Co
S(x) dz  = 1
s
-*  S(x) =  0 for  x  # 0.
To get a picture of S(x), take a function  of the real variable x which 
vanishes  everywhere  except inside a small domain,  of length E say, 
surrounding the origin  x = 0,  and which is so large inside this  domain 
that its integral over this domain is unity. The  exact shape of the 
function inside this domain  does not matter, provided there are no 
unnecessarily wild variations (for example provided the  function 
is always of Order  4).  Then in the limit E -+ 0 this function  will go 
over into  S(X).
S(x) is not a  function  of  x according to the usual mathematical
definition of a function,  which requires a function to have a definite 
value for each  Point  in its domain,  but is something more general, 
which we may call  an `improper function' to show up its differente 
from a  function  defined by the usual definition. Thus S(x) is not a 
quantity which tan  be generally used in mathematical analysis  like 
an ordinary function,  but its use must be confined to certain simple 
types of expression for `which it is obvious that no inconsistency 
tan  arise.
 
0  16                           THE  6 FUNCTION
The most important proper@  of S(X) is exemplified by the follow-
ing equation,                   w
s f(4w9  dx = f(O),                          (3)
-03
where f(x) is any continuous function of x. We  tan easily see the 
validity of this equation  ficom  the above picture of S(x). The  lefthand 
side of (3)  tan depend only on the values of  f(x) very  close 
to the origin, so that we may replace f(x) by its value at the origin, 
f(O), without essential error. Equation (3) then follows from the 
first of equations (2).  By making a  Change of origin in (3), we  tan 
deduce the formula        co
s fWW4   dx = f(a),                                (4)
-Co
where a is any real number. Thus  the process  of  multiplying  a function 
of x by S(x-a) and integrating over all x  is equivalent to the process  of 
substituting a for  x. This general result holds also if the function of x is 
not a  numerical  one, but is a  vector  or linear Operator depending on x.
The range of integration in  (3)  and  (4)  need not be from  --Co to  CO,
but may be over any domain  surrounding the critical Point  at which 
the S function does not vanish. In future the  Limits  of integration 
will usually be omitted in such equations, it being understood that 
the  domain  of integration is a suitable one.
Equations (3) and  (4) Show  that, although an improper function
does not itself have a weh-defined value, when it occurs as a factor 
in an integrand the integral has a  well-defined  value. In quantum 
theory, whenever an improper function appears, it will be something 
which  is to be used ultimately in an integrand. Therefore  it should be 
possible to rewrite the theory in a form in which  the improper functions 
appear all through  only in integrands. One could then eliminate 
the improper  functions  altogether. The use of improper  functions 
thus does not involve any lack  of rigour in the theory, but is merely 
a  convenient notation, enabling us to express in a concise form 
certain relations which  we could, if necessary, rewrite in a form not 
involving improper functions,  but only in a cumbersome way which 
would tend to obscure  the argument.
An alternative way of defining  the S function is as the differential
coefficient E'(X) of the function E(X) given by
E(X) = 0  (x  <  0)                          (5)
= 1 (x > 0).                       1
 
00                          REPRESENTATIONS                                 0  15

We may verify that  this is equivalent to the previous definition by 
substituting E'(X) for S(x) in the left-hand side of (3) and integrating 
by  Parts.  We find, for g, and  g, two positive numbers,

ulf~x)Ef~x) ax
1                 =  [ft4+J)]8'up-   p'f'wt4  ax
-92                                     -02
=  fkd- Pm fJx
-f(O),   O
in agreement with (3). The  8 function appears whenever one  differen-
tiates a discontinuous function.
There are a number of elemrntary equations which one tan write
down  about   6 functions. These equations are essentially rules of
manipulation for algebraic  werk  involving 6 functions. The meaning                 -
of any of these equations is that its two sides give equivalent results
as factors in an integrand.
Examples of such equations are
q-x) = S(x)                                           (6) 
xS(x) = 0,                                            (7) 
S(ax) = dS(x) (a > O),                                (8)
S(x242)  = B~-l{w-J)+s(x+~))  @ > o>, (9)
s s(a-x) ax s+b) = s+b),                                             (10)
f(x)S(x-a)  = f(a)S(x-a).                                   (11)
Equation  (6), which merely states that  S(x) is an even function of its 
variable x is trivial. To verify (7) take any continuous function of 
x, f(x). Then
f(x)xS(x)  ax = 0,
s
from (3). Thus x 6(x)  as a  factor in an integrand is equivalent to 
Zero,  which is just the meaning of  (7). (8) and (9) may be verified 
by similar elementary arguments. To verify  (10)  take any continuous 
function of  a, f(a). Then
ff(q d"J s(a-x) ax S(X-b)  = f s(x-b) axJf(a)adqa-x)
= 1 S(x-b) dxf(x) = 1 f(a) da S(a-4).
Thus the two sides of (10) are equivalent as factors in an integrand 
with a as variable of integration. It may be shown in the same way
 
§  15                          THE  8 FUNCTION                                61

that they are equivalent also as factors  in an integrand with b as 
variable of integration, so that equation (10) is justified from either 
of these  Points  of view. Equation (11) is also easily justified, with 
the help of  (4), from two  Points  of view.
Equation  (10)  would be given by an  application of  (4)  with
f(x) =  S(x-b). We have  here an illustration of the  fact  that we may 
often use an improper function as though it were an ordinary continuous 
function, without getting a wrong result.
Equation (7)  Shows  that, whenever one divides both sides of an
equation by a variable  x  which   tan take on the value  Zero,  one 
should add on to one side an arbitrary multiple of S(x), i.e. from an 
equation                             A-B                                    (12) 
one  cannot  infer                  A/x = Bfx,
but only                            Alx =B/x+c SW,                         (13)
where c is unknown.
As an illustration of work with the S function, we may consider the
differentiation of log x. The usual formula
d
-&logx  =  1                           (14)
X

requires examination for the neighbourhood of x =  0.  In  Order  to 
make the reciprocal function l/x  well defined in the neighbourhood 
of x = 0 (in the sense of an improper function) we must impose  on 
it an extra  condition, such as that ite integral from  -E to  E vanishes. 
With this extra condition,  the integral of the right-hand side of (14) 
from -E to E vanishes, while that of the left-hand side of (14) equals 
log  (-  l),  so that  (14)  is not a  correct  equation. To correct it, we must 
remember that, taking principal values, logx has a pure imaginary 
term irr for negative values of x. As x Passes  through the value Zero 
this pure imaginary term vanishes discontinuously. The  differentiation 
of this pure imaginary term gives us the result -ins(x),  so 
that (  14)  should read
d
zlogx -L&(x).                                 (15)
X

The  particular  combination of reciprocal function and S function 
appearing in (15) plays an important part in the quantum  theory of 
collision  processes  (see 5 50).
 
16. Properties of the Basic Vectors

62                            REPRESENTATIONS                                 §16

16. Properties of the basic vectors
Using the notation of the  8  function,  we  tan  proceed with the theory
of representations. Let us suppose first that we have  a  Single   observable 
4 forming by itself a complete commuting set, the  condition  for 
this being that there is only one eigenstate of  4 belonging to any 
eigenvalue  [`, and let us set up an orthogonal representation in which 
the basic vectors are eigenvectors of  e and are written <[`I,  It'>.
In the  case  when the eigenvalues of  `$ are  discrete,  we  csn normalize
the basic vectors, and we then have
<w'>  =  0 (4'  #  t3>
GT'>  =  1.
These equations tan  be combined into the  Single  equation
<cr>  =  S@,                               W-9
where the  Symbol   6 with two suffixes, which we shall often use in the
future, has the meaning
srs =0  when rfs
= 1 when  r =  s.
In the  case  when the  eigenvalues  of  t are continuous we  cannot
normalize the basic vectors. If we now consider the quantity @`lt"> 
with  4' fixed  and 6" varying, we see from the work connected with 
expression (29) of  6  10 that this quantity vanishes for  4" # 8' and 
thet its integral over a range of  6" extending through the value  f 
is finite, equal to c say. Thus
G'  15") = c s(&-y").
From   (30) of  5 10,  c is a positive number. It  mag  vary with f', so
we should write it ~(6') or c' for brevity, and thus we have
<kT"> =c' S(f'-6').                            (18)
Alternatively, we.  have
&?15"> =C" S(f'--f"),                          (19)
where  c" is short for c(["),  the right-hand sides of (18) and (19) being
equal on account of (11).
Let us pass to another representation whose basic vectors  arc
eigenvectors of e,  the new basic vectors being numerical  multiples of 
the previous ones. Calling the new basic vectors  (4'"   1,   It'*),  with the 
additional label  * to distinguish them from the previous ones, we have
(f'"l =  W~`l,        14'")  =  m'>,
 
P


§  16          PROPERTIES OF THE BASIC VECTORS                              I 63

where  k' is short for k(f) and is a number depending on 5'.  We get
(t'*  Ie"*>  = k'~(f'  lf")  = k'j&  S(,f'-4")
with the help of (18). This may be written


from (11). By choosing  k'  so that its modulus is  c'-*,   which  is possible
since c' is positive, we arrange to have
(f'"l  f'*>  =  S(&(").                        (20)
The lengths of the new basic vectors are now fixed so as to make the 
representation as simple as possible. The way these lengths were 
fixed is in some  respects  analogous to the normalizing of the basic 
vectors in the  case  of discrete  e',  equation (20) being of the form of 
(16)  with the  8  function  S([`--6")  replacing the  6  Symbol   8ee of 
equation ( 16). We shall continue to work with the  new  representation 
and  shall  drop the * labels in it to save writing. Thus (20) will now 
be written                  ([`lf')   =  S([`-5").                        (21)
We tan develop the theory on closely parallel lines for the discrete
and continuous cases.  For the discrete  case  we have, using (16),

c   15'>wY'>  = 2  IEY,,l  = 14">,
5'                 i?
the sum being taken over all eigenvalues. This equation holds for 
any basic ket  jr) and  hence,  since the basic kets form  a  complete set, 


This is a  useful  equation expressing an important property of the 
basic vectors, namely,  if  je'>   is  multiplied  on the  right by  (6'1 the 
resulting linear Operator,  summed  for all  (`,  equds  the unit Operator. 
Equations (16) and (22) give the fundamental properties of the basic 
vectors for the discrete  case.
Similarly, for the continuous case  we have, using (21),
/  ~kf) dff wo =  1 14') at' w-rf) = 157                    (23)
from (4) applied  with  a ket vector  for f(x),  the range of integration 
being the range of eigenvalues. This holds for any basic ket  16") 
and hence
s 149  dt'  (~7 =  1.                         (24)
 
64                         REPRESENTATIONS                                §  16 
This  is of the same form as  (22)  with an integral replacing the sum. 
Equations  (21)  and  (24)  give the fundamental properties of the basic 
vectors for the continuous  case.
Equations  (22) and  (24) enable one to expand any bra or ket in
terms of the basic vectors. For example, we get for the ket  IP) in the
discrete  case,  by multiplying  (22) on the right by IP),
IP>  =  2  14'>(5'IP>~                       (25)
t?
which gives /P) expanded in terms of the  14')`s  and  Shows  that the 
coefficients in the expansion are  (5'1  P),  which are just the numbers 
forming the representative of  1 P).  Similarly, in the continuous  case,
IP) = j- lt'> dt'  <W'>,                       (26)
giving  IP) as an integral over the  lt')`s, with the coefficient in the 
integrand again just the representative  (6'   1 P)  of  1 P), The conjugate 
imaginary equations to (25) and (26) would give the bra vector  (P  1 
expanded in terms of the basic bras.
Our present mathematical methods enable us in the continuous
case to expand any ket as an integral of eigenkets of 5. If we do not 
use the  6 function notation, the expansion  of,a general ket will consist 
of an integral plus a sum, as in equation (25) of  5  10,  but the  6 function 
enables us to replace the sum by an integral in which the integrand 
consists of terms  each  containing a  & function as a  factor.  For 
example, the eigenket 16")  may be replaced by an integral of eigenkets, 
as is shown by the second of equations (23).
If (Q 1 is any bra and 1 P) any ket we get, by further  applications
of (22) and (24),       KW>  = ya5')(5'IP>                                (27)
for discrete 6' and
OW> = j- <&lf> dt'  <W>                           (28)
for continuous 5'. These equations express the scalar  product of (QI 
and  1 P)  in terms of their representatives (Q  It') and  (6'   1 P). Equation 
(27) is just the usual formula for the  scalar   product of two 
vectors in terms of the coordinates of the vectors, and  (28) is the 
natura1  modification of this formula for the  case  of continuous t', 
with an integral instead of a sum.
The generalization of the foregoing work to the  case  when  4` has
both discrete and continuous eigenvalues is quite straightforward.
 
S  16                 PROPERTIES OF THE BASIC VECTORS                           65
Using 4' and 4" to denote discrete eigenvalues and 6' and 4" to denote
continuous eigenvalues, we have the set of equations
Gw>  =  $Tg">          @ll?>  =  0,    GT'>  = w-4") (29)
as the generalization of (16) or (21). These equations express that 
the basic vectors are all orthogonal, that those belonging to discrete 
eigenvalues are normalized and those belonging to continuous eigenvalues 
have their lengths fixed by the same rule as led to (20). Prom 
(29) we  tan derive, as the generalization of (22) or (24),



the rsnge of integration being the range of continuous eigenvalues.
With the help of  (30),  we get immediately

lP> = c 14'>G?I~)+  1  lt'> dt' WlP>
4'
as the generalization of  (26) or (26),  and


a,s  the generalization of (27) or (28).
Let  us now pass to the general  case  when we have several commuting
observables  EI, t2,. . . , & forming  a  complete commuting set and set up 
an orthogonal representation in which  the basic vectors are  simultaneous 
eigenvectors of all of them, and  a;re   mitten   {&...&   1,   I&..&). 
Let us suppose  e1,t2,..., &  (V  <  u) have discrete eigenvalues and 
4                6 have continuous eigenvalues.
w+l,"',   u
Consider the quantity  (&..&  ~~+I..&j~;..~~  g+,..[t).   Rom the
orthogonality theorem, it must vanish unless  each   68  =  6: for 
S = v+  l,..,   u. By extending the work connected with expression
(29)  of  6  10  to simultaneous eigenvectors of several commuting
observables and extending also the axiom  (30),  we find that the
(u-v)-fold integral of this  quantity  with  respect  to  each  fi over
a range extending through the value  ei is a finite positive number.
Calling this number c', the ' denoting that it is  a  function  of
s;,..,  G,  ka+iv*,  G, we  tan  express our results by the equation
<~;..~~~~+,..~~1~;..~~5~+1~.su>      =  c's(~~+,-5~+l)..s(~-~~),  (33)
with  one 8  factor on the right-hand side for  each  value of s from 
V+ 1 to u. We now  Change the lengths of our basic vectors so  as to
3696.57                                   F
 
66                          REPRESENTATIONS                                   §  16
make c' unity, by a procedure similar to that which led to (20). By 
a  further  use of the orthogonality theorem, we get finally 


with a two-suffix  8  Symbol  on the right-hand side for  each   4 with 
discrete eigenvalues and a  8 function for  each   ,$ with continuous
eigenvalues.     This is the generalization of  (16)  or  (21)  to the  case  when
there are several commuting observables in the  complete set.
From (34) we tan  derive, as the generalization of (22) or (24)

(35)

the integral being a (u-v)-fold one over all the  k"s  with continuous 
eigenvalues and the summation being over all the  ["s with discrete 
eigenvalues. Equations (34) and (35) give the fundamental properfies 
of the basic  vectors in the present  case.  From (35) we  tan  immediately 
write down the generalization of (25) or (26) and of (27) or (28).
The  case  we have just considered  tan  be  further  generalized by
allowing some of the 4's to have both discrete and continuous eigenvalues. 
The modifications required in the equations are quite  straightforward, 
but will not be given here as they are rather cumbersome to 
write down in general  form-
There are some  Problems  in which it is convenient not to make the
cf of equation (33) equal unity, but to make it equal to some definite 
function of the  6"s instead. Calling this function of the  f"s p'-l we 
then have, instead of (34)


and instead of (35) we get

(37)

p'  is called the  weight  function  of the representation,  p'd,$,+,..d& 
being the `weight' attached to a small volume element of the  space 
of the variables cV+r,..,  &.
The representations we considered previously all had the weight
function unity. The  introduction  of a weight function not unity is 
entirely a matter of convenience and does not add anything to the 
mathematical power of the representation. The basic bras {f;...&*  1 
of a representation with the weight function p' are connected with
 
17. The Representation of Linear Operators

§  16          PROPERTIES   OF THE BASIC VECTORS                                 67
the basic bras  (&..&  1 of the corresponding representation with the
weight function unity by
(&...fu*l  =  p'-~(~;...~ul,                      (38)
as is easily verified. An  example  of a useful representation with 
non-unit weight function occurs when one has two  5's  which  are 
the polar and azimuthal angles 8 and  + giving a  direction  in threedimensional 
space  and one takes  p' = sin  8'. One then has the  elcment 
of solid angle sin  8'  dPd+'   occurring in (37).
17. The representation of linear Operators
In  5 14 we  saw  how to represent ket and bra vectors by ssts  of
numbers. We now have to do the same for linear Operators, in  Order 
to have a complete scheme for representing all our  abstract  quantities 
by sets of numbers. The Same basic vectors that wo had in 3 14 tan 
be used again for this purpose.
Let us suppose the basic vectors are simultaneous eigenvectors  of
a complete set of commuting observables  41,eZ,...,[U.  If  01  is any 
linear Operator, we take a general basic bra (&.&  1 and a general 
basic ket jf;...fc)  and form the numbers
{C$..~~~CX~~~*..~~).                          (39)
These numbers are  sufficient  to determine  01  completely,  since  in the 
first  place they determine the ket  01jt;...tc)  (as they provide the 
representative of this ket),  and the value of this ket for all the basic 
kets 1~~...~~>  determines CX. The numbers (39) are called the representative 
of the linear Operator  CY. or of the dynamical variable  (x. They 
are more complicated than the representative of a ket or bra vcctor 
in that they involve the Parameters that label two basic vectora 
instead of one.
Let us examine the form of these numbers in simple cases.  Take
first  the  case  when there is only one  t, forming a complete commuting 
set by itself, and suppose that it has discrete  eigenvalues 6'. The 
representative of  01  is then the  discrete  set of numbers (5' [CX  14"). If 
one had to write out these numbers explicitly, the natura1 way of 
arranging them would be  as  a two-dimensional array, thus:
G?l4P>  <511442>  @blP>  *  l
G21d?>         GT2bE2>        (4ë214k3>     í     í
I <~314~1>  (P14t2>   <S3bE3>   *  ' 1                         (40)
...........
i .    .í    .........
J
 
68                                REPRESENTATIONS                                §  17

where  tl,  t2,  t3,.. arc all the eigenvalues of [. Such an array is called
a mutrix and the numbers are called the elements of the matrix- We 
make the convention that the elements must always be arranged SO 
that those in the same row refer to  the  Same basic bra vector and 
those in the Same column refer to the same basic ket vector.
An element  ([`[cu~[`>  f
re  erring  to two basic vectors with the same
label is called  a  diagonal  element  of the matrix, as all such elements 
lie on a diagonal. If we put Q:  equal to unity, we have from (16) all 
the diagonal elements equal to unity and all the other elements equal 
to  Zero. The matrix is then called the unit matrix.
If  cx  is real, we have                     -_----
<0#`>   =  <5"145'>*                          (41)
The  effect of these conditions on the matrix (40)  is to make the 
diagonal elements all real and each  of the other elements equal the 
conjugate  complex of its mirror reflection in the diagonal. The matrix 
is then called  a Hermitian matrix.
If we put 01  equal to 4, we get for a general element of the matrix
~4'1&?`> = mw'>  = Q'&$$@.                            (42)
Thus all the elements not on the diagonal are  Zero.  The matrix is 
then called a  diagonul  matrix. Its diagonal elements are just equal 
to the eigenvalues of  5`.  More generally, if we put a equal to  f(f), a 
function of  6, we get
(6' IM)  lt?`>  = f@>  Kp@                    (43)
and the matrix is again a diagonal matrix.
Let us determine the representative of a product @  of two linear
Operators  a and ss in terms of the representatives of the  factors.
F'rom  equation (22) with p substituted for er we obtain
~ww'> = G'b  F l5"><~lPlr'>
=  G?l~l5"><5"ISlk'>~
f                                     (44)
111
which  gives us the required result. Equation (44) Shows  that the 
matrix formed  by the elements (~`101/3l~) equals the product of the
matrices  formed by the elements  (6'   Ia  15")  and  (k'   Iss  1~")  respectively,
according to the usual mathematical rule for multiplying matrices. 
This rule gives for the element in the  rth  row and sth column of the 
product matrix the sum of the product of  each  element in the rth 
row of the first  factor matrix with the corresponding element in the sth
 
s  17        THE  REPRES ENTATION OF LINEAR OPERATORS

column of  the second factor matrix. The  multiplication  of matrices 
is non-commutative, like the multiphcation  of linear Operators.
We tan  summarize our results for the case  when there is only one
t and it has discrete eigenvalues as follows:
(i)  Any  iinear   operatdr   is represented by  a  matrix. 
(ii)  The unit Operator is represented by the unit  mutrix.
(iii)  A real linear Operator is represented by a  Hermitian   rmztrix. 
(iv)  6 and  functions   of  ZJ  aye  represented by diagonal matrices.
(v) The  matrix  representing the product  of two linear Operators is the
product  of the matrices representing the two  factors.
Let us now consider the case  when there is only one  e and it has
continuous eigenvalues. The representative of  a is now  (~`/~1~"),  a 
function of two variables 6' and  6" which tan  vary continuously. It 
is convenient to  cal1 such a function a  `rnatrix',  using this word in 
a generalized sense, in Order  that we may be able to use the same 
terminology for the discrete and continuous  cases.  One of these 
generalized matrices  cannot,  of course, be written  out  as  a  twodimensional 
array like an ordinary matrix, since  the number of its 
rows and columns is an infinity equal to the number of Points  on a 
line, and the number of its elements is an infinity equal to the 
number of  Points  in an area.
We  arrange  our definitions concerning these generalized matrices
so that the rules (i)-(v) which we had above for the discrete  aase 
hold also for the continuous case.  The unit Operator is represented 
by  S(t'--f") and  the generalized matrix formed by these elements 
we  define to be the  unit mtrix.  We still have equation  (41) as the 
condition for  01  to be real and we define the generalized matrix  formed 
by  the elements  (6'   ]o~]LJ">   to be  Herrnitian   when it satisfies this 
condition.   5 is represented by
(6'  lW> =  6' W-f')                           (46)
aJ-d  f (59 bY              <f'lf<f> lt'% = f(f) W-F'),                        (46)
and the generalized matrices formed by these elements we define to  be 
diagonal  mutrices. From  (1  l),  we could equally well have  f" and  f  (t") 
as the coefficients of S([`-5") on the right-hand sides of  (45)  and  (46) 
respectively. Corresponding to equation  (44)  we now have, from  (24)
<~`b/W'>  =  j  <5'14t"`>   dt"'   @`l~lt">,             (47)
with an integral instead of a sum, and we define the generalized 
matrix  formed  by the elements on the right-hand side here to be the
 
70                             REPRESENTATIONS                                  $ 17

product of the matrices formed by  (e'jaJ["> and  (t'J/314").  With 
these definitions we secure  complete parallelism between the discrete 
and continuous cases  and we have the  rules  (i)-(v) holding for both.
The question arises  how a general diagonal matrix is to be defined
in the continuous  case,  as so far we have only defined the right-hand 
sides of  (45)  and  (46)  to be examples of diagonal matrices. One 
might be inclined to define  as diagonal any matrix whose  (f',  f") 
elements all vanish except when  t' differs infinitely little from t", 
but this would not be satisfactory,  because  an important property 
of diagonal matrices in the discrete  case  is that they always commute 
with one another and we want this property to hold also in the 
continuous  case.  In  Order  that the matrix formed by the elements 
(4'1~  15") in the continuous case  may commute with that formed by 
the elements on the right-hand side of (45) we must have, using the 
multiplication  rule  (47),


With the help of formula  (4), this reduces  to
<4'144">4"  = 4'w46">                                (48)
or                             (pty)(('  Iw I(") = 0.
This gives, according to the rule by which  (13) follows from (12))
(&J  1f") = c' 6(&-tj")
where c' is a number that may depend on  f'. Thus  (c'   Iw  16") is of the 
form of the right-hand side of  (46). For  this reason we  de$ne only 
matrices whose elements are of the  ferm   of the right-hund side of (46)  to 
be diagonal matrices. It is easily  verified  that these matrices all 
commute with one another. One  tan  form other matrices whose 
(t',  4")  elements all vanish when 5' differs appreciably from  4" and 
have a different form of singularity when 5' equals 6" [we shall later 
introduce the derivative  6'(x) of the  6  function and  6' (ff  -6") will 
then be an example, see $22 equation  (lg)],  but these other matrices 
are not diagonal according to the definition.
Let us now pass on to the  case  when there is only one  [ and it has
both discrete and continuous eigenvalues. Using  e,  t8  to denote 
discrete eigenvalues and ff, 5" to denote continuous eigenvalues, we 
now have the representative of a consisting of four kinds of quanti-
ties,   (4'jaIF>,   (p]oil~`>,   ([`Icx]~),   ([`lar]4").  These quantities  tan all
 
5  17    THE REPRESENTATION OF LINEAR OPERATORS                                  71

be put together and considered to form  a more general kind of matrix 
having some discrete rows and columns and also a continuous range 
of rows and columns. We define unit matrix, Hermitian matrix, 
diagonal matrix, and the product of two matrices also for this more 
general kind of matrix so as to make the rules (i)-(v) still hold. The 
details are a straightforward generalization of what has gone before 
and need not be given explicitly.
Let us now go back to the general case of several [`s, kl,  fa,...,  k,,.
The representative of 01, expression (39) may still be looked upon as 
forming a matrix, with rows corresponding to different values of
Si,.  . ., & and columns corresponding to different values of  [i,. ..,  fi. 
Unless  all the  ,$`s have discrete eigenvalues, this matrix will be of the 
generalized kind with continuous ranges  of rows and columns. We 
again arrange our definitions so that the rules (i)-(v) hold, with rule 
(iv) generalized to:
(iv')  Esch   tn,   (rn   =  1,  2,..., u>   and any function of  them is repre-
sented by a diagonal matrix.
A diagonal matrix is now defined as one whose general element
(&,..&~w~~~...~~> is of the form


in the  case when fl,.., ,$V have discrete eigenvalues and  &,+l,   ..,   tU   have
continuous eigenvalues, c' being any function of the 6"s.  This definition 
is the generalization of what we had with one  4` and makes 
diagonal matrices always  commute  with one another. The other 
definitions are straightforward and need not be given explicitly.
We now have a linear Operator always represented by a matrix.
The sum of two linear Operators is represented by the sum of the 
matrices representing the Operators and this, together with rule (v), 
means that the nuztrices are subject  to the same algebraic  relations as 
the linear  olperators.    If any algebraic  equation holds between certain 
linear Operators, the same equation must hold between the matrices 
representing those Operators.
The scheme of matrices  tan  be extended to bring in the repre-
sentatives of ket and bra  vectors.  The matrices representing linear 
Operators are all  Square matrices with the  Same number of rows and 
columns, and with, in fact,  a one-one correspondence between their 
rows and columns. We may look upon the representative of a ket 
1 P)  as  a  rrmtrix  with a  single  wlumn   by setting all the numbers
 
18. Probability Amplitudes

72                          REPRESENTATIONS                               0  17

(.&...&lP) which form this representative one below the other. The 
number of rows in this matrix will be the  Same as the number of 
rows or columns in the  Square matrices representing linear Operators. 
Such a Single-column matrix  tan be multiplied on the left by a  Square 
matrix  (&...&Icx~~~...~~) p
re resenting a linear Operator, by a  rule
similar to that for the  multiplication  of two  Square matrices. The 
product is another Single-column matrix with elements given by 



From (35) this is just equal to  (~;...&Icx~P),  the. representative of 
011   P).  Similarly we may look upon the representative of a bra (Q  / 
as a  matrix with a Single row by setting all the numbers (QI~~...&> 
side by side. Such a Single-row matrix may be multiplied on the 
right by a  Square matrix  (~~...&Icx\~~...R),  the product being another 
Single-row matrix, which is just the representative of  <&Icx.  The 
Single-row matrix representing (Q 1 may be multiplied on the right 
by the Single-column matrix representing IP), the product being a 
matrix with just a Single  element, which is equal to (Q  IP). Finally, 
the Single-row matrix representing (Q  1 may be multiplied on the left 
by the Single-column matrix representing f P), the product being a 
Square matrix, which is just the representative of  l.P)(Q  1.  In this 
way all our abstract Symbols, linear Operators, bra  vectors,  and ket 
veetors,  tan  be  represented  by matrices, which are  subject  to the 
same  algebraic  relations as the abstract Symbols themselves. 

18. Probability amplitudes
Representations are of great  importante  in the physical  interpreta-
tion of quantum  mechanics as they provide a convenient method for 
obtaining the probabilities of observables having given values. In 
$ 12 we obtained the probability of an observable having any specified 
value for a given state and in $ 13 we generalized this result 
and obtained the probability of a set of commuting observables 
simultaneously having specified values for a given state. Let us now 
apply this result to  a  complete set of commuting observables,  say  the 
set of  f's  which we have been dealing with already. According to 
formula (51) of  5 13, the probability of  each   5,. having the value  6; 
for the state corresponding to the normalized  lret vector IX) is
 
18                 PROBABILITY AMPLITUDES                                73
§


If the  6's all have discrete eigenvalues, we  tan um (35) with v = U,
and no integrals, and get







We thus get the simple result that  the  probahility  of the  6's  kving  the 
vulues  6'  is just the  Square  of  the modulus  of the appropriate  coordinate 
of the normalized ket vector corresponding to  the  stade   concerned.
If the  LJ'S do not all have discrete eigenvalues, but if, say, fl,.., &,
have discrete eigenvalues and  &,+r   ,. . ,fU  have continuous eigenvalues,, 
then to get something physically  significant   we  must obtain the
probability of  each   (Jr  (r = l,.., v) having  a  specified value  C and  each 
&   (8 =  v+L., U) lying in a specified  small range  59 to  [:+c@:.  For 
this purpose we must replace  each  factor Sg8g;  in  (50)  by a factor xS, 
which is that function of the observable  &  which is equal to unity 
for &  within the range  [i to  &+dtL  and zero otherwise. Proceeding 
as before with the help of  (35),  we obtain for this probability 


Thus in every case  the probability distribution  of values for the e'~  is 
given by the  squure  of the modulus  of the  representative   of  the  normalixed 
ket vector corresponding to the  stute  concerned.
The numbers which form the representative of  a normalized ket
(or bra) may for this reason be called  probability ampiitudes. The 
Square of the modulus of a probability amplitude is an ordinary 
probability, or  a probability per unit range for those variables that 
have continuous ranges  of values.
We may be interested in a state whose corresponding ket  IX)  cannot
be normalized. This occurs, for example, if the state is an eigenstate 
of some observable belonging to an eigenvalue lying in a range of 
eigenvalues . The  formula  (51)  or  (52)  tan  then still be used to give 
the relative probability of the  6's having specified values or having 
values lying in specified small ranges,  i.e. it will give correctly the 
ratios of the  probabilities for different 4"s.  The numbers (&...&lx> 
may then be called  relative probability  amplitudes.
 
74                          REPRESENTATIONS                               §  18

The representation for which the above results hold is characterized
by the basic vectors being simultaneous eigenvectors of all the  f's. 
It may also be characterized by the requirement that each  of the 5's 
shall be represented by a diagonal matrix, this condition being easily 
seen to be equivalent to the previous one. The latter characterization 
is usually the more convenient one. For brevity, we shall formulate 
it as  each  of the  6's  '  being  diagonal  in the representation'.
Provided the  f's form a complete set of commuting observables,
the representation is completely determined by the characterization, 
apart  Flom  arbitrary  Phase   factors in the basic vectors.  Esch basic bra 
(ei..  .& 1 may be multiplied by eiy',  where  y' is any real function of 
the variables  &...,  &, without changing  any of the conditions which 
the representation has to satisfy, i.e. the condition that the  E's  are 
diagonal or that the basic vectors are simultaneous eigenvectors of 
the  5'8, and the fundamental properties of the basic vectors  (34)  and 
(35).  With the basic bras changed  in this way, the representative 
(~~..&IP>  of a ket  /P) gets multiplied by  eir',  the representative 
(&  It;...&) of a bra  (& 1 gets multiplied by e-iy'  and the  representative 
(&...&lal~;...~~)  f  h
o a `near Operator cx  gets multiplied by eflr'--r?
The probabilities or relative probabilities  (51), (52)  are, of course,
unaltered.
The probabilities that one calculates in practical  Problems  in
quantum  mechanics are nearly always obtained  from  the squares 
of the moduli of probability amplitudes or relative probability amplitudes. 
Even when one is interested only in the probability of an 
incomplete set of commuting observables having specified values, it 
is usually  necessary  first to make the set a complete one by the 
introduction of some extra commuting observables and to obtain 
the probability of the complete set having specified values (as the 
Square of the modulus of a probability amplitude), and then to sum 
or integrate over all possible values of. the extra observables. A 
more  direct   application of formula  (51)  of  $  13  is usually nof 
practicable.
To introduce a representation in practice
(i) We look for observables which we would like to have diagonal,
either  because  we are interested in their probabilities or for
reasons of mathematical simplicity ;
(ii) We must see that they all commute-a necessary condition
since diagonal matrices  always commute  ;                       j.
 
9 18                 PROBABILITY        AMPLITUDES                          75

(iii) We then sec  that they form a complete commuting set, and
if not we add some more commuting observables to them fo
make them into  a complete commuting set ;
(iv) We set up an orthogonal representation with this complete
commuting set diagonal.
The representation is then completely determined except for the 
arbitrary  Phase  factors. For  most  purposes the arbitrary  Phase 
factors are unimportant and trivial, so that we may count the 
representation  as  being completely determined by the observables 
that are diagonal in it. This fact is already implied in our notation, 
since  the only indication  in a representative of the representation to 
which it belongs are the letters denoting the observables that are 
diagonal.
It may be that we are interested in two representations for the
same dynamical  System. Suppose that in one of them the complete 
set of commuting observables  [i,...,  eU are diagonal and the basic 
bras are  <&...&] and in the other  the complete set of commuting
observables  T~,.  . . ,  vw are diagonal and the basic bras  are  (q;...&, 1. 
A ket  1 P)  will now have the two representatives  {&...&I   P>   and 
<&.&lP>.   If  &,..>   &,  have discrete eigenvalues and &+l,..,  fU  have 
continuous eigenvalues and if  Q,. . , 7% have discrete eigenvalues and 
?lx+l,")  rlw have continuous eigenvalues, we  get from  (35)



and interchanging e's  and 7's



These are the transformation equations which give one representative 
of  IP) in terms of the other. They show that either representative 
is expressible linearly in terms of the other, with the quantities 


as coefficients. These quantities are called the  transformtion  funcGons. 
Similar equations may be written down to  connect  the two 
representatives of a bra  vector  or  of a linear Operator. The  transformation 
functions  (55) are in every case the means which enable 
one to pass  fiom  one representative to the other.  Esch of the
 
19. Theorems about Functions of Observables

76                           REPRESENTATIONS                                            4  18

transformation functions is the conjugate complex of the other, and
they satisfy the conditions
&,1..  vj-s (~;..qX;...iQ  dt;+,..dS; <~;-~lrl;...~W)
8
=     r)irlT"       qzqz
6  í l  6(r15+1-rlr+l)..6(rl:o-rlf;)    (56)

and the corresponding conditions with 6's and  17's interchanged, as 
may be verified from (35) and (34) and the corresponding equations 
for the  77's.
Transformation functions are examples of probability amplitudes
or relative probability amplitudes. Let us take the case  when all the 
6's and all the  7's  have discrete eigenvalues. Then the  basic  ket 
/qi...&)  is normalized, so that its representative in the  f-representation, 
{~;...&I~;...&,), is a probability amplitude for  each  set of values 
for the  ["s. The state to which these probability amplitudes refer, 
namely the state corresponding to  1 y;..  .$,,), is characterized by the 
condition that a simultaneous measurement of  Q,. .  .,  Q,,  is certain to 
lead to the results  &...,&.  Thus  I([;...&[$...&,)12  is the  probability 
of the  5's  having the values &,...&  for the state for which the 
7's  certainly have the values  $...&.  Since
ro;...c.Jq;...&7>12  =  rc~~...~Wl~;...~~>I",
we have the theorem of reciprocity-the probability of the e's  having
the values  ['   for the state  for which the  r]`s  certainly  huve  the values  q' 
is  equal  to the probability  of  the  q's  having the values  7'  for the state  for 
which the  f's certainly  haue  the values  4'.
If all the  q's have discrete eigenvalues and some of the  e's  have
continuous eigenvalues,  1 {Ei..   .eh   1~;.  . .  $,J  l2 still gives the probability
distribution of values for the  4's for the state for which the  7)`s certainly 
have the values 7'. If some of the 7's have continuous eigen-
values,  IT;...&,>  is not normalized and  I(~~...&I$...&>I"                   then gives
only the relative probability distribution of values for the 4's for the
state for which the 7's certainly have the values 7'.

19. Theorems  about  functions of observables
We shall illustrate the mathematical value of representations by
using them to prove some theorems.
THEOREM 1.  A linear Operator that  commutes  with an observable  6
commutes  also with any function  of 4.
The theorem is obviously true when the function is expressible as
 
3 19    THEOREMS  ABOUT  FUNCTIONS  OF OBSERVABLES                                 77

a power seiies. To prove it generally, Iet w be the linear Operator,
so that we have the equation
t+-05 = 0.                                  (57)
Let us introduce  a representation in which  [ is diagonal. If  6 by 
itself does not form  a complete commuting set of observables, we must 
make it into  a  complete commuting set by  adding  certain observables, 
/?  say, to it, and then take  the representation in which t and the ,B's 
are diagonal. (The  case   when  6 does form a complete commuting set 
by itself  tan  be looked  upon as a special case  of the preceding one 
with the number of  /3 variables  Zero.) In this representation equation 
(57)   becomes                <!?ss'I!sJ -co&y'ss">  = 0,
which reduces  to
6$`(ty   10 llf'ss') - (f'ss' Ii.0 lly'ss')y  = 0.
In the  case  when the eigenvalues of  4 are  discrete,  this equation 
Shows  that all the matrix elements (f'ss' lolQ"ss') of  w vanish except 
those for which  5' =  f". In the  case when the eigenvalues of  6 are 
continuous it Shows,  like equation (48),  that @ss'  10 Ie"ss">  is of the 
form                      (fss' 10 I("ss')  = c S(tj'q'),
where c is some function of  f' and the  ss"s and p"`s.                In either case
we may say that the matrix representing w `is diagonal  with respect 
to  6'. Iff([)  denotes any function of  6 in accordance with the general 
theory of  3  I  1,  which  requires   f(r) to be  deflned  for  ,$"'  any eigenvalue 
of  5, we  tan  deduce in either case

This gives              <fss' If (kl @-Jf(J3   ll"ss") =  0, 
so that                         f(k) Ce-Wf(f)   = 0 
and the theorem is proved.
`As a special  case   of the theorem, we have the result that any
observable that commutes  with an observable  E also commutes  with 
any function of 4. This result appears as a physical necessity when 
we identify, as in  $13, the  condition of  commutability  of two 
observables with the  condition of compatibility of the  corresponding 
observations. Any Observation that is  compatible  with the 
measurement of an observable  6 must also be  compatible  with the 
measurement of  f(e), since any measurement of  6 includes in itself 
a measurement  of  f(  t).
 
78                                REPRESENTATIONS                                0  19
THEOREM  2.  A linear  Operator  thut  commutes with  euch   of  a  complete
set   of commuting observables  is a function of those observables.
Let o be the linear Operator and el,  c2,.  . . , eU  the complete set of
commuting observables, and set up a representation with these 
observables diagonal.  Since   w commutes with  each  of the  8'5, the 
matrix representing it is diagonal with respect to  each  of the  t's, 
by the argument we had above. This matrix is therefore a diagonal 
matrix and is of the form  (49),  involving a number c' which is a 
function of the  ("s.  It thus represents the function of the  [`s that 
c' is of the  e"s, and hence  o equals this function of the  f's.
TEEOREM  3.  If an observable  6 and a linear Operator g are such  that
any linear Operator  thut  commutes with  f also commutes with g,  thea  g
is a ficnction  of 5.
This is the  converse  of Theorem 1. To prove it, we use the same
representation with  f diagonal as we had for Theorem  1. In the first 
place, we see that g must commute with  6 itself, and hence the 
representative of g must be diagonal with respect to  e, i.e. it must 
be of the form
<~`B'lslf'P">  =  atS'W')~~g~   or   4S'B'P")W'-%"),
according to whether 6 has discrete or  continuous eigenvalues. Now 
let  o be any linear Operator that commutes with  f, so that its 
representative is of the form
(fjs'  10  I["ss")  =  b([`p'/I")6pp   or  b([`/3'/3')8([`---f").
By hypothesis w must also commute with g, so that'
<CB  kW -og(~"p')  = 0.                         6-w
If we suppose for definiteness that the  Iss's have discrete eigenvalues, 
(58) leads, with the help of the law of matrix multiplication,  to
2 {a(~`~`jY")b(~`/3"`/3')-b(~`/3'/3"')cz(~'/3"'/3")~     = 0,    (59)
ss"'
the left-hand side of (58) being equal to the left-hand side of (59) 
multiplied by  $6 or  S([`---6").  Equation (59) must hold for all 
functions   b(f'lg'JQ").   We  tan  deduce that
43v") =  0  for  j3'  #  fl", 
a( [`/3'/3')  = a( f'/3"/3").
The first of these results Shows  that the matrix representing g is 
diagonal and the second Shows  that a(f'/3'p') is a function of 4' only. 
We  tan now infer that g is that function of f which c@`jY@`)  is of [`,
 
20. Developments in Notation

§  19     THEOREMS  ABOUT   FUNCTIONS  OF OBSERVABLES                          79
so the theorem is proved. The proof is analogous if some of the  B's
have continuous eigenvalues.
Theorems 1 and 3 are still valid if we replace the observable 6 by
any set of commuting observables  fl,  f2,..,   &.,  only formal  changes
being needed in the proofs.

20. Developments in notation
The theory of representations that we have developed provides a
general  System  for labelling kets and bras. In a representation in which 
the  complete set of commuting observables  (J1,...  ,  Eu are diagonal any 
ket 1-P)  will have a representative  (&...&IP>, or (l'/P)  for brevity. 
This representative is a  definite  function of the variables  [`, say  $(E'). 
The function # then determines the ket IP) completely, so it may be 
used to label this ket, to replace the arbitrary label P. In Symbols, 
if
we put
We must put  IP) equal to  l+(4)> and not  $(f')>,   since  it does not 
depend on a particular  set of eigenvalues for the [`s, but only on the 
form of the function #.
With  f(t) any function of the observables  El,...,   CU,   f(f)IP)  will
have as its representative
CE' If(O  I p> = f(E'J#(S')  *
Thus according to (60) we put
f(6)   v-3 =  lfW(~b
With the help of the  second  of equations (60) we now get
f(t)  W(5)> = If(S)sL(f)>*                       (61)
This is a general result holding for any functions  f and # of the e's, 
and it Shows that the  vertical  line  1 is not necessary  with the new 
notation for a ket-either side of  (61)   may be written simply as 
f(~)#([)).  Thus the rule for the new notation becomes:-
if                               (W>  =  VW)                                 (62) 
we put                               lP>  =  ?m>*                       1 
We may further  shorten I(t)) to  $>, leaving  the variables [ understood, 
if no ambiguity arises  thereby.
The   bt  tw> may be considered as the  product of  the  linear
Operator  #([) with a  ket  which is denoted simply by  ) without  a 
label. We  cal1  the ket ) the  standard  ket. Any ket whatever tan be
 
80                             REPRESENTATIONS                                         §  20
expressed as a function of the  6's multiplied into the Standard ket. 
For example, taking ]P) in (62) to be the  basic  ket It">, we find
(63)
in the case  when tl,.., & have  discrete  eigenvalues and  &,+l,.   .,  4,  have
continuous eigenvalues. The Standard ket is characterized by the 
condition  that its representative  (5'  1) is unity over the whole  domain 
of the variable  t', as may be seen by putting # =  1 in (62).
A  further  contraction may be made in the notation, namely to
leave the  Symbol  ) for the Standard ket understood. A ket is then 
written simply as #(JJ), a function of the observables  5. A function 
of the 5's used in this way to denote a ket is called a wave function.? 
The  System  of notation provided by wave functions is the one usually 
used by most authors for calculations in  quantum  mechanics. In 
using it one should remember that  each  wave function is understood 
to have the Standard ket multiplied into it on the right, which 
prevents one from multiplying the wave function by any Operator 
on the right. Waue  functions  tan  be multiplied by Operators only on 
the   Zeft.  This distinguishes them from ordinary functions of the  Os, 
which are Operators and  tan  be multiplied by Operators on either the 
left or  the right. A wave function is just the representative of a ket 
expressed as a function of the observables  f,  instead of eigenvalues  e' 
for those observables. The  Square of its modulus gives the  probability 
(or the relative probability, if it is not normalized) of the  &`s 
having specified  values,  or lying in specified  small   ranges,  for the 
corresponding state.
The new notation for bras may be developed in the  Same way as
i for kets. A bra  (&I whose representative  (&[f') is  #`)  we write 
r  (~/(fl)l. With this notation the conjugate imaginary to  I$(t))  is
(g(e)   1.  Thus the rule that we have used hitherto, that a ket and 
its conjugate imaginary bra are both specified by the  Same label, 
must be extended to  read-if   the   labels of a Eet involve cornplex 
numbers or cmplex  functions, the lubels of the conjugate irnaginary 
bra involve the conjugate cornplex numbers or  functions. As in the 
case  of kets we  tan  show that (#)lf(k) and (~(@jo)~  are the Same, 
so that the  vertical  line  tan  be omitted. We  tan consider  (c#)  as 
the  product of the linear Operator  +(f)  into the  Standard bra  (, which
t  The   reason  for this name is that in the early daye of  quantum  mechanics all the
examples of  these  functions were of the form of  waves. The name is not a  descriptive
one from the  Point  of view of the modern general theory.
 
9  20                DEVELOPMENTS IN NOTATION
is the conjugate imaginary of the Standard ket  ).  We may leave 
the Standard bra understood, so that a general bra is written as  #), 
the conjugate complex of a wave function. The conjugate complex 
of a wave function tan  be multiplied by any linear Operator on the 
right, but cannot  be multiplied by a linear Operator on the left. We 
tan  construct triple  products  of the form  (f(t)>. Such a triple product 
is a number, equal to  f(f) summed  or  integrated over the whole
domain  of eigenvalues for the  E's,
(64)
in the  case  when fr,..,  ,$V have  discrete  eigenvalues and  &,+l,.  .  .,  &  have
continuous eigenvalues.
The Standard ket and bra are defined with  respect to a representa-
tion. If we  carried  through the above work with a different representation 
in which the complete set of commuting observables r) are 
diagonal,  or  if we merely  changed  the  Phase  factors in the representation 
with the  5's  diagonal, we should get a different Standard ket and 
bra. In a piece of work in which more than one Standard ket or bra 
appears one must, of course, distinguish them by giving them labels.
A  further  development of the notation which is of  great   importante
for dealing with complicated dynamical Systems will now be discussed. 
Suppose we have a dynamical  System  describable in terms of  dynamical 
variables which tan  all be divided into two  Sets, set A and set B 
say, such that any  member of set A commutes  with any member of 
set  B. A general dynamical variable must be expressible as a function 
of the A-variables and B-variables together. We may consider 
another dynamical System  in which the dynamical variables are the 
A-variables only-let us  call  it the A-System. Similarly we may 
consider a third  dynamical   System  in which the dynamical variables 
are the B-variables only-the B-System. The original  System   tan 
then be looked upon as a combination of the  A-System  and the 
B-System  in accordance with the mathematical scheme given below.
Let us take any ket Ia> for the  A-System  and any ket Jb} for the
B-System. We assume that they have a product  Ia)  [b) for which 
the commutative and distributive axioms of multiplication  hold, i.e.
lW> = P>W,
+4%>+d%>)I~) = Cl  1%)  I~)+%l%)   Ib>, 
I~OECl l~l)+%lb2  = Cl l~M,>+c,j~> Ibn>,
3595.67                                Q
 
82                         REPRESENTATIONS                                 9  20

the c's being numbers. We  tan give a meaning to any A-variable 
operating on the product ja)  Ib) by assuming that it operates only 
on the  Ia) factor and commutes with the  Ib> factor, and similarly 
we  tan give a meaning to any B-variable operafing on this product 
by assumiug that it operates only on the  Ib) factor and commutes 
with the  ja) factor. (This makes every A-variable  commute  with 
every B-variable.) Thus any dynamical variable of the original 
System   tan  operate on the product  Icc)  Ib), so this product  tan be 
looked upon as a ket for the original System, and may then be
written  lab),  the two labels  a  and  b  being  sufficient  to  specify it.
In this way we get the fundamental equations
Im> = Ib> Ia> = lW*                            (65)
The  multiplication  here is of quite a different kind  from  any that
occurs earlier in the theory. The ket vectors Ia) and  Ib) are in two 
different vector spaces and their product is in a third vector space, 
which may be called the product of the two previous vector spaces. 
The number of dimensions of the product space is  equal  to the 
product of the number of dimensions of  each  of the factor spaces. 
A general ket vector of the product space is not of the form (654,  but 
is a sum or integral of kets of this form.
Let us take a representation for the  A-System  in which a  complete
set of commuting observables fA of the  A-System  are diagonal. We
shall  then have the  basic  bras  (62  1 for the  A-System. Similarly,  taking
a representation for the B-System  with the observables tB diagonal,
we  shall  have the  basic  bras (&l  for the B-System. The  products


will then provide the basic  bras for a representation for the original
System,  in which representation the  tA's  and the  fB's will be diagonal. 
The  fd's  and  tB's  will together form a  complete set of commuting 
observables for the original System. From (65)  and  (66)  we get
Kl la><&?lb> =f <4a tilab>,                      (67)
showing  that the representative of  jab)   equals the product of the 
representatives of  Ia) and of  Jb) in their respective representations.
We  tan  introduce the Standard ket,  )a say, for the A-System,
with   respect  to the representation with the  fA's diagonal, and also 
the Standard ket  )B for the B-System, with  respect  to the  representation 
with the  &`s  diagonal. Their product  )a  >* is then the
 
§ 20              DEVELOPMENTS IN NOTATION                               83 
Standard ket for the original System,  with respect to the representation 
with the fB's and tB's  diagonal. Any ket for the original System 
may be expressed as                                                    (68)
It may be that in a certain calculation we wish to use a particular
representation for the B-System, say the above representation with 
the  eB's  diagonal, but do not wish to introduce any particular 
representation for the A-System. It would then  be convenient to 
use  the Standard ket )* for the  B-System  and no Standard ket for 
the  A-System.    Under  these circumstances we could write  any ket 
for the original System  as       I&3&3~                               w 
in which ItB)  is a ket for the  A-System  and is also a function of the 
fB's,  i.e. it is a ket for the  A-System  for each  set of values for the 
fB's-in  fact (69) equals (68) if we take

We may leave the Standard ket )B in (69) understood, and then we 
have the general ket for the original System  appearing as IeB>,  a ket 
for the  A-System  and a wave function in the variables  tB of the 
B-System. Examples of this notation will be used in 5s 66 and 79.
The above work  tan be immediately extended to a dynamical
System  describable in terms of dynamical variables which  tan  be 
divided into three or more sets A, 23, C,... such that any member of 
one set commutes  with any member of another. Equation (65) gets 
generalized to           la)lb) Ic)...  = pc...>,
the factors on the left being kets for the component Systems and 
the ket on the right being a ket for the original System. Equations 
(66), (67), and (68) get generalized to many factors in a similar way.
 
IV. The Quantum Conditions 
21. Poisson Brackets

IV
THE QUANTUM CONDITIONS
2 1. Poisson brackets
OUR  work so far has consisted in setting up a general mathematical 
scheme connecfing states and observables in  quantum  mechanics. 
One of the dominant features of this scheme is that observables, and 
dynamical variables in general, appear in it as quantities which do 
not obey the commutative law of multiplication. It now  becomes 
necessary for us to obtain equations to replace the commutative law 
of multiplication, equations that will tell us the value of  [r]  -  76 when 
6 and 7 are any two observables or dynamical variables. Only when 
such equations are known shall we have a  complete scheme of 
mechanics with which to replace classical mechanics. These new 
equations are called  quantum  conditions  or  comnutation  relations.
The  Problem  of finding  quantum  conditions is not of such a general
Character  as  those we have been concerned with up to the present. It 
is instead  a  special  Problem  which presents itself with  each   particular 
dynamical  System  one is called upon to study. There is, however, 
a fairly general method of obtaining quantum  conditions, applicable 
to a very large  class  of dynamical Systems. This is the method of 
classical   anulogy   and will form the main theme of the present  chapter. 
Those dynamical Systems to which this method is not  applicable 
must be treated individually and special considerations used in  each 
case.
The value of classical analogy in the development of  quantum
mechanics depends on the  fact that classical mechanics provides a 
valid description of dynamical Systems  under  certain conditions, 
when the  particles  and bodies composing the Systems are sufficiently 
massive for the disturbance accompanying an Observation to be 
negligible. Classical mechanics must therefore be a limiting case  of 
quantum  mechanics. We should thus expect to find that important 
concepts in classical mechanics correspond to important concepts in 
quantum  mechanics, and, from an understanding of the general 
nature of the analogy between classical and  quantum  mechanics,  we 
may hope to get laws and  theorems  in  quantum  mechanics appearing 
as simple generalizations of well-known results in classical mechanics; 
in particular  we may hope to get the quantum  conditions appearing 







.
 
J2 21                      POISSON BRACKETS
as a simple  generalization of the classical law that all  dynttmical
variables commute.
Let  us  take a  dynamical  System  composed of a number of particles
in interaction.  As independent dynamical variables for dealing with 
the  System   we may use the Cartesian coordinates  of all the particles 
and the corresponding Cartesian components of velocity of the  particles. 
It is, however, more convenient to work with the  momentum 
components  instead of the velocity components. Let us  cal1   the 
coordinates  qr,  r going from  1  to three times the number of  particles, 
and the corresponding momentum  components 13,. The q's and  p's
are called  canonical coordinates  and  momenta.
The method of Lagrange's equations of motion involves introdu-
cing  coordinates qp and momenta pT in a more general way, applicable 
also for a System not composed of particles (e.g. a System  containing 
rigid bodies). These more general q's and JYS are  also called canonical 
coordinates  and momenta. Any dynamical variable is  expressible in 
terms of a set of canonical coordinates and momenta.
An important concept in general dynamical theory is the  Poisson
Bracket. Any two dynamical variables u and v have a P.B. (Poisson
Bracket)  which  we shall denote by [u, v], defined by

(1)

u and  v being regarded as  functions  of a set of canonical coordinates 
and  momenta q,, and  13,. for the purpose of the  differentiations.  The 
right-hand side of  (1)  is independent of  which  set of canonical 
coordinates and momenta are used, this being a consequence of the 
general definition of canonical coordinates and  momenta, so  the 
P.B.  [u,v] is well  defined.
The main properties of  P.B.`s,   which  follow,  at once from their
definition  (l),  are
[u,  v] =  -p, Ul,                     (2) 
[w-j  = 0,                             (3)
where c is a number  (which  may be considered as a  special   case  of a
dynamical variable),
 
THE QUANTUM CONDITIONS                            8 21


= [Ul, "]u2+u,[u,, v],
[u, VI 021 = [u, v11v2++4 v21.                                   (5)
Also the identity
1%  [v, w]]+[v,  [w, u]]+[w,  [u, v]] = 0             (6)
is easily verified. Equations (4) express that the P.B. [u, v] involves 
u and v  linearly,  while equations (5) correspond to the ordinary rules 
for differentiating a product.
Let us try to introduce  a  quantum  P.B. which shall be the  analogue
of the classical one. We  assume  the  quantum  P.B. to satisfy all the 
conditions (2) to  (6), it being now necessary that the  Order  of the 
factors  u1  and  uz in the  first of equations (5) should be preserved 
throughout the equation,  as in the way we  have here w-ritten it, and 
similarly for the  v1 and  v2 in the  second of equations (5). These  conditions 
are already  sticient to determine the form of the  quantum 
P.B. uniquely, as may  be seen from the following argument. We tan 
evaluate the P.B. [ul us, v1 v2] in two different ways, since  we tan  use 
either of the two formulas (5) first,  thus, 
[Ul U2Y Vl %l  = ~~~~211~21~2+~~~~2~~~~21
= (C%  "11~2+"1c%  v21>u2+%@2> "11~2+~&27  van 
=  [~~>~,P2~2+~,[~,~~21~2+~,[~2~~,1~2+~,~,~~2~~21
and
[Ul UZ> VI  v21 = Cu1  Uz9  v&2+4?%  u29  v21
= [UD  %]u,v,+~,lu,~ ~&a+"1[%  "zluz+%  %[U27  v21.
Equating these two results, we obtain
[Ul,  ~&J27J2-~2U,)  = mv-~I  d-u,>  v,l*
Since  this condition holds with ui and v1 quite  independent of u2  and
vuZ, we must hsve         Ul  V~--V~  Ul  = i?i[ul,  VJ,
u2v2-42u2 =  iqu,,  v21,
where  fi must not depend on u1  and  vl, nor on u, and  v2, and also 
must commute  with (u,v, -vl  u,). It follows that fi must be simply 
a number. We Want the P.B. of two  real  variables to be real, as in 
the  classical  theory, which requires, from the work at the top of p.  28, 
that  6  aha11  be  a real number when introduced,  as here, with the
 
a

*    § 21                      POISSON BRACKETS                                   87
coefficient  i. We  arc  thus led to the following  definition  for the
quuntum  P.B.  [u,  v] of any two variables u and v,
UV-vu =  iqu,   q,                             (7)
in which  6 is  it new universal constant. It has the dimensions of 
action.  In  Order  that the theory may agree with experiment, we 
must take  $5  equal to  h/%,  where  h is the universal constant that 
was introduced by Planck,  known as Planck's constant. It is easily
verifled that the  quantum  P.B. satisfies all the conditions  (2),  (3),  (4),
W,  and  (6).
The  Problem  of finding  quantum  conditions now  reduces  to the
Problem  of determining  P.B.`s in  quantum  mechanics. The strong 
analogy between the  quantum  P.B. defined by (7) and the classical 
P.B. defined by  (1)  leads us to make the assumption that the  quantum 
P.B.`s,   or at any rate the simpler ones of them, have the  same  values 
as the corresponding classical P.B.`s.  The simplest P.B.`s  arc  those 
involving the canonical coordinates and momenta themselves and 
have the following values in the classical theory:



We therefore assume that the corresponding  quantum   P.B.`s also 
have the values given by (8). By eliminating the  quantum   P.B.`s 
with the help of  (7),  we obtain the equations
Qr Qs-%Pr = 0,         PirPs-%Pr = 09
%%-%4r  = =LY                                1 (9)
which are the  fundurnental  quuntum conditions.  They  Show  us where 
the  lack  of commutability among the canonical coordinates and 
momenta lies. They also provide us with  a  basis for calculating  commutation 
relations between other dynamical variables.  For  instance,
if  [ and  r) are any two  functions  of the q's and p's expressible as 
power series, we may express [y--$  or [f, 71, by repeated  applications 
of the laws  (2),  (3),  (4), and  (6), in terms of the elementary 
P.B.`s given in (8) and so evaluate it. The result is often, in simple 
cases,  the same as the classical result, or departs from the classical 
result  omy through requiring a  special   Order  for  factors in a  product, 
this  Order  being, of course, unimportant in the classical theory. Even 
when  f and  7 are more general functions  of the  q's and p's not expressible 
as power series, equations (9) are still sufficient  to fix the
 
88                    THE QUANTUM CONDITIONS                             §  21
value of  [T--$, as will  become  clear  from the following  werk. 
Equ&ions  (9) thus give the Solution of  the Problem  of finding the 
quantum  conditions, for all those dynamical Systems which have a 
classical analogue and which are describable in terms of canonical 
coordinates and momenta. This does not  include all possible Systems 
in  quantum  mechanics.
Equations  (`7) and (9) provide the foundation for the analogy
between   quantum  mechanics and classical mechanics. They show
fhat classical mechunics may be regarded us  the  limiting  case of quuntum
mechunics when 5 tends to Zero. A P.B. in quantum  mechanics is a 
purely  algebraic   notion and is thus  a  rather more fundamental  concept 
than a classical P.B., which  tan  be  defined only with reference to 
a set of canonical coordinates and momenta. For this reason canonical 
coordinates and momenta are of less  importante in  quantum  mechanics 
than in classical mechanics; in fact,  we may have a System  in quanturn 
mechanics for which canonical coordinates and momenta do 
not exist and  we  tan  still give a meaning to  P.B.`s. Such a  System 
would be one without a classical analogue and we should not be able 
to obtain its quantum  conditions by the method here described.
From equations  (9)  we see that two variables with different suffixes
r and  s always  commute.  It follows that any function of  qT and  p,, 
will  commute  with any function  of  qS and  p,  when s differs from  r. 
Different values of  r  correspond to different degrees of freedom of the 
dynamical System, so we get the result that  dynumical  variables 
referring  to different degrees of freedom  commute. This law, as we have 
derived it from  (9),  is proved only for dynamical Systems with 
classical analogues, but we assume it to hold generally. In this way 
we  tan  make a  Start on the  Problem  of finding quantum  conditions. 
for dynamical Systems for which canonical coordinates and momenta 
do not exist, provided we  tan give a meaning to different degrees of 
freedom, as we may be able to do with the help of physical insight.
We  tan  now see the physical meaning of the division, which was
discussed in the preceding section,  of the dynamical variables into 
Sets,  any member of one set commuting with any member of another. 
Esch set corresponds to certain degrees of freedom, or possibly just 
one degree of freedom. The division may correspond to the physical 
process of resolving the dynamical System  into its constituent Parts, 
each  constituent being  capable   of existing by itself as a physical 
System,  and  the various constituents having to be brought into
 
22. Schroedinger's Representation

§ 21                      POISSON BRACKETS                                   89 
interaction  with one another to  produce the original System. Alternatively 
the  division  may be merely a mathematical procedure of 
resolving the dynamical  System  into degrees of freedom  which   cannot 
be separated physically, e.,CT. the System  consisting of a particle with 
internal  structure may be divided into the degrees of freedom  describing 
the motion of the  centre  of the particle and those describing the 
internal structure.
22. Schroedinger's representation
Let us consider a dynamical  System  with n degrees of freedom
having a classical analogue, and thus describable in terms of canonical 
coordinates and momenta  q,.,p, (r = 1,2,...  , n). We assume that the 
coordinates  qr are all observables  and haue continuous  ranges  of eigenvalues, 
these assumptions being reasonable from the physical significance 
of the  q's. Let us set up a representation with the  q's  diagonal. 
The question arises  whether the q's form a complete commuting set 
for this dynamical System. It seems pretty obvious from inspection 
that they do. We shall here assume that they do, and the assumption 
will be justified later (see top of p.  92).  With the  q's  forming a 
complete commuting set, the representation is fixed except for the 
arbitrary Phase  factors  in it.
Let us consider first the  case  of  n = 1, so that there  is only one q
and  ~p, satisfying            qp-pq  = in.
Any ket may be written in the Standard ket notation #(q)>. From  it 
we  tan  form another ket d#/dq), whose representative is the derivative 
of the original one. This new ket  is  a linear  function  of the 
original one and is thus the result of some linear Operator applied to 
the original one. Calling this linear Operator  d/dq,  we have
gn = -",.                                (11)
Equation (11) holding for all functions  $ defines the linear Operator
dldq.   We have
g->-o.                              (12)
Let us treat the linear Operator  d/dq   according to the general theory
8f linear Operators of  6  7. We should then be able to apply it to a bra 
(4(q),   the  product ($d/dq  being defined, according to (3) of  $ 7, by
 
90                        THE  QUA.NTUM  CONDITIONS                           0  22

for all functions #Q). Taking representatives,  we get

(14)
We  tan transform the right-hand side by partial integration and get


provided the  contributions  from the limits of integration vanish.
This gives                         d
<+dqw  =  -9,

showing that                        <Q,=          -(!g$- .                   (16)   ;-           d
:- J
I .
T'hus   dldq  operating to the  left on the conjugate complex of  a  wave  '  '
function has the meaning of minus differentiation  with respect  to q.
The validity of this result depends on our being able to make the
passage  f?om   (14)  to  (15),  which  requires that we must  restritt  ourselves 
to bras and kets corresponding to wave functions that satisfy 
suitable  boundary conditions. The conditions usually holding in 
practice are that they vanish at the boundaries. (Somewhat more
general conditions will be given in  the next  section.) These conditions 
do not limit  the physical applicability of the theory, but, on the contrary, 
are usually required also on physi4 grounds. For example, 
if  q is a Cartesian coordinate of a particle,  its eigenvalues run from 
-00 to  CO,  and the physical requirement that the  particle  has Zero 
probability of being at infinity leads to the condition that the wave 
function vanishes for q = &co.
ri '  ,f.1
:' The conjugate complex of  fhe linear Operator  d/dq   tan  be evaluated
1;. by noting that the conjugate imaginary of  d/dq.  #)  or  d#/dq)  is 
i  '  04Jf&,  or -  (4  d/dq   from  (16).  Thus the conjugate complex of  d/dq
.1s  -dfdq, so d/dq is a pure imginary  linear Operator.
To  get the representative of  djdq  we note that,  from an  application
of formula  (63)  of  5  20,
k">  =  WPfD,                        (17)



and  hence                                                                  (19) 
The representative of  dldq involves the derivative of the  8 function.
 
5 22             SCHROeDINGER'S                  REPRESENTATION

Let us  werk  out the comrnutation relation connecting  d/dq  with  q.
We have                 d
&jM                                                 (20)

Since  this holds for any ket #>, we  have
a a
dsq-"4i=  l*
Comparing this result with (  lO), we see that -% d/dq  satisJies the
same  coinmutation   relution   with q  thut  p does.
To extend the foregoing work to the  case of arbitrary n, we write
the general ket as #(ql...q,)) = #) and introduce the  n linear operators 
a/aq,   (T   =  l,..., n),  which   tan  operate on it in accordance with 
the formula
g*,  =  $),                              (22)
T                r
corresponding to (11).  We have
-&)=O
f                            (23)
corresponding to (12). Provided we  restritt  ourselves to bras and 
kets corresponding to  wave  functions  satisfying suitable boundary 
conditions, these linear Operators  tan  operate also on bras, in  accordance 
with the formula <+$  =  -<",                                         (24)
r                r
corresponding to (16). Thus  a/aq,   tan  operate to the left on the 
conjugate  complex of a wave  function,  when it has the meaning pf 
minus partial differentiation with  respect  to  q,.  We find as before 
that each  a/aq,.  is a pure imaginary linear Operator. Corresponding 
to (21) we have the commutation  relations


We have  further
=2%>                                     (26)
%r %
aa                aa
showing that                      --=-e*
a!L 4s                                    (27)
a%  a%
Comparing (25) and (27) with  (9), we see that  the  linear operators 
-4  a/aq,   satisfy  the  same   commutution   relutions  with the  q's  and with 
euch other  thut  the  p's do.
 
92                    THE QUANTUM  CONDITIONS                                  9  22

It would be possible to take
Pr =  -3#ia/aqv                               (28)
without getting any inconsistency. This possibility enables us to see 
that the  q's  must form a  complete commuting set of observables, 
since  it means that any function of the  q's  and  ~3s  could be taken 
to be a function of the q's and -4  i?/aq's   and then could not commute 
with all the  q's unless it is a function of the  q's only.
The equations (28) do not necessarily hold. But in any  case  the
quantities p,+S  a/aq,.  each  commute with all the  q's, so each  of them
is a function of the  q's, from Theorem 2 of  5 19. Thus
Pr = -4 var+frw                                   (29)
Since   13, and --ifia/aqr   are both real,  J.(q)   must be real.  l?or  any
function  f of the  q's we  have



showing that                   $-f& = g.
?        T      r
With the help of (29) we tan now deduce the general formula

PJ-fPr  =  -ifL  af/a!lr*                          (31)
This formula may be written in P.B. notation

Lf)Prl  =  afla%                               (32)
when it is the same as in the classical theory, as follows from (1).
Multiplying (27) by (  -i6)2 and substituting for  ---in   a/aq,.   and  -8  a/aqg
their values given by (29), we get

(Pr-frl(Ps-fs)   =  (Ps-f.s)(Pr-fr)?
which reduces, with the help of the  quantum   conditionp,p,  =  pspr,  to

Prfs+frPs =  Psfr+fsPr*
This reduces further,  with the  help  of  (31), to
aft31a% = afrla%)                               (33)
showing that the functions  fr are all of the form
fr  =  waqr                                  (34)
with  F independent of  r. Equation (29) now  becomes
Pr = -i%a/ap,+aF/afjr.                               (35)
We have been working with a representation which is  fixed to the
extent that the  q's  must be diagonal in it, but which contains arbitrary
 
a


§ 22            SCHRODINGER'S             REPRESENTATION
Phase  factors. If the  Phase  factors are changed, the Operators a/aq, 
get changed. It will now be shown that, by a suitable Change in the 
Phase  factors, the function F in (35)  tan  be made to vanish, so that 
equations (28) are made to hold.
Using  Stars to distinguish quantities referring to the new repre-
sentation with the new  Phase  factors, we shall have the new  basic
bras connected with the previous ones by
<4;4;*  I =  eif(q'...qk  /                 WV
where  y' = y(q')   is a real function of the  q"s.  The new  representative 
of  a ket is eir' firnes the old one, showing that e%j)*  = #),  SO
(37)
as the connexion between the new Standard ket and the original one. 
The new linear Operator (a/aq,)* satisfies, corresponding to  (22),

$  *#)*  =  $.>*  =  e-W!$.)
(     r1              r               t
with the help of (37). Using (22), this gives


showing that                      a *           a
= e-iy-eiy
( G1           at&  '
or, with the help of  (SO), a * a +i$.                                 (39)
(  )
ag,=as,             r
By choosing y so that               ;F  = ny+  a constant,             (40) 
(35)  becomes                       Pr = -iiti(a/aq,)*.                (41) 
Equation  (40)  fixes y except for an arbitrary constant, so the representation 
is fixed except for an arbitrary constant Phase  factor.
In this way we see that a representation tan be set up in which
the  q's a3ne  diagonal and equations (28) hold. This representation is 
a very useful one for many Problems. It will be called ScTLroecG~er's 
representution,  as it was the representation in terms of which Schroedinger 
gave his original formulation of  quantum  mechanics in 1926. 
Schroedinger's representation exists whenever one has canonical q's 
and  p's, and is completely determined by these q's and p's except for 
an arbitrary constant Phase  factor. It owes its great convenience  to 
its allowing one to express immediately any algebraic  function of the
 
23. The Momentum Representation

94                       THE QUANTUM CONDITIONS                                     522      11 
q's and P's of the form of a power series in the P's as an Operator of                       i 
differentiation,  e.g. if f(ql,.  . . , qn, PI,.  . . , Pi,) is such a function,  we have    !
f(ql,...,qn,pil,...,pn)  =  fkb..4h  -wag,,...,  -ifivqn),                (42)
provided we preserve the Order  of the factors in a product on substi-                       '
tuting the -%a/aq's   for the  p's.
From (23) and  (28), we have
PJ =  0.                                     (43)
Thus the Standard ket in Schroedinger's representation is characterized                       : 
by  the  condition that it is a simultaneous eigenket of all the momenta                     i
.    belonging to the eigenvalues  Zero. Some properties of the  basic                            :
vectors  of Schroedinger's representation may also be noted. Equation
(22) gives

<q;...q;]$$>  = (q;...q;l$)  = %g$Q  =  -$  <q;...q;1+>.
7                    T              T           r

Hence                       <d...!Al; =  gAq;...qiA,                               (44)
r    r
(q;...q;/p,  =  4;  <q;...q;1.                       w
T
Similarly, equation  (24)  leads to



23. The  momentum representation
Let us take  a System  with one degree of freedom, describable  in
terms of a q and p with the eigenvalues of q running from --CO  to CO,                      i
and let us take an eigenket Ip')  of p. Its representative in the Sehroe-                     1
dinger  representation,  (q'  Ip'),   satisfies

P'WP')  =  WIPIP'> =  -i$$   (qf  IP'),

with the help of  (45)   applied to the  case  of one degree of freedom. 
The  solutioy,  of Chis  differential equation for (q' Ip')  is
(q'lp') = cf eWd/fi,                               (47
where  c' =  c@`) is independent of  q',  but may involve  p'.
The representative  (q'  Ip')   does not satisfy the boundary  conditions
of  vanishing   at  q' = -&o.  This gives rise to some  d.ifEculty,  which
 
.'    §  23            THE  MOMENT~J~%  REPRESENTATION                               95

Shows  itself  up  most   arectly   in  the  fGluro   of the orthogonality 
theorem. If  we  take a  seoond eigenket   I@`>  of  p with representative
(*'@">  = &fP"b/n,

belonging to a different eigenvalue   p", we  shA.l have



This i&egrd does not converge according to the USU~~ definition of 
convergence. To br-g  tho theory  into Order,  we adopt a new definition 
of convergence  of an  integral   whose  domain  extendsto  inanity, 
analogous  to the  Cesaro  definition  of the sum of an infuaife  series. 
With  this  new  de-finition,   an integral whose value to the upper hmif 
q' is of the form cosq' or sin&, with  a  a real number  nof  Zero,  is 
counted as Zero  when q' ten&  to infinity, i.e. we take the mean value 
of the oscillations, and simiIa;rly for the lower limit of q' tending to 
minus  Unity.   Th&   makes  the right-hand side of  (48)  vanish  for 
13"  # p',  so that the  ortlmgonality  theorem is restored. Also it makes 
the right-hand sides  of  (13)  and  (14)  equal when  (4 and  $)  arc  eigenvectors 
of  p, so that  eigenvectors  of  p become permissible vectors to 
use with  the Operator  d/dq. Thus  the boundary conditions that the 
representative  of a  permissible  bra or ket has to satisfy become 
extended to allow the  representrttive to  oscillate like  cos&  or  sinaq'
M
as  q' goes  to inCn.ity  or minus in6nity.
For p" very close to p', the  right-hand side of (48) involves a 6
function.  To  evaluate it, we need the formula
00
s eze  dx =  27r6(a)                             (49)
-03
for real  a, which may be proved as follows. The  formula evidently 
holds for a different from  Zero, as  both sides are then Zero.  Further 
we  have, for any continuous functionf(a),

Jf(a)  du Jo eim  dx =  sm&)  da 2a-l sinag =  2$(0)
-cQ       -l7               -Co
in the Limit  when g tends  to infinity.  A more complicated argument 
Shows  that we get the Same result  if instead of the  limits  g and -g 
we put g, and -g2, and  then  Iet g1 and  g, tend to infCty  in Werent 
ways  (not too tidely  different).           This  Shows  the equivalence of both 
sides of' (49) as facfors in an "fegrand,  which  proves the formula.
 
96                     THE QUANTUM CONDITIONS                               9  23

With the help of  (49),  (48) becomes

(p'lp") = i-2 ZnS[(p'-p')/li] =  7cM  h S(p'-pl)
=  IC'IVL  S(p'-p').      (50)
We hsve obtained an  eigenket  of  p belonging to any real eigenvalue
p', its representative being given by  (47). Any ket  IX)  tan be  expanded 
in terms of these eigenkets of  p,  since  its representative 
(@IX>   tan  be expanded in terms of the representatives  (47)  by 
Fourier analysis. It follows that the momentum  p  is an observable, 
in agreement with the experimental result that momenta  tan be 
observed.
A symmetry now appears between 4 and p. Esch of them is an
observable with eigenvalues extending from  --CO  to  CO,  and the 
commutation  relation connecting  q  and p, equation  (lO),   remains 
invariant if we interchange q and p and write  -i for i. We have set 
up a representation in which  q is  diagonal and  p =  -ihd/dq.   It 
follows from the symmetry that we tan  also set up a representation 
in which  p is diagonal and
q =  &d/dp,                              (51)
the Operator d/dp  being defined by a procedure similar to that used 
for d/dq.  This representation will be called the momentum  representation. 
It is less useful than the previous Schroedinger representation 
because,  while the Schroedinger representation enables one to express 
as an Operator of differentiation any function  of  q  and  p  that is a 
power series in p, the  momentum  representation enables one so to 
express any function of q and p  that is a power series in q, and the 
important quantities in dynamics  are almost  always power series in 
p  but are often not power series in q.  All the  Same the  momentum 
representation is of value for certain  Problems  (see  $  50).
Let us calculate the transformation  function   (q'  1~`)   connecting the
two representations. The  basic  kets  jp') of the  momentum   representation 
are eigenkets of p and their Schroedinger representatives (q'lp') 
are given by (47) with the coefficients c' suitably Chosen. The Phase 
factors of these  basic  kets must be  Chosen  so as to make  (51) hold. 
The easiest way to  bring  in this  condition is to use the symmetry 
between q and p referred to above, according to which (q' jp') must 
go over into  (p'[q') if we interchange  q' and  p'   and write  -4 for i. 
Now  {q'lp')  is equal to the right-hand side of  (47) and (p'  Iq')  to the
 
24. Heisenberg's Principle of Uncertainty

0  23            THE  MOMENTUM  REPRESENTATION                                          97

conjugate complex expression, and  hence  c' must be independent of
$. Thus c' is just a number c. Further,  we must have
CP IP") =  ~@`--l-0,
which Shows, on comparison with (50),  that Ic 1 = h-4. We tan choose 
the arbitrary constant Phase  factor in either representation so as to 
make c = h-+, and we then get
(q' Ip') = h-@-fd/fi                                   W2)
for the transformation function.
The foregoing work may easily  be generalized to a  System  with
n degrees of freedom, describable in terms of n p's and $8,  with the 
eigenvalues of  each  q running from --CO  to  00. Esch 13 will then be 
an observable with eigenvalues running from -CO to  co, and there 
will be symmetry between the set of  q's  and the set of  p's, the 
commutation  relations remaining invariant if we  interchange   each   q,, 
with the corresponding p,.  and write  4 for i. A  momentum  representation 
tan  be set up in which the @s are diagonal and esch
4T = iha/app.                                      W)
The transformation function connecting it with the Schroedinger 
representation will be given by the  product of the transformation 
functions  for  each  degree of freedom separately, as is shown by 
formula  (67)  of $20, and will thus be
<s;qB...snlP;~z..*Pn>  = cel13;>caaI~~>.g.<q~l~~)
= h-n12ezo3;qz;+p;qef...~p~q~l~,              (54) .
24. Heisenberg's principle of uncertainty
For a System  with one degree of freedom, the Schroedinger and the
momentum  representatives of a ket IX> are connected by

(pf  IX)   =  h-h   OD   e-fq'p'l~   dq'  (q'lx},
s
-03                                            w
(a'lX> =h-k * &`P'lfi &p' (p'IX>.
f
-CO                                       1
These formulas have an elementary significance. They show  that
either of  I the representatives  is. given, apart from  numerical   coeficients, 
by the am(plitudes  of the Pourier components of tke~other.                  _
It is interesting to apply (55) to a ket whose Schroedinger repre-
sentative consists of what is called a wave packet.  This is a function
3995.57                                     H
j.d>
 
98                      THE QUANTUM CONDITIONS                             $ 24

whose value is very small everywhere outside  a certain domain,  of 
width Aq'  say,  and inside this  domain is approximately  periodic with 
a definite  frequency.t If a Fourier analysis  is made of such a  wave 
packet, the amplitude of all the Fourier components will be small, 
except those in the neighbourhood of the definite frequency.  The 
components whose amplitudes are not small will fill up a  frequencyt 
band whose width is of the Order  l/Aq', since two components whose 
frequencies differ by this amount, if in  Phase  in the middle of the 
domain  Aq', will be just out of  Phase  and interfering at the ends of 
this  domain.      Now in the first of equations (55) the variable
(2~)~3'/fi  =  p'/h   plays the part of frequency. Thus with  (q'  IX)  of the 
form of a  wave packet, the  function (p'/X),   being composed of the 
amplitudes of the Fourier components of the wave packet, will be 
small everywhere in the p'-space outside a certain domain  of width 
AP' =  h/Aq'.
Let us now apply the physical interpretation of the  Square of the
modulus of the representative of a ket as a probability. We find that 
our wave packet represents a state for which  a measurement of q is 
almost  certain to lead to a result lying in a domain of width Aq' and 
a measurement of  p  is almost  certain to lead to a result lying in a 
domain  of width Ap'. We may say that for this state q has a definite 
value with an error of  Order  Aq'  and p has a definite value with an 
error  of  Order  Ap'. The  product of these two  errors  is
Aq'Ap' =  h.                          (56)
Thus the more accurately one of the variables  q,p  has a definite 
value, the less accurately the other has a definite value. For a  System 
with several degrees of freedom, equation (56) applies to each  degree 
of  freedom separately.
Equation  (56)  is known as Heisenberg's Principle  of Uncertainty.
It Shows  clearly the limitations in the possibility of simultaneously 
assigning  numerical  values, for any  particular  state, to two  noncommuting 
observables, when those observables are a canonical coOrdinate 
and momentum, and provides a plain illustration  of  how 
observations in  quantum  mechanics may be  incompatible.  It also 
Shows  how classical mechanics,  which  assumes that  numerical  values 
tan be assigned simultaneously to all observables, may be a  valid 
approximation when  h  tan be considered as small enough  fo be 

t  Frequency  here  means  reciprocal  of wave-length.
 
25. Displacement Operators

§  24    HEISENBERG'S PRINCIPLE OF UNCERTAINTY                              09

negligible.  Equation  (56) holds only in the most favourable  case, 
which occurs when the representative of the state is of the form of a 
wave  packet.  Other forms of representative would lead to a Aq' and 
AP' whose  product is larger than  h.
Heisenberg's principle of uncertainty  Shows that, in the  limit  when
either  q  or   p is completely determined, the other is completely 
undetermined. This result  tan  also be obtained directly from the 
transformation  function   (q'lp').   According to the end of  6 18, 
l(q'1P'>12da'   * P P t
1s ro or  ional to the probability of q having a value in
the small range from  q'  to  q'+dq'  for the state for which  p certainly 
has the value p', and from (52) this probability is independent of q' 
for  8, given  &'  . Thus if  p certainly has a definite value  p', all values 
of  q  are equally probable. Similarly, if  q  certainly has a definite value 
q', all values of p are equally probable.
It is evident physically that a state for which all values of q are
equally probable,  or one for which all values  ofp are equally probable, 
cannot  be attained in practice, in the first  case   because  of limitations 
of size and in the second  because  of limitations of energy. Thus an 
eigenstate of p or  an eigenstate of  q cannot  be attained in practice. 
The argument at the end of  $ 12 already showed that such eigenstates 
are unattainable,  because  of the infinite precision that would be 
needed to set them up, and we now have another argument  leading 
to the same conclusion.

25. Displacement Operators
We get a new insight into the meaning of some of the  quantum   con-
ditions by making a study of displacement Operators. These appear 
in the theory when we take into consideration that the scheme of 
relations between states and dynamical variables given in Chapter 11 
is essentially  aphysical  scheme, so that if certain states and dynamical 
variables are connected by some relation, on our displacing them all 
in a definite way (for example, displacing them all through a distance 
6x in the  direction  of the x-axis of Cartesian coordinates), the new 
states and dynamical variables would have to be connected by the 
same relation.
The displacement of a state  or  observable is a perfectly definite
process physically. Thus to displace a state or observable through a 
distance  6x in the  direction  of the  x-axis,  we should merely have  to 
displace all the apparatus used in  preparing  the state,  or all the
 
100                THE QUANTUM CONDITIONS                                s  25

apparatus required to measure the observable, through the distance 
Sx in the direction of the x-axis, and the displaced apparatus would 
define the displaced state or observable.  The  displacement of  a 
dynamical variable must be just as definite as the displacement of 
an observable, because of the close  mathematical connexion between 
dynamical variables and observables.  A displaced state or dynamical 
variable is uniquely  determined by the undisplaced state or dynamical 
variable together with the direction and magnitude of the  displacement 
.
The  displacement of a ket vector is not such a definite thing though.
If  we take a certain ket vector, it will represent a certain state and we 
may displace this state and get a perfectly definite new state, but this 
new   state will not determine our displaced ket, but  only the direction 
of our displaced ket. We help to fix our displaced ket by requiring 
that it shall have the same length as the undisplaced ket, but even 
then it is not completely determined, but tan  still be multiplied by 
an arbitrary Phase  factor.  One would think at first sight that each 
ket one displaces would have a different arbitrary  Phase   factor, 
but with the help of the following argument, we see that it must be 
the same for them all. We make use of the law that Superposition 
relationships between states remain invariant  under  the displacement. 
A Superposition relationship between states is expressed 
mathematically by a linear equation between the kets corresponding 
to those states, for example
IJ9 =q4+c,IJo,                                (57)
where  c1 and  c2 are numbers, and the  invariance  of the Superposition 
relationship requires that the displaced states correspond to kets 
with the same linear equation between them-in our example they 
would correspond to  IRd),   [Ad),   IB&!>  say,  satisfying
pd) =  C~~Ad)+c,(Bd).                           (58)
We take these kets to be our displaced kets, rather than these kets
multiplied by arbitrary independent  Phase   factors,   which  latter             4
kets would satisfy a linear equation with different coefficients cl, c2. 
The  only arbitrariness now  left in  the displaced kets is that of a  Single 
arbitrary Phase  factor  to be multiplied into all of them.
The  condition that linear equations between the kets remain in-
variant  under  the displacement and that an equation such as (58) 
holds whenever  the corresponding (57)  holds,  means that the  dis-
 
4 25                 DISE'LACEMENT  OPERATORS                           101 
placed kets are linear  functions  of  the undisplaced kets and thus  each 
displaced ket  /Pd)  is the result of some linear Operator applied to the 
corresponding undisplaced ket IP). In Symbols,
VW =  DIP),                               (69)
where  D is a linear Operator independent of 1 P>  and depending only 
on the displacement. The arbitrary  Phase  factor by which all the 
displaced kets may be multiplied results in  D  being undetermined 
to the extent of an arbitrary numerical  factor of modulus unity.
With the displacement of kets made definite in the above manner
and the displacement of bras, of course, made equally definite, 
through their being the conjugate imaginaries of the  kets,  we  tan 
now assert that any  symbolic equation between kets, bras, and 
dynamical variables must remain invariant under  the displacement 
of every  Symbol  occurring in it, on account of such an equation 
having some physical significance which will not get changed  by the 
displacement .
Take as an exarnple  the equation
<QIP) =  c>
c being a number. Then we must have
(QdlPd)  = c =  {QIP).                        (60)
From  the conjugate imaginary of (59) with Q instead of P,
(Qdl =  (QID.                             (61)
Hence  (60) gives        (QJ~W'>  = (QIP>*
Since  this holds for arbitrary (Q] and  ) P>,  we must have
BD=1,                                (62)
giving us a general  condition  which  D  has to satisfy.
Take as a second example the equation
VIP) =  IR),
where v is any dynamical variable. Then, using  vd to denote the
displaced dynamical variable, we must have
v,/Pd)  =  /Rd).
With the help of (89) we get
v,IPd)  =  DIR)  =  DvjP)  =  DvD-lIPd).
Since  1 Pd) tan be any ket, we must have
Vd = DvD-`,                             (63)
 
102                 TIIE  QUANTUM CONDITIONS                          9  25

which  Shows  that the linear Operator  D determines the displacement 
of dynamical variables as weh  as that of kets and bras. Note that 
the arbitrary numerical factor of modulus unity in D does not affect 
vd, and also it does not affect the validity of  (62).
Let us now pass to an infinitesimal displacement, i.e. taking the
displacement through the distance Sx in the direction of the x-axis, 
let us make  8x +  0. From physical continuity we should expect 
a  displaced  ket IPd) to tend to the original 1 P) and we may further 
expect the limit
firn   Jpd)-1')  =  lim   D-1  Ip) 
6x-+o        SX            Gx-+o   sx
to exist. This requires that the limit
~~o'D- 1 )/Sx                        (64)
shall exist. This limit is a linear Operator which we shall cal1  the 
dislplacement   Operator  for the  x-direction and denote by  dz.  The 
arbitrary numerical factor eiy  with y real which we may multiply 
into  D must be made to tend to unity as Sx --+ 0 and then introduces 
an arbitrariness in d,, namely, dx may be replaced by
hm (Deir-- l)/Sx  = hm  (D-  l+iy)/Sx  =  d,+ia,,
6X+O                         6x+0
where  a, is the limit of r/Sx. Thus dz contains an arbitrary additive
pure imaginary number.
For Sx small                   D = I+Sxd,.                          (65)     -
Substituting this into (62), we get
(l+Sxd,)(l+Sxd,)  =  1,
which  reduces,  with neglect of  Sx2, to
Sx(ci,+d,)  =  0.
Thus dz is a pure imaginary linear Operator. Substituting (65)  into
(63)  we get, with neglect of Sx2  again,
vd = (l+Sxdx)v(l-Sxd,) = v+Sx(d,v-v  dJ,                  (66)
showing that           lim  (w,-w)/Sx = .d,v-vd,.                     (67)     -
6X-+O
We may describe any dynamical System  in terms of the following
dynamical variables: the Cartesian coordinates x, y,  x of the  centre  of 
mass of the System, the components  ~~,p~,p~   of the total  momentum 
of the System,  which are the canonical momenta conjugate to x, y, z 
respectively, and any dynamical variables needed for describing
 
26. Unitary Transformations

§ 25                  DISPLACEMENT OPERATORS                                   103
internal degrees of freedom of the System. If  we  suppose  a  piece 
of apparatus which has been set up to measure x, to be displaced a 
distance  6x in the  direction of the x-axis, it will measure  x-6x,  hence
x& = x-6x.
Comparing this with (66) for v = x, we obtain
d,x-xd,  = -1.                                (68)
This is the  quantum  condition connecting  d,  with x. From similar
arguments we find that  y,  x,  pZ,  p2/,   13, and the internal dynamical vari-
ables, which are unaffected by the displacement, must commute with 
d,.  Comparing these results with (Q),  we see that i& dz satisfies just 
the same  quantum  conditions as  23,.  Their  differente,  pZ-i7idZ, 
commutes  with all the dynamical variables and must therefore be a 
number. This number, which is necessarily real since  p5 and  S  dz  are 
both real, may be made  Zero by a suitable choice of the arbitrary, 
pure imaginary number that tan be added to dZ.  We then have the
result                            Pz = ins,,                                 (69) 
or  the  x-component  of the total  momentum  of  the  system   is  i!i  times the 
disphcement  Operator  d,.
This is a fundamental result, which gives a new significance to
displacement Operators. There is a corresponding result, of course, 
also for the y and x displacement Operators d,  and da. The quantum 
conditions which state that (ps,  pu and  ps commute with each  other 
are now seen to be connected  with  the  fact  that displacements in 
different directions  are  commutable  operations.

26. Unitary transformations
Let   U  be any linear Operator that has a reciprocal U-l and  con-
sider the equation               a* =  uau-1,                                (`0)
cx  being an arbitrary linear Operator. This equation may be regarded 
as expressing a transformation from any linear Operator  CII  to a 
corresponding linear Operator  a*, and as such it has rather remarkable 
properties. In the first place  it should be noted that each  a* has the 
same eigenvalues as the corresponding  a; since, if  a' is any eigenvalue 
of  01  and  Ia')  is an eigenket belonging to it, we have

and hence cx*u[a.`)  =  UaU-Wla')  =  UaIa')  =  Cx'UId},
 
104                   THE  QUANTUM CONDITIONS                           5  33 
showing that Ul&) is an eigenket of CX*  belonging to the same eigenvalue 
01',  and similarly any eigenvalue of CY*  may be shown to be also 
an eigenvalue of  CL. Z'urther,  if we take several a's  that are connected 
by algebraic equations and transform them all according to (`70),  the 
corresponding  c11*`s  will be connected by the same algebraic equations. 
This result follows from the  fact that the fundamental algebraic  processes 
of addition and multiplication  are left invariant by the transformation 
(YO), as is shown by the following equations :
(al+a&*  = u(cxl+aJu-l = ucYlu-l+ua2  u-1 = af+cg,
(a1 aJ" = Uap, u-1 = Uctl u-wcL2 u-1 = c+g.
Let us now see what  condition would be imposed on  U  by the
requirement that any real  (Y'  transforms into a real  01*.  Equation 
(70)  may be written               a*u  =  Ua.                         (71) 
Taking the oonjugate complex of both sides in accordance with
(5) of Q 8 we find, if  ~11  and  CX*  are both real,
uea*  =  aue.                         (72)
Equation  (71)  gives us         uea*u  =  im,
and equation  (72)  gives us
UeCXJ  =  am.
Hence                             ueua  =  aueu.
Thus  UeU  commutes  with any real linear Operator and therefore also 
with any linear Operator whatever, since  any linear Operator tan  be 
expressed as one real one plus  i times another. Hence  UeU  is a 
number. It is obviously real, its conjugate complex according to (5) 
of  $ 8 being the  Same as itself, and  further  it must be a positive 
number, since for any ket [P),  (P 1 UeU 1 P) is positive as well as 
<P  1 P). We  tan  suppose it to be unity without any loss of generality 
in the transformation  (70).  We then have
UV-= 1.                           (73)
Equation  (73)  is equivalent to any of the following
u = ue-1,          ue  =  u-1,        u-q-1  =  1.       (74)
A matrix or linear Operator  27 that satisfies  (7 3)  and (74) is said
to be unitury and a transformation (70) with unitary U is called a 
unitary  transformation.       A unitary transformation transforms real 
linear Operators into real linear Operators and leaves invariant any
 
3 26                UNITARY TRANSFORMATIONS
algebraic equation between linear Operators. It may  be considered 
as applying also to kets  and bras, in accordance with the equations
ly*)  =  Ul0           (Pl =  (Plue  -  (Pp-l,                  (`5)
and then it leaves invariant any algebraic equation between linear 
Operators, kets, and bras. It transforms eigenvectors  of 01  into eigenvectors 
of  a *. From this one  tan easily deduce that it transforms an 
observable into an observable and that it leaves invariant any functional 
relation between observables  based on the general definition 
of  8 function given in 0 11.
The  inverse  of a unitary transformation is also a unitary trans-
formation, since from  (74), if  U  is unitary,  U-l  is also unitary. 
Further,  if two unitary transformations are applied in succession, 
the result is a third unitary transformation, as may be verified in 
the following way.  Let the two unitary transformations be (70) and
a+ =  va*v-1.
The connexion between 011'  and  01  is then
,+  = vu~u-lv-l
=  (VU)a(  VU)-1                        (76)
from (42) of 3 11. Now V U is unitary since
--
vuvu =  uvvu =  ueu  = 1,
and  hence  (76) is a unitary transformation.
The transformation given in the preceding  section from undisplaced
to displaced quantities is an example of a unitary transformation, as 
is shown by equations  (62),  (63), corresponding to squations  (73), 
(70), and equations (59), 61), corresponding to equations (75).
In classical mechanics one  tan  make a transformation from the
canonical coordinates and momenta qT,pr  (r = l,.., n) to a new set of 
variables  &!,   &!   (r =  1,.  . , n) satisfying the same P.B. relations as the 
q's and ~`8,  i.e. equations (8) of 6 21 with q*`s  and p*`s replacing the 
q's   andp's, and  tan express all dynamical  variables  in terms of the  q*`s 
and  p*`s. The  q*`s  and  ~8's are then also  called  canonical  coordinates 
and  momenta  and the transformation is called a contact transformation. 
One  tan  easily  verify that the P.B. of  any  two dynamical 
variables u and v is correctly given by formula (1) of $21 with  q*`s  and 
P*`S instead of q's  and ~`5,  so that the P.B. relationship is invariant 
under  a contact transformation. This results in the new canonical 
coordinates  and momenta being on the same footing as the original 
ones   for  mmy  purposes  of general dynamical  theory, even though the
 
THE QUANTUM CONDITIONS                                             3  26
new coordinates &! may not be a set of Lagrangian coordinates but 
may be functions  of the Lagrangian coordinates and velocities.
It will now be shown that, for a quantum  dynamical System  that
has  a  classical analogue,  unitary transformations in the  quantum  theory 
are the analogue of contact transformations  in the classical theory. 
Unitary transformations are more general than contact transformations, 
since the  former   tan  be applied to Systems in  quantum 
mechanics that have no classical analogue, but for those Systems in 
quantum  mechanics which are describable in terms of canonical 
coordinates and momenta, the analogy between the two kinds of 
transformation holds. To establish it, we note that a unitary transformation 
applied to the  quantum  variables  q,.,pr  gives new variables 
qF,pF   satisfying the same P.B. relations, since the P.B. relations are 
equivalent to the algebraic relations  (9)  of  0  2 1  and algebraic relations 
are left invariant by  a  unitary transformation. Conversely, any real 
variables  q:,pz   satisfying the  P.&  relations for canonical coordinates 
and momenta are connected with the q,.,pr by a unitary transformation, 
as is shown by ths following argument.
We use the Schroedinger representation, and write the  basic  ket
jq;...qk>   as  I@>  for brevity.  Since  we are assuming that the  qz,pF 
satisfy the P.B. relations for canonical coordinates and momenta, 
we  tan  set up a Schroedinger representation referring to them, with 
the  qz  diagonal and  each   pf equal to  -;fi  a/i?qF.  The  basic  kets in 
this  second Schroedinger representation will be  jqf'...qz'), which we 
write  jq*`>   for brevity. Now introduce the linear Operator  77 defined by
GI" I W') = W"`-q'),                                         (77)
where S(Q*`- q') is short for
6(q"`-q')  =  s(q~`-q;)s(q~`-q;)...8(q$-q;).                                (78)
The conjugate  complex of  (77) is
(q'  I ue kl*`) = qq*`-q'),
and  hence-j-
(q' 1 ue u Iq")  = 1 <q' 1 ue lq*`> 4" <q*'  I 77 Ia">

=  s(q*'
s         -q') dq*'  S(q*`-q")
=  6(q'-q"),
so that                      ueU=l.
t  We   use the notation of a  Single  integral sign and  dq*'  to  denote  an integral  over
all the variables  q:`,  qz',...,   qz'. This abbreviation will be used  also  in  future  work.
 
3 26                  UNITARY TRANSFORMATIONS                              107
Thus U  is a unitary Operator. We have further
<!l*`l!lwlo  =  !$+`~(Q"`--Q')
and                        GI* I k 14')  = G*`-!?`%
The right-hand sides of these two equations are equal on account of
the property of the  8 function  (11) of  6 15, and hence
4v.J  =  Q*
or                                 q;   =  uq,   U-l.
Again, from (45) and (46),
(q*`lp~ulq'>  = -ih&qq*~-q').
T
(q*'  1 Up,lq')  = i?i$-$ a(q*`--q').
9.
The right-hand sides of these two equations are obviously equal, and
PW= UPr
or                                p:  =  upr  U-l.
Thus all the conditions for a unitary transformation are verified.
We get an infinitesimal  unitary  transformation by taking  U   in  (70)
to differ by an infinitesimal from unity. Put
U =  1+id,
where  E is infinitesimal, so that its Square tan be neglected. Then
U-1 = l-kp.
The  unit,ary  condition  (73) or (74) requires that J' shall  be real. The
transformation equation  (70)  now takes  the  form
a* = (1  +kF)cx( 1 -id),
which  gives                  a*--01 =  ie(Pa-d).                         (79)
It may be w-ritten in P.B. notation
CL*--01 = &[cx,F].                   .  (80)
If  01  is a canonical  coordinate or momentum, this is  formallythe  Same
as  a  classical infinitesimal contact transformation.
 
V. The Equations of Motion
27. Schrodinger's Form for the Equations of Motion

V

THE EQUATIONS  OB'  MOTION
27. Schroedinger's  form for the equations of motion
OUR  work fror-n  0  5 onwards has all been concerned with one instant 
of time. It gave the general scheme of relations between states and 
dynamical variables for a dynamical System  at one instant of time. 
To get a  complete theory of  dynamics  we must consider also the 
connexion between different instants of time. When one makes an 
Observation on the dynamical System, the state of the System  gets 
changed  in an unpredictable  way,  but in between observations 
causality applies, in quantum  mechanics as in classical mechanics, 
and the System  is governed by equations of motion which make the 
state at one time determine the state at a later time. These equations 
of motion we now proceed to study. They will apply so long  as the 
dynamical System  is left undisturbed by any Observation or similar 
pr0cess.t Their general form  tan  be deduced from the principle of 
Superposition of Chapter 1.
Let us consider a particular  state of motion throughout the time
during  which the  System  is left undisturbed. We shall have the state 
at any time t corresponding to a certain ket which depends on t and 
which rnay be written It). If we deal with several of these states of 
motion we distinguish them by giving them labels such as  A,  and we 
then write  the ket which corresponds to  the state at time  t for one 
of them ]At). The requirement that the state at one time determines 
the state at another time means that ]At,)  determines ]At) except 
for a  numerical   factor.  The principle of Superposition applies to these 
states of motion throughout the time  during  which the  System  is 
undisturbed, and means that if we take a Superposition relation
holding for certain states at time  t,  and giving rise to a linear  equation
between the corresponding kets, e.g. the equation
l%)  = G`%~+m%h
the same superposition relation must hold between the states of 
motion throughout the time  during  which the  System  is undisturbed 
and must lead to the Same equation between the kets corresponding
t The preparation  of a state is a prooess of this kind. It often takes the form of
making an Observation and  selecting  the  System  when the result of the Observation
turns  out to  be a certain  pre-assigned  number.
 
$27  SCHROeDINGER'S  FORM FOR THE EQUATIONS OF MOTION                           109

to these states at any time t (in the undisturbed time interval), i.e.
the equation               1Rt) = c,l~o+c,Im,
provided the arbitrary numerical factors by which these kets may be 
multiplied  arc  suitably Chosen. It follows that the  IPt)`s   are linear 
functions  of the  IPt,)`s  and  each   IPt)   is the result of some linear 
Operator applied to  1 Pt,).  In Symbols
/Pt>  =  W't,),                                (1)
where  T is a linear Operator independent of  P and depending only
on t (and to).
We now assume that each  1 Pt)  has the same length as the  corre-
sponding jPto>.  It is not necessarily possible to choose  the arbitrary 
numerical factors by which the  IPt)`s  may be multiplied so as to 
make this so without destroying the linear dependence of the  IPt)`s 
on the 1 PtJ's, so the new assumption is a physical one and not just 
a question of notation. It involves  a  kind of sharpening of the 
principle of Superposition. The arbitrariness in  IPt) now  becomes 
merely a  Phase   factor,  which must be independent of  P  in  Order  that 
the linear dependence of the 1 Pt)`s on the 1 Pt,)`s  may be preserved. 
From the condition  that the length of c1 1 Pt>+c2 1 Qt) equals that of 
c,lPto>+cz~&to)  for any complex numbers cl,  cg, we tan  deduce that
<QW>  =  <QW't,).                                  (2)
The connexion between the  IPt)`s and  1 Pt,)`s  is formally similar
to the connexion  we'had  in $25 between the displaced and undisplaced 
kets, with a process of time displacement instead of the  space   displacement 
of  3 25. Equations (1) and (2) play the part of equations (69) 
and  (60) of 3 25. We tan develop the consequences of these equations 
as in  f~ 25 and  tan  deduce that  T  contains an  arbitrary  numerical 
factor of modulus unity and satisfies
FT=l,                                      (3)
corresponding to  (62)  of  5  25,  so T  is unitary. We pass to the  infinitesimal 
case  by making  t  --+  t,  and assume from physical continuity that 
the limit                       lim I Pt>-  I pt,>
t-d0  t-t,
exists.  This   limit  is just the derivative of [Pt,)   with respect to  t,.
From  (1)  it equals
 
110                THE EQUATIONS OF MOTION                              0  27

The limit Operator occurring here is, like  (64)  of $25, a pure imaginary 
linear Operator and is undetermined to the extent of an arbitrary 
additive pure imaginary number. Putting this limit Operator  multiplied 
by  i6 equal to H, or rather  H(t,) since it may depend on  t,, 
equation (4) becomes,  when written for a  general  t,
&4po
- =  lqt>p>.
dt                                        (5)
Equation (5) gives the general law for the Variation with time of
the ket corresponding to the state at any time. It is Schroedinger's 
ferm for the equutions  of motion. It involves just one real linear 
Operator  H(t),   which must be characteristic of the dynamical  System 
under  consideration. We assume that  H(t)  is the total energy  of 
the system.  There are two justifications for this assumption, (i) the 
analogy with classical mechanics, which will be developed in the
next  section,  and (ii) we have  H(t)  appearing as  in  firnes an Operator 
of displacement in time similar to the Operators of displacement in 
the x, y, and  x  directions  of  0 25, so corresponding to (69) of  8 25 
we should have  H(t) equal to the total energy, since the theory of 
relativity puts energy in the same relation to time as momentum  to 
distance.
We assume on physical grounds that the total energy of a System
is always an observable. For  an isolated System  it is a constant,  and 
may then be written  H. Even when it is not a  constant  we  aha11  often 
write it simply  H,  leaving its dependence on  t  understood. If the 
energy depends on  t, it means the  System  is acted on by external 
forces.  An action  of this kind is to be distinguished from a  disturbance 
caused  by a process of observation, as the  former  is  compatible 
with causality and equations of motion while the latter is not.
We  tan get a  connexion between H(t)  and the  T of equation (1)
by substituting for 1 Pt> in (5) its value given by equation (1). This
gives
ifif$   /Pt,)  =  H(t)TIPt,).

Since  1 Pt,)  may be any ket, we have
ifidT
dt   = H(t)T.                             (6)

Equation (5) is very important forpractical Problems, where it is
usually used in conjunction with a representation. Introducing  a
 
28. Heisenberg's Form for the Equations of Motion

§ 27  SCHROeDINGEIC'S     FORM FOR THE EQUATIONS OF MOTION                111

representation with a  complete  set of commuting observables  f 
diagonal and putting (6'  [Pt) equal to #([`t),  we have, passing  to the 
Standard ket notation,         w =  $440>*
Equation (5) now becomes
(7)
Equation (7) is known as  Xchroedinger's wave  equation and its solutions 
#(&)   arc  time-dependent wave  functions.   Esch Solution corresponds to 
a state of motion of the  System  and the  Square of its modulus  gives 
the probability of the  s's  having specified values at any time  t. For 
a  System  describable in terms of canonical coordinates and momenta 
we may use Schroedinger's representation and tan  then take H to be 
an Operator of differentiation in accordance w-ith (42) of  3 22. 
28. Heisenberg's form for the equations of motion
In the preceding  section  we set up a picture of the states of
undisturbed motion by making  each  of them correspond to a moving 
ket, the state at any time corresponding to the ket at that time. We 
shall call  this the Schroedinger  picture. Let us apply to our kets the 
unitary transformation which  makes each  ket Ia)  go over into
Ia*)  =  T-l  ja).                         (8)
This transformation is of the form given by (75) of 8 26 with T-l for 
U,  but it depends on the time t since  T depends on t. It is thus to be 
pictured  as  the  application of a continuous motion (consisting of 
rotations and uniform deformations) to the whole ket vector  space. 
A ket  which  is originally  fixed becomes a moving one, its motion  being 
given by (8) with  Ia)  independent of  t. On the other hand,  a ket 
which  is originally moving to correspond to a state of undisturbed 
motion, i.e. in accordance with equation (l), becomes fixed, since  on 
substituting  /Pt) for  Ia>  ,in  (8) we get  Ia*>   independent of  t. Thus 
the transformation  brings the kets corresponding to  stutes  of undisturbed 
motion. to  rest.
The unitary transformation must be applied also to bras and linear
Operators, in  Order  that equations between the various quantities may 
remain invariant. The transformation applied to bras is given by the 
conjugate imaginary of (8) and applied to linear Operators it is given 
by  (70) of  5 26 with T-l for U, i.e.
a*  =  TAT.                                (9)
 
112                 THE EQUATIONS OF MOTION

A linear Operator which is originally fixed transforms into a moving 
linear Operator in general. Now a dynamical variable corresponds  to 
a linear Operator which is originally fixed (because  it does not refer 
to  t at all), so  after  the transformation it corresponds to a moving 
linear Operator. The transformation thus leads us to a new picture 
of the motion, in which the states correspond to fixed  vectors  and 
the dynamical variables to moving linear Operators. We shall  cal1 
this the  Heisenberg picture.
The physical condition of the dynamical  System  at any time
involves the relation of the dynamical variables to the state, and 
the  Change of the physical condition with time may be ascribed 
either to a  Change in the state, with the dynamical variables kept 
fixed, which gives us the Schroedinger picture, or to a Change in the 
dynamical variables, with the state kept fixed, which gives us the 
Heisenberg picture.
In the Heisenberg picture there are equations of motion for the
dynamical variables. Take a dynamical variable corresponding to 
the fixed linear Operator v in the Schroedinger picture. In the  Heisenberg 
picture it corresponds to a moving linear Operator, which we 
write as vt instead of v*,  to bring out its dependence on t, and which 
is given by                        vt = T-%T 
or                               TV, =  vT. 
Dserentiating  with respect  to  t, we  get
aT              av,      aT
-p+T-=  vz.
at
With the help of (6), this gives
.
HTq+zfiT dvt
dt = vHT

dv
OP                     in-1   = T-QHT-  T-IHTv,
dt
=  v,  H,--H,v,,                    (11)
where                     H,= T-IHT.                               02)
Equation (11) may be written in P.B. notation
 
9  28    HEISENBERG'S FORM FOR THE EQUATIONS OF MOTION                 113
Equation  (11) or (13) Shows  how any dynamical variable varies
with time in the Heisenberg picture and gives us Heisenberg's ferm 
fm the equutions  of  motion. These equations of motion are determined 
by the one linear Operator  H,,  which is just the transform of the linear 
Operator  H  occurring in Schroedinger's form for the equations of 
motion and corresponds to the energy in the Heisenberg picture. We 
shall cal1 the dynamical variables in the Heisenberg picture, where 
they vary with the time,  Heisenberg dynamical variables,  to distinguish 
them from the fixed dynamical variables of the Schroedinger picture, 
which we shall  cal1  Schroedinger dynamical variables. Esch Heisenberg 
dynamical variable is connected with the corresponding Schroedinger 
dynamical variable by equation ( 10). Since this connexion is a unitary 
transformation, all  algebraic  and  functional  relationships are the 
Same for both kinds of dynamical variable. We have  T = 1 for
t =  t,,  so that  viO  = v and any Heisenberg dynamical variable at time
t,  equals the corresponding Schroedinger dynamical variable.
Equation (13)  tan  be compared with classical mechanics, where we
also have dynamical variables  varying  with the time. The equations 
of motion of classical mechanics tan be written in the Hamiltonian 
form                  dq,   =  aH   dP,   im
-=--2
dt          F,' dt                              (14)
%r
where the  q's  and  p's are a set of canonical coordinates and momenta 
and  H  is the energy expressed as a  function  of them and possibly also 
of  t.  The energy expressed in this way is called the  Hamiltonian. 
Equations  (14)  give, for v any function of the  q's and  JI'S that does 
not contain the time  t explicitly,
dv            av  da      av dp
-=            -.r  -2
dt           %T  dt +ap?  dt
av aH av al.7
=      -e-w-
%  acp,  apr  aqr
= [v,HJ,                                 (15)
with the classical definition of a  P-B., equation (1) of  3 21. This is 
of the  Same form as equation (13) in the  quantum  theory. We thus 
get an analogy between the classical equations of motion in the 
Hamiltonian form and the quantum  equations of motion in Heisenberg's 
form. This analogy provides a  justification  for the assumption
3595.67                               r
 
114                      THE  EQUATIOKS OJ?  MOTION                                  §  28

that the linear  operator  Ii  introduced in the preceding  section is the
energy of the  System  in  quantum  mechanics.
In cbssical mechanics a dynamical System  is defined mathemati-
cally  when the Hamiltonian is given, i.e. when the energy is given 
in terms of a set of canonical coordinates and momenta, as this is 
sufficient to fix the equations of motion. In quantum  mechanics a 
dynamical  System  is defined mathematically when the energy is 
given in terms of dynamical variables whose commutation  relations 
are known, as this is then sufficient to fix the equations of motion, 
in both Schroedinger's and Heisenberg's  fozm.  We need to  have 
either H expressed in terms of the Schroedinger dynamical variables 
or  Ht  expressed in terms of the corresponding Heisenberg dynamical 
variables, the  functional  relationship being, of course, the same in 
both  cases. We  call  the energy expressed in this  way  the  Hamiltonian 
of the dynamical  System  in  quantum  mechanics, to keep  up the 
analogy with the classical theory.
A  System  in  quantum  mechanics always has a Hamiltonian, whether
the  System  is one that has a classical analogue and is describable in 
terms of canonical coordinates and momenta or not. However,  8 the 
System  does have a classical analogue, its connexion with classical 
mechanics is specially  close  and one  tan  usually  assume that the 
Hamiltonian  is the same  function of the canonical coordinates and 
momenta in the  quantum  theory as in the classical theory.? There 
would be a dBlculty  in this,  of course, if the classical Hamiltonian 
involved a product of factors  whose quantum  analogues do not commute, 
as one would not know in which  Order  to put these factors in 
the  quantum  Hamiltonian, but tbis  does not happen for most of the 
elementary dynamical Systems  whose study is important for atomic 
physics.  In consequence we are able also largely to use the same 
language for describing dynamical  Systems  in the  quantum  theory  as 
in the classical theory (e.g.  to talk about particles  with given masses 
moving   through  given fields  of  forte),  and when given a  System  in 
classical mechanics,  tan  usually  give a meaning to  ` the  Same'   sysfem 
in  quantum  mechanics.
Equation ( 13) holds for  v, any  function  of the Heisenberg dynamical
variables not  involving the time explicitly, i.e. for  v any  constant

t Thia  sssumption is  found  in practice  to be successful  only when appkd with the
dynamical  coordktes and momenta referring to a Cartesian  system of  axes and not
to  more  general  curvilinear coordinates.
 
f 23    HEISENBERG'S FORM FOR THE EQUATIONS OF MOTION                             115 "I ) "l  "  j.t.1   `5 
linear Operator  in the Schroedinger picture. It  Shows  that such a  "  /  I                          ' 
function  vt is constant if it commutes with  4  or if  w commutes with  H.7 > 7 7 - ",/TV:.  7 `* 
We then have

and we  call   vt   or  v  SL constant of the motion.  It  is necessary  that v shall
commute  with H at all times, which is usually possible only if H is 
constant. In this case  we  tan  Substitute  H for v in (13) and deduce 
that Ht  is constant, showing that H itself is then a constant of the 
motion. Thus if the Hamiltonian is constant in the Schroedinger 
picture, it is also constant in the Heisenberg picture.
For an isolated  System,  a  System  not acted on by  any  external
forces,  there are  always  certain constants of the motion. One of these 
is the total energy or Hamiltonian. Others  arc  provided by the 
displacement theory of  3 25. It is evident physically that the total 
energy must remain unchanged if all the dynamical variables are 
displaced in a certain way, so equation (63)  of  $ 25  must hold with 
v(f=v= H.  Thus  D  commutes with  H  and is a constant of the 
motion. Passing  to the case  of an infinitesimal displacement, we see 
that the displacement Operators dz, d,, and  dz  are constants of the 
motion and  hence,  from  (69)  of  5 25,  the  total  momentum  is a constant 
of the motion. Again, the total energy must remain unchanged if all 
the dynamical variables are subjected to a certain rotation. This 
leads, as will be shown in 6 35, to the result that the total angular 
momentum  is a constant of the motion. The Zuws  of conservation of 
energy, momentum, and  angular   momentum  hold for an  isoluted  System 
in the Heisenberg picture in quantum  mechunics, as they hold in 
clmsical   mechunics.
Two forms for the equations of motion of  quantum  mechanics have
now been given. Of these, the Schroedinger form is the more useful 
one for practical Problems,  as  it provides the simpler equations. The 
unknowns in Schroedinger's wave  equation are the numbers which 
form the representative of a ket vector, while Heisenberg's equation 
of motion for a dynamical variable, if expressed in terms of a representation, 
would involve as unknowns the numbers forming the 
representative of the dynamical variable. The latter are far more 
numerous and therefore more difficult to  evaluate than the  Schroedinger 
unknowns. Heisenberg's form for the equations of motion is 
of value in providing  an  immediate  analogy  with classical mechanics 
and enabling one to see how various features of  classical  theory, such
 
29. Stationary States

116                   THE EQUATIONS OF MOTION                                   0  28

as the conservation laws referred to above, are translated into quan-
tumtheory.
29. Stationary states
We shall  here deal with a dynamical System  whose energy is con-
starrt. Certain  specially simple relations hold for this case.  Equation
(6) tan  be  integratedt to give
y  =  ,-ia(t-to)/fi>
with the help of the initial  condition that  T =  1 for  t =  t,. This
result substituted into  (1)  gives
(Pt) = e-iHWol/fi  1 pt,),                          (16)
which is the integral of Schroedinger's equation of motion  (5),   and
substituted into (10) it gives
vt = eiH(1-tO)/n,e-iH(I-tO)/~ >                      (17)
which is the integral of Heisenberg's equation of motion  (1  l),  Bt being
now equal to H. Thus we have solutions of the equations of motion 
in a simple form. However, these solutions are not of  much  practical 
value, because of the difficulty involved in evaluating the Operator 
e-iH(t-lo)lR,  unless  H  is particularly simple, and for practical purposes 
one usually has to fall back on Schroedinger's wave equation.
Let us consider a state of motion such that at time  t,  it is an  eigen-
state of the energy. The ket 1 Pt,)  corresponding to it at this time 
must be an eigenket of  H.  If  H'  is the eigenvalue to which it belongs, 
equation (  16)  gives      I PO = e-in'(l-lOYfi 1 pt,),
showing that  [Pt)  differs from  1 Pt,)  only by a  Phase   factor. Thus 
the  state  always remains an eigenstate of the energy, and  further,  it 
does not vary with the time at all, since the direction of the ket 1 Pt) 
does not vary with the time. Such a state is called a stutioruzry state. 
The probability for any particular  result of an Observation on it is 
independent of the time when the Observation is made. From our 
assumption that the energy is an observable, there are  sufficient 
stationary states for an arbitrary state to be dependent on them.
The time-dependent wave function  z,b(&) representing a stationary
state of energy H' will vary with time according to the law
#(et)   =  ~O(~)e-iR't~fi,                         (18)
t The integration  cm be carried  out as though H wem  an ordinary  algebraic
variable instead of  a linear Operator, because  there  is no quantity that does not
commute  with  H in the work.
 
§  29                     STATIONARY STATES                              117
and Schroedinger's wave equafion (7) for it reduces to
fwl) = &4l).                            (19)
This equation merely asserts that the state  represented by & is an 
eigenstate  of H. We call  a function lc10 satisfying (19) an eigenfunction 
of  H, belonging to the eigenvalue  H'.
In the Heisenberg picture the stationary states correspond  to fixed
eigenvectors of the energy. We tan  set up a representation in which 
all the basic vectors are eigenvectors of the energy and so correspond 
to stationary states in the Heisenberg picture. We call  such a representation 
a  Heisenberg representation.  The  fkrst  form of  quantum 
mechanics,  discovered  by Heisenberg in 1925, was in terms  of a 
representation of this kind. The energy is diagonal in the  representation. 
Any other diagonal dynamical variable must  commute  with the 
energy and is therefore a constant of the motion. The  Problem  of 
setting up a IIeisenberg representation thus reduces to the Problem 
of finding a  complete set of commuting observables, each  of which 
is a constant of the motion, and then making these  observables 
diagonal. The energy must be a function of these observables, from . 
Theorem  2 of  0  19. It is sometimes convenient to take the energy 
itself as one of them.
Let  CY denote the  complete set of commuting observables in a
Heisenberg reprssentation, so that the basic vectors are w-ritten  (a'l, 
1'~").  The energy is a function of these observables 01, say  H = H(a). 
I?rom  (17) we  get
(~`lvfld'>  =<&le iH(f-lO)/?i~e-iH(f -fO)/fi 1 a')
= eW'-W(~-fo)/~( (11'  I2,I a'),          (20)
where  H' =  H(d)  and  H" = H(G).  The  factor (a'lvl~i")  on the righthand 
side here is independent of  t, being an element of the matrix 
representing the fixed linear Operator V. Formula (20) Shows  how the 
Heisenberg matrix elements of any Heisenberg dynamical variable 
vary with time, and it makes V~  satisfy the equation of motion (1 l), 
as is easily verified. The Variation given by (20) is simply periodic 
with the frequency
IH'-H"I/27& =  IH'-H"[/h,                        (21)
depending only on the energy differente of the two stationary states 
to which the matrix element refers. This result is closely connected 
with the Combination Law of Spectroscopy and Bohr's Frequency
 
30. The Free Particle

118                  THE  EQUATIONS   03 MOTION                              3  29

Condition, according to which  (22) is the frequency of the  electromagnetic 
radiation emitted  or absorbed when the  System  makes a 
transition  under  the influence of radiation between the stationary 
states  CY'  and  ~y",  the eigenvalues of  H being Bohr's energy levels. 
These matters will be dealt with  in 5 45.

30. The free particle
The most fundamental and elementary  application of  quantum
mechanics is to the  System  consisting merely of a free particle, or 
particle not acted on by any forces.  For dealing with it we use as 
dynamical variables the three Cartesian coordinates x,  y,  x and their 
conjugate momenta  pz,  py,  pz. The Hamiltonian is equal to the 
kinetic  energy of the particle, namely

H =  g-& (P:+P;+Pz>                              (22)

according to Newtoniaa mechanics,  m  being the mass. This formula 
is valid only if the velocity of the particle is small compared with c, 
the velocity of light.  For a rapidly moving particle, such as we often 
have to deal with in  atomic  theory, (22) must be replaced by the
relativistic  formula H = c(m2c2+p~+p~+p~)*.                                (23) 
For small values of pzc, py, and pz (23) goes over into (22), except for 
the  constant  term mc2 which corresponds to the rest-energy of the 
particle in the theory of relativity and which has no influence on the 
equations of motion. Formulas (22) and (23)  tan be taken over 
directly into the  quantum  theory, the  Square root in (23) being now 
understood as the positive  Square root defined at the end of $11. 
The  constant  term  mc2  by which (23) differs from (22) for small values 
of  ps, piy, and  pz tan  still have no physical effects,  since the  Hamiltonian 
in the quantum  theory, as introduced in $27, is undefined to 
the extent of an arbitrary additive real constant.
We shall here work with the more accurate  formula  (23). We shall
first solve the Heisenberg equations of motion. From the  quantum
conditions  (9)  of  3  21,  ~~  commutes  with  pv  and  ps,  and  hence,   from
Theorem 1 of  5 19 extended  to a set of commuting observables, pz 
commutes  with any  function  of  pz,   py,  and  ps  and therefore with  H. 
It follows that  p,  is a  constant  of the motion. Similarly  p,  and  pz  are 
constants of the motion. These results are the same as in the classical
 
0  30                        THE FREE PARTICLE                                119

theory. Again, the equation of motion for a  coordinate,   X, say, is,
according to  (1  l),

;nx,  = i!i$  = x~C~~2C2+P~+p~+pB)~-C(~2~2+p5+p~+p~)fXI~

The right-hand side here  tan he evaluated by means of formula 
(31) of $22 with  the roles of coordinates and momenta interchanged,
so that it reads                  qkf!L  =  ifi  YPP,'                       (24)
f now being any function of the  p's. This gives

.
xt =  g-  c@"c"+PH+p;+P~Y                   C2PZ
=-.
H
2                                               (25)
Similarly ,                          C2Pf/
&=--,                    C2PZ
i, =-.
H                H                1
The magnitude of the velocity is
v  =  (*;+s.jf+i~)"   =  c"(p;+p$+p,2)"/.H.                 W)
Equations  (25)  and  (26)  are just the same as in the classical theory.
Let us consider a state that is an  eigenstatt  of the momenta,
belonging to the eigenvalues  p;,   ph,  pi. This state must be  an  eigen-
state of the Hamiltonian, belonging to the eigenvalue
H'  =  c(m2C2+~~2+~~2+p~2)f,                        (27)
and must therefore be a stationary sfate. The possible values for H' 
are  all numbers from mc2  to 03, as in the classical theory. The wave 
function #(xyx)  renresenting this state at any time in Schroedinger's 
representation must satisfy
p&qxyx))   =  p~$b(xyx)>  =  -Ins>,

with similar equations for  py   and  pz.  These equations show that
$(xyz) is of the form
#(xyz)   =  (-J&P~~+P;2J+P;~~l~,                 (28)
where a is independent of x, y, and  x. From   (18)  we see now that the
time-dependenf wave function $(xyxt)  is of the form
*(xyzt)   =  a,  &P~X+P;V+P+H'fi,                   (29)
where  a,-, is independent of x, y, x, and t.
The function (29) of x, y, x, and t describes plane waves in space-
time. We see from this example the suitability of the  terms  `wave 
function' and `wave equation'. The frequency of the waves is
v =  H'p?,                             (30)
 
120                   THE  EQUATIONS OF MOTION                         3  30

their wavelength is
x =  h/(p;2+pj2+p;2)~  =  h/P',                  (31)
P'  being the length of the vector (&.,&,&),  and their motion is in 
the  direction  specified by the vector (~5,&,pss) with the velocity
Au  =  H'JP'  =  c2/v',                   (32)
v' being the velocity of the particle corresponding to the momentum 
(p&ph,pb)  as given  by formula (26). Equations  (30),   (31), and (32) 
are easily seen to hold in all Lorentz frames of reference, the expression 
on the right-hand side of (29) being, in  fact, relativistically 
invariant with  p:,  ph,  p:   and  H'  as the compononts of a  4-vector. 
These properties of  relativistic  invariance  led de Broglie, before the 
discovery of  quantum  mechanics, to Postulate the  existente of waves
. of the form (29) associated with the motion of any particle. They
are therefore  known  as  de  Broglie  waves.
In the limiting case  when the mass m is made to tend to Zero,  the
classical velocity of the particle v becomes equal to c and  hence,  from 
(32), the wave velocity also becomes c. The waves are then like the 
light-waves associated with a  Photon,  with the  differente that they 
contain no reference to the polarization and involve a  complex  exponential 
instead of sines and cosines. Formulas (30) and  (31) are
still valid, connecting the  firequency  of the light-waves with the 
energy of the  Photon  and the wavelength of the light-waves with 
the  momentum  of the  Photon.
For the state represented by  (29), the probability of the particle
being found in any specified small volume when an Observation of its 
Position  is made is independent of where the volume is. This provides 
an example of Heisenberg's principle of uncertainty, the state being 
one for which  the  momentum  is accurately given and for which,  in 
consequence, the  Position  is completely unknown. Such a state is, 
of course, a limiting case  which  never occurs in practice. The states 
usually met with in practice. are those represented by wave  packets, 
which  may be formed by superposing a number of waves of the ty-pe 
(29) belonging to slightly different values of (&, p;,  p:), as discussed 
in  5 24. The ordinary formula in  hydrodynamics  for the velocity of 
such a wave packet,  i.e. the group  velocity of the waves, is
 
31. The Motion of Wave Packets

§ 30                       THE FREE PARTICLE                                 121

which gives, from (30) and (31)
dH'
-=c-
d"r, (mV+ P'2)h  2g = v'.
dP'                                                      (34)
This is just the velocity of the particle. The wave  pscket  moves in 
the  Same direction and with the same velocity as the particle moves 
in classical mechanics.

31.  The motion  of  wave   packets
The result just deduced for a free particle is an example of a general
principle. For  any dynamical  System  with  a classical analogue,  a sfate 
for which the classical description is valid as an approximation is 
represented in  quantum  mechanics by a wave packet, all the  coordinates 
and momenta having approximate  numerical  values, whose 
accuracy is limited by Heisenberg's principle of uncertainty. Now 
Schroedinger's wave equation fixes how such  a wave packet varies with 
time, so in  Order  that the classical description may remain valid, the 
wave packet should remain  a wave packet and should move according 
to the laws of classical dynamics.  We shall verify that this is so.
We take  a dynamical System  having a classical analogue and let
its Hamiltonian be H(q,,pJ  (r 7 1,2,...,  12).  The corresponding classical 
dynamical  System  will have as  Hamiltonian  H,(q,,   JI,.)   say,  obtained 
by putting ordinary algebraic  variables for the 4,. and  p,.  in H(q,,g+) 
and making  fi  -+ 0 if it occurs in  H(q,.,p,).   The classical Hamiltonian 
HC   is, of course,  a real  function  of its variables. It is usually  a 
quadratic   function of the momenta  J+,  but not always so, the 
relativistic  theory of a free particle being an example where it is not. 
The following argument is valid for  HC   any  algebraic   function  of thep's.
We suppose that the time-dependent wave  function  in  Schroe-
dinger's representation is of the form
+(qt)  =  Aeisln,                           (35)
where  A and X are real functions  of the q's and t which do not vary 
very rapidly with their arguments. The wave  function  is  then  of the 
form of waves, with A and S determining the amplitude and  Phase 
respectively. Schroedinger's wave equation (7) gives



or                                  = e--islfiH(q,.,  p,)Aeiqfi>.           (36)
 
122                   THE EQUATIONS OF MOTION                             8  31

Now  e--islfi  is evidently a unitary linear Operator and may be used for 
U  in equation  (70)  of  3  26  to give us a unitary transformation. The 
@s  remain unchanged  by this transformation, each  J.+, goes over into
e-is+3preislfi = p,+as/ap;,
with the help of (31) of 0 22, and H goes over into
e-isifiH(qr,pT)eiS1fi  =  H(q,,pr+aS/aqr),
since  algebraic  relations are preserved by the transformation. Thus
(36)  becomes

(37)
Lct us now suppose  that fi tan  be counted as small  and iet us neglect 
terms involving  6 in (37). This involves neglecting the  pr's  that occur 
in  H  in  (37), since  each   (P,.  is equivalent to the Operator -ifia/aq, 
operating on the  functions  of the  q's  to the right of it. The surviving 
terms give


This is a differential equation which the  Phase  function S has to 
satisfy. The equation is determined by the classical Hamiltonian
function  HC  and is known  as  the  Hamilton-Jacobi  equution   in classical 
dynamics.  It allows S to be real and so  Shows  that the assumption 
of the wave form (35) does not lead to an inconsistency.
To obtain an equation for  A,  we must retain the terms in (37)
which are linear in fi and see what they give. A direct  evaluation of 
these terms is rather awkward in the case  of a general function H, 
and we  tan  get the result we  require  more easily by first multiplying 
both sides of  (37)  by the bra  vector   (Af,  where  f is an arbitrary real 
function of the  q's. This gives
(Af{ i?iaAz-AZ}>  = wq!??&+~A).
The conjugate complex equation is



Subtracting and dividing out by in, we obtain

2<Af $) = CA [fJ+44.+gp)*                             (39)
 
g 31             THE MOTION  OF WAVE  PACKETS                            123

We now  have  to  evaluste the P.B.

Our assumption that 6 tan  be counted BS  small  enables us to expand 
H(!lTY PT+aiw-lr) as a power series in the  p's. The  tcrms  of  Zero degree 
will contribute  nothing to the P.B. The terms of the first degree in 
the  JI'S give a contribution to the P.B. which tan be evaluated most 
easily with the help of the classical formula (1) of § 2 1 (this formula 
being valid also in the quantum  theory if zc is independent of the p's 
and v is linear in the p's). The amount of this contribution is 



the notation meaning that we must Substitute  a8/aq,  for each  13, in 
the function [ ] of the q's and p's,  so as to obtain a function of the q's 
only. The  ten-s  of higher  degree in the p's give contributions  to the 
P.B. which vanish when K --+ 0. Thus (39)  becomes,  with neglect of 
terms involving 6, which is equivalent to the neglect of fi2 in (37), 



Now if  a(q)   and  b(q)  arc any two  functions  of the  q's,  formula
(64) of $20 gives      @(q)W)>  =  j-  40  4l' &tL

and so                Wd a;;'
-> =  -($.f@b(q)),                          (41)
r              T
provided  a(a) and  b(q)  satisfy suitable boundary conditions, as  dis-
cussed in $9 22 and 23. Hence (40) may be written



Since  this holds for an arbitrary real functionf, we must have

(42)

This is the equation for the amplitude  A of the wave function. To
get an understanding of its significance, let us suppose we have  a fluid 
moving in the  space of the variables  q,  the density of the fluid at any 
Point  and time being A2 and its velocity being
 
124                  THE  EQUATIONS   OF MOTION                      §  31

Equation (42) is then just tho equation of conservation for such a 
fluid. The motion of the fluid is determined by the function S 
satisfying  (38), there being one possible motion for  each  Solution 
of (38).
For a given S, let us take a Solution of (42) for which at some
definite time the density A2 vanishes everywhere outside a certain 
small region. We may suppose this region to move with the fluid, 
its velocity  at each  Point  being given by (43), and then the equation 
of conservation (42) will require the density always to vanish outside 
the region. There is a limit to how small the region may be, imposed 
by the approximation we made in neglecting  6 in (39). This  approximation 
is valid only provided
as
&$A<29A,
r       r

or

which requires that A shall vary by an appreciable fraction of itself 
only through a range of the q's  in which S varies by many times fi, 
i.e. a range consisting of many wavelengths of the wave function (35). 
Our Solution is then a wave  packet of the type discussed in  $ 24 and 
remains so for all time.
We thus get a wave function representing a state of motion for
which the coordinates  ahd momenta have approximate  numerical 
values throughout all time. Such a state of motion in  quantum 
theory corresponds to the states with which classical theory deals. 
The motion of our wave  packet  is determined by equations (38) and 
(43). From  these we get, defining ps as W/aqg,







where in the last line the p's are counted as independent of the  q's 
before the partial differentiation.    Equations (43) and (44) are just 
the classical equations of motion in Hamiltonian form and show that 
the wave  packet  moves according to the laws of classical mechanics.
 
32. The Action Principle

f  31                      THE MOTION OF WAVE  PACKETS                                               125

We see in this way how the classical equations of motion arc derivable
from  the quantum  theory as a limiting case.
By a more accurate Solution of the wave equation one  tan show
that the accuracy with which the coordinates and momenta simultaneously 
have  numerical  values  cannot  remain permanently  as 
favourable as the  Limit  allowed by Heisenberg's principle of  uncertainty, 
equation (56) of 3 24, but if it is initially so it will become 
less favourable, the wave packet undergoing a spreading.?
32. The  action   principlet
Equation (10) Shows  that the Heisenberg dynamical variables at
time   t, vt, are connected  with  their values at time  t,, vlO,  01: v, by a 
unitary transformation. The Heisenberg variables at time  t  +&  are 
connected with their values at time  t by an infinitesimal unitary 
transformation, as is shown by the equation of motion (11) or (13), 
which gives the connexion between v,+B  and vl of the form of (79) or 
(80) of 6 26 with Ht for P and  at/&  for E. The Variation with time of 
the Heisenberg  dynamical  variables may thus be looked upon as the 
continuous unfolding of a unitary transformation. In classical 
mechanics the dynamical variables at time t +St are connected with 
their values at time  t  by an infinitesimal contact transformation and 
the whole motion may be looked upon  as  the continuous unfolding of a 
contact transformation. We have here the mathematical foundation 
of the analogy between  the classical and  quantum  equations of 
motion, and tan develop it to bring out the quantum  analogue of all 
the main features of the classical theory of dynamics.
Suppose we have  a  representation in which the  complete set of
commuting observables +$ are diagonal, so that a  basic  bra is (l'l. 
We  tan  introduce  a  second  representation in which the  basic  bras are
<f'*  I =  <Fl T.                                       (45)
The new  basic  bras depend on the time  t  and give us a moving 
representation, like a moving System  of axes in an ordinary vector 
space  . Comparing (45) with the conjugate imaginary of (8), we see 
that the new  basic   vectors are just the transforms in the Heisenberg 
picture of the original basic  vectors in the Schroedinger picture, and 
hence  they must be connected with the Heisenberg dynamical
See  Kennard,   2.  f.  Physik;44  (1927),  344; Dsrwin,  Proc.  Roy.  Sec.   A, 117  (1927),
268.
$ This  section  may be omitted by the  student  who is not  specially   concerned   with
higher   dynamics.
 
1%                    THE EQUATIONS OF MOTION                               §  32

variables 2~~  in the same way in which the original basic  vectors are 
connected with the Schroedinger dynamical variables v. In  particular, 
each  (4'* 1 must be an eigenvector  of the  &`s belonging to the  eigenvalues 
5'. It may therefore be written  (eil,  with the  understanding 
that the uumbers 5; are the Same eigenvalues of the &`s that the (1"`s 
arc of the  6's.   Rom  (45) we get
<l%">  =  (4'ITIr)9                        (46)
showirrg  that the transformation function is just the representative
of  !P  in the original representation.
Ufferentiating (45) with respect  to t and using (6), we get



with   the help of  (12).  Multiplying on the right by any ket  Ia)
independent of  t, we get

ifi;<W> = GIH,l~) = f <Stff4G> G <GI@,                     (47)

i f we tnke for definiteness the  case  of continuous eigenvalues for the 
6's.   New  equation (51,  written  in terms of  representatives,  reads 



Since   (&l.Hl/&  is the same function of the variables  6; and  g that 
<~`/111~">  is of t' and f", equations (47) and (48) are of precisely the 
samt  form, with the variables Ei,   ci in (47) playing the role of the 
variables  f'  and  r in (48) and the function  <Si  Ia;)  playing the role 
oC   Ute  function  ([`]Pt).   We  tan  thus look upon  (47) as  a form of 
Schroedinger's wave equation, with the function  (6;  Ia} of the variables 
fi  as the  wave  function. In this way Schroedinger's wave equation 
appears in  a  new light,  us   the  condition  on the representative, in the 
moqving  representation with the Heisenberg variables  & diagonal,  of the 
$xed ket corresponding to a state in the Heisenberg picture. The function 
{&  In> owes its Variation with time to its left factor (&/,  in contradistinction 
to the function  (4'  1 Pt),  which owes its Variation with time 
to its right factor /Pt>.
If we put In> =  It") in  (47),   we get

aw#i>  dt:  <s;orsn>,              (49)
 
f 32                    THE ACTION  PRINCIPLE

showing that the transformation function  (6;  15")  satisfies  Schroe-
dinger's wave equation. Now  &,  = 5, so we must have


the  6 function here being understood as the producf of a number of 
factors,  one for  each  e-variable, such as occurs for the variables 
4Vu+l>"? e
on the right-hand side of equation
u                                          (34)  of  5 16. Thus the
transformation function  (&IE"> is that solution of Schroedinger's wave 
equation for which the  4's certainly have the values  r at time  1, 
The  Square of its modulus,  J (&lr)   J2, is the relative probability of the 
t's having the values  5; at time  t  >  t,  if they certainly have the values 
5" at time  t,.  We may write  (&iF)  as  ([;I&)  and consider it  as 
depending on t, as well as on t. To get its dependence on t, we take 
the conjugate complex of equation (49),  interchange  t and t,, and also 
interchange Single  primes and double primes. This gives



The foregoing discussion  of the transformation function {-oelt) is
valid with the  t's any complete set  of commuting observables. The 
equations were written down for the  case  of the  f's having continuous 
eigenvalues, but they would still be valid if any of the  4's have 
discrete  eigenvalues, provided the necessary formal  changes   arc  made 
in them. Let us now take  a dynamical  System  having  a classical 
analogue and let us take the f's  to be the coordinates 4. Put
(qi  jq")  =  &W                          (52)
and so define  the function 8 of the variables qi, Q". This function also 
depends explicitly on  t.  (52) is  a Solution of Schroedinger's wave 
equation and, if 6 tan  be counted as small, it tan be handled in thc 
Same  wy  as (35) was. The  S of  (52) differs from the  8 of  (35) on 
account of there being no A in (52),  which makes the  8 of (52) complex, 
but the real part of this S equals the  S of  (35) and its pure 
imaginary part is of the  Order  fi. Thus, in the limit 6 -+  0, the  S of 
(52)  will equal that of  (35)  and will therefore satisfy, corresponding 
fo (3%                      -                                              (53) 
where                             PA =  W%,                                (54) 
and  N, is the Hamiltonian of the classical analogue of our quantum
dynamical System. But  (52)  is also  a solution of  (51)  with  q's  for  f's,
 
128                        THE EQUATIONS OF MOTION                             § 32
which is the conjugate  complex  of Schroedinger's wave equation in the
variables 4" or &. This  Causes S to satisfy also'f
aS/at,  =  H,(q;,p;)>                         (55)
where                                pp = -ax/aq;.                            WV
The Solution of the Hamilton-Jacobi equations  (53),   (55)  is  the
action function of classical mechanics for the time interval t,  to  t,
i.e. it is the time integral of the Lagrangian L,

S =  t  L(t') dt'.
s                                    (57)
Thus  the 8  de$ned by  (52) is the quantuna analogue of the clussical  action 
function and equals it in the  limit  6  -+ 0. To get the  quantum  analogue 
of the classical Lagrangian, we pass to the case  of an infinitesimal 
time interval by putting  t =  t,+&   and we then have  (q~,+~ljq~O)  as  the 
analogue of eiQ@lin. F or the sake of the analogy, one should consider 
L(t,)   as a function of the coordinates  q' at time  t,+6t   and the  coordinates 
q" at time t,,  rather than as a function of the coordinates 
and velocities at time t,,  as one usually does.
The principle of least action in classical mechanics says that the
action function (57) remains stationary for small variations of the  trajectory 
of the System  which do not alter the end points, i.e. for small 
variations of the  q's  at all intermediate times between  t,  and  t  with  qt, 
and  qI  fixed. Let us see what it corresponds to in the  quantum  theory. 

Put     exp[i/Qt)   dt/%]   =  exp(iS(t,,   t,)/h)   =  B(t,,  t,),    (58)
u
so that B(t,, ta) corresponds to <qo1qi,) in the  quantum  theory . (We 
here allow  qiG and  qi, to denote different eigenvalues of  qt,  and  qtb,  to 
save having to introduce a  large  number of primes into the analysis.) 
Now suppose the time interval t, -+ t to be divided up into a large
number of small time intervals t, -+ t,,  t, + t2,...,  tmml -+ tm, tm -+ t, by 
the  introduction  of a sequence of intermediate times  t,,   t2,...,   t,.   Then
Jw, 4)) = w, tm)W,,  t,l)-*W,,  t,P(t,,  to>.                 (59)
The corresponding quantum  equation, which follows from the pro-                       .
perty of  basic  vectors (35) of  $ 16, is

~4~140)  =  jJ..J   M4hJ   &lh<amlqA-1)   &L4a81q~>   &Mlao>~
(60)
j- For a more accurate comparison of transformation functions with classical
theory,  sec Van Vleck, Proc.  Nat. Acc&. 14, 178.
 
0 32                     THE ACTION PRINCIPLE                             129     '

& being written for & for brevity. At first sight there does not seem 
to be any  close  correspondence between  (59)  and  (60).  We must, 
however, analyse the meaning of  (59)  rather more carefully. We must 
regard  each  factor  B as a function of the p's at the two ends of the 
time interval to which it refers. This makes the right-hand side of 
(59) a function, not only of CJ~ and Q~,, but also of all the intermediate 
Q'S.  Equation (59) is valid only when we Substitute for the intermediate 
q's  in its right-hand side their values for the real trajectory, 
small variations in which values leave  S stationary and therefore also, 
from (58),  leave  B(t, to)  stationary. It is the process of substituting 
these values for the intermediate q's which corresponds to the integrations 
over all values for the intermediate  q"s in  (60). The  quantum 
analogue of the action principle is thus absorbed in the composition 
law  (60)  and the classical requirement that the values of the intermediate 
q's  shall make  S stationary corresponds to the  condition 
in  quantum  mechanics that all values of the intermediate  q"s 
are important in Proportion to their  contribution  to the integral 
in  (60).
Let us see how  (59)  tan be a Iimiting  case of  (60)  for  fi small. We
must suppose the integrand in  (60)  to be of the form  eiFjfi,  where  F is 
a function of  qh, qi, qi,...,qA,   qf  which remains continuous as  fi tends 
to  Zero, so that the integrand is a rapidly oscillating function when 
% is small. The integral of such a rapidly oscillating function will be 
extremely small, except for the  contribution  arising from a region in 
the  domain of integration where comparatively large variations in 
the  q5 produce only very small variations in F. Such a region must 
be the neighbourhood of a  Point  where  P is stationary for small  variations 
of the qk. Thus the integral in (60) is determined essentially by 
the value of the integrand at a  Point  where the integrand is stationary 
for small variations of the intermediate q"s, and so (60) goes over 
into  (59).
Equations  (54)  and  (56)  express that the variables  qi,pf.  are  con-
neoted with the variables q",p" by a contact transformation and are 
one of the Standard forms of writing the equations of a contact  transformation. 
There is an analogous form for writing the equations of a 
unitary transformation in  quantum  mechanics. We get from  (52)) with 
the help of  (45)  of  3 22,

<q;/pJq") =  -iTi--${q;~q")     =  a~<q;lqrf)*            (61)

3595.57                              K
 
33. The Gibbs Ensemble

130                    THE  EQUATIONS  OF MOTION                                I  32

Similarly, with the help of  (46)  of  5 22,


From the general definition of functions of commuting observables,
we have                                                                         (63) 
wheref(q,) and  g(q)  arc  functions of the  qt's  and  q's  respectively. Let 
G(qt,q)   be  any function of the  qt's  and  q's  consisting of  a sum or 
integral of terms each  of the form f(qJg(q),  so that all the  qt's in Q 
occur to the left of all the  q's.  Such  & function we  cal1  weil   ordered. 
Applying (63) to  ea,ch of the terms in G and adding  or integrating, 
we get                 (ai 1 Wt, d Ia">  = W;,  a"M;  Id'>-
Now  let us suppose  each   spti  and  ~p,,  tan  be expressed  as  a well-ordered 
function of the  qt's and  q's and write these functions pti(qt,  q),p,(qt,  q). 
Putting these functions for G, we get
<at Ii?-%tlq">  = Pr&;,  mat  l!f),
<!d  IP,lS">  =  PAd,  a"Kdla">~
Comparing these equations with (61) and (62) respectively, we see 
that                           Wd, Cl")                           as(d,  a")
PrikL  a") =                     PrGl")  =  -          ap"
eid  '                               r
This means  that
(64)     .
provided the right-hand sides of  (64)  arc written  as well-ordered
functions.
These equations arc  of the Same form &s  (54) and (56), but refer to
the non-commuting quantum  variables qt,q   instead of the ordinary 
algebraic  variables qi,  q".  They show how the conditions for a unitary 
transformation between  quantum  variables are analogous to the  conditions 
for a contract  transformation between classical variables. The 
analogy is not  complete,  however,  because  the  cbssical S must be real
and there is no simple  condition corresponding to this for the S of  (64).
33.  The  Gibbs ensemble
In our work up to the present we have been assuming all along  that
our dynamical System  at each  instant of time is in a definite  state, 
that is to  say,  its motion is specified as completely and accurately  as 
is possible without conflicting with the general principles of the theory.
 
0 33                     THE GIBBS ENSEMBLE

In the classical theory this would  meen,  of course, that all the  coordi-
nates and momenta have specified values. Now we may be interested 
in a motion which is specified to a lesser extent than this mnximum 
possible. The present section  will be devoted to the methods to be 
used in such a case.
The procedure in  classical  mechanics is to introduce what is called
a Gibbs ensembbe,  the idea of which is as follows. We consider all the 
dynamical coordinates and momenta as Cartesian coordinates in a 
certain space, the  phse  spute,   whose number of dimensions is twice 
the number of degrees of freedom of the  System.  Any state of the 
System  tan then be represented by a Point  in this space. This Point 
will move according to the classical equations of motion (14). Suppose, 
now, that  we   arc  not given that the  system  is in a definite state 
at  any  time, but only that it is in one or other of  it number of possible 
states  according to  a definite probability law. We should then be 
able to represent it by  a fluid in the  Phase  space, the mass of fluid in 
any volume of the phase space being the total probability of the 
System  being in any state whose representative  Point  lies in that 
volume.  Esch particle  of the fluid will be  moving  according to the 
equations of motion (14). If we introduce the density p of the fluid 
at any  Point,  equal to the probability per unit volume of  Phase  space 
of  the-System  being in the neighbourhood of the corresponding state, 
we shall  have  the equation of conservation





=  -[pJq.                                          (65)     '
This may be considered as the equation of motion for the fluid,  since 
it determines the density p for all time if  p is given initially  as a 
function  of  t,he  q's and p's. It is, apart from the minus sign, of the 
same form as the ordinary equation of motion (15) for a dynamical 
variable.
The requirement that the total probability of the  System  being in
any state  shall  be unity gives us  a normalizing  condition for  p
SS pdq@ = 1,
the integration being over the whole of  Phase  space and  the Single
 
132                   THE EQUATIONS OF MOTION                           § 33

differential  dq or dp being written to denote the  product of  all the 
02~`s or  dp's.  If  /3 denotes any  function of the dynamical variables, 
the  average  value of /3 will bef S 13P dq**                          (67) 
It makes only a trivial alteration in the theory, but often facilitates 
diseussion, if we work with a density p differing from the above one 
by a positive  constant  factor,  E say, so that we have instead of  (66)
p dqdp =  k.
SS
With this density we tan  picture the fluid as representing a number 
k: of similar dynamical Systems, all following through their motions 
independently in the  same  place,  without any mutual disturbance or 
interaction.  The density at any Point  would then be the probable or 
average  number of Systems in the neighbourhood of  any  state per  unit 
volume of  Phase  space, and expression (67) would give the  average 
total value of  /3 for all the Systems. Such a set of dynamical Systems, 
which  is the ensemble introduced by Gibbs, is usually not realizable 
in practice, except as  a rough approximation, but it forms  all the 
same a useful theoretical  abstraction.
We shall now see that there exists a corresponding density  p
in  quantum  mechanics, having properties analogous to the  above. 
It was first introduced by von Neumann. Its  existente is rather 
surprising in view of the  fact that Phase  space has no meaning in 
quantum  mechanics, there being no possibility of assigning  numerical 
values simultaneously to the  q's  and  p's.
We consider a dynamical System  which  is at a certain time in one
or other of a number of possible states according to some given 
probability law. These  states may be either a  discrete  set or a  continuous 
range, or both together. We shall here take for definiteness 
the  case  of a  discrete  set and suppose them labelled by a Parameter  m. 
Let the normalized ket vectors  corresponding to them be Im) and let 
the probability of the System  being in the mth state be  Pm. We then 
define the  quantum  density p by

P =
c   Im>en<mI~                (68)
m
Let p' be any eigenvalue of  p and  Ip')  an eigenket belonging to this
eigenvalue. Then
c  Ia?&4P')  =
PIP') =
P'IP')
m
 
r  /

/









8 33                          THE GIBBS ENSEMBLE                                     133

so  that                 2  (P'I~)Pm(~lP') =  PYP'IP')
m
0r                            2  p,le4P'~12  =  PYP'IP').
m
Now  Pm,  being a probability, tan never be negative. It follows that 
p'  cannot be negative. Thus  p  has no negative eigenvalues, in analogy 
with the fact  that the classical density p is never negative.
Let us now obtain the equation of motion for our quantum  p. In
Schroedinger's picture the kets and bras in  (68)   will vary with the time
in accordance with Schroedinger's equation (5) and the conjugate
imaginary of this equation, while the  Pm's   will remain  constant,   since 
the  System,  so long as it is left undisturbed, cannot  Change over from 
a state corresponding to one ket satisfying Schroedinger's equation to 
a state corresponding to another. We thus have



=  2 (HIm~p~~mI~lm)pm~mlH)
m
=  HP-pH.                                                  (69)
This is the  quantum  analogue of the classical equation of motion 
(65). Our  quantum   p,  like the classical one, is determined for all time 
if it is given initially.
From the assumption of 3 12, the average  value of any observable
/3 when the System  is in the state m is (m I/llm).  Hence  if the System 
is distributed over the various states m according to the probability 
law  Pm,   the  average  value of  ,8 will be  2  P,(m   I~jrn}. If we introduce 
a representation with a discrete  set ofmbasic  ket vectors 15:)  say, this 
equals
ersm  EX Iss Im> = (; (5" Isslm)Pm<m  14'>
3
m'

=
p  WssPlO =
c  <5'IPssIf>9    (70)
I                  5'
the last step being easily verified with the law of matrix multiplication, 
equation (44)  of  3  17. The expressions (70) are the analogue of 
the expression (67)  of the classical theory. Whereas in the classical 
theory we have to multiply ss by  p  and take the integral of the 
product over all  Phase   space,  in the  quantum  theory we have to 
multiply ss by  p,  with the  factors  in either  Order, and take the
 
134                 THE  EQUATIONS  OF MOTION                             5 33

diagonal sum of the  product  in a representation. If the  representation 
involves a continuous range of basic  vectors lt'),  we  get instead 
of (70)                                                                  (71) 
so that we must carry through a process of `integrating along the 
diagonal' instead of summing the diagonal elements. We shall define 
(7 1)  to be the diagonal sum of  /3p  in the continuous  case. It tan  easily 
be verified, from the properties of transformation  functions  (56) of 
6 18, that the diagonal sum is the Same for all representations.
From the  condition that the  Im)`s  are normalized we get, with
discrete  ["s
F  G?IPlS'>  =~Clnz)Pm(ml~`~  =  "p,  =   1,               (72)
m
since  the total probability of the System  being in any state is unity. 
This is the analogue of equation (66). The probability of the System 
being in the state  e',  or the probability of the observables 6 which 
are diagonal in the representation having the values  ,$`, is, according 
to the rule  for interpreting  representatives of kets (51) of 3 18,
c  I<4?l~>12Pm  =  <4'IPIO,                       (73)
m
which gives us a meaning for each  term in the sum on the left-hand 
side of (72). For continuous &`s,  the right-hand  side of (73) gives the 
probability of the  (`s having values in the neighbourhood of  t' per 
unit range of Variation of the values 6'.
As in the classical theory, we may take a density equal to  k times
the above  p  and consider it as representing a Gibbs ensemble of  k 
similar dynamical Systems, between which there  is no mutual disturbance 
or interaction.  We shall then have k; on the right-hand side 
of  (72),   and  (70)  or  (71)  will give the total  average   /3 for all the 
members of  the,ensemble,  while  (73) will give the total probability 
of  a member of the  ensemble  having values for its  6's  equal to  5 
or in the neighbourhood of  5' per  unit  range of Variation of the 
values 4'.
An important application of the Gibbs ensemble is to  a dynamical
System  in  thermodynamic  equilibrium with its surroundings at  a 
given temperature T. Gibbs showed that such  a  System  is  represented 
in classical mechanics by the density
p  =  Ce-HH',                              (74)
 
§ 33                    THE  GIBBS ENSEMBLE                                135

EZ  being the Hamiltonian, which is now independent of the time, k: 
being Boltzmann's constant,  and c being a number Chosen  to make 
the normalizing  condition (66) hold. This formula may  be taken  over 
unchanged  into the  quantum  theory. At high temperatures, (74) 
becomes  p = c, which gives, on being substituted into the right-hand 
side of  (73),  c((`l[`)  = c in the case  of discrete  l"s.  This Shows  that 
at high temperatures all discrete  stutes  are qually  probable.
 
VI. Elementary Applications
34. The Harmonic Oscillator

VI
ELEMENTARY APPLICATIONS
34. The harmonic oscillator
A  SIMPLE and interesting example of a dynamical System  in quantum 
mechanics is the harmonic oscillator. This example is of  importante 
for general theory, because  it forms a corner-stone in the theory of 
radiation.    The dynamical variables needed for describing the  System 
are just one  coordinate  4 and its conjugate  momentum   p. The 
Hamiltonian in classical mechanics is

H =  $  (p"+mWq2),                            (1)
where  m is the mass of the oscillating particle  and w is 2rr times the 
frequency. We assume the same Hamiltonian in  quantum  mechanics. 
This Hamiltonian, together with the  quantum  condition (10)  of  9 22, 
define the  System  completely.
The Heisenberg equations of motion are
4  =  klt,  Hl =  rptlm,
st =  [pt,  H] =  -mo2q,.                      (2)
1
It is convenie&  to introduce the dimensionless complex dynamical 
variable                7j =  (2mfiw)-Q+imwq).                          (3) 
The equations of motion (2) give
9jl   =  (2mh)-*(-mw2q,+iapt)        =  iw~.
This equation tan  be integrated to give
yt  =  qoeiwt,                          (4)
where  7. is a linear Operator independent of  t, and is equal to the 
value of  qt at time  t =  0.  The above equations are all as in the 
classical theory.
We  tan  express q and 13 in terms of 7 and its conjugate complex +j
and may thus work entirely in terms of 7 and  q. We have
Tiwr)+j  =  (2m)-1(p+imwq)(p-imoq)
=  (2m)-1[p2+m2c02q2+imw(qp-pq)]
=  H-Q?ko                                         (5)
and similarly                Fiwfj7j = H+iW.                            (6) 
Thus                       Q-r)+ji  = 1.                                (7)
 
8 34                  THE HARMONIC OSCILLATOR                           137

Equation (5) or (6) gives H in terms  of  7 and  7 and  (7) gives the
commutation   relation connecting  71  and  +.   From  (5)
?iGjqlJ  = rjH+iC&j
and  from  (6)              ~w?ei =  H+j+&.kj.
Thus                      +jH-Hrj  =  EhTj.                              (8) 
Also, (7) leads to       +iqn---qn7j =  nqn-l                            (9) 
for any positive integer n, as may be verified by induction,  since, by 
multiplying (9) by 31  on the left, we  tan  deduce (9) with n+ 1 for n.
Let H' be an eigenvalue of H and 1 H') an eigenket belonging to it.
From  (5)
*&o(H'~+j~H') = (H'IH-@Weh') = (H'-&J)(H'(H').
Now  (H'[+jlH')  is the  Square of the length of the ket  +jjH'),   and
hence                        <H'lqrjlH'>   2  0,
the  case  of equality occurring only if  ql H') = 0.  Also  (H' jH') > 0. 
Thus                             H'  >  &t~,                           (10) 
the  case  of equality occurring only if  +j 1 H') = 0. From  the form (1) 
of  H as a sum of squares, we should expect its eigenvalues to be all 
positive or zero (since the  average  value of H for any state must be 
positive or Zero.)  We now have the more stringent condition  (IO).
From (8)
HqIH')   =  (;~H---&.~J~)~H') =  (H'-&x~)ij~H'>.         (11)
Now if  H' #  @iw,  rj]H'> is not zero and is then according to (11) an 
eigenket of  H belonging to the eigenvalue  H'-Ziw.   Thus,  with  H' 
any eigenvalue of  H not equal to  &JJ, H'-Ah  is another eigenvalue 
of  H. We  tan  repeat the argument and infer that, if H'-ih  # @io, 
H'-21iw   is another eigenvalue of  H.  Continuing in this way, we 
obtain the series of eigenvalues  H',  H'-h,  H'-21io,   H'-3Tio,..., 
which   cannot  extend  to infinity,  because  then it would contain  eigenvalues 
contradicting (  lO),   and  tan terminate only with the value  *ao. 
Again,  from  the conjugate  complex of equation (8)
HqlH')  =  (qH+fiqW') =  W'+~4qlH'),
showing that H'+&J  is another eigenvalue of  H, with q1H') as  an 
eigenket belonging to it, unless  qlH'>   = 0. The latter alternative 
tan  be ruled out, since it would lead to
0  =  &ioijq~H') =  (H+@~~J)IH') =  (H'+Q?io)IH'>,
 
138                ELEMENTARY APPLICATIONS                                  I 34

which contradicts (10). Thus H'+&J  is always another eigenvalue 
of H, and so are Hf+ 2fi0, H'+3b  and so on. Hence the eigenvalues 
of  H are the series of numbers
piw,       ;ncfJ,        piw,       pkJ> . . . .         (12)
extending to infinity. These are the possible energy values for the
harmonic  oscillator.
Let  IO)  be an eigenket of  H  belonging to the lowest eigenvalue
#CU,  so that                           +jlO> =  0,                         (13)
and form the sequence of kets
IO>,       dO>,          q210>,       TjqO),    . . . .    (14)
These kets are  all  eigenkets of  H,  belonging to the sequence of  eigen-
values  (12) respectively.  Prom  (9) and  (13)
ij7jqO)  =  nTjJy0)                           (15)
for any non-negative integer  n.  Thus the set of kets  (14)  is such that 
7 or +i applied to any one of the set gives a ket dependent on the set. 
New all the dynamical variables in our  Problem  are expressible in terms 
of  q and  +j, so the kets  (14)  must form a complete set (otherwise there 
would be some more dynamical variables). There is just  one of these 
kets for each  eigenvalue  (12) of  H, so  H  by itself forms a complete 
commuting set of observables. The kets  (14)  correspond to the various 
stationary states of the oscillator. The stationary state with energy 
(%n+  g)rio,  corresponding  to 7" IO),  is called the n;th  quantum  state.
The  Square of the length of the ket qnlO) is
wj"~"lO>  =n(Ol~n-17p-110>
with the help of  (15). By induction,   we  find that
(Ol~n7p[O)   I:  n!                          (16)
provided  IO) is normalized. Thus the kets  (14)  multiplied by the 
coefficients  n!-g with n = 0, 1,2  ,..., respectively form the  basic  kets 
of a representation, namely the representation with  H  diagonal. Any 
ket  1s) tan  be expanded in the form
IX> = z\ w.P10>,                             (1')
0
where the  x,`s  are numbers. In this way the ket  IX) is put into 
correspondence with a power series 2 X,  7n in the variable  3, the 
various terms in the power series  corresponding  to the various 
stationary  states.  If  IX} is normalized, it defines a state for which
 
$34                    THE HARMONIC OSCILLATOR                              139

the probability of the oscillator being in the  &h quanfum state,
i.e. the probability of H having the value (n+$)fio,  is
P, =  n!lxn12,                        (18)
as follows from the same argument which led to (51) of 3 18.
We may consider the ket  IO>  as a Standard ket and the power series
in  17  as a wave function, since  any ket tan be expressed as such a 
wave function rnultiplied  into this Standard ket. The present kind 
of wave function differs from the usual kind, introduced by equations 
(62) of  6 20, in that it is a function of the  complex dynamical variable 
`1  instead of observables. It is, however, for many purposes the most 
convenient wave function to use for describing states of  the  harmonic 
oscillator. The Standard ket IO) satisfies the  condition  (  13), which 
replaces the conditions (43) of  8 22 for the Standard ket in Schroedinger's 
representation.
Let us introduce Schroedinger's representation with  4 diagonal and
obtain the representatives of the stationary states.  From  (13) and (3)
(p-imwq)JO)  = 0,
so                               (q'Ip--imwq10)  = 0.
With the help of (45) of $22, this gives
a
6-y (q'~0>+mwq'(q'~0) = 0.
3                                                (19)
The Solution of this differential equation is
(q'  IO) =  (mw~7di)~e-mw~`m~2T1,                (20)
the  numerical  coefficient being Chosen  so as to make IO} normalized. 
We have here the representative of the normal state, as the state of 
lowest energy is called. The representatives of the other stationary 
states tan be obtained  from  it: We have from (3)
(q'prpl0)  =  (2mnw)-~/2(a'l(p+imwq)nlO)

=  (2?r&w)-n12in          -$+M "<cr'lO>
(                   1
=  in(254&0~)--7@(rn~/&)t  -fi--$+m~q'   ne-naoq'p/2fi.    (21)
(              1
This may easily be worked out for small values of n. The result is of 
the form of e- mwq'e12fi  times a power series of degree n in q'. A further 
factor  n!-* must be inserted in (21) to get the normalized  representative 
of the &h quantum  state. The  factor  in may be discarded, being 
merely a  Phase  factor.
 
35. Angular Momentum

140                     ELEMENTARY APPLICATIONS                                        a 35
35. Angular  momentum
Let us consider a particle  described by the three Cartesian coordi-
nates x, Y,  x and their conjugate momenta  Ps,  PV,   Pz.  Its angular 
momentum  about the origin is defined as in the classical theory, by
m, = YPi-ZP,                 my  =  zP,-xPz           mf3 =  XPy-YPm         (22)
or by the vector equation
m=xxp.
We must evaluate the  P.B.s  of the angular momentum  components 
with the dynamical variables x, pz, etc., and with each  other. This 
we  tan do most conveniently with the help of the laws (4) and (5) of
9  21,  thus       [m,, XI = [XPy-YPm 4 = -Yl-J&,X] = y,                              (23)
cm,,Yl   =  bP,-YPWYI   =  X[Py,Yl =  -x7                     1 
Cm,,z]  =  [xpy-~~z9   21 =  0,                                    (24)
and similarly,
[%PLzl =  P*>            [m,,P,l =  -Pm                        (25)
[m,,PJ  =  0,                                  (26)
with corresponding relations for m, and  mg. Again

[my,  %l = bPc-XPm %l  =  4Pm mzl-1x9  %lPa
= --vy+YPa  = mm                                            (27)
[m,,  m,]  =  mu,        [Wz,   m,]  =  m,.                      1
These results are all the  sarne as in the classical theory. The sign in 
the results (23)) (25)) and (27) may easily be remembered from the 
rule that the + sign occurs when the three dynamical variables,  consisting 
of the two in the P.B. on the left-hand side and the one 
forming the result on the right, are in the cyclic Order  (xyx) and the 
- sign occurs otherwise. Equations (27) may be put in the vector
form                                  mxm  =  ifim.                                   (28)
Now suppose we have several  particles  with angular momenta
m,, m,,... . Esch of these angular  momentum   vectors  will satisfy
(28)) thus                           m, x m, =  iZ?q, 
and any one of them will commute  with any other, so that
m,xm,+m,xm;=  0  (r  #s).
 
§ 35                               ANGULAR  MOMENTUM                            141

Hence  if M = 1 m, is the total angular momentum,
T
MxM = 2 qxm,  = 2 qxq+  2 (m,xm,+m,xm,)
TS                   r           r<s
= ifi  ;f m, = %M.                                              (29)
T
This result is of the same form  as  (28), so that the components of the 
total angular momentum  M of any number of particles satisfy the 
same  commutation  relations as those of the angular momentum  of 
a  Single particle.
Let  A,,   A,,  A,  denote the three coordinates of any one of the
particles, or else the three components of  momentum  of one of 
the particles. The A's will commute  with the angular momenta of 
the other particles, and  hence  from (23), (24),  (29,  and (26)
[M,,   A,] =  A,,          [K,  Af/] =  -4,     [M,,   A,] =  0.    (30)
If  B,,  J$,,   B, are a  second  set of three quantities denoting the 
coordinates or momentum  components of one of the particles, they 
will satisfy similar relations to (30). We shall then have
PL 4 %+4/ BI/+4 4
= Pk &1%+4P?z7  4!lfv?3~  4/lq/+4/M q/l
= A,  B,+A,  B,-A,  B,-A,  B,
= 0.
Thus the scalar product A,  B,+A,  B,+  A,  Bz commutes with MS, 
and similarly with &!%  and  &&.  Introduce the vector product
AxB=C
or
A,   Bz-A,  B,  = Cz,               A,  Bz--A, Bz = C,,     A,   BP--A,  BS = Cs.
We have                       PL GI = -AZ B,+A,B,  = C, 
and similarly                 [M,,c,J =  4,       [i&CJ  =  0. 
These equations are again of the form  (3O),  with C for A. We  tan 
conclude from this work that equations of the form (30) hold for the 
three components of any vector that we  tan  construct  from   our 
dynamical variables, and that any scalar commutes with M.
We  tan  introduce linear Operators R referring to rotations about
the origin in the same way in  which  we introduced the linear Operators 
D in 5 25 referring to displacements. Taking a rotation through an
 
142                 ELEMENTARY APPLICATIONS                                 § 35

angle  S+   about the  x-axis  and making  S$  infinitesimal,  we   tan  obtain
the limit Operator corresponding to (64) of  9 25,
lim (R-  l)/S+,
W-+-O
which we shall cal1  the rotution  operator  about the x-axis and denote 
by  rZ. Like the displacement Operators,  rZ  is a pure imaginary linear 
Operator and is undetermined to the extent of an arbitrary additive 
pure imaginary number. Corresponding to (66) of  0 25, the  Change 
in any dynamical variable v caused  by a rotation through  a  small 
angle S+  about the x-axis is
S$(r, v---vr,L                            (31)
to the first  Order  in  S+.  Now the changes  produced in the three 
components A,, A,,   A,  of a  vector by  a  (right-handed) rotation  S+ 
about the x-axis applied to all measuring apparatus are  S&4,, 
-S+,,  and 0 respectively, and  any scalar quantity  is  unchanged  by 
the rotation. Equating these changes to  (31),  we find that
rzA,---A,r,  = A,,          r,A,--A,r,   =  -A,,
rzA,--A,r, = 0,
and  rz  commutes  with  any  scalar. Comparing these results with  (30), 
we see that  Zr,  satisfies the  Same  commutation  relations as  M,. 
Their  differente,  M,---iEirz,  commutes  with all the dynamical variables 
and must therefore be a number. This number, which is necessarily 
real  since   M, and  Sr,  are real, may be made zero by  a  suitable choice 
of the arbitrary pure imaginary number that tan  be added to rz.  We 
then have the result              iJ&  =  ifir,.                           (32) 
Similar equations hold for  JI,  and  M,.  They are the analogues of (69) 
of $25. Thus the  total angular momentum  is connected with the rotation 
Operators  as the total  momentum  is  connected  with the displacement 
Operators.  This conclusion is valid for any  Point   as  origin.
The above argument applies to the angular  momentum  arising
from  the motion of  particles,  defined by (22) for each  particle.  There
is another kind of angular momentum  occurring in atomic  theory, 
spin angular  momentum.  The former  kind of angular momentum  will 
be called orbital  angukzr momentum,  to distinguish it. The spin  angular 
momentum  of a  particle  should be pictured  as  due to some internal 
motion of the particle,  so that it is associated with different degrees 
of freedom from those describing the motion of the  particle   as  a whole,
 
§ 35                     ANGULAR  MOMENTUM
and hence the dynamical variables that describe the spin must  commute 
with x, y, x, ps,  JZ+,, and ps.  The spin does not correspond vexy 
closely to anything in classical mechanics, so the method of classical 
analogy is not suitable for studying it. However, we tan build up a 
theory of the spin simply from the assumption that the components 
of the  Spin  angular  momentum  are connected with the rotation  operators 
in the  Same way as we had above for orbital angular momentum, 
i.e. equation (32) holds with  MB as the  x component of the spin angular 
momentum  of a particle and  r, as the rotation Operator about the 
x-axis referring to states of spin of that particle. With this assumption, 
the  commutation  relations connecting the components of the 
spin angular momentum  M with any vector A referring to the spin 
must be of the Standard form  (30), and hence, taking A to  be the 
spin angular momentum  itself, we  have equation (29) holding also 
for the Spin. We now have (29) holding quite generally, for any sum 
of spin and orbital angular momenta, and also (30) will hold generally, 
for M the total spin and orbital angular  momentum  and A any vector 
dynamical variable, and the connexion between angular  momentum 
and rotation Operators will  be always valid.
As an irnmediate consequence of this connexion, we  tan  deduce  th8
iaw of conservation of  angdur   momentum. For an  isolated  System,  the 
Hamiltonian must be  unchanged  by any rotation  about the origin, in 
other words it must be a  scalar,  so it must  commute  with the angular 
momentum   about the origin. Thus the angular  momentum   is a 
constant  of the motion. For this argument the  origin  may be any 
Point.
As a  second  immediate consequence, we  tan deduce that a  state
with xero  total  angular  momentum   is  sphericully   symmetricai.   The  stafe
will correspond to a ket IS), say, satisfying
B,IS>  =  J!!lJS)  =  M,IX)  = 0,
and hence              r,[S)  =  ry\S)  =  r,.S)  =  0.          - 
This Shows  that the ket IX>  is unaltered by infSt8simal  rotations, 
and it must therefore be unaltered by finite rotations,  since the  latter 
tan  be built up  from  infinitesimal ones. Thus the state is spherically 
symmetrical. The  converse  theorem, a  qherically  symmetrical   Stute 
kts xero total  angulur   momentum,   is also true, though its proof is not 
quite so simple. A spherically  symmetrical  state corresponds to a  ket 
IS) whose  direction is unaltered  by any  rotation. Thus the  Change
 
36. Properties of Angular Momentum

144                  ELEMENTARY APPLICATIONS                                      $35 .

in  18)  produced by a rotation Operator  rs,   rl/,   or   rZ  must  be a  numerical
multiple of  1 S), say
r,Ifi> =  cxlS>,       r,l@  =  c,lQ            %P> =  c,w,
where the c's are numbers. This gives
M,lS> = inc,Is),             Jf$Q =  iKc,lS),
M,jS)  = i?iczIX).                                (33)
These equations are not consistent  with  the commutation relations 
(29) for M,, My> M, unless c, = cy = 15~  = 0, in which case  the state 
has zero total angular momentum. We have in (33) an example of 
a ket which is simultaneously an eigenket of the three non-commuting 
linear Operators M,, My,  M,,  and this is possible only if all three 
eigenvalues are  Zero.
36. Properties of angular  momentum
There are some general properties of angular momentum, deducible
simply  fiom  the commutation relations between the three  components. 
These  propefiies  must hold equally for spin and orbital angular 
momentum. Let m,, mg,  m,  be the three components of an angular 
momentum, and introduce  the quantity ss defined by
ss =  m~+-m;-+-m;.
Since   /3 is a  scalar  it must  commute  with  m,,   mg,  and  rn,.  Let us 
suppose we have a  dynamioal   System  for which m,,   mg, m, are the 
only dynamical variables. Then ss  commutes  with everything and 
must be a  number.  We  tan  study this dynamical  System  on  much 
the  sime lines  as we used for the harmonic  oscillator in 5 34.
Put                            m -im, =  7.
X
From the commutation relations (27) we get
ijq =  (m,+im,)(m,-imJ   =  m$+m~---i(m,m,-m,m,)
= ss-+-i-~m,                          (34)
and similarly                             qfj =  ss-rni-nm,.                       (361 
Thus                            7jq-r)Tj  =  21imz.                               (36) 
Also                  mgq--r)mz  = %m,---?irn$  = -6~.                            (37) 
We assume that the components of an angular  momentum  are
observables and thus  m,  has eigenvalues. Let  rn:  be one of  them,
and  Impf)  an eigenket belonging to it. From (34)
<m$jqlmL>  = <&Iss--mi+~m&d> = @--m~2+~~~>(m~~m~>.




ii ,i
 
r    $4 36                  PROPERTIES OF ANQULAR  MOMENTUM                    146

The left-hand side here is the  Square of the length of the ket qlm;) 
and is thus greater than or equal to Zero,  the  case  of equality occurring 
if and only if  37  Im:)  = 0. Hence
ss-m;"+&m;  2 0,
or                                    ss+*n2   >  (m;-@)?                 (38)
Thus                                  ss+@"   2  0.
' Defining  the number 7 by
k+gi  =  (ss+@"y  =  (m;+m;+m;+$@)*,               (39)
so that k a -86,  the inequality (38)  becomes
-  >                k+@  >  fm;-+fil
or                                  k+#i  >   m;  >   -k.                (40) 
An equality occurs if and only if  7 Im;>  = 0. Similarly from  (35)
<m2r+$O  =  (ss-m~2-+W)<m34>,
showing that                       ss -m~2-hlm~ 2 0 
or                                k  >  mA  >  -k-4,
with an equality occurring if and only if  +jjmL)   = 0.  This result
combined with (40)  Shows  that k 2 0 and
k  >  na;   2 -k,                   (41)
withm~=kif~lm~>=Oandm~=-kif~lm~>=O.
From (37)

Now if  rn:   # -k,  7  Im:)  is not zero and is then an eigenket of  mz 
belonging to the eigenvalue  VI;--fi. Similarly, if  rn~-$i  #  -k, mi-2fi 
is another eigenvalue of  rn*,  and so op. We get in this way a series 
of eigenvalues  rn;,  mi-4,  mL---21i,..., which must termirrate from  (4l), 
and  tan terminate only with thevalue -k. Again, from the conjugate 
complex of equation (37)
m,  rilmL>  = (@b+f$)  Im;>  = (mi+f+j  W,
showing that rni+fi  is another eigenvalue of  m, unless  Olms) = 0, in 
which  case  rnz  = k. Continuing in this way we get a series of  eigenvalues 
mL,mL+fi,  rnL+%i  ,..., which must termirrate from  (.41),  and 
tan terminate only with the value  k. We tan  conclude that 2k is an 
integral multiple of iti and that the eigenvalues of m, are
k, k-4,  k-4%,  . . . .  -k+fi,  -k.         (42)
8696.67                             L
 
146                  ELEMENTARY APPLICATIONS                               5  36
The eigenvalues of mz and  my  are the same, from symmetry. These 
eigenvalues are all integral or half odd integral multiples of  6,  according 
fo whether 2k is an even or odd multiple of fi.
Let Imax)  be an eigenket of m, belonging to the maximum eigen-
value  k, so that              +jlmax)  = 0,                               (43)
and  ferm  the sequence of kets
Im->,   37lmaxh   r121max>,          . . . . 7j2Yfi 1 max) .    (44)
These kets are all eigenkets of  rnS,  belonging to the sequence of eigenvalues 
(42) respectively. The set of kets (44) is such that the Operator 
q applied to any one of them gives a ket dependent on the set  (q 
applied to the last gives  Zero),  and from (36) and (43) one sees 
that q applied to any one of the set also gives a ket dependent on the 
Set. All the dynamical variables for the  System  we are now dealing 
with are expressible in terms of 7 and q, so the set of kets (44) is a 
complete set. There is just one of these kets for each  eigenvalue (42) 
of  m,,  so m,  by itself forms a complete commuting set of observables.
It is convenient to  define  the magnitude of the angnlar  momentum
vector m to be k, given by (39),  rather than /3t,  because  the possible 
values for k are                                                           (45) 
extending to  infinity,  while the possible values for  /3b are a more 
complicated set of numbers.
Fora dynamical  System  involving other dynamical variables besides
m,,  mv,  and m,,  there may be variables that do not commute  with /?. 
Then /3 is no longer a number, but a general linear Operator. This
happens for any orbital angular  momentum   (22),  as x,  y,  x,  pz,  py, and 
pS  to not  commute  with  /3. We shall assume that  /3 is always an 
observable, and  k  tan then be  deCned  by (39) with the positive  Square 
root  fimction  and is also an observable. We shall  call   k  so defined 
the magnitude of the angular momentum  vector m in the general 
case.  The above  analysis  by  which  we obtained the eigenvalues of 
vS is still valid if we replace Im;) by a simultaneous eigenket Ik'n$> 
of the commuting observables k and mz,  and leads to the result that 
the possible eigenvalues for  k  are the numbers  (45),  and for  each 
eigenvalue k' of k the eigenvalues of m,  are the numbers (42) with k' 
substituted for k.  We have here an example of a phenomenon which 
we have not met with previously, namely that with two commuting 
observables, the eigenvalues of one depend on what eigenvalue we
 
§  36          PROPERTIES OF ANGULAR  MOMENTUM                               147
assign to the other. This phenomenon may be understood as the two 
observables being not altogether independent, but partially  functions 
of one another. The number of independent simultaneous eigenkets 
of  Jc  and m, belonging to the eigenvalues k' and mP;  must be independent 
of  rn:,   since  for  each  independent  Jk'm;)  we  tan  obtain an 
independent 1 k'mz),  for any rni in the sequence (42),  by multiplying 
jk'ma>   by a suitable power of  7 or +j.
As an example let us consider a dynamical  System  with two angular
momenta  m1 and m,, which  commute with one another. If there are 
no other dynamical variables, then all the dynamical variables  commute 
with the magnitudes k, and  kz  of  m, and m,, so  k, and  k, are 
numbers.      However, the magnitude  K of the resultant angular 
momentum  M = m,+m,   is not a number (it does not  commute 
with the components of m, and m,) and it is interesting to work out 
the eigenvalues of  K.  This  tan  be done most simply by a method 
of counting independent kets. There is one independent simultaneous 
eigenket of  m,,  and  rnza   belonging to any eigenvalue  4  having one of 
the values  kl,  kl--K,  kl--2fi ,..., -kl  and any eigenvalue  rn;  having one 
of the values  k,,  k,--jii,   k,---21i,...,   -k,,  and this ket is an eigenket 
of  M,  belonging to the eigenvalue  ML  =  m&+rnL.  The possible 
values of  iV& are thus k,+k,, k,+k,-Ti,  k,+k,-2&,...,-kl--k,,  and 
the number of times each  of them occurs is given by the following 
scheme (if we assume for definiteness that k, > kJ,
k,+k,, kl+k24, k,+k,-26 ,..., kl--k,,  kl--kz--fi  ,...
1      2           3             . . .  2k,+1   2k,+l . . .       (46)
. . . -k,+k,,-k,+k,-&,...,--k,-E, 
. . .    2k,+   1  2k, . . .  1
Now  each  eigenvalue  K'  of  K will be associated with the eigenvalues 
K', K'-?i, K'-26  ,..., -K' for  Hz,   with the same  number of independent 
simultaneous eigenkets of  K  and  M'  for  each  of them. The total 
number of independent eigenkets of MzI  belonging to any eigenvalue 
.&fL  must be the  Same, whether we take them to be simultaneous 
eigenkets of mb and mb or  simultaneous eigenkets of  K and M,, i.e. 
it is always given by the scheme (46). It follows that the eigenvalues 
for K are
-1 +k29    k,+k,-5,  kl-j-k,--Si,  . . . . kl-k,,            (4')
and that for each  of these eigenvalues for K and an eigenvalue for
 
<-



148                 ELEMENTARY APPLICATIONS                                  0  36

2M, going with it there is just one independent simultaneous eigenket
of  K  and  M,.
The effect of rotations on eigenkets of angular  momentum  variables
should be noted. Take any eigenket I&Q of the  x component of total 
angular  momentum  for any dynamical System, and apply to it a small 
rotation through an angle  84 about the x-axis. It will Change into
u+wr*)IJf;) =  (1-~~~1M,/qIM;>
with the help of (32). This equals


to the first Order  in 84. Thus IM:) gets multiplied by the  numeriert1 
factor  e- iS$MJn. By applying a succession of these small rotations,  we 
find that the application of a finite rotation through an angle  4  about 
the z-axis Causes IM:) to get multiplied by e-i+"Jn.  Putting 4  = 277, 
we find that an application of one revolution about the x-axis leaves 
IM:) unchanged if the eigenvalue  MI is an integral multiple of  & and 
Causes IM;) to Change sign if -84:  is half an odd integral multiple of 6. 
Now consider an eigenket IK'>  of the magnitude K of the total angular 
momentum. If the eigenva1ue.K'  is an integral multiple of 6, the 
possible eigenvalues of  il&  are all integral multiples of  fi and the  application 
of one revolution about the x-axis must leave  1 K')  unchanged. 
Conversely, if K' is half an odd integral multiple of  6, the possible  eigenvalues 
of  MS  are all half odd integral multiples of  6 and the revolution 
must Change the sign of  1 K'). From symmetry, the application of a 
revolution  about any other axis must have the same effect on IK') 
as one  about the x-axis. We thus get the general result,  the application 
of one revolution  about  any axis leaves  a  Eet unchanged  or   changes  its 
sign according to whether it  belongs  to eigenvalues  of the magnitude of 
the  total  angulur momentum  which  are integral or half odd integral 
multiples of fi. A state, of course, is always unaffected by the revolution, 
since  a state is unaffected by a Change of sign of the ket corresponding 
to it.
, For a dynamical System  involving only orbital angular momenta, 
a ket must be unchanged by a revolution  about an axis,  since  we  tan 
set up Schroedinger's representation, with  the coordinates of all the 
particles  diagonal, and the Schroedinger representative of a ket will 
get brought back to its original value by the revolution. It follows 
that  the eigenvalues of the magnitude of an orbital angular  momentum 
are always integral  multipies  of  6. The eigenvalues of a component
 
37. The Spin of the Electron

of an orbital angular momentum  are also always integral multiples 
of  6. For  a spin angular momentum, Schroedinger's representation 
does not exist and both kinds of eigenvalue are possible. 

37. The spin of the electron
Electrons,  and also some of the other fundamental particles  (pro-
tons, neutrons) have a spin whose magnitude is 4%.  This is found 
from experimental evidente, and also there are theoretical reasons 
showing that this spin value is more elementary than any other, even 
spin  Zero (see Chapter XI). The study of this  particular  spin is therefore 
of  special  importante.
For dealing with an angular  momentum   m  whose magnitude is  46,
it is convenient to put                m =  @o.                                        (48) 
The components of the  vector  Q then satisfy, from  (27),                        '            '  '
oy  Dz            =          2iu,     (r
-VZQ

az ox -ux   oz =  2io,,                                    (49)
Ox  ug -(Q*x =  2iaz.                       L
i
The eigenvalues of rnz are 46 and -+fi, so the eigenvalues of  oB are  1 
and  -  1,  and  0: has just the one eigenvalue  1. It follows that  c$ must 
equal 1, and similarly for 05 and D& i.e.
2
0,  = $=   o;=  1.                                        (50)
We  tan  get equations (49) and (50) into a simpler form by means of
some straightforward non-commutative algebra. From (60)
o$uz-o,oy  = 0
or                  a,(o,o,-a,a~)+(ay~z-a,a,)oy   =  0 
or                               cTy~x+fJxay = 0
with the help of the first of equations  (49). This means  oZ   uy   - --ag  ax. 
Two dynamical variables or linear Operators like these  which  satisfy 
the commutative law of  multiplication  except for a minus sign will 
be said to  anticommute.  Thus  0, anticommutes  with aU.  From symmetry 
each  of the three dynamical variables  ox,   oy,   a, must  anticommute 
with any other. Equations (49) may now be written
*y az =  io, = -azuv, 
*z =x = io, = -cr,a,,
.
OxU,  = za, = -csyuz,
and also from (50)                                  .
axoyr7z =  2.
 
ELEMENTARY APPLICATIONS                                0 37

Equations  (50),  (Sl), (52) are the fundamental equations satisfied by
the spin variables o describing a spin whose magnitude is 46.
Let us set up a matrix representation for the  a's  and let us take  a,
to be diagonal. If there are no other independent dynamical variables 
besides the  m's or a's in our dynamical System, then  a, by itself forms 
a  complete set of commuting observables,  since  the form of equations 
(60) and (61) is such that we  cannot  construct out of  u%,  Um,  and  u, 
any new dynamical variable that  commutes  with  a,. The diagonal 
elements of the matrix representing U, being the eigenvalues 1 and 
- 1 of oz, the matrix itself will be



Let  a, be represented by

This matrix must be Hermitian, so that a1 and ad must be real and 
a,  and a, conjugate complex numbers. The equation aB a,  = -az a, 
gives us


so that  a, =  a4  = 0. Hence   0,  is represented by a matrix of the form


The equation 4 = 1  now shows that  a,  us  =  1.  `Thus  a2   and  a3,   being
conjugate complex numbers, must be of the form  e"a  and  e-ia  respectively, 
where  01  is a real number,  so that 0% is represented by a 
.matrix  of the form


. .    Similarly it may be shown that ?Y  is also represented by a matrix of
this form. By suitably choosing the  Phase  factors in the representation, 
which  is not completely determined by the  condition  that  us 
shall be diagonal, we tan arrange that uz  shall be represented by the 
matrix                              0 1
(  1
1 0'
The representative of  uY is then determined by the equation 
% =  iu,u,. We thus obtain finally the three matrices
 
6 37                  TEE  SPIN OF THE ELECTRON                                   161

to represent a,,  (T~, and  a, respectively, which  matrices  satisfy all the 
algebraic  relations (49)) (50), (5 l), (52). The component of the vector 
Q in an arbitrary direction specified  by the direction cosines Z, m,  72, 
namely  ZG, +  VZG~  +  na,,  is represented by
( n I-im
E-f-im                                      (54)
1
-n '
The representative of a ket vector will consist of just two numbers,
corresponding to the two values + 1 and - 1 for 0;. These two numbers 
form a function of the variable  CF:  whose  domain  conqists of only 
the two Points  + 1 and - 1. The state for which  an has the value unity 
will be represented by the function, f,(4  say, consisting of the pair 
of numbers  1, 0  and that for which  5, has the value  -  1  will be 
represented by the function, fB(5;) say, consisting of the pair 0, 1. 
Any function of  / the variable  5;, i.e. any pair of numbers,  tan  be 
expressed as a linear combination of these two. Thus any stute tan
be obtained by superposition  of the two  stutes  for which  oz  equaLs   -/-,l   and
- 1 respectively. For example, the state for which the component of 
a in the direction Z, m,  n, represented by  (54),  has the value  +l  is 
represented by the pair of numbers a, b which satisfy
n                                                         i
l-j-im
or                             nu+(Z--im)b  = a,
(l+im)a-nb = b.
Thus                          a I-im            1-l-n
-=-=-*
b       l-n      l-j-im
This state  tan  be regarded as a Superposition of the two states for
'    which  aa equals +  1  and -  1,  the relative weights  in the superposition
process being as
ja12 :  fb[" =  ~Z-im~2:  (~---Ts)~  =  l+n  : l-n.             (55)
For the complete description of an electron  (or other elementary
particle  with spin  Qiti)  we require the spin dynamical variables  5, 
whose connexion with the spin angular momentum  is given by (48), 
together with the Cartesian coordinates x, y, x and momenta  pz,  py, 
pz. The spin  dynamical  variables  commute  with these coordinates 
and momenta. Thus a complete set of commuting observables for a 
System  consisting of a  Single electron will be x, y, x, oz.  In a representation 
in which these are diagonal, the representative  of any state
 
38. Motion in a Central Field of Force

i
162                   ELEMENTARY APPLICATIONS                                         § 37 
will  be a function of four variables  x',  y', x',  0;. Since   0; has a domain 
consisting of only two  Points,  nameiy  1  and  -  1,  this function of four 
variables is the Same as two functions of three variables, namely the 
two functions
<x'y'q)+  =  <x',y',C+ll>,          (x'y'x'  1
)- = (x',  y',  x`,   -  11).    (56)
Thus  the  presence  of the spin  nmy be considered either  us  introducing  a 
new variable into  the representative  of  a  state or  us giving this  representative 
two components.
38. Motion in a central field of  forte
An atom consists of a massive positively charged  nucleus together
with a number of electrons moving round, under  the influence of the 
attractive forte of the nucleus and their own mutual repulsions. An 
exact treatment of this dynamical System  is a very difficult mathematical 
Problem.  One tan,  however, gain some insight into the main 
features of the  System  by making the rough approximation of  regarding 
each  electron  as moving independently in a certain  central  field 
of  forte, namely that of the nucleus, assumed fixed, together with 
some kind of  average  of the  forces due to the other electrons. Thus 
our present Problem  of the motion of a  particle  in a central field of 
forte  forms a corner-stone in the theory of the atom.
Let the Cartesian coordinates of the  particle,  referred to a System
of axes with the  centre  of  forte  as origin, be x, y, x and the  corresponding 
components of  momentum   pa,  pu,  pz.  The Hamiltonian, 
with neglect of relativistic  mechanics, will be of the form
27  =  1/2m.  (P:+P;+P3+%                                    (57)
where  V, the potential energy, is a function only of  (x2+y2+x2).  To 
develop the theory it is convenient to introduce polar dynamical 
variables. We introduce first the radius  r,  defined as the positive 
Square root                     r = (x2+y2+x2)*.
Its eigenvalues go from 0 to Co. If we evaluzte its P.B.s  with ps,  py,
and  p,, we obtain, with the help of formula (32) of zj 22,

[r,p,]   -  E =  :,     [w,] =  F9           [r,pJ  =  :,

the Same as in the classical theory. We introduce also the dynamical
variable  pP  defined by
P3T  =  +(XPz+YP,+~PJ=                                    (58)
 
$38             MOTION  IN A CENTRAL  FIELD  OF  FORCE

Its P.B. with r is given by
r[r,pJ  = [c v%l = Cr,  wc+wy+%l
= X[r,~,l+Y[rtPyj+X[r,13,1 
= x.x/r+y.y/r+z.zp = r.
Hence                                  PY PJ = 1 
OI`                               rp,-p,r  = in.
The  commutation  relation between r and  Pr  is just the one for a
canonical coordinate and momentum, namely equation (10) of 5 22. 
This makes P,~ like the momentum  conjugate to the r coordinate, but 
it is not exactly equal to this momentum  because  it is not real, its 
conjugate  complex being
f% =  (P,  x+p,   Y+Pz  Gr-l   =  (~p,+YPg+~Pz-  3ar-l
=  (rpT--3ifi)r-l  =  pr-2ifb--1.    (69)
Thus  pr- $kl  is real and is the true momentum  conjugate to r.
The angular  momentum  m of the  particle   about the origin is given
by (22) and its magnitude  k is given by (39).  Since   r  and  pv  are
scalars, they commute  with m, and therefore also with k.
We  tan  express the Hamiltonian in terms of  r,  pr,   and  k. We have,
if  z denotes a sum over cyclic permutations of the suffixes x, y, x,

k;k+n>  =  2   m;  =  2   (~P,-Yza2

="   ~"Pu~~+YPzYPz-~P~YPz-YpLT~Py)
xw
=  2   (X2p;+y2P~-X2izP1/Y-YP~Pz~+~2P:-~PLEP3F-
zu2                                                    - 2ifixpJ
=   (x2+y2-~z2)(p"fpy+~)-
-(xP,+YP,+?Pz)CiPaX+PyY+P~~+2w
=  r2(p5+pY+p~)-r23,(1?,r+2in)
=  r2(p~+p~+pZ)-w%
from (59).  Hence
H =  & ipFr+ @p)+v.                                (60)
(
This form for  H is such that  k  commutes  not only with  H,  as is 
necessary  since   k  is a  constant  of the motion, but also with every 
dynamical variable  occurring  in H, namely r, pr,  and  V, which  is a
 
164                  ELEMENTARY APPLICATIONS                                 0 38

function of  r. In consequence, a simple treatment becomes possible, 
namely, we may consider an eigenstate of  I% belonging to an eigen-
value  k'  and then we  tan Substitute  iY for  E in  (60)  and get  a  Problem
in one degree of freedom r.
Let us introduce Schroedinger's representation with x,  y,  x diagonal.
Then  pz,  py,  p, are equal to the Operators  -4  a/ax,   -4  a/ay,   -4%  a/az 
respectively. A state is represented by a  wave function $(xyxt)  satisfying 
Schroedinger's wave equation (7) of 3 27, which now reads, with 
H given by (57),


We may pass from the Cartesian coordinates x, y, x to the polar
coordinates r, 0,4 by means of the equations
X =  rsinOcos$, 
Y =  rsinOsin+,
X= r cos l9,
and may express the wave function in terms of the polar coordinates, 
so that it reads  t,&@t).  The equations (62) give the Operator equation
a-= axa+aya+aza
-- --
--=
ar     arax  aray   tia2            -;;+;;,g;,
which Shows,  on being compared with (58), that p,,  = -4 a/ar.  Thus 
Schroedinger's wave equation reads, with the form (60) for H,
,a*  fi2  1  a2
-= - -- -,+w+w
at     i ~TYA ( T at-2      F  +w                      (63)
1      1
Here  k is a certain linear Operator which, since  it commutes  with r 
and  a/ar, tan  involve only 6, #, a/8,  and  a/a+.  From the formula
w+w =  m~+??g+?n~,                              (64)
which  Comes  from (39), and from (62) one tan work out the form of
k(k+fi)  and one  finds
W+fi)             1  asinoa   1  a2
--@---=---
sin  8 ae ----
ae sin29 ap '              (65)
This Operator is well known in mathematical  physics.  Its eigenfunctions 
are called  sphericul   harmonics   and its eigenvalues are 
n(n,+l)  where  n  is an integer. Thus the theory of spherical  harmonics 
provides an alternative proof that the eigenvalues of  k are 
integral multiples of $.
 
0 33          MOTION IN A CENTRAL FIELD OF FORCE                                  156
For an eigenstate of  E belonging to the eigenvalue  &  (n a  non-
negative integer) the wave function will be of the form
# = ~-%w?@~>,                               (66)
where   8,   (04) satisfies
(67)
i.e. from (65) Sn is a spherical harmonic  of  Order   n. The  factor  r-l 
is inserted in (66) for convenience. Substituting  (66) into  (63),  we 
get  as the equation for x
ax  Ti2
%f= 1-
(68)
2m ( --+
a2  nqy+v}x.
s-r2
If the state is a stationary state belonging to the energy value  H',
x will be of the form                x(d)  =  Xo(r)e-sri~fi
and (68) will reduce to
52
H'xo   =   T&   --/-j-j+
a2      n!d!tp   +v  xo.         (69)
1           (                   )     1
This equation may be used to determine  the energy-levels H' of the 
System. For each Solution x,,  of  (69j,  arising from a given n, there 
will be  2n+l independent states,  because  there are  2n+l   independent 
solutions of  (67) corresponding to the  212+ 1  different values 
that  a component of the angular  momentum, na, say, tan  take on.
The probability of the particle being in an element of volume
dxdydx  is proportional to  [#  1%-&.&  With  $J of the form  (66) this 
becomes   r-21~/2/X,12dxdyd~.  The probability of the particle being in 
a spherical Shell between r and  r+dr  is then proportional to  1x12dr. 
It now becomes clear  that, in solving equation (68) or (69), we must 
impose a boundary condition on the function x at r = 0, namely the 
function must be such that the integral to the origin  1  1~  l2  dr  k
0
convergent.  If this integral were not convergent,  the wave function 
would represent a state for which  the chances  arc  inSnitely  in favour 
of the particle being at the origin and such a state would not  be 
physically admissible.
The boundary condition at r = 0 obtained by the above considera-
tion of probabilities is, however, not sufficiently stringent. We get a 
more stringent  condition by  verifying  that the wave function obtained 
by solving the wave equation in polar coordinates (63) really satisfies 
the wave equation in  Cartesian  coordinates  (61).  Let us take the  case
 
39. Energy-levels of the Hydrogen Atom

156                 ELEMENTARY APPLICATIONS                                 8  38

of  V = 0, giving us the  Problem  of the free particle. Applied  to  a
stationary state with energy H'  = 0,  equation (61)   gives
v2* = 0,                                (70)
where  V2  is written for the  Laplacian  Operator  a2/ax2+a2/ay2+   a2/ax2,
and equation (63) gives
i a2
4 -r- kw)# = 0.                                  (71)
(T ar2
A Solution of  (71)  for  k: =  0  is  t) =  r-l.    This does not  satisfy 
(7O),  since,  although  V2r-1 vanishes for any finite value of  r, its integral 
through  a volume containing the origin is -4~  (as may be  verified 
bg transforming  this volume integral to a surface  integral by means 
of  Gauss's  theorem), and  hence
V2Y-1 =  -47T  S(x)S(y)S(x).                       (72)
Thus not every solution of  (71) gives  a Solution of  (70),   and more 
generally, not every solution of  (63) is  a Solution of  (61). We musst 
impose on the Solution of (63) the condition that it shall not tend to 
infinity as rapidly as r-l  when r -+ 0 in Order  that, when substituted 
into  (61),   it shall not give a S function on the right like the right-hand 
side of  (72). Only when equation  (63)  is supplemented with this  condition 
does it become equivalent to equation (61). We thus hrtve the 
boundary condition r$ -+ 0  or  x + 0  as r -+ 0.
There  are also boundary conditions for the wave function at  r =  00.
If we are interested only in `closed' states, i.e. states for which  the 
particle does not go off to infinity, we must restritt  the integral to 

infinity   s  IX(~)  l2  dr to be  convergent.  These closed states, however, 
arc  not the only  ones that arc physically permissible, as we tan also 
have states in which  the particle arrives from infinity, is scattered 
by the central  field of  forte, and goes off to infinity again. For these 
states the  watve  function may remain finite as  r  +  co. Such  states  will 
be dealt with in Chapter VIII  under  the heading of  collision  Problems. 
In any  case  the wave function must not tend to infinity as r -+ CO,  or 
it will represent a state that has no physical meaning.

39. Energy-levels of the hydrogen atom
The above analysis  may be applied to the Problem  of the hydrogen
atom with neglect of  relativistic  mechanics and the spin of the
 
§ 39          ENERGY-LEVELS OF THE HYDROGEN ATOM                                    157
electron. The potential energy  V is  nowt  -e2/r,  so that equation
(69) becomes

!!Yy+L;$  +,  = -3?&o.
dr2                                                            (73)
A thorough investigation of this equation has been given by Schroedinger. 
We shall here obtain its eigenvalues H' by an elementary 
argument .
It is convenient to put x. = f(r)e-rju,                                        (74)
introducing the new  function  f(r),  where  a is one  or other of the
Square roots                  a  =  -J--,/(--P/ZmH').                             (75)
Equation (73) now becomes



We look for a Solution of this equation in the form of a power series

(77)

in which consecutive values for  s differ by unity altbough these 
values themselves need not be integers. On substituting (77) in (76) 
we obtain
2  ~~(~(~-1)1"8-~~-(2~/a)~~-~-n(~+l)r~-~+(2me~/li~)r~-~~  = 0,
8
which gives, on equating to Zero the coefficient of P+-~,  the following
relation between successive coefficients c,,
c,[s(s-  1) -n(n+ l)] = c~~~[~(s---  l)/a- 2rne2/K2].               (78)
We saw in the preceding  section  that only those  eigenfunctions   x 
are allowed that tend to  Zero with r and  hence,  from (74), f(r) must 
tend to zero  with r. The series (77) must therefore terminate on the 
side of  small   s and the  minimum   value  of  s must be greater than  Zero. 
Now the only possible minimum  values of s are  those that make the 
coefficient of  cs in (78) vanish, i.e.  n+  1 and  -n,  and the  second 
of these is negative  or   Zero. Thus the minimum  value of  s must be 
n+ 1. Since   n  is always an  integer, the values of  s will all be integers. 

+ The e here,  denoting minus the Charge  on an electron, is, of course,  to be dis-
tinguished from the e denoting the base of exponentials.
$ Schroedinger, Am.  d. Physik, 79 (1926),  361.
 
ELEMENTARY APPLICATIONS                               I 39
The series  (77)  will in general extend to  infinity  on the side of large  s.
For large values of s the ratio of successive terms is
2r
3-y=-
Q-1           sa
according to  (78). Thus the series (77) will always converge, as the 
ratios of the  higher  terms to one another are the  Same as for the
1  2r8
CO
--                             (79)
s! a  '
8
which converges to e2rla.
We must now examine how our Solution  x. behaves for large
values of  r. We must distinguish between the two  cases  of  H'  positive 
and  H'  negative. For H' negative, a given by (75) will be real. Suppose 
we take the positive value for a. Then as  r  -+  00 the sum of the 
series (77)  will tend to inf?.nity according to the Same law as the sum 
of the series (79),  i.e. the law  e2rla. Thus, from (74),  x. will tend to 
i..n.fGty  according to the law eda  and will not represent a physically 
possible  state.  There is therefore in general no permissible Solution 
of  (73)  for negative values of  H'. An exception arises,  however, whenever 
the series  (77)  terminates on the side of large  s,  in which  case  the 
boundary conditions are all satisfied. The condition for this  termination 
of the series is that the coefficient of csVr  in (78) shall vanish for 
some value of the suffix  s-  1  not less than its  minimum  value  n+   1, 
which is the same  as  the condition that
s 9ne2
--- =
a           na        0
for some integer  .s not less than  n+ 1. With the help of (75) this
condition  becomes             H'=   -11264
2s2P'                    (80)
and is thus a condition for the energy-level H'. Since  s may be any 
positive integer, the formula (80) gives a  discrete  set of negative 
energy-levels  for the hydrogen atom. These are in agreement with 
experiment. For  each  of them (except the lowest one s =  1)  there 
are several independent states, as there are various possible values 
for n, namely any positive  or zero integer less than s.  This multiplicity 
of states belonging to an energy-level is in addition to that 
mentioned in the preceding  section arising from the various possible
 
40. Selection Rules

§ 39        ENERGY-LEVELS OF THE HYDROGEN ATOM                                 169
values for a component of angular momentum, which latter multi-
plicity occurs with any  central  field of foroe. The  n  multiplicity  occurs
only with an inverse  Square law of  forte and even then is removed 
when one takes relativistic  mechanics into account, as will be found 
in Chapter XI. The Solution  x. of  (73)  when  H'  satisfies (80) tends to 
Zero exponentially as  r  -+  CQ and thus represents a closed state  (corresponding 
to an elliptic Orbit in Bohr's theory).
For any positive values of H',  a given by  (75)  will bepure imaginary.
The series  (771,  which is like the series  (79)  for large  r, will now have a 
sum that remains finite as  r  -+  a. Thus  Xogiven  by  (74)  will now remain 
Finite as r -+ co and will therefore be a permissible Solution of  (73), 
giving a wave  function  (CI that tends to  Zero according to the law  r-1 as 
r -+ CO. Hence  in addition to the  discrete  set of negative  energy-levels 
(80),  all positive energy-levels are allowed. The  states  of positive

energy are not closed,  since  for them the integral  to  in6nity  r  1 x.  i2  dr
does not converge. (These states correspoad to the hyperbolic Orbits
of Bohr's theory.) 
40.  Selection  rules
If a dynamical  System  is set up in a certain stationary  statte, it will
remain in that stationary state so long as it is not acted upon by 
outside  forces.  Any  atomic  System  in practice, however, frequently 
gets acted upon by external  electromagnetic  fields,  under  whose
infiuence  it is liable to  cease   to be in one stationary state and to make 
a transition to another. The theory of such transitions will be de-
veloped in  $8  44  and 45. A  result  of this theory is that, to a high degree
of accuracy, transitions between two states cannot  occur under  the 
influence  of electromagnetic  radiation if, in a Heisenberg representation 
with these two stationary states as two of the basic  states, the 
matrix  element, referring to these two states, of the representative 
of the total electric  displacement D of the  System  vanishes.  New it 
happens for many  atomic  Systems that the great majority of the
matrix elements of D in a Heisenberg representation do vanish, and
hence  there are severe limitations on the possibilities for transitions.
' The  rules  that express these limitations are called selection  ruEes.
The idea of  selection  rules  tan  be refined by a more detailed
application  of the theory of  $5  44  and  45,  according to which 
the matrix elements of the different Cartesian components of the 
vector  D are associated with different states of polarization of the
 
160                    ELEMENTARY APPLICATIONS                                Q 40

electromagnetic  radiation. The  nature  of this  association  is just what 
one would get if one considered the matrix elements, or rather their 
real parts, as the amplitudes of harmonic oscillators which  interact 
with the field of radiation according to classical electrodynamics.
There is a general method for obtaining all selection rules, as
follows. Let us  call  the constants of the motion which are diagonal in 
the Heisenberg representation ar's  and let .D  be one of the Cartesian 
components of D. We must obtain an algebraic  equation connecting 
D and the a's which does not involve any dynamical variables other 
than D and the  2s  and which is linear in D. Such an equation will
be of the form                                                                w 
where the f?`s  and g,.`s  are functions  of the  a's  only. If this equation 
is expressed in terms of representatives,  it gives us


01:

which Shows that (a'  ID  ld') = 0 unless


This last equation, giving the connexion which must exist between 
CY'  and  an in  Order  that  (d/Dld')  may not vanish, constitutes the 
selection rule, so far as the component  D  of D is concerned.
0u.r work on the harmonic oscillator in 9 34 provides an exampie
of a selection rule. Equation (8) is of the form (81) with +j for D and 
EI  playing the part of the 01'8,  and it Shows that the matrix elements
(F  I+ IriT") of +j all vanish except those for which H"=N'  = 6~.  The 
conjugate  complex of this result is that the matrix elements  (H'  Iq   IH">
of  7 all vanish except those for which H"-H'  = -6~.  Since q is a
numerical  multiple of q--q,  its matrix elements (H' IqlH") all vanish 
except those for which  Hf'-Hf   = -j$w. If the harmonic oscillator 
Garries an electric  Charge, its electric displacement  D will  be proportional 
to  Q. The selection rule is then that only those transitions 
tan take  place in which the energy  H  changes  by a  Single   quanturn 
tiw.
We  shall  now obtain the selection rules for  m,  and  k: for an  electron
moving  in a  central  field of  forte.  The components. of electric  dis-
 
.* :,

Q 40                            SELECTION RULES                               161 
placement   arc here  proportional to the Cartesian  coordinates  x,  y, x. 
Taking   first  m,, we have  that  rn, commutes  with x, or that
m,x--zm,   = 0.
This   is an equation of the required  type (EU),  giving  us the selection
rule                                     ,         Ir
m,-m  =
B     0
for   the  x-component   of  the  displacement.   Again,   from   equations
(23)  we  have              P-h [m,,xl] = Cm,, YJ = -4 
or                        m,2x-2m,xm,fxm~-Px  = 0,
which   is also  of the type (81) and gives  us  the selection  rule

or                         (mi-rnz-%)(me-mi+Tb)   = 0
for   the x-component of the displscement.   The selection  rule  for the 
y-component  is  the  same. Thus   our   selection   rules   for  ma!   are  that 
in transitions associated  with  ra&ation   with a polarization  correspondi9q 
to an electric dipole in the x-direction, rn: cunnot  chunge,  while  in transiGons 
associated with  a polarkation  corresponding to an electric dipole 
in the x-direction or y-direction,  mp: must change  by -J+.
We tan determine  more  accurately  the state  of polarization of the
radiation associated  with  a transition  in  which  rni  changes  by -J& by 
considering   the  condition  for  the  non-vanish$g   of  matrix   elements 
of x+iy and x -iy. We  have
[m,,x+iy] =  y-ix   =  -i(x+iy)
0r                        m,(x+iy)---   (x+iy)(m,+W  = 0,
which   is again  of  the type (81).   It gives
I
m,--rni-4 = 0
as  the  condition  that   (m~jx+iyJm~)  shall   not  vanish.   Similarly,
mZ;--rni+fi  = 0
is  the  condition  that   (mzlx-iy   Im:>  shall   not  vanish.  Hence
(m;jx-iyjmL-4) = 0
or                 (m~JxJm~---6)   =  i(m~~yJrn~--fi)   =  (a+ib)kw" 
say,   a, b, and  CC) being  real. The  conjugate  oomplex  of this  is
(m~--ALJx]m~)  =  -i(mi--6]ylmL>   =  (a-ib)e-i~t
Thus  the vector &{{m;  ID (m;--%>  + <mA-&/D  Im;)), which  determines
8895.67                                 M
 
162                     ELEMENTARY APPLICATIONS                               §  40
the state of  Polarkation  of the radiation associated with transitions
for which  rni  = rn; -4, has the following three components
~{<m~lxlm~-n>+(m~-Alxlm~>)
=  g{(a+ib)ei"`+(a-ib)e-qwl)  =  a cos d-b sin  wt,
~(~m~lvlma-~>+~m~-~lulm~>)                                           (83)
=  gi(-(a+ib)e"`"`+(a-ib)e-io3 = acsin  ot+b cos wt,
~{<m~[z~rn~-~)+(rn~-~~~~rn~)~  = 0.                      1
From the form of these components we see that the associated radiation 
moving in the  z-direction  will be circularly polarized, that 
moving in any direction in the q-plane will be linearly polarized in 
this plane, and that moving in intermediate  directions  will be 
elliptically polarized. The direction of circular polarization for radiation 
moving in the  x-direction  will depend on whether w is positive 
or negative, and this will depend on which of the two states rni or 
mg  = mi--?i  has the greater energy.
We shall now determine the  selection  rule for E. We have
[W+W,  z] = [m$  z]+[m$  21
=  -ym% -m,y+xm,+m,x 
=  2(m,x-m,   y+ifiz)
= 2(m, x- ym,) =  2(xm,-m,   y).
Similarly ,                [WC+%  x] = 2(ym,--m,  4 
and                        [W+fi),  Y]  =  2(m,+xm,). 
Hence
[w+fi),  Cw+a  4-j
=  qw+Jq, my x-m,  y-f-ifiz]
=  2m,[k(E+~),x]-2m,[k(k+1Ti),y]+2ifi[k(E+~),          ~1               * 
=         4m,(ym,----m,z)-4m,(m,z--xmz)+2(k(k+fi)x-xE(k+fi))                      - 
=  4(mzx+riEy   y+m,z)m,-4(m!$+mi+ma)z+
+  2w+w z-zk(k+fi)).
From (22)                     m,x+m,   y+m,x = 0                              P)
and hence
[w+fi),   Pw+fo, Zl] = -2{W+~)~+~E(~+~)),
which gives
IC2(k+n)32_2k(E+fi)xE(E+n)+xE2(E+n)2-.
-21i2Ek(l+n)z+xk(k+n)E   = 0. (85)
 
§  40                         SELECTION  RULES                                163

Similar equations hold for x and  y. These equations are of the  re-
quired type (81), and give us the selection rule
k'2(k'+q2-2k'(k'+fi)v(~+fi)+~yY+~)~-
-2fi%`(E'+?i)-2Pk"(k"+n)  = 0,
which reduces to
(k'+E"+2n)(k'+E")(k'-k"+n)(k'-E"-fi)            = 0.
A transition tan take place  between two states k'  and k?'  only if one
of these four factors vanishes.
Now the first of the factors, (Ic'+lc"+ 2fi), tan  never vanish, since
the eigenvalues of k are all positive or Zero. The second,  (E'+V),  tan 
vanish only if  k'  = 0  and  k"  = 0. But transitions between two states 
with these values for Ic cannot occur on account of other selection 
rules, as may be seen from the following argument. If two states 
(labelled  respectively with a  Single  Prime and a  double  Prime)   arc 
such that  k'  =  0  and  k;" = 0,  then from (41) and the corresponding 
results for  m, and  my,   rn; =  mk   =  rni  =  0  and  rni =  rni =  rni  = 0. 
The selection rule for  m, now  Shows  that the matrix  elements  of 
x and  y referring to the two states must vanish, as the value of  m, 
does not Change during  the transition, and the similar selection rule 
for m, or rny  Shows  that the matrix element of z also vanishes. Thus 
transitions between the two states cannot occur. Our selection rule 
for ?c now reduces to
(k'-k"+h)(k'-k"-4)   = 0,
showing that  k  mzcst   chcLnge   by  -@. This selection rule may  be written
p-2jypyp-$2   =  0,
and since this is the condition that a matrix element (Er ~x~F'>  shall
not vanish, we get the equation
k2z-  2kzk+zk2-fi2z  = 0
or                              [k  [k,   z]] =  -2,                         (86)
a result which could not easily be obtained in a more direct way.
As  a final example we  shall  obtain the selection rule for the  magni-
tude  K  of the total angular  momentum  M of a general  atomic  System. 
Let x, y, z be the coordinates  of one of the electrons.  We must obtain 
the  condition that the (Kr,  K") matrix element of  X, y, or x shall  not 
vanish. This is evidently the Same as the condition  that the  (Kr, K") 
matrix element of  h,,  h,,  or & shall  not vanish, where  &, h,,  and  $
 
164                   ELEMENTARY APPLICATIONS                                 §  40

are any three independent linear functions of x,  y, and  x with  numerical 
coefficients,  or  more  generally with any coefficients that commute 
with  K and are thus represented by  matrices   which  are diagonal with 
respect to  K.  Let          &, = M,x+M,  y+JQ,
h, =M,z-M,y-iKx,
Av =  M,x-M,x-ifiy, 
AZ  = M,  y-M,x-ifix.
We have

=  1 (M,   MV-M, M,-ifiM,)x  = 0              (87)
from (29). Thus &,,  )I,,  and ATare not linearly independent functions 
of x, y, and z. Any two of them, however, together with AO are three 
linearly independent functions of x, y, and  x and may be taken as the 
above  h,,  X,,  X,,  since  the coefficients M,, M,, M,  all commute with K. 
Our Problem  thus reduces  to finding the condition that the (K', K") 
matrix elements of  h,,  hz,  AU, and  h,  shall not vanish. The physical 
meanings of these h's  are that X, is proportional to the component of 
the vector (x, y, x) in the direction  of the vector M, and AZ, &,,  Xz are 
proportional to the Cartesian components of the component of (x, y,  x) 
perpendicular  to M.
Since  &,  is a scalar  it must commute with K. It follows that only
the diagonal elements  (K'  /h,lK')  of  h,  tan  differ from  Zero, so the 
selection rule is that  K  cannot   Change so far as  h, is concerned. Applying 
(30) to the vector hz,  &,,  h,,  we have
PfAl  = 4l9       [&, hy] = -h,,          [M,, &] = 0.
These relations between  M,  and  h,,  X,,   h,  are of exactly the same form 
as the relations (23),  (24) between m, and  x, y, x, and also (87) is of 
the same form as (84). The dynamical variables  &.,   h,,  AZ thus have the 
Same properties relative to the angular momentum  M as x, y, x have 
relative to  m.  The  deduction of the selection rule for  lc when the 
electric displacement is proportional to (x, y,  x)  tan therefore be taken 
over and applied to the selection rule for  K  when the electric  displacement 
is proportional to (h,,  h,,   h,). We find in this way that, so far as 
&.,   h,,  h,  are concerned, the selection rule for  K  is that it must  Change 
by  33.
Collecting results, we have as the selection rule for  K  that it must
Change  by 0  or   -J$. We have considered the electric displacement
 
41. The Zeeman Effect for the Hydrogen Atom

f  40                     SELECTION RULES

produced  by  only one of the  electrons,  but the same  selection  rule 
must hold for each electron and thus also for the total  electricjlisplacement 
.

41.  The Zeeman  effect for the hydrogen  atorn
We  shall  now consider the System  of a hydrogen atom in a uniform
magnetic field. The Hamiltonian (5'7) with  V  =  -ez/r,  which  describes 
the hydrogen atom in no external field, gets modified by the magnetic 
field, the  modification, according to classical mechanics, consisting 
in the  replacement of the components of momentum, pz,   pV, p3,, by 
px+e/c.A,,  p,+e/c.A,,  %+e/c  .A,, where  A,,  A,,   A,  arc the  components 
of the vector potential describing the field. For  a, uniform 
field of magnitude  J+  in the  direction of the x-axis we may tske 
A,  = -Q&y,   A,  =  +&x,  A,  =0.  The classical Hamiltonian will 
then be


This classical Hamiltonian  may be taken over into the  quantum 
theory if we add on to it a ferm giving the effect of the spin of the 
electron. According to experimental  evidente and according  to the 
theory of Chapter XI, the electron has  a magnetic moment  -  efi/2mc.   G, 
where  Q is the spin vector of  0 37. The energy of this magnetic moment 
in the magnetic field will be  e!i3/2mc.   0,.  Thus the total quantum 
Hamiltonian will be

H      '
z-
2m  px
((
There ought strictly to be other terms  in this Hamiltonisn giving the 
interaction of the magnetic moment of the electron with  the electric 
field of the  nucleus  of the atom, but this effect is small, of the  same 
Order  of magnitude as the correction one gets by taking relativistic 
mechanics into account, and will be neglected here. It will be taken 
into account in the  relativistic  theory of the electron given in 
Chapter XI.
If the magnetic field is not too large, we  tan neglect terms involving
#2,  so that the Hamiltonian (88)  reduces  to



(89)
 
166                    ELEMENTARY APPLICATIONS                                8  41

The extra terms due to the magnetic field are now  eJ4/2mc.  (mz+hz). 
But these extra terms commute  with the total Hamiltonian and arc 
thus constants of the motion. This makes the  Problem  very easy. 
The stationary states of the system, i.e. the eigenstates of the  Hamiltonian 
(89), will be those eigenstates of the Hamiltonian for no field 
that are simultaneously eigenstates of the observables m, and  Ob,  or 
at least of the one  observrtble  rn,+fia,,   and the energy-levels of the 
System  will be those for the  System  with no field, given by (80) if 
one considers only closed states, increased by an eigenvalue of 
e#/2mc.   (m,+?b,).   Thus stationary states of the  System  with no 
field for which rn8 has the numerical value rnl,  an integral multiple 
of  5, and for which also  O*  has the numerical value  0;  =  j-  1,  will still 
be stationary states when the field is applied. Their energy will be 
increased by an amount consisting of the sum of two  Parts,  a part 
e&/2mc.m~ arising from the orbital motion, which part may be considered 
as due to an orbital magnetic moment -emi/2mc,  and a part 
e3#/2mc.   ha; arising from the  Spin. The ratio of the orbital magnetic 
moment to the orbital angular  momentum  rnz   is -e/2mc,   which is 
half the ratio of the spin magnetic moment to the spin angular 
momentum. This  fact is sometimes referred to as the magnetic 
anomaly of the Spin.
Since  the energy-levels now involve  m,,  the  selection  rule for  m,
obtained in the preceding  section   becomes   capable of  direct   comparison 
with experiment. We take a Heisenberg representation in 
which, among other constants of the motion, m, and oz are diagonal. 
The  selection  rule for m, now requires m, to  Change by  &,  0, or -4, 
while  u,,  since  it commutes  with the electric displacement, will not 
Change at all. Thus the energy  differente between the two states 
taking part in the transition process will differ by an amount 
e?iJ+/2mc,  0,  or -eW/2mc   from  its value for no magnetic field. 
Hence,   from  Bohr's frequency  condition,  the frequency of the 
associated  electromagnetic   radiation  will  differ by  eJ$/&rnc,   0,  or 
-eJ#/hmc   from  that for no magnetic field. This means that each                      i 
specfrd  he for no magnetic field gets Split  up by the field into three 
components. If one considers radiation moving in the x-clirection, 
then from (83) the two  outer components will be circularly polarized, 
while  the  central  undisplaced one will be of zero intensity. These 
reaults are in agreement with experiment and also with the  classical 
theory of the Zeeman effect.
 
VII. Perturbation Theory 
42. General Remarks

PERTURBATION THEORY
42. General remarks
IN  the preceding  chapter   exact treatments  were  given of some simple 
dynamical Systems in the  quantum  theory. Most  quantum  Problems,
however, cannot be solved exactly with the present resources of                   .
mathematics, as they lead to equations whose solutions cannot be 
expressed in finite terms with  the help of the ordinary functions  of 
analysis. For such  Problems  one  tan   often  use a perturbation method. 
This consists in splitting up the Hamiltonian into two  park, one of 
which must be simple and the other small. The first part may then 
be considered as the Hamiltonian of a simplified or unperturbed
<    System, which  tan  be dealt with exactly, and the adclition of the
second will then require small corrections, of the nature of a  perturbation, 
in the Solution for the unperturbed System. The requirement 
that the first part shall be simple requires in practice that it shall not 
involve the time explicitly. If the second part contains a small
i         numerical   factor  E, we  tan  obtain the solution of our equations for
the perturbed System  in the form of a power series in E, which, provided 
it converges, will give the  answer  to  our   Problem  with any 
desired accuracy. Even when the series does not converge, the first 
approximation obtained by means of it is usually fairly accurate.
There are two distincf methods in perturbation theory. In one of
these the perturbation is considered as causing a modzjkation  of the 
states of motion of the unperturbed System. In the other we do nof 
consider any  modification  to be made in the states of the unperturbed 
System,  but we suppose that the perturbed  System,  instead of  remaining 
permanently in one of these states, is continually changing  from 
one to another, or  wmking   transitions,  under  the influence of the 
perturbation.    Which  method is to be used in any  particular   case 
depends on the nature of the Problem  to be solved. The first method 
is useful usually only when the perturbing energy (the correction in the 
Hamiltonian for the undisturbed  System)  does not involve the time 
explicitly, and is then applied to the stationary states. It tan be used 
for calculating things that do not refer to any definite time, such as 
the energy-levels of the stationary states of the perturbed  System,  or, 
in the  case  of  collision  Problems, the probability of  stattering  through
 
43. The Change in the Energy-levels caused by a Perturbation

168                    PERTURBATION THEORY                                s  42

a  given angle. The second method must, on the other hand, be used 
for solving all  Problems  involving a consideration of time, such as 
those  about the transient phenomena that occur when the perturbation 
is suddenly applied, or more generally  Problems  in which the 
perturbation varies with the time in any way (i.e. in which the perturbing 
energy involves the time explicitly). Again, this second 
method must be used in  collision  Problems, even though the  perturbing 
energy does not here involve the time explicitly, if one 
wishes to calculate absorption and emission probabilities,  since  these 
probabilities, unlike a stattering  probability, cannot be defined without 
reference to a state of affairs that varies with the time.
One  tan summarize the  distinctive  features of the two methods  by
saying that, with the first method, one compares the stationary states 
of the perturbed systsm with those of the unperturbed System; with 
the second method one takes a stationary state of the unperturbed 
system and sees how it varies with time  under  the influence of the 
perturbation.

43. The Change  in the energy-levels  caused  by a perturbation
The  first  of the above-mentioned methods will now be applied to
the calculation of the changes  in the energy-levels of a System  caused 
by a perturbation. We assume the perturbing energy, like the  Hamiltonian 
for the unperturbed System, not to involve the time explicitly. 
Our  Problem  has a meaning, of course, only provided the energy-levels 
of the unperturbed  System  are  discrete  and the  differentes between 
them are  large compared with the  changes  in them  caused  by the 
perfurbation. This circumstance results in the treatment of perturbation 
Problems  by the first method having some different features 
according to whether the energy-levels of the unperturbed  System  are 
discrete   or continuous.
Let the Hamiltonian of the perturbed  System  be
H=E+K                                       (1)
E being the Hamiltonian of the unperturbed  System  and  V the small 
perturbing energy. By hypothesis each  eigenvalue  H' of  H lies very 
close to one and only one eigenvalue E' of  E. We shall use the same 
number of primes to specify any eigenvalue of H and the eigenvalue 
of  E to which it lies very close. Thus we shall have  H" differing  from
E"  by a small quantity of  Order   V  and differing  from   E'  by a quantity
that is not small  unless E' = E". We must now take  care always to
 
§ 43               CHANGE IN THE  ENERGY-LEVElk                                         169

use different numbers of primes to  specify  eigenvalues of H and E
which we do not want to lie very close  together.
To obtain the eigenvalues of H, we have to solve the equation
WO =  H'IH')
or                          (H'--E)]H')  =  VIH').                                      (2)
Let  IO) be an eigenket of  E  belonging to the eigenvalue  E'  and 
suppose the  IH')   and  H'  that satisfy (2) to differ  from  IO}  and  E' 
only by small  quantities and to be expressed as
10  = IO>+ IV+ FD+-->
Hf  =  E'+a,+a,+...,                              1         (3)
where  1 1  > and  a,  are of the first  Order of  smallness (i.e. the  same   Order 
as  V),  /2>  and  a2  are of the second  Order, and so  on.  Substituting 
these expressions in (2), we obtain
{Ef-E+al+aa+~..)(lo)f   IV+  12>+...)   = V(lo>+   IO+-*}.
If we now separate the terms of Zero  Order, of the first  Order, of the 
second  Order,  and so on, we get the following set of equations,
(E'-E)IO) = 0,
CE'-JW)+a,P>  = VP>,                                        (4
(E'-E)l2)+a,ll)+a,lO)  = VP>,
. . . . . . . . .                                         1
The first of these equations tells us, what we have already assumed, 
that  IO>  is an eigenket of  E  belonging  to the  eigenvalue  E'. The others 
enable us to calculate the various corrections Il), 12),...,  al,a,,... .
For the  further   discussion  of  these equations it is convenient to
introduce a representation in which E is diagonal, i.e. a Heisenberg 
representation for the unperturbed System, and to take  E itself as 
one of the observables whose eigenvalues label the  representatives. 
Let the others, in the event of others being necessary, as is the case 
when there is more than one eigenstate of E belonging to any eigenvalue, 
be called  Iss's. A  basic  bra is then  (E"/3"   1.   Since   IO) is an 
eigenket of  E belonging to the eigenvalue  E', we  have
@"P"lo)  =  &pE,f@"),                                      (5)
wheref(/3")  is some  function  of the variables  p". With the help of this 
result the second of equations  (GI), written in terms of  representatives, 
becomes(E'-E")(E"P"Il)+~,8~"~f(rssn)  = B (B"j"IV~IE'/3')f(jg'). (6)
,
 
170                       PERTURBATION THEORY                                    B 43
Putting  E" = E' here, we get


Equation (7) is of the form of the Standard equation in the theory
of eigenvalues, so far as the variables  /3' are concerned. It Shows  that 
the various possible values for a, are the eigenvalues of the matrix 
<E'/l"IVIE'f3').   This matrix is a part of the representative of the 
perturbing energy in the Heisenberg representation for the unperturbed 
System, namely, the part consisting of those elements that 
refer to the same unperturbed energy-level  E'  for their row and 
column. Esch of these values for  a, gives, to the first  Order, an energylevel 
of the perturbed System  lying close to the energy-level E' of the 
unperturbed  System.?  There may thus be several energy-levels of the 
perturbed  System  lying close to the one energy-level E' of the unperturbed 
System, their number being anything not exceeding the 
number of independent states of the unperturbed  System  belonging 
to the energy-level  E'. In this way the perturbation may  Cause a 
Separation  or  partial Separation of the energy-levels that coincide 
at E' for the unperturbed System.
Equation (7) also determines, to the zero  Order,  the representatives
(IG"/?  IO) of the stationary states of the perturbed  System  belonging 
to energy-levels lying close to E', any solutionf(fl')  of (7) substituted 
in  (5) giving one such representative. Esch of these stationary states 
of the perturbed System  approximates to one of the stationary states 
of the unperturbed  System,  but the  converse,  that  each  stationary 
state of the unperturbed  System  approximates to one of the stationary 
states of the perturbed System, is  not true,  since  the general 
stationary state of the unperturbed  System  belonging to the energylevel 
E' is represented by the right-hand side of (5) with an arbitrary 
function  f(p). The  Problem  of finding  which  stationary states of 
the unperturbed  System  approximate to stationary states of the 
perturbed System, i.e. the  Problem  of finding the solutions  f@`> of 
(7), corresponds to the Problem  of `secular perturbations' in classical 
mechanics. It should be noted that the above results are  independent 
of the values of all those matrix elements of the perturbing
i To distinguish these energy-levels one from another we  should require some
more  elaborate  notation, since  according to the present notation they must all be 
specified by the  same  number of primes,  namely  by the number of primes specifying 
the energy-level of the unperturbed System  from which  they arise.  For our present 
purposes,  however, this more  elaborate  notation is not required.
 
§ 43                    CHANGE IN THE ENERGY-LEVELS                                             171

energy which refer to two different energy-levels  of the unperturbed
System.
Let us see what the above results  become  in the specially simple  case
when there is only one stationary state of  the unperturbed  sysfem 
belonging to each energy-1evel.t  In this case E alone fixes the representation, 
no  13's being required. The sum in (7)  now  reduces to a 
Single term and we get
CcI =(E'IVIE').                                         (8)
There is only one energy-level of the perturbed  System  lying  close  to 
any energy-level of the unperturbed  System  and  the  Change  in energy 
is  equal, in the  @st   Order,  to the corresponding  diagonal  element  of the 
perturbing energy  in  the Heisenberg  representution   for the unperturbed 
System,  or  to the average  value of the perturbing energy for the corresponding 
unperturbed state. The  latter  formulation of the result is the  Same 
as in classical mechanics when the unperturbed  System  is multiply 
periodic .
We shall proceed 60 calculate the  second-Order correction a2 in
the energy-level for fhe  case  when the unperturbed  System  is  non-
I
degenerate. Equation (5) for this case reads
(E"IO) =  &`E',
with neglect of an unimportant numerical  factor,  and equation (6) 
reads                 (E'-E")(E"p>+a,8~.$y                   =  (E"p$!r). 
This gives us the value of (E" J 1) when E" # E', namely
(E'IVIE')
(E"I1>  =      E'         E"          .
-                                   (9)
The third of equations  (4),  written  in terms of  representatives,
becomes
(E'-E")<E"12)+a,(E"Il)+a,~&`E'                       =  2  <E'IVIE")(E"~1).
z"
Putting  E" = E'  here,  we gef
al@`Il)+a2 = & GVV'XE"l~),
which  reduces, with the help of .(8), fo
a2  =~~~,(E'IV~E'}(E'11).

t A  System  with only one stationary state belonging to  each  energy-level  is often
called  non-degenerute  and one with two  or  more  stationary states belonging  to  an 
energy-level is called  degenerste,      although   these  words  arc   not  very appropriate  from 
the modern  Point  of view.
 
44. The Perturbation considered as causing Transitions

172                        PERTURBATION THEORY                              § 43

Substituting for (23" 1 l> from (9), we obtain finally
(E'IVIE")(E"jV~E')
a2=               -
c             EI-E"              >
E"#E'
giving for the total energy Change  to the second  Order

a,+a,   =  (E'IVIE')+   2   `E'IVI~)J~~IvlE'?,
E"#E
The method may be developed for the calculation of the  higher
approximations if required. General recurrence formulas giving the 
nth  Order   corrections  in terms of those of lower  Order  have been 
obtained by Born, Heisenberg, and Jordan.?

44.  The  perturbation considered as causing transitions
We shall now consider the second of the two perturbation methods
mentioned in lj 42. We suppose again that we have an unperturbed 
System  governed by a Hamiltonian  E  which does not involve the 
time explicitly, and a perturbing energy  `V which  tan now be an 
arbitrary  function of the time. The Hamiltonian for the perturbed 
System  is again  H  =  E+V.  For the present method it does not 
make any essential  differente  whether the energy-levels of the 
unperturbed System, i.e. the eigenvalues of  E,  form a  discrete  or 
continuous set. We shall, however, take the  discrete  case,  for 
definiteness. We shall again work with a Heisenberg representation 
for the unperturbed System, but as there will now be no advantage in 
taking E itself as one of the observables whose eigenvalues label the 
representatives,  we shall suppose we have a general set of 2s  to label 
the  representatives.
Let us suppose that at the initial time  t,  the  System  is in a state for
which the  CX'S  certainly have the values CY'.  The ket corresponding to 
this state is the  basic  ket  1~`). If there were no perturbation, i.e. if the 
Hamiltonian were  E, this state would be stationary. The  perturbation 
Causes the state to Change. At time t the ket corresponding to the 
state in Schroedinger's picture will be T 1 a'),  according to equation (1) 
of  5 27. The probability of the a's  then having the values 0~" is
P(&")   =  I(a"lTla'>l2.                  (11)
For  (11"   #  c11',   P(a'a")  is the probability of a transition taking  place 
from state a' to state ~2' during  the time interval t, -+ t, while P(&&)
t  2.  f.  Physik, 35  (19259, 565.






,
 
§ 44               PERTURBATION CAUSINC TRANSITIONS                           173

is the probability of no transition taking  place at all. The sum of
P(a'a") for all O? is, of course, unity.
Let us now suppose that initially the  System,   instead  of being
certainly in the state CX', is in one or other of various states 01' with 
the probability Pa,  for each.  The Gibbs density corresponding to this 
distribution is, according to (68) of  5 33
p  = c ja'>P&'  1.                       (12)
ff'
At time t, each  ket Ia') will have changed to Tl@`) and each  bra (a' 1
to  (cu'[T, so  p will have changed to
pt =  C  T(ar')P& fi?'.                     (13)
01'
The probability of the  CX'S then having the values  QL"  will be, Biom 
(73)  of  5 33,       (cx"]pt~a")  =  2 (~HITJ~`)P,~(ol'l~la">
a*
= 2 Pa'  P(cx'a")                      (14)
with the help of  (11). This result expresses that the probability of 
the  System  being in the state  ~2' at time  t is the sum of the probabilities 
of the  System  being initially in any state  01'  #  an,  and making a  transition 
from state 01' to state O?  and the probability of its being initially 
in the  state.$ and making no transition. Thus the various transition 
probabilities act independently of one  another, according to the 
ordinary laws of probability.
The whole Problem  of calculating transitions thus reduces  to the
determination of the probability amplitudes  (CU"   1  T  Ia'). These  tan  be 
worked out from the differential equation for  T,  equation  (6)  of $27, or
%dT,`dt  = HT = (E+V)T.                           (15)
.    The calculation tan be simplified by working with
T*   = &N-tol/fiT.                     (16)
We have             i&dT*/dt  = eiE(t-to)jfi( - ET+%  dT/dt)
=  &W-lo)lfiVT            = V*T* >          (17)
where                     V*  = eiE(t-to)/~Ve-iE(t-I,llfL     1             (18) 
i.e.  V*  is the result of applying a certain unitary transformation to  V. 
Equation  (17)  is of a more convenient form than (15),   because  (17) 
makes the Change in T*  depend entirely on the perturbation V, and
 
174                      PERTURBATION THEORY                                              5 44

for v = 0  it would make  T* equal its initial value, namely unity.
We have  from  (16)
(CX"(T"lLX')     = ,&W-lo)/fi(  a" 1 T  1 a' ) ,
so that                     P(a'afl)  =  I(anIT*Ia'>12,                                   (19)
showing that T*  and  T are equally good for determining transition
probabilities.
Our work up to the present has been exact. We now assume V is
a small quantity of the first Order  and express T*  in the form
T'=   l+Tf+T;+...,                                           (20)
where  TT is of the first Order, 5!`: is of the second, and so on. Substituting 
(20) into  (17) and equating terms of equal Order,  we get
i!idTT/dt  =  V",
ifidT;/dt  =  V*T;,                                      (21)
. . . . . .                                         1
From the first of these equations we obtain
t
Ti =  -in--1  V*(t') dt',
s                                      (22)
to
fiom  the second we obtain

T,*   =  ---#i-2 i V*(t') dt'  j  V*(f) dt",                          (23)
to                    to
and so on. For many practical Problems  it is sufficiently accurate to 
retain only the term Tz, which  gives for the transition probability 
P(&d') with cy"  # cc'

P(a'd) =6-2  (a"l i V*(t') dt'la') 2
I      to                          I         (24)
=  n-2            (2  1 V*(t')  ja')  dt' 2.
l         1
We obtain in this way the transition probability to the second Order
of accuracy. The result depends only on the matrix element
(a"lV*(t')Ia')  of  V*(t') referring to the two states concerned, with t'
going from t,  to  t. Since  V* is real, like  V,
+qv*(t')~a'>   =  ~`Iv*(t')ldr>
and hence                         P(&!")  =  P(c&`)                                       (25)
to the second  Order  of accuracy.
 
45. Application to Radiation

§ 44           PERTURBATIQN CAUSING TRANSITIONS                                          175

Sometimes one is interested in a transition Q' -+ CY"  such that the
matrix element  (a"  1 V*  [  ~2) vanishes, or is small compared with  ofher
matrix elements of  V*. It is then necessary to work to a  higher 
accuracy. If we retain only the terms  Tf and  Tz, we  get, for  0~"  # a', 

P(o!`a") = n-2 / <ar"j v*(t') Ia'} dt'-
It0

-ifi-1   2              1   (a",V*(t'),,"`)  dt' j'<a"`,V*(t'),n'>   &"i2.  (26)
`3"" + a'  d'  to                        to
The terms (Y"' = 01' and  ~4"  = (x" are omitted from the sum  since  they 
are  small  compared with other terms of the sum, on account of the
smallness  of (01" 1 V* 101'). To  interpret  the result  (26),  we may suppose
that the term                           t
s (a"~V*(t')~a'>  dt'                           (27)
to
gives rise to a transition directly  fiom  state a' to state  2,  while  the
ferm
-&--1 / (d'l V*(t') Ia"`)  dt' i' (a", V*(f) (cJ>  dt"                   (533)
to                            to
gives  rise   fo a transition  from   sfate  01' to  state  OP',  followed by a 
transition  from  state  01' to state  a". The state  01"' is called  an `Wermediate 
staie in this interpretation.  We must add the term  (27) to the 
various  ferms   (28)  corresponding  fo different  intermediate   sfafes 
and  then  take the  Square  of  the modulus of  the sum, which means 
that  th8r8  is interference between the different transition  proc8ssesthe 
direct  one and those involving intermediafe  states-and  one cannof 
give a meaning to the probability for one of  these  processes  by 
itself.  E'or   each  of  these   processes,  however,  there   is a probability 
amplitude.  If one  carries  out the perturbation method  to a  higher 
degree of accuracy, one obtains a result which  tan  be  interpreted 
similarly, with the help of more  complicated transition  processes 
involving a  succession  of intermediate states.

45.  Application  to radiation
In the preceding  section  a general theory of the perturbation of an
atomic  System  was developed, in which the perturbing  energy could 
vary with the time in an arbitrary way. A perturbation of this 
kind  tan  be realized in practice by allowing incident  electromagnetic
 
176                    PERTURBATION TREORY                               0 46

radiation to fall on the System. Let us see what our result  (24)  reduces
to in this case.
If we neglect the effects of the magnetic  field of the incident radia-
tion, and if we  further  assume that the wave-lengths of the  harmonic 
components of this radiation are all large compared with the  dimensions 
of the atomic  System,  then the perturbing energy is simply the 
scalar  product                      V =  W',   e),                      (29) 
where D is the total electric displacement of the  System  and  42 is 
the electric  forte of the incident radiation. We suppose  e to be a 
given  function  of the time. If we take for simplicity the  case  when 
the incident radiation is plane polarized with its electric vector in 
a certain direction  and let D denote the Cartesian component of D 
in this direction, the expression (29)  for V reduces to the ordinary 
product                                V=De,
where  e is the magnitude of the vector &!.  The matrix elements of
V  are                    (a"lVIa'>  =  (cx"~D~a')~,
since  e is a number. The matrix element (CX"  1 D 10~`)  is independent
oft.  From  (18)
<a"IV*(t)la')  =  (~"lDlor')eicEl-E~~~~/~~(~),
and  hence  the expression (24) for the transition probability becomes



If the incident radiation  during  the time interval  t, to  t is resolved
into its Fourier components, the energy crossing unit area per unit 
frequency range  about the frequency  v will be, according to classical 
electrodynamics,                            t                      2
=-                 @W-to)&  (t' ) dt' .
Ev       2;                                        (31)
IJ
to
Comparing this with (30), we obtain
P(&") =  2rr~-Yi-21(a"lDl01')1~E,,                   (32)
where                           v =  IE"-E'I/h.                          (33)
From this result we see in the  fnst   place  that the transition  proba-
bility depends only on that Fourier component of the incident radiation 
whose frequency  v is connected  with  the  Change of energy by (33).
 
§  45              APPLICATION TO RADIATION                            177 
This gives us Bohr'8  Frequency Condition  and  Shows  how the ideas 
of Bohr's atomic theory, which was the  forerunner  of  quantum 
mechanics, tan be fifted in with quantum  meohanics.
The present elementary theory does not tell us  anything   about the
energy of the field of radiation. It would be reasonable to assume, 
though, that the energy absorbed or liberated by the atomic System 
in the transition process Comes from or goes into the component of 
the radiation with frequency  v given by (33). This  assumption will 
be justified by the more complete theory of radiation  given in 
Chapter X. The result (32) is then to be  interpreted  as the  probability 
of the System, if initially in the state of lower energy, absorbing 
radiation and being  carried  to the upper state, and if initially in 
the upper state, being  stimuZated  by the  incident  radiation to  emit 
and fall to the lower state. The present theory does not account for 
the experimental fact that the System, if in the upper state with no 
incident radiation,  tan emit spontaneously and fall to the lower state, 
but this also will be accounted for by the more complete theory of 
Chapter X.
The  existente of the phenomenon of stimulated emission was in-
ferred by  Einsteint  long before the  discovery of  quantum  mechanics, 
from a consideration of statistical equilibrium between atoms and a 
field of black-body radiation satisfying Planck's law. Einstein showed 
that the transition probability for stimulated emission must equal 
that for absorption between the  Same pair of states, in agreement 
with the present quantum  theory, and deduced also a  relation connecting 
this transition probability with that for spontaneous  emission, 
which relation is in agreement with the theory of Chapter X.
The matrix element (a"lDja')~  in (32) plays the part of the  ampli-
tude of one of the Fourier components of D in the  classical  theory of
a  multiplybperiodic   System  interacting with radiation. In  fact it was
the idea of replacing classical Fourier components by  matrix  elements 
which led Heisenberg to the  discovery of  quantum  mechanics in 1925.
Heisenberg assumed that the formulas  describing  the  interaction with
radiation of a  System  in the  quantum  theory tan  be obtsined from
the classical formulas by substituting for the Fourier components of 
the total electric displacement of the  System  the corresponding  matrix 
elements. According to this assumption applied to spontaneous  emission, 
a System  having an electric moment D will, when  in the state
j' Einstein, Phys.  Zeih 18 (19I7),  121.
3595.87                             N
 
46. Transitions caused by a Perturbation Independent of the Time

178                         PERTURBATION THEORY                                  9 45
(Y`, spontaneously emit radiation of frequency v = (E'-  E")/h,  where 
E" is an energy-level, less than E', of some state an,  at the rate
4 (27q
3 -+<~"lDI~`)12~                               (34)
The distribution of this radiation over the different directions of 
emission and its state of polarization for  each  direction will be the 
Same as that for a classical electric dipole of moment equal to the 
real part of  (a" IDla'>. To interpret this rate of emission of radiant 
energy as a transition probability, we must divide it by the quantum 
of energy of this frequency, namely hv, and  call it the probability per 
unit time of this quantum being spontaneously emitted, with the 
atomic   System  simultaneously dropping to the state  a" of lower 
energy. These assumptions of Heisenberg are justified by the present 
radiation theory, supplemented  by the spontaneous transition theory 
of Chapter X.
46. Transitions  caused  by a perturbation independent of the
time
The perturbation method of 3 44 is still valid when the perturbing
energy  V does not involve the time  t explicitly.  Since the total 
Hamiltonian H in this case  does not involve t explicitly, we could 
now, if desired, deal with the  System by the perturbation method of 
$ 43 and find its stationary states. Whether this method would be 
oonvenient or not would depend on what we want to find out about 
the System. If what we have to calculate makes an explicit reference 
to the time, e.g. if we have to calculate the probability of the System 
being in a certain state at one time when we are given that it is in a 
certain state at another time, the method of $44 would be the  more 
convenient one.
Let us see what the result (24) for the transition probability becomes
when P does not involve t explicitly and let us take t, = 0 to simplify 
the writing. The matrix element  (a"lVla') is now independent of  t,
and  from (18)           (d'~V*(t')~c%`)  =  (d'~v~a')e~~`-~3"i,                (35)
ts(a)v*(t')Id) dt' = (d']Vla'> $,y--;;jj;,
0
provided E" + E'. Thus the transition probability (24) becomes
p(a'a")   = j(~"1vl~`)12[e~(~"-E3"`"-
l][e-i(E'-E)t/A-  l]/(E"-E')2
= 2~(&'~V~a')~2[b--cos((E"-E')t/fi~]/(E"-E')2.                   (36)
 
§ 46                 TRANSITION PROBABILITIES
If  E"  differs appreciably from  E'  this transition probability is small
and remains so for all values of t. This result is required by the law 
of the conservation of energy. The total energy H is constant and 
hence  the proper-energy E (i.e. the energy with neglect of the part 
V due to the perturbation), being approximately  equal  to H, must 
be approximately constant. This means that if E initially has the 
numerical value  E', at any later time there must be only a small 
probability of its having a numerical value  differing considerably 
from   E'.
On the other hand, when  the initial state  CL' is such that there exists
another state  CX"  having the same  or very nearly the  Same properenergy 
E, the probability of a transition to the final  state All" may be 
quite large. The case  of physical interest now is that in which there 
is a continuous range of final states CL"  having a continuous range of 
proper-energy levels E" passing  through the value  E' of the properenergy 
of the initial state. The initial state must not be one of the 
continuous range of final states, but may be either a separate  discrete 
state or one of another continuous range of states. We  shall   now have, 
remembering the rules of  6 18 for the interpretation of probability 
amplitudes  tith continuous  ranges  of states, that, with  P(cY.`oI") 
having the value (36), the probability of a transition to a final state 
within the small range a" to cll"+&"  will be  P(cL'cx")  da" if the initial 
state  a' is discrete  and will be proportional to this quantity if  01' is 
one of a continuous range.
We may suppose that the  OL'S describing the final state consist of
E together with a number of other dynamical variables  8, so that we 
have a representation like that of 3 43 for the degenerate case. (The 
Iss's, however, need have no meaning for the initial state CC'.)  We shall 
suppose for definiteness that the  /3's  have only discrete  eigenvalues. 
The  total probability of a transition to a final state CX"  for which the 
/3's  have the values ,8" and  E  has any  value (there will be a strong 
probability of its having a  val.ue near the initial value  E') will  now 
be (or be proportional to)


= 2  co   I(~~~vl~`)l"[l-cos((E"    - E')t/fij]/( E"-  E')2  dE" (37)
s
-to
 
180                          PERTURBATION THEORY                                                §  46

if one makes the Substitution (E"-E')t/$  = z. For large values of  t
this reduces  to

ztn-lI(E'p"lvla')I"   r  [l-cosx]/x2  dx
-UJ                  =  27&-l~(E'B"~V~cY')~?              (38)
Thus the total probability up to time t of a transition to a final state 
for which the  /3's  have the values  /3" is proportional to  t.  There is 
therefore a definite probability coe$icient,  or probability per unit time, 
for the transition process  under  consideration,  having  the value
27+-ll(EB"~V~CX')~2.                                    (39)
It is proportional to the  Square of the modulus of the matrix element,
associated with this transition, of the perturbing energy.
If the matrix element (E'/3"  1 V Ia') is small compared with other
matrix elements of  V, we must work with the more accurate formula
(26). We have from (35)

j (o!" 1 V*(t')  Ia"`)  dt' / (a"`l  v*(t") ld) dt"
0                         0

= (a"l Vla")(a"lVla')  S &E"-E"')tqW dt' f ei@P"-E')f"/fi dt"
0                 0
= +" 1v1a">6"' 1 v Ia'>
--
i( E"- E')/fi st (e+C.E')6'/fi _ e~(ELE"~/n)  dt'.
For  E'  close to  E',  only the first term in the integrand here gives rise 
to a transition probability of physical  importante and the  second 
term may be discarded. Using this result in (26) we get
P(a'a')
= 2  (a'lvla')-                       <fqqa")(a"IVla')   2   l-cos((E"-E')l/?i)
- ----
I                     c
IX"  # cd . a!"          E"-E'                     (fl-E')2               '
which replaces (36). Proceeding as before, we obtain for the  transition 
probability per unit time to a final state for which the /3's  have 
the values  /3" and  E  has a value close to its initial value  E'


This formula  Shows  how intermediate states, differing from the initial 
state and final state, play a role in the determination of a probability 
coefficient .







t
 
47. The Anomalous Zeeman Effect

0 46                TRANSITION PROBABILITIES                               181
In  Order  that the approximations  used in deriving (39)  and  (40) may
be valid, the time  t must be not too small  and not too large. It must 
be large compared with the  periods   of the  atomic  System   in  Order that 
the approximate evaluation of the integral (37) leading  to the result 
(38) may be valid, while it must not be excessively large or else the 
general formula (24) or (26) will break down. In fact one could make 
the probability (38) greater than unity by taking  t large enough. The 
upper limit to t is fixed by the condition that the probability (24) or 
(26),   or  t  times (39) or  (40), must be small compared  with  unity. There 
is no difficulty in t satisfying both these conditions simultaneously 
provided the perturbing energy  V is sufficiently small.

47.  The  anomalous Zeeman effect
One of the simplest examples of the perturbation method of $43
is the calculation of the  first-Order  Change in the energy-levels of an 
atom  caused  by a uniform  magnetic  field. The  Problem  of  a hydrogen 
atom in a uniform magnetic  field has already been dealt with  in $41 
and was so simple that perturbation theory was unnecessary. The 
case  of a general atom is not  much  more complicated when we make 
a few approximations such that we  tan  set up a simple model for the 
atom.
We  first of all consider the atom in the  absence  of the  magnetic
field and look for constants of the motion or quantities that are 
approximately constants of the motion. The total angular momenturn 
of the atom, the  vector  j  say, is certainly a  constantl  of the 
motion. This angular  momentum  may be  regarded as  the sum of two 
Parts,  the total orbital angular  momentum  of all the  electrons,   1 say, 
and the total spin angular momentum, s say. Thus we  have j = l+~. 
Now the effect of the spin magnetic  moments on the motion of the 
electrons  is small compared with the effect of the Coulomb  forces and 
may be neglected as a first approximation. With this approximation 
the spin angular  momentum  of  each   electron is a  constant  of the
motion, there being no  forces  tending to  Change ita orientation. Thus 
s, and  hence  also 1, will be constants of the motion. The  magnitudes, 
Z, s,  and j say, of 1, s, and j will be given by

z+*fi =  (EZ+z;+e+i)n2)`, 
s+Q?i  =  (8~+s;+s~+gP)*,
 
182                     PERTURBATION THEORY                             §  47

corresponding to equation (39) of  5 36. They  commute  with  each 
other, and ffrom  (47) of Q 36 we see that with given numerical values 
for Z and s the possible numerical values for j are
z+s,  z+s-6,  . . . .  IZ-SI.
Let us consider a stationary state for which  Z,  s,  and j have definite
numerical values in agreement with the above scheme. The energy 
of this state will depend on Z, but one might think  that with neglect 
of the spin magnetic moments it would be independent of  s, and
also  of the direction of the vector s relative to 1, and thus of j. It will 
be found in Chapter IX, however, that the energy depends very  much 
on the magnitude  s of the vector s, although independent of its 
direction when one neglects the spin magnetic moments, on  account 
of certain phenomena arising from the  fact that the  electrons  are 
indistinguishable one from another. There  are  thus different energylevels 
of the  System  for each  different value of  Z and  s. This means 
that  Z and s are  functions  of the energy, according to the  general 
definition of a function given in 0 11, since the Z and s of a stationary 
state  are fixed when the energy of that state is fixed.
We  tan  now take into account the  effect of the spin magnetic
moments, treating it as a small perturbation according to the method 
of  8 43. The energy of the unperturbed System  will still be approximately 
a  constant  of the motion and  hence   Z and  S, being functions 
of  this  energy, will still be approximately constants of the motion. 
The directions  of the vectors 1 and s, however, not being functions  of 
the unperturbed energy, need not now be approximately constants 
of the motion and may undergo large  secular variations.  Since  the 
vector j is constant,  the  only possible  Variation of 1 and s is a precession 
about the vector j. We thus have an approximate model of 
the atom consisting of the two  vectors  1 and s of  constant  lengths 
precessing about their sum j, which is a fixed vector. The energy is 
determined mainly by the magnitudes of 1 and s and depends only 
slightly on their reiative directions,  specified by j. Thus states with 
the same  Z and s and different j will have only slightly different 
energy-levels, forming what is called a  multiplet  term.
Let us now take this  atomic  model as our unperturbed  System  and
suppose it to be subjected to a uniform magnetic field of magnitude  J# 
in the direction of the x-axis. The extra energy due to this magnetic 
field will consist of a terme&/2mc.  (m,+liLa,),                        (41)
 
§ 47              THE  ANOMALOUS ZEEMAN EFFECT                                      183

like   the  last   term   in   equation   (89)  of  $  41,  contributed  by  each
electron,   and will   thus  be  altogether
e3/2mc.   2  (m,+&r,)  =  eJ+/2mc.   (Zz+2sz)  =  eA/2m.   (j,+s,).   (42)
This   is  our  perturbing   energy   Y.  We  shall   now  use   the  method  of 
6  43 to determine  the changes  in   the energy-levels  caused  by this  V. 
The  method  will be legitimafe only provided fhe field is so weak  that 
V  is small  compared wifh  the energy  differentes  within  a multiplet.
Our  unperturbed  System  is degenerate,  on account of the direction
of  the  vector  j  being   undetermined.   We  must   therefore   take,   from 
the  representative   of  V  in   a  Heisenberg   representation   for   the  unperturbed 
System,  those matrix  elements  that refer  to one particular 
energy-level  for  their  row  and column,  and obtain the eigenvalues  of 
the  matrix  thus  formed.  We tan  do  this best by first  splitfing  up  V 
into two Parts,  one of which  is a constant  of the unperturbed  motion, 
so that  its representative  contains  only matrix  elements  referring  to 
the  same  unperturbed  energy-Ievel   for their  row   and column,  while 
the  representative  of the other  contains  only matrix  elements  referring 
to  two   different  unperturbed   energy-levels   for  their   row   and 
column,   so that this  second   part  does not  affect  the first-Order  perturbation. 
The  term   involving   ja  in   (42)  is  a  constant   of  the  unperturbed 
motion and thus  belongs  entirely  to the first  part. For  the 
term  involving s,  we have




where                     .      .
Yx  =  sz3$--Jz~y =  szEy-l~8v =  l&-E,   SV,
yg =  j,s,-szjx  =  l,s,-szlx =  Zzsx---Zxsz.                   (44)
The  first  term  in  this  expression  for sz is a constant  of the unperturbed 
motion  and thus  belongs  entirely  to the  first  Part,  while  the second 
term,   as we shall  now See, belongs  entirely   to the second  part.
Corresponding   to  (44)  we   tan  introduce
Yz =  1,s,-+,.
It  tan  now  easily  be  verified  that
jxy,+j,y,+~z~z   =  0
and from  (30) of 8 35
h  ~~1 = rgi              LizJ   r,l  = --yxT    bz9   ral  = 0.
 
184                   PERTURBATION THEORY                                  § 47 
These relations connectingj,, jy,  jz and yz,  rU,  yz are of the Same form 
as the relations connecting  m,, my,   m, and x, y, x in the calculation 
in  5 40 of the  selection  rule for the matrix elements of  x in a  representation 
with E diagonal. From the result there obtained that all 
matrix elements of  x vanish except those referring to two  E values 
differing by  -f-n, we  tan infer that all matrix elements of  yz,  and 
similarly of  ya: and  yV,  in a representation with j diagonal, vanish 
except those referring to two j values differing by  &iFi. The  coefficients 
of yz and ry in the  second term on the right-hand side of (43) 
commute  with j, so the representative of the whole of this term will 
contain only matrix elements referring to two j values differing by 
rfr&,  and thus referring to two different energy-levels of the unperturbed 
System.
Hence  the perturbing energy  V  becomes,  when we neglect that
part of it whose representative consists of matrix elements referring
to two different unperturbed energy-levels,



The eigenvalues of this give the  first-Order   changes  in the energylevels. 
We.  tan make the representative of this expression diagonal 
by choosing our representation such that jz is diagonal, and it then 
gives us directly  the  first-Order   changes  in the energy-levels  caused  by 
the  magnetic  field. This expression is known as Lande's formula.
The result (46) holds only provided the perturbing energy  V is small
compared with the energy diff erences within a multiplet. For larger 
values of  V a more complicated theory is required. For very strong 
fields, however, for  which   V is large compared with the energy  differences 
within  a multiplet, the theory is again very simple. We may 
now neglect altogether the energy of the spin magnetic  moments for 
the atom with no external field, so that for our unperturbed  System 
the  vectors 1 and s themselves are constants of the motion, and not 
merely their magnitudes Z and S. Our perturbing energy V, which  is 
still eJ%/2mc.  (j,+s,),  is now a constant  of the motion for the unperturbed 
System, so that its eigenvalues give directly the  changes  in the 
energ y -1evels. These eigenvalues are integral or half-odd integral 
multiples of e&ti/2mc  according to whether the number of electrons 
in the atom is even or odd.
 
VIII. Collision Problems 
48. General Remarks

VIII
COLLISION PROBLEMS
48. General remarks
IN this chapter  we shall investigate Problems  connected with a partitle 
which, coming  from infinity, encounters or  `collides with' some 
atomic  System  and, after being scattered through a certain angle, goes 
off to infinity again. The atomic System  which does the  stattering 
we shall  call,  for brevity, the  scatterer.  We thus have a dynamical 
System  composed of an incident particle and a scatterer interacting 
with each  other, which we must deal with according to the laws of 
quantum  mechanics, and for which we must, in  particular,  calculate 
the probability of stattering  through any given angle. The scatterer 
is usually assumed to be of infinite mass and to be at rest throughout 
the  stattering  process. The  Problem  was first solved by Born by a 
method substantially equivalent to that of the next  section. We must 
take into account the possibility that the scatterer, considered  as a 
System  by itself, may have a number of different stationary states 
and that if it is  initially  in one of these states when the particle arrives 
from infinity, it may be left in a different one when the particle goes 
off  to  infinity  again, The  colliding  particle may thus induce transitions 
in the scatterer.
The  Hamiltonian  for the whole  System  of scatterer plus particle
will not involve the time explicitly, so that this whole  System  will 
have stationary states represented by  periodic solutions of  Schroedinger's 
wave equation. The meaning of these stationary states 
requires a little  care to be properly understood. It is evident that 
for any state of motion of the  System  the particle will spend nearly all 
its time at infinity, so that the time  average  of the probability  of  the 
particle being in any finite volume  will be Zero.  Now for a statiomry 
state the probability of the particle being in a  given finite volume, 
like any other result of Observation, must be independent of the time, 
and  hence  this probability will equal its time  average,  which we have 
seen is  Zero. Thus only the relative probabilities of the particle being 
in different finite volumes will be physically  significant,  their absolute 
values being all  Zero. The total energy of the  System  has a continuous 
range of eigenvalues, since the initial energy of the particle  tan  be 
anything. Thus a ket, 1s) say, corresponding to a stationary state,
 
186                       COLLISION PROBLEMS                                   0 48

being an eigenket of the total energy, must be of infinite length. We 
tan see a physical reason for this, since  if 1s) were normalized and if 
& denotes that observable-a certain function of the  Position  of 
the particle-that is equal to unity if the particle is in a given finite 
volume and  Zero otherwise, fhen  (sl&ls)  would be  Zero,  meaning that 
the  average  value of  &,  i.e. the probability of the particle being in the 
given volume, is  Zero, Such a ket 16) would not be a convenient one 
to work with. However, with  1s) of infinite length,  (SI& js>   tan  be 
finite and would then give the relative probability of the particle 
being in the given volume.
In picturing a state of a System  corresponding to a ket IX) which
is not normalized, but for which  (xlx)  =  n say,  it may be convenient 
to suppose that we have n similar Systems all occupying the same 
space but with no  interaction  between them, so that  each  one follows 
out its own motion independently of the others, as we had in the 
theory of the Gibbs ensemble in 0 33. We tan  then interpret (xlc~lx),
where  01  is any observable, directly as the total  01  for  all  the  rt Systems.
In applying these ideas to the above-mentioned Is} of infinite length, 
corresponding to a stationary state of the  System  of scatterer plus 
colliding  particle, we should picture an infinite number of such systems 
with the scatterers all located at the same  Point  and the  particles 
distributed continuously throughout space. The number of particles 
in a given finite volume would be pictured as (st&  js>,  & being the 
observable defined above, which has the value unity when the particle 
is in the given volume and Zero otherwise. If the ket is represented 
by a Schroedinger wave function involving the Cartesian coordinates 
of the particle, then the Square of the modulus of the wave function 
could be interpreted directly as the density of  particles  in the picture. 
One must remember, however, that  eacF,   of  these  particles  has its own 
individual scutterer. Different particles  may belong to scatterers in 
different states. There will thus be one particle density for  each  state 
of the scatterer, namely the density of those  particles  belonging to 
scatterera in that state. This is taken account of by the wave function 
involving variables describing the state of the scatterer in addition 
to those describing the Position  of the particle.
For  determining   stattering  coefficients we have to investigafe
stutionary stutes  of the whole  System  of scatterer plus particle. For 
instance, if we want to determine the probability of  stattering  in 
various  directions  when the scatterer is initially in a given stationary
 
§  48                      GENERAL REMARKS                                187 
state and the incident particle has initially a given velocity in a given 
direction, we must investigate that stationary state of the whole 
System  whose picture, according to the above method, contains at 
great distances  from  the  Point  of location of the scatterers  only 
particles moving with the given initial velocity and direction and 
belonging  each  to a scatterer in the given initial stationary state, 
together with particles moving outward from the  Point  of location 
of the scatterers and belonging possibly to scatterers in various 
stationary states. This picture corresponds closely to the  actual  state 
of affairs in an experimental determination of  stattering  coefficients, 
with the differente that the picture really describes only one actuul 
System  of scatterer plus particle. The distribution of outward moving 
particles at  inflnity  in the picture gives us  immediately  all the  information 
about  stattering  coefficients that could be obtained by  experiment. 
For practical calculations  about the stationary state described 
by this picture one may use a perturbation method somewhat like 
that of $43, taking as unperturbed  System,  for example, that for 
which  there is no interaction  between the scatterer and particle.
In dealing with  collision  Problems, a  further  possibility to be taken
into consideration is that the scatterer may perhaps be  capable of 
absorbing and re-emitting the particle. This possibility arises  when 
there exists one or more  stutes  of absorption  of the whole  System,  a 
state of absorption being an  approximately   stationary state  which 
is closed in the sense mentioned at the end of  Q 38 (i.e. for which 
the probability of the particle being at a greater distance than  r from 
the scatterer tends to zero  as r -+ CO). Since a state of absorption is 
only approximately stationary, its property of being closed will be 
only a transient one, and after a  sufficient   lapse of time there will be 
a finite probability of the particle being on its way to infinity. 
Physically this means there is a finite probability of spontaneous 
emission of the particle. The  fact that we had to use the word 
`approximately' in stating the conditions required for the phenomena 
of emission and absorption to be able to occur  Shows  that these  conditions 
are not expressible in  exact mathematical language. One  tan  give 
a meaning to these phenomena only with reference to a perturbation 
method. They occur when the unperturbed  System  (of scatterer plus 
particle) has stationary states that are closed. The  introduction of  the 
perturbation spoils the stationary property of these states and gives 
rise to spontaneous emission and its converse  absorption.
 
49. The Scattering Coefficient

Y





188                       COLLISION PROBLEMS                                B 48
For'calculating  absorption and emission probabilities it is  necessary
to deal with  m-~tutionury &&8 of the System, in contradistinction 
to the  case for  stattering  coefficients, so that  the perturbation method 
of $44 must be used. Thus for calculating an emission coefficient 
we must consider the non-stationary states of absorption described 
above. Ag&,  since an absorption is always  followed  by a re-emission, 
it  cannot  be distinguished from a  stattering  in any experiment involving 
a steady state of affairs, corresponding to a stationary state 
of the System. The   distinction  tan be made only by  reference  to a 
non-steady state of affairs, e.g.  by use of a stream of incident particles 
that  has  a  sharp   beginning,   SO  that  the scattered particles will appear 
immediately   after  the incident particles meet the scatterers, while 
those   that  have  been absorbed and re-emitted will begin to appear 
only  some  time later. This stream of particles would  he the picture 
of  ts certain ket of infinite length, which could be used for calculating 
the absorption coefficient.

49. The stattering  coefficient
We  shall now consider the calculation of  stattering  coefficients,
taking  first the  case  when there is no absorption and emission,  which 
means that our unperturbed  System  has no closed stationary states, 
We may conveniently take this unperturbed  System  to be that for 
which  there is no  interaction  between the scatterer and particle.  Its 
Hamiltonian will thus be of the form
E=H,+W,                                     (1)
where H8 is that for the scatterer alone and W that for the particle
alone, namely, with neglect of  relativistic  mechanics,
w = 1/2m.  (p2+zg+pl)*
The perturbing energy V,  assumed small, will now be  a function of
the Cartesian coordinates of the  particle x, y, x, and also, perhaps,
of its momenta  ~p%,   py,  pB,  together with dynamical variables  describ-
ing the scatterer.
Since  we are now interested only in stationary states of the whole
System, we use a perturbation method like that of  8  43. Our  unperturbed 
System  now necessarily has a continuous range of  energylevels, 
since it contains a fiee  particle,  and this gives rise to certain 
modifications  in the perturbation method. The question of the  Change 
in the energy-levels  caused   by  the perturbation,  which  was the main
 
THE  SCATTERING               COEFFICIENT                189

question  of 4 43, no longer has a meaning,  and the convention  in 5 43 
of  using  the  Same nunaber of primes to denote nearly equal eigenvalues 
of  E and  H now drops out. Again, the  splitting  of energylevels 
which we had in  0 43 when the unperturbed  System  is degenerate 
cannot  now arise, since  if the unperturbed  System  is degenerate the 
perturbed one, which must also have a continuous range of energylevels, 
will also be degenerate to exactly the same extent.
We again use the general scheme of equations developed at the
beginning of  3 43, equations (1) to (4) there, but  we  now take our 
unperturbed stationary state forming the  Zero-Order  approximation 
to belong to an energy-level E' just equal to the energy-level H' of 
our perturbed stationary state. Thus the  u's  introduced in the  second 
of equations (3)  5 43 are now all zero and the  second  of equations 
(4)  there now  rcads      (E'-E)I   I>  =  V(0).                          (3) 
Similarly, the third of equations (4)  §  43  now reeds
(P-E)J2>  =  V]l}.                              (4)
We shall proceed to solve equation (3) and to obtain the  stattering 
coefficient to the first Order.  We shall need equation (4) in 0 51.
Let  01  denote a  complete  set of commuting observables describing
the scatterer, which are constants of the motion when the scatterer is 
alone and may thus be used for labelling the stationary states of the 
scatterer.   `I`his  requires that H,  shall commute  with the  OCS and be 
a  function  of them. We  tan  now take  a representation of the whole 
system  in which the  OC'S and  2, y, z,  the coordinates of the  particle, 
are diagonal. This will make  E& diagonal. Let IO) be represented by 
(  XCX'(O)  and  11) by  (xa'(   1>,  the  Single  variable x being  written to 
denote  X,  y, x and the Prime being omitted from x for brevity. In the 
same way the  Single  differential  d3x will be written to denote the 
product  dxdydz.  Equation  (3), w-ritten in  terms  of  representatives, 
becomes,  with the help of  (1) and  (2),
(E'---H,(a')+P/2m.V2)(XcY'p}  = z  1  (XcY'lV]X"cY")   d3X"(X"a"IO>.
a
(5)
Suppose that the incident particle  has the  momentum  po and that 
the initial stationary state of the scatterer is  ~9. The stationary  state 
of  our  unperturbed  System   is  now the one for which p =  po and 
cy  = CX*,  and  hence  its representative is
<Xcx'lO>  =L-   S4,ao ei(Po+X)Ih.                 (6)
 
180                         COLLISION PROBLEMS                                    § 49

This makes equation (5)  reduce  to
(~`-H,(~`)+~z/27Tb.V2)cxa'll)
=  1  (XdIVIX"~o>   &@f$P",Xol/fi

or                       (k2+V2)(Xdp)   =  F,                                      (7) 
where               ~2  = %n~-2(E'-  H,(d))                                        (8) 
and                 F =  %r&-2 (xd  1 V 1 xO&  dsXO  ei(PO,xO)lfi,
s                                                (9)
a definite function of x, y, X, and  a'. We must also have
E' =  Hs(ao)+  po2/2m.                             (10)      :
Our   problem  now is to obtain  8  solution  (XCW'  11)  of (7) which,  for
values of x, y,  z denoting Points  far from  the scatterer, represents 
only outward moving particles. The   Square of its modulus,  1 (x&   / 1)  12, 
will  then  give the density  of scattered particles belonging to scatterers 
in the state  a' when the density of the  incident  particles is  1 (  xa"  IO)  12, 
which is unity. If we transform to polar coordinates r, 6,& equation
(7)  becomes

(r&#a'Il) =  F.  (11)

Now  F must tend to zero as r + co, on account of the physical re-                          6
quirement that the  interaction  energy between the scatterer and 
particle  must tend to  Zero as the distance between them tends to 
infinity. If we neglect F in (11) altogether, an approximate Solution
for large  r is              <rw I 0 =  u(O+xf)r-leikr,                           (12)       '
where u is an arbitrary function of 8,  4, and  01',  since  this expression
substituted in the left-hand side of  (11) gives a result of  Order  +. 
When we do not neglect F, the Solution of  (11) will still be of the 
form (12) for large r, provided F tends to zero sufficiently rapidly as
r-+co,butth  f
e unction  u will now be definite and determined by the
Solution for smaller values of  r.
For values 01' of the $5 such that k2,  defined by (8), is positive, the
k in (12) must be Chosen  to be the positive Square  root of k2, in Order
that (12) may represent only outward moving particles, i.e. particles
for which the radial component  of momentum, which from  6 38 
equals  p,--iTrl  or -i?i(a/&+r-l),   has a positive value.  We now 
have that the density  of  scattered  particles belonging to scatterers in 
state  a',  equal  to the  Square  of the modulus of  (12),  falls  off  with 
increasing  r according to  the  inverse   Square law, as is  physically







-i.-
 
,
§ 49                    THE ScATTERING   COEFFICIENT                                                   191
necessary,       and their angular distribution is  given  by  IU(&~`)/~. 
Further,  the magnitude, P' say, of the  momentum  of  these scattered
particles  must  equal  k&,  the  momentum  being radial for  large  r,
SO that their energy is equal to
P'2       k2?i2
-=-= E'-H,(cY')  =
2m        2m
with the help of (8) and (10). This is just the  energy of an incident 
particle, namely  p02/2m,  reduced by the increase in energy of the 
scatterer, namely H,( a') - H,( ao), in agreement with the law of  conservation 
of energy. For values cy'  of the  01's  such that k2 is negative 
there are no scattered particles, the total initial energy being insufficient 
for the scatterer to be left in the state 01'.
We must now evaluate u(@x')  for a set of values Q' for the CJS such
that  k2  is positive, and obtain the angular distribution of the scattered 
particles belonging to scatterers in state c11'.  It is sticient to evaluafe 
u for the direction  8 = 0 of the pole of the polar coordinates,  since 
this  direction is arbitrary. We make use of Green's theorem, which 
states that for any two  functions of  Position   A and  B the  volume 
integral  I  (AV2B-BV2A)  d3x  taken over  any  volume  equals the 
surface integral  1  (MB/&+- B&4/%)  d8 taken  over the boundary 
of the volume,  a/an  denoting differentiation along the  normal to 
the surface. We take
A = e-ikrCOB 3>                      B =  (re+ipj
and apply the theorem to a large sphere with the origin as centre.
The volume integrand is thus
e-ikrcoa  e 772(&$,&  11) _  (re+a'  11  )`i&-ikr  cos e

=   e-ikrcos81'iJ2+k2)(r6~~tll)                    =.=   e-ikroosdF

from (7) or (ll), while the surface integrand is, with the help of  (12),



= @cr  cos e u(  -~+~)ckr+i~e"*%cosee-~"rD"8

= ikur-l(  1 +cos t+@f~--COS~
with neglect of  Y-~.   Hence we get
2T           Ir
e-ikrcoseF   d3X  =              d#          r2 sin 8 dt?, ikur-l(  1+ cos O)eikdl  -cos 4,
s                               s            s
0            0
 
192                      COLLISION PROBLEMS                                     8 49
the volume integral on the left being taken over the whole of space. 
The right-hand side becomes, on   being integrated by  Parts  with 
respect  to 8,



The  second term  in the  (> brackets  is of the  Order  of magnitude of 
r-1,  as would be revealed by  further  partial integrations, and  may 
therefore be neglected. We  arc thus left with
277
e-dbcoSflF d3x  = - 2 d$ u(O#d) = - 4ru( O&`),
s                                s0
giving the value of u(&x')  for the direction 0 = 0.
This  res&  may be written
u(O$a')   =  -               -1  e-bP7-XSe/fiF  d3x,
(477')            1                       (13)

since  P' =  EL If the vector p' denotes the  momentum  of the scattered 
electrons  coming off in a certain direction (and is thus of magnitude 
P'), the value of u for this direction will be
u(ef+faf)  = -kw-1 J e-WdfiF  dsx,
as follows from (13) if one takes  this direction to be the pole of the
polar coordinates. This becomes, with the help of  (9),

u(ef4faf)
= -(27r)-hfi-2 J/ e-i(P',x)/R  d3x  ( xaf ] V 1~0~0) d3xO ei(P"JCo)/fi

= -2mh(p'a'j  V Ip"dJ),                                         (14)
when one makes a transformation from the coordinates x to the 
momenta p of the particle, using the transformation function  (54) 
of  tj 23. The  Single letter p is here used  as a label for the  three 
components of momentum.
The density of scattered  particles  belonging to scatterers in state                 i
CE`  is now given by  /u(8f+`af)12/r2.   Since  their  velocity  is  P'/m,   the        :
rate at which  these  particles  appear per unit solid angle  about the
direction of the vector  p'  will be  P`/m.   ~u(e'+`&)j2.  The density of                   ,
the incident particles  is, as we have Seen,  unity, so that the number 
of incident  particles  crossing unit  area  per unit time is equal to their 
velocity PO/m,  where  PO is the magnitude of po. Hence  the  effective 
area that must be hit  by an incident particle in  Order  to be scattered 








t
 
50. Solution with the Momentum Representation

fi 49                  THE SCATTERIKG             COKFFICIENT                  1 93

in a unit solid angle  about the direction  p'  and then belong to  a
scatterer in state CY'  will be
P'/PO.   lu(B'&x')l2  = 47?m2h2P'/Po,  ~(~`cL'~V~~~CX~)~~.      (15)     '
This is the  stattering  coefficient for transitions  ao-+   CX'  of the scatterer.
It depends on that matrix element (p'a'l  V 1 p%O)  of the perturbing 
energy  V whose  column   p%O   and whose row  p'a   refer respectively to 
the initial and final states of the unperturbed System, between which 
the  stattering  transition process takes place.  The result (15) is thus 
in some  ways  analogous  to the result (24) of  8 44, although the 
numerical  coefficients are different in the two  cases,  corresponding 
to the different natures  of the two transition processes.

50. Solution with the momentum  representation
The result  (I 5) for the  stattering  coefficient makes a reference only
to that representation in which the  momentum  p  is diagonal. One 
would thus expect to be able to get a more direct proof of  the result 
by working all the time in the p-representation, instead of working 
in the x-representation and transforming at the end to the p-representation, 
as was done in 6 49. This would not at first sight appear 
to be a great improvement, as the lack  of directness of the x-representation 
method is offset by more direct applicability, it being 
possible to picture the Square of the modulus of the x-representative 
of a state as the density of a stream of particles  in process of being 
scattered. The x-representation method has, however, other more 
serious disadvantages. One of the main applications of the theory 
of  collisions  is to the  case  of photons as incident  particles.  Now a 
Photon is not a simple  particle but has a polarization. It is evident 
from classical electromagnetic  theory that a Photon with a  definite 
momentum, i.e. one moving in a definite direction with a definite 
frequency, may have a definite state of polarization (linear, circular, 
etc.), while a Photon  with a definite  position, which is to be pictured 
as an electromagnetic  disturbance confined to a very small volume, 
cannot have any definite polarization. These  facts  mean that the 
polarization observable of a  Photon  commutes  with  its  momentum 
but not with its Position. This results in the p-representation method 
being immediately  applicable to the  case  of photons, it being only 
necessary to introduce the polarizing  variable  into the representatives 
and treat it along with the  CU'S describing the scatterer, while  the
3595.57                                 0
 
194                       COLLISION PROBLEMS                                   §  00

x-representation method is not applicable.  Further,  in dealing with 
photons,  it is necessary to take relativistic mechanics into account. 
This   tan  easily  be done in the p-representation method, but not SO 
easily in the x-representation method.
Equation (3) still holds  with relativistic mechanics, but W is now
given by         ~+-2 =  rnV+P2   = mv+p;+p;+p;                                (16)
instead of by (2). Written in terms of p-representatives, equation (3)
gives             (E'-H,(d)-   W)<pcL'(l) = (pa'IV10),
p being written instead of p' for brevity and W being understood as 
a definite function of pz,  py, pz given by (16). This may be written
(W'-W(Pdl)  =  (poqqO),                                 (17
where                           w'  =  Bi'-H,(d)                               (1% 
and is the energy required by the law of conservation of energy for                     .  - 
a scattered particle  belonging to a scatterer in state a'.  The ket (0)                . . 
is represented by  (6)  in the x-representation and the  basic  ket  lp"&J>
is represented by
(xd 1 p"ao)  = Satao  <x Ip")  = 6,to10 h-*ei@"~X)/n,
from  the transformation function  (54)  of  § 23. Hence
IO)  =  hqpv) ,                              (19)
and equation  (17)  may be written
(FV-  W)(pa'/l)  = h~(pa'~V~pvJ).                          (209
We now make a transformation from the Cartesian coordinates
ps,  py, pz of  p to its polar coordinates P, CO,  x, given by
Pz = Pcosw,         py  = PsinocosX,              pz  =  Psinwsinx.
If in the new representation we take the weight function P2 sine, 
then the weight attached to any volume of p-space will be the same 
as in the previous p-representation, so that the transformation will
mean simply a relabelling of the rows and columns of the matrices
without any alteration of the matrix elements. Thus (20) will  become
in the new representation
(W'-  W)(Pwpx' j l} = hyPCtJp'[  VI P%u0~%%0),                  (21)
W  being now a function of the Single variable  P.
 
§ 50     SOLUTION WITH MOMENT-UM REPRESENTATION                        195
The coefficient of  (PWXCX'   (l), namely  FV'-  W, is  riow  simply  a
multiplying  factor and not a differential Operator as it was with the 
x-representation  method. We  tan therefore divide out by this factor 
and obtain an explicit expression for (PUXCX  11). When, however, 01' 
is such that W', defined by (18),  is greater than mc2,  this factor will 
have the value Zero for a certain Point  in the domain of the variable 
P,  namely the  Point  P = P',  given in terms of  W'  by  (16).  The 
function (Pep' ] 1) will then have a singularity at this point. This 
singularity  Shows  that (Pep' 11)   represents an infinite number of 
particles moving about at great distances from the scatterers with 
energies indefinitely close to  W' and it is therefore this singularity 
that we have to study to get the angular distribution of the particles 
at infinity.
The result of dividing out (21) by the factor W'- W is, according
to  (13)  of  5 15,
(Pwp'~   1) = h~(PwXa'  j V]P%J"~%~)/(  W'- W)+X(cqa')   S(W'-  W),
(22)
where  A is an arbitrary function of  w, x, and  a'.  To give a meaning 
to the first term on the right-hand side of (22),  we make the convention 
that its integral with  respect to  P  over a range that includes the 
value  P' is the limit when  E  -+ 0 of the integral when fhe small 
domain  P-E to  P'+E  is excluded from the range of integration. 
This is sufficient  to make the meaning of (22)  precise,  since  we are 
interested effectively only in the integrals of the representatives of 
states when the representation has continuous ranges  of rows and 
columns. We see that equation (21) is inadequate to determine the 
representative  <P~xdIl)   completely, on account of the arbitrary 
function  X occurring in (22). We must  choose this  A such  that 
(PWXOI'  ] l> represents only outward moving particles, since  we want 
the only inward moving particles to be those corresponding to  IO).
Let us take first the general case  when the representative  <Pt.q I>
of a state of the  particle  satisfies an equation of the  type
(w'--w)mJxI)  =  f(Pox),                       (23)
where  f( POX) is any function of  P,  o, and  x, and  W' is a number
greater than CMA~,  so that (POX 1) is of the form
<PqI)  =f(P~x)/(W'-W)S-h(~X)~(W'-W),                     (24)
and let us determine now what h must be in Order  that (Pq  1) may
 
196                                      COLLTSION PROBLEMS                                   §  50
represent  only outward moving particles.  We  tan do this by transforming 
(Pq   1) to the x-representation, or rather the  (r0+)-repretransformation 
function is                     (12)
sentation,  and comparing it with                                       for large values of r. The

(re+IPcq)  =  h-g&p,x)/?i = ~-~eZPr&os  w cos t?+  sin~sin8cos(~-&]/fi~.
For the direction  0 = 0 we find









The  second  term in  the  (  )  brackets  is of  Order   r-2,  as may be verified 
by further  partial integrations with respect  to  Er), and  tan therefore 
be neglected. We are left with





When  we Substitute for  (PwxI)   its value given by  (24),  the  first
tcrm  in the integrand in (25) gives
Q,
ih-+J               p dp ,+Prlfi
s                          (f(pvx)/(  w'- W)+%d S( W'- W>>. (26)
0
The term involving S(W'- W)  here may be integrated immediately 
and  @es,  when one uses  Lhe   relation  P d  P  =  W d  WIc2,   which 
follows from (1  ti),

ih-kV-l * W d W e-@`~~fiA(~x)s(  W'- W)
s
??w"              =  ih-lc-2r-1W'h(rrX)e-iP'rI".               (27)
To  integrate  the other term  in (26) we use the formula
03                                  CO
fg(P)$?!?df'
0 -  = g(P')s;;T;dp, VW
0
 
with  neglect of terms involving r- l, for any continuous function g(P),

which formula holds since  ~.K(P)K-~~~~~   dP  is of  Order   r-1  for   any
0
continuous function K(P) and since the  differente
s(P>/(P'-P)-s(P')i(P'-P)
is continuous. The right-hand side of  (28), when evaluated with 
neglect of terms involving  r--l, and also with neglect of the small 
domain  PI--E to  P'+E  in the  domain  of integration, gives
CoiyipTF dp = g( Pf)e-iP'r/?i03
dP')s                                                "~-p~   dp
-                                    s       -
03
= ig(                           s'np:-_ pw
pf)e-iP'r/h      s  . _..___ dP = ing(P')@"T/fi, (29)
P'-P

In our present example  g(P) is
g(P) = ih-Q-lP  f (P7rx)(  P'- P)/( W'- W),
which has the limiting value when P = P',
g(P')  =  ih-FP'f(P'q)W'/PV                       =  ih-+-2r-1W'f(P'rrX).
Substituting this in  (29)  and  adding  on the expression  (27),  we obtain
the following value for the integral (26)
h-*c-2r-1  W'(-vf  (P'77X)  +ih(~~))e--~~~~~.                        (30)
Similarly the  second term in the integrand in (26) gives
h-W+lW'(-nf(P'O~)-ih(O~))e~~~~.                                   (31)
The sum of these two expressions is the value of  (rO+  1) when r is
large.
We require that  (rO+l)  shall represent only outward moving
particles, and  hence  it must be of the form of a multiple of  eiP`r/fi.
Thus  (30)  must vanish, so that
h(nx)  =  -iTf(P'7TX).                             (32)
We see in this way that the  condition that <r&j~>  shall represent 
only outward moving particles in the direction  0  =  0  fixes the value 
of  h for the opposite direction  0 = V.  Since the direction 8 = 0 or 
o = 0 of the pole of our polar coordinates is not in any way Singular, 
we  tan generalize  (32) to
GJX)  = --Gf(P'OX),                                     (33)
 
198                        COLLISION PROBLEMS                               § 50

which  gives  the value  of  X for an arbitrary direction. This value
substituted in (24)  gives a result that may be  written
(PWXI)  =f(Pwx){l/(W'-W)-iTG(W'-W)),                          (34)
since  one  tan Substitute  P'  for  P  in the coefficient  of a term involving
6( W'- W) as a factor without changing  the value of the term. The
condition   thut   (PWX   1)  sh&?1  represent only outward moving particles is
thus  thut  it  shall contain the factor
{l/(w'-w)-i7r6(w'-w)}.                             (35)
It is interesting to note that this factor is of the form of the  right-
hand side of equation (15)  of  $ 16.
With  A given by  (33), expression  (30)  vanishes and the value  of
<rO+  1) for large r is given by expression (3  1) alone, thus
(rOq5  1) = - 2?rh-k-2r-1W'f(P'Ox)e~~`@.
This may be generalized to
<r@ 1) = - 2~h-k-+1  W'f( P'Wx)e~J"+,
giving the value of  <rO+  i> for any direction 8, 4 in terms  of f(P'wx) 
for the same direction labelled by O,  X.  This is of the  ferm (12) with 

uuw  = - 2rrh-*c-2  W'f(  P'ox)
and thus represents a distribution of outward moving particles of
momentum  P' whose number is



per unit solid angle per unit time. This distribution is the one
represented by the  (PCOX  1) of  (34).
From this general result we  tan infer that, whenever we have a
representative (  PWX  1) representing only outward moving particles 
and satisfying an equation of the type (23),  the number per unit solid 
angle per unit time of these particles is given by (36). If this ( PWX  I> 
occurs in a Problem  in which the number of incident particles is one 
per unit volume, it will correspond to a  stattering coefficient of
4n2W~W'P'
MP0          If (P'mx) 12'                  (37)
It is only the value of the  function  f(Pwx)  for the  Point  P = P' that
is of  importante.
 
51. Dispersive Scattering

r3 50    SOLUTION  WITH  MOMENTUM  REPRESENTATION                           199
lf  we   now   apply  th.is   general  theory to our equations  (21) and
(SZ),  we   have
f(Pwx)  =  ~~(P~x~`~v~Powoxo~o).
Hence from (37) the stattering  coefficient is
4-n2h2woW'P'/C*Po.  1 (P'wxcy'  1 v 1 P0,0x0010) 12. _    (38)
If one neglects relativity and puts  W"W'/c4  =  m2,  this result  reduces 
to the result (15) obtained in the preceding section by  means  of 
Green's theorem.

51. Dispersive  stattering
We  shall now determine the  stattering  when the incident particle
is capable of being absorbed, that is, when our unperturbed System 
of scatterer plus particle has closed stationary states with the particle 
absorbed. The  existente  of these closed states for the unperturbed 
System  will be found to have a considerable effect on the stattering 
for the perturbed System, and indeed an effect that depends very 
much  on the energy of the incident particle, giving rise to the  phenomenon 
of dispersion in optics when the incident particle is taken to 
be  a Photon.
We use a representation for  which  the  basic  kets correspond to
the stationary states of the unperturbed  System,  as was the  case  with 
the p-representation of the preceding section. We take these  stationary 
states to be the states  (p'd)   for  which  the particle has a definite 
momentum   p'  and the scatterer is in a definite state  (x', together with 
the closed states,  1 say,  which  form a separate  discrete  set, and 
assume that these states are all independent and orthogonal. This
assumption  is not accurate when the particle is an  electron  or  atomic          ' 
nucleus,   since in this case  for an absorbed state  k: the particle will 
still certainly be somewhere, so that one would expect to be  able to 
expand  Jk)  in terms of the eigenkets 1 x'a') of x, y, z,  and the  CX'S, 
and  hence  also in terms of the  Ip'd)`s.   On the other hand, when the 
particle is a Photon  it will no longer exist for the absorbed states, 
which  are then certainly independent of and orthogonal to the states 
(p'c~`)   for  which  the particle does exist. Thus the assumption is valid 
in this  case,  which  is an important practical one.
Since  we are concerned with stattering,  we must still deal with
stationary states of the whole  System. We shall now, however, have 
to work to the  second   Order  of accuracy, so that we  cannot  use merely
 
200                         COLLISION PROBLEMS                                      §  61
the  first  -Order   equetion   (3),  but  must  use also  (4). Equation  (3)
becomes, when written in terms of representatives in  our present
representation,
w-  W(w'Il>  =  (P(gqO),                                   (39)
tJ---Ekwv  =  @Jvp>,                           1
where  W' is the  function of  E'  and the  CX"S given by (18) and  Ek   is the 
energy   of the stationary state  II:  of the unperturbed  System. Similarly, 
equation (4) becomes
W'-W(P~`l2) =  <Pa'1'v/l),                                 (40)
(E'-E,){kl2)  =  (k:IV(l).                    1
Kxpanding  the right-hand sidos by matrix rnultiplication, we get

( W'- JQ<Pa'12)
=  1 J-  (PqVlp"~") d3p"   (p'a"ll)+  2  (pa'IVIE")(k"l1),
"                                         k"
(E'-,&(kla)                                                                         (41)
I
=  2  J  (kylp'cu')   CPp"   (p%`ll)+   c  (Ic~v~Iv>(k"~l).
d                                    k'
The ket 10) is still given by (19), so (39) may be written
W'-  W(P~`I  1) =  h~(pa'y~pOaO),                             (42)     *
(Ef-E,)(kI1) =  ~"(JclVlp%O).                              (43)
We may assume that the matrix elements <k'IV  Ik")  of V vanish,
since  these matrix elements are not essential to the phenomena  under 
investigation, and if they did not vanish it would mean simply that 
the absorbed states  ic had not been suitably Chosen. We shall  further 
assume that the matrix elements  (p'01'   1 V 1 p'a') are of the second  Order 
of smallness when the matrix elements  (k'l  V  1 p"a'},   (ph']  V  IE")  are 
taken to be of the first Order  of smallness. This assumption will be 
justified for the case  of photons in $ 64. We now have from (43) and 
(42) that (k 11) is of the first Order  of smallness, provided E' does not 
lie near one of the discrete  set of energy-levels Ek, and (PU 11) is of 
the second Order.  The value of  (pa'  12) to the second  Order  will thus 
be given, from the first of equations (4l),  by

(W'--W)(poi'j2)   =  ?G  2  (porfIVIE")(k"IVIpoao)/(Ef-E,.).
k'
 
52. Resonance Scattering

5 51                          DISPERSIVE SCATTERING                          201 
The total correction in the wave  function to the second  Order, namely 
/  poI'  11) plus  (  PCX'   12), therefore satisfies
(W'-W)((P~`l1)S(P~`l2))
=  ~y(P4J7P"~o)+   2  (P~`l~l-)<k;l'V~pOor0>~(~`-~~)~*
k
This equation is of the type (23), provided CY'  is such that W'  > mc2, 
which  means that ac' as a final state for the scatterer  is not inconsistent 
with the law  of conservation of energy. We tan  therefore infer 
from the general result (37) that the  stattering  coefficient is
47r2k2W0WfP'
~_--_---
C4P0        I
The  stattering  may now be considered as composed of two  Parts,
a  part  that arises  from the matrix element (p'r~' IV/ p"cuo) of the perturbing 
energy and a part that  arises   from  the matrix  elements'
(p'01'  1 V I!c) and (Ic [ V 1 POCLO). The first Part,  vhioh  is the Same as our 
previously obtained result (38), may be  cdled  the  direct  stattering. 
The second part may be considered as arising from an absorption of 
the incident particle into some  state  Ic, followed immediately by a 
re-emission in  a  different  direction,  and is like the transitions through 
an intermediate state considered in 3 44. The  fact that we have to 
add the two terms before taking the  Square of the modulus denotes 
interference between the two kinds of stattering.  There is no experimental 
way  of separating the two kinds, the  distinction  between 
them being only mathematical.
52. Resonance  stattering
Suppose the energy of the incident particle to he varied  con-
tinuously while the initial state  2 of the scatterer is kept fixed, so 
that the total energy E'  or  Hf varies continuously. The formula (44) 
now  Shows  that as E'  approaches  one of the  discrete  set of energylevels 
Ek,  the  stattering   becomes  very large.  In  fact,  according  fo 
formula  (44)  the  stattering  should be infinite when  E'  is exactly equal 
to an  Ek.  An infinite  stattering  coefficient  is, of course, physically 
impossible, so that we  tan  infer that the approximations used in 
deriving (44) are no langer  legitimate when E'  is close  to an Ek. To 
investigate the stattering  in this case  we must therefore go back to 
the  exact equation             (E'-E)   IH') =  VIH'),
equation  (2)  of  6 43 with  E'  written for  Hf,   and use a different method
 
202                       COLLISION PROBLEMS                                      8 52

of approximating to  its Solution. This exact equation, written in
terms  of representatives hke  (41), becomes
( W'- W)( pa' IH')                                                  H'>,
=  c 1 (pa'~V[p'v'>  d3p"   (p"a"[H'>+ 2 {pcx'~V~IC")(E"
k"                             (45)
(E"i!&)(kl~`)
=  c J- (k~V(p"cx")  Gp"   (p"d'IH')-+ c (ky~E")(k"~H')                   1
d                                     k
Let us take one  particular   .#?&  and consider the  case  when  E'  is  close
to it. The  large term in the  stattering  coefficient (44) now  arises  from 
those elements of the matrix representing V that lie in row  k or  in 
column k, i.e. those of the  type  (kl   VI   pa') or  (pa'l   Vlk). The scattering 
arising from the other matrix elements of V is of a  smaller  Order 
of magnitude. This suggests that in our exact equations (45) we should 
make the approximation of neglecting all the matrix elements of  V 
except the important ones,  which are those of the type  (PU 1 VlrC) or 
(k  1 V  1 p01'), where  a' is a state of the scatterer that has not too  much 
energy to be disallowed as a final state  by the law of  conservation of 
energy. These equations then reduce to
(FV-W)(pa'IH')  =  <pa'~V~E)(k~H'),
(=--&&klH'>  = 2 1 @I~b'> d3P <Pa'IH'>,                           (46)
a                                       1
the  01' summation being over those values of  CX'  for which  FV' given 
by (18) is > mc2. These equations  arc  now  sticiently  simple for us 
to be able to solve exactly without firrther  approximation.
From the first of equations (46) we obtain by division
(pa'IH'>   =  (pa'jY~k)(k~a')/(W'-W)+AG(W'-W).                       (47)
We  must  choose   X, which may be  any  function of the  momentum 
p and  01',  such that (47) represents the incident  particles  corresponding 
to  IO} or it% 1 p"ao)  together with only outward moving particles.             [The 
representative of I3 ~%P) is actually of the form X S( FV'- W), since 
the  conditions  01' = 010 and p =  po for it not to vanish lead to 
W' =  E'-HS(&)  = E'-Hs($)   = WO  =  W.]  Thus (47) must be 
(pa'IH'>   = hYPa' I POaO>  +
+(pa'lVllc)(k]H'){1/(W'--W)--GS(W'---W)],                  (48)
and from the general formula  (37) the stattering  coefficient  will be
4v2WoW'P'/hc*P0.  I(p'a'IVlk)121(klH'>12.                       (49)
 
$52                         RESONANCE SCATTERING
~[t   remains  for us to determine the value of  (!c JH'). We  tan do this
by snbstituting for (PU'  IH') in the second of equations (46) its value
given   by  (48).  Chis  gives 
CE'-E,)(kIH')  =  h+(kIVIpOaO)+
+QIN')  c J' I<-lvlp~`>12{1/(W'-W)-irr  6(W'-W)}dSp
=           h~(lclVlpOcuO)+(EIH')(a-ib),
where                    a =  2 J  I(4VlP~`>12  d3Pl(W'-  W)            PO)
and  b  = 77 1 (<k~;lpaybyw'-w) d3p
Ct.' s
= 4oL# JJS l(-IVIPWxoL')126(W'-W)P2dPsino dwdx
ZZZ 77 2 P' W'c-2
a'               Is I(k]VJP'Wxol')12sinw  dwdx.          (51)
Thus            (klH') =  h~(k(V~p"~oS/(E'-~~-a+~~).                    (52)
Note that a and b are real and that b is positive.
This value for <EIH')  substituted in  (49) gives for the  stattering
coefficient


One  tan  obtain the total  effective  area  that the incident particle 
must hit in Order  to be scattered anywhere by integrating (53) over 
all directions of  stattering,  i.e. by integrating over all  directions  of 
the  vector   p'  with its magnitude kept fixed at  P',  and then summing 
over all  a' that are to be taken into consideration, i.e. for  which 
W'  > mc2.  This gives, with the help of  (51),  the result
4vh2W0   bl{kIV~p"a0)~2
--i%=--   (E'-,?Ck-a)2+b2'                 (54)
If we suppose  E' to  vary continuously through the value  E,, the
main Variation of (53) or (64) will be due to the small denominator 
(E'-Ek---a)2+b2.   If we neglect the dependence of the other factors 
in (53) and (54) on E', then the maximum stattering  will occur when 
E' has the value E,+a  and the  stattering  will be half its maximum 
when   E  differs from this value by an amount  b. The large amount of 
stattering  that occurs for values of the energy of the incident particle 
that make  E' nearly equal to  Ek give rise to the phenomenon of an 
absorption  fine. The  centre  of the line is displaced by an amount
 
53. Emission and Absorption

204                                                                             $53
a from the resonance energy of the  incident particle, i.e. the energy 
which would make the total energy  just Ek,  while the  quantity   b is 
what is sometimes called  the half-width of the line.

53. Emission and absorption
For studying emission and absorption we must  consider   non-
stationary states of the  System  and must  use  the perturbation method 
of 3 44. To determine the coefficient of  spontaneous  emission we must 
take an initial state for which the particle is absorbed, corresponding 
to  a ket Ik), and determine the probability that at some  later time 
the particle shall be on its way  to  infinity  with  a  definite momentum. 
The method of $46  tan now be applied. From the result (39) of that 
section we see that the probability per unit time per unit range of w 
and  X, of the particle being emitted in any direction CO',   X' with  the 
scatterer being left in state a' is
2&ll(  W'O'X'OI'  1 v Ilc) 12,                       (55)
provided, of course, that c11'  is such that the energy W', given by (18), 
of the particle is greater than  mc 2. For values  of 01' that do not satisfy 
this  condition there  is no emission possible. The matrix element 
(W'o'x'a'lV]k)  here must refer  to a representation  in which W, W,  x, 
and  01  are diagonal  wifli the weight  ftinction  unity. The matrix 
elements of  V appearing in the three preceding  sections  refer  to a  repre- 
`sentation Sn which pz, J.+,,  pz are diagonal with the weight function 
unity,  or  P,   O,  x  ar~;@i+gonal  with the weight  function   P2  sino. 
They would thus  refer  to a representation in which  W,   CO,  x are 
diagonal with the.  weigh$  function  dP/o? W. P2 sin o = WP/c2.  sin CO. 
Thus the matrix  &ment   (  W'w'x'a'(   V  Ik>  in (55) is equal  to
(  W'P'/ci   . sin  o')*  times  aur previous matrix element <  W'O'X'CX'   1 V  Ik>
or ( P'LII' 1 V Ilc), so that (55) is equal to



The probability of emission per unit  solid angle per unit time, with 
the scatterer simultaneously dropping to state OZ',  is thus
2n  W'P'
n 7  I(P'4'Vl~)12.

TO obtain the total probability per unit  time of the particle being
em&ed  in any direction, with any final state for the scatterer, we
 
tj 53                 EMISSION AND ABSORPTION                                  205
must integrate (56) over all angles w', x' and sum over all states cy' 
whose   energy  &(a') is such that  H,(a')+mc2   <  Bk.   The result is 
just 2b/h,  where  b  is defined by (51). There is thus this simple  relation 
between the total emission  coeficient  and the half-width b of the 
&mption   line.
I,et us now consider absorption. This requires that we shall  take
an initial state for which the particle is certainly not absorbed but is 
incident with  a definite momentum. Thus the ket corresponding to 
the initial state must be of the form (19). We must now determine 
the probability of the particle being absorbed after time  t. Since  our 
final state Ic is not one of a continuous range, we cannot  use directly 
the result (39) of  5 46. If, however, we take
IO>  =  lPOaO),                             (57)
as the ket corresponding to the initial state, the  analysis   of  $9  44  and  46 
is  still  applicable as far  as  equation  (36)  and  Shows  us that the  probability 
of the particle being absorbed into stabte  k after time t is
WWlP"~o~12[I -co*~(~~-Eljtjfb)]~(~~-  R')!
This corresponds to a  distributim of  in&i$@  particles of density
h-3,  owing to the omission of the  factor ,bt  from (57),   as compared              \
I ,.,
with (19). The probability  sf there being  ah  absorption after time 
t when there is one incident particle crossi&g  unit area per unit time 
is therefore
2hVP/cV"`. ~(k~V~pOcuo)~2[1 -cos((.E&r)t/n)]prk-E')2.   (58)
To obtain the absorption coefficient we must consider the incident
particles not all to have exactly the  Same  energy  Wo = E'---H,(o~o), 
but to have a distribution of energy values about the correct value 
Ek--Hs(czo)  required for absorption. If we take a beam of incident 
particles consisting of one crossing unit area per unit time per unit 
energy  range, the probability of there being an absorption after time 
t will be given by the integral of (58) with  respect  to E'. This integral 
may be evaluated in the Same way as (37) of 9 46 and is equal to
4,rr2h2  W"t/c2P0. 1 (k 1 V 1 pV)  1 2.
The probability per unit time of an absorption taking place  with an 
incident beam of one particle per unit area per unif  time per unit 
energy range is therefore
4n2h2  W"/c2Po.  1 (k 1 V 1 p"aO)  1 2,               (59)
which is the absorption coefficient.
 
206                          COLLISION PROBLEMS                            9 53

The connexion between the absorption and  emission  coefficienfs
(59) and  (56) and the  resonance  stattering  coefficients  calculated in 
the preceding section should be noted. When the incident beam does 
not consist of  particles  all with the  Same energy, but  consists of a unit 
distribution of particles per unit energy range crossing unit area per 
unit  time, the total number of incident particles with energies near 
an absorption line that get scattered will be given by the integral 
of (54) with respect to E'. If one neglects the dependence of the 
numerator of (54) on E', this integral will, since
s b
(,&4,42+~2   dE'  =  ?T,
-CO

have  just the value (69). Thus  the total nunaber  of  scuttered particles 
in the neighbourhood  of  an absorption  line is  equul  to the total number 
abwrbed.   We  tan therefore regard all these scattered particles as 
absorbed particles that are subsequently re-emitted in a different 
direction.   Further,  the number of particles in the neighbourhood of 
the absorption line that get scattered per unit solid angle  about a 
given direction  specified by p' and then belong to scatterers in state 
01' will be given by the integral with respect to  E'  of  (53),   which 
integral has in the same way the value


This is just equal to the absorption coefficient (59) multiplied by the 
emission coefficient  (66)  divided by  2b/&,  the total emission coefficient. 
This is in agreement with the  Point  of view of regarding the resonance 
scattered particles as those that are absorbed and then re-emitted, 
with the absorption and emission processes  governed independently 
each  by its own probability law, since this  Point  of view would 
make the Fueraction  of the total number of absorbed particles that are 
re-emitted in  a  unit solid angle  about a given  direction just the 
emission coefficient for this direction divided by the total emission 
coefficient .
 
IX. Systems Containing Several Similar Particles
54. Symmetrical and Antisymmetrical States

.--






IX

SYSTEMS CONTAINING SEVERAL SIMILAR  PARTICLES
54. Symmetrical and  antisymmetrical states
IF a System  in atomic  physics  contains a number of particles of the 
same  kind, e.g. a number of  electrons,  the particles are absolutely 
indistinguishable  one from another. No observable  Change is made 
when  two of them arc interchanged. This circumstance  gives rise to 
some  curious phenomena in  quantum  mechanics having no analogue 
in the classical theory, which arise  from the  fact that in quantum 
mechanics  a transition may occur resulting in merely the  interchange 
of two similar particles, which transition then could not be detected 
by any observational means. A satisfactory theory ought, of  course, 
to count two observationally indistinguishable states  as  the same 
state and to deny that any transition does occur when two similar 
particles  exchange   places. We shall find that it is possible to  reformulate 
the theory so that this is so.
Suppose we have a  System  containing  n similar particles. We may
take  as our dynamical variables a set of variables f1 describing the 
first   particle,  the corresponding set  f2 describing the  second  particle, 
and so on up to the set & describing the nth particle.  We shall  then 
have the  &.`s  commuting with the  &`s  for r  # s.  (We may  require 
certain extra variables, describing what the  System  consists of in 
addition to the  n similar particles, but it is not necessary to mention 
these explicitly in the present chapter.)  The Hamiltonian describing 
the motion of the  System  will  now  be expressjble as  a  function  of the 
fl, e2,...,  fn. The fact that the particles arc  similar requires that the 
Hamiltonian shull be a symmetricai  function  of the t1,f2,.,.,  &,  i.e. it 
shall  remain  unchanged  when the sets of variables  &  are interchanged 
or  permuted in any way. This condition must hold, no matter what 
perturbations are applied to the System. In  fact,  any quantity of 
physical significance must be a syrnmetrical  function  of the 6's.
Let  ja,}, lbi>,  . . . be kets for the first  particle  considered as a  dynami-
cal  System  by itself. There will be corresponding kets  Ia,),   1 b,},  . . .  for
the  second  particle  by itself, and so on.  We  tan get a ket for the 
assembly  by taking the  product  of kets for each  particle  by itself, 
for example
laJlW~~).4g~>   =  lalb2ca.~.gn)                           (11
 
T  -
Il
208         SYSTEMS  CONTAINING SEVERAL SIMILAR  PARTIcLES                 s 51 
say,   according to the notation of (65)  of  6 20. The ket (  1)  corresponds 
to a  spezial  kind of state for the assembly, which may be described 
by saying  that each  particle is in its own sfate, corresponding to its 
own   factor on the left-hand side of (1). The general  ket   for  the 
assembly is of the form  of  a sum or integral of kets  like   (l),  and 
corresponds  to a state for the assembly for which one  cannot  say that 
each  particle is in its own state, but  only that  each  particle is  partly 
in several states, in a way which is correlated with the  other  particles 
being partly in several states. If the kets ja,), Ib,),... arc  a Set of 
basic kets for the first particle by itself, the kets la,),  jb,),...  will be 
a set of basic kets for the  second  particle by itself, and so on, and  the 
kets (1) will be a set of basic kets for the assembly. We  cal1   the  representation 
provided by such basic kets for the assembly  a  ,symmetricuZ 
representution,  as it treats all the particles on the same footing.
In (1) we may interchange the kets for the first two particles and
get another ket for the assembly, namely
IWa,)Ic,).&G  =  Ibw,...g,>.
More generally, we may interchange the role of the first two particles 
in any ket for the assembly and get another ket for the assembly. 
The process of interchanging the first two particles is an Operator 
which  tan  be applied to kets for the assembly, and is evidently a 
linear Operator, of the type dealt with in $7. Similarly, the process 
of interchanging any pair of particles is a linear Operator, and by 
repeated applications of such interchanges we get any Permutation 
of the particles appearing as a linear Operator which  tan be  applied 
to kets for the assembly. A Permutation is  called  an  ewen permutation 
or an oueki!  permutation according to whether it tan  be built up from 
an even or an odd number of interchanges.
A ket for the assembly IX) is called symmetrical  if it is unchanged
by any Permutation, i.e. if
wo  =  IX>                                 (2)                 '
for  any  Permutation P. It is called  antisymmetrical   if it is  unchanged
by any even Permutation and has its sign  changed   by any odd
Permutation, i.e. if            w> = IkIX),                                 (3) 
the + or  - sign being taken according to whether  P is even or  edd. 
The  state corresponding to a symmetrical ket is called a  symmetricd 
state,  and the state corresponding to an antisymmetrical ket is called 
an antisymmetrical  Stute.  In a symmetrical representation, the repre- 








'           t
 
§ 64          SYMMETRICAL  AND ANTISYMMETRICAL STATES
sentative of a symmetrical ket is a symmetrical  function of the 
variables referring to the various particles and  the representrttive of 
an antisymmetrical ket is an antisymmetrical function.
In the Schroedinger picture, the ket corresponding to a state of the
assembly will vary with time according to Schroedinger's equation of 
motion. If it is initially symmetrical it must always remain symmetrical, 
since, owing to the Hamiltonian being symmetrical, there 
is nothing to disturb the symmetry. Similarly if the ket is initially. 
antisymmetrical it must always remain antisymmetrical. Thus a 
stute  which is initially symmetrical always remains  8ymmetriu.d  and 
a  state  Which   is initially  antisymmetricul  always  rernuins  antisymmetricul. 
In consequence, it may be that for a  particular kind of 
particle only symmetrical states occur in nature, or only  antisymmetrical 
states occur in nature. If either of these possibilities 
held, it would lead to certain special phenomena for the particles in 
question.
Let us suppose first that only antisymmetrical states occur in
nature. The ket (1) is not antisymmetrical and so does not correspond 
to a state occurring in nature. From (1) we tan in general form 
an antisymmetrical ket by applying all possible permutations to it 
and  adding the results, with the coefficient - 1 inserted before those 
terms arising from an odd Permutation, so as to get            . 


the + or - sign being taken according to whether P is even oi odd.
The  ket (4) may be written  as  a determinant







and its representative in a symmetrical representation is a determinant. 
The ket (4) 0; (5) is not the general antisymmetrical ket, but 
is a specially simple one. It corresponds to  a state for the assembly 
for which one  tan say that certain particle-states, namely the states 
a,  b, c,. . . ,g, are occupied, but one cannot say which particle is in 
which state,         particle being equally likely to be in any stak If
3595.57                           P
 
-
I//


210         SYSTEMS  CONTAINING  SEVERAL SIMILAR PARTICLES                  3 54
I
two of the particle-states a, b, c,. . ., g arc  the  Same,  the ket (4) or (5) 
vanishes  and does not correspond  to any state  for  the assembly.                 /1 
Thus  two  particles cunnot  occupy the same stak  More generally,  the             l 
occupied  states must  be  all independent,  otherwise (4)  or  (5) vanishes.       ! 
This   is an important characteristic of particles for which only antisymmetrical 
states occur in nature. It leads to a special statistics, 
which  was first studied by Fermi, so we shall cal1 particles for which 
only antisymmetrical states occur in nature  ffmGms.
Let US suppose now that only symmetrical states occur in nature.
The  kef (1) is not symmetrical, except in the special  case  when all the 
particle-states a,  b,  c  ,..., g are the  Same, but we  tan always obtain a 
symmetrical ket from it by applying all possible permutations to it
and  adding the results, so as to get                                                    l,
g  %b,%4L)~                                    (6)

The ket (6) is not the general symmetrical ket, but is a specially 
simple one. It corresponds to a state for the assembly for which one 
tan say that certain particle-states are occupied, namely the states 
a, b, c,. . . ,g, without being able to say. which  particle is in which state. 
It is now possible for two or  more of the states a, b, c,.. ., g to be the 
Same,  so that two  or more particles  tan be in the same state. In spite 
of this, the statistics of the particles is not the  Same as the usual 
statistics of the classical theory. The new statistics was first studied 
by Bose, so we shall cal1 particles for which only symmetrical states 
occur in nature  bosons.
We  tan see the  differente  of Bose statistics from the usual statistics
by considering a special case-that of only two particles and only two 
independent states  a  and  b  for a  particle.  According to classical 
mechanics, if the assembly of two particles is in  thermodynamic 
equilibrium at a high temperature,  each   particle will be equally likely 
to be in either state. There is thus a probability  4 of both particles 
being in state  a, a probability 2 of both particles being in state  b, 
and a probability & of one  particle  being in each state. In the  quanturn 
theory there are three independent symmetrical states for the 
pair of particles, corresponding to the symmetrical kets  ~a,)~a,>, 
Ib,)  Ib,),  and  Ia,> Vd+  Ia,> Ib,), and describable as both particles in 
sfate  a,  both particles in state  b,  and one  particle in  each  state 
respectively. For thermodynamic  equilibrium at a high temperature 
these three states are equally probable, as was shown in  8 33, so that 


f
 
55. Permutations as Dynamical Variables

§ 54      SyMMETRICAL            AND ANTISYMMETRICAL STATES                  211 
there is a probability 4 of both particles being in state a, a probability 
Q of both particles being in state  b, and a probability 4 of one particle 
b&ng   in  each state. Thus  with Bose statistics the probability  of  two 
parti&s  being in the same state  is  greuter  than  with  classical statistics. 
Bose  statistics differ from classical statistics in the opposite direction 
t. Fermi statistics, for which the probability of two particles being 
in   the  same  state   is   Zero.
In building up a theory of atoms on the  lines  mentioned at  the
beginning  of  $  38,  to get agreement with experiment one must assume 
that two electrons are never in the Same  state. This rule is known as 
Pauli's  exclusion principle. It Shows  us that electrons are fermions. 
Planck's  law of radiation  Shows  us that  photons  ure bosons, as  only  the 
Bose  statistics for photons will lead to Planck's  law.  Similarly,  for 
each of the other kinds of particle known in physics,  there is experimental 
evidente  to show either that they are  fermions, or that they 
are bosons. Protons, neutrons, positrons are fermions, cr-particles  are 
bosons. It appears that all particles occurring in nature are either 
fermions  or bosons, and thus only antisymmetrical or symmetrical 
states for an assembly of similar particles  are met with in practice. 
Other more complicated kinds of symmetry are possible  mathematically, 
but do not apply to any known particles. With a theory which 
allows only antisymmetrical or only symmetrical states for a particular 
kind of particle, one cannot  make a  distinction between two states 
which differ only through a Permutation of the particles, so that the 
transitions mentioned at the  beginning of this  section disappem. 

55.  Permutations  as dynamical variables
We shall now build up a general theory for a  System  confaining n
simila;r  particles when states with any kind of symmetry properties 
are allowed, i.e. when there is no restriction to only symmetrical or 
only antisymmetrical states. The general state now will not be  symmetrical 
or antisymmetrical, nor will it be expressible linearly in 
terms of  symmetrical  and antisymmetrical states when  n > 2. This 
theory will not apply directly to  any  particles occurring in nature, 
but all the same it is useful  for setting up an approximate treatment 
for an assembly of electrons, as will be  shown  in  fs 58.
We have  Seen that  each  Permutation  P  of the n particles is a linear
operator  which  tan be applied to any ket for the assembly.  Hence 
we tan regard P as a dynamical variable in OUT  System  of  n particles.
 
212        SYSTEMS CONTAINING  SEVERAL SIMILAR PARTICLES              §  66
There are n! permutations, each of which tan be regarded as a 
dynamical variable. One of them, Fl say, is the identical Permutation, 
which is equal to uni@. The product of any two permutations is a 
third Permutation and hence any function of the permutations is 
reducible to a linear function of them. Any Permutation P has a 
reciprocal  P-l satisfying
PP-1 =  p-1p = Pl = 1.
A Permutation P tan be applied to a bra (XI for the assembly,
to give another bra, which we shall denote for the present by P(X (. 
If P is applied to both factors of the product  (XI  Y), the product 
must be unchanged, since it is just a number, independent of any 
Order  of the particles. Thus
uYxopI  y> = (XI y>
.showing that                    P(X~=<X~P-1                            (7)
Now P(XJ is the conjugate imaginary of  PIX) and is thus equal to 
(XlP,  and hence  nfrom (7)       p  =  p-1.                            (8) 
Thus  a  permutation is not in general a real dynamical variable, its 
conjugate  complex  being equal to its reciprocal.
Any Permutation of the numbers 1,2,   3,..., n may be expressed in
the cyclic notation, e.g. with n = 8


in which each number is to be replaced by the succeeding number in
a  bracket, unless it is the last in a bracket, when it is to be replaced 
by the flrst in that bracket. Thus Pa changes the numbers 12345678
into 47138625. The type of any Permutation is specified by the 
partition of the number n which is provided by the number of  num-
bers in each of the brackets. Thus the type of P, is specified by the 
partition 8 = 3+ 2 + 2 + 1. Permutations of the same type, i.e. corresponding 
to the same partition, we shall  call  simdur.   Thus, for 
example, Pa in (9) is similar to
4 =  (871)(35)(46)(2).                     (W
The whole  of the  n!  possible permutations may be divided into sets 
of similar permutations, each such set being called a  ciuss. The  permutation 
P1 = 1  forms a class   by itself. Any Permutation is simiIar 
to its reciprocal.
 
56. Permutations as Constants of the Motion

pERMUTATIONS  AS DYNAMICAL VARIABLES                             213

When   two permutations Pa  and  Pu are similar, either of them Pb
may  be obtained by making a certain Permutation PS  in the other 
cz. Thus, in our example (9),  ( 10) we tan take Pz to  be the permutation 
that changes  14327586 into 87135462, i.e. the Permutation
P, =  (18623)(475).
Different ways of writing P,  and P, in the cyclic notation would lead 
to  different  Pz's. Any of these  Pz's   applied to the product  P,  IX> 
would  Change  it into Pb.  Pz  IX),  i.e.
P,P,lX> =  PbP3,jX).
Hence                                Pb =  PS  P,  P;$                     (11) 
which  expresses the  condition  for  P,   and  Pb  to  be similar as an 
algebraic equation. The  existente of any Pz  satisfying   (11) is  sufficient 
to show that P,  and  Pb are similar. 
56. Permutations as constants  of the  motion
Any