\errorcontextlines=10
\documentclass[11pt]{article}
\usepackage[dvips]{color}
\usepackage{graphicx}
\usepackage{amssymb}
\usepackage{amsmath}
\usepackage{amsbsy}
%\usepackage[dvips]{epsfig}
\usepackage[english]{babel}
%\RequirePackage[hyperindex,colorlinks,backref,dvips]{hyperref}
\RequirePackage {hyperref}
\tolerance=7000
\begin{document}
M.~Born, Z. Phys. {\bf 34,} 858 \hfill {\large \bf 1925}\\
\vspace{2cm}
\begin{center}
{\large \bf On Quantum Mechanics}
\end{center}
\begin{center}
{\Large M.~ Born}\\
Received 1925\\
\end{center}
\centerline{--- ---~~~$\diamond~\diamondsuit~\diamond$~~~--- ---}
\noindent
Translation into English: {\it Sources of Quantum Mechanics,}\\
\mbox{Ed. by B. L. van der Waerden,} North Holland, Amsterdam (1967) 277.\\
\centerline{--- ---~~~$\diamond~\diamondsuit~\diamond$~~~--- ---}
\vspace{2cm}
{\small
The recently published theoretical approach of Heisenberg is here developed into
a systematic theory of quantum mechanics (in the first place for systems having
one degree of freedom) with the aid of mathematical matrix methods. After a brief
survey of the latter, the mechanical equations of
motion are derived from a variational principle and it is shown that using
Heisenberg's quantum condition, the principle of energy conservation and Bohr's
frequency condition follow from the mechanical equations. Using the
anharmonic
oscillator as example, the question of uniqueness of
the solution and of the significance of the phases of the partial vibrations is raised. The paper concludes with an attempt to incorporate electromagnetic field
laws into the new theory.}
\section*{\bf
Introduction}
The theoretical approach of Heisenberg \footnote{W.Heisenberg, Zs. f.
Phys. {\bf 33} (1925) 879.} recently published in this Journal,
which aimed at setting up a new kinematical and mechanical formalism in
conformity with the basic requirements of quantum theory, appears to us of
considerable potential significance. It represents an attempt to render justice to
the new facts by selling up a new and really suitable conceptual system instead of adapting the customary conceptions in a more or less artificial and forced manner. The physical reasoning which led Heisenberg to this development has
been so clearly described by him that any supplementary remarks appear
superfluous. But, as he himself indicates, in its formal, mathematical aspects his approach is but in its initial stages. His hypotheses have been applied only
to simple examples without being fully carried through to a generalized theory.
Having been in an advantageous position to familiarize ourselves with his
ideas throughout their formative stages, we now strive (since his investigations
have been concluded) to clarify the mathematically formal content of his approach and
present some of our results here. These indicate that it is in fact possible,
starting with the basic premises given by Heisenberg, to build up a closed
mathematical theory of quantum mechanics which displays strikingly close
analogies with classical mechanics, but at the same time preserves the
characteristic features of quantum phenomena.
In this we at first confine ourselves, like Heisenberg, to systems having {\it
one degree of freedom} and assume these to be -- from a classical
\mbox{standpoint -- {\it periodic.}} We shall in the continuation of this publication
concern ourselves with the generalization of the mathematical theory to systems
having ah
arbitrary number of degrees of freedom, as also to aperiodic motion. A
noteworthy generalization of Heisenberg's approach lies in our confining
ourselves neither to treatment of nonrelativistic mechanics nor to calculations
involving Cartesian systems of coordinates. The only restriction which we
impose upon the choice of coordinates is to base our considerations upon
{\it libration coordinates,} which in classical theory are {\it periodic}
functions of time.
Admittedly, in some instances it might be more reasonable to employ other
coordinates: for example, in the case of a rotating body to introduce the angle
of rotation $\varphi$, which becomes a linear function of time. Heisenberg also
proceeded thus in his treatment of the rotator;
however, it remains undecided whether the approach applied there can be
justified from the standpoint of a consistent quantum mechanics.
The mathematical basis of Heisenberg's treatment is the {\it law of multiplication }
of quantum--theoretical quantities, which he derived from an ingenious
consideration of correspondence arguments. The development of his formalism,
which we give here, is based upon the fact that this rule of multiplication is
none other than the well--known mathematical rule of {\it matrix multiplication.}
The infinite square array (with discrete or continuous indices) which appears at
the start of the next section, termed a {\it matrix,} is a representation of a
physical quantity which is given in classical theory as a function of time.
The mathematical method of treatment inherent in the new quantum mechanics is thereby
characterized through the employment of {\it matrix analysis} in place of the usual
number analysis.
Using this method, we have attempted to tackle some of the simplest
problems in mechanics and electrodynamics. A {\it variational
principle,} derived from correspondence considerations, yields {\it equations of
motion} for the most general Hamilton function which are in closest analogy with
the classical canonical equations. The quantum condition conjoined with one of
the relations which proceed from the equations of motion permits a simple
matrix notation. With the aid of this, one can prove the general validity of the
{\it law of conservation of energy} and the {\it Bohr frequency relation} in the sense
conjectured by Heisenberg: this proof could not be carried through in its entirety
by him even for the simple examples which he considered. We shall later return
in more detail to one of these examples in order to derive a basis for
consideration of the part played by the phases of the partial vibrations in the new
theory. We show finally that the basic laws of the electromagnetic field in a
vacuum can readily be incorporated and we furnish substantiation for the
assumption made by Heisenberg that the squares of the absolute values of the
elements in a matrix representing the electrical moment of an atom provide a
measure for the transition probabilities.\\\\
\section*{
Chapter 1. Matrix Calculation}
\subsection*{
1. Elementary operations. Functions}
~~~~We consider square infinite matrices, \footnote{Further details of matrix algebra
can be found, e.g., in M. Bocher, Einf\"uhrung in die h\"ohere
Algebra (translated from the English by Hans Beck; Teubner, Leipzig, 1910) \S~22--25;
also in R. Courant and D. Hilbert, Methoden der mathematischen Physik {\bf
1} (Springer, Berlin, 1924) Chapter I.} which we shall denote by heavy type to
distinguish them from ordinary quantities which will throughout be in light type,
$$
{\bf a} = (a(nm)) = \left(
\begin{array}{ccc}
a(00)&a(01)&a(02) \cdots\\
a(10)&a(11)&a(12) \cdots\\
a(20)&a(21)&a(22) \cdots\\
\cdots&\cdots&\cdots
\end{array} \right).
$$
Equality of two matrices is defined as equality of corresponding components:
\begin{equation}
{\bf a} = {\bf b} \qquad \mbox{means} \qquad a(nm) = b(nm).
\end{equation}
Matrix addition is defined as addition of corresponding components:
\begin{equation}
{\bf a} = {\bf b} + {\bf c} \qquad \mbox{means} \qquad a(nm) = b(nm) + c(nm).
\end{equation}
Matrix multiplication is defined by the rule ``rows times columns'', familiar from
the theory of determinants:
\begin{equation}
{\bf a} = {\bf bc} \qquad \mbox{means} \qquad a(nm) = \sum \limits^{\infty}_{k = 0}~
b(nk) c(km).
\end{equation}
Powers are defined by repeated multiplication. The associative rule applies to
multiplication and the distributive rule to combined addition and multiplication:
\begin{equation}
{\bf (ab)c = a(bc);}
\end{equation}
\begin{equation}
{\bf a(b + c) = a b + a c.}
\end{equation}
However, the commutative rule does {\it not} hold for multiplication: it is not in
general correct to set ${\bf a b = b a}$. If ${\bf a}$ and ${\bf b}$ do satisfy this relation, they are said to commute.
The {\it unit matrix} defined by
\begin{equation}
{\bf 1} = (\delta_{nm}), \qquad \left\{
\begin{array}{l}
\delta_{nm} = 0 \quad \mbox{for} \quad n \ne m,\\
\delta_{nm} = 1\\
\end{array} \right.
\end{equation}
has the property
$$
{\bf a1 = 1 a = a.} \eqno(6a)
$$
The {\it reciprocal matrix} to ${\bf a}$, namely ${\bf a}^{- 1}$, is defined by\footnote{As
is known, ${\bf a}^{-1}$ is uniquely defined by (7) for {\it finite} square
matrices when the determinant $A$ of the matrix ${\bf a}$ is non--zero.
If $A = 0$ there is no matrix to ${\bf a}$.}
\begin{equation}
{\bf a}^{-1} {\bf a} = {\bf aa}^{-1} = {\bf 1}
\end{equation}
As {\it mean value} of a matrix ${\bf a}$ we denote that matrix
whose diagonal elements are the same as those of ${\bf a}$ whereas all
other elements vanish:
\begin{equation}
{\bf \tilde a} = (\delta_{nm} a(nm)).
\end{equation}
The sum of these diagonal elements will be termed the
{\it diagonal sum} of the
matrix ${\bf a}$ and written as ${\bf D(a)}$, viz.
\begin{equation}
{\bf D(a)} = \sum \limits_n~ a(nm).
\end{equation}
From (3) it is easy to prove that if the diagonal sum of a product
${\bf y = x_1 x_2 \cdots x_m}$ be finite, then it is unchanged by cyclic rearrangement of the factors:
\begin{equation}
{\bf D(x_1 x_2 \cdots x_m) = D(x_r x_{r + 1} \cdots x_m x_1 x_2 \cdots x_{r
- 1}}).
\end{equation}
Clearly, it suffices to establish the validity of this rule for {\it two} factors.
If the elements of the matrices ${\bf a}$ and ${\bf b}$ are functions of a parameter $t$, then
$$
\frac{\rm d}{{\rm d}t}~ \sum \limits_n~ a(nk) b(km) = \sum \limits_k~ \{
\dot a(nk) b(km) + a(nk) b(km)\},
$$
or from the definition (3):
\begin{equation}
\frac{\rm d}{{\rm d}t}~ {\bf(ab)} = {\bf \dot a b + a \dot b.}
\end{equation}
Repeated application of (11)
$$
\frac{\rm d}{{\rm d}t}~ ({\bf x_1 x_2 \cdots x_n}) = {\bf \dot x_1 x_2} \cdots {\bf
x_n}
+ {\bf x_1 \dot x_2} \cdots {\bf x_n} + \cdots + {\bf x_1 x_2 \cdots \dot x_n.} \eqno(11')
$$
From the definitions (2) and (3) we can define {\it functions} of matrices. To begin with,
we consider as the most general function of this type, ${\bf f(x_1, x_2 \cdots
x_m)}$, one which can formally be represented as a sum of a finite or infinite number
of products of powers of the arguments ${\bf x_k}$; weighted by numerical coefficients.
Through the equations
\begin{equation}
\begin{array}{l}
{\bf f_1 (y_1, \cdots y_n;~ x_1, \cdots x_n}) = 0,\\
\hdotsfor 1\\
{\bf f_n(y_1, \cdots y_n;~ x_1, \cdots x_n}) = 0
\end{array}
\end{equation}
we can then also define functions ${\bf y_l(x_1,} \dots {\bf x_n})$; namely, in order to obtain
functions ${\bf y_l}$; having the above form and satisfying
equation (12), the
${\bf y_l}$( need only be set in form of a series in increasing
power products of the
${\bf x_k}$ and the coefficients determined through
substitution in (12). It can be seen that one will always
derive as many equations
as there are unknowns. Naturally, the number of equations and
unknowns exceeds
that which would ensue from applying the method of undetermined coefficients in the
normal type of analysis incorporating {\it commutative} multiplication. In each of
the equations (12), upon substituting the series for the ${\bf y_l}$;
and gathering
together like terms one obtains not only a sum term $C'{\bf x_1 x_2}$
but also a term $C'' {\bf x_2 x_1}$ and thereby has to bring both
$C'$ and $C"$
to vanish (e.g., not only $C'+C"$). This is, however, made possible by the fact
that in the expansion of each of the ${\bf y_l}$, two terms
${\bf x_1 x_2}$
and ${\bf x_2 x_1}$ appear, with two available coefficients.
\subsection*{
2. Symbolic differentiation}
~~~~At this stage we have to examine in detail the process of {\it differentiation} of a
matrix function, which will later be employed frequently in calculation. One
should at the outset note that only in a few respects does this process display
similarity to that of differentiation in ordinary analysis. For example, the rules
for differentiation of a product or of a function of a function here no longer
apply in general. Only if all the matrices which occur commute with one
another can one apply all the rules of normal analysis to this differentiation.
Suppose
\begin{equation}
{\bf y} = \prod \limits^s_{m = 1}~ {\bf x_l}_m = {\bf x_{l_1} x_{l_2}
\dots x_{l_s}}.
\end{equation}
We define
\begin{equation}
\frac{\partial {\bf y}}{\partial{\bf x_k}} = \sum \limits^s_{r = 1}~ \delta_{l_rk}
\prod \limits^s_{m = r + 1}~ {\bf x_{l_m}} \prod \limits^{m = r - 1}_{m
= 1}~ {\bf x_{l_m}}, \qquad \left\{
\begin{array}{l}
\delta_{jk} = 0 \quad \mbox{for} \quad j \ne k,\\
\delta_{kk} = 1.
\end{array} \right.
\end{equation}
This rule may be expressed as follows: In the given product, one regards all
factors as written out individually (e.g., not as ${\bf x^3_1 x^2_2}$,
but as ${\bf x_1 x_1 x_1 x_2 x_2}$); one
then picks out any factor ${\bf x_k}$ and builds the product of all the factors which
follow this and which precede (in this sequence). The sum of all such
expressions is the differential coefficient of the product with respect to this ${\bf x_k}$.
The procedure may be illustrated by some examples:
$$
\begin{array}{ll}
{\bf y = x^n},& \frac{\displaystyle {\rm d}{\bf y}}{\displaystyle {\rm d}{\bf x}}
= n {\bf x}^{n -1}\\\\
{\bf y} = {\bf x}^n_1 {\bf x}_2^m,& \frac{\displaystyle \partial{\bf y}}{\displaystyle
\partial{\bf x_1}}
= {\bf x}_1^{n - 1} {\bf x}_2^m + {\bf x}_1^{n - 2} {\bf x}_2^m {\bf x}_1 + \dots + {\bf x}_2^m
{\bf x}_1^{n - 1},\\\\
{\bf y} = {\bf x}^2_1 {\bf x}_2 {\bf x}_1 {\bf x}_3,& \frac{\displaystyle \partial {\bf y}}{\displaystyle
\partial{\bf x}_1} = {\bf x}_1 {\bf x}_2 {\bf x}_1 {\bf x}_3 + {\bf x}_2 {\bf x}_1 {\bf x}_3 {\bf x}_1 + {\bf x}_3 {\bf x}_1^2 {\bf x}_2.
\end{array}
$$
If we further stipulate that
\begin{equation}
\frac{\partial({\bf y}_1 + {\bf y}_2)}{\partial{\bf x}_k} =
\frac{\partial{\bf y}_1}{\partial{\bf x}_k} + \frac{\partial{\bf y}_2}{\partial
{\bf x}_k},
\end{equation}
then the derivative $\partial{\bf y}/\partial{\bf x}$ is defined for the most
general analytical functions ${\bf y}$.
With the above definitions, together with that of the diagonal sum (9), there follows
the relation
\begin{equation}
\frac{\partial{\bf D(y)}}{\partial{\bf x}_k (nm)} = \frac{\partial{\bf
y}}{\partial{\bf x}_k} (mn),
\end{equation}
on the right--hand side of which stands the $mn$--component of the matrix $\partial{\bf
y}/\partial{\bf x}_k$. This
relation can also be used to define the derivative $\partial{\bf y}/\partial{\bf
x}_k$. In order to prove (16), it
obviously suffices to consider a function ${\bf y}$ having the form (13). From
(14) and (3) it follows that
\begin{equation}
\frac{\partial{\bf y}}{\partial{\bf x}_k (mn)} = \sum \limits^s_{r =
1}~ \delta_{l_rk} \sum \limits_{\tau} \prod \limits^s_{p = r + 1}~ x_{l_p}
(\tau_p \tau_{p + 1}) \prod \limits^{r - 1}_{p = 1} x_{l_p} (\tau_p
\tau_{p + 1});
\end{equation}
$$
\tau_{r + 1} = m, \quad \tau_{s + 1} = \tau_1, \quad \tau_r = n.
$$
On the other hand, from (3) and (9) ensues
$$
\frac{\partial{\bf D(y)}}{\partial{\bf x}_k (mn)} = \sum \limits^s_{r =
1} \delta_{l_r k} \sum \limits_{\tau} \prod \limits^{r - 1}_{p = 1}~ {\bf
x}_{l_p} (\tau_p \tau_{p + 1}) \prod \limits^s_{p = r + 1} ~ x_{l_p}
(\tau_p \tau_{p + 1}); \eqno(17')
$$
$$
\tau_1 = \tau_{s + 1}, \quad \tau_r = n, \quad \tau_{r + 1} = m.
$$
Comparison of (17) with (17') yields (16).
We here pick out a fact which will later assume importance and which can be deduced
from the definition (14): {\it the partial derivatives of a
product are invariant
with respect to cyclic rearrangement of the factors.} Because of (16) this can also be inferred from (10).
To conclude this introductory section, some additional
description is devoted to functions ${\bf g(pq)}$ of the variables. For
\begin{equation}
{\bf y} = {\bf p}^s {\bf q}^r
\end{equation}
it follows from (14) that
$$
\frac{\partial{\bf y}}{\partial{\bf p}} = \sum \limits^{s - 1}_{l - 1}
{\bf p}^{s - 1 - l} {\bf q}^r {\bf p}^l, \quad
\frac{\partial{\bf y}}{\partial{\bf q}} = \sum \limits^{r - 1}_{j = 1}~
{\bf q}^{r - 1 - j} {\bf p}^s {\bf q}^j. \eqno(18')
$$
The most general function ${\bf g(pq)}$ to be considered is to be represented in accordance
with \S~1 by a linear aggregate of terms
\begin{equation}
{\bf z} = \prod \limits^k_{j = 1} ~ ({\bf p}^s_j {\bf q}^r_j).
\end{equation}
With the abbreviation
\begin{equation}
{\bf p}_l = \prod \limits^k_{j = l + 1}~ ({\bf p}^s_j {\bf q}^r_j)
\prod \limits^{l - 1}_{j = 1} ~ ({\bf p}^s_j {\bf q}^r_j),
\end{equation}
one can write the derivatives as
\begin{equation}
\left.
\begin{array}{l}
\frac{\displaystyle \partial {\bf z}}{\displaystyle \partial {\bf p}} =
\sum \limits^k_{l = 1} \sum \limits^{s_l - 1}_{m = 0}~ {\bf p}^{s_l - 1
- m} {\bf q}^r_l {\bf p}_l {\bf p}^m,\\\\
\frac{\displaystyle \partial {\bf z}}{\displaystyle \partial {\bf q}} =
\sum \limits^k_{l = 1} \sum \limits^{r_l - 1}_{m = 0}~ {\bf q}^{r_l - 1
- m} {\bf p}_l {\bf p}^{s_l} {\bf q}^m.
\end{array} \right\}
\end{equation}
From these equations we find an important consequence. We consider the
matrices
\begin{equation}
{\bf d}_1 = {\bf q}~ \frac{\partial {\bf z}}{\partial {\bf q}} - \frac{\partial
{\bf z}}{\partial {\bf q}}~ {\bf q}, \quad {\bf d}_2 = {\bf p}~ \frac{\partial
{\bf z}}{\partial {\bf p}} - \frac{\partial {\bf z}}{\partial {\bf p}}~
{\bf p}.
\end{equation}
From (21) we have
$$
{\bf d}_1 = \sum \limits^k_{l = 1} ~ ({\bf q}^{r_l} {\bf P}_l {\bf p}^{s_l}
- {\bf P}_l {\bf p}^{s_l} {\bf q}^{r_l}),
$$
$$
{\bf d}_2 = \sum^k_{l = 1}~ ({\bf p}^{s_l} {\bf q}^{r_l} {\bf P}_l - {\bf
P}_l {\bf p}^{s_l} {\bf q}^{r_l}).
$$
and thus it follows that
$$
{\bf d}_1 + {\bf d}_2 = \sum \limits^k_{l = 1} ~ ({\bf p}^{s_l} {\bf q}^{r_l}
{\bf P}_l - {\bf P}_l {\bf p}^{s_l} {\bf q}^{r_l}).
$$
Herein the second member of each term cancels the first member of the
following, and the first and last member of the overall sum also cancel, so that
\begin{equation}
{\bf d}_1 + {\bf d}_2 = 0.
\end{equation}
Because of its linear character in ${\bf z}$, this relation holds not only for expressions ${\bf z}$ having the form (19), but indeed for arbitrary analytical
functions ${\bf g(pq)}$.\footnote{More generally, for function of $r$ variables,
one has
$$
\sum \limits_r \left( {\bf x}_r~ \frac{\partial {\bf g}}{\partial {\bf
x}_r} - \frac{\partial {\bf g}}{\partial {\bf x}_r} ~ {\bf x}_r \right)
= 0.
$$
}
In concluding this brief survey of matrix analysis, we establish the following rule:
{\it Every matrix equation}
$$
{\bf F(x}_1, {\bf x}_2, \dots {\bf x}_r) = 0
$$
{\it remains valid if in all the matrices $x_j$ one and the same permutation of
all rows and columns is undertaken.} To this end, it suffices to show that for
two matrices ${\bf a,~ b}$ which thereby become transposed to
${\bf a', ~b'}$, the following
invariance conditions apply:
$$
{\bf a' + b' = (a + b)', \quad a'b' = (ab)',}
$$
wherein the right--hand sides denote those matrices which are
formed from ${\bf a + b}$ and ${\bf ab}$ respectively by such an
interchange.
We set forth this proof by replacing the procedure of permutation
by that of
multiplication with a suitable matrix.\footnote{The method of proof adopted here possesses the merit
of revealing the close connection of permutations with an important class of more
general transformations of matrices. The validity of the rule
in question can however also be established directly on
noting that in the
definitions of {\it equality}, as also of {\it addition}
and {\it multiplication}
of matrices, no use was made of order relationships between the rows
or the columns.}
We write a permutation as
$$
\left(
\begin{array}{ccccc}
0&1&2&3&\ldots\\
k_0&k_1&k_2&k_3 &\ldots
\end{array} \right) = \left(
\begin{array}{l}
n\\
k_n
\end{array} \right)
$$
and to this we assign a {\it permutation matrix,}
$$
{\bf p} = (p(nm)), \quad p(nm) = \left\{
\begin{array}{l}
1~ \mbox{when}~ m=k_n\\
0~ \mbox{otherwise}.
\end{array}
\right.
$$
The transposed matrix to ${\bf p}$ is
$$
{\bf \tilde p} = (\tilde p(nm)), \quad \tilde p(nm) = \left\{
\begin{array}{l}
1~ \mbox{when}~ n = k_m\\
0~ \mbox{otherwise.}
\end{array} \right.
$$
On multiplying the two together, one has
$$
{\bf p \tilde p} = (\sum \limits_k p(nk) \tilde p(km)) = (\delta_{nm}) = {\bf 1},
$$
since the two factors $p(nk)$ and $\tilde p(km)$ differ from zero
simultaneously only if
$k=k_n=k_m$, i.e., when $n=m$. Hence $\tilde p$ is reciprocal to ${\bf p}$:
$$
{\bf \tilde p} = {\bf p}^{-1}.
$$
If now ${\bf a}$ be any given matrix, then
$$
{\bf pa} = (\sum \limits_k p(nk) a(km)) = (a(k_n, m))
$$
is a matrix which arises from the permutation $\left( \begin{array}{c}n\\
k_n \end{array} \right)$ of the rows of ${\bf a}$ and
equivalently
$$
{\bf ap}^{-1} = (\sum \limits_k a(mk) \tilde p(km)) = (a(n, k_m))
$$
is the matrix arising from permutation of the columns of ${\bf a}$.
One and the same
permutation applied both to the rows and the columns of ${\bf a}$
thus yields the matrix
$$
{\bf a'} = {\bf pap}^{-1}.
$$
Thence follows directly
$$
\begin{array}{ll}
{\bf a}' + {\bf b}' = {\bf p(a+b)p}^{-1} & = {\bf (a+b)'},\\
{\bf a'b'} = {\bf pabp}^{-1}&={\bf (ab)}'
\end{array}
$$
which proves our original contention.
It is thus apparent that from matrix equations one can never
determine any given sequence or order of rank of the matrix elements.
Moreover, it is evident that a much more general rule applies, namely
that every matrix equation is invariant with respect to
transformations of the type
$$
{\bf a'} = {\bf bab}^{-1},
$$
where ${\bf b}$ denotes an {\it arbitrary} matrix. We shall sec later that this does not
necessarily always apply to matrix differential equations.
\section*{
Chapter 2. Dynamics}
\subsection*{
3. The basic laws}
~~~~The dynamic system is to be described by (lie spatial
coordinate {\bf q} and the momentum ${\bf p,}$ these being represented by matrices
\begin{equation}
{\bf q} = (q(nm)e^{2 \pi i \nu(nm)t}, \quad {\bf p} (p(nm)e^{2 \pi i\nu(nm)t}).
\end{equation}
Here the $\nu (nm)$ denote the quantum-theoretical frequencies
associated with transitions between states described by the
{\it quantum numbers} $n$ and $m.$ The
matrices (24) are to be Hermitian, e.g., on transposition of the
matrices, each element is to go over into its complex conjugate
value, a condition which should apply for all real $t$. We thus have
\begin{equation}
q(nm) q(mn) = |q(nm))|^2
\end{equation}
and
\begin{equation}
\nu(nm) = - \nu(mn).
\end{equation}
If ${\bf q}$ be a {\it Cartesian} coordinate, then the expression (25) is
a measure of the {\it probabilities}\footnote{In this connection see
$\S 8$.} of the transitions $n \Leftrightarrow m$.
Further, we shall require that
\begin{equation}
\nu(jk) + \nu(kl) + \nu(lj) = 0.
\end{equation}
This can be expressed together with (26) in the following manner:
there exist quantities $W_n$ such that
\begin{equation}
h \nu(nm) = W_n - W_m.
\end{equation}
From this, with equations (2), (3), it follows that a function ${\bf g(pq)}$
invariably again takes on the form
\begin{equation}
{\bf g}= (g(nm)e^{2 \pi i \nu(nm)t})
\end{equation}
and the matrix $(g(nm))$ therein results from identically the
same process applied
to the matrices $(q(nm)),~ (p(nm))$ as was employed to find ${\bf g}$
from ${\bf q, p.}$ For this reason we can henceforth abandon the
representation (24) in favour of the shorter notation
\begin{equation}
{\bf q} = (q(nm)), \quad {\bf p} = (p(nm)).
\end{equation}
For the {\it time derivative} of the matrix
${\bf g} = (g(nm)),$ recalling to mind
(24) or (29), we obtain the matrix
\begin{equation}
{\bf \dot g} = 2 \pi i(\nu(nm)g(nm)).
\end{equation}
If $\nu(nm) \ne 0$ when $n \ne m,$ a condition which we wish to
assume, then the formula
${\bf \dot g} =0$ denotes that ${\bf g}$ is a diagonal matrix with
$g(nm) = \delta_{nm}(nn).$
A matrix differential equation ${\bf \dot g=a}$ is invariant with
respect to that process
in which the same permutation is carried out on rows and columns
of all the matrices and also upon the numbers $W_n$ In order to realize
this, consider the diagonal matrix
$$
{\bf W} = (\delta_{nm} W_n).
$$
Then
$$
{\bf Wg} = (\sum \limits_k \delta_{nk} W_n g(km)) = (W_ng(nm)),
$$
$$
{\bf gW} = (\sum \limits_kg(nk)\delta_{km}W_k) = (W_mg(nm)),
$$
i.e., according to (31),
$$
{\bf \dot g} = \frac{2 \pi i}{h} ((W_n - W_m)g(nm)) = \frac{2\pi i}{h} ({\bf
Wg - gW}).
$$
If now ${\bf p}$ be a permutation matrix, then the transform of
${\bf W,}$
$$
{\bf W'} = {\bf p W p}^{-1}=(\delta_{n_k m} W_{nk})
$$
is the diagonal matrix with the permuted $W_n$ along the diagonal.
Thence one has
$$
{\bf p \dot g p}^{-1} = \frac{2 \pi i}{h} ({\bf W'g' - g'W'}) = {\bf \dot
g'},
$$
where ${\bf g'= pgp}^{-1}$ and ${\bf \dot g'}$ denotes the time
derivative of ${\bf g'}$ constructed in accordance with the rule (31)
with permuted $W_n.$
The rows and columns of ${\bf \dot g}$ thus experience the same
permutation as those of ${\bf g,}$ and hence our contention is vindicated.
It is to be noted that a corresponding rule does {\it not} apply to
arbitrary transformations of the form ${\bf a' = bab}^{-1}$ since for
these ${\bf W'}$ is no longer a diagonal
matrix. Despite this difficulty, a thorough study of these general
transformations would seem to be called for, since it offers promise of
insight into the deeper
connections intrinsic to this new theory: we shall later revert to this
point.\footnote{Cf. the continuation of this work, lo lie published forthwith.}
In the case of a Hamilton function having the form
$$
{\bf H} = \frac{1}{2m} {\bf p^2} + {\bf U(q)}
$$
we shall assume, as did Heisenberg, that the {\it equations of motion}
are just of the same
form as in classical theory, so that using the notation of \S 2 we can write:
\begin{equation}
\left.
\begin{array}{l}
{\bf \dot q} = \frac{\displaystyle \partial {\bf H}}{\partial {\bf p}} =
\frac{\displaystyle 1}{\displaystyle m} {\bf p},\\\\
{\bf \dot p} = - \frac{\displaystyle \partial {\bf H}}{\displaystyle \partial
{\bf q}} = - \frac{\displaystyle \partial {\bf U}}{\displaystyle\partial
{\bf q}}.
\end{array}\right\}
\end{equation}
We now use correspondence considerations to try more generally to
elucidate the equations of motion belonging to an arbitrary Hamilton
function ${\bf H(pq)}$.
This is required from the standpoint of relativistic mechanics and in
particular for the treatment of electron motion under the influence of
magnetic fields. For in this latter case, the function ${\bf H}$
cannot in a Cartesian coordinate system any
longer be represented by the sum of two functions of which one
depends only on the momenta and the other on the coordinates.
Classically, equations of motion can be derived from the action principle
\begin{equation}
\int \limits^{t_1}_{t_0} L dt = \int \limits^{t_1}_{t_0} \{ p \dot q - H(pq)\}
dt = \mbox{extremum.}
\end{equation}
If we now envisage the Fourier expansion $L$ substituted in (33)
and the time interval $t_1 - t_0$ taken sufficiently large, we find
that only the constant term
of $L$ supplies a contribution to the integral. The form which
the action principle
thence acquires suggests the following translation into quantum mechanics:
The diagonal sum ${\bf D(L)} = \sum \limits_k L(kk)$ {\it is to be made
an extremum}:
\begin{equation}
{\bf D(L) = D(p \dot q - H(pq))} = \mbox{extremum,}
\end{equation}
{\it namely, by suitable choice of ${\bf p}$ and ${\bf q,}$
with $\nu(nm)$ kept fixed.}
Thus, by setting the derivatives of ${\bf D(L)}$ with respect to the
elements of ${\bf p}$ and ${\bf q}$ equal to zero, one obtains the equations of motion
$$
2 \pi i \nu(nm) q(nm) = \frac{\partial {\bf D(H)}}{d p(nm)},
$$
$$
2 \pi i \nu(mn) p(mn)
\frac{\partial {\bf D(H)}}{\partial q(mn)}.
$$
From (26), (31) and (16) one observes that these equations of
motion can always be written in {\it canonical} form,
\begin{equation}
\left. \begin{array}{l}
{\bf \dot q} = \frac{\displaystyle \partial {\bf H}}{\displaystyle \partial
{\bf p}},\\\\
{\bf \dot p} = - \frac{\displaystyle\partial {\bf H}}{\partial {\bf q}}.
\end{array} \right\}
\end{equation}
For the quantization condition, Heisenberg employed a relation
proposed by Thomas\footnote{W. Thomas, Naturwiss. {\bf 13} (1925) 627.}
and Kuhn.\footnote{W. Kuhn, Zs. f. Phys. {\bf 33} (1925) 408.} The equation
$$
{\bf J} = \oint p dq = \int \limits^{1/\nu}_0 p \dot q dt
$$
of ``classical'' quantum theory can, on introducing the Fourier
expansions of $p$ and $q,$
$$
p = \sum \limits^{\infty}_{\tau = - \infty} p_{\tau}e^{2 \pi i
\nu \tau t}, \quad q = \sum \limits^{\infty}_{\tau = - \infty} q_{\tau} e^{2
\pi i \nu \tau t},
$$
be transformed into
\begin{equation}
1 = 2 \pi i \sum \limits^{\infty}_{\tau = - \infty} ~ \tau \frac{\partial}{\partial
{\bf J}} (q_{\tau} p_{- \tau}).
\end{equation}
If therein one lias $p = m \dot q,$ one can express the $p_{\tau}$
in terms of $q_{\tau}$ and thence
obtain that classical equation which on transformation into a
difference equation
according to the principle of correspondence yields the formula of
Thomas and Kuhn. Since here the assumption that ${\bf p}= m\dot {\bf q}$ should
be avoided, we are obliged
to translate equation (36) directly into a difference equation.
The following expressions should correspond:
$$
\sum \limits^{\infty}_{\tau = - \infty} \tau \frac{\partial}{\partial {\bf
J}} (q_{\tau} p_{- \tau}) \quad \mbox{with}
$$
$$
\frac{1}{h} \sum \limits^{\infty}_{\tau = - \infty} q(n + \tau, n)p(n,n+
\tau) - q(n,n - \tau)p(n - \tau, n));
$$
where in the right-hand expression those $q(nm), ~p(nm)$ which take
on a negative index are to be set equal to zero. In this way we obtain
the quantization condition corresponding to (36) as
\begin{equation}
\sum \limits_k (p(nk)q(kn) - q(nk)p(kn)) = \frac{h}{2 \pi i}.
\end{equation}
This is a system of infinitely many equations, namely one for each value of $n.$
In particular, for ${\bf p}=m {\bf \dot q}$ this yields
$$
\sum \limits_k \nu(kn)|q(nk)|^2 = \frac{h}{8 \pi^2 m},
$$
which, as may easily be verified, agrees with Heisenberg's form of the
quantization condition, or with the Thomas-Kuhn equation. The formula
(37) has to be regarded as the appropriate generalization of this equation.
Incidentally one sees from (37) that the diagonal sum ${\bf D(pq)}$
necessarily becomes infinite. For otherwise one would have
${\bf D(pq) - D(qp)} = 0$ from
whereas (37) leads to ${\bf D(pq)-D(qp)}=\infty.$ Thus the matrices
under consideration arc never finite.\footnote{Further, they do not belong
to the class of ``bounded'' infinite matrices hitherto almost exclusively
investigated by mathematicians.}
\subsection*{
4. Consequences. Energy-conservation and frequency laws}
~~~~The content of the preceding paragraphs furnishes the basic
rules of the new quantum mechanics in their entirety. All other
laws of quantum mechanics,
whose general validity is to be verified, must be {\it derivable} from
these basic
tenets. As instances of such laws to be proved, the law of
energy conservation and the Bohr frequency condition primarily
enter into consideration. The law of
conservation of energy states that if ${\bf H}$ be the energy, then ${\bf
\dot H}=0,$ or that ${\bf H}$ is a
{\it diagonal matrix.} The diagonal elements $H(nn)$ of ${\bf H}$
are interpreted, according to
Heisenberg, as the {\it energies of the various states of the system}
and the Bohr frequency condition requires that
$$
h \nu(nm) =H(nn)-H(mm),
$$
or
$$
W_n =H(nn) + ~\mbox{const.}
$$
We consider the quantity
$$
{\bf d = pq - qp.}
$$
From (11), (35) one finds
$$
{\bf \dot d = \dot p q + p \dot q - \dot q p - q \dot p}
= {\bf q} \frac{\partial {\bf H}}{\partial {\bf q}} - \frac{\partial {\bf
H}}{\partial {\bf q}} {\bf q} + {\bf p} \frac{\partial {\bf H}}{\partial {\bf
p}} - \frac{\partial {\bf H}}{\partial {\bf p}} {\bf p}.
$$
Thus from (22), (23) it follows that ${\bf \dot d} =0$ and ${\bf d}$
is a diagonal matrix. The
diagonal elements of ${\bf d}$ are, however, specified just by the
quantum condition (27). Summarizing, we obtain the equation
\begin{equation}
{\bf pq - qp} = \frac{h}{2 \pi i} {\bf 1},
\end{equation}
on introducing the unit matrix ${\bf 1}$ defined by (6). We term the
equation (38) the
``stronger quantum condition'' and base all further conclusions upon it.
From the form of this equation, we deduce the following: If an
equation ($A$)
be derived from (38), then $(A)$ remains valid if ${\bf p}$ be
replaced by ${\bf q}$ and
simultaneously $h$ by $-h.$ For this reason one need for instance
derive only one of the following two equations from (38), which can
readily be performed by induction
\begin{equation}
{\bf p}^n {\bf q} = {\bf qp}^n + n \frac{h}{2 \pi i} {\bf p}^{n-1},
\end{equation}
$$
{\bf q}^n {\bf p} = {\bf pq}^n - n \frac{h}{2 \pi i} {\bf q}^{n-1}. \eqno(39')
$$
We shall now prove the energy-conservation and frequency laws, as expressed above, in the first instance for the case
$$
{\bf H = H_1(p) + H_2(q)}.
$$
From the statements of \S 1, it follows that we may formally
replace ${\bf H_1(p)}$ and ${bf H_2(q)}$ by power expansions
$$
{\bf H}_1 = \sum \limits_sa_s {\bf p}^s, \quad {\bf H}_2 = \sum \limits_s
b_s {\bf q}^s.
$$
Formulae (39) and (39') indicate that
\begin{equation}
\left.
\begin{array}{l}
{\bf Hq - qH} = \frac{\displaystyle h}{\displaystyle 2 \pi i} \frac{\displaystyle \partial {\bf H}}{\displaystyle \partial {\bf p}},\\\\
{\bf Hp - pH} = - \frac{\displaystyle h}{\displaystyle 2 \pi i} \frac{\displaystyle \partial {\bf H}}{\displaystyle \partial {\bf p}}.
\end{array} \right\}
\end{equation}
Comparison with the equations of motion (35) yields
\begin{equation}
\left. \begin{array}{l}
{\bf \dot q} = \frac{\displaystyle 2 \pi i}{\displaystyle h} ({\bf Hq - qH}).\\\\
{\bf \dot p} = \frac{\displaystyle 2 \pi i}{\displaystyle h} ({\bf Hp - pH}).
\end{array} \right\}
\end{equation}
Denoting the matrix ${\bf Hg - gH}$ by $\biggl| \begin{array}{c} {\bf H}\\{\bf
g} \end{array} \biggr| $ for brevity, one has
\begin{equation}
\Biggl| \begin{array}{c}
{\bf H}\\
{\bf ab}
\end{array} \Biggr| = \Biggl| \begin{array}{c}
{\bf H}\\
{\bf a}
\end{array} \Biggr| {\bf b} + {\bf a} \Biggl| \begin{array}{c}
{\bf H}\\
{\bf b}
\end{array} \Biggr|;
\end{equation}
from which generally for ${\bf g=g(pq)}$ one may conclude that
\begin{equation}
{\bf \dot g} = \frac{2 \pi i}{h} \Biggl| \begin{array}{c}
{\bf H}\\
{\bf g}
\end{array} \Biggr| = \frac{2 \pi i}{h} ({\bf Hg - gH}).
\end{equation}
To establish this result, one need only conceive ${\bf \dot g}$
as expressed in function of ${\bf p, q}$
and ${\bf \dot p, \dot q}$ with the aid of (11), (11$'$), and
$\biggl| \begin{array}{c}
{\bf H}\\
{\bf g}
\end{array} \biggr|$ as evaluated by means of (42) in
function of ${\bf p, q}$ and $\biggl| \begin{array}{c} {\bf H}\\ {\bf p}
\end{array} \biggr|$, $\biggl| \begin{array}{c} {\bf H}\\ {\bf q}
\end{array} \biggr|$ followed by application of the relations (41).
In particular, if in (43) one sets ${\bf g=H,}$ one obtains
\begin{equation}
{\bf \dot H} = 0.
\end{equation}
Now that we have verified the energy-conservation law and recognized
the matrix ${\bf H}$ to be diagonal, equation (41) can be put into the
form
$$
h \nu(nm) q(nm) =(H(nn) - H(mm)) q(nm),
$$
$$
h \nu(nm)p(nm) = (H(nn) - H(mm)) p(nm),
$$
from which the frequency condition follows.
If we now go over to consideration of more general Hamilton functions
${\bf H}^{\ast}={\bf H}^{\ast} {\bf (pq),}$ it can easily be seen that
in general ${\bf \dot H}^{\ast}$ no longer vanishes
(examples such as ${\bf H}^{\ast} = {\bf p}^2 {\bf q}$, readily reveal
this). It can however be observed
that the Hamilton function ${\bf H} = \frac{1}{2} ({\bf p}^2 {\bf q} + {\bf
qp}^2)$ yields the same equations of motion
as ${\bf H}^{\ast}$ and that ${\bf \dot H}$ again vanishes. In
consequence we may express the energy-conservation and frequency
laws in the following way: To {\it each function }
${\bf H}^{\ast} = {\bf H}^{\ast} ({\bf pq})$ {\it there can be
assigned a function ${\bf H=H(pq)}$ such that as
Hamiltonians ${\bf H}^{\ast}$ and ${\bf H}$ yield the same equations of
motion and that for these equations of motion ${\bf H}$ assumes the role of an energy which is constant in time and which fulfils the frequency condition.}
On bearing in mind the considerations discussed above, it suffices
to show that the function ${\bf H}$ to be specified satisfies not only the conditions
\begin{equation}
\frac{\partial {\bf H}}{\partial {\bf p}} = \frac{\partial{\bf H}^{\ast}}{\partial
{\bf p}}, \quad \frac{\partial {\bf H}}{\partial {\bf q}} = \frac{\partial
{\bf H}^{\ast}}{\partial {\bf q}},
\end{equation}
but in addition satisfies equations (40). From \S 1, the matrix
${\bf H}^{\ast}$ is formally to
be represented as a sum of products of powers of ${\bf p}$ and ${\bf q.}$ Because of the
linearity of equations (40), (45) in ${\bf H, H}^{\ast}$ we have
simply to specify the
commensurate sum term in ${\bf H}$ a counterpart to each individual
sum term in ${\bf H}^{\ast}$. Thus we need consider solely the case
\begin{equation}
{\bf H}^{\ast} = \operatornamewithlimits{\Pi}^k_{j-1} ({\bf p}^{s_j},
{\bf q}^{r_j}).
\end{equation}
It follows from the remarks of \S 2 that equations (45) can be
satisfied by specifying ${\bf H}$ as a linear form of those products of
powers of ${\bf p, q}$ which arise
from ${\bf H}^{\ast}$ through cyclic interchange of the factors; herein
the sum of the coefficients must be held to unity. The question as
to how these coefficients are
to be chosen so that equations (40) may also be satisfied is less
easy to answer. It may at this juncture suffice to dispose of the
case $k=1$, namely
\begin{equation}
{\bf H}^{\ast} = {\bf p}^s {\bf q}^r.
\end{equation}
The formula (39) can be generalized\footnote{A different generalization is
furnished by the formulae
$$
{\bf p}^m {\bf q}^{n} = \sum \limits^{m,n}_{j=0} j! \left( \begin{array}{c}
m\\
j
\end{array} \right) \left( \begin{array}{c}
n\\
j
\end{array} \right) \left( \frac{\displaystyle h}{\displaystyle 2 \pi i}
\right)^j {\bf q}^{n-j} {\bf p}^{m-j},
$$
$$
{\bf q}^n {\bf p}^m = \sum \limits^{m,n}_{j=0} j! \left( \begin{array}{c}
m\\
j
\end{array} \right) \left( \begin{array}{c}
n\\
j
\end{array} \right) \left( \frac{\displaystyle h}{\displaystyle 2 \pi i}
\right)^j {\bf p}^{m-j} {\bf q}^{n-j},
$$
where $j$ runs to the lesser of the two integers $m,n$.} to
\begin{equation}
{\bf p}^m {\bf q}^n - {\bf q}^n {\bf p}^n = m \frac{h}{2 \pi i} \sum \limits^{n-1}_{l=0}
{\bf q}^{n-1-l} {\bf p}^{m-1} {\bf q}^l.
\end{equation}
For $n=1$ this reverts to (39); in general (48) ensues from the fact that because of (39) one has
$$
{\bf p}^m {\bf q}^{n+1} - {\bf q}^{n+1}{\bf p}^m = ({\bf p}^m {\bf q}^n -
{\bf q}^{n} {\bf p}^m ){\bf q} + m \frac{h}{2 \pi i} {\bf q}^n {\bf p}^{m+1}.
$$
The new formula
$$
{\bf p}^m {\bf q}^n - {\bf q}^n {\bf p}^m = n \frac{h}{2 \pi i} \sum \limits^{m-1}_{j=0}
{\bf p}^{m-1-j} {\bf q}^{n-1} {\bf p}^j \eqno(48')
$$
is obtained on interchanging ${\bf p}$ and ${\bf q}$ and reversing the
sign of $h.$
Comparison with (48) yields
\begin{equation}
\frac{1}{s+1} \sum \limits^s_{l=0} {\bf p}^{s-l} {\bf q}^r {\bf p}^l = \frac{1}{r+1}
\sum \limits^r_{j=0} {\bf q}^{r-j} {\bf p}^s{\bf q}^j.
\end{equation}
We now assert: The matrix ${\bf H}$ belonging to ${\bf H}^{\ast}$ as
given by (47) is:
\begin{equation}
{\bf H} = \frac{1}{s+1} \sum \limits^s_{l=0} {\bf p}^{s-l} {\bf q}^r {\bf
p}^l.
\end{equation}
We need only prove equations (40), to which end we recall the
derivatives, (18$'$) \S 2.
From (50), we now obtain the relation
$$
{\bf Hp - pH} = \frac{1}{s+1} ({\bf q}^r {\bf p}^{s+1} - {\bf p}^{s+1} {\bf
q}{^r}),
$$
and according to (48) this is equivalent to the lower of equations (40).
Further, using (49) we find
$$
{\bf Hq - qH} = \frac{1}{r+1} ({\bf p}^s {\bf q}^{r+1} - {\bf q}^{r+1} {\bf
p}^s),
$$
and by (48$'$) this is equivalent to the upper of equations (40).
This completes the requisite proof.
Whereas in classical mechanics energy conservation $({\bf \dot H}=0)$ is directly apparent
from the canonical equations, the same law of energy conservation in
quantum mechanics, ${\bf H}=0$ lies, as one can see, more deeply hidden beneath the surface.
That its demonstrability from assumed postulates is far from being trivial will be
appreciated if, following more closely the classical method of proof, one sets out to
prove ${\bf H}$ to be constant simply by evaluating ${\bf \dot H.}$
To this end, one first has to express ${\dot H}$ as
function of ${\bf p, q}$ and ${\bf \dot p, \dot q}$ with the aid of (11),
(11$'$), whereupon for ${\bf \dot p}$ and ${\bf \dot q}$ the
values $-\partial {\bf H}/ \partial {\bf q},~ \partial {\bf H}/ \partial
{\bf p}$ have to be introduced. This yields ${\bf \dot H}$ in function
of ${\bf p}$ and ${\bf q.}$ Equation (38)
or the formulae quoted in the footnote to equation (48) which were
derived from (38) permit this function to be converted into a sum of
terms of the type $a{\bf p}^s {\bf q}^r$ and one then
has to prove that the coefficient $a$ in each of such terms vanishes.
This calculation for
the most general case, as considered above along different lines,
becomes so exceedingly involved\footnote{For the case ${\bf H} =(1/2m){\bf
p}^2 + {\bf U(q)}$ it can immediately be carried out with the aid of
(39$'$).} that it seems hardly feasible. The fact that nonetheless
energy-conservation and frequency laws could be proved in so general a context would seem to
us to furnish strong grounds to hope that this theory embraces truly
deep-seated physical laws.
In conclusion, we append a result here which can easily be derived from the formulae
of this section, namely: {\it Equations (35), (37) can be replaced by (38) and (44) (with ${\bf H}$
representing the energy); the frequencies are thereby to be derived from the frequency condition.}
In the continuation to this paper, we shall examine the important applications to which this theorem gives rise.
\section*{
Chapter 3. Investigation of the Anharmonic Oscillator}
~~~~The anharmonic oscillator, having
\begin{equation}
{\bf H} = \frac{1}{2} {\bf p}^2 + \frac{1}{2} \omega^2_0 {\bf q}^2 + \frac{1}{2}
\lambda {\bf q}^3
\end{equation}
has already been considered in detail by Heisenberg. Nevertheless, its
investigation will here be renewed with the aim of determining the {\it most general} solution
of the fundamental equations for this case. If the basic equations of the present theory are
indeed complete and do not require to be supplemented any further, then
the absolute
values $|q(nm)|,~ |p(nm)|$ of the elements of the matrices ${\bf q}$
and ${\bf p}$ must {\it uniquely} be
determined by these equations, and thus it becomes important to check
this for the
example (51). On the other hand, it is to be expected that an
uncertainty will still persist
with respect to the phases $\phi_{nm}, \varphi\varphi_{nm}$ in the relations
$$
q(nm) = |q(nm)|e^{i \phi_{nm}},
$$
$$
p(nm) = |p(nm)|e^{i \varphi_{nm}}.
$$
For the statistical theory, e.g., of the interaction of quantised
atoms with external
radiation fields, it becomes of fundamental importance to ascertain the
precise degree of such uncertainty.
\subsection*{
5. Harmonic oscillator}
~~~~The starting point in our considerations is the theory of the
harmonic oscillator; for small
$\lambda$, one can regard the motion as expressed by equation (51) to be a perturbation of the normal harmonic oscillation having energy
\begin{equation}
{\bf H} = \frac{1}{2} {\bf p}^2 + \frac{1}{2} \omega^2_0 {\bf q}^2.
\end{equation}
Even for this simple problem it is necessary to supplement Heisenberg's analysis. This
latter employs correspondence considerations to arrive at significant deductions as to the
form of the solution: namely, since classically only a {\it single} harmonic component is
present, Heisenberg selects a matrix which represents transitions between adjacent states only, and which thus has the form
\begin{equation}
{\bf q} = \left(
\begin{array}{cccccc}
0&q^{(01)}&0&0&0&\ldots\\
q^{(10)}&0&q^{(12)}&0&0& \ldots\\
0&q^{(21)}&0&q^{(23)}&0 &\ldots\\
\ldots&\ldots&\ldots&\ldots&\ldots&\ldots
\end{array} \right).
\end{equation}
We here strive to build up the entire theory self-dependently, without
invoking assistance
from classical theory on the basis of the principle of correspondence.
We shall therefore
investigate whether the form of the matrix (53) cannot itself
be derived from the basic
formulae or, if this proves impossible, which additional postulates are required.
From what has been stated in \S 3 regarding the invariance with
respect to
permutation of rows and columns, one can see right away that the exact
form of
the matrix (53) can never be deduced from the fundamental equations, since if
rows and columns be subjected to the same permutation, the canonical
equations and the quantum condition remain invariant and thereby
one obtains a
new and apparently different solution. But all such solutions
naturally differ
only in the notation, i.e., in the way the elements are numbered.
We seek to
prove that through a mere renumbering of its elements, the solution
can always be brought into the form (53). The equation of motion
\begin{equation}
{\bf \ddot q} + \omega^2_0 {\bf q} = 0
\end{equation}
runs as follows for the elements:
\begin{equation}
(\nu^2(nm) - \nu^2_0)q(nm) = 0,
\end{equation}
where
$$
\omega^0 = 2 \pi \nu_0, \quad h \nu(nm) = W_n - W_m.
$$
From the stronger quantum condition
\begin{equation}
{\bf pq} - {\bf qp} = \frac{h}{2 \pi i} {\bf 1},
\end{equation}
it follows that for each $n$ there must exist a corresponding $n'$
such that $q(nn') \ne 0,$
since if there were a value of $n$ for which all $q(nn')$ were equal
to zero, then the
with diagonal element of ${\bf pq-qp}$ would be zero, which
contradicts the quantum
condition. Hence equation (55) implies that there is always an $n'$ for which
$$
|W_n - W_{n'}| = h \nu_0.
$$
But since we have assumed in our basic principles that when $n \ne m$,
the energies
are always unequal $(W_n \ne W_m)$, it follows that at most two
such indices $n'$ and
$n''$ can exist, for the corresponding $W_{n'}, W_{n''}$ are solutions
of the quadratic equation
$$
(W_n - x)^2 = h^2 \nu_0^2;
$$
and if indeed {\it two} such indices $n', n''$ exist, it follows that
the corresponding frequences must be related as:
\begin{equation}
\nu(nn') = - \nu(nn'').
\end{equation}
Now from (56) we get
\begin{equation}
\sum \limits_k \nu(kn)|q(nk)|^2 = \nu (n'n) \{|q(nn')|^2 -|q(nn'')|^2\} = h/8 \pi^2,
\end{equation}
and the energy (52) ensues as
$$
\begin{array}{ll}
H(nm)& = \frac{\displaystyle 1}{\displaystyle 2} \times 4 \pi^2 \sum \limits_k \{-\nu(nk) \nu(km) q(nk)q(km) + \nu^2_0 q(nk) q(km)\}\\\\
&=2 \pi^2 \sum \limits q(nk) q(km)\{\nu_0^2 - \nu(nk) \nu(km)\}.
\end{array}
$$
In particular, for $m=n$ we have
\begin{equation}
H(nn) = W_n = 4 \pi^2 \nu_0^2 (|q(nn')|^2 + |q(nn'')|^2).
\end{equation}
Moreover, we can now distinguish between three possible cases:
\begin{itemize}
\item[(a)] no $n''$ exists and one has $W_{n'} > W_n$;
\item[(b)] no $n''$ exists and one has $W_{n'} < W_{n}$,
\item[(c)] $n''$ exists.
\end{itemize}
In case (b) we now consider $n'$ in place of $n;$ to this there
belong at most two
indices $(n')'$ and $(n')''$ and of these, one lias to equal $n.$
We thereby revert to one
of the cases (a) or (c) and can accordingly omit further
consideration of (b).
In case (a), $\nu(n'n) = + \nu_0$ and from (58) it follows that
\begin{equation}
\nu_0|q(nn')|^2 = h/8 \pi^2,
\end{equation}
and thus from (59) that
$$
W_n = H(nn) = 4 \pi^2 \nu_0^2|q(nn')|^2 = \frac{1}{2} \nu_0 h.
$$
Because of the assumption that $W_n \ne W_m$ for $n \ne m$ there is
thus at most {\it one} index $n=n_0$for which the case (a) applies.
If such an $n_0$ exists, we can specify a series of numbers $n_0, n_1, n_2,
n_3, \ldots,$ such that
$(n_k)' = n_{k+1}$ and $W_{k+1} > W_k.$ Then invariably
$(n_{k+1})'' = n_k$. Hence for $k > 0$, equations (58) and (59) give
\begin{equation}
H(n_k n_k) = 4 \pi^2 \nu_0^2 \{ |q(n_k, n_{k+1})|^2 + |q(n_k, n_{k-1})|^2),
\end{equation}
\begin{equation}
\frac{1}{2} h = 4 \pi^2 \nu_0\{|q(n_k, n_{k+1})|^2 - |q(n_k, n_{k-1})|^2
\}.
\end{equation}
From (60) and (62) it follows that
\begin{equation}
|q(n_k, n_{k+1})|^2 = \frac{h}{8 \pi^2 \nu_0} (k+1),
\end{equation}
and thence from (61) that
\begin{equation}
W_{n_k} = H(n_k, n_k) = \nu_0 h(k + \frac{1}{2}).
\end{equation}
Now, we still have to check whether it be possible that there is no
value of $n$ for which case (a) applies. Beginning with an
arbitrary $n_0$ we can then build
$n_0' = n$, and $n_0'' = n_{-1}$ and with each of these latter write
$n_1'=n_2$, $n_1''=n_0$ and $n'_{-1} = n_0,~ n''_{-1} = n_{-2}$
etc. In this manner we obtain a series of numbers
$\ldots n_{-2}, n_{-1}, n_0, n_1, n_2 \ldots,$, and equations (61),
(62) hold for every $k$ between $- \infty$ and $+ \infty$. But this
is impossible, since by (62) the quantities $x_k = |q(n_{k+1}, n_k)|^2$
form an equispaced series of numbers,
and since they are positive, there must be a least value. The relevant
index can
then again be designated as $n_0$ and we thereby revert to the
previous case -- thus here also, the formulae(63), (64) apply.
One can further sec that every number $n$ must be contained within the
numbers $n_k$, since otherwise one could construct a new series (65)
proceeding from $n,$ and for this formula (60) would again hold. The
starting terms of both series would then have the same value $W_n = H(nn)$, which is not possible.
This proves that the indices $0,~ 1,~ 2,~ 3 \ldots$ can be rearranged
into a new sequence
$n_0, n_1, n_2, n_3 \ldots$ such that formulae (63), (64) apply:
with these new indices, the solution then takes on Heisenberg's form
(53). Hence this appears as the ``normal form'' of the general
solution. By virtue of (64), it possesses the property that
$$
W_{n_{k+1}} > W_{n_k}.
$$
If, inversely, one stipulate that $W_n=H(nn)$ should always increase
with $n,$ then it necessarily follows that $n_k = k$; this principle
thus uniquely establishes the
normal form of the solution. But thereby only the notation becomes fixed and the calculation more transparent: nothing new is conferred {\it physically.}
Therein lies the big difference between this and the previously
adopted semiclassical methods of determining the stationary states.
The classically calculated orbits merge into one another continuously;
consequently the quantum orbits selected at a later stage have a
particular
sequence right from the outset. The new mechanics presents itself as an
essentially discontinuous theory in that herein there is no question
of a sequence of quantum states defined by the physical process,
but rather of quantum numbers which are indeed no more
than distinguishing indices which can be ordered and normalized
according to any practical standpoint whatsoever (e.g., according to
increasing energy $W_n$).
\subsection*{
6. Anharmonic oscillator}
~~~The equations of motion
\setcounter{equation}{65}
\begin{equation}
{\bf \ddot q} + \omega^2_0 {\bf q} + \lambda{\bf q}^2 = 0,
\end{equation}
together with the quantum condition yield the following system of equations for the elements:
\begin{equation}
\begin{array}{c}
(\omega_0^2 - \omega^2(nm))q(nm) + \lambda \sum \limits_k q(nk) q(km) = 0,\\
\sum \limits_k \omega(nk) q(nk) q(kn) = - h/4 \pi.
\end{array}
\end{equation}
We introduce series expansions
\begin{equation}
\begin{array}{l}
\omega(nm) = \omega^0(nm) + \lambda \omega^{(1)} (nm) + \lambda^2 \omega^{(2)}(nm)
+ \ldots\\
\dot q(nm) = q_0(nm) + \lambda q^{(1)}(nm) + \lambda^2 q^{(2)}(nm) + \ldots
\end{array}
\end{equation}
in seeking the solution.
When $\lambda =0$, one lias the case of the harmonic oscillator
considered in the previous
section; we write the solution (53) in the form
\begin{equation}
q_0(nm) = a_n \delta_{n, m-1} + \overline{a_m} \delta_{n-1,m},
\end{equation}
where the bar denotes the conjugate complex value. If one builds
the square or higher
powers of the matrix ${\bf q}^0=(q^0(nm))$, one arrives at matrices
of similar form, being composed of sums of terms
\begin{equation}
(\xi^{(p)}_{nm} = \xi_n \delta_{n, m-p} + \overline{\xi_m} \delta_{n-p, m}.
\end{equation}
This prompts us to try a solution of the form
\begin{equation}
\begin{array}{c}
q^0(nm) = (a)^{(1)}_{nm},\\
q^{(1)}(nm) = (x)^0_{nm} + (x')^{(2)}_{nm},\\
q^{(2)}(nm) = (y)^{(1)}_{nm} + (y')^{(3)}_{nm},\\
\ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots
\end{array}
\end{equation}
$n$ which odd and even values of the index $p$ always alternate. If one
actually inserts this in the approximation equations
\begin{equation}
\lambda: \left\{
\begin{array}{rl}
(\omega_0^2 &- \omega^0(nm)^2) q^{(1)}(nm) - 2 \omega^0(nm)\omega^{(1)}nm)
q^0(nm)\\
&+ \sum \limits_k q^0(nk) q^0(km) = 0,\\
\sum \limits_k \{ \omega^0(nk)& (q^0(nk) q^{(1)}(kn) + q^{(1)}(nk) q^0(kn))\\
&+\omega^{(1)}(nk) q^0(nk) q^0(kn)\} = 0,
\end{array} \right\}
\end{equation}
\begin{equation}
\lambda^2: \left\{
\begin{array}{rl}
(\omega_0^2 &-\omega^0(nm)^2) q^{2})(nm) - 2 \omega^0(nm) \omega^{(1)}(nm)
q^{(1)}(nm)\\
&-(\omega^{(1)}(nm)^2 + 2 \omega^0(nm) \omega^{(2)}(nm)) q^0(nm)\\
&+\sum \limits_k (q^0(nk) q^{(1)}(km) + q^{(1)}(nk) q^0(km)) =0,\\\\
\sum \limits_k \{\omega^0nk)& (q^0(nk) q^{(2)}(km) + q^{(1)}(nk) q^{(1)}(km)\\
&+q^{(2)}(nk) q^0(km)) + \omega^{(1)}(nk) (q^0(nk) q^{(1)}(km)\\
&+q^{(1)}(nk) q^0(km)) + \omega^{(2)}(nk) q^0(nk)q^0(km)\} = 0
\end{array} \right\}
\end{equation}
and notes the multiplication rule
\begin{equation}
\begin{array}{ll}
\sum \limits_k \Omega_{nkm}(\xi)^{(p)}_{nk} (\eta)^{(q)}_{km} &= \Omega_{n,
n+p, n+p+q} \xi_n \eta_{n+p} \delta_{n, m-p-q}\\
&+\Omega_{n,n+p,n+p-q} \xi_n \bar \eta_{n+p-q} \delta_{n,m-p+q}\\
&+\Omega_{n,n-p,n-p+q} \bar \xi_{n-p} \eta_{n-p} \delta_{n,m+p-q}\\
&+\Omega_{n,n-p,n-p-q} \bar \xi_{n-p} \bar \eta_{n-p-q} \delta_{n,m+p+q},
\end{array}
\end{equation}
one sees, in setting each of the factors of $\delta_{n,m-s}$ singly
to zero, that through the
substitution (71) all conditions can in fact be satisfied and that
higher terms in (71) would identically vanish.
In detail, the calculation yields the following:
The first of the equations (72) gives, after substitution of the expressions (71),
\begin{equation}
\left.
\begin{array}{ll}
2 \omega_0^2 x_n + |a_n|^2&+|a_{n-1}|^2 = 0,\\
-3 \omega_0^2 x'_n&+a_na_{n+1} = 0,\\
&\omega^{(1)}_{n,n-1} = 0,
\end{array} \right\}
\end{equation}
and the second is identically satisfied. One thus has
\begin{equation} \left\{
\begin{array}{ll}
x_n = - \frac{\displaystyle |a_n|^2 + |a_{n-1}|^2}{\displaystyle 2 \omega_0^2},\\\\
x'_n = \frac{\displaystyle a_na_{n+1}}{\displaystyle 3 \omega_0^2}.
\end{array} \right\}
\end{equation}
The first of the equations (73) yields
\begin{equation}
\left.
\begin{array}{r}
2 \omega_0a_n \omega^{(2)}_{n,n+1} + 2a_nx_{n+1} + 2a_n x_n + \tilde a_{n-1}
x'_{n-1} + \tilde a_{n+1} x'_n = 0,\\
- 8 \omega^2_0 y'_n + a_n x'_{n+1} =a_{n+2} x'_n = 0,\\
\omega^{(1)}_{n,n-2} = 0,
\end{array} \right\}
\end{equation}
whereas the second equation is not identically satisfied, but
furnishes a relation from which $y_n$ can be determined:
\begin{equation}
\begin{array}{r}
a_n \tilde y_n + \tilde a_n y_n - a_{n-1} \tilde y_{n-1} - \tilde a_{n-1}
y_{n-1} + 2|x'_n|^2 - 2|x'_{n-2}|^2\\
- \frac{\displaystyle \omega^{(2)}_{n,n+1}}{\displaystyle \omega_0} |a_n|^2
- \frac{\displaystyle \omega^{(2)}_{n,n-1}}{\displaystyle \omega_0} |a_{n-1}|^2
= 0.
\end{array}
\end{equation}
The solution is:
\begin{equation}
\left.
\begin{array}{rl}
\omega^{(2)}_{n,n+1} &= \frac{\displaystyle 1}{\displaystyle 3 \omega_0^3}
(|a_{n+1}|^2 + |a_{n-1}|^2 + 3 |a_n|^2),\\
y'_n& = \frac{\displaystyle 1}{\displaystyle 12 \omega_0^4} a_n a_{n+1} a_{n+2}.
\end{array} \right\}
\end{equation}
Further, if for brevity one introduces
\begin{equation}
\eta_n = a_n \tilde y_n + \tilde a_n y_n,
\end{equation}
then the $\eta$ determined by the equation
\begin{equation}
\eta_n - \eta_{n-1} = \frac{1}{\omega_0^4} (|a_n|^4 - |a_{n-1}|^4 + \frac{1}{9}
|a_n|^2 |a_{n+1}|^2 - \frac{1}{9} |a_{n-1}|^2 |a_{n-2}|^2).
\end{equation}
Expressions (76) and (79) show that the quantities $x_n, x'_n, y'_n$
can be expressed through
the solution of the zero-th order approximation $a_n$. Thus their
phases are determined by
those of the harmonic oscillator. For the quantities $y_n,$ the situation seems to be different,
since although $\eta_n$ can uniquely be determined from (81), $y_n$
cannot be obtained
absolutely from (80). It is probable that the next higher order of approximation gives rise to an auxiliary determining equation for $y_n.$ We have
to leave this question open here but
we should like to indicate its significance as a point of
principle in regard to the
completeness of Hie entire theory. All questions of statistics invariably depend finally
upon whether or not our supposition that of the phases of the $q(nm)$
{\it one} in each row (or each column) of the matrix remains
undetermined be valid.
In conclusion we present the explicit formulae which are obtained by
substituting the solution of the harmonic oscillator found previously
(\S 5). In normal form, by (63), this runs as follows:
\begin{equation}
a_n = \sqrt{C(n+1))}e^{i \varphi_n}, \quad C=h/4 \pi \omega_0 = h/8 \pi^2
\nu_0.
\end{equation}
Thence,
using (76), (79), (81) one obtains
\begin{equation}
\left.
\begin{array}{l}
x_n = - \frac{\displaystyle C}{\displaystyle 2 \omega_0^2} (2n + 1),\\\\
x'_n = \frac{\displaystyle C}{\displaystyle 3 \omega_0^2} \sqrt{(n+1)(n+2)}
e^{i(\varphi_n + \varphi_{n+1})}\\\\
y'_n = \frac{\displaystyle \sqrt{C^3}}{\displaystyle 12 \omega_0^4} \sqrt{(n+1)(n+2)(n+3)}
e^{i(\varphi_n + \varphi_{n+1} + \varphi_{n+2})}
\end{array} \right\}
\end{equation}
\begin{equation}
\left.
\begin{array}{l}
\omega^{(1)}_{n,n-1} = 0, \quad \omega^{(1)}_{n,n-2} = 0,\\\\
\omega^{(2)}_{n,n-1} = - \frac{\displaystyle 5C}{\displaystyle 3 \omega_2^3}
n;
\end{array} \right\}
\end{equation}
that is,
$$
\eta_n - \eta_{n-1} = \frac{11C^2}{9 \omega_0^4} (2n+1),
$$
$$
\eta_n = a_n \tilde y_n + \tilde a_n y_n = \frac{11C^2}{9 \omega_0^4} (n+1)^2.
$$
If one sets $y_n = |y_n|e^{i \varphi_n}$, then
\begin{equation}
|y_n| \cos (\varphi_n - \psi_n) = \frac{\eta_n}{2|a_n|} = \frac{11 \sqrt{C^3}}{18
\omega_0^4} \sqrt{n+1^3}.
\end{equation}
In this approximation, $y_n$ cannot be specified any more closely than this.
However, we should like to write out the final equations when one
makes the assumption that $\psi_n = \varphi_n$. These are as follows
(up to terms of higher than second order in $\lambda$):
\begin{equation} \left.
\begin{array}{l}
\omega(n,n-1) = \omega_0 - \lambda^2 \frac{\displaystyle 5C}{\displaystyle
3 \omega_0^3}
n + \ldots,\\\\
\omega(n,n-2) = 2 \omega_0 + \ldots;
\end{array} \right\}
\end{equation}
\begin{equation}
\left.
\begin{array}{rl}
q(n,n)&= - \lambda \frac{\displaystyle C}{\displaystyle \omega_0^2} (2n +
1) + \ldots,\\\\
q(n,n-1)&= \sqrt{Cn}e^{i \varphi_{n-1}} \left( 1 + \lambda^2 \frac{\displaystyle 11 Cn}{\displaystyle 18 \omega_0^4} + \ldots \right),\\\\
q(n,n-2)&= \lambda \frac{\displaystyle C}{\displaystyle 3 \omega_0^2} \sqrt{n(n-1)}
e^{i(\varphi_{n-1} + \varphi_{n-2})} + \ldots,\\\\
q(n,n-3)&= \lambda^2 \frac{\displaystyle \sqrt{C^3}}{\displaystyle 12 \omega_0^4}
\sqrt{n(n-1)(n-2)} e^{i(\varphi_{n-1} + \varphi_{n-2} + \varphi_{n-3})} +
\ldots
\end{array} \right\}
\end{equation}
We have also calculated the energy directly and derived the following formula;
\begin{equation}
W_n = h \nu_0 \left(n + \frac{1}{2} \right) - \lambda^2 \frac{5C^2}{3 \omega_0^2}
\left(n(n+1) + \frac{17}{30} \right) + \ldots
\end{equation}
The frequency condition is actually satisfied, since, remembering
(82), we have
$$
\begin{array}{ll}
W_n - W_{n-1} = h \nu_0 - \lambda^2 \frac{\displaystyle 2C^2}{\displaystyle \omega_0^2} n + \ldots &=\frac{\displaystyle h}{\displaystyle 2 \pi} \omega(n,n-1),\\
W_n - W_{n-2} = 2 h \nu_0 + \ldots & = \frac{\displaystyle h}{\displaystyle 2 \pi} \omega(n,n-2).
\end{array}
$$
With the formula (88) we can associate the observation that already
in terms of lowest
order there occurs a discrepancy from classical theory which can formally be removed
by the introduction of a ``half-integer'' quantum number $n' = n + 1/2$. This has already been
remarked by Heisenberg. Incidentally, our expressions $\omega(n,n-1)$
as given by (86)
agree {\it exactly} with the classical frequencies in all respects. For comparison, we note the classical energy to be\footnote{See M. Born, Atommechanik (Berlin,
1925), Chapter 4, \S 42, p. 294; one has to set $a=1/3$ in the formula (6)
in order to obtain agreement with the present treatment.}
$$
W_n^{(e1)} = h \nu_0 n - \lambda^2 \frac{5C^2}{3 \omega_0^2} n^2 + \ldots,
$$
and thus the classical frequency to be:
$$
\begin{array}{ll}
\omega_{e1}&= \frac{\displaystyle 1}{\displaystyle h} \frac{\displaystyle \partial W^{(e1)}_n}{\displaystyle \partial n} = h \nu_0 - \lambda^2 \frac{\displaystyle 5C^2}{3 \omega_0^2} n + \ldots\\\\
&= \omega_{qu}(n,n-1) = \frac{\displaystyle 1}{\displaystyle h} (W^{(qu)}_n
- W^{(qu)}_{n-1}).
\end{array}
$$
We have, finally checked that the expression (88) can also be derived from
the Kramers-Born perturbation formula (up to an additive constant).
\end{document}
%ENCODED JUNE 2003 BY NIS;