Research ExpositionArnold Mathematical Journal

Received: 5 July 2015 / Revised: 7 September 2015 / Accepted: 24 October 2015

Kepler’s Laws and Conic Sections

Alexander Givental Department of Mathematics UC Berkeley
Berkeley CA 94720 USA,
Center for Geometry and Physics of the Institute for Basic Science POSTECH Pohang Korea
givental@math.berkeley.edu

Abstract

The geometry of Kepler’s problem is elucidated by lifting the motion from the $(x,y)$-plane to the cone $r^{2}=x^{2}+y^{2}$.

Keywords
Kepler’s laws, Conic section

Introduction

In the senior year at the Moscow high school no. 2, my astronomy term paper11 Astronomy teacher: N. V Mart’yanova, geometry and algebra teacher: Z. M. Fotieva, calculus teacher: V. A. Senderov. was about Kepler’s laws of planetary motion. The intention was to derive the three famous empirical laws from the first principles of Newtonian gravitation. Assuming the 1st law, which stipulates that each orbit is a conic section with the center of attraction at its focus, I had no difficulty deriving the 2nd and 3rd laws (about conservation of sectorial velocity, and the relation between the periods of revolutions with the orbits’ sizes respectively). However, the derivation of the 1st law per se resisted my effort, apparently, for the lack of skill in solving ordinary differential equations.

Physics and math were taught in our school at an advanced level, and so I had no problems with differentiation of functions, and even with setting Kepler’s problem in the ODE form:

$\displaystyle\ddot{{\mathbf{r}}}=-k\frac{{\mathbf{r}}}{|{\mathbf{r}}|^{3}}.$

Here ${\mathbf{r}}$ denotes the radius-vector of the planet with respect to the center of attraction, and $k$ is the coefficient of proportionality in the Newton law “$F=ma$” between the acceleration and the “inverse-square radius.” Yet, solving such ODEs was beyond our syllabus.

The unfortunate deficiency was overcome in college, but the traditional solution of Kepler’s problem—by integrating the ODE in polar coordinates—left a residue of frustration. Namely, it remained unclear why, after all, the trajectories are conic sections. So, the high-school project became transformed into the question:

Where is the cone whose sections are the planetary orbits?

In this article, the answer to this question is presented among several other related achievements, including a derivation of Kepler’s 1st law for readers familiar with the cross-product of vectors and differentiation, but not necessarily integration or ODEs.

Thirty years ago, an abstract of this article was included by V. I. Arnold into his joint with V. V. Kozlov and A. I. Neishtadt survey of classical mechanics ([Arnold et al.1987]). Since then, I gave several presentations of this elementary topic: on a lunch seminar, student colloquium, math circle, in some lecture courses, but it might still be worth appearing in print.

In fact, there is no lack of accounts on Kepler’s problem, including some of elementary nature [e.g. ([Arnold1990], [Goodstein and Goodstein1996], [Milnor1983])], though in those I’ve seen, I don’t notice much overlap with ours. In any case, our treatment can hardly provide anything but one more geometric interpretation of what’s been revolving around for ages.

1 Conservation of angular momentum

. The angular momentum vector is defined as the cross-product

$\displaystyle{\mathbf{M}}=m({\mathbf{r}}\times\dot{{\mathbf{r}}}),$

where $m$ is the mass of the planet. Differentiating, one finds

$\displaystyle\dot{{\mathbf{M}}}=m(\dot{{\mathbf{r}}}\times\dot{{\mathbf{r}}})+ m({\mathbf{r}}\times\ddot{{\mathbf{r}}})={\mathbf{0}},$

whenever the force field is central [i.e. $m\ddot{{\mathbf{r}}}={\mathbf{F}}({\mathbf{r}})$ is proportional to the radius-vector ${\mathbf{r}}$]. Thus, in a central force field, the angular momentum vector is conserved. In particular, the direction of ${\mathbf{M}}$ is conserved, that is, the motion of the planet occurs in the plane spanned by the radius-vector ${\mathbf{r}}(0)$ and the velocity vector $\dot{{\mathbf{r}}}(0)$ at the initial moment $t=0$. Thereby Kepler’s problem in 3D is reduced to Kepler’s problem in 2D.

2 Kepler’s 2nd law

describes relative times the celestial body spends in various parts of the trajectory: In equal times, the radius-vector of the body sweeps equal areas. In other words, the “sectorial velocity” is constant. This is true for motion in any central force field and follows from the the conservation of the angular momentum vector ${\mathbf{M}}=m({\mathbf{r}}\times\dot{{\mathbf{r}}})$. In particular, the length $\mu:=m|{\mathbf{r}}\times\dot{{\mathbf{r}}}|$ of the angular momentum vector is conserved, and so is the sectorial velocity $|{\mathbf{r}}\times\dot{{\mathbf{r}}}|/2=\mu/2m$.

3 The cone and the orbits.

In the 3-dimensional space with coordinates $(x,y,r)$, consider the cone given by the equation

$\displaystyle r^{2}=x^{2}+y^{2}.$

A purely geometric theorem below recasts Kepler’s 1st law as the claim that in Kepler’s 2D problem with the center of attraction located at the origin $(x,y)=(0,0)$, trajectories are the projections to the $(x,y)$-plane of plane sections of this cone.

Definition.

The geometric locus of points on the plane with the fixed ratio (called eccentricity and denoted $e$) between the distances to a given point (called focus) and a given line (called directrix) is a quadratic curve called ellipse, when $e<1$, parabola, when $e=1$, and hyperbola, when $e>1$.

This is one of classical definitions of ellipses, parabolas, and hyperbolas. The case of circles corresponds to $e=0$ and the directrix “located at infinity”.

Theorem.

The projection of a conic section to the $(x,y)$-plane is a quadratic curve whose focus is the vertex of the cone, directrix is the line of intersection of the cutting plane with the plane $r=0$, and eccentricity is equal to the tangent of the angle between the planes.

Figure 1: An orbit’s focus, directrix, and eccentricity
Proof.

Since generators ($PO$, see Fig. 1) of the cone $r^{2}=x^{2}+y^{2}$ make $45^{\circ}$ with the plane $r=0$, the distance from a point ($P$) on the cone to its projection ($P^{\prime}$) to the plane is equal to the distance from the projection to the vertex of the cone ($PP^{\prime}=OP^{\prime}$), and for points of the same plane section is proportional to the distance ($QP^{\prime}$) from the projection to the line of intersection of the secting plane and the plane $r=0$, with the coefficient of proportionality equal to the slope ($\tan(\angle PQP^{\prime})$) of the secting plane. $\square$

4 Kepler’s 1st law by differentiation.

Suppose that a point is moving on the surface of the cone $r^{2}=x^{2}+y^{2}$ in such a way that the projection of the point to the plane obeys the equation of motion $\ddot{{\mathbf{r}}}=-k{\mathbf{r}}/|{\mathbf{r}}|^{3}$. We represent the position of the point on the cone by its radius-vector ${\mathbf{R}}$ in space with respect to the origin shifted by distance $l$ along the axis of the cone (Fig. 2):

$\displaystyle{\mathbf{R}}:={\mathbf{r}}+r{\mathbf{e}}-l{\mathbf{e}},\quad\text {where}\ {\mathbf{e}}=(0,0,1).$

We choose the shift to be different for different trajectories and determined by the value of sectorial velocity: $l=\mu^{2}/km^{2}$. Since it is conserved, the same value of $l$ serves all points of one trajectory.

Proposition.

In Kepler’s problem on the plane, each trajectory (with the given value of sectorial velocity), when lifted to the cone $r^{2}=x^{2}+y^{2}$, obeys the equation of motion:

$\displaystyle\ddot{{\mathbf{R}}}=-k\frac{{\mathbf{R}}}{r^{3}}.$
Proof.

Differentiating ${\mathbf{R}}$ with respect to time, we find:

$\displaystyle\dot{{\mathbf{R}}}=\dot{{\mathbf{r}}}+\frac{\dot{{\mathbf{r}}} \cdot{\mathbf{r}}}{r}{\mathbf{e}}.$

Differentiating again, we get:

$\displaystyle\ddot{{\mathbf{R}}}=\ddot{{\mathbf{r}}}+\frac{\ddot{{\mathbf{r}}} \cdot{\mathbf{r}}}{r}{\mathbf{e}}+\frac{\dot{{\mathbf{r}}}\cdot\dot{{\mathbf{r }}}}{r}{\mathbf{e}}-\frac{(\dot{{\mathbf{r}}}\cdot{\mathbf{r}})^{2}}{r^{3}}{ \mathbf{e}}.$

Taking in account the equation of motion $\ddot{{\mathbf{r}}}=-k{\mathbf{r}}/|{\mathbf{r}}|^{3}$, we have:

$\displaystyle\ddot{{\mathbf{R}}}=-k\frac{{\mathbf{r}}}{r^{3}}-\frac{k}{r^{2}}{ \mathbf{e}}-\frac{(\dot{{\mathbf{r}}}\cdot\dot{{\mathbf{r}}})({\mathbf{r}} \cdot{\mathbf{r}})-(\dot{{\mathbf{r}}}\cdot{\mathbf{r}})^{2}}{r^{3}}{\mathbf{e }}.$

Note that

$\displaystyle(\dot{{\mathbf{r}}}\cdot\dot{{\mathbf{r}}})({\mathbf{r}}\cdot{ \mathbf{r}})-(\dot{{\mathbf{r}}}\cdot{\mathbf{r}})^{2}=|{\mathbf{r}}\times\dot {{\mathbf{r}}}|^{2}=\frac{\mu^{2}}{m^{2}}=kl,$

four times the square of the sectorial velocity. Thus, at a fixed value of sectorial velocity, we obtain:

$\displaystyle\ddot{{\mathbf{R}}}=-k\frac{{\mathbf{r}}+r{\mathbf{e}}-l{\mathbf{ e}}}{r^{3}}=-k\frac{{\mathbf{R}}}{r^{3}}.$

$\square$

Corollary 1.

The fictitious angular momentum ${{\mathbf{N}}}:=m({\mathbf{R}}\times\dot{{\mathbf{R}}})$ is conserved.

Figure 2: Fictitious angular momentum
Proof.

$\dot{{{\mathbf{N}}}}=m(\dot{{\mathbf{R}}}\times\dot{{\mathbf{R}}})+m({\mathbf{ R}}\times\ddot{{\mathbf{R}}})={\mathbf{0}}$ , since $\ddot{{\mathbf{R}}}$ is proportional to ${\mathbf{R}}$. $\square$

In particular, the direction of vector ${{\mathbf{N}}}$ is conserved, and so the trajectory lies in the section of the cone by the plane passing through the point $l{\mathbf{e}}$ and spanned by vectors ${\mathbf{R}}(0)$ and $\dot{{\mathbf{R}}}(0)$ (perpendicular to ${{\mathbf{N}}}$). We have arrived at

Corollary 2.

(Kepler’s 1st law). When lifted from the plane to the cone, Keplerian trajectories become plane sections of the cone.

Corollary 3.

Keplerian trajectories with a fixed value of sectorial velocity correspond to the sections of the cone by planes passing through the same point $l{\mathbf{e}}$ on the axis, $l=\mu^{2}/km^{2}$.

Note that $l$ here is non-negative. To obtain negative values of $l$ one must assume that $k<0$, i.e. that the planet is repelled from the Sun by the central anti-gravity force inversely proportional to the square of the distance (as in Coulomb’s law for electric charges of the same sign). A plane crossing the axis of the cone at a point below the plane $r=0$ can intersect the part of the cone above this plane along one of the branches of a hyperbola, namely the branch unlinked with the axis. Thus, as a bonus, we obtain:

Corollary 4.

In Kepler’s anti-gravity problem, planets move along branches of hyperbolas with the Sun at the focus, namely along those branches which (being separated from the focus by the directrix) don’t go around the Sun.

Remark.

The equation of motion on the surface of the cone splits into two: the usual Newton equation for radius-vector ${\mathbf{r}}$ on the plane, and the scalar equation

$\displaystyle\ddot{r}=-\frac{k}{r^{2}}+\frac{\mu^{2}}{m^{2}r^{3}}.$

The latter is the effective Newton equation in polar coordinates at a fixed value of sectorial velocity. The traditional solution of Kepler’s problem consists in treating the effective equation as a mechanical system with one degree of freedom and integrating it using the effective energy conservation law.

5 Dandelin’s spheres

The definition of an ellipse as a quadratic curve with the eccentricity $<1$ competes with another definition of an ellipse as the locus of points on the plane with a fixed sum of distances to two given points called foci. The following famous geometric proof of the fact that closed conic sections are ellipses was invented by French mathematician G. P. Dandelin (1794–1847).

Into a conical cup (Fig. 3), place two balls so that they, being tangent to the cone, also touch the secting plane (one from above the other from below). The claim is that the points $F$ and $G$ of tangency with the plane are the foci of the conic section. Indeed, given a point $A$ on the conic section, the segments $AF$ and $AG$ are tangent to the respective balls at the points $F$ an $G$ since the balls touch the plane at these points. The segments $AB$ and $AC$ of the generator $OA$ of the cone are also tangent to the balls, since the balls touch the cone at $B$ and $C$ respectively. Since all tangents to the same ball from the same point have the same length, one concludes that $AF=AB$, $AG=AC$, and hence $AF+AG=BC$. The latter is the distance along a generator of the cone between the parallel circles along which the cone touches the balls. This distance does not depend on the position of $A$ on the conic section.

Exercise Extend Dandelin’s construction to hyperbolic sections of the cone.

Figure 3: Dandelin’s spheres

6 Osculating paraboloids

Here we reconcile the two definitions of ellipses by locating the second focus of a Keplerian orbit.

Theorem.

Into the conical cup, inscribe a paraboloid of revolution so that it touches the secting plane. Then the projection of the point of tangency to the horizontal plane is the second focus of the projected conic section (the vertex of the cone being the first one).

Proof.

First, note that a paraboloid of revolution is the locus of points equidistant from its focus lying on the axis of revolution and the directrix plane perpendicular to it. Consequently, the plane tangent to the paraboloid at a given point is the locus of points equidistant from the focus and the foot of the perpendicular dropped from the given point to the directrix. (This shows that the direction normal to the tangent plane makes equal angles with the direction to the focus and the direction of the axis, which implies the famous optical property of the parabolic mirror to reflect all rays parallel to the axis into the focus.) To justify this claim, consider the plane of all points equidistant from the focus and the aforementioned foot of the perpendicular dropped from the given point. Any other point on this plane is farther from the foot (and hence from the focus) than from the directrix, and therefore lies outside the paraboloid. Thus, the plane has only one common point with the paraboloid, and hence touches it at this point.

Lemma.

Paraboloids of revolution inscribed into the cone $r^{2}=x^{2}+y^{2}$ have the plane $r=0$ as the directrix, and the center of the circle of tangency with the cone as the focus.

Indeed, at the circle of tangency, the tangent planes make $45^{\circ}$ with the direction of the axis, hence the center of the circle must be the focus, and hence the plane $r=0$ the directrix.

Now let $P$ be a point on the section of the cone by the plane tangent to the paraboloid at $F$ (Fig. 4), $P^{\prime}$ and $P^{\prime\prime}$ be the projections of $P$ to the horizontal planes through the focus $O^{\prime}$ of the paraboloid and the vertex $O$ of the cone respectively, and likewise $F^{\prime}$ and $F^{\prime\prime}$ be the projections of $F$. Then the right triangles $PP^{\prime}O^{\prime}$ and $F^{\prime\prime}P^{\prime\prime}P$ are congruent, since $PO^{\prime}=PF^{\prime\prime}$ (as was explained before the lemma), and $P^{\prime}O^{\prime}=P^{\prime\prime}O=P^{\prime\prime}P$ (as was already noted in Section 3). Therefore $P^{\prime\prime}F^{\prime\prime}=P^{\prime}P$ and so $P^{\prime\prime}O+P^{\prime\prime}F^{\prime\prime}=P^{\prime}P^{\prime\prime}$, the distance from the focus $O^{\prime}$ of the paraboloid to the directrix plane, and does not depend on the choice of a point $P$ on the conic section.

Figure 4: Osculating paraboloids

Corollary Elliptic Keplerian orbits with a fixed length of their major axis correspond to the sections of the cone by planes tangent to the same paraboloid of revolution inscribed into the cone (and the length is equal to the distance from the focus to the directrix of the paraboloid).

Indeed, the major axis has length equal to the sum of the distances to the foci.

We will see that the same condition characterizes orbits with a fixed period of revolution, and a fixed value of total energy.

Exercise Extend the proof and the corollary to hyperbolic orbits.

7 Kepler’s 3rd law

The period $T$ of revolution can be found as the ratio of the area enclosed by the orbit to the sectorial velocity. The latter is $\mu/2m$, and the square of it $\mu^{2}/4m^{2}$ coincides with $kl/4$ (in our earlier notation). The area of an ellipse with semiaxes $a\geq b$ is equal to $\pi ab$. Combining, we find a geometric expression for the square of the period:

$\displaystyle T^{2}=\frac{4\pi^{2}}{k}\frac{a^{2}b^{2}}{l}.$

The farthest from and closest to the Sun positions of the planet are called the aphelium and perihelium respectively. Denote by $r_{1}$ and $r_{2}$ the respective distances to the Sun, and examine Fig. 5a showing the axial cross-section of the cone over the major axis of an elliptic orbit.

Figure 5: Arithmetic, geometric, and harmonic means
Proposition.

The major ($a$) and minor ($b$) semiaxes of an elliptic orbit, and the altitude ($l$) of the corresponding secting plane over the vertex of the cone are respectively the arithmetic, geometric, and harmonic means between the aphelium ($r_{1}$) and perihelium ($r_{2}$) distances:

$\displaystyle a=\frac{r_{1}+r_{2}}{2},b=\sqrt{r_{1}r_{2}},l=\frac{2}{1/r_{1}+1 /r_{2}}.$
Proof.

The first is obvious. The second is standard (see Fig. 5b): By the Pythagorean theorem, since the half-distance $f$ between the foci equals $(r_{1}-r_{2})/2$, we have $b^{2}=a^{2}-f^{2}=(r_{1}+r_{2})^{2}/4-(r_{1}-r_{2})^{2}/4=r_{1}r_{2}$. Lastly (see Fig. 5a), the area $r_{1}r_{2}$ of the right triangle $KOM$ equals the sum of the areas $lr_{1}/2$ and $lr_{2}/2$ of the triangles into which the angle bisector $OL$ divides the triangle $KOM$. Thus the length $l$ of the bisector equals $2r_{1}r_{2}/(r_{1}+r_{2})$. $\square$

We have $al=b^{2}$: the geometric mean between two quantities is the geometric mean between their arithmetic and harmonic means. Thus, $b^{2}/l=a$, and

$\displaystyle T^{2}=\frac{4\pi^{2}}{k}a^{3}.$

This is Kepler’s 3rd law: The squares of the periods are proportional to the cubes of the orbits’ major semiaxes.

8 The total energy

The sum of the kinetic and potential energy

$\displaystyle E:=m\frac{|\dot{{\mathbf{r}}}|^{2}}{2}-\frac{mk}{|{\mathbf{r}}|},$

is conserved, as it is readily verified by differentiation.

On the other hand, at the aphelium and perihelium, where the velocity vector $\dot{{\mathbf{r}}}$ is perpendicular to ${\mathbf{r}}$, we have (from conservation of sectorial velocity) $|{\mathbf{r}}|\,|\dot{{\mathbf{r}}}|=\mu/m$. Excluding the velocities, we obtain the following quadratic equation for $r=r_{1}$ and $r=r_{2}$:

$\displaystyle\frac{\mu^{2}}{2m}\left(\frac{1}{r}\right)^{2}-mk\left(\frac{1}{r }\right)-E=0.$

By Vieta’s theorem, we have

$\displaystyle\frac{2}{l}=\frac{1}{r_{1}}+\frac{1}{r_{2}}=\frac{2km^{2}}{\mu^{2 }},\quad\text{and}\quad r_{1}+r_{2}=-\frac{mk}{E}.$

The first equality shows (again) that at a fixed value of sectorial velocity the secting planes hit the axis of the cone at the same point. The second equality implies that elliptic orbits have negative total energy, and that the value of $E/m$ is determined by the major semiaxis of the orbit.

Corollary Orbits with a fixed value of the total energy (as well as orbits with a fixed period of revolution) correspond to sections of the cone by planes tangent to the same paraboloid of revolution inscribed into the cone.

9 The gravity of curvature

The Keplerian dynamics lifted from the plane to the cone $x^{2}+y^{2}=r^{2}$ can be considered as Lagrangian dynamics on the cone itself with potential energy $-mk/r$, where however the distance $r$ to the vertex of the cone, as well as the kinetic energy, are dictated by the metric on the cone obtained from the degenerate metric $(dx)^{2}+(dy)^{2}+0(dr)^{2}$ in 3-space, induced by the projection of the space to the plane. Consider now how the dynamics changes when the metric on the cone is induced from a more general metric of the form $(dx)^{2}+(dy)^{2}+\epsilon(dr)^{2}$, Euclidean for $\epsilon>0$ and Minkovsky for $\epsilon<0$ (but ${>-}1$ so that the cone remains Euclidean). When cut along a generator, the cone can be developed isometrically to the plane. The development covers a sector which for $\epsilon\neq 0$ differs from full angle $2\pi$ by the angular defect $\alpha$ (easily computed as $\alpha=2\pi(\sqrt{1+\epsilon}-1)$. The ratio $\alpha/2\pi$ can be interpreted as the amount of the Gaussian curvature of the cone accumulated at the vertex. This curvature has the following effect on the Keplerian dynamics (see Fig. 6): the usual Keplerian orbits on the development plane jump from one edge of the cut to the other and proceed along the trajectory rotated through the angle $\alpha$. This results in the rotation of the perihelium by $\alpha$ radians per revolution—somewhat similar to the rotation of the perihelium of Mercury explained by the general theory of relativity.

Figure 6: The effect of the angular defect

Figure 7: Two beads on the strings

10 An exercise on hidden symmetries

The integrability of Kepler’s problem is based on the law of conservation of angular momentum, which, in full agreement with E. Noether’s theorem, is due to the rotational symmetry (isotropy) of space. In the following problem, the isotropy is explicitly broken:

Two identical beads slide without friction along two perpendicular strings (Fig. 7) and interact with each other gravitationally, i.e. by central attracting force proportional to the inverse square of the distance between the beads. Describe the motion of this system (assuming that each bead crosses the other bead’s string without collision).

Added in proof
I. Boyadzhiev informed the author about two GeoGebra applets she designed to help visualising: (a) Dandelin’s spheres http://tube.geogebra.org/material/simple/id/1954689, and (b) the above theorems about conic sections http://tube.geogebra.org/material/simple/id/1725305.

References

  • [Arnold1990] Arnold, V.I.: Huygens and Barrow, Newton and Hooke: Pioneers in mathematical analysis and catastrophe theory from evolvents to quasicrystals. Birkhäuser, Basel (1990)
  • [Arnold et al.1987] Arnold, V.I., Kozlov, V.V., Neishtadt, A.I.: Mathematical aspects of classical and celestial mechanics. Springer, Dynamical systems-III (Encyclopaedia of Mathematical Sciences) (1987)
  • [Goodstein and Goodstein1996] Goodstein D.L., Goodstein J.R.: Feynmans Lost Lecture, The Motion of Planets Around the Sun. W. W. Norton & Co., New York (1996)
  • [Milnor1983] Milnor, J.: On the geometry of the Kepler problem. Am. Math. Month. 90, 353–365 (1983)