
LECTURE 16: VARIATIONS OF LENGTH AND ENERGY Although we defined geodesics as “self-parallel” curves, in the last several lectures we have seen that on Riemannian manifolds, geodesics are closely related to “length minimizing” curves: • (Lecture 13) on any Riemannian manifold, in a small neighborhood of any point, geodesics are precisely the shortest curves connecting endpoints. • (Lecture 14 and 15) on any complete Riemannian manifold, in each pathhomotopy class, there exists a length minimizing curve and it is a geodesic. On the other hand, we also know the existence of geodesics which are not length minimizing in the given path homotopy class [e.g. closed geodesics on S m]. In what follows we take a closer look at the relation between geodesics and the length functional. 1. Geodesics as critical points of energy functional ¶ The Euler-Lagrange equation. For any p, q ∈ M, consider Cpq = {γ : [a, b] → M | γ is piecewise smooth and γ(a) = p, γ(b) = q}. One may ask: what property distinguish geodesics in Cpq from other curves? One of the answers should be “length-minimizing”, at least locally. Now let’s attack this problem by studying the length functional directly. Recall that the length of a piecewise smooth curve γ : [a, b] → (M, g) is L(γ) = Length(γ) = Z b a |γ˙(t)|dt. To find the minimum of such a functional, for simplicity let’s first assume that γ is inside a coordinate patch, and thus is given by a vector-valued function x(t) = (x 1 (t), · · · , xm(t)). Consider a very general question in variational analysis: Given a smooth function f = f(t, x, x˙), find all the minimizer of the functional I(x) = Z b a f(t, x(t), x˙(t))dt in the set of all smooth curves x(t) = (x 1 (t), · · · , xm(t)) with fixed endpoints x(a) = p, x(b) = q. Since this “space of smooth curves” is huge (namely, of “infinitely dimensional”), one cannot apply usual methods in calculus to find the minimizer. Fortunately, there is a new branch of mathematics called ✿✿✿✿✿✿✿✿✿✿✿ variational ✿✿✿✿✿✿✿✿ calculus that is invented to handle such 1

2 LECTURE 16: VARIATIONS OF LENGTH AND ENERGY problems. The idea is: convert the one “infinitely dimensional” problem [in which we have infinitely many directions to move] to infinitely many “one-dimensional problems” [in which we fix one direction to move]. Here is how it works in this example: Since we are studying the functional on curves with fixed endpoints, we may fix any smooth map y(t) = (y 1 (t), · · · , ym(t)) with y(a) = y(b) = 0 and consider the corresponding one-parameter family of curves of the form x(t)+εy(t). So if x = x(t) is a minimizer of I, then we must have 0 = d dε ε=0 I(x + εy) = d dε ε=0 Z b a f(t, x + εy, x˙ + εy˙)dt = Z b a ∂f ∂xk (t, x, x˙)y k + ∂f ∂x˙ k (t, x, x˙) ˙y k dt = Z b a ∂f ∂xk (t, x, x˙) − d dt ∂f ∂x˙ k (t, x, x˙) y k dt. As a result, we see that if x is a minimizer (or a critical point) of I, then ∂f ∂xk (t, x, x˙) = d dt ∂f ∂x˙ k (t, x, x˙), 1 ≤ k ≤ m, which is known as the Euler-Lagrange equation for the functional I. ¶ Arc length v.s. energy. We may apply Euler-Lagrange equation above to the function f(t, x(t), x˙(t)) = (⟨x˙(t), x˙(t)⟩x(t)) 1/2 = (gij (x(t)) ˙x i (t) ˙x j (t))1/2 . However, since there is a square root, the computation could be a bit messy. It turn out that there is a small trick that can simplify the computation a lot: instead of the length functional, we can work on the energy functional: E(γ) = 1 2 Z b a |γ˙(t)| 2 dt. By the Cauchy-Schwartz inequality, for each piecewise smooth curve γ, L(γ) 2 = Z b a |γ˙(t)|dt2 ≤ Z b a 1 2 dt Z b a |γ˙(t)| 2 dt = 2(b − a)E(γ), with equality holds if and only if |γ˙(t)| ≡ constant. In particular we see that although the length functional L(γ) is independent of the choice of parametrizations, the energy functional does depend on the parametrizations (and on the length of the interval [a, b]): among different parametrizations of γ on fixed [a, b], E(γ) is minimized on the “constant speed parametrization”. As a consequence we can prove Proposition 1.1. A curve γ : [a, b] → M in Cpq minimize the energy functional E(γ) if and only if it has constant speed and minimize the length functional L(γ)

LECTURE 16: VARIATIONS OF LENGTH AND ENERGY 3 Proof. Suppose γ : [a, b] → M minimize E(γ) but there exists γ ′ ∈ Cpq such that L(γ ′ ) < L(γ), then for the “constant speed re-parametrization” γe : [a, b] → M of γ ′ , E(γe) = 1 2(b − a) L(γe) 2 = 1 2(b − a) L(γ ′ ) 2 < 1 2(b − a) L(γ) 2 ≤ E(γ ′ ), which is a contradiction. So any minimizer of E(γ) must also minimize L(γ). Conversely, if γ : [a, b] → M has constant speed and minimize L(γ), but there is another γ ′ : [a, b] → M in Cpq with E(γ ′ ) < E(γ), then L(γ ′ ) ≤ p 2(b − a)E(γ ′ ) < p 2(b − a)E(γ) = L(γ), a contradiction. □ Since any piecewise smooth curve can be reparametrized to have constant speed, to minimize L(γ), it is enough to minimize E(γ) whose integrand is much simpler. Applying Euler-Lagrange equation to f(t, x(t), x˙(t)) = gij (x(t)) ˙x i (t) ˙x j (t) we get, for 1 ≤ k ≤ m, ∂gij ∂xk x˙ ix˙ j = d dt gkjx˙ j + d dt gikx˙ i = 2 ∂gkj ∂xi x˙ ix˙ j + 2gkjx¨ j , which, as we have seen in Lecture 13, is equivalent to the geodesic equation x¨ k + Γk ijx˙ ix˙ j = 0, 1 ≤ k ≤ m. Amazingly enough, by this way we get not only all the geodesics that are length minimizing curves in Cpq and all the geodesics that are the length minimizing curves in each path homotopy class of Cpq 1 , but in fact we get ALL the geodesics connecting p and q: Theorem 1.2. For a Riemannian manifold (M, g), a curve γ : [a, b] → M in Cpq is a geodesic if and only if it satisfies the Euler-Lagrange equation of the energy functional E(γ). This gives another proof of the fact that any length minimizing curve is a geodesic, and also explains why there exist geodesics that are not length minimizing even in the given path-homotopy class: those curves are only “critical points” of E which need not be minimizing among all near-by curves. If one need to find the minimizing geodesics, then as usual one can further calculate the second order derivative d 2 dε2 |ε=0E(x + εy) in a coordinate system, using which one can show that geodesics are always length-minimizing locally in a neighborhood. 1Although these curves are not length minimizing in Cpq, they are in fact length minimizing among “nearby curves”, namely among curves of the form x + εy in the computation above for ε small enough, since these curves are in the same path-homotopy class

4 LECTURE 16: VARIATIONS OF LENGTH AND ENERGY 2. Formulas for the first and second variations The calculations above are thought-provoking but has the shortcoming that they are performed in a chart. In what follows we take a global way to calculate the first and second derivatives, and also study variations which could be more general (without fixing endpoints) or more restrictive (with geodesic variation). ¶ Variations. For simplicity we start with smooth variations of a smooth curve: Definition 2.1. Let γ : [a, b] → M be a smooth curve, and ε > 0. (1) A smooth variation of γ is a smooth map f : [a, b] × (−ε, ε) → M so that f(t, 0) = γ(t) for all t ∈ [a, b]. In what follows, we will also denote γs(t) = f(t, s). (2) A variation f is called proper if for every s ∈ (−ε, ε), γs(a) = γ(a) and γs(b) = γ(b). (3) A variation is called a geodesic variation if each γs is a geodesic. For simplicity we denote R = [a, b] × (−ε, ε). Let f : R → M be a smooth variation of γ. Then E = f ∗TM is a vector bundle over R, on which we have an induced linear connection ∇e = f ∗∇ (where ∇ is the Levi-Civita connection on (M, g)). To be rigorous, in what follows we will calculate via ∇e , and refer to the appendix of this section for the definition and properties of ∇e . The variation f gives rise to two natural sections of E, namely, fs(t, s) := (df)t,s( ∂ ∂s) ∈ Tf(t,s)M = Et,s and ft(t, s) := (df)t,s( ∂ ∂t) ∈ Tf(t,s)M = Et,s, where ∂ ∂s and ∂ ∂t are the coordinate vector fields on R. Note that by definition, ft(t0, s0) = ˙γs0 (t0). We are mainly interested in the restriction of the sections fs and ft to s = 0, which are in fact “vector fields along γ”. Obviously we have ˙γ(t) = ft(t, 0) = (df)t,0( ∂ ∂t). Definition 2.2. We will call V (t) := fs(t, 0) = (df)t,0( ∂ ∂s) the variation field of the variation f. Note that if γ is an embedded curve and f is an embedding, then both ˙γ(t) and V (t) can be regarded as vector fields on M along γ in a natural way, and the computations below can be carried out via ∇ instead of ∇e

LECTURE 16: VARIATIONS OF LENGTH AND ENERGY 5 ¶ The first variation formula of energy for smooth variations. Now we compute the variation of E along given variation (without fixing endpoints): Let f(t, s) be a smooth variation of a smooth curve γ : [a, b] → M. By Proposition 3.11 and Proposition 3.13, the derivative of E(γs) is d dsE(γs) = 1 2 Z b a ∂ ∂s⟨γ˙ s(t), γ˙ s(t)⟩dt = Z b a ⟨∇e ∂/∂sft , ft⟩dt = Z b a ⟨∇e ∂/∂tfs, ft⟩dt. Applying metric compatibility (i.e. Proposition 3.11) again, we get Z b a ⟨∇e ∂/∂tfs,ft⟩dt= Z b a ∂ ∂t⟨fs, ft⟩dt− Z b a ⟨fs,∇e ∂/∂tft⟩dt=⟨fs,ft⟩|t=b t=a− Z b a ⟨fs,∇e ∂/∂tft⟩dt. So we get Theorem 2.3 (The First Variation of Energy). Given any smooth variation f(t, s) of a smooth curve γ : [a, b] → M, d dsE(γs) = Z b a ⟨∇e ∂/∂tfs, ft⟩dt = ⟨fs(t, s), ft(t, s)⟩|t=b t=a − Z b a ⟨fs, ∇e ∂/∂tft⟩dt. In particular, d ds s=0 E(γs) = − Z b a V (t), ∇γ˙ (t)γ˙ dt − ⟨V (a), γ˙(a)⟩ + ⟨V (b), γ˙(b)⟩. In particular, if f is a proper smooth variation, then d ds s=0 E(γs) = − Z b a V (t), ∇γ˙ (t)γ˙ dt. Again we see that γ is a geodesic (i.e. ∇γ˙ γ˙ = 0) if and only if γ is a critical point of the energy functional E among all proper variations. ¶ The first variation formula of length for smooth variations. Use the same way, one can calculate the first variation of the length. A trick to simplify the computation is the following observation: ∂ ∂s|γ˙ s(t)| = ∂ ∂s⟨ft , ft⟩ 1 2 = 1 2 1 |ft | ∂ ∂s⟨ft , ft⟩ = 1 |ft | ⟨∇e ∂/∂tfs, ft⟩ = ⟨∇e ∂/∂tfs, ft |ft | ⟩. Then following the same computation, one gets Theorem 2.4 (The First Variation of Length). Let f(t, s) be a smooth variation of a smooth curve γ. Then d ds s=0 L(γs) = − Z b a V (t), ∇γ˙ (t) γ˙ |γ˙ | dt − V (a), γ˙(a) |γ˙(a)| + V (b), γ˙(b) |γ˙(b)| . As an application we prove

6 LECTURE 16: VARIATIONS OF LENGTH AND ENERGY Proposition 2.5. Let S be a closed submanifold of (M, g). Suppose γ is a geodesic from p ̸∈ S to q ∈ S with L(γ) = d(p, S). Then γ is perpendicular to S. Proof. For any v ∈ TqS, take a curve σ : (−ε, ε) → S with σ(0) = q and ˙σ(0) = v. Let γs be a variation of γ with γs(0) = p and γs(l) = σ(s), where l = L(γ). Then V (a) = 0 and V (b) = v, and bu the first variation formula, 0 = d ds|s=0L(γs) = ⟨v, γ˙(l)⟩, thus the conclusion follows. □ ¶ Piecewise smooth curve. More generally, one can consider piecewise smooth curves γ : [a, b] → M, i.e. there exists a subdivision a = t0 < t1 < t2 < · · · < tk < tk+1 = b such that γ is smooth on each interval [ti , ti+1]. We shall consider piecewise smooth variations of γ, which is a continuous map f : [a, b] × (−ε, ε) → M so that f is smooth on each [ti , ti+1] × (−ε, ε) for each i. Note that this implies • for each s ∈ (−ε, ε), the curve t 7→ γs(t) = f(t, s) is piecewise smooth, • for each t ∈ [a, b], the curve s 7→ f(t, s) is smooth (so fs is well-defined at ti ’s ). Applying the previous theorems to each [ti , ti+1] × (−ε, ε), we get Corollary 2.6. Let f be a piecewise smooth variation of curve γ. Then d ds s=0 E(γs)=− Z b a ⟨V (t),∇γ˙ γ˙⟩ dt−⟨V (a),γ˙(a)⟩+⟨V (b),γ˙(b)⟩−X k i=1 V (ti),γ˙(t + i )−γ˙(t − i ) and d ds s=0 L(γs) = − Z b a V (t), ∇γ˙ γ˙(t) |γ˙ | dt − V (a), γ˙(a) |γ˙(a)| + V (b), γ˙(b) |γ˙(b)| − X k i=1 V (ti), γ˙(t + i ) |γ˙(t + i )| − γ˙(t − i ) |γ˙(t − i )| . The local computations above imply that among smooth curves, geodesics are critical points of the energy functional. A natural question is: If a curve is piecewise smooth, can it be a critical point of the energy functional? Of course for γ be a critical point of the energy functional, it must be a geodesic when restricted to any subinterval where it is smooth, or in other words, it must be “piecewise geodesic”. Corollary 2.7. If a piecewise smooth curve γ is a critical point of the energy functional among proper variations, then it is C 1 and thus a geodesic

LECTURE 16: VARIATIONS OF LENGTH AND ENERGY 7 Proof. We can first choose proper variations with variation fields satisfying V (ti) = 0 and deduce that ∇γ˙ γ˙ = 0 at any smooth point of γ. In particular, the first term in the right hand of the first variation formula vanishes. As a consequence, we have X k i=1 ⟨V (ti), γ˙(t + i ) − γ˙(t − i )⟩ = 0 for any variation field V . Then for each i we can consider all variation fields so that V (tj ) = 0 for all j ̸= i, and conclude that ⟨V (ti), γ˙(t + i ) − γ˙(t − i )⟩ = 0 for any V (ti) ∈ Tγ(ti) . It follows that ˙γ(t + i ) = ˙γ(t − i ) and thus γ is C 1 . □ ¶ Piecewise smooth curve. Finally we compute the second variation of energy. As in calculus, the second variation is mainly used near critical points, i.e. near geodesics. So we let γ : [a, b] → M be a ✿✿✿✿✿✿✿✿✿ geodesic, and f(t, s) be a smooth variation of γ. According to Theorem 2.3, Proposition 3.11 and Proposition 3.13, d 2 ds2 E(γs) = Z b a ∂ ∂s⟨∇e ∂/∂tfs, ft⟩dt = Z b a ⟨∇e ∂/∂s∇e ∂/∂tfs, ft⟩dt+ Z b a ⟨∇e ∂/∂tfs, ∇e ∂/∂sft⟩dt = Z b a ⟨Re( ∂ ∂s, ∂ ∂t)fs,ft⟩dt+ Z b a ⟨∇e ∂/∂t∇e ∂/∂sfs, ft⟩dt+ Z b a ⟨∇e ∂/∂tfs, ∇e ∂/∂tfs⟩dt. There are two ∇e ∂/∂t in the above formula. We may either apply Proposition 3.11 to the first one to get d 2 ds2 E(γs)=Z b a ∂ ∂t⟨∇e∂/∂sfs,ft⟩dt+ Z b a ⟨Re( ∂ ∂s, ∂ ∂t)fs,ft⟩−⟨∇e∂/∂sfs,∇e∂/∂tft⟩+⟨∇e∂/∂tfs,∇e∂/∂tfs⟩dt =⟨∇e∂/∂sfs,ft⟩|(b,s) (a,s)+ Z b a ⟨Re( ∂ ∂s, ∂ ∂t)fs,ft⟩−⟨∇e∂/∂sfs,∇e∂/∂tft⟩+⟨∇e∂/∂tfs,∇e∂/∂tfs⟩dt, or apply Proposition 3.11 to both to get d 2 ds2 E(γs)=Z b a ∂ ∂t ⟨∇e∂/∂sfs,ft⟩+⟨fs,∇e∂/∂tfs⟩ dt + Z b a ⟨Re( ∂ ∂s, ∂ ∂t)fs,ft⟩−⟨∇e∂/∂sfs,∇e∂/∂tft⟩−⟨fs, ∇e∂/∂t∇e∂/∂tfs⟩ dt = ⟨∇e∂/∂sfs,ft⟩+⟨fs,∇e∂/∂tfs⟩ (b,s) (a,s) + Z b a ⟨Re( ∂ ∂s, ∂ ∂t)fs,ft⟩−⟨∇e∂/∂sfs,∇e∂/∂tft⟩−⟨fs, ∇e∂/∂t∇e∂/∂tfs⟩ dt

8 LECTURE 16: VARIATIONS OF LENGTH AND ENERGY Note that by Proposition 3.13(a), ⟨Re( ∂ ∂s, ∂ ∂t)fs,ft⟩|(t,0) = ⟨R(V (t), γ˙(t))V (t), γ˙(t)⟩ = ⟨V, R( ˙γ, V ) ˙γ⟩(t). So by letting s = 0 in both formula, we get Theorem 2.8 (The Second Variation of Energy). Let f(t, s) be a smooth variation of a geodesic γ : [a, b] → M, then d 2 ds2 s=0 E(γs) = ⟨∇e∂/∂sfs,γ˙⟩|b a + Z b a ⟨R(˙γ,V) ˙γ,V ⟩+⟨∇γ˙V,∇γ˙V ⟩ dt = ⟨∇e∂/∂sfs,γ˙⟩+⟨V,∇eγ˙ V ⟩ b a − Z b a V, ∇γ˙ ∇γ˙ V − R( ˙γ, V ) ˙γ(t) dt. In particular, if the variation is proper, then V (a) = V (b) = 0 and (∇e∂/∂sfs)|(a,s) = ∇V (a)fs(a, s) = 0, (∇e∂/∂sfs)|(b,s) = ∇V (b)fs(b, s) = 0 so we get d 2 ds2 s=0 E(γs) = Z b a ⟨R( ˙γ, V ) ˙γ, V ⟩+⟨∇γ˙ V, ∇γ˙ V ⟩ dt = Z b a ⟨V (t), −∇γ˙ ∇γ˙ V (t) + R( ˙γ, V ) ˙γ(t)⟩ dt. Note that ⟨R( ˙γ, V ) ˙γ, V ⟩ = −Rm( ˙γ, V, γ, V ˙ ) = −K( ˙γ, V ) Area( ˙γ, V ), so we see that if (M, g) has non-positive sectional curvature, then for any proper variation of any geodesics, d 2 ds2 s=0 E(γs) ≥ 0. As a result, any geodesic in non-positive sectional curvature space is locally minimizing among nearby curves

LECTURE 16: VARIATIONS OF LENGTH AND ENERGY 9 3. Appendix: The induced connection (by Yulong Li) ¶ The pullback bundle. Let M, N be two smooth manifolds, ∇ a connection on M and φ : N → M a smooth map. Then we may pullback the tangent bundle π : TM → M over M to a vector bundle π ′ : E = φ ∗ (TM) → N over N[known as the pullback bundle], where E = {(x, v)|x ∈ N, v ∈ TM, φ(x) = π(v)} ⊂ N × TM. In other words, we just set the fiber Ex of π ′ : E → N to be the vector space Tφ(x)M. Denote by φe : E → TM the induced bundle map that maps (x, v) ∈ E to v ∈ TM. Then we have the following commutative diagram (x, v) v (x, v) E TM v x N M π(v) x φ(x) ∈ ∈ ∈ π ′ φe π ∋ ∈ φ ∋ ∈ ∈ In this construction, there are two natural ways to obtain sections on E: • For any section V ∈ Γ ∞(TM), one can define an assignment φe ∗V : N → E x 7→ (x, Vφ(x)). By this way any smooth vector field V ∈ Γ ∞(TM) gives rise to a smooth section φe ∗V ∈ Γ ∞(E) of E. [Note: if Ei is a local frame of TM near φ(x), then φe ∗Ei is a local frame of E near x.] • For any section X ∈ Γ ∞(T N), one can define an assignment dφ(X) : N → E x 7→ (x, (dφ)x(Xx)). By this way any smooth vector field X ∈ Γ ∞(T N) gives rise to a smooth section dφ(X) ∈ Γ ∞(E) of E. The two constructions are related as follows: For X ∈ Γ ∞(T N), V ∈ Γ ∞(TM), dφ(X) = φe ∗V ⇐⇒ dφx(Xx) = Vφ(x) ⇐⇒ X, V are φ-related. In manifold theory we have seen that if X, V and Y, W are φ-related, then [X, Y ] and [V, W] is φ-related. So we get Proposition 3.1. If dφ(X) = φe ∗V, dφ(Y ) = φe ∗W, then dφ([X, Y ]) = φe ∗ ([V, W])

10 LECTURE 16: VARIATIONS OF LENGTH AND ENERGY To extend this proposition to more general vector fields on N, we let Ei be a local frame of TM near φ(x), then φe ∗Ei is a local frame of E near x. Proposition 3.2. If X, Y ∈Γ ∞(T N) and dφ(X) = Xiφe ∗Ei , dφ(Y ) = Y jφe ∗Ej , then dφ([X, Y ]) = X(Y j )φe ∗Ej − Y (X i )φe ∗Ei + X iY jφe ∗ ([Ei , Ej ]). Proof. For any x ∈ N and any f ∈ C ∞(M), Y (φ ∗ f)(x)= (dφx)(Yx)f =Y j (x)(φe ∗Ej )x(f)=Y j (x)(Ej )φ(x)(f)= (Y jφ ∗ (Ejf))(x), so Y (φ ∗ f) = Y jφ ∗ (Ejf) and thus Y jXx φ ∗ (Ejf) −X iYx φ ∗ (Eif) =X iY jφ ∗ (EiEjf −EjEif)=X iY jφ ∗ ([Ei , Ej ]f). It follows that as vectors in Tφ(x)M acting on f ∈ C ∞(M), (dφ[X,Y ])φ(x)f =[X,Y ]x(φ ∗ f) =Xx(Y φ∗ f)−Yx(Xφ∗ f) =Xx Y jφ ∗ (Ejf) −Yx X iφ ∗ (Eif) =Xx(Y j )φ ∗ (Ejf)+Y jXx φ ∗ (Ejf) −Yx(X i )φ ∗ (Eif)−X iYx φ ∗ (Eif) =Xx(Y j )φ ∗ (Ejf)−Yx(X i )φ ∗ (Eif)+X i (x)Y j (x)φ ∗ ([Ei , Ej ]f). So we get, as sections in Γ∞(E), dφ([X, Y ]) = X(Y j )φe ∗Ej − Y (X i )φe ∗Ei + X iY jφe ∗ ([Ei , Ej ]). □ ¶ The induced connection on the pullback bundle. Since each fiber Ex = Tφ(x)M, one may transplant structures on TM to E. For example, if (M, g) is a Riemannian manifold, then the metric structure on TM gives rise to a metric structure on the pullback bundle E in the natural way, namely one just endow each fiber Ex = Tφ(x)M with the inner product gφ(x) . Although it is not that obvious, we may also transplant linear connections on TM to E: Proposition 3.3. Given any linear connection ∇ on TM, there exists a unique linear connection ∇e on E satisfying (1) ∇e u(φe ∗V ) = φe ∗ (∇(dφ)xuV ) for any x ∈ N, u ∈ TxN and V ∈ Γ ∞(TM). Proof. We first prove the uniqueness. Assume ∇e exists. For any x0 ∈ N and any local frame {Ei} m i=1 around φ(x0), {φe ∗Ei} m i=1 is a local frame around x0. Thus for any s ∈ Γ ∞(E), we may write s(x) = s i (x)φe ∗Ei(x)