MATH 5403 - Calculus of Variations, Section 001 - Fall 2019
TR 1:30-2:45 p.m., 809 PHSC
Instructor:
Nikola Petrov, 1101 PHSC, npetrov AT math.ou.edu
Office Hours:
Tue 9:30-10:30 p.m., Wed 12:00-1:00 p.m., or by appointment, in 1101 PHSC.
First day handout
Prerequisites:
4433 (Intro to Analysis I) or 3423 (Physical Math II) or 4163 (Intro to PDEs)
Course description:
I plan to cover Euler-Lagrange equations, Legendre transform, Hamilton’s equations,
Lagrange multipliers, Hamilton-Jacobi theory, conservation laws and Noether’s theorem,
second variation, conditions for strong and weak extrema. If time permits,
we may discuss applications to eigenvalue problems and/or outline some ideas
from optimal control theory (but certainly will not go into the latter in depth).
I will probably sacrifice some mathematical rigor in favor of discussing
more applications and examples. We will cover many interesting examples
from Mechanics, Optics, and Geometry (isoperimetric problems, curves of shortest length).
Text:
The main textbook for the class is
-
[GF] I. M. Gelfand, S. V. Fomin, Calculus of Variations, Dover, 1991.
We may use parts of the following books, freely available from the OU Libraries web-site for OU students:
-
[vB] B. van Brunt, The Calculus of Variations, Springer, 2004,
-
[RB] A. Rojo, A. Bloch, The Principle of Least Action, Cambridge University Press, 2018,
-
[K] H. Kielhöfer,
Calculus of Variations: An Introduction to the One-Dimensional Theory with Examples and Exercises,
Springer, 2018,
-
[FG] P. Freguglia, M. Giaquinta,
The Early Period of the Calculus of Variations,
Birkhäuser, 2016.
Homework:
Content of the lectures:
-
Lecture 1 (Tue, Aug 20):
Introduction - classical problems in Calculus of Variations:
-
elementary minimization problems from Calculus;
-
Queen Dido's isoperimetric problem;
-
Fermat's principle in optics: paths of light rays in geometric optics
are such that the "optical length" of a path,
i.e., the time light needs to get from a point A to a point B, is minimal;
-
the reflection law and the Snell's law in optics
as consequences of Fermat's minimal time principle;
-
Johann Bernoulli's brachistochrone challenge (1696);
-
minimizing the area of a soap film;
-
minimizing the potential energy of a cable hanging in the Earth's gravity field.
-
Lecture 2 (Thu, Aug 22):
Preliminaries:
-
notations: continuous functions C[a,b],
continuously differentiable functions C1[a,b];
-
definition of a relative min/max of a function of one variable;
-
notation: "little o";
-
representing a differentiable function in the form
ƒ(x0+h)=ƒ(x0)+Lh+R1(h),
where R1(h)=o(h);
-
Taylor series for functions of one and several variables;
-
linear spaces, normed spaces;
-
norms in C[a,b] and C1[a,b].
-
Lecture 3 (Tue, Aug 27):
More preliminaries:
-
if a function is C2[a,b], then it can be written in the form
ƒ(x0+h)=ƒ(x0)+ƒ'(x0)h+(1/2)ƒ''(x0)h2+R2(h),
where R2(h)/h2→0 as h→0;
-
inspiration from Calculus: if a C2 function of one variable has a min/max at x0, then ƒ'(x0)=0
and ƒ''(x0)>0, resp. ƒ''(x0)<0;
-
definition of a relative min/max of a function from a normed space to R;
-
definition of a first variation of a functional:
δJ[h]=(d/dt)J[y+th]|t=0
(remark: Fréchet derivatives);
-
Theorem about differentiation under the integral;
-
examples of computing the first variation of a functional.
-
Lecture 4 (Thu, Aug 29):
Preparations for the Euler-Lagrange equation:
-
functionals;
-
the action J[y] as a functional of a function y(x);
-
idea of obtaining the function y(x) from the condition
of extremizing the action over a given interval x∈[a,b]
for given values of y(a) and y(b);
-
admissible classes of functions that "perturb" the function y
- usually functions from C[a,b] or C1[a,b]
that vanish at the endpoints;
-
Theorem: if J:X→R has a relative min at some
Y∈X with respect to some class S of admissible functions,
then for each h∈S, δJ[h]=0;
-
expression for δJ[h]=0 when the functional
is an integral of a function ƒ(x,y(x),y'(x));
-
examples of different norms of functions: C0- and C1-norms
of ƒ(x)=(1/n)sin(nx) (n∈N);
-
weak relative minimum and strong relative minimum of a functional;
-
Lemma: integration by parts;
-
Lemma: existence of "test functions";
-
Fundamental Lemma of Calculus of Variations.
-
Lecture 5 (Tue, Sep 3):
The Euler-Lagrange equation:
-
Theorem: if the functional J has a relative minimum
at y∈X w.r.t. a space S of admissible functions,
then for all h∈S,
δJ[h]=0;
-
Theorem:
for y∈C1[a,b]
and ƒ:R3→R continuous on some open subset,
if the functional J[y] is given as an integral
of ƒ(x,y(x),y'(x))
over x in [a,b],
then δJ[h] is an integral over x in [a,b]
of ƒyh+ƒy'h';
-
example of a weak relative minimum that is not a strong relative minimum:
if J[y] is an integral over x∈[0,1]
of (y')2[1−(y')2]
with BCs y(0)=0, y(1)=0,
then the function Y(x)=0 is a weak relative minimum,
but the continuous (but not C1) "tent" function
connnecting the points (0,0), (1/2,1), and (1,0) gives a smaller value of the functional;
-
Theorem (necessary condition for a weak minimum in C2):
under the usual assumptions on ƒ(x,y(x),y'(x)),
let J[y] be an integral over x∈[a,b]
of ƒ(x,y(x),y'(x)),
and suppose that J has a weak relative minimum
at Y∈C2[a,b]
with respect to admissible functions
S={h∈C1[a,b]:h(a)=h(b)=0},
then Y(x) satisfies the Euler-Lagrange (EL) equation
ƒy(x,Y(x),Y '(x))−(d/dx)[ƒy'(x,Y(x),Y '(x))]=0;
-
Question: what if the minimizer Y we are looking for
is in C1[a,b]
but not in C2[a,b]?
-
Example (the minimizer is in C1[a,b]
but not in C2[a,b]):
if J[y] is an integral over x∈[−1,1]
of y2(2x−y')2,
then Y(x)=0 for x∈[−1,0]
and Y(x)=x2 for x∈[0,1]
is a minimizer in C1[a,b]
but not in C2[a,b].
-
Lecture 6 (Thu, Sep 5):
The Euler-Lagrange equation (cont.):
-
Lemma: if ƒ∈C1[a,b]
and ƒ'(x)=0 for all x∈[a,b],
then ƒ=const on [a,b]
(follows from the Mean Value Theorem);
-
The "Homework Lemma": if ƒ∈C[a,b],
ij0 on [a,b],
and the integral of ƒ over [a,b] is zero,
then ƒ(x)=0 for all x∈[a,b];
-
Lemma (Du Bois-Reymond): if ƒ∈C[a,b]
and for every η∈C1[a,b]
with η(a)=η(b)=0
the integral of ƒ(x)η'(x) vanishes,
then ƒ is constant on [a,b];
-
Corollary: let ƒ be an R-valued C2 function on an
appropriate domain in R3,
y∈C1[a,b],
let J[y] be an integral over x∈[a,b]
of ƒ(x,y(x),y'(x)),
and suppose that J has a weak relative minimum
at Y∈C1[a,b]
with respect to admissible functions
S={h∈C1[a,b]:h(a)=h(b)=0},
then ƒy'(x,Y(x),Y '(x))
is a differentiable function of x on [a,b],
and for all x∈[a,b],
Y(x) satisfies the Euler-Lagrange (EL) equation
ƒy(x,Y(x),Y '(x))−(d/dx)[ƒy'(x,Y(x),Y '(x))]=0;
-
Fact: from the differentiability of
ƒy'(x,Y(x),Y '(x)),
it follows that the function Y(x) is C2
for all values of x for which
ƒy'y'(x,Y(x),Y '(x))≠0,
and the second derivative of Y is
Y ''(x)=[ƒy(x,y,y')−ƒy'x(x,y,y')−Y '(x)ƒy'y(x,y,y')]/ƒy'y'(x,y,y'),
where the right-hand side is evaluated at y=Y(x), y'=Y '(x)
(for a proof see O. Bolza, Lectures on the Calculus of Variations,
Dover, New York, 2018, pages 25-26).
Particular cases of the Euler-Lagrange equations:
-
Case 1: ƒ does not depend on y': the Euler-Lagrange equation is not a differential equation,
and usually the boundary conditions cannot be satisfied (because the solution of EL has no arbitrary constants);
examples:
-
if J[y] is integral of y2 over x∈[0,1]
with y(0)=0, y(1)=1, then the minimizer is the discontinuous function
Y(x)=0 for x∈[0,1) and Y(1)=1;
-
another example: if J[y] is integral of y over x∈[0,1]
with y(0)=0, y(1)=1, then a minimizer does not exist
because J[y] can be made as negative as desired;
-
Case 2: ƒ does not depend on y: the Euler-Lagrange equation is
(d/dx)[ƒy'(x,Y(x),Y '(x))]=0,
hence
ƒy'(x,Y(x),Y '(x))=const,
and one has to only integrate once more to obtain the solution;
example:
-
J[y] is integral of [1+(y')2]1/2/x over x∈[1,2]
with y(1)=0, y(2)=1 - the minimizer is the function
Y(x)=2−(5−x2)1/2
(which is an arc of the circle x2+(y−2)2=5).
-
Lecture 7 (Tue, Sep 10):
Particular cases of the Euler-Lagrange equations (cont.):
-
Case 3: ƒ does not depend on x:
the quantity E:=y'ƒy'−ƒ
evaluated at the solution Y(x) of the Euler-Lagrange equation
is independent of x;
examples:
-
minimizing the area of a surface of revolution (soap film)
formed by rotating the graph of the function y(x) about the x-axis:
the area is integral of 2πy[1+(y')2]1/2
over x∈[a,b] - the Euler-Lagrange equation may have
zero, one, or two solutions depending on the initial conditions;
the minimizing surface is called "catenoid";
solving the problem for the "tangency" between the graphs
of the functions φ(z) and ψ(z):
if the curves are tangent to one another at z=z*,
then both their values and their slopes must be equal, i.e.,
φ(z*)=ψ(z*),
φ'(z*)=ψ'(z*);
Goldschmidt solution;
details about the solution of this problem can be found in
A. R. Forsyth, Calculus of Variations, Cambridge University Press, 1927,
pages 30-35;
-
minimizing the potential energy of a chain of linear mass density λ
hanging in the constant gravity field of the Earth:
the potential energy is integral of λgy[1+(y')2]1/2
over x∈[a,b] - the problem is the same as the catenoid above,
the solution curve is called "catenary".
-
Lecture 8 (Thu, Sep 12):
Particular cases of the Euler-Lagrange equations (cont.):
-
Case 3: ƒ does not depend on x (cont.): more examples:
-
brachistochrone problem (Johann Bernoulli, 1696): setting up the problem by using the expression for the arclength
and the conservation of energy to determine the speed of the mass, solution: a cycloid through the two points
(the largest cycloid as a global minimizer, and other possible cycloids as local minimizers).
Variational problems with multiple unknown functions:
-
derivation of the Euler-Lagrange equations for the unknown functions;
-
example: the central force two-body problem in the Lagrangian mechanics framework:
-
unknown functions: the polar coordinates r and θ as functions of time;
-
expression for the kinetic and the potential energy:
T=(m/2)[(r')2+r2(θ')2],
U=−α/r (α=const>0);
-
Euler-Lagrange equations, conservation of the angular momentum
M=mr2θ'=const
(follows from the fact that the Lagrangian L=T−U does not depend explicitly on θ);
-
using the angular momentum conservation to obtain a second-order ODE for r(t);
-
change of the independent variable from t to θ in the ODE,
change of the dependent variable from r to u=1/r in the ODE;
-
solving the ODE for u(θ);
-
geometric interpretation of the solution for r(θ) - it describes an ellipse,
expressing the eccentricity of the ellipse.
-
Lecture 9 (Tue, Sep 17):
Variational problems for a function of more than one independent variable:
-
setting up the problem for finding a weak relative minimum of a functional J
of a function u(x,y) of two independent variables fixed at the boundary
of the domain Ω⊂R2;
-
recalling Green's Theorem from Calculus; comparison with the Divergence Theorem;
-
Lemma generalizing the Fundamental Lemma of Calculus of Variations to functions of more than one independent variable;
-
Theorem: If a function u∈C2 is a weak relative minimum
of an action functional J with Lagrangian density F,
then it satisfies the Euler-Lagrange equation
∂F/∂u−(∂/∂x)(∂F/∂ux)−(∂/∂y)(∂F/∂uy)=0;
-
planar motion of a string in the Earth's gravity:
-
setting up the problem for the planar motion of a string of length l,
linear density λ, tension τ in the Earth's gravity field g=−gk,
assuming that the ends of the string are firmly attached;
-
the shape of the string at time t is described by z=u(x,t),
where x∈[0,l] and t∈[0,∞);
-
derivation of the expression for the linear density of the kinetic energy, λut2/2.
-
Lecture 10 (Thu, Sep 19):
Variational problems for a function of more than one independent variable (cont.):
-
planar motion of a string in the Earth's gravity (cont.):
-
derivation of the expression for the density of the gravitational potential energy, λgu;
-
derivation of the expression for the density of the elastic potential energy,
τ[(1+ux2)1/2−1]≈τux2/2;
-
setting up the variational problem for a vibrating string;
-
derivation of the Euler-Lagrange equation for the planar motion
of a string hanging in the gravity field;
-
discussion of the physical meaning of the solution
u(x,t)=ƒ(x−ct)+g(x+ct)
of the wave equation (without the gravity term) as waves propagating to the right and to the left with
speed c=(τ/λ)1/2;
-
setting up and solving the boundary value problem
for the shape of a hanging string attached to two points (at the same height);
-
finding the shape of the steady state of the string and its maximumum hanging.
-
Lecture 11 (Tue, Sep 24):
A digression: Implicit Function Theorem:
-
motivation of the problem of defining locally a function of one variable
whose graph is a given curve in the plane
(not necessarily satisfying the "vertical line test");
-
geometric derivation of the expression for the derivative
y'(x) of the function y(x) defined implicitly
by the equation Φ(x,y)=0;
-
relation with the "implicit differentiation" from Calculus;
-
statement of the Implicit Function Theorem for a function
y:R→R defined implicity
by Φ(x,y)=0, where Φ:R2→R
is a Ck function;
-
statement of the Implicit Function Theorem for a function
y:Rm→Rn defined implicitly
by Φ(x,y)=0, where Φ:Rn+m→Rn
is a Ck function.
A digression: dimensional analysis in physics:
-
deriving the formula for the period of a mathematical pendulum from dimensional analysis
(up to an overall dimensionless constant and a function depending on a dimensionless quantity, namely,
the maximum angle θmax).
-
Lecture 12 (Thu, Sep 26):
Variation of a functional with movable end points:
-
setting up the problem, defining a distance between
the "old" function y and the "new" function y*
that takes into account that (1) y and y*
are defined on different domains, and (2) the endpoints move
from (x0,y0) to
(x0+δx0,y0+δy0)
and from (x1,y1) to
(x1+δx1,y1+δy1);
-
derivation of the variation of the functional in which only terms
linear in |y−y*|, |y'−y*'|,
|δx0|, |δy0|,
|δx1|, and |δy1| are kept;
-
generalization of the formula for the variation of the functional (in linear approximation)
for the case of several functions y1(x),...,yn(x);
-
derivation of the expressions for the boundary terms when the endpoints are constrained
to move on a curve (in the case of one function y(x))
- transversality conditions.
[GF, pages 54-57 of Sec. 13, 59-60 of Sec. 14]
-
Lecture 13 (Tue, Oct 1):
Variation of a functional with movable end points (cont.):
-
another point of view:
-
consider a fixed left end point (x0,y0)
and movable right end point (ξ,η)∈U,
where U is a small open set in R2
(assume that (ξ,η) is close enough to (x0,y0)
so that there is a unique minimizer connecting the two points),
-
let y(ξ,η)(x) be the minimizer of the functional J[y]
with boundary conditions
y(ξ,η)(x0)=y0,
y(ξ,η)(ξ)=η,
-
define the function τ:U→R by τ(ξ,η):=J[y(ξ,η)],
-
use the expression for δJ derived in Lecture 12 in order to find the partial derivatives
∂τ/∂ξ and ∂τ/∂η;
-
derivation of the transversality conditions in the particular case of Lagrangian of the form
L(x,y,y')=A(x,y)[1+(y')2]1/2 - orthogonality conditions;
-
physical interpretation of the orthogonality conditions - absense of tangential components of the force exerted on the constraint.
Broken extremals; the Weierstrass-Erdmann conditions:
-
an example of an action functional whose minimizer is piecewise smooth:
integral over [−1,1] of
L(x,y,y')=y2(1−y')2
with BCs y(−1)=0, y(1)=1:
the solution is
y(x)=0 for x∈[−1,0] and =x for x∈(0,1];
-
variation of a functional with fixed ends which has one point where the function y(x) has a corner;
-
Weierstrass-Erdmann conditions at the break.
[GF, pages 60-61 of Sec. 14, Sec. 15]
-
Lecture 14 (Thu, Oct 3):
Broken extremals; the Weierstrass-Erdmann conditions (cont.):
-
an example of a variational problem that admits extremals with arbitrarily many breaks:
J[y] is an integral of
L(x,y,y')=(y')2(1−y')2
over x∈[0,2] with boundary conditions
y(0)=0, y(2)=1.
[L. Elsgolc, Calculus of Variations, Dover, 2007, Ch. II, Sec. 4, Example 2 on page 94]
A digression: strange phenomena in quadratic equations:
the equation x−1=0 has one solution, x=1,
but the equation εx2+x−1=0
where ε>0 is very small has two solutions,
x1=(−1+(1+4ε)1/2)/(2ε)≈1+ε/2,
and
x2=(−1−(1+4ε)1/2)/(2ε)≈−1/ε.
Canonical form of the Euler-Lagrange equations:
-
generalized momenta: pi:=∂F/∂yi';
-
idea: change variables from (x,y,y') to (x,y,p),
where the generalized velocities yi' are expressed
as functions of (x,y,p) from the system
pi=∂F/∂yi', i=1,...,n;
-
potential complications: one would not be able to express yi' are expressed
as functions of (x,y,p) if the conditions in the Implicit Function Theorem
are not satisfied; an example:
F(x,y1,y2,y1',y2')=y1(y1')2+y1'y2'+[1/(4y1)](y2')2
(checking that the condition det(Fy1'y2')≠0 is violated);
-
a digression on change of variables
ξ=Ξ(x,y), η=Η(x,y)
with inverse change x=X(ξ,η), y=Y(ξ,η),
in a function ƒ(x,y) - the result is a new function
g(ξ,η) defined by
g(ξ,η):=ƒ(X(ξ,η),Y(ξ,η))
or, equivalently,
g(Ξ(x,y),Η(x,y))=ƒ(x,y);
-
an example illustrating that the partial derivatives depend on all variables:
the partial derivatives of ƒ(x,y)=x+2y are
ƒx=1, ƒy=2;
changing variables from (x,y) to (ξ,y), where
ξ=x+y, y=y - the new function is equal to
g(ξ,y)=ξ+y, so the partial derivatives of the new function are
gξ=1, gy=1 - how to resolve this "paradox"?
[GF, pages 67-68 of Sec. 16, Sec. 15]
-
Lecture 15 (Tue, Oct 8):
Canonical form of the Euler-Lagrange equations (cont.):
-
derivation of the Hamilton's equations by direct differentiation of the Hamiltonian
H(x,y,p)=pV(x,y,p)−F(x,y,V(x,y,p)):
dy/dx=∂H/∂p,
dp/dx=−∂H/∂y,
-
generalization to the case of y=(y1,...,yn),
p=(p1,...,pn);
-
time derivative of the Hamiltonian: ∂H/∂x=−∂F/∂x,
so that a system is autonomous exactly when the Hamiltonian does not explicitly on x;
-
a digression on the differential of function of many variables:
-
how to recognize that an expression
M(x,y)dx+N(x,y)dy
is a differential of a function?
-
exact equations,
-
exactness if "fragile" property: while
2xeydx+x2eydy=0
is an exact equation,
2eydx+xeydy=0
(the above equation divided by x) is not;
-
solvability of exact equations in a simply connected domain;
-
invariance of the differential of a function with respect to change of variables;
-
an alternative derivation of Hamilton's equations ([GF], pages 69 ,70).
Description of a physical system within Hamilton's formalism:
-
standing assumption: the system is authonomous;
-
C={(y1,...,yn)} configuration space (n dimensional),
-
P={(y1,...,yn,p1,...,pn)}
phase space (2n dimensional),
-
set z=(y,p)∈P,
flow Φt:P→P
solving the Hamilton's equations and the initial condition
Φ0(z0)=z0;
-
semigroup property of the flow:
Φt○Φs=Φt+s;
-
corollaries of the semigroup property:
Φ0=IdP,
(Φt)−1=Φ−t;
-
different points of view:
-
Φ:(−ε,ε)×P→P (for some ε>0);
-
Φ:(−ε,ε)→Diff P, where
Diff P is the group of diffeomorphisms of P
(i.e., invertible smooth transformations of P with a smooth inverse;
the group operation is the composition ○ of two diffeomorphisms);
-
the mapping (R,+)→(Diff P,○):t↦Φt
is a morphism of Abelian groups
(assuming that Φt exists and is a diffeomorphism for all t);
-
observable: a smooth function G:P→R;
-
time evolution of the observable: consider the composition
G○Φt:P→R.
-
Lecture 16 (Thu, Oct 10):
Description of a physical system within Hamilton's formalism (cont.):
-
writing the Hamilton's equations as
dz=J∇H, where
z=(y,p),
J is a (2n)×(2n) matrix with
entries 0, I, 0, −I,
where 0, is the zero n×n matrix
and I is the unit n×n matrix;
-
definition of the Poisson bracket {G,K}:P→R
of the observables G:P→R and K:P→R;
-
properties of the Poisson bracket: for observables G,K,M:P→R,
-
linearity: {αG+K,M}=α{G,M}+{K,M}
for α=const,
-
antisymmetry
{G,K}=−{K,G};
-
Jacobi identity:
{{G,K},M}+{{K,M},G}+{{M,G},K}=0,
-
Leibniz rule:
{GK,M}={G,M}K+G{K,M},
-
the R-linear space C∞(P) of smooth observables
together with an operation
{ , }:C∞(P)×C∞(P)→C∞(P)
satisfying the first three property is called a Lie algebra;
-
the ring C∞(P) of smooth observables
together with an operation
{ , }:C∞(P)×C∞(P)→C∞(P)
satisfying all four properties above is called a Poisson algebra;
-
the time evolution of an observable G:P→R
is given by G○Φt:P→R,
and its rate of change is
d(G○Φt)/dt={G,H}○Φt;
-
a first integral (or constant of motion) of a system of ODEs is a function of the dependent variables
that does not change with time, i.e., G○Φt(z)=G(z)
for any t∈R, z∈P;
-
in an autonomous system, the antisymmetry of the Poisson bracket implies that
d(H○Φt)/dt={H,H}○Φt=0;
-
a function G:R×P→R
changes with time according to
(d/dt)G(t,Φt(z))=(∂G/∂t)(t,Φt(z))+{G,H}(t,Φt(z));
in particular, for a non-autonomous system (in which H=H(t,z)),
(d/dt)H(t,Φt(z))=(∂H/∂t)(t,Φt(z)).
Legendre transform:
-
a definition of a convex function from R to R (by "convex" we mean "strictly convex");
if ƒ is convex and C2, then ƒ''>0;
-
let ƒ:R→R be a convex function,
for any p∈R define
ξ(p)=argmaxξ[pξ−ƒ(ξ)]
(the convexity of ƒ guarantees the uniqueness of ξ(p));
-
if ƒ is convex and C2, then ξ(p) is determined by
(d/dξ)[pξ−ƒ(ξ)]=0
and
(d2/dξ2)[pξ−ƒ(ξ)]>0,
i.e., ξ(p) is defined implicitly by p=ƒ'(ξ)
(and ƒ''(ξ)>0, which is automatically satisfied since ƒ is convex);
-
Hamiltonian
H(p):=[pξ−ƒ(ξ)]|ξ=ξ(p)=[pξ(p)−ƒ(ξ(p))];
-
Legendre transform: a transition from
(ξ,ƒ(ξ)) to (p,H(p));
-
example: if ƒ:[0,∞)→[0,∞) is defined as ƒ(ξ)=ξα/α,
then H(p)=pβ/β, where
1/α+1/β=1.
[GF], pages 71-73 of Sec. 18.
-
Lecture 17 (Tue, Oct 15):
Legendre transform (cont.):
-
involutivity of the Legendre transform;
-
example (general Young's inequality):
-
let φ:[0,∞)→R be a smooth strictly increasing function
with φ(0)=0 and φ(ξ)→∞ as x→∞;
-
define ƒ(ξ) as integral of φ from 0 to ξ, then
ƒ is a strictly convex function;
-
we compute that the Legendre transform H(p) of ƒ is
integral of φ−1 from 0 to p;
-
using that H(p)=maxξ[ξp−ƒ(ξ)],
we obtain that pξ≤ƒ(ξ)+H(p);
-
geometric representation of the generalized Young's inequality;
-
for ƒ:[0,∞)→[0,∞) defined as ƒ(ξ)=ξα/α
for α>1, the example from the end of Lecture 16 implies the
classical Young's inequality,
pξ≤ξα/α+pβ/β,
where ξ>0, p>0, 1/α+1/β=1;
-
variational principles for the Hamilton's equations:
-
recall: if the action J[y] is an integral of F(x,y(x),y'(x))
over x from x0 to x1
subject to the boundary conditions
y(x0)=y0,
y(x1)=y1,
where we consider y'(x) as a derivative of y, i.e.,
y'(x)=(d/dx) y(x),
then the minimizer solves the Euler-Lagrange equation
Fy(x,y(x),y '(x))−(d/dx)[Fy'(x,y(x),y '(x))]=0;
-
one way of thinking:
perform a Legendre transform: define the generalized momentum
p:=Fy'(x,y,y'),
express y' from this relation: y'=V(x,y,p),
define the Hamiltonian by
H(x,y,p)=pV(x,y,p)−F(x,y,V(x,y,p));
here we think of the functions y(x) and p(x) as independent (i.e., unrelated);
using the Euler-Lagrange equations, we derive Hamilton's equations
dy/dx=∂H/∂p, dp/dx=−∂H/∂y
for the functions y(x) and p(x);
-
question: can one obtain Hamilton's equations as Euler-Lagrange equations for some action functional?
-
consider
H(x,y,p)=py'−F(x,y,y'),
but now think of y'(x) as the x-derivative of the function y(x);
rewrite this as
F(x,y,y')=py'−H(x,y,p),
where we think of the functions y(x) and p(x) as independent
and y'(x)=(d/dx)y(x);
define the action functional
J[y,p] of the functions y(x) and p(x)
(treated as independent functions)
as an integral of
F(x,y,y')=py'−H(x,y,p)
integrated over x from x0 to x1;
the Euler-Lagrange equations for the action J[y,p] are
(∂/∂y)[py'−H(x,y,p)]−(d/dx)(∂/∂y')[py'−H(x,y,p)]=0,
(∂/∂p)[py'−H(x,y,p)]−(d/dx)(∂/∂p')[py'−H(x,y,p)]=0;
we have
(∂/∂y)[py'−H(x,y,p)]=−∂H/∂y,
(∂/∂y')[py'−H(x,y,p)]=p,
(∂/∂p)[py'−H(x,y,p)]=y'−∂H/∂p,
(∂/∂p')[py'−H(x,y,p)]=0,
so the Euler-Lagrange equations for the functional J[y,p] become
−∂H/∂y−dp/dx=0,
y'−∂H/∂p=0,
which are exactly the Hamilton's equations.
[GF], pages 73-75 of Sec. 18.
-
Lecture 18 (Thu, Oct 17):
Legendre transform (cont.):
-
a nice reference about linear spaces, Hilbert spaces, dual spaces, linear operators, optimization on functionals,
Legendre transform, constrained optimization, and many other topics is
D. G. Luenberger, Optimization by Vector Space Methods, Wiley, 1969.
Noether's Theorem:
-
transformation of the original independent and dependent variables, (x,y),
to new ones, (x*,y*);
-
definition of invariance of a functional under a transformation (x,y)→(x*,y*);
-
examples of functionals that are invariant or not invariant with respect of changes of variables;
-
local one-parameter groups of transformations:
(x*=Φ(x,y;ε), y*=Ψ(x,y;ε),
group property;
-
generators of the one-parameter groups of transformations:
φ(x,y)=(d/dx)Φ(x,y;ε)|ε=0,
ψ(x,y)=(d/dx)Ψ(x,y;ε)|ε=0,
-
Noether's Theorem - statement and proof;
-
example: motion in R2 under a central force:
-
Lagrangian
L(x,y1,y2,y1',y2')=
(m/2)(y1'2+y2'2)−U(|y|),
where y=(y1,y2),
|y|=(y12+y22)1/2;
-
one-parameter group of rotations in R2:
Φ(x,y1,y2;ε)=x,
Ψ(x,y1,y2;ε)=(y1cos(ε)−y2sin(ε), y1sin(ε)+y2cos(ε));
-
computing the generators of the group action:
φ(x,y)=0,
ψ(x,y)=(−y2,y1);
-
proof of the invariance of the Lagrangian (and, hence, the action) under rotations;
-
computing the conserved quantity:
ψ1 ∂F/∂y1'+ψ2 ∂F/∂y2'=m(y1y2'−y1'y2)
[GF], Sec. 20
A very nice reference for Lie groups, conservations laws, and using symmetries to study/solve differential equations:
P. G. Olver, Applications of Lie Groups to Differential Equations, 2nd edition, Springer, 1993.
-
Lecture 19 (Tue, Oct 22):
Noether's Theorem (cont.):
-
example: motion in R2 under a central force (cont.):
-
interpretation of the conserved quantity:
ψ1 ∂F/∂y1'+ψ2 ∂F/∂y2'=m(y1y2'−y1'y2):
if (r,θ) are the polar coordinates in R2, then
θ'(t)=(y1y2'−y1'y2)/(y12+y22), so that
M:=m[y1(t)y2'(t)−y1'(t)y2(t)]=mr(t)2θ'(t) is the conserved angular momentum;
-
remarks about fundamental conservation laws of physics:
-
homogeneity of time: t*=t+ε (with ε a constant): implies conservation of energy;
-
homogeneity of space: r*=r+ρ (with ρ a constant vector): implies conservation of momentum;
-
isotropy of space: r*=Rφr
(with Rφ a rotation matrix): implies conservation of angular momentum;
[GF], Sec. 20
For more on the fundamental conservation laws in physics, see Sections 1-5, 6, 7, 9 of the book
L. D. Landau, E. M. Lifshitz, Mechanics, 3rd edition, Butterworth-Heinemann, 1976.
Constrained systems:
-
motivating examples:
-
Queen Dido's problem,
-
hanging chain;
-
terminology:
-
scleronomic constraints (not depending explicitly on t),
-
rheonomic constraints (depending explicitly on t);
-
example: Atwood machine;
-
terminology:
-
two-sided (equality) constraints,
-
one-sided (inequality) constraints, example: a cow tied to a silo;
-
a simple approach to constraints given by a function g(x,y)=0:
parameterize the constraint manifold and apply tools from elementary Calculus.
-
Lecture 20 (Thu, Oct 24):
Constrained systems (cont.):
-
geometric intuition: minimizing a function ƒ:R2→R
under the constraint g(x,y)=0,
where g:R2→R:
let C:={(x,y)∈R2:g(x,y)=0}
stand for the constraint manifold;
at the point (x0,y0)∈C
where the restriction ƒ|C:C→R has an extremum,
the level curves of ƒ are tangent to C, hence
∇ƒ(x0,y0) is collinear with ∇g(x0,y0),
i.e., ∇ƒ(x0,y0)=λ∇g(x0,y0)
for some λ∈R;
-
Lemma: for ƒ,g:R2→R C1 functions,
let C be the constraint manifold and ƒ|C:C→R
be the restriction of ƒ to C; if ƒ|C
has a local extremum at (x0,y0)∈C,
and ∇g(x0,y0)≠0,
then there exists λ∈R such that
∇ƒ(x0,y0)=λ∇g(x0,y0);
-
a pathological example:
g(x,y)=(x2+y2)2−2(x2−y2)=0,
then the above lemma does not work at (0,0);
-
constrained optimization of a functional J[y]
(with initial condition at the initial and final points)
under a constraint K[y]=l
- a derivation and a practical recipe;
-
Queen Dido's problem, condition for existence of a solution.
-
Lecture 21 (Tue, Oct 29):
Constrained systems (cont.):
-
holonomic versus non-holonomic constraings,
an example of a non-holonomic constraint - a rolling penny;
-
holonomic constraints for functionals containing two functions
- the variations of the two functions are not independent because of the constraint;
introducing a Lagrange multiplier λ(t):
F(x,y,y')=ƒ(x,y,y')−λ(t)g(x,y)
(which in this case is a function);
solving the Euler-Lagrange equations for F(x,y,y')
and the constraint equation to find the (n+1) unknown functions
y1(t), ..., yn(t), λ(t);
-
physical interpretation of the Lagrange multiplier: −λ(t)∇g(y)
is the force that the constraint exerts on the moving body to keep it moving in the constraint manifold
[Sections 5.1-5.4 of M. Kot, A First Course in Calculus of Variations, AMS, 2014]
-
Lecture 22 (Thu, Oct 31):
The second variation:
-
writing the total variation ΔJ of a functional J[y]
as δJ+(1/2)δ2J+(higher order terms);
-
deriving an expression for δ2J as an integral of
ƒyyη2+ƒyy'ηη'+ƒy'y'(η')2=:Pη2+Qηη'+R(η')2;
-
Second Variation Condition: if y*∈C1[a,b] is a relative minimum (maximum)
of J[y], then δ2J≥0 (resp. δ2J≤0)
for every weak perturbation η∈C1[a,b] satisfying
η(a)=0, η(b)=0;
-
Legendre's Necessary Condition (1788): if y*∈C1[a,b] is a relative extremum
of J[y], then ƒy'y'(x,y*(x),(y*)'(x))
must not change sign on [a,b] - statement and proof;
-
Legendre's attempt to formulate a sufficient condition, explanation of the flaw in Legendre's argument
(noticed by Lagrnage in 1797);
-
examples of application of Legendre's Necessary Condition:
geodesics on the plane, minimal surfaces of revolution
[Sections 6.1, 6.2 of M. Kot, A First Course in Calculus of Variations, AMS, 2014]
-
Lecture 23 (Tue, Nov 5):
The second variation (cont.):
-
facts about the Riccati equation
y'=a(x)y2+b(x)y+c(x):
-
if we know one particular solution yp(x), we can find the general solution: set
z(x):=y(x)−yp(x),
then z(x) satisfies a Bernoulli equation which can be solved exactly;
-
every linear 2nd order homogeneous ODE can be turned into a Riccati equation;
turning the Riccati equation w'=−P+(Q+w)2/R
into a linear 2nd order homogeneous ODE can be turned into a Riccati equation, set
w(x)=:−Q−Ru'(x)/u(x)
under the assumption that u(x)≠0 ∀x∈[a,b]:
Jacobi equation, (Ru')'+(Q'−P)u=0;
-
rewriting the second variation as ε2 times the integral of
R(η'−ηu'/u)2;
-
a weak sufficiency condition:
for if y*(x) the first variation δJ=0, and for this function
-
R(x)>0 ∀x∈[a,b], and
-
the Jacobi equation has a solution u(x)≠0 ∀x∈[a,b],
then δ2J is positive definite
[i.e., δ2J>0 for any admissible weak variation η(x)
that is not identically 0];
-
Euler's identity for homogeneous functions;
-
noticing that expression
2Ω(η,η'):=Pη2+Qηη'+R(η')2
in the second variation is a homogeneous function of degree 2 to rewrite 2Ω(η,η') as
(∂Ω/∂η)η+(∂Ω/∂η')η'
-
integration by parts and recalling that η(a)=0=η(b) to rewrite 2Ω(η,η') as
[(∂Ω/∂η)−(d/dx)(∂Ω/∂η')]η;
-
using the concrete expression for 2Ω to rewrite
[(∂Ω/∂η)−(d/dx)(∂Ω/∂η')]η
as Ψ(η)η, where
Ψ(η):=−(Ru')'−(Q'−P)u,
is equal to left-hand side of Jacobi equation multiplied by −1;
-
comparison: when we vary the function y(x) by εη(x)
in the action J[y] with Lagrangian ƒ(x,y,y'),
the first variation δJ is ε times an integral of
EL(η)η, where
EL(η):=∂ƒ/∂y−(d/dx)(∂ƒ/∂y')
is the left-hand side of the Euler-Lagrange equation
∂ƒ/∂y−(d/dx)(∂ƒ/∂y')=0;
when we vary the function y*(x) (which is a solution of the Euler-Lagrange equation) by εη(x)
in the so-called accessory (or secondary) Lagrangian
2Ω(η,η')=Pη2+Qηη'+R(η')2
[where P, Q, and R(η') are the second partial derivatives of
ƒ with respect to y and y' evaluated along the function y*(x)],
the variation can be written as ε2 times integral of Ψ(η)η,
where Ψ(η) is (−1) times the left-hand side of the Jacobi equation;
in this sense it is said that the Jacobi equation is the Euler-Lagrange equation for the
accessory variational problem.
[Section 6.3 of M. Kot, A First Course in Calculus of Variations, AMS, 2014]
Lecture 24 (Thu, Nov 7):
The second variation (cont.):
-
finding linearly independent solutions by differentiating the general solution
y*(x,α,β) of the Euler-Lagrange equation
with respect to the arbitrary constants α and β
(Jacobi's Theorem):
u1(x)=∂y*/∂α,
u2(x)=∂y*/∂β;
-
constructing the function
Δ(x,a)=u1(x)u2(a)−u2(x)u1(a)
(where x∈[a,b]),
which satisfies Jacobi's equation and the initial condition Δ(a,a)=0;
-
Jacobi's necessary condition:
if J[y] has a (weak) relative min or max at y*(x),
then Δ(x,a) does not vanish on x∈(a,b);
-
conjugate points: where Δ(x,a) vanishes for the first time for x>0;
-
finding conjugate points analytically;
Lecture 25 (Tue, Nov 12):
The second variation (cont.):
-
finding conjugate points geometrically:
-
definition of an envelope of an 1-parameter family of functions;
-
deriving an algorithm for locating an envelope of an 1-parameter family of functions;
-
example of finding an envelope: Mach cone behind a supersonic jet plane (in 2 dimensions);
-
locaging conjugate points geometrically by looking for an envelope
of the 1-parameter family ("pencil") of extremals emanating
from the point (a,ya);
-
recapitulating the necessary and sufficient conditions for a weak min or max.
Lecture 26 (Thu, Nov 14):
The second variation (cont.):
-
Example: harmonic oscillator:
-
Euler-Lagrange equation on x∈[0,b]
with boundary conditions y(0)=0,
y(b)=yb, finding the extremals;
-
Jacobi equation, solutions (as partial derivatives
of the general solution of the Euler-Lagrange equation
with respect to the arbitrary integration constants);
-
constructing the function Δ(x,a),
determining finding the conjugate point: x=π
-
proving that the strengthened Legendre condition is satisfied;
-
Example: catenoid:
-
Legendre condition;
-
looking for conjugate points, geometric approach:
for a given initial point, there is an envelope;
for each point below the envelope, there are no solutions;
for each point above the envelope, there are two solutions;
-
looking for conjugate points, analytic approach for symmetric boundary conditions:
finding the critical values of the parameters that separate the existence
and non-existence regions in the parameter space (by studying the tangency conditions);
-
for the case when two catenoids are possible, determining which one is stable.
Lecture 27 (Tue, Nov 19):
Preparations for studying conditions for a strong extremum:
-
recalling the Weierstrass-Erdmann conditions for broken extremals;
-
indicatrix;
-
geometric condition for allowed values for the one-sided slopes at the break:
existence of a tangent to the indicatrix that touches the indicatrix at two or more points;
-
Example: action that is an integral of
ƒ(x,y,y')=(y'2+1)2y'2
with boundary conditions y(0)=1, y(2)=0:
-
solving the Euler-Lagrange equation:
the weak extremals are y=mx+k;
-
looking for solutions with corners:
a double tangency to the indicatrix at y'=−1 and y'=0;
-
constructing infinitely many strong extremals that minimize the action,
while the weak extremal satisfying the boundary conditions gives a greater value;
-
Legendre condition;
-
Jacobi condition - no conjugate points;
-
remarks about approximating the weak solution
on the interval [0,5] with boundary conditions
y(0)=2, y(5)=1:
one can construct strong extremals
that are arbitrarily close to the weak extremal
in the C0-norm, but are at a finite distrance
to the weak extremal in the C1-norm.
Lecture 28 (Thu, Nov 21)
Good to know:
the greek_alphabet,
some useful notations.