# Lecture Notes on

CALCULUS OF VARIATIONS

A. Salih
Department of Aerospace Engineering
Indian Institute of Space Science & Technology, Trivandrum, India
July 2013

Chapter 1
Classical Variational Problems
1.1 Introduction
The calculus of variations deals with functionals, which are functions of a function, to put it simply.
For example, the methods of calculus of variations can be used to find an unknown function that
minimizes or maximizes a functional. Many of its methods were developed over two hundred years
ago by Euler (1701-1783), Lagrange (1736-1813), and others. It continues to the present day to bring
important techniques to many branches of engineering and physics.

## 1.2 Variational Problems

Let us begin our discussion with few problems from physics that can be solved using the method of
calculus of variations.

## 1.2.1 The brachistochrone problem

Let P(x1 , y1 ) and Q(x2 , y2 ) be two points on a vertical plane. Consider a curved path connecting these
points. We allow a particle, without friction, to slide down this path under the influence of gravity.
The question here is what is the shape of curve that allows the particle to complete the journey in
the shortest possible time. Clearly, the shortest path from point P to point Q is the straight line
that connects the two points. However, along the straight line, the acceleration is constant and not
necessarily optimal. Naive guesses for the pathss optimal shape, including a straight line, a circular
arc, a parabola, or a catenary are wrong.
In order to calculate the optimal curve we set up a two-dimensional Cartesian coordinate system
on the vertical plane that contains the two points P and Q as shown in figure 1.1. Our goal is to
find the path that minimizes the time it takes for an object to move from point P to point Q. From
figure 1.1 we see that at any point c(x, y) on curve y(x), the gravitational force vector F decomposes
into a component Ft tangent and Fn normal to curve at P. The component Fn does nothing to move
the particle along the path, only the component Ft has any effect. The vector F is a constant at

## CHAPTER 1. CLASSICAL VARIATIONAL PROBLEMS

x
b

y(x)
g
b

c(x, y)
b

Ft

Fn
F

y
Figure 1.1: A particle sliding down a curved path.

each point on the curve of (F = mg, where m is the mass of the particle and g is the gravitational
acceleration), but Fn and Ft depend on the steepness of the curve at c. The steeper the curve, the
larger Ft is, and the faster the particle moves. So it would be better if the path close to point P is
more steeper so that the velocity of the object increases rapidly and then flattens towards point c.
Definitely this sort of curve is longer than the straight line connecting the end points. But the extra
speed that the particle develops just as it is released will more than make up for the extra distance
that it must travel, and it will arrive at Q in less time than it takes along a straight line. The curve
along which the particle takes the least time to go from P to Q is called the Brachistochrone (from
the Greek words for shortest time). This famous problem, known as the Brachistochrone Problem,
was posed by Johann Bernoulli (1667-1748) in 1696. The problem was solved by Johann Bernoulli,
his older brother Jakob Bernoulli, Newton, and LHospital.
Let us begin our own study of the problem by deriving a formula relating the choice of the curve
y to the time required for a particle to fall from P to Q. The instantaneous velocity of the ball along
the curve is v = ds
dt , where s denotes the arc-length. Therefore,
p
q
dx2 + dy2
ds
1
1 + y(x)2 dx
(1.1)
dt =
=
=
v
v
v
Let be the time of descent from A to B along the curve y = y(x). Then,

Z
0

dt =

Z S
ds
0

(1.2)

where S is the total arc-length of the curve. If the origin of the coordinate system is taken as the
staring point A, we have, using (1.1)
Z x2 p
1 + y(x)2
=
dx
(1.3)
v
0
To obtain an expression for v we use the fact that energy is conserved through the motion. Thus,
the total energy at any time t must be the same as the total energy at time zero (corresponding to
location P), which we may take to be zero; that is

1 2
mv + mg(y) = 0
2

## 1.2. VARIATIONAL PROBLEMS

Solving for v gives v =

## 2gy. Therefore the time required for the particle to descend is

s
Z x2
1 + y (x)2
1
dx
[y] =
y(x)
2g 0

(1.4)

where we have explicitly noted that depends on the curve y(x). Notice that we use square brackets
for a functional, to signify the fact that its argument is a function. Equation (1.4) defines a functional.
To experiment with formula (1.4), first we suppose that Q is the point with coordinates (1, 1) and
normalize the acceleration of gravity to g = 1/2. Then the straight line segment joining P and Q lies
in the line y(x) = x and we can compute the time of descend easily:
Z 1r

2
dx = 2 2 = 2.8284
=
x
0
If the curve is the circular arc with a vertical tangent at P (i.e., its center is at (1, 0) radius equal to
1) then
q
y(x) = 1 (x 1)2
and the time of descend is

Z 1
0

1
p
dx = 2.6220
4
(2x x2)3

This is an improvement of about 7%, and shows that the shortest path does not yield the shortest

time. If the curve is the arc of the parabola with a vertical tangent at P then y(x) = x and integrating
formula (1.4) numerically we obtain
Z
1 1 1 + 4x

=
= 2.5872
4 3
2 0
x
which is slightly better than the circular arc. But is it the best result possible? That is, is this parabolic
arc the brachistochrone for points P(0, 0) and Q(1, 1)?
Clearly it would be tedious to choose y(x) one after another and look for the shortest time. In our
situation, for P(0, 0) and Q(x2 , y2 ) fixed, we have a collection F of candidate functions, namely all
those that are differentiable and whose graphs pass through both P and Q. To each element y(x) of
F we associate a number according to formula (1.4). Thus there is a defined mapping J from the
set F of relevant functions to the set R of real numbers. Such a mapping from a set of functions to
a set of numbers is called a functional. The Brachistochrone Problem can thus be stated:
Find the function y(x) that minimizes the functional
s
Z x2
1 + y (x)2
1
dx
(1.5)
= J[y] =
y(x)
2g 0
subject to the conditions y(0) = 0 and y(x2 ) = y2 > 0.
We stated earlier that the importance of the brachistochrone problem is that it directed attention
to the systematic study of problems of a certain type. These are problems in which a fixed rule (a
functional J ) assigns a numerical value J[y] to each function y(x) in a particular set F of functions,
subject to constraints such as the endpoint conditions in the brachistochrone problem, and the goal
is to find the element y of F that either maximizes or minimizes J[y].

## 1.2.2 Minimum surface-area of revolution

Another problem which could be easily solved using the methods of calculus of variations is the
determination of the shape of a thin soap film supported by a pair of coaxial rings. The properties of
thin soap films (and of soap bubbles) were identified experimentally by the Belgian physicist Joseph
Antoine Ferdinand Plateau (1801-1883), who determined a number of empirical laws for the formation
of their surfaces. To honour his work, this application is called the Plateau problem. The dominant
force that allows the soap film to retain its shape is surface tension. The energy related to this process
is proportional to the magnitude of the surface tension and the total surface area of the film. The
film can therefore minimize its free energy by minimizing its area, subject to the requirement that it
will be attached to the two wire rings. Each cross-section of the resulting surface is a circle centered
on the x-axis. Thus the soap film problem can be stated as follows:
Given two points P(x1 , y1 ) and Q(x2 , y2 ), not too far apart, in the plane find the curve y(x) joining
P and Q such that the area S of the surface of revolution about the x-axis is minimized. In other
words minimize the area of revolution
Z
Q

S=

2 y ds

## where ds is the arc-length. Using the arc-length expression

Z x2
q
S = J[y] = 2
y(x) 1 + y(x)2 dx

(1.6)

x1

It may be noted that, the this problem has a discontinuous solution (discovered by Goldschmidt)

y
Q(x2 , y2 )

y(x)

P(x1 , y1 )
b

## Figure 1.2: A surface of revolution of the curve y(x).

obtained by revolving the curve which is the union of three lines: the vertical line from P to the point
(x1 , 0), the vertical line from Q to (x2 , 0) and the segment of the x-axis from x1 to x2 .

## 1.2.3 Fermats principle of least time

Fermats Principle of Least Time, first formulated by the seventeenth century French mathematician
Pierre de Fermat, provides a formulation of geometrical optics. The principle states that when a light
ray moves through an optical medium, it travels along a path that minimizes the travel time. In an

y
Q(x2 , y2 )
b

P(x1 , y1 )
b

## Figure 1.3: Goldschmidt solution for minimum surface area of revolution.

inhomogeneous planar optical medium in xy plane, the speed of light, v(x, y), varies from point to
point, depending on the optical properties. Speed equals the time derivative of distance travelled,
namely, the arc length of the curve y = y(x) traced by the light ray. The ratio, n = c/v is called the
refractive index of the medium where c is the speed light in vacuum. The time required to cover the
distance between two points P(x1 , y1 ) and Q(x2, y2 ) is

dt =

Z S
0

ds
=
v(x, y)

Z x2 p
1 + y(x)2
x1

v(x, y)

dx

(1.7)

where S is the total arc-length of the curve. Fermats principle states that, to travel from point P to
point Q, the light ray follows the curve y = y(x) that minimizes this functional (1.7) subject to the
boundary conditions y(x1) = y1 and y(x2 ) = y2 .
If the medium is homogeneous, e.g., a vacuum, then v(x, y) = c is constant, and we have

1
=
c

Z x2 q
x1

1 + y(x)2 dx

whose minimizers are the straight line connecting the points P and Q. In an inhomogeneous medium,
the path taken by the light ray is no longer evident, and we are in need of a systematic method for
solving the minimization problem. Indeed, all of the known laws of geometric optics, lens design,
focusing, refraction, etc., will be consequences of the geometric and analytic properties of solutions
to Fermats minimization principle.

## 1.2.4 Principle of least action

Many physical principles may be formulated in terms of variational problems. Specifically the least
action principle is an assertion about the nature of motion that provides an alternative approach to
mechanics completely independent of Newtons laws.
The motion of a particle moving in the gravitational field of the Earth is computed based on the
principle of least action. The principle, a special case of more general Hamiltons principle, has been
known for several hundred years, and was first proven by Euler. The principle states that a particle

## CHAPTER 1. CLASSICAL VARIATIONAL PROBLEMS

under the influence of a gravitational field moves on a path along which the kinetic energy is minimal.
As such, it is a variational problem of

E =2

Ek dt =

mv2 dt

where m and v are the mass and velocity of the particle respectively. Since

v=

ds
dt

and

ds =

1 + y(x)2

## the functional to be minimized may be written as

E =m

v ds = m

q
1 + y(x)2 dx

(1.8)

Since the gravitational field induces the particles motion, its speed is related to its height known from
elementary physics as
v2 = u2 2gy
where u is an initial speed with yet undefined direction. Plugging into the equation (1.8) yields the
functional
Z p
q
2
E =m
u 2gy 1 + y (x)2 dx
(1.9)
These are some of the problem for physics and geometry that can be cast in variational form.

## 1.2.5 Chaplygins problem

Consider an airplane performing a search by flying horizontally in a closed curve with a constant speed
Va , in a region where strong uniform winds blow at a speed Vw . Show that the path that extremizes
the enclosed area of the search region for a given length of the closed curve is an ellipse with its major
axis perpendicular to the wind velocity and with an eccentricity equal to

e=

Vw
Va

Chapter 2
Functionals and Its Variations
2.1 Functionals
As we have seen in the last section, there exist a great variety of physical problems that deals with
functionals, which are functions of a function. We are familiar with the definition of a function. A
function can be regarded as a rule that maps one number (or a set of numbers) to another value. For
example,
f (x) = x2 + 2x
is a function, which maps x = 2 to f (x) = 8, and x = 3 to f (x) = 15, etc. On the other hand, a
functional is a mapping from a function (or a set of functions) to a value. That is, a functional is
a rule that assigns a real number to each function y(x) in a well-defined class. Like a function, a
functional is a rule, but its domain is some set of functions rather than a set of real numbers. We can
consider F[y(x)] as a functional for the fixed values of x. For example,

F[y(x)] = 3y2 y + 10
where

y(x) = ex + cos x x

for

x=

## is a functional. Another class of functional has the form

J[y] =

Z b

y(x) dx

Here J gives the area under the curve y = y(x). Hence J is not a function of x and its value will be a
number. However, this number depends on the particular form of y(x) and hence J[y] is a functional.
For a = 0 and b = , the value of the functional when y(x) = x is

J[y] =

x dx =

J[y] =

2
4.93
2

sin x dx = 2

10

## CHAPTER 2. FUNCTIONALS AND ITS VARIATIONS

Therefore the given functional J[y] maps y(x) = x to 2 /2 and maps y(x) = sin x to 2. Because an
integral maps a function to a number, a functional usually involves an integral. The following form
of functional often appears in the calculus of variations,

J[y] =

Z b
a

F(x, y, y ) dx

(2.1)

The fundamental problem of the calculus of variations is to find the extremum (maximum or minimum)
of the functional (2.1).

## 2.2 First Variation of Functionals

Consider a function y = f (x). When the independent variable x changes to x + x, then the dependent
variable y changes to f (x + x) = f (x) + f (x), where f is the total change in the function. f can
be computed by expanding f (x + x) using Taylor series. Thus,

d 2 f x2
d 3 f x3
df
x + 2
+ 3
+ ...
dx
dx 2!
dx 3!
df
d 2 f x2
d 3 f x3
f f (x + x) f (x) =
x + 2
+ 3
+ ...
dx
dx 2!
dx 3!
f (x + x) = f (x) +

(2.2)

By definition, the differential d f of the function f (x) is how much f changes if its argument, x,
changes by an infinitesimal amount x. That is

d f = lim f =
x0

df
x
dx

(2.3)

Comparing (2.2) and (2.3), we see that the differential d f is the linear part of the total change f .
That is
f = d f + higher-order terms in x
(2.4)
In line with definition of differential of a function f (x), we now introduce the concept of the
variation (or differential) of a functional F[y(x)]. Let y(x) is changed to y(x) + y(x), where y(x) is
the vertical displacement of the curve y(x). It is known as the variation of y and is denoted by y.
We introduce an alternative function of the form

## Y (x) = y(x) + y(x)

(2.5)

This is illustrated in figure 2.1, where y(x) is shown in red color and Y (x) is shown in blue color.
By definition, the total change in the functional is given by

## F[y] = F[y(x) + y(x)] F[y(x)] = F[Y (x)] F[y(x)]

(2.6)

If (x) is an arbitrary differentiable function that vanishes at the boundaries of the domain, i.e.,
(a) = 0 and (b) = 0, then the variation y(x) can be represented as

y(x) = (x)

y, A

(2.7)

## 2.2. FIRST VARIATION OF FUNCTIONALS

11

y(x)

y(x)
Y (x)

x
Figure 2.1: Plot of y(x) and a small variation from it.

where is an arbitrary parameter independent of x. This definition enable us to write equation (2.5)
in the following form,
(2.8)
Y = y +
Now from (2.6), the total change in functional F is given by

F = F[y + ] F[y]

(2.9)

## F[Y ] = F[y + ] = F[y] +

d 3F 3
dF
d2F 2
+ 2 2 + 3 3 + . . .
dy
dy
2!
dy
3!

(2.10)

## Rearranging equation (2.10) to obtain the change in functional F :

F = F[y + ] F[y] =

dF
+ higher-order terms
dy

(2.11)

By definition, first variation of a functional F[y], denoted by F , is how much F changes if its
argument, y, changes by an infinitesimal amount y. Therefore,

## F = lim F = lim (F[y + ] F[y]) =

0

dF
dF
=
y
dy
dy

(2.12)

which shows that F is given by the linear part of the equation (2.11). Thus, the change in functional
F[y] and its first variation is related by the equation

F = F + higher-order terms

(2.13)

Let us now define what is called the Gateaux derivative or Gateaux variation in the direction of (x).
It is denoted by F[y; ] and is defined as

d
F
F[y + ] F[y]
F[y; ] = lim
= lim
=
F[y + ]
(2.14)
0
0

d
=0

## CHAPTER 2. FUNCTIONALS AND ITS VARIATIONS

12

Note that the first variation and the Gateaux variation are related through the parameter , i.e.,
F( f v) = F(gv) where we have denoted first variation by F( f v) and Gateaux variation by F(gv) .
Unfortunately, in the literature, these two variations are denoted by the same symbol F .
Let us look at the meaning of and geometrically. Since y is the unknown function to be
found so as to extremize a functional, we want to see what happens to the functional F[y] when we
perturb this function slightly. For this, we take another function and multiply it by a small number
. We add to y and look at the value of F[y + ]. That is, we look at the perturbed value of
the functional due to perturbation . This is the shaded area shown in figure 2.2. Now as 0,
we consider the limit of the shaded area divided by . If this limit exists, such a limit is called the
Gateaux variation of F[y] at y for an arbitrary but fixed function .

y(x)
y +

(x)
a

## Figure 2.2: Plot of y(x) and its variation.

Note that choosing a different gives a different set of varied curves and hence a different
variation. Hence F[y; ] depends on which function is chosen to define the increment y and this
dependence is explicitly shown in the notation.

## First variation of functional F[x, y, y , y ]

We now consider the first variation of the functional

F[x, y, y , y ]
for fixed values of x. If y changes to y + , then y changes to y + and y changes to y + .
From equation (2.8), we have

Y = y +
Y = y +

and

Y = y +
The new value of the functional is then

F[x, Y, Y , Y ] = F[x, y + , y + , y + ]

13

## where is known as the variation of y and is denoted by y . Similarly, is known as the

variation of y and is denoted by y . The change in the functional F is then defined as

F = F[x, y + , y + , y + ] F[x, y, y , y ]

(2.15)

## Using Taylor series, we can expand the first term on R.H.S. as


F
F F
F[x, y + , y + , y + ] = F[x, y, y , y ] +
+ +
y
y
y

 2
F 2 2 F 2 2 F 2
2F
2F
2 F 2

+ ...
+
+ 2 + 2 + 2
+ 2
+ 2
y2
y
y
y y
y y
y y
2!

Rearranging the above Taylor series expansion, we obtain the change in functional F :

F =

 2

F
F F
F 2 2 F 2 2 F 2
+ + +
+ 2 + 2
y
y
y
y2
y
y

2
2
2
F
F
F 2

+ 2
+ 2
+2
+ ...
y y
y y
y y
2!

In analogy with the definition of a function, the sum of the linear part in the F is called the first
variation of the functional F . Therefore,

F =

F
F
F
+ +
y
y
y

(2.16)

Since

y = ,

y = ,

y =

## The variation of F can be written as

F =

F
F
F
y + y + y
y
y
y

(2.17)

Now, the total differential dF of a function F(x, y, y , y), when x is considered fixed, is given by

dF =

F
F
F
dy + dy + dy
y
y
y

Formula (2.17) for F has the same form as the above formula for dF . Thus the variation of F is
given by the same formula as differential of F , if x is considered to be fixed.
It is to be noted that the differential of a function is the first-order approximation to the change in
that function, along a particular curve while the variation of a functional is the first-order approximation
to the change in the functional from one curve to other.
We mention here that the sum of terms in and 2 is called the second variation of F and the
sum of terms in , 2 , and 3 is called the third variation of F . However, when the term variation is
used alone, the first variation is meant.

## CHAPTER 2. FUNCTIONALS AND ITS VARIATIONS

14
Some rules of variational calculus

The variational operator follows the rules of differential operator d of calculus. Let F1 and F2 be
any continuous and differentiable functionals. Then we have the following results:
F n = nF n1 F
(F1 + F2 ) = F1 + F2
(F1 F2 ) = F1 F2 + F2 F1
 
F1
F2 F1 F1 F2

=
F2
F22
It is easy to show that the operators
be written mathematically as

d
dx

## and are commutative. The commutative property may

dy
d
( y) =
dx
dx
The proof is as follows:

d
d
dy
d
( y) =
( ) =
= = y =
dx
dx
dx
dx
That is, the differential of the variation of a function is identical to the variation of the differential of
the same function.
Another commutative property is the one that states that the variation of the integral of a functional F is the same as the integral of the variation of the same functional, or mathematically

Fdx =

Fdx

Note that the two integrals must be evaluated between the same two limits.

## First variation of functional

Rb
a

F(x, y, y , y ) dx

## Next we consider the first variation of the functional defined by

J[y] =

Z b
a

F(x, y, y , y ) dx

## If y changes to Y = y + , then y changes to Y = y + and y changes to Y = y + . The

change in functional, J , is given by

## J = J[Y ] J[y] = J[y + ] J[y]

where

J[y + ] =

Z b
a

(2.18)

F[x, y + , y + , y + ] dx

## Therefore, the change in functional is given by

J =

Z b
a

F[x, y + , y + , y + ] dx

Z b
a

F(x, y, y , y) dx

(2.19)

## As previously defined, the G

ateaux derivative or Gateaux variation in the direction of (x) is given by

d
J
J[y + ] J[y]
J[y; ] = lim
= lim
=
J[y + ]
(2.20)
0
0

d
=0

## 2.2. FIRST VARIATION OF FUNCTIONALS

15

Example 2.1
Consider the functional

J[y] =

Z 1
0


x2 y2 + y2 dx

with y(0) = 0 and y(1) = 1. Calculate J and J[y; ] when y(x) = x and (x) = x2 .
We first evaluate J[y],

J[y] =

Z 1
0

Z 1
0


x2 y2 + y2 dx
2

x x + 1 dx =

Z 1

dx = 1

## The family of curves y + is given by x + x2 . We next evaluate J on the family y + to get

J[y + ] =
=

Z 1
0

Z 1
0


x2 (y + )2 + (y + )2 dx


x2 (x + x2)2 + (1 + 2 x)2 dx

3
17
= 1 + + 2
2
15
Hence, the change in the functional

J = J[y + ] J[y] =

3
17
+ 2
2
15

## The derivative of the functional

3 34
d
J[y + ] = +
d
2 15
Evaluating this derivative at = 0 gives the Gateaux derivative

3
d
J[y + ]
=
d
2
=0

16

## CHAPTER 2. FUNCTIONALS AND ITS VARIATIONS

Chapter 3
The Fundamental Problem
3.1 Introduction
A fundamental problem of the calculus of variations can be stated as follows: Given a functional J
and a well-defined set of function A, determine which function in A afford a minimum (or maximum)
value to J . The word minimum can be interpreted as a local minimum or an absolute minimum a
minimum relative to all elements in A. The well-defined set A is called the set of admissible functions.
It is those functions that are the competing functions for extremizing J . For example, the set of
admissible functions might be the set of all continuous functions on an interval [a, b], the set of all
continuously differentiable functions on [a, b] satisfying the conditions such as f (a) = 0.
Classical calculus of variations restricts itself to functionals that are defined by certain integrals
and to the determination of both necessary and sufficient conditions for extrema. The problem of
extremizing a functional J over the set A is called a variational problem. Several examples have already
been presented in section 1.2. To a certain degree the calculus of variations could be termed as the
calculus of functionals. In the present discussion we restrict ourselves to an analysis of necessary
conditions for extrema. An elementary treatment of sufficient conditions can be found in Gelfand and
Fomin.
Even the preceding limited collection of examples of variational problems should convince the
reader of the tremendous practical utility of the calculus of variations. Let us now discuss the most
basic analytical techniques for solving such minimization problems. Let us concentrate on the simplest
class of variational problems, in which the unknown is a continuously differentiable scalar function, and
the functional to be minimized depends upon at most its second derivative. As already mentioned,
the basic minimization problem, then, is to determine a suitable function y = y(x) that minimizes the
objective functional

J[y] =

Z b
a

F(x, y, y y) dx,

yA

(3.1)

where F(x, y, y , y ) is some given function and A is a admissible class of functions. The integrand F is
known as the Lagrangian for the variational problem. We assume that the Lagrangian is continuously
differentiable in each of its four arguments x, y, y , and y .

17

## CHAPTER 3. THE FUNDAMENTAL PROBLEM

18

We note here that all the problems discussed in section 1.2 have the functional in the form

J[y] =

Z b
a

with Lagrangian
q
2
F = 1+y
y
p
F = y 1 + y2
p
F = 1 + y2
p
p
F = u2 2gy 1 + y2

F(x, y, y ) dx,

yA

(3.2)

Brachistochrone Problem
Minimum Surface-Area of Revolution
Fermats Principle of Least Time - for homogeneous medium
Principle of Least Action

## 3.2 Maxima and Minima

One of the central problems in the calculus is to maximize or minimize a given real valued function of
a single variable. If f is a given function defined in an open interval (a, b), then f has a local minimum
at a point x = x0 in (a, b) if f (x0 ) < f (x) for all x near x0 on both sides of x = x0 . In other words,
f has a local minimum at a point x = x0 in (a, b) if f (x0 ) < f (x) for all x, satisfying |x x0 | < for
some . If f has a local minimum at x0 in (a, b) and f is differentiable in (a, b), then it is well known
that
f (x0 ) = 0
(3.3a)
Similar statements can be made if f has a local maximum at x0 . The aforementioned condition (3.3a)
is called a necessary condition for a local minimum; that is, if f has a local minimum at x0 , then
(3.3a) necessarily follows. Equation (3.3a) is not sufficient for a local minimum, however; that is, if
(3.3a) holds, it does not guarantee that x0 provides an actual minimum. The following conditions are
sufficient conditions for f to have a local minimum at x0

f (x0 ) = 0

and

f (x0 ) > 0

(3.3b)

provided f exists. Again, similar conditions can be formulated for local maxima. If (3.3b) holds, we
say f is stationary at x0 and that x0 is an extreme point for f .

## 3.2.1 Maxima and minima of functionals

Instead of extremizing functions in calculus, the calculus of variations deals with extremizing functionals. The necessary condition for the functional J[y] to have an extremum at y(x) = y(x)

is that its
variation vanishes for y = y. That is,
J[y;
] = 0
(3.4)
for y = y and for all admissible variations .
The fact that the condition (3.4) holds for all admissible variations often allows us to eliminate
from the condition and obtain an equation just in terms of y, which can then be solved for y.

## 3.2. MAXIMA AND MINIMA

19

Generally the equation for y is a differential equation. Since (3.4) is a necessary condition we are not
guaranteed that solutions y actually will provide a minimum. Therefore the solutions y to (3.4) are
called (local) extremals or stationary functions, and are the candidates for maxima and minima. If
J[y;
] = 0, we say J is stationary at y in the direction .
Based on the variations y and y , we distinguish between the following cases, i.e., strong
extremum and weak extremum. Strong extremum occurs when y is small, however, y is large,
while weak extremum occurs when both y and y are small.

Example 3.1
Consider the functional

J[y] =

Z 1
0


1 + y(x)2 dx

## with y(0) = 0 and y(1) = 1. Let y(x)

= x and (x) = x(1 x). The family of curves y + is given
by x + x(1 x) and a few members are sketched in figure 3.1. We evaluate J on the family y +
to get

y = x

= x(1 x)
x

## Figure 3.1: The one parameter family of curves (x + x(1 x)).

J[y + ] =
=

Z 1h
0

Z 1h
0

= 2+

1 + y (x) + (x)

2 i

dx

i
1 + (1 + (1 2x))2 dx

2
3

## Then the derivative of the functional

2
d
J[y + ] =
d
3
Evaluating this derivative at = 0 gives the Gateaux derivative

d
J[y;
] =
J[y + ]
=0
d
=0

Hence we conclude that variation J[y; ] = 0 and J is stationary at y = x in the direction = x(1 x).

## CHAPTER 3. THE FUNDAMENTAL PROBLEM

20
Example 3.2
Consider the functional

J[y] =

Z 2
0


1 + y(x)2 dx

## = x and (x) = sin x. The family of curves y + is given by

with y(0) = 0 and y(2 ) = 2 . Let y(x)
x + sin x and a few members are sketched in figure 3.2. We evaluate J on the family y + to get

y = x

= sin x

## Figure 3.2: The one parameter family of curves (x + sin x).

J[y + ] =
=

Z 2 h
0

1 + y (x) + (x)

Z 2 
0

2 i

dx


1 + (1 + cos x)2 dx

= (4 + 2 )
Then the derivative of the functional

d
J[y + ] = 2
d
Evaluating this derivative at = 0 gives the Gateaux derivative

d
J[y + ]
=0
d
=0

Hence we conclude that variation J[y; ] = 0 and J is stationary at y = x in the direction = sin x.

Example 3.3
Consider the functional

J[y] =

Z 1
0


1 + y(x)2 dx

21

y = x2

= x(1 x)
1

x


## Figure 3.3: The one parameter family of curves x2 + x(1 x) .

with y(0) = 0 and y(1) = 1. Let y(x)
= x2 and (x) = x(1 x). The family of curves y + is given
2
by x + x(1 x) and a few members are sketched in figure 3.3. We evaluate J on the family y +
to get
Z 1h
2 i
1 + y (x) + (x)
J[y + ] =
dx
0

Z 1h
0

## Then the derivative of the functional

i
1 + (2x + (1 2x))2 dx


1
=
7 2 2
3

2
d
J[y + ] = (1 + )
d
3
Evaluating this derivative at = 0 gives the Gateaux derivative

2
d
J[y + ]
=
d
3
=0

## stationary in the direction

We conclude that variation J[y; ] 6= 0 and hence y = x2 does not make J[y]
= x(1 x).

## 3.3 The Simplest Problem

The simplest problem of calculus of variations is to determine a function y(x) for which the value of
the following functional

J[y] =

Z b
a

F(x, y, y ) dx

(3.5)

is a minimum. Here y C2 [a, b].1 and F is a given function that is twice continuously differentiable
on [a, b] R2 . In order to uniquely specify a minimizing function, we must impose suitable boundary
1C2 [a, b]
2

is the set of all continuous functions on an interval [a, b] whose second derivative is also continuous. If

## CHAPTER 3. THE FUNDAMENTAL PROBLEM

22

conditions. Any type of boundary conditions including, Dirichlet (essential) and Neumann (natural)
boundary conditions may be prescribed. In the interests of brevity, we shall impose the Dirichlet
boundary conditions of the form

y(a) =

y(b) =

That is, the graphs of the admissible functions pass through the end points (a, ) and (b, ).
We seek a necessary condition for the functional J[y] to be a minimum. For this, we need to
compute the G
ateaux variation of J . Let y(x) be a local minimum and (x) a twice continuously
differentiable function satisfying (a) = (b) = 0. Then, Y = y + is an admissible function and the
new functional becomes

J[Y ] =

Z b
a

F[x, Y, Y ] dx =

Z b
a

F[x, y + , y + ] dx

(3.6)

d
J[Y ] =
d

Z b

F[x, Y, Y ] dx

a


Z b
Z b
F Y
F Y
F
F
dx =
=
+
+
dx
Y
Y
Y
Y
a
a

## Evaluating the above integral at = 0, we obtain


Z b

F
F
d

=
J[y + ]
+ dx
d
y
y
a
=0

(3.7)

As we have seen earlier, the necessary condition for the functional J[y] to have an extremum at y is
that its variation vanishes for y. That is,

d
J[y; ] =
J[y + ]
=0
d
=0

(3.8)

Therefore, from (3.7) the necessary condition for the functional J[y] to have an extremum at y is given
by

Z b
F
F
(3.9)
+ dx = 0
y
y
a
for all C2 [a, b] with (a) = (b) = 0.

## An alternate approach for the derivation of equation (3.9)

Since first variation and G
ateaux variation are linearly related through the parameter , the Gateaux
variation in equation (3.8) may be replaced by the first variation. Thus the necessary condition given
by equation (3.8) becomes

J =

Z b
a

F(x, y, y ) dx =

Z b
a

F dx = 0

23

Z b
a


F

y
dx
y
y
a

Z b
F
F
+ dx = 0
=
y
y
a

F dx =

Z b
F

y +

Z b
F


F
+ dx = 0
y
y

## which is same as (3.9).

Condition (3.9) is not useful as it stands for determining y(x). Using the fact that it must hold
for all , however, we can thus eliminate and and thereby obtain a condition for y alone. First
we integrate the second term in (3.9) by parts2 to obtain
Z b
F
a

dx =
y |{z}
|{z} v

b
a



Z b
d F
dx

dx

## Thus, condition (3.9) can be written as

Z b
F
a

y
dx

F
y



F
dx +

b

=0

(3.10)

Since, (a) = (b) = 0, the last term on right-hand side vanishes and thus the condition (3.10)
becomes


Z b
d F
F

dx = 0
(3.11)
y
dx y
a
The above equation must hold for any arbitrary limits. This is possible only if the integrand is
identically zero (DuboisReymond lemma). Therefore, we have

F
d

y
dx

F
y



=0

Since (x) is an arbitrary admissible function, equation (3.11) holds good only if

d
F

y
dx

F
y

=0

J[y] =

Z b
a

## where y C2 [a, b] and

2

uv dx = uv u v dx

y(a) =

F(x, y, y ) dx

y(b) =

## CHAPTER 3. THE FUNDAMENTAL PROBLEM

24
then y must satisfy the equation

d
F

y
dx

F
y

=0

x [a, b]

(3.12a)

Equation (3.12a) is called the EulerLagrange equation or simply Euler equation. There are two
important aspects of the derivation of the EulerLagrange equation that deserve close inspection.
First, it provides a necessary condition for a local minimum but not a sufficient one. It is analogous
to the derivative condition f (x) = 0 in differential calculus. Therefore its solutions are not necessarily
local minima. It is a second-order ordinary differential equation with a solution that is required to
satisfy two conditions at the boundaries of the domain of solution. Such boundary value problems
may have no solution, one unique solution, or multiple solutions depending on the situation. A case
with multiple solutions will imply that more than one paths from point (a, ) to point (b, ) satisfy
the EulerLagrange equation. However, not all of these paths will necessarily minimize the functional
J[y]. A second important aspect of the EulerLagrange equation is related to our assumption that the
curve y(x) C2 [a, b]. Indeed, our considerations focused only on such smooth functions. However,
the actual path that extremizes an integral might be one with a corner or a kink. Such paths are
not relevant for the use of the EulerLagrange equation in Newtonian mechanics. However, they are
often the true solutions in other problems in the calculus of variations, as we have seen in the case of
physics of soap films.
It may be worthwhile to note that if y is treated as independent variable and x is dependent
variable, then the EulerLagrange equation (3.12a) will takes the form

d
F

x
dy

F
x

=0

y [ , ]

(3.12b)

## 3.3.1 Essential and natural boundary conditions

In the derivation of the EulerLagrange equation, we used the conditions that (a) = (b) = 0, which
means that the variations y(a) = y(b) = 0. These conditions are a consequence of our imposition
of fixed values of y(x) at the endpoints a and b. That is

y(a) =

y(b) =

where and are constants. This is called the essential (or Dirichlet) boundary condition. In some
applications, we may need to apply other types of boundary conditions to the function y(x).
If we still want the last term in equation (3.10) to vanish (so that we obtain the familiar Euler
Lagrange equation), but allowing y(a) and y(b) to be non-zero, then we need to have,

F
=0
y x=a

F
=0
y x=b

This is called a natural (or Neumann) boundary condition. A system may also have a natural boundary
condition at one end (x = a) and an essential boundary condition at the other end (x = b).

25

## 3.3.2 Other forms of EulerLagrange equation

The functional F in the EulerLagrange equation is a function of x, y, and y . Therefore,

F
F dy F dy
dF
=
+
+
dx
x
y dx y dx
dF
F
F
F
=
+ y
+ y
dx
x
y
y
But, we have





F
d
d
F
F
+y
y =y
dx
y
y
dx y

(3.13)

(3.14)

## Subtracting (3.14) from (3.13), we have





d
F
F
dF
F
d
F

+y
y
y =
dx
dx
y
x
y
dx y
Rewriting the above equation to give





d
d F
F
F
F
=y

F y
dx
y
x
y
dx y
By the EulerLagrange equation (3.12a) we see that the right-hand side of the above equation is zero.
Thus,


d
F
F
=0
(3.15)
F y
dx
y
x
Equation (3.15) is another useful form of the EulerLagrange equation.

## 3.3.3 Special cases

Case I. Often in applications, the functional F does not depend directly on x and the EulerLagrange
equation, in this case, takes a particularly nice form. Here we have F/ x = 0 and the corresponding
form of EulerLagrange equation (3.15) becomes


d
F
F y = 0
dx
y
Integrating, we get the first integral of EulerLagrange equation

F y

F
=C
y

(3.16)

Thus, the extremizing function y is obtained as the solution of a first-order differential equation (3.16)
involving y and y only. This simplified form of EulerLagrange equation (3.16) is known as the
Beltrami identity. The combination F y Fy that appears on the left of the Beltrami identity is
sometimes referred to as Hamiltonian.
Case II. If F is independent of y, then F/ y = 0 and the form of EulerLagrange equation (3.12a)
becomes


d F
=0
dx y

26

## Integrating, we get the first integral of the EulerLagrange equation as,

F
=k
(3.17)
y
where k is a constant. Note that equation (3.17) is a first order differential equation involving x and
y .
Case III. If F is independent of y , then F/ y = 0 and the form of EulerLagrange equation
(3.12a) becomes
F
=0
y
integrating, we get F = F(x), a function of x alone.

## 3.4 Advanced Variational Problems

3.4.1 Variational problems with high-order derivatives
Here we will consider the problem of finding the function y(x) that extremizes the integral

J[y] =

Z b
a

F(x, y, y y ) dx

(3.18)

## with prescribed Dirichlet (essential) boundary conditions

y(a) =

y (a) =

y(b) =

y (b) =

Here y C4 [a, b] and F is a given function that is twice continuously differentiable on [a, b] R2 .
The necessary condition for the functional J[y] to be a minimum is that the function y(x) satisfies
the following EulerLagrange equation




d F
d2 F
F

+ 2
=0
(3.19)
y
dx y
dx y
Instead of the Dirichlet-type boundary conditions we may also prescribe a Neumann -type (natural)
boundary conditions of the form



d F
F
F
=0

=0
y
dx y x=a
y x=a



F
F
d F
=0

=0
y
dx y x=b
y x=b
In general, when the functional contains higher derivatives of y(x), which extremizes the functional

J[y] =

Z b
a

F(x, y, y y , , y(n) ) dx

## must be a solution of the equation







n
F
d F
F
d2 F
n d
=0

+ 2
+ (1)
y
dx y
dx y
dxn y(n)

(3.20)

(3.21)

Equation (3.21) is differential equation of order 2n and is called EulerPoisson equation. The general
solution of this contains 2n arbitrary constants, which may be determined from the 2n boundary
conditions.

27

## 3.4.2 Variational problems with several independent variables

If the extremal function u is a function on two independent variables x & y and the functional to be
extremized is of the form
ZZ
F(x, y, u, ux , uy ) dx dy
(3.22)
J[u] =
R

## then the u(x, y) must be a solution of the equation





F
F
F

=0
u
x ux
y uy

(3.23)

This second-order partial differential equation that must be satisfied by the extremizing function u(x, y)
is called the Ostrogradsky equation after the Russian mathematician M. Ostrogradsky.

## 3.4.3 Variational problems involving several unknown functions

So far we have considered variational problems involving only one unknown function. We shall now
state the necessary conditions for the extremum of a functional whose value depends on several
functions of a single independent variable. Consider the functional

J[y1 , , yn ] =

Z b
a

F(x, y1 , y2 , , yn , y1 , y2 , , yn ) dx

(3.24)

where the functions y1 , y2 , , yn satisfy the prescribed essential boundary conditions of the form

y1 (a) = y1a

y2 (a) = y2a

y1 (b) = y1b

y2 (b) = y2b

yn (a) = yna
yn (b) = ynb

Here yi C2 [a, b] and F is a given function that is twice continuously differentiable on [a, b] R2 . The
necessary condition for the functional J to be a minimum is that the function yi (x) (i = 1, 2, . . . , n)
satisfies the following EulerLagrange equation


d F
F

=0
(i = 1, 2, . . . , n)
(3.25)
yi
dx yi
Thus for an extremum of J , the necessary conditions are that the n differential equations shown in
equation (3.25) should be satisfied. From the solutions of this system of differential equations, we can
determine the functions yi s which make the integral (3.24) an extremum.

## 3.4.4 Variational problems in parametric form

So far we have considered functionals of functions given by explicit equation of the form

y = y(x)
in the two dimensional case with one independent variable, x. However, its often more convenient to
consider functionals of functions given in parametric form. Thus, we consider here the case where the
function is given parametrically in the form

x = x(t)

y = y(t)

(3.26)

## CHAPTER 3. THE FUNDAMENTAL PROBLEM

28
Suppose that in the functional

Z x2
x1

F(x, y, y ) dx

where the function y is given in parametric form (3.26). The (3.27) can be written as

Z t2
Z t2 
y(t)

(x, y, x,
y)
dt
x(t)
dt =
F x(t), y(t),
x(t)

t1
t1

(3.27)

(3.28)

where the overdot denotes differentiation with respect to t . The functional appearing on the righthand side of (3.28) does not involve t explicitly. It is possible to show that the value of the functional
given by (3.26) depends only on the function y = y(x) defined by the parametric equations x = x(t),
y = y(t) and not on the particular choice of x(t), y(t) themselves.
Suppose we have a functional of the form

J[t] =

Z t2
t1

(x, y, x,
y)
dt

(3.29)

in which the function is parameterized, the variational problem of (3.29) leads to the pair of EulerLagrange equations





d
d

=0
=0
(3.30)
x
dt x
y
dt y
which must be equivalent to the single Euler-Lagrange equation


d F
F

=0
x
dt x
corresponding to the variational problem for the original functional (3.27).

Chapter 4
Application: Standard Variational
Problems
This section deals with several classical problems to illustrate the methodology. The problem of finding
the minimal path between two points in space will be addressed here.

## 4.1 Minimal Path Problems

Problems of determining shortest distances furnish a useful introduction to the theory of the calculus
of variations because the properties characterizing their solutions are familiar ones which illustrate
many of the general principles common to all of the problems suggested above.

## 4.1.1 Shortest distance

Let us begin with the simplest case of all, the problem of determining the shortest distance joining
two given points. Let P(x1 , y1 ) and Q(x2 , y2 ) be two fixed points in a space. Then we want to find the
shortest distance between these two points. The length of the curve using the arc-length expression is

L = J[y(x)] =

Z Q

ds =

Z x2 q

1 + y(x)2 dx

x1

The variational problem is to find the plane curve whose length is shortest i.e., to determine the
function y(x) which minimizes the functional J[y]. The curve y(x) which minimizes the functional J[y]
is be determined by solving the EulerLagrange equation (3.12a)

d
F

y
dx

F
y

=0

F=

q
1 + y(x)2

29

30

## CHAPTER 4. APPLICATION: STANDARD VARIATIONAL PROBLEMS

and is a special case in which F independent of x and y. Then according to (3.17) EL equation reduces
to
F
=k
y
where k is a constant. The derivative

Therefore,

F
2y
1
p
=k
=
y
2 1 + y (x)2
y = k

## Solving for y to obtain

y =

p
1 + y2

k2
=m
1 k2

Integrating, y = mx + c, where constants m and c are to be found using the boundary conditions
y(x1) = y1 and y(x2 ) = y2 . Thus, the straight line joining the two points P(x1 , y1 ) and Q(x2 , y2 ),

y=

y2 y1
x2 y1 x1 y2
x+
x2 x1
x2 x1

## 4.1.2 Brachistochrone problem

Let us now revisit the brachistochrone problem. In the Brachistochrone problem, our aim is to find the
path that minimizes the time it takes for an object to slide from point P to point Q without friction.
For this, we need to find the function y(x) that minimizes the time functional given by equation (1.5).
That is,
s
Z x2
1 + y (x)2
1
dx
(4.1)
= J[y] =
y(x)
2g 0
subject to the conditions y(0) = 0 and y(x2 ) = y2 > 0. Here
s
1 + y2
F =
y
which is independent of x and therefore we can apply the Beltrami identity (3.16)

F y
where B is a constant. Now

F
=B
y

F
1
1
= p
2y

y
y 2 1 + y2

p
1 + y2
y2
=B
p

y
y 1 + y2

31

p
p
1 + y2 1 + y2 y2
=B
p
y 1 + y2

## where C is another constant.

"


y 1 + y2 = C


dy
y 1+
dx

2 #

=C

That is, the solution to the brachistochrone problem is the solution y = y(x) of the above ordinary
differential equation. This is a well known differential equation whose solution1 is called the cycloid.
A cycloid is the locus of a point fixed on the circumference of a circle as the circle rolls on a flat
horizontal surface. We can show that there is one and only one cycloid passing through points P and
Q. Its equation is given in the parametric form

x( ) = a( sin )

(4.2)

y( ) = a(1 cos )

where a = C/2 is the radius of the rolling circle and is the angle of rotation. Using the condition
that the curve (cycloid) passes through Q(x2 , y2 ), the value of the constant a can be determined.

b
b

y
Figure 4.1: The cycloid acts as a brachistochrone.

Another remarkable characteristic of the brachistochrone particle is that when two particles at
rest are simultaneously released from two different points M and N of the curve they will reach the
terminal point of the curve at the same time, if the terminal point is the lowest point on the path.
Such a curve is called an isochrone or a tautochrone. This is also counterintuitive, since clearly they
have different geometric distances to cover; however, since they are acting under the gravity and the
slope of the curve is different at the two locations, the particle starting from a higher location gathers
much bigger speed than the particle starting at a lower location. Hence the brachistochrone problem
may also be posed with a specified terminal point and a variable starting point, leading to the class
of variational problems with open boundary.
1

32

## CHAPTER 4. APPLICATION: STANDARD VARIATIONAL PROBLEMS

x
b

M
b

N
b

y
Figure 4.2: The tautochrone

## 4.1.3 Minimum surface-area of revolution

We will now take up the problem of determination of the shape of a thin soap film supported by a
pair of coaxial rings. This problem can be stated as follows: Given two points P(x1 , y1 ) and Q(x2 , y2 ),
not too far apart, in the plane find the curve y(x) joining P and Q so that the area S of the surface of
revolution about the x-axis is minimized. For this, we need to find the function y(x) that minimizes
the time functional given by equation (1.6). That is,

S = J[y] = 2

Z x2
x1

q
y(x) 1 + y(x)2 dx

(4.3)

## subject to the conditions y(x1) = y1 and y(x2) = y2 . Here

q
F = y(x) 1 + y(x)2
which is independent of x and therefore we can apply the Beltrami identity (3.16)

F y
where a is a constant. Now

F
=a
y

F
1
= y p
2y

2
y
2 1+y

## Therefore the Beltrami identity becomes

y
which on simplification yields

p
y y2
1 + y2 p
=a
1 + y2
y=a

p
1 + y2

The solution to the minimization problem therefore reduces to solving the above differential equation.
Fortunately this nonlinear, first-order, differential equation is elementary. We recast it as

dy
= y =
dx

p
y 2 a2
a

33

1
a

dx + b =

## which can be immediately solved to obtain

x
+b =
a

dy
p
y 2 a2

dy
p
y 2 a2

where b is the constant of integration. Substituting y = a cosh , so that dy = a sinh d and plugging
into the right-hand side of the above equation gives

x
a sinh d
p
+b =
a
a cosh2 1
Z
y
x
+ b = d = = cosh1
a
a
Z

i.e.,
Therefore, we have

y = a cosh

x

+b

a
The constants a and b are determined using the end (boundary) conditions
y(x1) = y1

and

(4.4)

y(x2) = y2

Equation (4.4) represents a catenary. The surface generated by rotation of the catenary is called a
catenoid.

## 4.1.4 Fermats principle of least time

As we have seen earlier, the time required by light to cover the distance between two points P(x1, y1 )
and Q(x2 , y2 ) is given by (1.7). That is,
Z x2 p
1 + y(x)2
= J[y] =
dx
v(x, y)
x1
or in terms of the refractive index, n(x, y), we can write
Z
q
1 x2
=
n(x, y) 1 + y(x)2 dx
c x1
Fermats principle states that, to travel from point P to point Q, the light ray follows the curve y = y(x)
that minimizes this functional subject to the boundary conditions y(x1) = y1 and y(x2) = y2 .
Assuming the refractive index n is a function of only y, the above functional becomes
Z
Z
p
p
1 x2
1 y2
n(y) 1 + y2 dx =
n(y) 1 + x2 dy
=
c x1
c y1
We may treat x or y as the independent variable. Either should work, but treating y as the independent
variable is more advantageous because the integrand then does not contain the dependent variable x.
When y is the independent variable, the EulerLagrange equation is given by (3.12b). That is,


F
d F

=0
y [ , ]
x
dy x

34

## where the integrand F is given by

p
F = n(y) 1 + x2

Since it does not contain the x term explicitly, E-L equation becomes

F
=a
x
where a is a constant. It follows that

x
n(y)
=a
1 + x2
Solving for x , we get

n2 x2 = a2 1 + x2

a
dx
= x =
dy
n2 a2

The solution to the minimization problem therefore reduces to solving the above differential equation.
Depending on the particular model of the speed of light (or refractive index) in the medium the result
varies.
Case I. If refractive index is inversely proportional to the height of the medium (n(y) = 1/y), the
minimization problem becomes

dx
a
y
= q
= p
2
1
dy
2
b y2
2 a
y

x+d =

ydy
p
b2 y 2

## where d is the constant of integration. Putting b2 y2 = z, we have

x+d =

1
2

or

(x + d)2 = b2 y2

dz
= z = b2 y 2
z

(x + d)2 + y2 = b2

## Hence the required path will be an arc of a circle.

Case II. If the refractive index linearly increases in the medium in such a way that

n(y) = n0 (1 + y)
where n0 and are constants. The minimization problem then becomes

dx
b
a
= p
= q
dy
(1/ + y)2 b2
n20 (1 + y)2 a2

35

## where b = a/n0 . Separate the variables and integrate to yield

x x0 =

b dy
p
(1/ + y)2 b2

where x0 is the constant of integration. Let u = 1/ + y, then the above equation becomes

x x0 = b

du

2
u b2

x x0 = b cosh1
or

x x0 = b cosh1

u
b

1/ + y
b

Therefore,

## y = b cosh[(x x0) /b] 1/

Hence the path that satisfies the Fermats principle of least time is a catenary.
Case III. In the case of the inhomogeneous optical medium consisting of two homogeneous media in
which the speed of light is piecewise constant, the result is the well known Snells law. Suppose that
the light travels from a point P1 (x1 , y1 ), with a constant speed v1 , in a homogenous medium M1 to a
point P2 (x2 , y2 ), with a constant speed v2 , in another homogeneous medium M2 . The two media are
separated by the line y = y0 . The time taken to travel the full path between P1 and P2 is given by

p
p
(x x1)2 + (y0 y1 )2
(x x2)2 + (y0 y2 )2
+
(x) =
v1
v2
Fermats principle of least time says that the path taken by the light ray will be the one for which x
minimizes (x). From calculus, it follows that

d
=0
dx
which gives

v1

x x1

(x x1)2 + (y0 y1 )2

x x2
p
=0
v2 (x x2)2 + (y0 y2 )2

The solution of this equation yields the x location of the ray crossing the boundary, and produces the
well-known Snells law of
sin 2
sin 1
=
(4.5)
v1
v2
where the angles are measured with respect to the normal of the boundary between the two media.

36

P1

y = y0
2
b

P2

## 4.1.5 Deflection of beam variational formulation

Consider a simply supported beam subjected to concentrated moments at both the ends. From EulerBernoulli beam theory, the governing differential equation for deflection y can be derived. It is is a
one-dimensional Poisson-type equation of the form

EI

d2y
M(x) = 0
dx2

(4.6)

## with the fixed boundary conditions

y(0) = 0

and

y(L) = 0

where E is the Youngs modulus I is the second moment of area of the cross-section of the beam,
and L is the span of the beam. The product EI , called the flexural rigidity, represents the resistance
offered by the beam to deflection and M(x) is the bending moment. In the present problem M = M0
is a constant. Therefore,
EIy M0 = 0
This standard differential equation can be readily integrated to obtain the deflection curve. The
solution is given by
M0
y(x) =
x(x L)
(4.7)
2EI
This beam deflection problem can also be solved by using the variational methods. To do this we

y
M0

M0

x
Figure 4.4: Simply supported beam
need to recast the problem as a variational problem using an appropriate variational statement. Here
we use the principle of minimum potential energy which states that

37

## For conservative structural systems, of all the kinematically admissible deformations,

those corresponding to the stable equilibrium state has the minimum total potential
energy.
The potential energy in a structural systems is the sum of strain energy (SE) and the work potential
(W P). The potential energy of the beam under consideration is given by the following integral
#
"  
Z L
EI dy 2
+ M0 y dx
(4.8)
(y) =
2 dx
0
Here the Lagrangian F is given by

EI
F =
2

dy
dx

2

+ M0 y =

EI 2
y + M0 y
2

## which is independent of x. To minimize the (y), we use the EL equation (3.12a)



d F
F

=0
y
dx y
We compute

F
F
= M0
and
= EIy
y
y
Substitute the above results in the EL equation to obtain
M0

d
(EIy) = 0
dx

## Separate the variable and integrating

EIy = M0 x + c1
Integrating again

x2
+ c1 x + c2
2
where constants c1 and c2 are to be found using the boundary conditions y(0) = 0 and y(L) = 0. Thus,
we have
M0 L
and
c2 = 0
c1 =
2
Substitution of these values in the general solution gives the equation of the deflection curve
EIy = M0

y=

M0
x(x L)
2EI

## 4.2 Construction of Functionals from PDEs

We have noticed that EL equation produces the governing differential equation corresponding to a
given functional or variational principle. Here we seek the inverse procedure of constructing a variational principle for a given differential equation, L (y) = 0. The procedure for finding the functional
associated with the differential equation involves four basic steps:

## CHAPTER 4. APPLICATION: STANDARD VARIATIONAL PROBLEMS

38

Multiply the left-hand side of the differential equation L (y) with the variational y of the
dependent variable y and integrate over the domain of the problem.
Use integration by parts to transfer the derivatives to variation y.
Express the boundary integrals in terms of the specified boundary conditions.
Bring the variational operator outside the integrals.
The procedure is best illustrated with an example. We will take the problem of the deflection of beam
governed by the equation (4.1). Since the differential equation holds good for all points within the
system, we can write


d 2y
EI 2 M0 y = 0
dx
where y is an arbitrary variation on y with y|x=0 = 0. Integrating over the domain of the problem,

Z L
d 2y
J =
EI 2 M0 y dx = 0
dx
0
Z L
Z L
2
d y
EI 2 y dx
M0 y dx
J =
dx
0
0

Now, the first integral on the right-hand side can be integrated by parts2 by letting u = y and
2
v = EI ddx2y . Thus

Z L
Z L
dy L
d( y) dy
J = y EI
M0 y dx
EI dx
dx 0
dx
dx
0
0

The first term vanish if we assume either the homogeneous Dirichlet or Neumann conditions at the
boundaries. That is
y(0) = y(L) = 0

y(0) = y(L) = 0

or

Hence

J =
Therefore,

dy
dy
=
=0
dx L
dx 0

 
Z L
EI dy 2
2

J[y] =

dx

"

dx

 
Z L
EI dy 2
0

dx

Z L
0

M0 y dx
#

+ M0 y dx

Some standard differential equations and their functional are given below:
If the differential equation is of the form

d2
+ P(x) + Q(x) = 0
dx2

x [a, b]

## the corresponding variational principle is given by

"  
#
Z
1 b
d 2
D
P(x) 2 2Q(x) dx
J[ ] =
2 a
dx
2

uv dx = uv u v dx

(4.9a)

(4.9b)

39

2 + p2 = q

xD

(4.10a)

## the corresponding variational principle is given by

1
J[ ] =
2
where

Z 
D


| |2 p2 2 + 2q d D

| | = =
2

2

(4.10b)

2

Example 4.1
Find the functional for the ordinary differential equation

d 2y
+ 3y + x = 0,
dx2

0<x<1

## subject to y(0) = y(1) = 0.

This equation is of the form (4.9a). Therefore, the corresponding functional is given by

1
J[y] =
2

"
Z 1  2
dy
dx

3y2 2xy dx

## As a check we will use the EL equation (3.12a)

d
F

y
dx

F
y

=0

for the above functional to recover the original differential equation. That is



dy
d
2
=0
6y 2x
dx
dx

d 2y
+ 3y + x = 0
dx2

## 4.3 RayleighRitz Method

RayleighRitz method is a direct method for minimizing a given functional. It is direct in the sense that
it yields a solution to the variational problem without solving the associated Euler-Lagrange Equation.
It may be noted that, for most of the physical problems, the functional we get from the variational
principle is not simple always and thus the solution using the EL equation will be difficult to obtain.
The RayleighRitz method is an approximate method where the given functional is directly minimized
without recourse to the associated EL equation.
To illustrate the method let us consider the following functional

J[ ] =

F(x, y, , x , y ) dS

(4.11)

40

## CHAPTER 4. APPLICATION: STANDARD VARIATIONAL PROBLEMS

Our objective is to minimize this integral. In the Rayleigh-Ritz method, we select a linearly independent
set of functions called basis functions un and construct an approximate solution to equation (4.11),
satisfying some prescribed boundary conditions. The solution is in the form of a finite series

= u0 +

an un

(4.12)

n=1

where u0 meets the nonhomogeneous boundary conditions if any, and un satisfies homogeneous boundary conditions. The unknown coefficients an are to be determined and is an approximate solution
to the exact solution . Substitution of the approximate solution into equation (4.11) results in the
function with N coefficients a1 , a2 , aN . That is,

J( ) = J(a1, a2 , aN )
The minimum of this function is obtained when its partial derivatives with respect to each coefficient
is zero. That is
J
J
J
= 0,
= 0,
=0
a1
a2
aN
or
J
= 0, n = 1, 2, N
(4.13)
an
Thus we obtain a system of N linear algebraic equations which can be solved to obtain an . These an
are then substituted into the approximate solution (4.12). Now, if as N in some sense,
then the procedure is said to converge to the exact solution.
The basis functions are selected to satisfy the prescribed boundary conditions of the problem.
u0 is chosen to satisfy the inhomogeneous boundary conditions, while un (n = 1, 2, N) are selected
to satisfy the homogeneous boundary conditions. It may be noted that u0 = 0 if the prescribed
boundary conditions are all homogeneous (Dirichlet conditions). The Rayleigh-Ritz method has two
major limitations. First, the variational principle in equation (4.11) may not exist in some problems
such as in nonself-adjoint equations (odd order derivatives). Second, it is difficult, if not impossible,
to find the functions u0 satisfying the global boundary conditions for the domains with complicated
geometries.

Example 4.2
Use the Rayleigh-Ritz method to solve the beam deflection problem given by the variational principle
(4.8):
"  
#
Z L
EI dy 2
[y] =
+ M0 y dx
2 dx
0
with the boundary conditions y(0) = 0 = y(L). The exact solution of this minimization problem is

y(x) =

M0
x(x L)
2EI

N

y = u0 +

an un

n=1

## 4.3. RAYLEIGHRITZ METHOD

41

where u0 = 0. Some of the possible choices for base function are polynomial of the form
N

y =

an x n

n=1

## and trigonometric functions of the form

N

y =

an sin n x

n=1

We will first explore the case of trigonometric function with N = 1. That is, we have

y = a sin x
The assumed solution should satisfy both the boundary conditions. If we set = 1/L, we have a
solution which satisfies the boundary conditions. Thus, we have

y = a sin

x
L

Here a is the undetermined parameter to be found out. We have to select a such that the functional
[y] is a minimum. Substituting the above approximate solution into the functional gives

Z L
x 2
x
EI  a
+ M0 a sin dx
cos
(a) =
2 L
L
L
0
Evaluating the integral to yield

(a) =

EI 2
4L

a +


2M0 L
a

At this point observe that (a) is an ordinary function of the unknown a. The function (a) is
minimum when


EI 2

2M0 L
4M0 L2
=0

2
a+
=0
or
a= 3
a
4L

EI
Hence the approximate solution is

x
4M0 L2
sin
3
EI
L
Next, we will try with the polynomial function with N = 2. That is, we
y =

Polynomial approximation:
have

y = a1 x + a2 x2
The assumed solution satisfy the boundary condition y(0) = 0. Application of the second boundary
condition yields
0 = a 1 L + a 2 L2

a1 = a2 L
Hence the approximate solution which satisfies both the BCs is given by

y = ax(x L)

42

## CHAPTER 4. APPLICATION: STANDARD VARIATIONAL PROBLEMS

where we have dropped the subscript of a. Substituting the above approximate solution into the
functional gives

Z L
EI
2
[a(2x L)] + M0 ax(x L) dx
(a) =
2
0

Z L
Z L



EI
2 2
2
2 2 2
=
4a x 4a Lx + a L
dx +
M0 ax2 aLx dx
2
0
0
 3



aL
aL3
EI 4 2 3
2 3
2 3
a L 2a L + a L + M0

=
2 3
3
2

 2 3

3
EI a L
aL
=
+ M0
2
3
6
The function (a) is minimum when

=0
a

EI 2aL3
M0 L3

=0
2 3
6

or

a=

M0
2EI

## Hence the approximate solution is

M0
x(x L)
2EI
We see here that this is the exact solution of the problem. This has happened because the assumed
approximate solution was of the same for as the exact solution.
y =

## 4.4 Problem with Variable End Points Natural Boundary

Conditions
To illustrate the problem with end points not fixed, let us consider the following problem: A river
with parallel straight banks b units apart has stream velocity given by V (x, y) = v(x) j (see figure).
Assuming that one of the banks is the y axis and that the point (0, 0) is the point of departure, what
path should a boat take to reach the opposite bank in the shortest possible time? Assume that the
speed of the boat in quiescent water is c, where |c| > |v|.
This problem differs from problems in earlier sections in that the right-hand endpoint, the point of
arrival on the line x = b, is not specified; it must be determined as part of the solution to the problem.
It can be shown that the time required for the boat to cross the river along a given path y = y(x) is

J[y] =

Z bp 2
c (1 y2) v2 vy

c2 v2

dx

(4.14)

## Hence the variational problem is to minimize J[y] subject to the conditions

y(0) = 0

and

y(b) unspecified

Such a problem is called a free endpoint problem and if y(x) is an extremal, then a certain condition
must hold at x = b. Conditions of these types, called natural boundary conditions (Neumann type),
are discussed here.

y

y = y(x)

(0, 0)

x=b

## Let us consider the problem

J[Y ] =

Z b
a

F[x, Y, Y ] dx =

Z b
a

F[x, y + , y + ] dx

(4.15)

If y(x) is not prescribed at the end points, then end points are treated as variable in the y direction. In
this case, the condition for the functional J[y] in (4.15) to have minimum is given by equation (3.10),
which is repeated below:




Z b
F
d F
F b
(4.16)

dx +
=0
y
dx y
y a
a
There are four possible cases in which the above conditions can be met. We show these diagrammatically as

y +
y
a
(i)

y
b

a
(ii)

y +

y +

y +

y
b

a
(iii)

y
b

a
(iv)

Figure 4.6: The four possible cases of varying end points in the direction of y.

Case (i) In this case we have fixed boundary at both boundaries in the form

y(a) =

y(b) =

Here should satisfy (a) = (b) = 0. The equation (4.16), in this case, then becomes


Z b
F
d F
dx = 0

y
dx y
a

44

## CHAPTER 4. APPLICATION: STANDARD VARIATIONAL PROBLEMS

Since (x) is arbitrary admissible function, the above equation is satisfied only when the integrand is
zero. Thus, we have


d F
F

=0
(4.17)
y
dx y
This is just our standard EulerLagrange equation (3.12a).
Case (ii) In this case only the left boundary has the fixed boundary condition unspecified. Thus, we
have (a) = 0 but (b) 6= 0. The equation (4.16), in this case, then becomes
Z b
F
a

y
dx

F
y



F
dx +

=0

x=b

Also since (x) is arbitrary admissible function and (b) does not vanish, the above equation is
satisfied only if


d F
F

=0
y
dx y
and

F
=0
y x=b

(4.18)

## If y is not prescribed at an end point, as in the above example, x = b, then we require

Such a condition is called a natural boundary condition.

F
y

= 0 at x = b.

Case (iii) In this case only the right boundary has the fixed boundary condition unspecified. Thus, we
have (b) = 0 but (a) 6= 0. The equation (4.16), in this case, then becomes
Z b
F
a

y
dx

F
y



F
dx +

=0

x=a

Also since (x) is arbitrary admissible function and (a) does not vanish, the above equation is
satisfied only if


F
d F

=0
y
dx y
and

F
=0
y x=a

(4.19)

Case (iv) If both the boundary conditions (a) and (b) are not specified, we can chose (x) which
does not vanish both at a and b. Thus the necessary conditions for the functional to become a
minimum are

F
F
=0
and
=0
(4.20)
y x=a
y x=b
So when a condition is not specified at a boundary, we first find the general solution of the Euler
Lagrange equation (4.17). The arbitrary constants in the solution are then determined by using the
natural boundary conditions given by (4.18), (4.19), or (4.20).

## 4.4. PROBLEM WITH VARIABLE END POINTS NATURAL BOUNDARY CONDITIONS45

Example 4.3
Let us rework the problem of determining the shortest distance joining two given points. The functional
to minimize for this problem is given by
Z x2 q
L[y(x)] =
1 + y (x)2 dx
x1

p
where F = 1 + y(x)2 . We have seen that if the points are fixed in space, say A(x1 , y1 ) and B(x2 , y2 ),
the curve joining the two points which has the shortest distance is a straight line given by
y=

y2 y1
x2 y1 x1 y2
x+
x2 x1
x2 x1

If y values are not given at x1 and x2 , we can still solve the EulerLagrange equation to get the
equation of the curve. As we have seen earlier, it is a straight line, y = mx + c. Now we have from
the natural boundary conditions (4.15),

F
F
=0
and
=0
y x=x1
y x=x2
Therefore, for the left boundary at x1 ,

## which leads to the condition

y (x1 )
F
p
=
=0
y x=x1
1 + y2

dy
y (x1 ) =
=0
dx x1

Since y = mx + c, we get m = 0 at x = x1 . In a similar manner we can get the condition on the right
boundary (at x2 ) as m = 0. Hence y = c is the equation of the curve with shortest length.

Example 4.4
Find the differential equation and boundary conditions for the extremal of the variational problem

J[y] =

Z 1
0

with

y(0) = 0


p y2 q y2 dx

and

y(b) unspecified

where p = p(x) and q = q(x) are positive smooth functions on [0, b]. In this case

F = p y2 q y2
Therefore,

F
= 2q y
y
The EulerLagrange equation is

and

d
(py) + qy = 0,
dx

F
= 2p y
y

0xb

46

## The natural boundary condition at x = b is given by (4.20) or

2p(b)y(b) = 0
Since p > 0 the boundary conditions are

y(0) = 0

and

y (b) = 0

Example 4.5
We find the natural boundary condition at x = b for the river crossing problem defined by the functional
(4.14). In this case
p
c2 (1 y2) v2 vy
F =
c2 v2
Therefore,
"
#
F
1
c2 y
= 2
v
y
c v2 (c2 (1 + y2) v2)1/2
The natural boundary condition (4.20) becomes

c2 y (b)
[c2 (1 + y(b)2 ) v(b)2]1/2

v(b) = 0

On simplification yields

v(b)
c
Thus the slope that the boat enters the bank at x = b is the ratio of the water speed at the bank to
the boat velocity in still water.
y (b) =

Chapter 5
Variational Problems with Constraints
5.1 Introduction
In the simplest variational problems in the last few sections, the class of admissible functions was
specified by conditions imposed on the end points of the curve. However, many applications of
calculus of variations lead to problems in which not only boundary conditions, but also conditions of
a quite different type side conditions or constraints are imposed on the admissible curves.
There are three types of variational problems with constraints: Isoperimetric problems, holonomic
problems, and non-holonomic problems. Their difference is in the way the constraints are specified.
In an isoperimetric problem the constraint is given in terms of an integral, in holonomic problems the
constraint is given in terms of a function which does not involve derivatives, and in non-holonomic
problems the constraints are given by differential equations.

## 5.2 Isoperimetric problem

Here we consider a class of problems called isoperimetric problem. Originally, the isoperimetric problem
refers to the determination of shape of a closed curve with a given perimeter which encloses the
maximum area. A variational problem that has an integral constraint (side conditions) is known as an
isoperimetric problem. The simplest isoperimetric problem consists of finding a function y(x) which
extremizes the functional

J[y] =

Z b
a

F(x, y, y ) dx

(5.1)

y(a) =

y(b) =

K[y] =

Z b
a

G(x, y, y ) dx

47

(5.2)

## CHAPTER 5. VARIATIONAL PROBLEMS WITH CONSTRAINTS

48

assumes a given prescribed value . To solve this problem, use the method of Lagranges multipliers.
Let J[y] has an extremum for y = y(x). Then, if y = y(x) is not an extremal of K[y], there exists a
constant such that y = y(x) is an extremal of the functional

J[y] + K[y] =

Z b
a

(F + G) dx

## and y = y(x) satisfies the modified Euler-Lagrange equation






F
d F
G
d G

+
=0
y
dx y
y
dx y

(5.3)

The general solution of equation (5.3) contains two constants of integration say c1 , c2 , and the
unknown Lagrange multiplier . These unknowns will be determined using the two end conditions
y(a) = , y(b) = and the side condition (5.2).

Example 5.1
Consider a curve in the upper half-plane having a fixed length and passing through the points (a, 0)
and (a, 0). Find the maximum enclosed area formed by the curve and the interval [a, a].
This is an isoperimetric problem. Here we need to find y = y(x) for which the integral

J[y] =

Z a

y dx

## is a maximum subject to the conditions

y(a) = y(a) = 0
and

K[y] =
We first form the functional

Z ap

J[y] + K[y] =

1 + y2 dx =

Z a
a

y+


1 + y2 dx

## and the corresponding EulerLagrange equation (5.3) becomes

!
y
d
p
=0
1+
dx
1 + y2
which on integration gives

## Integrating once more, we obtain

y
x+p
= C1
1 + y2
(x C1 )2 + (y C2 )2 = 2

This equation represents the equation of family of circles. The constants C1 , C2 , and are determined
from the conditions
y(a) = y(a) = 0
and
K[y] =

## 5.2. ISOPERIMETRIC PROBLEM

49

Example 5.2
A rope of length with constant weight per unit length w hangs from two fixed points (x1 , y1 ) and
(x2 , y2 ) in the plane. Determine the shape of the hanging rope.
Let y(x) be an arbitrary configuration of the rope with the y axis adjusted so that y(x) > 0. A small
element of length ds at (x, y) has weight w ds and potential energy wyds relative to y = 0. Therefore
the total potential energy of the rope hanging in the arbitrary configuration is given by the functional
Z x2 p
Z
wy 1 + y2 dx
wy ds =
J[y] =
x1

It is known that the curve will assume a shape that minimizes the potential energy. Thus we are faced
with minimizing the above functional subject to the isoperimetric condition
Z x2 p
K[y] =
1 + y2 dx =
x1

## We now form the new functional

J + K =

Z x2 

wy

x1


p
p
1 + y2 + 1 + y2 dx

Since F and G are independent of x, EulerLagrange equation (5.3) can be written as (see equation
(3.16))




G
F
F y + Gy =C
y
y
Therefore, we have
!
!

p
p
wy
1
2y
2y

p
wy 1 + y2 y
1 + y2 y p
+
=C
2
2 1 + y2
1 + y2
or
!
2
p
y
=C
1 + y2 p
(wy + )
1 + y2
Solving for y and separating variables yields

1
dy
p
= dx
C
(wy + )2 C2

## Integrating the above equation to obtain

Z

dy
p

(wy + )2

C2

x
+ C1
C

The left hand-side is a standard integral and can be evaluated by the formula
Z
u
du

= cosh1
2
2
C
u C
Thus, we have
1
wy +
x
cosh1
= + C1
w
C
C
Solving for y, we get

 wx

C
+ C2
y = + cosh
w
w
C
The above equation represents a catenary. Therefore the shape of a hanging rope is a catenary. The
constants C, C2 , and may be determined from the endpoint conditions y(x1) = y1 , y(x2 ) = y2 , and
the side condition of fixed perimeter . In practice this calculation may be difficult.

50
Example 5.3

J[y] =

Z 1

y2 dx

y(0) = y(1) = 0

and

K[y] =

Z 1
0

y2 dx = 2

J + K =

Z 1
0


y2 + y2 dx



d

(2y) + (2y ) = 0

2y 2 y = 0
dx
or

d2y
1
y=0
dx2

a

d 2y
dy
+ b + cy = 0
2
dx
dx

## The roots of the corresponding auxiliary equation are given by

b + b2 4ac
b b2 4ac
r1 =
r2 =
2a
2a
We distinguish three cases according to the sign of the discriminant, D = b2 4ac.
Case I: If D > 0, the general solution is given by

y = c1 er1 x + c2 er2 x
Case II: If D = 0, the auxiliary equation will have only a single root, r = b/2a. The general solution
is then given by
y = c1 erx + c2 xerx
Case III: If D < 0, the roots and of the auxiliary equation are complex numbers. They are given by

r1 = + i
where = b/2a and =

r2 = i

## 4ac b2/2a. The general solution is then given by

y = e x (c1 cos x + c2 sin x)

51

1
r2 =

1
r1 =

## and the general solution becomes

y = c1 ex/

+ c2 ex/

The constants c1 and c2 can be determined using the boundary conditions y(0) = y(1) = 0. Using the
condition y(0) = 0,
0 = c1 + c2

c1 = c2 = c
Therefore,
and using the condition y(1) = 0,



y = c ex/ ex/


0 = c e1/ e1/

Since is assumed to be positive, the above equation is satisfied only when c = 0. Hence, we get a
trivial solution of y = 0 for > 0.
Case II: If = , we have c = 0 D = 0. So, we have the single root r = b/2a = 0. The general
solution then becomes
y = c1 + c2 x
Using the boundary condition, we get c1 = 0 = c2 . Again we get a trivial solution y = 0 for this case.
Case III: If < 0, we have D < 0. In this case, the roots and of the auxiliary equation are complex
numbers. They are given by
1
1
r1 = i
r2 = i

## The general solution is then given by

 
 
y = c1 cos x/ + c2 sin x/
Using the boundary condition y(0) = 0, gives c1 = 0. The second condition y(1) = 0 gives
 
0 = c sin 1/
Since c2 = 0 gives a trivial solution, we should have
 
1
= n ,

sin 1/ = 0

n = 1, 2, 3, . . .

## Therefore, the general solution becomes

y = c sin n x,

n = 1, 2, 3, . . .

So we have here an infinite number of solutions and we can construct the most general solution as

y=

cn sin n x

n=1

52

Z 1
0

y2 dx = 2

2
c1 =

## and the curve is given by

2
y=
sin x

Chapter 6
Principle of Least Action
6.1 Introduction
Most of the applications of the calculus of variations examined so far have been geometrical in nature.
We now explore applications of the calculus of variations to problems in classical mechanics. It is
assumed that the time evolution of a mechanical system is completely determined if its state is known
at some given instant. This is expressed by the fact that the dynamical variables satisfy a set of
differential equations (the equations of motion of the system) as functions of time along with initial
conditions. The method of classical dynamics, therefore, consists of listing the dynamical variables
and discovering the equations of motion that predict the systems evolution in time.
One method of obtaining the equations of motion is from a variational principle. This method is
based upon the idea that a system should evolve along the path of least resistance. Principles of this
sort have a long history. In the seventeenth century, Fermats principle that light rays travel along the
path of shortest time was announced. For mechanical systems Maupertuiss principle of least action
stated that a system should evolve from one state to another in such a way that the action (a vaguely
defined term with the units energy time) is smallest. Lagrange and Gauss were also advocates of
similar principles. In the early part of the nineteenth century, however, W. R. Hamilton stated what has
become an extremely powerful principle that can be generalized to formulate virtually all fundamental
laws of physics.

## 6.2 Principle of Least Action

The most general formulation of the law governing the motion of mechanical systems is the principle
of least action or Hamiltons principle. Consider a system of N particles1 with no constraints imposed
on the system. Let the ith particle has a radius vector ri (i = 1, 2, . . . , N) (for notational convenience
1A

particle is a body whose dimensions may be neglected in describing its motion. The possibility of so doing

depends on the conditions of the problem concerned. For example, the planets may be regarded as particles in
considering their motion about the Sun, but not in considering their rotation about their axes. The position of a
particle in space is defined by its radius vector r, whose components are in Cartesian coordinates x, y, z.

53

## CHAPTER 6. PRINCIPLE OF LEAST ACTION

54

an underline is used to denote this vector). According to the principle of least action every mechanical
system is characterized by a definite function L(r1 , r2 , . . . , rN , r 1 , r 2 , . . . , r N , t), or briefly L(r, r,t), and
he motion of the system is such that a certain condition is satisfied.
Let the system occupy positions defined by the set of radius vectors r(0) at an instant t0 and r(1)
at an instant t1 . Then, according to the principle of least action, the system moves between these
positions in such a way that the integral

J=

Z t1

L(r, r,t) dt

(6.1)

t0

takes the least possible value. The function L is called the Lagrangian of the system concerned and
the integral (6.1) is called the action integral. Thus, principle of least action may be stated as follows:
Among all the paths that a system of particles could take to go from an initial position
at t0 to a final position at t1 , the paths that the particles actually take are the ones that
minimize the action integral (6.1).
Note that the Lagrangian contains only r and r, but not the higher derivatives of r, is due to the fact
that the mechanical state of the system is completely defined when the position and the velocities are
given.

## 6.2.1 Hamiltons principle of least action from Newtons law of motion

The Hamiltons principle of least action for a mechanical system is equivalent to the Newtons second
law of motion for the same system. This equivalence can be easily shown by developing Hamiltons
principle from the Newtons second law.
y

t = t0

t = t1

r
b

r
r

## Figure 6.1: Paths traced out by the radius vectors r and r

For simplicity, we shall assume that the system has only one particle of mass m with radius vector
r. If F is the force acting on the particle, Newtons second law of motion is given by

m r F = 0

(6.2)

## 6.2. PRINCIPLE OF LEAST ACTION

55

Let r(t) be the resulting path within the time interval, t0 < t < t1 . Let us also consider another path
r (t) associated with another law of motion, such that

t0 < t < t1

## satisfying the conditions

r(t0 ) = 0

r(t1 ) = 0

(6.3)

where r is the variation in r so that the varied path is given by r (t). We take the dot product of
equation (6.2) with r and integrate from t0 to t1
Z t1


m r r F r dt = 0

t0

## The kinetic energy of the particle is

T =

(6.4)

1
1 2
m r = m r r
2
2

The variational operator follows the rules of differential operator d of calculus. Therefore the
variation in kinetic energy due to r can be expressed as


1
T =
m r r = mr r
2
By the product rule of differentiation, we have

d
(r r) = r r + r r
dt
Using the above equation, the variation in kinetic energy can be written as

T =

d
(m r r) m r r
dt

Integrate from t0 to t1
Z t1
t0

Z t1
d

T dt =

t0

dt

(m r r) dt

[m r r]tt10

Z t1
t0

Z t1
t0

m r r dt

m r r dt

By equation (6.3), the first term on the right-hand side of the above equation becomes zero. Therefore,
Z t1
t0

T dt =

Z t1
t0

m r r dt

## Hence, equation (6.4) can be written as

Z t1
t0


T + F r dt = 0

(6.5)

This is the most general form of Hamiltons principle for a single particle under a general force field
and says that the path of motion is such that along it, the integral of the sum of the variation T of
the kinetic energy and F r must be stationary for path satisfying r(t0 ) = r(t1 ) = 0.

## CHAPTER 6. PRINCIPLE OF LEAST ACTION

56

If the force field is conservative, then there is a force potential (x, y, z) such that

F =
The term on the right-hand side of equation (6.4) can be written as

F r = Fx x + Fy y + Fz z =

x +
y +
z =
x
y
z

## Sine potential energy U is negative of the force potential , we may write

F r = U
Hence, the Hamiltons principle can be written as
Z t1
t0

(T U) dt =

Z t1
t0

(T U) dt = 0

(6.6)

Defining
equation (6.6) can be written as

L = T U

(6.7)

Z t1

L dt = 0

(6.8)

Z t1

(6.9)

t0

The integral

J=

L dt

t0

is called the action integral. Here we identify L as the Lagrangian of the system. It is the difference
between the particles kinetic energy, T and the potential energy, U . Equation (6.8) states that
variation of the action integral J is equal to zero. This is precisely the necessary condition for the
action integral to have a minimum. Thus, Hamiltons principle for a conservative system may be
stated as follows:
Among all the possible paths passing through two fixed points, corresponding to the
times t0 and t1 , the true motion is performed on that path for which the action integral
is a minimum.
Note: The above derivation Hamiltons principle can be easily extended to system of particle by
summation and to a continuous system. It is equally valid to a general dynamical system consisting
of particles and rigid bodies.

## 6.2.2 Newtons law from Hamiltons principle of least action

The statement of Hamiltons principle of least action is mathematically equivalent to the full set of
EulerLagrange equations, which give rise to the equations of motion that determine the actual paths
of the system.
Let the particle has a mass m and coordinates x, y, z. The kinetic energy of the particle is

T =


1
m x2 + y2 + z2
2

## 6.2. PRINCIPLE OF LEAST ACTION

57

We assume that the particle has a potential energy U , which can be expressed in the form

U = U(t, x, y, z)
such that the negative of partial derivative of potential energy in a given direction gives the force in
that direction (see equation (C.12)). That is,

Fx =

U
x

Fy =

U
y

Fz =

U
z

If the functional (action integral) (6.9) has a minimum, the following EulerLagrange equation

 
L
d L

=0
x
dt x
 
L
d L

=0
y
dt y
 
d L
L

=0
z
dt z

(6.10)

must be satisfied. Since L = T U , and the potential energy U depends only on t, x, y, z, while kinetic
energy T depends on the velocity components x,
y,
z, we can write the equation (6.10) in the form



U
d T

=0
x
dt x


d T
U

=0
y
dt y


U
d T

=0
z
dt z

(6.11)

U
d
(mx)
=0
x
dt
d
U

(my)
=0
y
dt
U
d
(mz) = 0

z
dt

(6.12)

or

Since the negative of partial derivative of potential energy in a given direction gives the force in that
direction, the system of equation (6.12) reduces to

mx = Fx
my = Fy
mz = Fz
This is just the Newtons second law of motion for a single particle.

(6.13)

58

## 6.3 Generalized Coordinates

We know that in a Cartesian coordinate system, the location of a particle in three-dimensional space
is described by three space coordinate (x, y, z). Hence to describe the position of N particles we require
3N number of coordinates. If the motion of a system of particles has constraints, it is not required
to give the actual coordinates of every particle to describe the configuration of the system. Suppose,
for instance, that a rigid rod is moving in a plane, then it is sufficient to specify the coordinates of
the center of mass and the angle that the rod makes with the x-axis. From these, the position of all
points of the rod may be determined.
Consider a system of N particles whose radius vectors r1 , . . . , rN relative to a Cartesian coordinate
system, subject to p holonomic independent constraints

f1 (r1 , . . . , rN ) = 0
..
.

(6.14)

f p (r1 , . . . , rN ) = 0
Due to the existence of the constraints, the 3N coordinates of particles are not independent, therefore
the number of independent coordinates will be

n = 3N p
Here n is the smallest possible number of variables required to describe the configuration of the system.
It is equal to the number of degrees of freedom of a system. For example, a system of two particles,
at a fixed distance one from the other, has 6 1 = 5 degrees of freedom.
If the number of particles is large, the presence of constraints makes the determination of the
coordinates xi , yi , zi a difficult task. We shall attach to the n degrees of freedom a number of n
independent variables q1 , . . . , qn called generalized coordinates or Lagrangian variables. Similar to
Cartesian coordinates, generalized coordinates are assumed to be functions of time and completely
specify the state of the system at any instant. Further we assume that there are no relations among
the qi so that they may be regarded as independent.
Unlike Cartesian coordinates, the generalized coordinates do not necessarily have the dimension of
length. In general Lagrangian coordinates may be any suitable geometrical objects, like line segments,
arcs, angles, etc. The choice of generalized coordinates is somewhat arbitrary.
If the system is not subject to constraints, we can choose the generalized coordinates as the 3N
Cartesian coordinates of the particles itself, but there are also other possible choices. For example, the
position of a particle can also be defined by its cylindrical coordinates r, , z or its spherical coordinates
r, , .
The 3N Cartesian coordinates ri are then expressed in terms of q1 , . . . , qn by

## r1 = r1 (q1 , . . . , qn ,t) r1 (q,t)

..
.
rN = rN (q1 , . . . , qn ,t) rN (q,t)

(6.15)

59

## 6.3.1 Configuration Space

The set of radius-vectors r1 , . . . , rN define the so-called configuration of the system of particles (i.e.
the positions of all N particles), in the real space. If we choose q1 , . . . , qn as coordinates of an
imaginary n-dimensional space Rn , then to each set of values of the variables q1 , . . . , qn will correspond a
representative point in this space, known as the configuration space. In other words, any configuration
of a mechanical system can be represented by a single point in the configuration space Rn . Note that
the configuration space does not generally have an intuitive meaning, as does the Euclidean space
used in Newtonian mechanics; but the abstract notions of generalized coordinates and configuration
space are very useful in mechanics.
As the mechanical system changes its configuration with time, the configuration point traces
a curve in configuration space, called generalized trajectory. This is by no means any of the real
trajectories of particles, but describes the motion of the whole system. The generalized trajectory
can be conceived as a succession of representative points, each of them corresponding to a certain
configuration of the system. The parametric equations of the generalized trajectory is given by

q j = q j (t)

( j = 1, . . . , n)

Once these equations are known, by means of equation (6.15), the motion of the particles in real space
can also be determined.
The time derivatives of q1 , . . . , qn

q j =

dq j
dt

( j = 1, . . . , n)

are called generalized velocities. In view of (6.15), the generalized velocities are related to the real
velocities r i by

dri
vi = r i =
=
dt

r i dq j
ri dt
+
=
dt
t dt

qj

j=1

ri

q j q j +

j=1

ri
t

(i = 1, . . . , N)

(6.16)

The kinetic energy T of the system in terms of the generalized coordinates q1 , . . . , qn , and the
generalized velocities q1 , . . . , qn may be obtained as

 n 

n 
N
1
1 N

r
i
i
i
i
T = mi vi v i = mi
qk +
q j +

2 i=1
2 i=1 j=1 q j
t
t
k=1 qk
If the constraints do not explicitly depend on time, the above equation for kinetic energy will be
simplified to
1 n n
(6.17)
T = a jk q j qk
2 j=1 k=1
where

ri ri
1 N
mi

2 i=1 q j qk


1 N
xi xi
yi yi
zi zi
= mi
+
+
2 i=1
q j qk
q j qk
q j qk

a jk =

## CHAPTER 6. PRINCIPLE OF LEAST ACTION

60

It can be shown that T 0, zero only if all q1 , . . . , qn are zero. For example, the kinetic energy of a
particle of mass m in spherical coordinates is

T =


1
m r2 + r2 2 + r2 sin2 2
2

We can therefore conclude that, in general, the kinetic energy has the following functional dependence:

T = T (q, q,t)

(6.18)

The potential energy U can also be written in generalized coordinates. U may be a function of
q1 , . . . , qn ,t , that is
U = U(q1 , . . . , qn ,t)
or briefly
U = U(q,t)
(6.19)

## 6.3.2 Hamiltons principle in generalized coordinates

Consider a conservative mechanical system described by generalized coordinates q1 , . . . , qn . Then the
motion of the system from time t0 to t1 is such that the action integral

J =

Z t1

L(q, q,t)
dt

(6.20a)

t0

is a minimum for functions q j (t) which describe the actual time evolution of the system. Using the
variational operator, the Hamiltons principle can be stated concisely as

J =

Z t1

L(q, q,t)
dt = 0

(6.20b)

t0

## The Lagrangian L in (6.20) is given by

L(q, q,t)
= T (q, q,t)
U(q,t)

(6.21)

where T and U are the kinetic energy and potential energy of the system respectively.
To find the paths q j (t) ( j = 1, 2, . . . , n) that minimize the action integral we solve the associated
EulerLagrange equation. Here the action integral (6.20a) is similar to the functional (3.24) where the
paths q j (t) plays the role of functions y j (x), the Lagrangian L plays the role of the functional F and
the parameter t plays the role of x.
The necessary condition for the action functional J to be an extremum is that the Lagrangian L
satisfies the following EulerLagrange equations (3.25):


d L
L
=0
( j = 1, . . . , n)
(6.22)

dt q j
qj
This set of equations (6.22) is known as Lagranges equations for mechanical systems rather than
EulerLagrange equations. There is one Lagrange equation for each degree of freedom. They form a
set of second-order differential equations for paths q j (t). The solutions to these equations, subjected
to the boundary conditions at t = t0 and t = t1 , are the paths q j (t) that minimize the action integral
(6.20a).

## 6.4. APPLICATIONS OF PRINCIPLE OF LEAST ACTION

61

In Lagrangian formulation, the motion of the system is described by the energy consideration
rather that of force. Further, since Lagrangian, L, is purely a scalar, the form of the equation is
independent of the coordinate system. This formulation is also advantages due to the fact that the
constraints can be handled in an easier way.
Since, for a conservative mechanical system, the Lagrangian L = T U , and the potential energy
U depends only on q,t while kinetic energy T depends on the velocity components q,t
, we can write
the Lagranges equation (6.22) in the form


U
d T
=0
( j = 1, . . . , n)
(6.23)
+
dt q j
qj
Equation (6.23) is the Lagranges equations for a conservative mechanical system with n degrees of
freedom.
In view of equation (3.16) (Beltrami identity), if the Lagrangian L is independent of time t , that
is, L/ t = 0, then the Lagranges equation (6.22) may be written as

L q j

L
=C
q j

( j = 1, . . . , n)

(6.24)

Thus, the extremizing function q is obtained as the solution of a first-order differential equation
involving q and q only.

## 6.4 Applications of Principle of Least Action

Harmonic oscillator
Consider the motion of a mass m attached to a spring in which the restoring force is F = kx, where
x is the coordinate representing the positive displacement of the mass from equilibrium and k is the
spring constant. We assume there is no damping and the mass will vibrate indefinitely if displaced,
i.e., the system is conservative. Determine the differential equation motion using the Lagranges
formulation and obtain the period of oscillation.

b
b

m
x

## Figure 6.2: Linear harmonic oscillator

The kinetic energy is

1 2
mx
2
and the potential energy can be found by the relation
T =

F = kx =

U
x

## CHAPTER 6. PRINCIPLE OF LEAST ACTION

62
Integrating to obtain the potential energy

U =

1 2
kx + c
2

where c can be set equal to zero by taking the horizontal plane as the reference level. The Lagrangian
is given by
1
1
L = T U = mx2 kx2
2
2
By principle of least action the motion takes place so that

Z t1 
1 2 1 2
mx kx dt
J(x) =
2
2
0
is stationary. Hence the Lagranges equation (6.22) becomes
 
L
d L
=0

dt x
x
where

L
= kx
and
x
Plugging into the Lagranges equation to obtain

L
= mx
x

d
(mx)
+ kx = 0
dt
or

k
x=0
(6.25)
m
Equation (6.25), which expresses Newtons second law that force equals mass times acceleration, is
an equation for simple harmonic motion. Its general solution of given by
r
r
k
k
t + c2 cos
t
x(t) = c1 sin
m
m
x +

## The constants c1 and c2 are determined from the initial conditions

x(0) = x0

and

x(0)

=0

where x0 is the semi-amplitude of oscillation. So, we have c1 = 0 and c2 = x0 . The solution is therefore,
r
k
x = x0 cos
t = x0 cos t
m
where is the angular frequency, given by

=
and the frequency is

k/m

p
k/m
f =
2
T = 2

p
m/k

63

O
b

m
b

## Figure 6.3: A simple pendulum

Simple pendulum
Consider a simple pendulum of length l and bob mass m suspended from a frictionless support by
an inextensible string as shown in figure 6.2. Using Lagranges formulation, determine the period of
oscillation of the simple pendulum. To describe the state at any time t , it is convenient to choose the
angle between the rest portion and the deflected position as the generalized coordinate, i.e., q = .
The kinetic energy of the pendulum is given by

T =

1
1 2
1
mv = m(l )2 = ml 2 2
2
2
2

In view of equation (??), the potential energy of the bob is mg times the height above its equilibrium
position, i.e.,
U = mg(OA OB) = mg(l l cos ) = mgl(1 cos )
where the zero potential energy level has been taken as the rest position of the bob. The Lagrangian
is given by
1
L = T U = ml 2 2 mgl(1 cos )
2
By principle of least action the motion takes place so that

Z t1 
1 2 2
J( ) =
ml mgl(1 cos ) dt
2
0
is stationary. Hence the Lagranges equation (6.22) becomes
 
d L
L
=0

dt

where

L
= mgl sin

## Plugging into the Lagranges equation to obtain

or

and

L
= ml 2


d
ml 2 + mgl sin = 0
dt
g
+ sin = 0
l

(6.26)

## CHAPTER 6. PRINCIPLE OF LEAST ACTION

64

Equation (6.26) is the governing equation for the system. If the amplitude of oscillation is small
enough, sin , then the equation becomes

g
+ = 0
l
This is an equation for simple harmonic motion and the general solution of given by
r
r
g
g
t + c2 cos
t
= c1 sin
l
l
The constants c1 and c2 are determined from the initial conditions

(0) = 0

(0) = 0

and

## where 0 is the semi-amplitude of oscillation. So, we have c1 = 0 and c2 = 0 . The solution is

therefore,
r
g
t = 0 cos t
= 0 cos
l
where is the angular frequency, given by

=
and the frequency is

p
g/l

p
g/l
=
2

## and the period of oscillation is

T = 2

l/g

Compound pendulum
A rigid body pivoted about a horizontal axis which does not coincide with the center of mass and able
to oscillate freely is called a compound pendulum. Let O and A are the point of suspension through
which the axis passes and the center of mass respectively. Let mass of the pendulum be m, moment
of inertia about axis of rotation I , and the distance OA = l .
O

x
b

## Figure 6.4: A compund pendulum

The kinetic energy of the pendulum is

1 2
I
2

65

## the potential energy of the pendulum is

U = mgl cos
where the zero potential energy level has been taken at a horizontal plane passing through the point
O. The Lagrangian is given by

L = T U =

1 2
I + mgl cos
2

J( ) =

Z t1 
1


2

I + mgl cos dt

d
dt
where

L
= mgl sin

L
=0

L
= I

and

## Plugging into the Lagranges equation to obtain

d 
I + mgl sin = 0
dt

or

mgl
sin = 0
+
(6.27)
I
Equation (6.27) is the governing equation for the compound pendulum. If the amplitude of oscillation
is small enough, sin , then the equation becomes
mgl
+
=0
I
This is an equation for simple harmonic motion and the general solution is given by

= c1 sin

mgl
t + c2 cos
I

mgl
t
I

T = 2

p
I/mgl

## Particle moving in a central force field

Consider the planar motion of a mass m that is attracted to the origin with a force inversely proportional
to the square of the distance from the origin. For generalized coordinates we take the polar coordinates
(r, ). The kinetic energy

66
y

m
b

## Figure 6.5: The motion of a mass m in the plane.


1 2
1
mv = m x2 + y2
2 "
2
2 
2 #

d
1
d
= m
(r cos ) +
(r sin )
2
dt
dt

1 
= m (r cos r sin )2 + (r sin + r cos )2
2

1
= m r2 + r2 2
2

T =

## The central force acting on the particle is given by

F =

k
r2

where sign accounts for the attraction force. Thus, the potential energy of the particle is

U =

Z r

F dr =

Z r
k

r2

dr =

k
r

where the zero potential energy level has been taken at r . The Lagrangian is given by

L = T U =

 k
1
m r2 + r2 2 +
2
r

## By principle of least action the motion takes place so that


Z t1 
 k
m 2
2 2
r + r +
dt
J(r, ) =
2
r
0
is stationary. Hence the Lagranges equation (6.22) becomes
 
L
d L
=0

dt

 
d L
L
=0

dt r
r
where

L
=0

k
L
= mr 2 2
r
r

L
= mr2

L
= mr
r

67

## Plugging into the Lagranges equation to obtain


d
mr2 = 0
dt
d
k
(mr) mr 2 + 2 = 0
dt
r
or

mr2 = const.
k

mr mr 2 + 2 = 0
r

(6.28)

This coupled pair of ordinary differential equations can be solved exactly to determine the path of
the motion of a particle in a central force field. Note that the first equation tells us that the angular
momentum (mr2 ) of the particle is conserved during the motion. The term

mr 2 =

mv2
r

(v = r )

in the second equation represents the centrifugal force on the particle. This equation says that the
net force in the radial direction is the sum of forces due to the centrally directed attraction ( rk2 ) and
the centrifugal force (mr 2 ).
The case of a satellite travelling about a spherical earth is an example of a particle moving in a
central force field.

Double pendulum
A double pendulum is a pendulum with another pendulum attached to its end as shown in figure 6.6.
Using Lagranges formulation, determine the equation of motion of the double pendulum.
O

x
b

1 l1

m1 b (x1 , y1 )
2

l2
m2 b (x2 , y2 )

y
Figure 6.6: A double pendulum
We use rectangular coordinates with (x1 , y1 ) and (x2 , y2 ) as the coordinates of the two masses m1
and m2 respectively. Then, in terms of the independent (generalized) coordinates 1 , and 2 we have

## CHAPTER 6. PRINCIPLE OF LEAST ACTION

68
(choosing y positive down)

x1 = l1 sin 1
y1 = l1 cos 1
x2 = l1 sin 1 + l2 sin 2
y2 = l1 cos 1 + l2 cos 2
From these transformation equations, we can find the following derivatives:

x1 = l1 1 cos 1
y1 = l1 1 sin 1

x2 = l1 1 cos 1 + l2 2 cos 2
y2 = l1 1 sin 1 l2 2 sin 2
The kinetic energy of the system is given by

 1

1
m1 x21 + y21 + m2 x22 + y22
2
2

1  2 2
1
2 2
2 2
= m1 l1 1 + m2 l1 1 + l2 2 + 2l1 l2 1 2 cos(1 2 )
2
2
The potential energy of the system above its equilibrium position (a distance l1 + l2 below the point
of suspension) is
T = T1 + T2 =

## U = U1 + U2 = m1 g (l1 + l2 l1 cos 1 ) + m2 g (l1 + l2 l1 cos 1 l2 cos 2 )

Then the Lagrangian is given by

L = T U =


1 
1
m1 l12 12 + m2 l12 12 + l22 22 + 2l1 l2 1 2 cos(1 2 )
2
2
m1 g (l1 + l2 l1 cos 1 ) m2 g (l1 + l2 l1 cos 1 l2 cos 2 )

J(1, 2 ) =

Z t1

L dt



d L
L

=0

dt 1
1


L
d L
=0

dt 2
2
where

L
1
L
1
L
2
L
2

## = m2 l1 l2 1 2 sin(1 2 ) m1 gl1 sin 1 m2 gl1 sin 1

= m1 l12 1 + m2 l12 1 + m2 l1 2 l2 cos(1 2 )
= m2 l1 l2 1 2 sin(1 2 ) m2 gl2 sin 2
= m2 l22 2 + m2 l1 l2 1 cos(1 2 )

## 6.4. APPLICATIONS OF PRINCIPLE OF LEAST ACTION

69

Plugging these equations into the in the Lagranges equation and after some manipulations we obtain

## (m1 + m2 )l12 1 + m2 l1 l2 2 cos(1 2 ) + m2 l1 l2 22 sin(1 2 ) = (m1 + m2 )gl1 sin 1

l2 2 + l1 1 cos(1 2 ) l1 12 sin(1 2 ) = g sin 2

(6.29a)
(6.29b)

The system equation (6.29) is the governing equation for the double pendulum with masses m1 & m2
and the lengths l1 & l2 .
If m1 = m2 , the system of equation (6.29) assumes the form

## 2l1 1 + l2 2 cos(1 2 ) + l2 22 sin(1 2 ) = 2g sin 1

l2 2 + l1 1 cos(1 2 ) l1 12 sin(1 2 ) = g sin 2

(6.30a)
(6.30b)

## If m1 = m2 and l1 = l2 = l , the system of equation (6.30) assumes the form

g
21 + 2 cos(1 2 ) + 22 sin(1 2 ) = 2 sin 1
l
g
2
2 + 1 cos(1 2 ) 1 sin(1 2 ) = sin 2
l

(6.31a)
(6.31b)

If the amplitude of oscillation is small enough, sin , cos 1, and neglecting the terms involving
2 , the system of equation (6.31) becomes

g
21 + 2 = 2 1
l
g
1 + 2 = 2
l

(6.32a)
(6.32b)

Vibrating string
We now consider the transverse vibrations of an elastic string. We place the string along the x-axis,
stretch it length L and fasten it at the ends x = 0 and x = . We then distort the string (since it is
elastic) by displacing it from its equilibrium and release it it and allow it to vibrate. The problem is
to find the deflection u(x,t) at any point ant at any time t > 0. The following assumptions are made:

## Figure 6.7: Deflected string at time t .

(1) The mass of the spring per unit length is constant, i.e., the string is homogeneous. (2) The string
performs small vertical motion in the vertical plane, i.e., every particle of the string moves only in the
vertical direction (3) There is no damping and the string will vibrate indefinitely if displaced, i.e., the
system is conservative. (4) the gravitational force is neglected in comparison with the initial tension
in the string.

## CHAPTER 6. PRINCIPLE OF LEAST ACTION

70

Since the string is fixed at the ends, the boundary conditions are

u(0,t) = u(,t) = 0
If is the mass of the undeflected string per unit length, the kinetic energy of the string is
 2
Z
1
u
T =
dx

2 0
t
The potential energy is related to the elongation (stretching) of the string. The arc length of the
elastic string is
s
 2
Z
u
dx
S=
1+
x
0
and the elongation due to the transversal motion is
s

= S =

u
1+
x

Z
0

2

dx

u
x

2

## Because of the small elongation assumption,

u
<1
x

it is reasonable to approximate

u
1+
x

2

1
1+
2

Z  2
1 u
0

dx

## and the potential energy contained in the elongated string is

Z  
F u 2
U = F =
dx
2 0 x
where F is the tension in the string.
The Lagrangian is given by
Z  
F u 2
dx
dx

t
2 0 x
0
"  
 2 #
Z
u 2
u
1
=

F
dx
2 0
t
x
Z
i
1 h
=
(ut )2 F (ux )2 dx
2 0

L = T U =

1
2

Z  2
u

"  
 2 #
Z Z
u 2
u
1 t1
F
dx dt

J(u) =
2 0 0
t
x

## 6.4. APPLICATIONS OF PRINCIPLE OF LEAST ACTION

71

is stationary. Here, the unknown function u is a function of two independent coordinates x and t .
Further, the Lagrangian L is not a function of x and t , and u. Thus, the appropriate Lagranges
equation is (3.23). Noting that L/ u = 0, the Lagrange equation takes the form




L

L
+
=0
t ut
x ux
The partial derivatives can be evaluated using the Leibnitz rule of differentiation under integration
sign2 as follows:

1
L
=
(ut )2 dx
ut
2 ut 0
Z

1 
(ut )2 dx
=
2 0 ut

Z
0

ut dx

and

1
L
=
ux
2 ux
=

Z
0

Z
0

F (ux )2 dx

Fux dx

Z
0

ut dx

Z
0

Fux dx = 0

## Using the Leibnitz rule again, we have

Z
0

( utt Fuxx ) dx = 0

This equation holds good for any arbitrary limits, then we should have
2
2u
2 u
=
c
t2
x2

(6.33)

where the physical constant F/ is denoted by c2 to indicate that this constant is positive. This is
the well known differential equation known as the one-dimensional wave equation.

## Note: The solution of wave equation using method of separation of variables

can be included if needed.
2

d
dt

Z b(t)
a(t)

u(x,t) dx =

Z b(t)
u(x,t)
a(t)

dx + u(b(t),t)

db
da
u(a(t),t)
dt
dt

72

## CHAPTER 6. PRINCIPLE OF LEAST ACTION

Appendix A
Solution of Brachistochrone Problem
The Eulerlagrange equation for the brachistochrone problem reduces to


y 1 + y2 = C

y=

C
1 + y2

To solve this differential equation, we substitute y = cot where is a parameter. Then we have

y=

C
C
= C sin2 = (1 cos 2 )
2
1 + cot
2

## Now the dx can be expressed as follows

dx =

dy
=
y

C
2 (2 sin 2 ) d

cot

C2 sin cos d
= 2C sin2 d
cot

dx = C(1 cos 2 )d
Integrating the above differential equation to obtain


sin 2
+D
x=C
2
where the constant of integration D can be determined from the condition y(0) = 0, we get D = 0.
Putting 2 = , we can write

x=

C
( sin )
2

and

73

y=

C2
(1 cos )
2

74

## APPENDIX A. SOLUTION OF BRACHISTOCHRONE PROBLEM

Appendix B
Method of Lagrange Multipliers
Lagrange multiplier method is a technique for finding a maximum or minimum of a function F(x, y, z)
subject to a constraint (also called side condition) of the form G(x, y, z) = 0.
Geometric basis of Lagrange multiplier method can be explained if the functions are of two variables. So we start by trying to find the extreme values of F(x, y) subject to a constraint of the form
G(x, y) = 0. In other words, we seek the extreme values of F(x, y) when the point (x, y) is restricted
to lie on the level curve G(x, y) = 0. Figure blow shows this curve together with several level curves
of F(x, y) = c, where c is a constant. To maximize F(x, y) subject to G(x, y) is to find the largest
value of c such that the level curve, F(x, y) = c, intersects G(x, y) = 0. It appears from figure that this
happens when these curves just touch each other, that is, when they have a common tangent line.
This means that the normal lines at the point (x0 , y0 ) where they touch are identical. So the gradient
vectors are parallel. That is

G(x, y) = 0
b

level curves of
F(x, y) = constant
x

Figure B.1: The four possible cases of varying end points in the direction of y.

F(x0 , y0 ) = G(x0 , y0 )
for some scalar . The scalar parameter is called a Lagrange multiplier. The procedure based on
the above equation is as follows. We have from chain rule,

dF =

F
F
dx +
dy = 0 ,
x
y

dG =

75

G
G
dx +
dy = 0
x
y

76

## Multiplying the second equation by and add to first equation yields





F
F
G
G
+
+
dx +
dy = 0
x
x
y
y
By choosing to satisfy

## for example, so that

G
F
+
= 0,
x
x
G
F
+
=0
y
y

As can be seen, the above two equations are components of the vector equation

F G = 0

(B.1)

Thus, the maximum and minimum values of F(x, y) subject to the constraint G(x, y) = 0 can be found
by solving the following set of equations

G
F
+
=0
x
x
F
G
+
=0
y
y

(B.2)

G(x, y) = 0
This is a system of three equations in the three unknowns x, y, and , but it is not necessary to find
explicit values for .
If the function to be extremized F and the side condition G are function of three independent
variables x, y, and z, the following system of equation is solved to obtain the minimum or maximum
value of F .
F
G
+
=0
x
x
G
F
+
=0
y
y
(B.3)
G
F
+
=0
z
z
G(x, y, z) = 0
This is a system of four equations in the four unknowns x, y, z, and .

Example B.1
A rectangular box without a lid is to be made from 27 m2 of cardboard. Find the maximum volume
of such a box.
Method I We first, solve this relatively simple problem without using Lagrange multiplier. Let the
length, width, and height of the box (in meters) be x, y, and z. Then the volume of the box is

V = xyz

77
We can express V as a function of just two variables x and y by using the fact that the area of the
five sides of the box is
xy + 2yz + 2xz = 27
Solving this equation for z, we get

z=
so that

V = xy

27 xy
2(x + y)

27xy x2 y2
27 xy
=
2(x + y)
2(x + y)

y2 (27 2xy x2 )
V
=
x
2(x + y)2

x2 (27 2xy y2 )
V
=
y
2(x + y)2

## If V is a maximum, then V / x = V / x = 0, but x = 0 or y = 0 gives V = 0, so we must solve the

equations
27 2xy x2 = 0
27 2xy y2 = 0
These equations imply that x = y, it may be noted the both x and y must be positive here. Putting
y = x in one of these equations, we get 27 3x2 = 0, which gives x = 3, y = 3, and z = 1.5. Thus
the maximum volume occurs at x = 3, y = 3, and z = 1.5, so that the maximum volume of the box is
13.5 m3 .
Method II Here we wish to maximize

V = xyz
subject to the constraint

## G(x, y, z) = xy + 2yz + 2xz 27 = 0

Using the method of Lagrange multipliers, we look for values of x, y, z, and such that

V
G
+
=0
x
x

V
G
+
=0
y
y

V
G
+
=0
z
z

xy + 2yz + 2xz = 27
which become

yz + (y + 2z) = 0

(B.4)

xz + (x + 2z) = 0

(B.5)

xy + (2y + 2x) = 0

(B.6)

xy + 2yz + 2xz = 27

(B.7)

To solve this systems of equations in a convenient manner, we multiply the equation (B.4) by x, (B.5)
by y, and (B.6) by z, then the left sides of these equations will be identical. Thus, we have

(B.8)
(B.9)
(B.10)

78

P1
y1

y2
b

P2
L = x1 + x2

## Figure B.2: Illustration of Snells law

We observe that 6= 0 because = 0 would imply xy = yz = xz = 0 and this would contradict the
equation (B.7). Therefore, from equations (B.8) and (B.9), we have xz = yz. Since z cannot be zero,
we have x = y. From equations (B.9) and (B.10), we have y = 2z. If we now put x = y = 2z in equation
(B.7), we get
12z2 = 27
Since x, y, and z are all positive, we therefore have z = 1.5 and so x = 3 and y = 3.

Example B.2
Here we will demonstrate how Lagrange multiplier method can be used for proving Snells law. In the
case of the inhomogeneous optical medium consisting of two homogeneous media in which the speed
of light is piecewise constant. Suppose that the light travels from a point P1 (x1 , y1 ), with a constant
speed v1 , in a homogenous medium M1 to a point P2 (x2 , y2 ), with a constant speed v2 , in another
homogeneous medium M2 . The two media are separated by the line y = y0 .
The time of transit of light is given by the geometry as

T =

y2 / cos 2
y1 / cos 1
+
v1
v2

## and is then subject to the geometrical constraint that

L = x1 + x2 = y1 tan 1 + y2 tan 2
Applying the condition (B.2)

T
L
y1
+
=
sec 1 tan 1 + y1 sec2 1 = 0
1
1
v1
y2
T
L
+
=
sec 2 tan 2 + y2 sec2 2 = 0
2
2
v2
These give as the only solution

sin 1 = v1

sin 2 = v2

79
or

sin 2
sin 1
=
v1
v2

where the angles are measured with respect to the normal of the boundary between the two media.

80

## APPENDIX B. METHOD OF LAGRANGE MULTIPLIERS

Appendix C
Work-Energy and Energy Conservation
Theorems
C.1 Work-Energy Theorem
The kinetic energy of a particle of mass m, moving with a speed v, is defined as

T =

1 2
mv
2

(C.1)

Let a particle move from point 1 to point 2 under the action of a force F . The total work done on
the particle by the force, as it moves from 1 to 2, is by definition the line integral

W12 =

Z 2
1

F ds

(C.2)

where ds = v dt is the displacement vector along the particles trajectory. Now, if the particle undergoes
an infinitesimal displacement ds under the action of a force F , the scalar product

dW = F ds

(C.3)

is the infinitesimal work done by the force F as the particle undergoes the displacement ds along the
particles trajectory. We use the Newtons second law of motion

F =

d(mv)
dt

## in the equation (C.3) to obtain

d(mv)
d
v dt =
dW =
dt
dt




1
1 2
mv v dt = d
mv
2
2

Since the scalar quantity 12 mv2 is the kinetic energy of the particle. It follows that

dW = dT

81

(C.4)

82

## APPENDIX C. WORK-ENERGY AND ENERGY CONSERVATION THEOREMS

Equation (C.4) is the differential form of the work-energy theorem: The differential work of the
resultant of forces acting on a particle is equal, at any time, to the differential change in the kinetic
energy of the particle. Integrating equation (C.3) between point 1 and point 2, corresponding to the
velocities v1 and v2 of the particle, we get

Z 2
Z v2 
Z 2
1
1
1 2
d(mv)
(C.5)
v dt =
mv
= mv22 mv21 = T2 T1
d
F ds =
W12 =
dt
2
2
2
v1
1
1
This is the work-energy theorem, which can be stated as the work done by the force F acting on a
particle as it move from point 1 to point 2 along its trajectory is equal to the change in the kinetic
energy (T2 T1 ) of the particle during the given displacement.

## C.2 Energy Conservation Theorem

If there exists a scalar function (x, y, z,t), so that we could write

F =

(C.6)

we shall say that the vector field F is a potential field. The scalar function (x, y, z,t) is then called
the potential function of the field. The vector field F is called conservative if does not explicitly
depend on time. The potential function (x, y, z), in this case, is called the force potential.
It is easy to show that if the force field is conservative the work done in moving the particle from
1 to 2 is independent of the path connecting 1 and 2. From equation (C.2), the total work done on
the particle by the force F as it moves from 1 to 2 is given by

W12 =

Z 2
1

F ds

Since the force field is conservative, from equation (C.6), we can write

W12 =

Z 2
1

F ds =

Z 2
1

ds =

Z 2
d
1

ds

ds =

Z 2
1

d = 2 1

(C.7)

The total work done is equal to the difference in force potential no matter how the particle moves
from 1 to 2. We can also write the following differential relation

dW = F ds = d

(C.8)

If we now write (x, y, z) = U(x, y, z) (inserting a minus sign for reasons of convention) and
express the force as
F = U
(C.9)
then the scalar function U is known as the potential energy of the particle. When F is expressed as
in the above equation, the line integral of equation (C.2) becomes

W12 = U1 U2

(C.10)

The total work done is equal to the difference U1 U2 no matter how the particle moves from 1 to 2.

## C.2. ENERGY CONSERVATION THEOREM

83

It may be noted that the line integral of the field F = U along a closed curve (called circulation)
is zero as shown below:
I
I
C

F ds =

dU = 0

Comparing equations (C.5) and (C.10), we deduce: T1 +U1 = T2 +U2 . This says that the quantity
T +U stays constant as the particle moves from point 1 to point 2. Since 1 and 2 are arbitrary points,
we have obtained the statement of conservation of total mechanical energy

E = T + U = constant

(C.11)

Thus, the energy conservation theorem can be stated as the total energy of a particle in a conservative
force field is constant.
It is instructive to note that equation (C.6) does not uniquely determine the function . We could
as well define F = + c, where c is a constant. Hence, the choice for the zero level of and
consequently U is arbitrary.
We can verify directly from equation (C.11) that the total energy is a constant of the motion. We
have
dE
dT
dU
=
+
dt
dt
dt
The kinetic energy term can be written as

1 dv2
dv
dT
= m
= m v = F v
dt
2 dt
dt
The potential energy U depends on time only through the changing position of the particle: U =
U(s(t)) = U(x(t), y(t), z(t)). We therefore have

U dx U dy U dz
dU
=
+
+
= U v = F v
dt
x dt
y dt
z dt
Thus we have

dE
= F v F v = 0
dt
This shows that the total energy of the particle moving in a conservative force field is a constant
during the motion.

## C.2.1 Force-potential energy relation

Consider a conservative force

F = Fx i + Fy j + Fz k
We then have

F = = U
Therefore, we obtain the following relations

Fx =

U
=
x
x

Fy =

U
=
y
y

Fz =

U
=
z
z

(C.12)

84

## APPENDIX C. WORK-ENERGY AND ENERGY CONSERVATION THEOREMS

This shows that the partial derivative of force potential in a given direction gives the force in that
direction. An example of a force that derives from a potential is gravitational force

F g = U
which leads to the following equations

mgx =

U
x

mgy =

U
y

mgz =

U
z

(C.13)

Thus, the negative of partial derivative of potential energy in a given direction gives the gravitational
force in that direction.
If gravitational acceleration vector is given by

g = g(0, 0, 1)
Then we have

0=

U
x

0=

U
y

mg =

U
z

(C.14)

## Integration of the last of the above equation to obtain

U = mgz + f (x, y)
Setting f (x, y) = 0, the potential energy of the particle in a gravitational field is given by

U = mgz
where g acts in the negative z direction. The total mechanical energy E is conserved when a particle
moves under the action of the gravitational field.

## C.2.2 Non-conservative force

An example of a force that does not derive from a potential is the frictional force F fr = kv, where
k is the coefficient of friction. This force acts in the direction opposite to the particles motion
and is responsible for the drag force. The frictional force cannot be expressed as the gradient of a
scalar function. This implies that in the presence of a frictional force, the total mechanical energy
of a particle E is not conserved. The reason is that the friction causes the mechanical energy E to
transform into heat. Energy conservation as a whole, of course, applies, i.e., the amount by which E
decreases matches the amount of heat dissipated into the environment.
It is instructive to note that the work-energy theorem given by equation (C.5) is always true,
whether or not the force F derives from a potential.

Bibliography
 Clegg, J. C., Calculus of Variations, Oliver and Boyd (1968).
 Dacorogna,, L., Introduction to the Calculus of Variations, 2nd ed., Imperial College Press (2008).
 Elsgolc, L. D., Calculus of Variations, Dover (2007).
 Ewing, G. M., Calculus of Variations with Applications, Dover (1985).
 Forsyth, A. R., Calculus of Variations, Dover (1960).
 Fox, C., An Introduction to the Calculus of Variations, Dover (2010).
 Gelfand, I. M. and Fomin S. V., Calculus of Variations, Prentice-Hall (1963).
 Gupta, A. S., Calculus of Variations with Applications, Prentice-Hall India (1996).
 Jost, J. and Li-Jost, X., Calculus of Variations, Cambridge Univ. Press (1998).
 Komzsik, L., Applied Calculus of Variations for Engineers, CRC Press (2009).
 Kot, M., A First Course in the Calculus of Variations, American Math Society (2014).
 van Brunt, B., The Calculus of Variations, Springer (2004).
 Weinstock, R., Calculus of Variations: with Applications to Physics and Engineering, Dover
(1974).

85