Namespaces
Variants
Actions

Difference between revisions of "Lagrange multipliers"

From Encyclopedia of Mathematics
Jump to: navigation, search
(Importing text file)
 
m
 
(3 intermediate revisions by 2 users not shown)
Line 1: Line 1:
Variables with the help of which one constructs a [[Lagrange function|Lagrange function]] for investigating problems on conditional extrema. The use of Lagrange multipliers and a Lagrange function makes it possible to obtain in a uniform way necessary optimality conditions in problems on conditional extrema. The method of obtaining necessary conditions in the problem of determining an extremum of a function
+
{{MSC|49}}
 +
{{TEX|done}}
 +
[[Category:Analysis]]
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/l/l057/l057190/l0571901.png" /></td> <td valign="top" style="width:5%;text-align:right;">(1)</td></tr></table>
+
The ''Lagrange multipliers'' are variables with the help of which one constructs a
 +
[[Lagrange function|Lagrange function]] for investigating problems on ''conditional extrema''.  
  
under the constraints
+
'''Definition'''
 +
If $f, g_1, \ldots, g_m: \mathbb R^n \supset \Omega \to \mathbb R$
 +
are given functions, a conditional extremal point of $f$ under the costraints $g_1, \ldots, g_m$ is an element $x^\star$ with the property that $f (x^\star)$ is the maximum (resp. minimum) value taken by $f$ on the set
 +
\begin{equation}\label{e:constrained_set}
 +
\Sigma :=\{y\in U : g_i (y) = g_i (x^\star) =: b_i\quad \forall i \in \{ 1, \ldots , m\}\}\, .
 +
\end{equation}
 +
If instead $f(x)$ is the maximum (resp. minimum) value taken by $f$ on $\Sigma \cap U$ for some neighborhood $U$ of $x^\star$, then $x^\star$ is called a local conditional extremal point.
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/l/l057/l057190/l0571902.png" /></td> <td valign="top" style="width:5%;text-align:right;">(2)</td></tr></table>
+
The method of Lagrange multipliers gives necessary conditions for local conditional extremal points. More precisely we have the following
  
consisting of the use of Lagrange multipliers <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/l/l057/l057190/l0571903.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/l/l057/l057190/l0571904.png" />, the construction of the Lagrange function
+
'''Theorem 1'''
 +
Assume $\Omega$ is an open set and let $f, g_1, \ldots, g_m: \mathbb R^n \supset \Omega \to \mathbb R$ be $C^1$ functions. If $x^\star$ is a local
 +
conditional extremal point under the constraints $g_1, \ldots , g_m$, then the gradients $\nabla f (x^\star), \nabla g_1 (x^\star), \ldots , \nabla g_m (x^\star)$ are linearly dependent.
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/l/l057/l057190/l0571905.png" /></td> </tr></table>
+
The conclusion above is in fact usually stated when $m<n$ and $\nabla g_1 (x^\star), \ldots, \nabla g_m (x^\star)$ are linearly dependent, i.e. when
 +
$b = (b_1, \ldots , b_i)$ is  a [[Regular value|regular value]] of the function $g = (g_1, \ldots, g_m)$ (at least if we restrict $g$ to some neighborhood $V$ of $x^\star$). The necessary condition of Theorem 1 can then be translated into the identity
 +
\begin{equation}\label{e:lagrange_m}
 +
\nabla f (x^\star) = \lambda^\star_1 \nabla g_1 (x^\star) + \ldots + \lambda^\star_m \nabla g_m (x^\star)\, .
 +
\end{equation}
 +
The Lagrange multipliers are the real numbers $\lambda^\star_1, \ldots , \lambda^\star_m$ appearing in \eqref{e:lagrange_m}.
 +
Observe that, if we define the [[Lagrange function]]
 +
\begin{equation}\label{e:lagrange_f}
 +
F(x,\lambda) = f(x) + \sum_{i=1}^m\lambda_i (b_i - g_i(x))\, ,
 +
\end{equation}
 +
then the conditions $x^\star\in \Sigma = \{g=b\}$ and \eqref{e:lagrange_m} are equivalent to the fact that $(x^\star, \lambda^\star)$ is a [[Critical point|critical point]] of $F$. We can therefore summarize the discussion above in the following statement, which in fact can be easily seen to be equivalent to Theorem 1:
  
and equating its partial derivatives with respect to the <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/l/l057/l057190/l0571906.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/l/l057/l057190/l0571907.png" /> to zero, is called the Lagrange method. In this method the optimal value <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/l/l057/l057190/l0571908.png" /> is found together with the vector of Lagrange multipliers <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/l/l057/l057190/l0571909.png" /> corresponding to it by solving the system of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/l/l057/l057190/l05719010.png" /> equations. The Lagrange multipliers <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/l/l057/l057190/l05719011.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/l/l057/l057190/l05719012.png" />, have the following interpretation [[#References|[1]]]. Suppose that <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/l/l057/l057190/l05719013.png" /> provides a relative extremum of the function (1) under the constraints (2): <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/l/l057/l057190/l05719014.png" />. The values of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/l/l057/l057190/l05719015.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/l/l057/l057190/l05719016.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/l/l057/l057190/l05719017.png" /> depend on the values of the <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/l/l057/l057190/l05719018.png" />, the right-hand sides of the constraints (2). One has formulated quite general assumptions under which all the <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/l/l057/l057190/l05719019.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/l/l057/l057190/l05719020.png" /> are continuously-differentiable functions of the vector <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/l/l057/l057190/l05719021.png" /> in some <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/l/l057/l057190/l05719022.png" />-neighbourhood of its value specified in (2). Under these assumptions the function <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/l/l057/l057190/l05719023.png" /> is also continuously differentiable with respect to the <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/l/l057/l057190/l05719024.png" />. The partial derivatives of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/l/l057/l057190/l05719025.png" /> with respect to the <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/l/l057/l057190/l05719026.png" /> are equal to the corresponding Lagrange multipliers <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/l/l057/l057190/l05719027.png" />, calculated for the given <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/l/l057/l057190/l05719028.png" />:
+
'''Theorem 2'''
 +
Assume that $m<n$, that $b$ is a regular value for $g$ and that $x^\star$ is a local conditional extremal point for $f$ under the constraint $g$ with $g (x^\star) = b$. Then there is $\lambda^\star\in \mathbb R^m$ such that $(x^\star, \lambda^\star)$ is a critical point of $F$.
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/l/l057/l057190/l05719029.png" /></td> <td valign="top" style="width:5%;text-align:right;">(3)</td></tr></table>
+
Observe that under the hypothesis of the latter theorem, the set $\Sigma$ is a $C^1$ submanifold of dimension $m$. Indeed Theorem 2 is usually proved via the [[Implicit function|Implicit function theorem]], reducing its statement to the usual necessary condition for unconstrained extrema of a differentiable function.
 +
Observe also that the coordinates of the point $x^\star = (x_1^\star,\dots,x_n^\star)$ together with the Lagrange multipliers $\lambda^\star = (\lambda_1^\star,\dots,\lambda_m^\star)$ give us $m+n$ real numbers
 +
which satisfy a system of $m+n$ equations: $m$ equations are indeed given by the constraint $g (x^\star) = b$ and $n$ by the identity \eqref{e:lagrange_m}.
  
In applied problems <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/l/l057/l057190/l05719030.png" /> is often interpreted as profit or cost, and the right-hand sides, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/l/l057/l057190/l05719031.png" />, as losses of certain resources. Then the absolute value of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/l/l057/l057190/l05719032.png" /> is the ratio of the unit cost to the unit <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/l/l057/l057190/l05719033.png" />-th resource. The numbers <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/l/l057/l057190/l05719034.png" /> show how the maximum profit (or maximum cost) changes if the amount of the <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/l/l057/l057190/l05719035.png" />-th resource is increased by one. This interpretation of Lagrange multipliers can be extended to the case of constraints in the form of inequalities and to the case when the variables <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/l/l057/l057190/l05719036.png" /> are subject to the requirement of being non-negative.
+
The Lagrange multipliers $\lambda^\star_i$, $i=1,\dots,m$, have the following interpretation
 +
{{Cite|Ha}}. Suppose that $x^\star$ provides a relative extremum of the function $f$ under the constraints $g$ and set $z^\star = f(x^\star)$. The values of $x^\star$, $\lambda^\star$ and $z^\star$ depend on the values of $b$. Under suitable assumptions such dependence is $C^1$  in some $\varepsilon$-neighbourhood of $g (x^\star)$. In this case the partial derivatives of $z^\star$ with respect to the $b_i$ are equal to the corresponding Lagrange multipliers $\lambda_i^\star$:
 +
\begin{equation}\label{e:costs}
 +
\frac{\partial z^\star}{\partial b_i} = \lambda_i^\star,\quad i=1,\dots,m\, .
 +
\end{equation}
 +
In applied problems $z$ is often interpreted as profit or cost, and the right-hand sides, $b_i$, as losses of certain resources. Then the absolute value of $\lambda_i^\star$ is the ratio of the unit cost to the unit $i$-th resource. The numbers $\lambda_i^\star$ show how the maximum profit (or maximum cost) changes if the amount of the $i$-th resource is increased by one. This interpretation of the Lagrange multipliers is very useful because it can be extended to the case of constraints in the form of inequalities.
  
In the calculus of variations one conveniently obtains by means of Lagrange multipliers necessary conditions for optimality in the problem on a conditional extremum as necessary conditions for an unconditional extremum of a certain composite functional. Lagrange multipliers in the calculus of variations are not constants, but certain functions. In the theory of optimal control and in the [[Pontryagin maximum principle|Pontryagin maximum principle]], Lagrange multipliers have been called conjugate variables.
+
In the calculus of variations suitable versions of the method of Lagrange multipliers have been developed in several infinite-dimensional settings, namely when the sought conditional extremal points are functions and both the cost to be minimized and the constraints are suitable functionals. In this case the vector of Lagrange multipliers might itself be infinite dimensional.  
 
 
====References====
 
<table><TR><TD valign="top">[1]</TD> <TD valign="top">  G.F. Hadley,  "Nonlinear and dynamic programming" , Addison-Wesley  (1964)</TD></TR><TR><TD valign="top">[2]</TD> <TD valign="top">  G.A. Bliss,  "Lectures on the calculus of variations" , Chicago Univ. Press  (1947)</TD></TR></table>
 
  
 +
In the theory of optimal control and in the [[Pontryagin maximum principle|Pontryagin maximum principle]], the Lagrange multipliers are usually called conjugate variables.
  
 +
====Comments====
 +
The same arguments as used above lead to the interpretation of the Lagrange multipliers $\lambda_i^*$ as sensitivity coefficients (with respect to changes in the $b_j$).
  
====Comments====
 
The same arguments as used above lead to the interpretation of the Lagrange multiplier values <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/l/l057/l057190/l05719037.png" /> as sensitivity coefficients (with respect to changes in the <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/l/l057/l057190/l05719038.png" />).
 
  
 
====References====
 
====References====
<table><TR><TD valign="top">[a1]</TD> <TD valign="top"> A.E. Bryson,  Y.-C. Ho,  "Applied optimal control" , Blaisdell  (1969)</TD></TR><TR><TD valign="top">[a2]</TD> <TD valign="top"> R.T. Rockafellar,  "Convex analysis" , Princeton Univ. Press  (1970)</TD></TR></table>
+
{|
 +
|-
 +
|valign="top"|{{Ref|Bl}}||valign="top"|  G.A. Bliss,  "Lectures on the calculus of variations", Chicago Univ. Press  (1947)  {{MR|0017881}}  {{ZBL|0036.34401}}
 +
|-
 +
|valign="top"|{{Ref|Br}}||valign="top"| A.E. Bryson,  Y.-C. Ho,  "Applied optimal control", Blaisdell  (1969) {{MR|0446628}} 
 +
|-
 +
|valign="top"|{{Ref|Ha}}||valign="top"|  G.F. Hadley,  "Nonlinear and dynamic programming", Addison-Wesley  (1964)  {{MR|0173543}}  {{ZBL|0179.24601}}
 +
|-
 +
|valign="top"|{{Ref|Ro}}||valign="top"| R.T. Rockafellar,  "Convex analysis", Princeton Univ. Press  (1970) {{MR|0274683}}  {{ZBL|0193.18401}}
 +
|-
 +
|}

Latest revision as of 17:11, 22 June 2014

2020 Mathematics Subject Classification: Primary: 49-XX [MSN][ZBL]

The Lagrange multipliers are variables with the help of which one constructs a Lagrange function for investigating problems on conditional extrema.

Definition If $f, g_1, \ldots, g_m: \mathbb R^n \supset \Omega \to \mathbb R$ are given functions, a conditional extremal point of $f$ under the costraints $g_1, \ldots, g_m$ is an element $x^\star$ with the property that $f (x^\star)$ is the maximum (resp. minimum) value taken by $f$ on the set \begin{equation}\label{e:constrained_set} \Sigma :=\{y\in U : g_i (y) = g_i (x^\star) =: b_i\quad \forall i \in \{ 1, \ldots , m\}\}\, . \end{equation} If instead $f(x)$ is the maximum (resp. minimum) value taken by $f$ on $\Sigma \cap U$ for some neighborhood $U$ of $x^\star$, then $x^\star$ is called a local conditional extremal point.

The method of Lagrange multipliers gives necessary conditions for local conditional extremal points. More precisely we have the following

Theorem 1 Assume $\Omega$ is an open set and let $f, g_1, \ldots, g_m: \mathbb R^n \supset \Omega \to \mathbb R$ be $C^1$ functions. If $x^\star$ is a local conditional extremal point under the constraints $g_1, \ldots , g_m$, then the gradients $\nabla f (x^\star), \nabla g_1 (x^\star), \ldots , \nabla g_m (x^\star)$ are linearly dependent.

The conclusion above is in fact usually stated when $m<n$ and $\nabla g_1 (x^\star), \ldots, \nabla g_m (x^\star)$ are linearly dependent, i.e. when $b = (b_1, \ldots , b_i)$ is a regular value of the function $g = (g_1, \ldots, g_m)$ (at least if we restrict $g$ to some neighborhood $V$ of $x^\star$). The necessary condition of Theorem 1 can then be translated into the identity \begin{equation}\label{e:lagrange_m} \nabla f (x^\star) = \lambda^\star_1 \nabla g_1 (x^\star) + \ldots + \lambda^\star_m \nabla g_m (x^\star)\, . \end{equation} The Lagrange multipliers are the real numbers $\lambda^\star_1, \ldots , \lambda^\star_m$ appearing in \eqref{e:lagrange_m}. Observe that, if we define the Lagrange function \begin{equation}\label{e:lagrange_f} F(x,\lambda) = f(x) + \sum_{i=1}^m\lambda_i (b_i - g_i(x))\, , \end{equation} then the conditions $x^\star\in \Sigma = \{g=b\}$ and \eqref{e:lagrange_m} are equivalent to the fact that $(x^\star, \lambda^\star)$ is a critical point of $F$. We can therefore summarize the discussion above in the following statement, which in fact can be easily seen to be equivalent to Theorem 1:

Theorem 2 Assume that $m<n$, that $b$ is a regular value for $g$ and that $x^\star$ is a local conditional extremal point for $f$ under the constraint $g$ with $g (x^\star) = b$. Then there is $\lambda^\star\in \mathbb R^m$ such that $(x^\star, \lambda^\star)$ is a critical point of $F$.

Observe that under the hypothesis of the latter theorem, the set $\Sigma$ is a $C^1$ submanifold of dimension $m$. Indeed Theorem 2 is usually proved via the Implicit function theorem, reducing its statement to the usual necessary condition for unconstrained extrema of a differentiable function. Observe also that the coordinates of the point $x^\star = (x_1^\star,\dots,x_n^\star)$ together with the Lagrange multipliers $\lambda^\star = (\lambda_1^\star,\dots,\lambda_m^\star)$ give us $m+n$ real numbers which satisfy a system of $m+n$ equations: $m$ equations are indeed given by the constraint $g (x^\star) = b$ and $n$ by the identity \eqref{e:lagrange_m}.

The Lagrange multipliers $\lambda^\star_i$, $i=1,\dots,m$, have the following interpretation [Ha]. Suppose that $x^\star$ provides a relative extremum of the function $f$ under the constraints $g$ and set $z^\star = f(x^\star)$. The values of $x^\star$, $\lambda^\star$ and $z^\star$ depend on the values of $b$. Under suitable assumptions such dependence is $C^1$ in some $\varepsilon$-neighbourhood of $g (x^\star)$. In this case the partial derivatives of $z^\star$ with respect to the $b_i$ are equal to the corresponding Lagrange multipliers $\lambda_i^\star$: \begin{equation}\label{e:costs} \frac{\partial z^\star}{\partial b_i} = \lambda_i^\star,\quad i=1,\dots,m\, . \end{equation} In applied problems $z$ is often interpreted as profit or cost, and the right-hand sides, $b_i$, as losses of certain resources. Then the absolute value of $\lambda_i^\star$ is the ratio of the unit cost to the unit $i$-th resource. The numbers $\lambda_i^\star$ show how the maximum profit (or maximum cost) changes if the amount of the $i$-th resource is increased by one. This interpretation of the Lagrange multipliers is very useful because it can be extended to the case of constraints in the form of inequalities.

In the calculus of variations suitable versions of the method of Lagrange multipliers have been developed in several infinite-dimensional settings, namely when the sought conditional extremal points are functions and both the cost to be minimized and the constraints are suitable functionals. In this case the vector of Lagrange multipliers might itself be infinite dimensional.

In the theory of optimal control and in the Pontryagin maximum principle, the Lagrange multipliers are usually called conjugate variables.

Comments

The same arguments as used above lead to the interpretation of the Lagrange multipliers $\lambda_i^*$ as sensitivity coefficients (with respect to changes in the $b_j$).


References

[Bl] G.A. Bliss, "Lectures on the calculus of variations", Chicago Univ. Press (1947) MR0017881 Zbl 0036.34401
[Br] A.E. Bryson, Y.-C. Ho, "Applied optimal control", Blaisdell (1969) MR0446628
[Ha] G.F. Hadley, "Nonlinear and dynamic programming", Addison-Wesley (1964) MR0173543 Zbl 0179.24601
[Ro] R.T. Rockafellar, "Convex analysis", Princeton Univ. Press (1970) MR0274683 Zbl 0193.18401
How to Cite This Entry:
Lagrange multipliers. Encyclopedia of Mathematics. URL: http://encyclopediaofmath.org/index.php?title=Lagrange_multipliers&oldid=16141
This article was adapted from an original article by I.B. Vapnyarskii (originator), which appeared in Encyclopedia of Mathematics - ISBN 1402006098. See original article