Univariate Functions and Optimization∗ Lester M.K. Kwong Created: August 12, 2004 This Version: October 23, 2008
1
Limits and Derivatives
With the definition of a function, we are often concerned with the function value as the variable concerned approaches arbitrarily close to a certain point. Hence, we use the notion of limits to address this issue. For example, when we write: lim f (x) (1) x→z
this is to denote the function value of the function f (x) as x approaches z. Similarly, we can write this limit in alternative ways. For example, limx→z f (x) = lim→0 f (z + ). When the limit of a function f (x) as x approaches some value z is say y, we write limx→z f (x) = y. This is to say that the value of f (x) gets closer and closer to y as x gets closer and closer to z. However, also note that in discussing about limits, we are never concerned with the case when x = z. In fact, we are always talking about the value of f (x) when x approaches z. In this sense, even if the function f (x) is not defined when x = z the limit as x approaches z may still exist. There are often times when we are only concerned with the limit of a function when it approaches a point in the domain from a particular side. This may occur if the limit from the opposing side does not belong in the domain of the function. Therefore, we write limx→z+ f (x) to denote the ∗
c 2004 Lester M.K. Kwong. Department of Economics, Brock UniverCopyright sity, 500 Glenridge Ave., St. Catharines, Ontario, L2S 3A1, Canada. Email:
[email protected]. Tel: +1 (905) 688-5550, Ext. 5137.
1
limit of f (x) as x approaches z from the right, or for x > z and write limx→z− f (x) to denote the limit of f (x) as x approaches z from the left, or for x < z. It then follows that limx→z f (x) = y if and only if limx→z+ f (x) = limx→z− f (x) = y. There are many laws that one may rely on when computing limits. We state four here with no particular order. Let f (x) and h(x) be two functions and that their limits exist at z. Then: 1. limx→z (f (x) ± h(x)) = limx→z f (x) ± limx→z h(x) 2. limx→z (c(f (x)) = c limx→z f (x), where c ∈ R is a constant 3. limx→z (f (x)h(x)) = limx→z f (x) limx→z h(x) 4. limx→z
f (x) h(x)
=
limx→z f (x) limx→z h(x)
if limx→z h(x) 6= 0
In addition, there are two special limits that one should be aware of. Namely, if c ∈ R is a constant, then limx→z c = c and limx→z x = z. It then naturally follows from the third law that limx→z xn = z n where n is a positive integer. 3
2
+5 Example 1 Let f (x) = x x−2x 4 +5x3 . Determine limx→1 f (x). To solve for this limit we apply the above laws and special limits. Hence:
limx→1 (x3 − 2x2 + 5) x3 − 2x2 + 5 = x→1 x4 + 5x3 limx→1 (x4 + 5x3 ) limx→1 x3 − 2 limx→1 x2 + limx→1 5 = limx→1 x4 + 5 limx→1 x3 1−2+5 2 = = 1+5 3
lim f (x) = lim
x→1
Equipped with the knowledge of limits, one may now approach the question regarding the derivative of a function. Heuristically, the derivative provides us with the rate of change of the function value with respect to a variable at a specific point. In other words, it is the slope of the function at a point.
2
Consider the point x, the slope of a function at a point x may be approximated by: f (x + ) − f (x) (2) x+−x for some ∈ R so long as x + is in the domain of f . It turns out that the slope, or the derivative, then is the limit of the above approximation as → 0. More specifically, we write and define: df (x) f (x + ) − f (x) = f 0 (x) = lim →0 dx x+−x
(3)
From our discussion on limits, it is then clear that the derivative at a point x exists so long as the limit at that point exists. This further implies that f 0 (x) exists at x if and only if: lim+
→0
f (x + ) − f (x) f (x + ) − f (x) = lim− →0 x+−x x+−x
(4)
Some general rules for derivatives are provided below: 1. If f (x) = axn then f 0 (x) = anxn−1 , 2. If f (x) = ln x then f 0 (x) = 1/x. However, we often deal with more complex functions than those stated above, and we provide some more general rules below. Suppose f (x) and g(x) are functions and that their derivatives exist. Then: 1. (f (x) ± g(x))0 = f 0 (x) ± g 0 (x), 2. (f (x)g(x))0 = f 0 (x)g(x) + f (x)g 0 (x), 3. (f (x)/g(x))0 = (f 0 (x)g(x) − f (x)g 0 (x))/(g(x))2 , 4. (f (g(x)))0 = f 0 (g(x))g 0 (x). ...........................................................................
3
1.1
Review Exercises
........................................................................... Exercise 1 Evaluate the following limits as x → 1. (i) f (x) = x2 − 1, (ii) 8 3 f (x) = x +2−5x , (iii) f (x) = 3x3 − 3x2 . x2 −2 Exercise 2 Prove that the following limit does not exist: |x| x→0 x
(5)
lim
Exercise 3 Prove that if f (x) = axn then f 0 (x) = anxn−1 where a is a constant and n a positive integer. Exercise 4 Determine the derivatives for the following functions: 3
4
(i) f (x) = x2 + x6 − 3 (ii) f (x) = x2x−7x 4 +1 (iii) f (x) = (x2 + 8x3 )2 (iv) f (x) = ln(x2 − 5x + 2) (v) f (x) = x2 − 4 ln x (vi) f (x) = ln(ln(ln x)) 2x x (vii) f (x) = x (viii) f (x) = ex2 ...........................................................................
2
Properties of Functions on R
In this section, we provide some very useful theorems and definitions. Of particular interest is often the notion of continuity of functions. Definition 1 A function f : R → R is continuous at a point x0 if ∀ > 0, ∃δ > 0 such that d(x, x0 ) < δ implies that d(f (x), f (x0 )) < . Note that d(x, y) = |x − y| is the distance operator between x and y. Naturally, a function is said to be continuous if it is continuous for all x in its domain. Such classes of functions are referred to as the C 0 space of functions. It is clear then, that the space C k refers to the class of functions which are k-times continuously differentiable.1 Aside from Definition 1, an alternative way to define continuity of a function at some point x0 is if limx→x0 f (x) = f (x0 ). Thus, this immediately follows. 1
Sometimes, one may refer to a function as being smooth to imply the existence of sufficient continuous derivatives. The exact measure of smoothness is somewhat arbitrary and is simply a reflection of the requirements of the problem at hand. Hence, the degree of smoothness may range from C 0 to C ∞ functions.
4
Proposition 1 If a function, f : R → R is differentiable at x0 then it is continuous at x0 . If f and g are continuous at x0 then from the definition of continuity, as well as properties of limits, the following holds: 1. f ± g is continuous at x0 . 2. f · g is continuous at x0 . 3. f /g is continuous at x0 provided that g(x0 ) 6= 0. Note that we have only define a general notion of continuity. There are, other forms of continuity such as pointwise continuity and piecewise continuity for which is beyond the scope of this course.2 However, we should note the existence of such alternative and generally weaker definitions. Aside from continuity, we may sometimes be concerned with the behavior of a function with respect to its argument. We begin by defining monotonicity. Definition 2 A function f is said to be strictly (weakly) monotone increasing if for all x, z ∈ R, if x > z then f (x) > (≥)f (z). Similarly, a function f is said to be strictly (weakly) monotone decreasing if for all x, z ∈ R, if x > z then f (x) < (≤)f (z). It follows then that strictly monotone functions are “1 − 1” relations.3 Hence, a general observation is that strictly monotonic functions are invertible.4 The inverse of a function, as discussed before, is denoted f −1 (x) with the property that f −1 (f (x)) = f (f −1 (x)) = x. Hence the inverse function is simply a reflection across the 450 line in a two dimensional geometric interpretation. It also follows by the rules of differentiation that: 1 df −1 (x) = 0 −1 dx f (f (x)) 2
Interested readers should refer to any book on real analysis for a detailed treatment of continuity. 3 For classes of functions f ∈ C 1 , strictly (weakly) monotone increasing functions imply 0 f > (≥)0 and strictly (weakly) monotone decreasing functions imply f 0 < (≤)0. 4 This, of course, hinges on the fact that the functions are “Onto” relations. But this may be rectified with no cost simply by restricting the codomain to the range in defining the relevant function.
5
Note that in the above definition of the derivative of the inverse of f , the mere existence of f −1 implies that f is strictly monotonic and hence f 0 > 0 for all x ∈ D or f 0 < 0 for all x ∈ D where f : D → R. Hence, the existence of the derivative of f −1 is solely dependent of the existence of the derivative of f . In addressing the curvature of a function f , we define concave and convex functions below. Definition 3 A function f : D → R is said to be concave over the domain D if for all x, z ∈ D, f (αx+(1−α)z) ≥ αf (x)+(1−α)f (z) for all α ∈ (0, 1). Definition 4 A function f : D → R is said to be convex over the domain D if for all x, y ∈ D, f (αx + (1 − α)z) ≤ αf (x) + (1 − α)f (z) for all α ∈ (0, 1). The geometric interpretation of a concave function may be obtained as follows. For a function defined over D ⊂ R, if you take any two points in D and connect them with a line the line should lie below (or equal to) the function. Conversely, for convex functions, this line should lie above (or equal to ) the function over this region. As it turns out, the curvature of a function f ∈ C 2 is quite easily testable since if f : D → R with D ⊂ R, then f is concave over D if f 00 (x) ≤ 0 for all x ∈ D. Conversely, f is convex over D if f 00 (x) ≥ 0 for all x ∈ D. Hence, the second derivative test for curvature is quite useful since we shall be studying predominantly twice continuously differentiable functions. The next very useful theorem is known as the mean value theorem: Theorem 1 Suppose f (x) is a continuously differentiable function over the domain D. Then for all x, z ∈ D, there exists some s ∈ (x, z) such that: f (z) − f (x) = (z − x)f 0 (s) The geometric interpretation of the mean value theorem is quite simple. For a function f ∈ C 1 , pick any two points in the domain, say x, z ∈ D. This results in a pair (x, f (x)) and (z, f (z)) in R2 . If we connect these two points in R2 with a line, the slope of this line is given by (f (z) − f (x))/(z − x). The mean value theorem simply states that there must exist some point s ∈ [x, z] so that f 0 (s) represents this slope. This is illustrated in Figure 1.
6
f6
x
s
z
-
Figure 1: Geometry of the Mean Value Theorem in R.
The proof of the mean value theorem is quite straightforward. Provided that f ∈ C 1 over D ⊂ R, then pick any two points x, z ∈ D with z > x and consider the function: f (z) − f (x) g(k) = f (k) − k z−x Clearly, g(z) = g(x) and by the continuity of f a local maximum or a local minimum, or both, to g(k) must exist at some value in [x, z]. Since at a local maximum or a local minimum, the slope of the function vanishes, define such points as s so that: f (z) − f (x) 0 0 g (s) = f (s) − =0 z−x Rearranging this yields exactly the mean value theorem. Definition 5 A function f is said to be homogeneous of degree k if for all x in the domain, f (λx) = λk f (x) for all λ > 0. The next theorem is known as Euler’s theorem and deals with homogeneous functions. Theorem 2 Suppose f is a homogeneous function of degree k. Then: f 0 (x)x = kf (x) 7
(6)
The proof of this is quite straightforward since from the definition of homogeneous functions, we naturally have: f 0 (λx)x = kλk−1 f (x) Hence setting λ = 1 completes the proof. Theorem 3 Suppose f is a homogeneous function of degree k. Then f 0 is a homogeneous function of degree k − 1. The proof of this result is also quite straightforward and may be obtained simply by taking the derivative of the definition of homogeneous function with respect to x. ...........................................................................
2.1
Review Exercises
........................................................................... Exercise 5 Determine whether the following functions are monotone increasing: (i) f (x) = (x − 2)2 + 4 (ii) f (x) = x3 (iii) f (x) = x2 − x Exercise 6 Consider two functions f, g ∈ C 2 such that both are globally concave. Is the composite function f (g(x)) also globally concave? You may assume that f and g are defined on R. Provide necessary and sufficient conditions such that f (g(x)) is globally concave. Exercise 7 Are the following functions globally concave? 1. f (x) = 4x2 − x3 2. f (x) = ln(x2 − 2) 3. f (x) = ex Exercise 8 Suppose f (x) is a homogeneous function of degree k. Prove that f must be of the form f (x) = axk where a is a constant. ........................................................................... 8
3
Optimization
In dealing with optimization problem we, sometimes, may be interested in the existence of optima. Hence, we introduce the Weierstrass theorem which deals with the existence of a maximum and a minimum over a compact subset of R. A maximum is here defined to be some x ∈ R such that f (x) ≥ f (z) for all z ∈ R or the relevant domain. That is, no other values in the relevant domain of the function attains a higher function value than the function value at x. Similarly, then, a minimum is defined to be some x ∈ R such that f (x) ≤ f (z) for all z ∈ R, or the relevant domain. Theorem 4 Let f be a real valued function on X and S be a compact subset of X. Then f : S → R achieves a maximum and a minimum in S. To see why the Weierstrass theorem fails when the domain is not compact is straightforward. For example, consider f : (0, 1) → R so that f (x) = x. Since S = (0, 1) which is open in R, we know that ∀x ∈ (0, 1), ∃ > 0 such that B (x) ⊂ (0, 1). More specifically, ∃ > 0 so that x ± ∈ (0, 1) for all x ∈ (0, 1). Since f (x + ) > f (x) ⇔ > 0, no maximum exists.5 Hence, the failure of the domain to be closed potentially introduces serious problems to the existence of a maximum (and also a minimum). Boundedness is also a sufficient property since it is easily conceivable for a function with a domain [0, ∞) and hence closed, but that the function is strictly monotonically increasing over R+ . Now, let us suppose we are faced with a function f : D → R, where D ⊂ R. Then a maximization problem is denoted: max f (x) x
where the above is read, choose x such that x maximizes the function f (x).6 Similarly, a minimization problem is written: min f (x) x
5
For instances when the maximum is defined in the closure of the domain, we will refer to it as the supremum denoted sup f (x). Naturally, if the domain is closed then sup f (x) = max f (x). For minimum, an infimum is analogously defined. 6 Obviously we are restricting our attention to x ∈ D.
9
The solution to this maximization problem denoted by the set S(f ) is expressed as: S(f ) = {x ∈ D : f (x) ≥ f (z)∀z ∈ D} or equivalently expressed as: ∀x ∈ S(f ) ≡ arg max f (z) z
Then, for any given twice continuously differentiable function f (x), the potential candidates for an interior solution to an optimization problem is given by the set of x such that the first-order condition: f 0 (x) = 0 is satisfied.7 The notion of an interior solution may be literally interpreted as an element that is not on the boundary of the domain. More formally, consider a b to be the largest open subset function f : D → R where D ⊂ R. Define D ∗ b In the of D. Then x ∈ arg maxz f (z) is an interior solution if x∗ ∈ D. ∗ ∗ ∗ b (presumably, x ∈ S(f ) exists), then x must be on the event that x ∈ / D b For such instances, we will refer to x∗ as boundary of D or the closure of D. a corner solution. This is not to say, however, that the first-order condition is necessarily violated when x∗ is a corner solution. It is possible, that the slope vanishes at the corner for a given function. Example 2 Suppose we have f : [−1, 0] → R so that f (x) = −x2 . Now consider thee problem maxx f (x). Then, the set of potential solution to the maximization problem the points x b ∈ [−1, 0] so that f 0 (b x) = −2b x = 0. Hence, we find that x b = 0 is a potential solution. Notice that in fact, this is the b (but x global maximizer of f (x) but that x b ∈ / D b ∈ D). Hence, we have a corner solution which satisfies the necessary first-order condition to an interior maximum. Denote X to be the set of values such that for all x ∈ X, f 0 (x) = 0. Then if f 00 (x) < 0, such an x is deemed a local maxima whereas if f 00 (x) > 0, 7
Note that the restriction to twice continuously differentiable functions is for convenience. Candidacy of a point x as a potential interior solution to a maximization problem simply requires the satisfaction of the first-order condition. Hence, f ∈ C 1 suffices.
10
such an x is a local minima. If for all x in the domain of f , f 00 < 0, then the function is said to be globally concave and hence if there X is nonempty then it must be a singleton representing the unique solution. Therefore, the first order condition is also sufficient and hence x will be the solution to the optimization problem. If, however, there does not exist an x such that f 0 (x) = 0, then potentially no solution to the problem exists.8 To illustrate this point, √ suppose we are faced with the function f : R+ → R+ given by f (x) = 2 x. Then the first order condition is given by some x such that f 0 (x) = x−1/2 = 0 is undefined. Hence, no maximum to this problem exists since f 0 (x) > 0 for all x ∈ R+ which implies that the function is strictly increasing in x. Hence, for all x ∈ R+ there must exist some z > x such that f (z) > f (x). Alternatively, for the function f : R → R given by f (x) = −x2 , we see that f 0 (x) = −2x and is equal to zero if and only if x = 0. Furthermore, f 00 (x) = −2 < 0 for all x ∈ R+ and hence, the maximum to the function is attained only when x = 0. For functions which are not strictly concave (for maximization problems) and not strictly convex (for minimization problems), there may exist multiple solutions which satisfy the first-order conditions and hence, are all candidates to be the solution. The determination of the solution may be conducted primitively simply by evaluating the function over the set of potential solutions X along with the potential corners. This will then allow one to determine S(f ) quite easily. ...........................................................................
3.1
Review Exercises
........................................................................... Exercise 9 Show that if f ∈ C 2 on R and reaches an interior maximum at x∗ then for any g : R → R such that g 0 (x) > 0 ∀x ∈ R, then x∗ is also a maximum for g(f (x)). Is this also true if g 0 (x) < 0 ∀x ∈ R? Exercise 10 Maximize the function f (x) = 13 x3 − x. ........................................................................... More importantly, for a function f ∈ C 2 such that f 00 < 0 and that @x such that f (x) = 0 implies that f 0 (x) > 0 for all x or f 0 (x) < 0 for all x. That is, f is strictly monotonic. 8
0
11
References [1] Dixit, Avinash K., Optimization in Economic Theory, 2nd. Ed., 1990. New York: Oxford University Press. [2] Fortin, Nicole, Elements of Topology and Calculus, 2002. Lecture Notes, University of British Columbia. [3] Stewart, James, Calculus, 3rd. Ed., 1995. Toronto: Brooks/Cole Publishing Company. [4] Sundaram, Rangarajan K., A First Course in Optimization Theory, 1996. New York: Cambridge University Press.
12