Previous Up Next
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

16 Advances of Metric Spaces

16.1 The Stone–Weierstrass Theorem

Density in metric spaces is an important concept since it allows to approximate any element by an element from the dense set. Furthermore, we can extend a uniformly continuous function from a dense subset by continuity, see Ex. 62. Thus, it is convenient to have some supply of manageable dense subsets of common metric spaces.

A famous case of density is the Theorem of Stone–Weierstrass, which in the original and most known form says that any continuous function on a compact interval can be uniformly approximated by a sequence of polynomials. Polynomials have many nice properties which make this dense subset particularly useful: easy computation, derivation and integration, etc. Yet, we will prove here a more general version of the Stone–Weierstrass theorem that applies to general compact metric spaces.

Theorem 1 (Stone–Weierstrass) Suppose that X is a compact metric space and let C(X,ℝ) be the Banach space of real valued continuous functions on X with norm || · ||. Suppose that AC(X,ℝ) is a unital subalgebra of C(X,ℝ), i.e. Suppose furthermore that A separates points, i.e. for any two x,yX with xy there exists a function fA such that f(x) ≠ f(y). Then, A is dense in C(X,ℝ).

This is an interestingly sounding theorem: it states that a subset of C(X,ℝ) which is closed under algebraic operations and separates points automatically has topological property—it is dense. Its consequences are striking. Before we prove this theorem let’s look at some of them.

Corollary 2[Weierstrass approximation theorem] The space of polynomials ℝ[x] is dense in C([a,b],ℝ) for any compact interval [a,b] in the in the || · || norm.

In other words, any continuous function can be approximated with arbitrary accuracy by a polynomial.

Corollary 3 The space of polynomials ℝ[x1,…,xn] is dense in C(K,ℝ) for any compact subset K of n in the || · || norm.

This is the higher dimensional version of the above theorem and states that a continuous functions of n-variables can be approximated by polynomials in n variables.

Corollary 4 Let C(S1,ℝ) be the space of continuous functions on the unit circle, or, equivalently, the space of 2 π-periodic real valued functions on . Then the finite linear span of the set
  
 
m ∈ ℕ
 {1, sin(mx), cos(mx)}
is dense in C(S1,ℝ).

The Stone–Weierstrass theorem is actually a consequence of the following theorem by Stone. This is a good illustration to inventor’s paradox stated by Polya [].

Here is some notation first for two functions f and g:

f ∧ g =  min{f,g},
f ∨ g = max{f,g}

Note that of f and g are continuous, then so are fg and fg (demonstrate this!).

Theorem 5 (Stone’s Theorem) Let X be a compact metric space and suppose that there is a subset A of C(X,ℝ) such that Then, A is dense in C(X,ℝ) in the topology induced by the norm || · || (the uniform topology).
Proof. We need to prove that any function g can be approximated by elements in A. For each two points x,y choose a function fx,yA such that fx,y(x)=g(x) and fx,y(y)=g(y). Such a function exists by our hypothesis for every pair of points. Now, for an ε>0 the sets
  Ox,y={z ∈ X ∣  fx,y(z) < g(z) + ε}
are open and form a cover of X even if we fix x. This is because Ox,y contains both x and y together with some neighbourhoods of these points. Now find a finite subcover for each fixed x. That is there are finitely many points y1, …, yn such that Ox,yi is an open cover. Now define the function
  fx= fx,y1 ∧ … ∧ fx,yn.
By hypothesis fx is in A for any xX and it has the property that
  fx(z) < g(z) + ε
but now for all z. Moreover, fx(x) = g(x). Again, the sets
  Ox={z ∈ X ∣  fx(z) > g(z) − ε}
make an open cover and therefore there is a finite subcover. This means there are finitely many points x1,…,xk such that Oxi is an open cover of X. Now the function
  f = fx1 ∨ … ∨ fxk
is in A and satisfies
  g(x)−ε < f(x) < g(x) + ε
for all x, or in other words || fg || < ε. □

It may be not obvious why conditions of Stone’s theorem 5 are more general than in Thm. 1. This will be seen from the following proof. We employ what Polya called leading particular case [, § 4.4]—we will show that the particular algebra of polynomials on [0,1] approximate the particular function √x and then reduce the general situation to it.

Proof.[Sketch of proof of the Stone Weierstrass theorem 1.] First we observe that if B is the closure in C(X,ℝ) of A from Thm. 1, then B will also be a unital point separating subalgebra of C(X,ℝ) (exercise!).

Step 1: If f is non-negative and in B, so is √f. To see this note that it is enough to show this for 0≤ f < 1 because in case f ≠0 we can compute √f by

  
f
 =
2
|| f||
1
2 || f ||
f
.

Now the Taylor series ∑k=0an xk for √1−x

1−x
=  
k=0
ak  xk = 1−
1
2
x
1
8
x2
1
16
x3
5
128
x4
7
256
x5−…

converges absolutely and uniformly on any interval [0,1−δ). Therefore, the series

  
k=0
ak (1−f−δ)k

converges in the Banach space B because all the partial sums are actually in B as B is a subalgebra. The limit of this sequence is, of course, √f. If we let δ go to zero, we can see that also √fB. This works because

  | 
f(x) + δ
 − 
f(x)
 | = δ ( 
f(x) + δ
 + 
f(x)
)−1 ≤ 
δ
  

so that the approximation is uniform.

Step 2: Since | f | = √f2 we have that fB implies | f | ∈ B. Since moreover,

   f ∧ g = 
f + g
2
 − 
| f −g |
2
,
   f ∨ g = 
f + g
2
 + 
| f −g |
2

we conclude from this that B is closed under the operations ∧ and ∨.

Step 3: Assume that xy are points in X and assume that a,b are real numbers. Then, by assumption there is an element f in B such that

   f(x) ≠ f(y).

Since B is a subspace that contains the constant functions, the function

     
    g(z)
= a + (ba)
f(x)−f(z)
f(x)−f(y)
         
 
= 
bf(x)−af(y)
f(x)−f(y)
 −
ba
f(x)−f(y)
f(z)
         

is also in B and it satisfies g(x)=a and g(y)=b.

Final Step: As we can see all the conditions of Stone’s theorem are satisfied and therefore B is dense in C(X,ℝ). Since B is closed in C(X,ℝ) this means that B=C(X,ℝ). Thus, A is dense in C(X,ℝ). □

We can extend the result from real scalars to complex one through identities ℜ z = 1/2(z +z) and ℑ z = 1/2(zz).

Corollary 6[Stone–Weierstrass (complex version)] Suppose that X is a compact metric space and let C(X,ℂ) be the complex Banach space of complex valued continuous functions on X with norm || · ||. Suppose that AC(X,ℂ) is a unital *-subalgebra of C(X,ℂ), i.e. Suppose furthermore that A separates points, i.e. for any two x,yX with xy there exists a function fA such that f(x) ≠ f(y). Then, A is dense in C(X,ℂ).

Using this complex version of the theorem (or simply the Euler identity ei φ = cosφ +i sinφ) we obtain the complex version of Cor. 4:

Corollary 7 The linear span of the set { ei m ϕm ∈ ℤ} is dense in C(S1,ℂ).

Note that we need both positive and negative values of m in ei m ϕ, the set { ei m ϕm ∈ ℕ0} is not dense in C(S1,ℂ).

The Stone–Weierstrass also says something about the separability of certain Banach spaces. Remember what it means for a topological space to be separable.

Definition 8 (Separable Metric Space) A metric space X is called separable if there exists a countable dense subset of X.

In other words, a separable metric space consists of accumulation points of a single sequence.

Suppose K ⊂ ℝn is compact. Then the space of polynomials with real coefficients and n-variables is dense in the space of continuous functions C(K). Of course every polynomial with real coefficients may be approximated by one with rational coefficients. Thus the set of rational polynomials ℚ[x1,…,xn] is dense in C(K). However, the space of rational polynomials is a countable set. In this way one obtains

Corollary 9 Let K be a compact subset of n then the Banach space C(K) is separable.

The following statement shows that continuous functions make only a tiny fraction of all bounded functions:

Exercise 10 Let X be an infinite set, show that the space B(X) of bounded functions on X is not separable. (Hint: present a set of disjoint balls of radius 1/2 parametrised by all real numbers.)

16.2 Contraction mappings and fixed point theorems

16.2.1 The Banach fixed point theorem

An important tool in numerical Analysis, but also in constructions of solutions of differential equations are fixed point approximations. In order to understand this, suppose that (X,d) is a metric space and f: XX a self-map. Then a point xX is called fixed point of f if f(x) = x. For example the function cos defines a self-map on the interval [0,1], and by starting with x1=0 and inductively computing xn+1 = cosxn one converges to the value roughly 0.739085 which is a fixed point of cos, i.e. solves the equation cos(x) = x. Under certain conditions one can show that such sequences always converge to a fixed point. This is the statement of the Banach fixed point theorem (contraction mapping principle).

Definition 11 (Contraction Mapping) Let (X,d) be a metric space. Then a map f: XX is called contraction if there exists a constant C<1 such that
  d(f(x),f(y)) ≤ Cd(x,y).

Note that any contraction is (uniformly) continuous.

Theorem 12 (Banach Fixed Point Theorem) Suppose that f: XX is a contraction on a complete metric space (X,d). Then f has a unique fixed point y. Moreover, for any xX the sequence (xn) defined recursively by
  xn+1 = f(xn),   x1 = x,
converges to y.
Proof. Let us start with uniqueness. If x,y are both fixed points in X, then since f is a contraction:
  d(x,y) ≤ Cd(x,y)
for some constant C<1. Hence, d(x,y)=0 and therefore x=y.

To prove the remaining claims we start with any x in X and we will show that the sequence xn defined by x1=x and xn+1=f(xn) converges. Since f is continuous the limit of (xn) must be a fixed point. Since (X,d) is complete we only need to show that (xn) is Cauchy. To see this note that

  d(xn+1,xn) ≤ Cd(xn,xn−1)

and therefore inductively,

  d(xn+1,xn) ≤ Cn−1d(x2,x1).

By the triangle inequality we have for any n,m>0

  d(xN+m,xN) ≤ (CN−1 + CN + … CN+m−2) d(x2,x1) ≤ CN−1
1
1−C
d(x2,x1) .

Since C<1 this can be made arbitrarily small by choosing N large enough. □

Corollary 13 Suppose that (X,d) is a complete metric space and f: XX a map such that fn is a contraction for some n ∈ ℕ. Then f has a unique fixed point.
Proof. Since fn is a contraction it has a unique fixed point xX, i.e.
   
 
 
f ∘ f … ∘ f


n−times
(x) =x.
Now note that
fn(f(x)) =  fn ∘ f (x) =   fn+1(x) = f∘ fn(x) = f(fn(x))=f(x)
and therefore f(x) is also a fixed point of fn. By uniqueness we must have f(x)=x. □

The question arises how to show that a given map f is a contraction. In subsets of ℝm there is a simple criterion. Recall that an open set U ⊂ ℝ is called convex if for any two points x,yU the line { t x + (1−t) yt ∈ [0,1] } is contained in U.

Theorem 14 (Mean Value Inequality) Suppose that U ⊂ ℝm is an open set with convex closure U and let f: U → ℝm be a C1-function. Let d f be the total derivative (or Jacobian) understood as a function on U with values in m × m-matrices. Suppose that || df(x) || ≤ M for all xU. Then f: U → ℝm satisfies
 || f(x) −f(y) || ≤ M || x − y ||
for all x,yU.
Proof. Given x,yU let γ(t) = t x + (1−t )y. Then d/dt γ(t) = xy.
  f(x) − f(y) = 
1
0
d
dt
f(γ(t)) dt = 
1
0
 (df) · 
d γ
dt
(t) dt.
Using the triangle inequality (this can be used for Riemann integrals too because these are limits of finite sums), one gets
  || f(x) −f (y) || ≤ 
1
0
 ||  (df) · 
d γ
dt
(t) || dt ≤ M
1
0
 || x − y || dt = M || xy||.
By continuity this inequality extends to U. □
Example 15 Consider the map f: ℝ2B1(0)B1(0),   (x,y) ↦ (x2/4+y/3+1/3,y2/4−x/2). Then
  df = 







x
2
1
3
1
2
y
2








.
The operator norm || df|| can be estimated by the Hilbert–Schmidt norm. Recall || A ||HS = (tr(A* A))1/2, so we get
  || df|| ≤ || df||HS = (
1
4
(x2+y2) + 
1
4
 + 
1
9
)1/2 <1.
Therefore f is a contraction. We can find the fixed point by starting, for example, with the point (0,0) and iterating. We get iterations:
(0,0), (0.333333, 0.), (0.361111, −0.166667), 
 (0.310378, −0.173611), (0.299547, −0.147654), 
 (0.306547, −0.144323), (0.308719, −0.148066),
 (0.307805, −0.148878), (0.307393, −0.148361),
 (0.307502, −0.148194), (0.307575, −0.148261), 
 (0.307564, −0.148292), (0.307551, −0.148284), 
(0.307552, −0.148279), (0.307554, −0.148279).
  
Example 16 Put a map of the country of your current presence on the floor, there’s a point on the map that is touching the actual point it refers to!

16.2.2 Applications of fixed point theory: The Picard-Lindelöf Theorem

Let f: K → ℝ be a function on a compact rectangle of the form K=[T1,T2] × [L1,L2] in ℝ2. Consider the initial value problem (IVP)

dy
dt
 = f(t,y),   y(t0) = y0, (101)

where y: [T1,T2] → ℝ, ty(t) is a function. The function f and the initial value y0 ∈ [L1,L2], and t0 ∈ [T1,T2] are given and we are looking for a function y satisfying the above equations.


  
  
Figure 19: Vector fields and their integral curves from Ex. 1720.

Example 17 Let f(t,x)=x and y0 =1, t0=0. Then the initial value problem is
  
dy
dt
 = y,   y(0) =1.
We know from other courses that there is a unique solution y(t) = et, see Fig. 19 top-left.
Example 18 Let f(t,x)=x2 and y0 =1, t0=0. Then the initial value problem is
  
dy
dt
 = y2,   y(0) =1.
We know from other courses that there is a unique solution y(t) = 1/1−t which exists only on the interval (−∞,1), see Fig. 19 top-right.
Example 19 Let f(t,x)=x2t and y0 =1, t0=0. Then the initial value problem is
  
dy
dt
 = y2t,   y(0) =1.
One can show that there exists a solution for small |t|, however this solution cannot be expressed in terms of elementary functions, see Fig. 19 bottom-left.
Example 20 Let f(t,x)=x2/3 and y0 =0, t0=0. Then the initial value problem is
  
dy
dt
 = y
2
3
 
,   y(0) =0.
It has at least two solutions, namely y=0 and y=t3/27, see Fig. 19 bottom-right.

Hence, there are two fundamental questions here: existence and uniqueness of solutions. The following theorem is one of the basic results in the theorem of ordinary differential equation and establishes existence and uniqueness under rather general assumptions.

Theorem 21 (Picard–Lindelöf theorem) Suppose that f: [T1,T2] × [y0C,y0+C] → ℝ is a continuous function such that for some M>0 we have
 |f(t,y1) − f(t,y2)| ≤ M | y1y2|   (Lipschitz condition)
for all t ∈ [T1,T2], y1,y2 ∈ [y0C,y0+C]. Then, for any t0 ∈ [T1,T2] the initial value problem
dy
dt
(t) = f(t,y(t)),   y(t0) = y0,
has a unique solution y in C1[a,b], where [a,b] is the interval [t0R, t0+R] ∩ [T1,T2], where
R= ||f||−1C.
(The solution exists for all times t such that |tt0| ≤ R).
Remark 22 Note, that the Lipschitz condition implies uniform continuity and is significantly stronger requirement.
Proof. Using the fundamental theorem of calculus we can write the IVP as a fixed point equation F(y) = y for a map defined by
  F(y) (t) = y0 + 
t
t0
f(s,y(s)) ds.
This is a map that will send a continuous function yC[T1,T2] to a continuous function F(y) ∈ C[T1,T2]. As a metric space we take
X = C([a,b], [y0 −C, y0 +C])
that is, the set of continuous functions on [a,b] taking values in the interval [y0C, y0 +C]. This is a closed (why?) subset of the Banach space C[a,b] and is therefore a complete metric space.

First we show that F: XX, i.e. F maps X to itself. Indeed,

 | F(y)(t) − y0 | = 


t
t0
f(s,y(s)) ds


≤ R || f ||≤ C.

Next we show that FN is a contraction for N large enough and thus establish the existence of a unique fixed point. It is the place to use the Lipschitz condition. Observe that for two functions y, y′ ∈ X we have

    | F(y)(t) − F(y′)(t) |
=  


t
t0
f(s,y(s)) − f(s,y′(s)) ds


 
≤  
t
t0
 | f(s,y(s)) − f(s,y′(s)) | ds ≤ |tt0| M || y − y′ ||.
(102)

We did not assume that (tt0) MRM <1, so F will in general not be a contraction. There are several ways to resolve this situations. For example, we can argue in either of the following two manners:

  1. We use both the result and the method from (102) to compute distances for higher powers of F, starting from the squares:
         
      | F2(y)(t) − F2(y′)(t) |
    ≤ 
    t
    t0
     | f(s,F(y)(s)) − f(s,F(y′)(s)) |  ds
             
     
    ≤ 
    t
    t0
     |st0| · M · || F(y) − F(y′)||  ds
             
     
    ≤ 
    t
    t0
     |st0| · M2 · || y − y′ ||  ds
             
     
     = 
    |tt0|2
    2
    M2  || y − y′ ||,
             
    and iterating this gives for any natural N:
     || FN(y) − FN(y′) ||≤  
    |tt0|N
    N !
    MN  || y − y′ ||.
    Since the factorial will overgrow the respective power, for N large enough, FN is a contraction and we deduce the existence of a unique solution from Cor. 13. This solution is in C1 since it can be written as the integral of a continuous function.
  2. The inequality (102) shows existence and uniqueness of solution only in the space of functions C([t0r,t0+r], [y0C,y0+C]) where r < M−1 and therefore |tt0| M< 1 in (102). Now suppose we have two solutions y and y′. They coincide at t0. Application of (102) to other initial points where the solutions coincide shows that the set E={ x ∈ [a,b] ∣ y(x) = y′(x)} is open. It is also the pre-image of the closed set {0} under the continuous map yy′. So we have that E is a closed and open subset of [a,b] that is non-empty. It must therefore be [a,b]. Hence, we get y = y′, establishing uniqueness in the whole C[a,b].

Note that this not only gives uniqueness and existence, but also gives a constructive method to compute the solution by iterating the map F starting for example with the constant function y(t)=y0. The iteration

yn+1(t) = y0 + 
t
t0
f(s,yn(s)) ds

is called Picard iteration. It will converge to the solution uniformly. See Fig. 20 for an illustration of few first iterations for the exponent functions.


Figure 20: Few initial Picard iterations for the differential equation y′=y: constant f0, linear f1, quadratic f2, etc.

Remark 23 The proof also gives a bound on the solution, namely if the assumptions are satisfied one gets | y(t) − y0 | ≤ C for t ∈ [a,b].
Remark 24 The proof works in the same way if y takes values in m and therefore f : ℝ × ℝm ⊃ [T1,T2] × BC(0) → ℝm. In fact, the target space may even be a Banach space (the derivative for Banach space-valued functions appropriately defined). Higher order differential equations may be written as systems of first order equations and hence the theorem applies to these as well. For example y″(t) + y(t) =0, y(0) = 1, y′(0) = 0 can be written as
  
d
dt


y
w


= 

w
y


,   

y
w


(0) = 

1 
0 


.
So here the function f is f(t,(x1,x2)) = (x2, −x1).
Example 25 Consider the IVP
  
dy
dt
 = y2t +1,   y(0) = 1.
Hence, f(t,x) = x2t +1. If we take f to be defined on the square [−T,T] × [1−C,1+C] then we obtain ||f||= (1+C)2 T +1 (the value at the top-right corner). In this case the solution will exist up to time
  min



T, 
C
(1+C)2T +1




.
If we choose, for example C=2 and T=1/2 we get that a unique solution exists up to time | t | ≤ 4/11. This solution will then satisfy | y(t) −1 | ≤ 2 for | t | ≤ 4/11.

In fact one can show that the solution can be expressed in a complicated way in terms of the Airy-Bi-function and it blows up at t=1.

16.2.3 Applications of fixed point theory: Inverse and Implicit Function Theorems

It is an easy exercise in Analysis to show that if a function fC1[a,b] has nowhere vanishing derivative, then f is invertible on its image. To be more precise, f−1: Im(f) → [a,b] exists and has derivative (f′(x))−1 at the point y=f(x). In higher dimensions a statement like this can not be correct as the following counterexample shows. Let 0<a<b and define

f: [a,b] × ℝ → ℝ2,
 (r,θ) ↦ (r cosθ, r sinθ).

This maps has invertible derivative

f′(r,θ) = 

cosθr sinθ 
sinθr cosθ. 


,   detf′(r,θ) = r2 >0.

at any point, the map is however not injective, see Fig. 21 for a cartoon illustration of the difference between one- and two-dimensional cases. However, for any point we can restrict domain and co-domain, so that the restriction of the function is invertible. In such a case we say that f is locally invertible. This concept will be explained in more detail below.


Figure 21: Flat and spiral staircases: can we return to the same value going just in one way?

Definition 26 (Local Invertibility) Suppose U1, U2 ⊂ ℝm are open subsets of m. Then a map f: U1U2 is called locally invertible at xU1 if there exists an open neighbourhood U of x such that f |U : Uf(U) is invertible. The function f is said to be locally invertible it it is locally invertible at x for any xU1.

Often, say for differential equations, we need a map which preserves differentiability of functions in both directions.

Definition 27 (Diffeomorphism) Suppose U1, U2 ⊂ ℝm are open subsets of m. Then a map f: U1U2 is called Ck-diffeomorphism if fCk(U1,U2) and if there exists a gCk(U2,U1) such that
   f ∘ g = 1U2,   g ∘ f = 1U1,
where 1U1 and 1U2 are the identity maps on U1 and U2 respectively.

There is also a local version of the above definition.

Definition 28 (Local Diffeomorphism) Suppose U1, U2 ⊂ ℝm are open subsets of m. Then a map f: U1U2 is called a local-Ck- diffeomorphism at xU1 if there exists an open neighbourhood U of x such that f |U: Uf(U) is a Ck-diffeomorphism. It is called a local-Ck- diffeomorphism if it is a local diffeomorphism at any point xU1.

Not every invertible Ck-map is a diffeomorphism. An example is the function f(x) = x3 whose inverse g(x) = x1/3 fails to be differentiable.

Theorem 29 (Inverse Function Theorem) Let U ⊂ ℝm be an open subset and suppose that fCk(U,ℝm) such that f′(x) is invertible at every point xU. Then f is a local Ck-diffeomorphism.

Before we can prove this theorem we need a Lemma, which basically says that under the assumptions of the inverse function theorem an inverse function must be in C1. That is, differentiability is the leading particular case [, § 4.4] for the general case of k-differentiable functions.

Lemma 30 Suppose that fC1(U1,U2) is bijective with continuous inverse. Assume that the derivative of f is invertible at any point, then f is a C1-diffeomorphism, and g′(f(x)) = (f′(x))−1.
Proof. Denote the inverse of f by g: U2U1. The continuity of f and g imply that xnx0 if and only if f(xn) → f(x0). We will show that g is differentiable at the point y0 = f(x0). If y=f(x) is very close to y0 (so that the line interval between x and x0 is contained in U1) then, by the MVT there exists a ξ on this line such that yy0 = f(x) − f(x0) = f′(ξ) · (xx0). Therefore, g(y)−g(y0) = (f′(ξ))−1 · (yy0). If y tends to y0, then ξ will tend to x0, and therefore, by continuity of f′ the value of (f′(ξ))−1 will tend to (f′(x0))−1. Thus, the partial derivatives of g exist and are continuous, so gC1. Note that we have used here that matrix inversion is continuous. □

Now we can proceed with the general situation.

Proof.[Proof of the Inverse Function Theorem 29] Let x0U and let y0=f(x0). We need to show that there exists an open neighborhood U1 of f(x0) such that f: f−1(U1) → U1 is a Ck-diffeomorphism. As a first step we construct a continuous inverse. Since f′(x0)=A is an invertible m × m-matrix we can change coordinates x = A−1 y + x0, so that we can assume without loss of generality that f′(x0)= 1 and x0=0. Replacing f by fy0 we also assume w.l.o.g. that y0=0. Since f′(x) is continuous there exists an ε>0 such that || f′(x) − 1 || ≤ 1/2 for all xBε(0). This ε>0 can also be chosen such that Bε(0) ⊂ U. Thus, || xf(x) || ≤ 1/2|| x|| for all xBε(0) by MVT, and for each yBε/2(0) the map
     x ↦ x + y − f(x)
is a contraction on Bε(0). Indeed, by MVT again:
      ||x + y − f(x) − (x′ + y − f(x′))||= ||x − f(x) − (x′ − f(x′))|| 
 = ||(f′(ξ) − 1) (xx′)||
 
≤ 
1
2
 ||xx′||,
(103)
where ||·|| is the norm of vectors in ℝm. Consider the complete metric space X=C(Bε/2(0),Bε(0)) and define the map
     F: X → X,   u ↦ F(u),   F(u)(y) = u(y) + y − f(u(y)). 
By the above this map is well defined and it also is a contraction
     
      || F(u)(y) −F(v)(y) ||
= || u(y) − f(u(y)) − 
v(y) −f(v(y)) 
|| 
         
 
≤ 
1
2
 || u(y) − v(y) ||  
 [by (103)]       
 
≤  
1
2
 || u − v ||. 
         
Hence, there exists a unique fixed point g. This fixed point yields a continuous inverse g of f|U defined on U =Bε/2(0) ∩ f−1(Bε/2(0)). By the previous Lemma this implies that g is differentiable. Now simply note that g′ = (f′)−1g. Since matrix inversion is smooth and f′ is in Ck−1 this implies that for mk−1 we get the conclusion (gCm) (gCm+1). Hence, g is in Ck. □

The implicit function theorem is actually a rather simple consequence of the inverse function theorem. It gives a nice criterion for local solvability of equations in many variables.

Theorem 31 (Implicit Function Theorem) Let U1 ⊂ ℝn × ℝm and U2 ⊂ ℝm be open subsets and let
   F: U1 → U2,   (x1, …, xn, y1,…,ym) ↦ F(x1, …, xn, y1,…,ym)
be a Ck-map. Suppose that F(x0,y0)=0 for some point (x0,y0) ∈ U1 and that the m × m-matrix y F(x0,y0) is invertible. Then there exists an neighborhood U of (x0,y0) ∈ ℝn × ℝm, an open neighborhood V of x0 in n, and a Ck-function f: V → ℝm such that
   {  (x,y) ∈ U ∣ F(x,y) =0 } =   {  (x , f(x)) ∈ U ∣ x ∈ V }.
The function f has derivative
   f′(x0)=−(∂yF(x0,y0))−1 ∂x  F(x0,y0)
at x0.
Proof. This is proved by reducing it to the inverse function theorem. Just design the map
  G :  U1 →  ℝn × ℝm,   (x,y) ↦ (x, F(x,y))
and then note that
  G′(x0,y0) = 

10 
xF(x0,y0)yF(x0,y0) 


is invertible with inverse
 (G′(x0,y0))−1 = 

10 
−(∂yF(x0,y0))−1 ∂x  F(x0,y0)(∂yF(x0,y0))−1


.
By the inverse function theorem there exists a local inverse G−1: U3U4, where U3 is an open neighborhood of 0 and U4 an open neighborhood of (x0,y0). Now define f by (x,f(x)) = G−1(x,0). □
Example 32 Consider the system of equations
  x12 + x22 + y12 + y22 = 2,
  x1 + x23  + y1 + y23 =2.
We would like to know if this system implicitly determines functions y1(x1,x2) and y2(x1,x2) near the point (0,0,1,1), which solves the equation. For this one simply applies the implicit function theorem to
  F(x1,x2,y1,y2) = ( x12 + x22 + y12 + y22 − 2,  x1 + x23  + y1 + y23 − 2).
The derivatives are
  ∂xF = 

2 x12 x2
13 x22


,   ∂yF = 

2 y12 y2
13 y22


The values of these derivatives at the point (0,0,1,1) are
  ∂xF(0,0,1,1) = 

00 
10 


,   ∂yF(0,0,1,1) = 

22 
13  


The latter matrix is invertible and one computes
  −(∂yF(x0,y0))−1 ∂x  F(x0,y0)(0,0,1,1) = 

1/20 
−1/20 


.
We conclude that there is an implicitly defined function (y1,y2)= f(x1,x2) whose derivative at (0,0) is given by
  

1/20 
−1/20 


.
The geometric meaning is that near the point (0,0,1,1) the system defines a two-dimensional manifold that is locally given by the graph of a function. Its tangent plane is spanned by the vectors (1/2,0,1,0) and (−1/2,0,0,1).
Example 33 Consider the system of equations
  x2 + y2 + z2 = 1,
  x + yz + z3 =1.
This is the intersection of a sphere (drawn in light green on Figure 22) with some cubic surface defined by the second equation (drawn in light blue). The point (0,0,1) solves the equation and is pictured as a little orange dot. By the implicit function theorem the intersection is a smooth curve (drawn in red) near this point which can be parametrised by x coordinate. Indeed, we can express y and z along the curve as functions of x because the resulting matrix
(y,z)F(0,1)=

2y2z
zy+3z2
  



 


y=0,z=1
 =  

02 
13 
  

is invertible.

Figure 22: Example of the implicit theorem: the intersection (red) of the unit sphere (green) and a cubic surface (blue).

Exercise 34 Fig. 22 suggests that the intersection curve can be alternatively parametrised by the coordinates y and cannot by z (why?). Check these claims by verifying conditions of Thm. 31.

16.3 The Baire Category Theorem and Applications

We are going to see another example of an abstract result which has several non-trivial consequences for real analysis.

16.3.1 The Baire’s Categories

Let us first prove the following result and then discuss its meaning and name.

Theorem 35 (Baire’s category theorem) Let (X,d) be a complete metric space and Un a sequence of open dense sets. Then the intersection S=∩n Un is dense.
Proof. The proof is rather straightforward. We need to show that any ball Bε(x0) contains an element of S. Let us therefore fix x0 and ε>0. Since U1 is dense the intersection of Bε(x0) with U1 is non-trivial. Thus there exists a point x1Bε(x0) ∩ U1. Now choose ε1 < ε/2 so that Bε1(x1)Bε(x) ∩ U1 (note the closure of the ball). Since U2 is dense, the intersection Bε1(x1) ∩ U2Bε(x0) ∩ U1U2 is non-empty. Choose a point x2 and ε2 < ε1 /2 such that Bε2(x2)Bε1(x1) ∩ U2Bε(x0) ∩ U1U2. Continue inductively, to obtain a sequence xn such that
Bεn(xn)
 ⊂ Bεn−1(xn−1) ⋂ Un  ⊂  Bε(x0) ⋂ U1 ⋂ U2 ⋂ … ⋂ Un,
and εn < 2n ε. In particular, for any n>N we have
  xn ∈ B2Nε(xN),
which implies that xn is a Cauchy sequence. Hence xn has a limit x, by completeness of (X,d). Consequently, x is contained in the closed ball BεN(xN) for any N, and therefore it is contained in Bε(x0) ∩ (∩n Un), as claimed. □

Completeness is essential here. For example, the conclusion does not hold for the metric space ℚ: take bijection ψ: ℕ → ℚ, and consider the open dense sets

  Un = { ψ(1), ψ(1), …, ψ(n)}c = {ψ(n+1), ψ(n+2),… }.

The intersection ∩n Un is empty.

The following historic terminology, due to Baire, is in use.

Definition 36 (Baire’s categories) A subset Y of a metric space X is called
  1. nowhere dense if the interior of Y is empty;
  2. of first category if there is a sequence (Yk) of nowhere dense sets with Y = ∪k Yk;
  3. of second category if it is not of first category.

Example of nowhere dense sets are ℤ ⊂ ℝ, the circle in ℝ2, or the set { 1/nn ∈ ℕ } ⊂ ℝ. Note that the complement of a nowhere dense set is a dense open set.

Corollary 37 In a complete metric space the complement of a set of the first category is dense.
Proof. Follows from relations for complements
  Yc = (⋃kYk)c = ⋂kYkc ⊃  ⋂k
Yk
c
and the fact that Ykc is dense. □

The following corollary is also called Baire’s category theorem in some sources:

Corollary 38 A complete metric space is of second category in itself, or plainly speaking it is never the union of a countable number of nowhere dense sets.

The theorem is often used to show abstract existence results. Here is an example.

Theorem 39 There exists a function fC[0,1] that is nowhere differentiable.
Proof. For each n ∈ ℕ define
  Un = 



f ∈ C[0,1]  s.t.  sup






f(x+h)−f(x)
h



  over  0 < |h| ≤ 
1
n




> n, ∀ x ∈ [0,1]



.
We will show that the Un are open and dense. By the Category theorem their intersection is also dense.

Un is open: Let fUn. For each x ∈ [0,1] choose δx>0 such that

 sup






f(x+h)−f(x)
h



  over  0 < |h| ≤ 
1
n




> n + δx,

hence there is a hx < 1/n with




f(x+hx)−f(x)
hx



 > n + δx.

By continuity of f there is an open neighborhood Ix of x such that




f(y+hx)−f(y)
hx



> n + δx.

for all yIx. These Ix form an open cover. We choose a finite subcover (Ixk)k=1,…,N. Let δ= min{δx1, …, δxN} > 0 . Then, for yIxk:




f(y+hxk)−f(y)
hxk



 > n + δ.

Now let gBε(f), where ε>0 is chosen so that ε < 1/2 δ hxk for all k. Then by an ε/3-style argument:




g(y+hxk) − g(y)
hxk



 ≥  


f(y+hxk) − f(y)
hxk



 − 2 
|| fg||
hxk
 > n + δ − 2 ε hxk−1 >n,

and therefore gUn. We conclude that Un is open.

Un is dense: For each ε>0 and fC[0,1] choose a polynomial p such that || fp || < ε/2 and a sequence of continuous function gmC[0,1] such that || g ||< ε/2 and such that for all x ∈ [0,1]:

 sup



gm(x+h)−gm(x)
h
  over  0 < |h| ≤ 
1
n




> m

by using a “zigzag” function. Then, for large enough m we have p+gmUn. □

The above proof actually shows much more, namely that the set of nowhere differentiable functions is dense in C[0,1]. It is also useful to compare it with the construction of the continuous nowhere differentiable Weierstrass function and identify some common elements.

16.3.2 Banach–Steinhaus Uniform Boundedness Principle

Another consequence of the Baire Category theorem is the Banach–Steinhaus uniform boundedness principle. Recall that, if X and Y are normed spaces, T: XY is called a bounded operator if it is a bounded linear map.

Theorem 40 (Banach–Steinhaus Uniform Boundedness Principle) Let X be a Banach space and Y a normed space, and let (Tα)α ∈ I be a family of bounded operators Tα: XY. Suppose that
  ∀ x ∈ X: 
 
sup
α
 || Tαx || < ∞.
Then we have supα || Tα|| < ∞, i.e. the family Tα is bounded in the set B(X,Y) of bounded operators from X to Y.
Proof. Define Xn = {xX ∣ supα || Tαx || ≤ n }. By assumption X = ∪n Xn. Note that all the Xn are closed. By the Baire category theorem at least one of these sets must have non-empty interior, since otherwise the Banach space X would be a countable union of nowhere dense sets. Hence, there exists N ∈ ℕ, yXN, and ε>0 such that Bε(y)XN. Now XN is symmetric under reflections x↦ −x and convex. So we get the same statement for −y. Hence, xBε(0) implies
x = 
1
2

(x + y) + (xy) 
∈ 
1
2

XN + XN
⊂ XN. (104)
This means that || x || ≤ ε implies || Tαx || ≤ N, and therefore || Tα|| ≤ ε−1 N for all α ∈ I. □

Recall that the Fourier series of a C1-function on a circle (identified with 2 π-periodic functions) converges uniformly to the function. We will now show that a statement like that can not hold for continuous functions.

Corollary 41 There exist continuous periodic functions whose Fourier series do not converge point-wise.
Proof. We will show that there exists a continuous function whose Fourier series does not converge at x=0. Suppose by contradiction such functions would not exist, so we would have point-wise convergence of the Fourier series
  
1
2
a0 + 
m =1
am cos(mx) + bm sin(mx)
for every fC(S1) = Cper(ℝ). Here we identify continuous functions on the unit circle with continuous 2 π-periodic functions Cper(ℝ). Hence we have a map
  Tn : C(S1) → ℝ, f ↦   
1
2
a0  + 
n
m =1
am
by mapping the function f to the n-th partial sum of its Fourier series at x=0. This is a family of bounded operators Tn: C(S1) → ℝ and by assumption we have for every f that
  
 
sup
n
 | Tn(f) | < ∞.
By Banach–Steinhaus theorem we have supn || Tn || = supn, || f ||=1 | Tn(f) | < ∞. Now one computes the norm of the map
  Tn : C(S1) → ℝ,    f ↦ 
1
π
π
−π
f(x) 


1
2
+
n
k=1
 cos(kx) 


dx =  
1
2 π
  
π
−π
f(x) Dn(x) dx
where
  Dn(x) = 
sin


(n+
1
2
) x


sin


x
2



is the Dirichlet kernel , cf. Lem. 6. This norm equals 1/2 π ∫−ππ| Dn(x) | dx = 1/2 π ∫02 π | Dn(x) | dx (Exercise) which goes to ∞ as n → ∞. Indeed, using sin(x/2) ≤ x/2 and substituting we get
     
   
2 π
0
 | Dn(x) | dx
≥ 
0
|sin((n+
1
2
) x)|
x/2
dx  
  [since  sins ≤ s]       
 
= 
(2n+1)π
0
|sin(t)|
t
dt  
 
 [change of variables    t=(n+
1
2
) x]
       
 
≥ 
2n
k=0
(k+1) π
k π
|sint|
t
dt  
  [split integral into intervals]       
 
 ≥ 


2n
k=0
π
0
sint
(k+1)
dt  


  
  [since  t ≤ k+1  for  t∈ (k,k+1) ]       
 
 = 2 
2n
k=0
1
k+1
  
  [evaluating the integral],        
which is the harmonic series divergent as n → ∞. This gives a contradiction. □

Another corollary of the Banach–Steinhaus principle is an important continuity statement. Recall that of X and Y are normed spaces them so is the Cartesian product X × Y equipped with the norm || (x,y) || = ( || x ||X2 + || y ||Y2 )1/2. It is easy to see that a sequence (xn,yn) converges to (x,y) in this norm if and only if xnx and yny.

Theorem 42 Suppose that X, Y are Banach spaces and suppose that B: X × Y → ℝ is a bilinear form on X × Y that is separately continuous, i.e. B(·, y) is continuous on X for every yY and B(x,·) is continuous on Y for every xX. Then B is continuous.
Proof. Suppose that (xn,yn) is a sequence that converges to (x,y). First note that
  B(xnx,yny)= B(xn,yn) − B(xn,y) − B(x,yn) + B(x,y),
where B(xn,y) → B(x,y) as well as B(x,yn) → B(x,y). So it is sufficient to show that B(xnx,yny) → 0 or, equivalently, B(xn,yn) → 0 for any xn→ 0 and yn→ 0. Now. the linear mappings Tn(x)= B(x,yn): X → ℝ are bounded, by assumption. Since ||yn||→ 0 the sequence Tn(x)→ 0 and is bounded for every xX. Then, by the Banach–Steinhaus theorem there exists a constant C such that ||Tn|| ≤ C for all n. That is |Tn(x)| = B(x, yn) ≤ C ||x|| for all n and xX. Therefore, |B(xn,yn)| ≤ C ||xn|| → 0. □
Remark 43 Recall that already on 2 separate continuity does not imply joint continuity for any function. The standard example from Analysis is the function
  f(x,y) = 





xy
x2 + y2
(x,y) ≠ (0,0) 
0(x,y) = 0,
which is continuous in x or y separately but is not jointly continuous.

16.3.3 The open mapping theorem

Recall that for a continuous map the pre-image of any open set is open. This does of course not mean that the image of any open set is open (for example, sin: ℝ → ℝ has image [−1,1], which is not open). A map f: XY between metric space is called open if the image of every open set is open. If a map is invertible then it is open if and only if its inverse is continuous. We start with a simple observation for linear maps. We will denote open balls in normed spaces X and Y by BrX(x) and BsY(y) respectively, or simply BrX and BsY if they are centred at the origin.

Lemma 44 Let X and Y be normed spaces. Then a linear map T: XY is open if and only if there exists ε>0 such that BεY(0) ⊂ T (B1X(0)), i.e. the image of the unit ball contains a zero’s neighbourhood.
Proof. If the map T is open it clearly has this property. Suppose conversely, that BεY(0) ⊂ T (B1X(0)) for some ε>0. Then, by scaling, Bε δY(0) ⊂ T (BδX(0)) for any δ>0. Suppose that U is open. Suppose that yf(U), that is there exists xU such that y=f(x). Then there exists δ>0 with x + BδX(0) ⊂ U and therefore
    TU ⊃ TBδX(x) = { Tx } +  TBδX(0) ⊃  { y } +  Bδ εY(0) =Bδ εY(y) .
Theorem 45 (Open Mapping Theorem) Let T : XY be a continuous surjective linear operator between Banach spaces. Then T is open.
Proof. Since T is surjective we have Y = ∪n T BnX. Therefore trivially, Y = ∪n T BnX. By the Baire category theorem one of the T BnX must have an interior point. Rescaling implies that T B1X has an interior point y0. Since T B1X is symmetric under reflection y→ −y, the point −y0 must also be an interior point. Therefore, by convexity of T B1X there exists a δ>0 with BδYT B1X, cf. (104). By linearity this means Bδ 2nYT B2nX for any natural n.

We will show that T B1XT B2X, with the implication from above that BδYT B2X, which will complete the proof by the previous Lemma. So, let yTB1X be arbitrary. Then, there exists x1B1X such that yT x1Bδ/2YTB1/2X. Repeating this, there exists x2B1/2X such that yT x1T x2Bδ/4Y.

Continuing inductively, we obtain a sequence (xn) with the property that || xn || < 2n+1 and

y − 
n
k=1
Txn ∈ Bδ 21−nY. (105)

By completeness of X, the absolute convergent series ∑1n xn converges to an element xX of norm || x|| < 2. By linearity an continuity of T we get from (105) that y = T x. Thus yTB2. □

If the map T is also injective (and, therefore, bijective with the inverse T−1) we can quickly conclude continuity of T−1.

Corollary 46 Suppose that T: XY is a bijective bounded linear map between Banach spaces. Then T has a bounded inverse T−1.

It is not rare that we may have two different norms ||·|| and ||·||* on the same Banach space X. We say that ||·|| and ||·||* are equivalent if there are constants c>0 and C>0 such that:

c ||x|| ≤ ||x||* ≤ C ||x||     for all  x ∈ X. (106)
Exercise 47
  1. Check that (106) defines an equivalence relations on the set of all norms on X.
  2. If a sequence is Cauchy/convergent/bounded in a norm then it is also Cauchy/convergent/bounded in any equivalent norm.

The Cor. 46 implies that if the identity map (X,||·||)→ (X,||·||*) is bounded then both norms are equivalent.

Corollary 48 Let (X,||·||) be a Banach space and ||·||* be a norm on X in which X is complete. If ||·|| ≤ C ||·||* for some C>0 the norms are equivalent.

16.3.4 The closed graph theorem

Suppose that X, Y are Banach spaces and suppose that DX is a linear subspace (not necessarily closed). Now suppose that T : DY is a linear operator. Then the graph gr(T) is defined as the subset {(x,Tx) ∣ xD} ⊂ X × Y. This is a linear subspace in the Banach space X × Y, which can be equipped with the norm ||(x,y)||2 = ||x||X2 + || y||Y2. One often uses the equivalent norm ||(x,y)|| = ||x||X + || y||Y but the first choice makes sure that the product X × Y is also a Hilbert space if X and Y are Hilbert spaces. We will refer to T as an operator from X to Y with domain D.

Definition 49 The operator T is called closed if and only if its graph is a closed subset of X × Y.

It is easy to see that T is closed if an only if xnx and T xny imply that T xnT x. Note the difference with continuity of T!!!

If T is an operator T : DY then its graph is a subset of X × Y. If we close this subset the resulting set may fail to be the graph of an operator. If the closure is the graph as well, we say that T is closable and its closure is the operator whose graph is obtained by closing the graph of T.

Differential operators are often closed but not bounded. Let L2[a,b] be the Hilbert space obtained by abstract completion of (C[a,b],|| ·||2), cf. Prop. 60. Then D=C1[a,b] is a dense subspace in L2[a,b] and the operator d/dx: C1[a,b] → L2[a,b] is of the above type. This operator is not closed, however it is closable and its closure therefore defines a closed operator with dense domain. We have already seen that this operator is unbounded and therefore it cannot be continuous.

Of course, the map D → (x,Tx) is a bijection from D to gr(T). We can use the norm on gr(T) to define a norm on D, which is then

 || x ||D = 
|| x ||X2 + || Tx ||Y2
1
2
 
.

Obviously, T is closed if and only of D with norm ||·||D is a Banach space. We are now ready to state the closed graph theorem. It is easy to check that T continuously maps (D, || · ||D) to Y.

Theorem 50 (Closed Graph Theorem) Suppose that X and Y are Banach spaces and suppose that T: XY is closed. Then T is bounded.
Proof. Since in this case we have D=X with have two norms ||·||X and || · ||D on X that are both complete. Clearly,
  ||·||X ≤ ||·||D,
and by Cor. 48 the norms are therefore equivalent. Hence,
  || Tx ||Y ≤ || x ||D ≤ C || x ||X
for some constant C>0. □

16.4 Semi-norms and locally convex topological vector spaces

Definition 51 (Semi-Norm) Let X be a vector space, then a map p: X → ℝ is called semi-norm if
  1. p(x) ≥ 0 for all xX,
  2. px) = |λ| p(x), for all λ ∈ ℝ, xX,
  3. p(x+y) ≤ p(x) + p(y), for all x,yX.

An example of a semi-norm on C1[0,1] is p(f):=|| f ′ ||. If (pα)α is a family of semi-norms with the property that

 ( ∀ α ∈ I, pα(x) =0 )  x=0

then we say X with that family is a locally convex topological vector space. There is a topology (that is, a description of all open sets) on such a vector space, by declaring a subset UX to be open if and only if for every point xU and any index α ∈ I there exists ε>0 such that { ypα(yx) < ε } ⊂ U. The notion of convergence one gets is xnx if and only of pα(xnx) → 0 for all α. The topology of point-wise convergence on the space of functions S → ℝ is for example of this type, with the family of semi-norms given by (px)s xS, px(f) = | f(x) |.

Another example is the vector space C(ℝm) with the topology of uniform convergence of all derivatives on compact sets. Here the family of semi-norms pα,K is indexed by all multi-indices α ∈ ℕ0m and all compact subsets K ⊂ ℝ and is given by

pα, K(f) = 
 
sup
x ∈ K
 | ∂αf(x) |.

If the family of semi-norms is countable then this topology is actually coming from a metric (so the space is a metric space)

d(x,y) = 
k=1
1
2k
pk(xy)
1+pk(xy)
.

Such a metric space is called Frechet space. Note that C(ℝm) is a Frechet space because the family of semi-norms above can be replaced by a countable one by taking a countable exhaustion of ℝm by compact subsets.

site search by freefind advanced

Last modified: November 6, 2024.
Previous Up Next