Advances of Metric Spaces

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

16 Advances of Metric Spaces

16.1 The Stone–Weierstrass Theorem

Density in metric spaces is an important concept since it allows to approximate any element by an element from the dense set. Furthermore, we can extend a uniformly continuous function from a dense subset by continuity, see Ex. 62. Thus, it is convenient to have some supply of manageable dense subsets of common metric spaces.

A famous case of density is the Theorem of Stone–Weierstrass, which in the original and most known form says that any continuous function on a compact interval can be uniformly approximated by a sequence of polynomials. Polynomials have many nice properties which make this dense subset particularly useful: easy computation, derivation and integration, etc. Yet, we will prove here a more general version of the Stone–Weierstrass theorem that applies to general compact metric spaces.

Theorem 1 (Stone–Weierstrass) Suppose that X is a compact metric space and let C(X,ℝ) be the Banach space of real valued continuous functions on X with norm || · ||_∞. Suppose that A ⊂ C(X,ℝ) is a unital subalgebra of C(X,ℝ), i.e.

A is a linear subspace,
1 ∈ A,
A · A ⊂ A, or in other words f,g ∈ A implies that also f · g ∈ A.

Suppose furthermore that A separates points, i.e. for any two x,y ∈ X with x ≠ y there exists a function f ∈ A such that f(x) ≠ f(y). Then, A is dense in C(X,ℝ).

This is an interestingly sounding theorem: it states that a subset of C(X,ℝ) which is closed under algebraic operations and separates points automatically has topological property—it is dense. Its consequences are striking. Before we prove this theorem let’s look at some of them.

Corollary 2 [Weierstrass approximation theorem] The space of polynomials ℝ[x] is dense in C([a,b],ℝ) for any compact interval [a,b] in the in the || · ||_∞ norm.

In other words, any continuous function can be approximated with arbitrary accuracy by a polynomial.

Corollary 3 The space of polynomials ℝ[x₁,…,x_n] is dense in C(K,ℝ) for any compact subset K of ℝⁿ in the || · ||_∞ norm.

This is the higher dimensional version of the above theorem and states that a continuous functions of n-variables can be approximated by polynomials in n variables.

Corollary 4 Let C(S¹,ℝ) be the space of continuous functions on the unit circle, or, equivalently, the space of 2 π-periodic real valued functions on ℝ. Then the finite linear span of the set

∪

m ∈ ℕ

{1, sin(m x), cos(m x)}

is dense in C(S¹,ℝ).

The Stone–Weierstrass theorem is actually a consequence of the following theorem by Stone. This is a good illustration to inventor’s paradox stated by Polya [13].

Here is some notation first for two functions f and g:

	f ∧ g = min{f,g},
	f ∨ g = max{f,g}

Note that of f and g are continuous, then so are f ∧ g and f ∨ g (demonstrate this!).

Theorem 5 (Stone’s Theorem) Let X be a compact metric space and suppose that there is a subset A of C(X,ℝ) such that

A is closed under the operations ∧ and ∨, this means f,g ∈ A implies f ∧ g ∈ A and f ∨ g ∈ A.
for any pair of points x ≠y and numbers a,b ∈ ℝ there is a function f ∈ A such that f(x)=a and f(y)=b.

Then, A is dense in C(X,ℝ) in the topology induced by the norm || · ||_∞ (the uniform topology).

Proof. We need to prove that any function g can be approximated by elements in A. For each two points x,y choose a function f_x,y ∈ A such that f_x,y(x)=g(x) and f_x,y(y)=g(y). Such a function exists by our hypothesis for every pair of points. Now, for an ε>0 the sets

O_x,y={z ∈ X ∣ f_x,y(z) < g(z) + ε}

are open and form a cover of X even if we fix x. This is because O_x,y contains both x and y together with some neighbourhoods of these points. Now find a finite subcover for each fixed x. That is there are finitely many points y₁, …, y_n such that O_{x,y_i} is an open cover. Now define the function

f _x= f_x,y₁ ∧ … ∧ f_{x,y_n}.

By hypothesis f_x is in A for any x ∈ X and it has the property that

f_x(z) < g(z) + ε

but now for all z. Moreover, f_x(x) = g(x). Again, the sets

O_x={z ∈ X ∣ f_x(z) > g(z) − ε}

make an open cover and therefore there is a finite subcover. This means there are finitely many points x₁,…,x_k such that O_{x_i} is an open cover of X. Now the function

f = f_x₁ ∨ … ∨ f_{x_k}

is in A and satisfies

g(x)−ε < f(x) < g(x) + ε

for all x, or in other words || f −g ||_∞ < ε. □

It may be not obvious why conditions of Stone’s theorem 5 are more general than in Thm. 1. This will be seen from the following proof. We employ what Polya called leading particular case [14, § 4.4]—we will show that the particular algebra of polynomials on [0,1] approximate the particular function √x and then reduce the general situation to it.

Proof.[Sketch of proof of the Stone Weierstrass theorem 1.] First we observe that if B is the closure in C(X,ℝ) of A from Thm. 1, then B will also be a unital point separating subalgebra of C(X,ℝ) (exercise!).

Step 1: If f is non-negative and in B, so is √f. To see this note that it is enough to show this for 0≤ f < 1 because in case f ≠0 we can compute √f by

√

|| f||_∞

√

2 || f ||_∞

Now the Taylor series ∑_k=0^∞a_n x^k for √1−x

√

1−x

∞

∑

k=0

a_k x^k = 1−

x−

x²−

x³−

128

x⁴−

256

x⁵−…

converges absolutely and uniformly on any interval [0,1−δ). Therefore, the series

∞

∑

k=0

a_k (1−f−δ)^k

converges in the Banach space B because all the partial sums are actually in B as B is a subalgebra. The limit of this sequence is, of course, √f+δ. If we let δ go to zero, we can see that also √f ∈ B. This works because

√

f(x) + δ

−

√

f(x)

| = δ (

√

f(x) + δ

√

f(x)

)⁻¹ ≤

√

so that the approximation is uniform.

Step 2: Since | f | = √f² we have that f ∈ B implies | f | ∈ B. Since moreover,

f ∧ g =

f + g

−

| f −g |

f ∨ g =

f + g

| f −g |

we conclude from this that B is closed under the operations ∧ and ∨.

Step 3: Assume that x ≠ y are points in X and assume that a,b are real numbers. Then, by assumption there is an element f in B such that

f(x) ≠ f(y).

Since B is a subspace that contains the constant functions, the function

g(z)

= a + (b−a)

f(x)−f(z)

f(x)−f(y)

b f(x)−af(y)

f(x)−f(y)

−

b−a

f(x)−f(y)

f(z)

is also in B and it satisfies g(x)=a and g(y)=b.

Final Step: As we can see all the conditions of Stone’s theorem are satisfied and therefore B is dense in C(X,ℝ). Since B is closed in C(X,ℝ) this means that B=C(X,ℝ). Thus, A is dense in C(X,ℝ). □

We can extend the result from real scalars to complex one through identities ℜ z = 1/2(z +z) and ℑ z = 1/2(z −z).

Corollary 6 [Stone–Weierstrass (complex version)] Suppose that X is a compact metric space and let C(X,ℂ) be the complex Banach space of complex valued continuous functions on X with norm || · ||_∞. Suppose that A ⊂ C(X,ℂ) is a unital *-subalgebra of C(X,ℂ), i.e.

A is a linear subspace,
1 ∈ A,
A · A ⊂ A, or in other words f,g ∈ A implies that also f · g ∈ A.
if f ∈ A the also f ∈ A.

Suppose furthermore that A separates points, i.e. for any two x,y ∈ X with x ≠ y there exists a function f ∈ A such that f(x) ≠ f(y). Then, A is dense in C(X,ℂ).

Using this complex version of the theorem (or simply the Euler identity e^{i φ} = cosφ +i sinφ) we obtain the complex version of Cor. 4:

Corollary 7 The linear span of the set { e^{i m ϕ} ∣ m ∈ ℤ} is dense in C(S¹,ℂ).

Note that we need both positive and negative values of m in e^{i m ϕ}, the set { e^{i m ϕ} ∣ m ∈ ℕ₀} is not dense in C(S¹,ℂ).

The Stone–Weierstrass also says something about the separability of certain Banach spaces. Remember what it means for a topological space to be separable.

Definition 8 (Separable Metric Space) A metric space X is called separable if there exists a countable dense subset of X.

In other words, a separable metric space consists of accumulation points of a single sequence.

Suppose K ⊂ ℝⁿ is compact. Then the space of polynomials with real coefficients and n-variables is dense in the space of continuous functions C(K). Of course every polynomial with real coefficients may be approximated by one with rational coefficients. Thus the set of rational polynomials ℚ[x₁,…,x_n] is dense in C(K). However, the space of rational polynomials is a countable set. In this way one obtains

Corollary 9 Let K be a compact subset of ℝⁿ then the Banach space C(K) is separable.

The following statement shows that continuous functions make only a tiny fraction of all bounded functions:

Exercise 10 Let X be an infinite set, show that the space B(X) of bounded functions on X is not separable. (Hint: present a set of disjoint balls of radius 1/2 parametrised by all real numbers.)

16.2 Contraction mappings and fixed point theorems

16.2.1 The Banach fixed point theorem

An important tool in numerical Analysis, but also in constructions of solutions of differential equations are fixed point approximations. In order to understand this, suppose that (X,d) is a metric space and f: X → X a self-map. Then a point x ∈ X is called fixed point of f if f(x) = x. For example the function cos defines a self-map on the interval [0,1], and by starting with x₁=0 and inductively computing x_n+1 = cosx_n one converges to the value roughly 0.739085 which is a fixed point of cos, i.e. solves the equation cos(x) = x. Under certain conditions one can show that such sequences always converge to a fixed point. This is the statement of the Banach fixed point theorem (contraction mapping principle).

Definition 11 (Contraction Mapping) Let (X,d) be a metric space. Then a map f: X → X is called contraction if there exists a constant C<1 such that

d(f(x),f(y)) ≤ C d(x,y).

Note that any contraction is (uniformly) continuous.

Theorem 12 (Banach Fixed Point Theorem) Suppose that f: X → X is a contraction on a complete metric space (X,d). Then f has a unique fixed point y. Moreover, for any x ∈ X the sequence (x_n) defined recursively by

x_n+1 = f(x_n), x₁ = x,

converges to y.

Proof. Let us start with uniqueness. If x,y are both fixed points in X, then since f is a contraction:

d(x,y) ≤ C d(x,y)

for some constant C<1. Hence, d(x,y)=0 and therefore x=y.

To prove the remaining claims we start with any x in X and we will show that the sequence x_n defined by x₁=x and x_n+1=f(x_n) converges. Since f is continuous the limit of (x_n) must be a fixed point. Since (X,d) is complete we only need to show that (x_n) is Cauchy. To see this note that

d(x_n+1,x_n) ≤ C d(x_n,x_n−1)

and therefore inductively,

d(x_n+1,x_n) ≤ Cⁿ⁻¹ d(x₂,x₁).

By the triangle inequality we have for any n,m>0

d(x_N+m,x_N) ≤ (C^N−1 + C^N + … C^N+m−2) d(x₂,x₁) ≤ C^N−1

1−C

d(x₂,x₁) .

Since C<1 this can be made arbitrarily small by choosing N large enough. □

Corollary 13 Suppose that (X,d) is a complete metric space and f: X → X a map such that fⁿ is a contraction for some n ∈ ℕ. Then f has a unique fixed point.

Proof. Since fⁿ is a contraction it has a unique fixed point x ∈ X, i.e.


f ∘ f … ∘ f
◥	▼	◤

n−times

(x) =x.

Now note that

fⁿ(f(x)) = fⁿ ∘ f (x) = fⁿ⁺¹(x) = f∘ fⁿ(x) = f(fⁿ(x))=f(x)

and therefore f(x) is also a fixed point of fⁿ. By uniqueness we must have f(x)=x. □

The question arises how to show that a given map f is a contraction. In subsets of ℝ^m there is a simple criterion. Recall that an open set U ⊂ ℝ is called convex if for any two points x,y ∈ U the line { t x + (1−t) y ∣ t ∈ [0,1] } is contained in U.

Theorem 14 (Mean Value Inequality) Suppose that U ⊂ ℝ^m is an open set with convex closure U and let f: U → ℝ^m be a C¹-function. Let d f be the total derivative (or Jacobian) understood as a function on U with values in m × m-matrices. Suppose that || df(x) || ≤ M for all x ∈ U. Then f: U → ℝ^m satisfies

|| f(x) −f(y) || ≤ M || x − y ||

for all x,y ∈ U.

Proof. Given x,y ∈ U let γ(t) = t x + (1−t )y. Then d/dt γ(t) = x −y.

f(x) − f(y) =

∫

d t

f(γ(t))   d t =

∫

(d f) ·

d γ

d t

(t)   d t.

Using the triangle inequality (this can be used for Riemann integrals too because these are limits of finite sums), one gets

|| f(x) −f (y) || ≤

∫

|| (d f) ·

d γ

d t

(t) ||   d t ≤ M

∫

|| x − y ||   d t = M || x−y||.

By continuity this inequality extends to U. □

Example 15 Consider the map f: ℝ² ⊃ B₁(0) → B₁(0), (x,y) ↦ (x²/4+y/3+1/3,y²/4−x/2). Then

df =

⎛
⎜
⎜
⎜
⎜
⎜
⎜
⎜
⎝

−

⎞
⎟
⎟
⎟
⎟
⎟
⎟
⎟
⎠

The operator norm || df|| can be estimated by the Hilbert–Schmidt norm. Recall || A ||_HS = (tr(A^* A))^1/2, so we get

|| df|| ≤ || df||_HS = (

(x²+y²) +

)^1/2 <1.

Therefore f is a contraction. We can find the fixed point by starting, for example, with the point (0,0) and iterating. We get iterations:

	(0,0), (0.333333, 0.), (0.361111, −0.166667),
	(0.310378, −0.173611), (0.299547, −0.147654),
	(0.306547, −0.144323), (0.308719, −0.148066),
	(0.307805, −0.148878), (0.307393, −0.148361),
	(0.307502, −0.148194), (0.307575, −0.148261),
	(0.307564, −0.148292), (0.307551, −0.148284),
	(0.307552, −0.148279), (0.307554, −0.148279).

Example 16 Put a map of the country of your current presence on the floor, there’s a point on the map that is touching the actual point it refers to!

16.2.2 Applications of fixed point theory: The Picard-Lindelöf Theorem

Let f: K → ℝ be a function on a compact rectangle of the form K=[T₁,T₂] × [L₁,L₂] in ℝ². Consider the initial value problem (IVP)

= f(t,y), y(t₀) = y₀, (101)

where y: [T₁,T₂] → ℝ, t ↦ y(t) is a function. The function f and the initial value y₀ ∈ [L₁,L₂], and t₀ ∈ [T₁,T₂] are given and we are looking for a function y satisfying the above equations.

Figure 19: Vector fields and their integral curves from Ex. 17–20.

Example 17 Let f(t,x)=x and y₀ =1, t₀=0. Then the initial value problem is

= y, y(0) =1.

We know from other courses that there is a unique solution y(t) = e^t, see Fig. 19 top-left.

Example 18 Let f(t,x)=x² and y₀ =1, t₀=0. Then the initial value problem is

= y², y(0) =1.

We know from other courses that there is a unique solution y(t) = 1/1−t which exists only on the interval (−∞,1), see Fig. 19 top-right.

Example 19 Let f(t,x)=x²−t and y₀ =1, t₀=0. Then the initial value problem is

= y²−t, y(0) =1.

One can show that there exists a solution for small |t|, however this solution cannot be expressed in terms of elementary functions, see Fig. 19 bottom-left.

Example 20 Let f(t,x)=x^2/3 and y₀ =0, t₀=0. Then the initial value problem is

= y

, y(0) =0.

It has at least two solutions, namely y=0 and y=t³/27, see Fig. 19 bottom-right.

Hence, there are two fundamental questions here: existence and uniqueness of solutions. The following theorem is one of the basic results in the theorem of ordinary differential equation and establishes existence and uniqueness under rather general assumptions.

Theorem 21 (Picard–Lindelöf theorem) Suppose that f: [T₁,T₂] × [y₀−C,y₀+C] → ℝ is a continuous function such that for some M>0 we have

|f(t,y₁) − f(t,y₂)| ≤ M | y₁−y₂| (Lipschitz condition)

for all t ∈ [T₁,T₂], y₁,y₂ ∈ [y₀−C,y₀+C]. Then, for any t₀ ∈ [T₁,T₂] the initial value problem

(t) = f(t,y(t)), y(t₀) = y₀,

has a unique solution y in C¹[a,b], where [a,b] is the interval [t₀−R, t₀+R] ∩ [T₁,T₂], where

R= ||f||_∞⁻¹ C.

(The solution exists for all times t such that |t−t₀| ≤ R).

Remark 22 Note, that the Lipschitz condition implies uniform continuity and is significantly stronger requirement.

Proof. Using the Fundamental Theorem of Calculus we can write the IVP as a fixed point equation F(y) = y for a map defined by

F(y) (t) = y₀ +

∫

t₀

f(s,y(s))   d s.

This is a map that will send a continuous function y ∈ C[T₁,T₂] to a continuous function F(y) ∈ C[T₁,T₂]. As a metric space we take

X = C([a,b], [y₀ −C, y₀ +C])

that is, the set of continuous functions on [a,b] taking values in the interval [y₀ −C, y₀ +C]. This is a closed (why?) subset of the Banach space C[a,b] and is therefore a complete metric space.

First, we show that F: X → X, i.e. F maps X to itself. Indeed,

| F(y)(t) − y₀ | =

⎪
⎪
⎪
⎪

∫

t₀

f(s,y(s))   d s

⎪
⎪
⎪
⎪

≤ R || f ||_∞≤ C.

Next, we show that F^N is a contraction for N large enough and thus establish the existence of a unique fixed point. It is the place to use the Lipschitz condition. Observe that for two functions y, y′ ∈ X we have

| F(y)(t) − F(y′)(t) |

⎪
⎪
⎪
⎪

∫

t₀

f(s,y(s)) − f(s,y′(s))   d s

⎪
⎪
⎪
⎪

≤

∫

t₀

| f(s,y(s)) − f(s,y′(s)) |   d s ≤ |t−t₀| M || y − y′ ||_∞.

(102)

We did not assume that (t−t₀) M ≤ RM <1, so F will in general not be a contraction. There are several ways to resolve this situations. For example, we can argue in either of the following two manners:

We use both the result and the method from (102) to compute distances for higher powers of F, starting from the squares:

| F²(y)(t) − F²(y′)(t) |

≤

∫

t₀

| f(s,F(y)(s)) − f(s,F(y′)(s)) |  d s

≤

∫

t₀

|s−t₀| · M · || F(y) − F(y′)||_∞  d s

≤

∫

t₀

|s−t₀| · M² · || y − y′ ||_∞  d s

|t−t₀|²

M² || y − y′ ||_∞,

and iterating this gives for any natural N:

|| F^N(y) − F^N(y′) ||_∞≤

|t−t₀|^N

N !

M^N || y − y′ ||_∞.

Since the factorial will overgrow the respective power, for N large enough, F^N is a contraction and we deduce the existence of a unique solution from Cor. 13. This solution is in C¹ since it can be written as the integral of a continuous function.

The inequality (102) shows existence and uniqueness of a solution only in the space of functions C([t₀−r,t₀+r], [y₀−C,y₀+C]) where r < M⁻¹ and therefore |t−t₀| M< 1 in (102). Now suppose we have two solutions y and y′. They coincide at t₀. Application of (102) to other initial points where the solutions coincide shows that the set E={ x ∈ [a,b] ∣ y(x) = y′(x)} is open. It is also the pre-image of the closed set {0} under the continuous map y−y′. So we have that E is a closed and open subset of [a,b] that is non-empty. It must therefore be [a,b]. Hence, we get y = y′, establishing uniqueness in the whole C[a,b].

□

Note that this not only gives uniqueness and existence, but also gives a constructive method to compute the solution by iterating the map F starting for example with the constant function y(t)=y₀. The iteration

y_n+1(t) = y₀ +

∫

t₀

f(s,y_n(s))   d s

is called Picard iteration. It will converge to the solution uniformly. See Fig. 20 for an illustration of few first iterations for the exponent functions.

Figure 20: Few initial Picard iterations for the differential equation y′=y: constant f₀, linear f₁, quadratic f₂, etc.

Remark 23 The proof also gives a bound on the solution, namely if the assumptions are satisfied one gets | y(t) − y₀ | ≤ C for t ∈ [a,b].

Remark 24 The proof works in the same way if y takes values in ℝ^m and therefore f : ℝ × ℝ^m ⊃ [T₁,T₂] × B_C(0) → ℝ^m. In fact, the target space may even be a Banach space (the derivative for Banach space-valued functions appropriately defined). Higher order differential equations may be written as systems of first order equations and hence the theorem applies to these as well. For example y″(t) + y(t) =0, y(0) = 1, y′(0) = 0 can be written as

⎛
⎜
⎝

⎞
⎟
⎠

⎛
⎜
⎝

−y

⎞
⎟
⎠

⎛
⎜
⎝

⎞
⎟
⎠

(0) =

⎛
⎜
⎝

⎞
⎟
⎠

So here the function f is f(t,(x₁,x₂)) = (x₂, −x₁).

Example 25 Consider the IVP

= y² t +1, y(0) = 1.

Hence, f(t,x) = x²t +1. If we take f to be defined on the square [−T,T] × [1−C,1+C] then we obtain ||f||_∞= (1+C)² T +1 (the value at the top-right corner). In this case the solution will exist up to time

min

⎧
⎪
⎨
⎪
⎩

(1+C)² T +1

⎫
⎪
⎬
⎪
⎭

If we choose, for example C=2 and T=1/2 we get that a unique solution exists up to time | t | ≤ 4/11. This solution will then satisfy | y(t) −1 | ≤ 2 for | t | ≤ 4/11.

In fact one can show that the solution can be expressed in a complicated way in terms of the Airy-Bi-function and it blows up at t=1.

16.2.3 Applications of fixed point theory: Inverse and Implicit Function Theorems

It is an easy exercise in Analysis to show that if a function f ∈ C¹[a,b] has nowhere vanishing derivative, then f is invertible on its image. To be more precise, f⁻¹: Im(f) → [a,b] exists and has derivative (f′(x))⁻¹ at the point y=f(x). In higher dimensions a statement like this can not be correct as the following counterexample shows. Let 0<a<b and define

	f: [a,b] × ℝ → ℝ²,
	(r,θ) ↦ (r cosθ, r sinθ).

This maps has invertible derivative

f′(r,θ) =

⎛
⎜
⎝

cosθ	−r sinθ
sinθ	r cosθ.

⎞
⎟
⎠

, detf′(r,θ) = r >0.

at any point, the map is however not injective, see Fig. 21 for a cartoon illustration of the difference between one- and two-dimensional cases. However, for any point we can restrict domain and co-domain, so that the restriction of the function is invertible. In such a case we say that f is locally invertible. This concept will be explained in more detail below.

Figure 21: Flat and spiral staircases: can we return to the same value going just in one way?

Definition 26 (Local Invertibility) Suppose U₁, U₂ ⊂ ℝ^m are open subsets of ℝ^m. Then a map f: U₁ → U₂ is called locally invertible at x ∈ U₁ if there exists an open neighbourhood U of x such that f |_U : U → f(U) is invertible. The function f is said to be locally invertible it it is locally invertible at x for any x ∈ U₁.

Often, say for differential equations, we need a map which preserves differentiability of functions in both directions.

Definition 27 (Diffeomorphism) Suppose U₁, U₂ ⊂ ℝ^m are open subsets of ℝ^m. Then a map f: U₁ → U₂ is called C^k-diffeomorphism if f ∈ C^k(U₁,U₂) and if there exists a g ∈ C^k(U₂,U₁) such that

f ∘ g = 1_U₂, g ∘ f = 1_U₁,

where 1_U₁ and 1_U₂ are the identity maps on U₁ and U₂ respectively.

There is also a local version of the above definition.

Definition 28 (Local Diffeomorphism) Suppose U₁, U₂ ⊂ ℝ^m are open subsets of ℝ^m. Then a map f: U₁ → U₂ is called a local-C^k- diffeomorphism at x ∈ U₁ if there exists an open neighbourhood U of x such that f |_U: U → f(U) is a C^k-diffeomorphism. It is called a local-C^k- diffeomorphism if it is a local diffeomorphism at any point x ∈ U₁.

Not every invertible C^k-map is a diffeomorphism. An example is the function f(x) = x³ whose inverse g(x) = x^1/3 fails to be differentiable.

Theorem 29 (Inverse Function Theorem) Let U ⊂ ℝ^m be an open subset and suppose that f ∈ C^k(U,ℝ^m) such that f′(x) is invertible at every point x ∈ U. Then f is a local C^k-diffeomorphism.

Before we can prove this theorem we need a Lemma, which basically says that under the assumptions of the inverse function theorem an inverse function must be in C¹. That is, differentiability is the leading particular case [14, § 4.4] for the general case of k-differentiable functions.

Lemma 30 Suppose that f ∈ C¹(U₁,U₂) is bijective with continuous inverse. Assume that the derivative of f is invertible at any point, then f is a C¹-diffeomorphism, and g′(f(x)) = (f′(x))⁻¹.

Proof. Denote the inverse of f by g: U₂ → U₁. The continuity of f and g imply that x_n → x₀ if and only if f(x_n) → f(x₀). We will show that g is differentiable at the point y₀ = f(x₀). If y=f(x) is very close to y₀ (so that the line interval between x and x₀ is contained in U₁) then, by the MVT, cf. Thm. 14, there exists a ξ on this line such that y−y₀ = f(x) − f(x₀) = f′(ξ) · (x−x₀). Therefore, g(y)−g(y₀) = (f′(ξ))⁻¹ · (y−y₀). If y tends to y₀, then ξ will tend to x₀, and therefore, by continuity of f′ the value of (f′(ξ))⁻¹ will tend to (f′(x₀))⁻¹. Thus, the partial derivatives of g exist and are continuous, so g ∈ C¹. Note that we have used here that matrix inversion is continuous. □

Now we can proceed with the general situation.

Proof.[Proof of the Inverse Function Theorem 29] Let x₀ ∈ U and let y₀=f(x₀). We need to show that there exists an open neighborhood U₁ of f(x₀) such that f: f⁻¹(U₁) → U₁ is a C^k-diffeomorphism. As a first step we construct a continuous inverse. Since f′(x₀)=A is an invertible m × m-matrix we can change coordinates x = A⁻¹ y + x₀, so that we can assume without loss of generality that f′(x₀)= 1 and x₀=0. Replacing f by f−y₀ we also assume w.l.o.g. that y₀=0. Since f′(x) is continuous there exists an ε>0 such that || f′(x) − 1 || ≤ 1/2 for all x ∈ B_ε(0). This ε>0 can also be chosen such that B_ε(0) ⊂ U. Thus, || x−f(x) || ≤ 1/2|| x|| for all x ∈ B_ε(0) by MVT and for each y ∈ B_ε/2(0) the map

x ↦ x + y − f(x)

is a contraction on B_ε(0). Indeed, by MVT again:

||x + y − f(x) − (x′ + y − f(x′))||

= ||x − f(x) − (x′ − f(x′))||

= ||(f′(ξ) − 1) (x−x′)||

≤

||x−x′||,

(103)

where ||·|| is the norm of vectors in ℝ^m. Consider the complete metric space X=C(B_ε/2(0),B_ε(0)) and define the map

F: X → X, u ↦ F(u), F(u)(y) = u(y) + y − f(u(y)).

By the above this map is well defined and it also is a contraction

|| F(u)(y) −F(v)(y) ||

= || u(y) − f(u(y)) −

⎛
⎝

v(y) −f(v(y))

⎞
⎠

≤

|| u(y) − v(y) ||

[by (103)]

≤

|| u − v ||_∞.

Hence, there exists a unique fixed point g. This fixed point yields a continuous inverse g of f|_U defined on U =B_ε/2(0) ∩ f⁻¹(B_ε/2(0)). By the previous Lemma this implies that g is differentiable. Now simply note that g′ = (f′)⁻¹ ∘ g. Since matrix inversion is smooth and f′ is in C^k−1 this implies that for m ≤ k−1 we get the conclusion (g ∈ C^m) (g ∈ C^m+1). Hence, g is in C^k. □

The implicit function theorem is actually a rather simple consequence of the inverse function theorem. It gives a nice criterion for local solvability of equations in many variables.

Theorem 31 (Implicit Function Theorem) Let U₁ ⊂ ℝⁿ × ℝ^m and U₂ ⊂ ℝ^m be open subsets and let

F: U₁ → U₂, (x₁, …, x_n, y₁,…,y_m) ↦ F(x₁, …, x_n, y₁,…,y_m)

be a C^k-map. Suppose that F(x₀,y₀)=0 for some point (x₀,y₀) ∈ U₁ and that the m × m-matrix ∂_y F(x₀,y₀) is invertible. Then there exists an neighborhood U of (x₀,y₀) ∈ ℝⁿ × ℝ^m, an open neighborhood V of x₀ in ℝⁿ, and a C^k-function f: V → ℝ^m such that

{ (x,y) ∈ U ∣ F(x,y) =0 } = { (x , f(x)) ∈ U ∣ x ∈ V }.

The function f has derivative

f′(x₀)=−(∂_y F(x₀,y₀))⁻¹ ∂_x F(x₀,y₀)

at x₀.

Proof. This is proved by reducing it to the inverse function theorem. Just design the map

G : U₁ → ℝⁿ × ℝ^m, (x,y) ↦ (x, F(x,y))

and then note that

G′(x₀,y₀) =

⎛
⎜
⎝

1	0
∂_x F(x₀,y₀)	∂_y F(x₀,y₀)

⎞
⎟
⎠

is invertible with inverse

(G′(x₀,y₀))⁻¹ =

⎛
⎜
⎝

1	0
−(∂_y F(x₀,y₀))⁻¹ ∂_x F(x₀,y₀)	(∂_y F(x₀,y₀))⁻¹

⎞
⎟
⎠

By the inverse function theorem there exists a local inverse G⁻¹: U₃ → U₄, where U₃ is an open neighborhood of 0 and U₄ an open neighborhood of (x₀,y₀). Now define f by (x,f(x)) = G⁻¹(x,0). □

Example 32 Consider the system of equations

	x₁² + x₂² + y₁² + y₂² = 2,
	x₁ + x₂³ + y₁ + y₂³ =2.

We would like to know if this system implicitly determines functions y₁(x₁,x₂) and y₂(x₁,x₂) near the point (0,0,1,1), which solves the equation. For this one simply applies the implicit function theorem to

F(x₁,x₂,y₁,y₂) = ( x₁² + x₂² + y₁² + y₂² − 2, x₁ + x₂³ + y₁ + y₂³ − 2).

The derivatives are

∂_xF =

⎛
⎜
⎝

2 x₁	2 x₂
1	3 x₂²

⎞
⎟
⎠

, ∂_yF =

⎛
⎜
⎝

2 y₁	2 y₂
1	3 y₂²

⎞
⎟
⎠

The values of these derivatives at the point (0,0,1,1) are

∂_xF(0,0,1,1) =

⎛
⎜
⎝

0	0
1	0

⎞
⎟
⎠

, ∂_yF(0,0,1,1) =

⎛
⎜
⎝

2	2
1	3

⎞
⎟
⎠

The latter matrix is invertible and one computes

−(∂_y F(x₀,y₀))⁻¹ ∂_x F(x₀,y₀)(0,0,1,1) =

⎛
⎜
⎝

1/2	0
−1/2	0

⎞
⎟
⎠

We conclude that there is an implicitly defined function (y₁,y₂)= f(x₁,x₂) whose derivative at (0,0) is given by

⎛
⎜
⎝

1/2	0
−1/2	0

⎞
⎟
⎠

The geometric meaning is that near the point (0,0,1,1) the system defines a two-dimensional manifold that is locally given by the graph of a function. Its tangent plane is spanned by the vectors (1/2,0,1,0) and (−1/2,0,0,1).

Example 33 Consider the system of equations

	x² + y² + z² = 1,
	x + y z + z³ =1.

This is the intersection of a sphere (drawn in light green on Figure 22) with some cubic surface defined by the second equation (drawn in light blue). The point (0,0,1) solves the equation and is pictured as a little orange dot. By the implicit function theorem the intersection is a smooth curve (drawn in red) near this point which can be parametrised by x coordinate. Indeed, we can express y and z along the curve as functions of x because the resulting matrix

∂_(y,z)F(0,1)=

⎛
⎜
⎝

2y	2z
z	y+3z²

⎞
⎟
⎠

⎪
⎪
⎪

y=0,z=1

⎛
⎜
⎝

0	2
1	3

⎞
⎟
⎠

is invertible.

Figure 22: Example of the implicit theorem: the intersection (red) of the unit sphere (green) and a cubic surface (blue).

Exercise 34 Fig. 22 suggests that the intersection curve can be alternatively parametrised by the coordinates y and cannot by z (why?). Check these claims by verifying conditions of Thm. 31.

16.3 The Baire Category Theorem and Applications

We are going to see another example of an abstract result which has several non-trivial consequences for real analysis.

16.3.1 The Baire’s Categories

Let us first prove the following result and then discuss its meaning and name. We may recognise some techniques similar to proof of Thm. 68.

Theorem 35 (Baire’s category theorem) Let (X,d) be a complete metric space and U_n a sequence of open dense sets. Then the intersection S=∩_n U_n is dense.

Proof. The proof is rather straightforward. We need to show that any ball B_ε(x₀) contains an element of S. Let us therefore fix x₀ and ε>0. Since U₁ is dense the intersection of B_ε(x₀) with U₁ is non-trivial. Thus there exists a point x₁ ∈ B_ε(x₀) ∩ U₁. Now choose ε₁ < ε/2 so that B_ε₁(x₁) ⊂ B_ε(x) ∩ U₁ (note the closure of the ball). Since U₂ is dense, the intersection B_ε₁(x₁) ∩ U₂ ⊂ B_ε(x₀) ∩ U₁ ∩ U₂ is non-empty. Choose a point x₂ and ε₂ < ε₁ /2 such that B_ε₂(x₂) ⊂ B_ε₁(x₁) ∩ U₂ ⊂ B_ε(x₀) ∩ U₁ ∩ U₂. Continue inductively, to obtain a sequence x_n such that

B_{ε_n}(x_n)

⊂ B_{ε_n−1}(x_n−1) ⋂ U_n ⊂ B_ε(x₀) ⋂ U₁ ⋂ U₂ ⋂ … ⋂ U_n,

and ε_n < 2⁻ⁿ ε. In particular, for any n>N we have

x_n ∈ B_2^−Nε(x_N),

which implies that x_n is a Cauchy sequence. Hence x_n has a limit x, by completeness of (X,d). Consequently, x is contained in the closed ball B_{ε_N}(x_N) for any N, and therefore it is contained in B_ε(x₀) ∩ (∩_n U_n), as claimed. □

Completeness is essential here. For example, the conclusion does not hold for the metric space ℚ: take bijection ψ: ℕ → ℚ, and consider the open dense sets

U_n = { ψ(1), ψ(1), …, ψ(n)}^c = {ψ(n+1), ψ(n+2),… }.

The intersection ∩_n U_n is empty.

The following historic terminology, due to Baire, is in use.

Definition 36 (Baire’s categories) A subset Y of a metric space X is called

nowhere dense if the interior of Y is empty;
of first category if there is a sequence (Y_k) of nowhere dense sets with Y = ∪_k Y_k;
of second category if it is not of first category.

Example of nowhere dense sets are ℤ ⊂ ℝ, the circle in ℝ², or the set { 1/n ∣ n ∈ ℕ } ⊂ ℝ. Note that the complement of the closure of a nowhere dense set is a dense open set.

Corollary 37 In a complete metric space the complement of a set of the first category is dense.

Proof. Follows from relations for complements

Y^c = (⋃_k Y_k)^c = ⋂_k Y_k^c ⊃ ⋂_k

Y_k

and the fact that Y_k^c is dense. □

The following corollary is also called Baire’s category theorem in some sources:

Corollary 38 A complete metric space is of second category in itself, or plainly speaking it is never the union of a countable number of nowhere dense sets.

The theorem is often used to show abstract existence results. Here is an example.

Theorem 39 There exists a function f ∈ C[0,1] that is nowhere differentiable.

Proof. For each n ∈ ℕ define

U_n =

⎧
⎪
⎨
⎪
⎩

f ∈ C[0,1] s.t. sup

⎧
⎪
⎨
⎪
⎩

⎪
⎪
⎪
⎪

f(x+h)−f(x)

⎪
⎪
⎪
⎪

over 0 < |h| ≤

⎫
⎪
⎬
⎪
⎭

> n, ∀ x ∈ [0,1]

⎫
⎪
⎬
⎪
⎭

We will show that the U_n are open and dense. By the Category theorem their intersection is also dense.

U_n is open: Let f ∈ U_n. For each x ∈ [0,1] choose δ_x>0 such that

sup

⎧
⎪
⎨
⎪
⎩

⎪
⎪
⎪
⎪

f(x+h)−f(x)

⎪
⎪
⎪
⎪

over 0 < |h| ≤

⎫
⎪
⎬
⎪
⎭

> n + δ_x,

hence there is a h_x < 1/n with

⎪
⎪
⎪
⎪

f(x+h_x)−f(x)

h_x

⎪
⎪
⎪
⎪

> n + δ_x.

By continuity of f there is an open neighborhood I_x of x such that

⎪
⎪
⎪
⎪

f(y+h_x)−f(y)

h_x

⎪
⎪
⎪
⎪

> n + δ_x.

for all y ∈ I_x. These I_x form an open cover. We choose a finite subcover (I_{x_k})_k=1,…,N. Let δ= min{δ_x₁, …, δ_{x_N}} > 0 . Then, for y ∈ I_{x_k}:

⎪
⎪
⎪
⎪

f(y+h_{x_k})−f(y)

h_{x_k}

⎪
⎪
⎪
⎪

> n + δ.

Now let g ∈ B_ε(f), where ε>0 is chosen so that ε < 1/2 δ h_{x_k} for all k. Then by an ε/3-style argument:

⎪
⎪
⎪
⎪

g(y+h_{x_k}) − g(y)

h_{x_k}

⎪
⎪
⎪
⎪

≥

⎪
⎪
⎪
⎪

f(y+h_{x_k}) − f(y)

h_{x_k}

⎪
⎪
⎪
⎪

− 2

|| f−g||_∞

h_{x_k}

> n + δ − 2 ε h_{x_k}⁻¹ >n,

and therefore g∈ U_n. We conclude that U_n is open.

U_n is dense: For each ε>0 and f ∈ C[0,1] choose a polynomial p such that || f − p || < ε/2 and a sequence of continuous function g_m ∈ C[0,1] such that || g ||_∞< ε/2 and such that for all x ∈ [0,1]:

sup

⎧
⎪
⎨
⎪
⎩

g_m(x+h)−g_m(x)

over 0 < |h| ≤

⎫
⎪
⎬
⎪
⎭

> m

by using a “zigzag” function. Then, for large enough m we have p+g_m ∈ U_n.
(Exercise: why had we approximated f by a polynomial p and just did not make the same claim about f+g_m itself?) □

The above proof actually shows much more, namely that the set of nowhere differentiable functions is dense in C[0,1]. It is also useful to compare it with the construction of the continuous nowhere differentiable Weierstrass function and identify some common elements.

16.3.2 Banach–Steinhaus Uniform Boundedness Principle

Another consequence of the Baire Category theorem is the Banach–Steinhaus uniform boundedness principle. Recall that, if X and Y are normed spaces, T: X → Y is called a bounded operator if it is a bounded linear map.

Theorem 40 (Banach–Steinhaus Uniform Boundedness Principle) Let X be a Banach space and Y a normed space, and let (T_α)_{α ∈ I} be a family of bounded operators T_α: X → Y. Suppose that

∀ x ∈ X:

sup

|| T_αx || < ∞.

Then we have sup_α || T_α|| < ∞, i.e. the family T_α is bounded in the set B(X,Y) of bounded operators from X to Y.

Proof. Define X_n = {x ∈ X ∣ sup_α || T_αx || ≤ n }. By assumption X = ∪_n X_n. Note that all the X_n are closed. By the Baire category theorem at least one of these sets must have non-empty interior, since otherwise the Banach space X would be a countable union of nowhere dense sets. Hence, there exists N ∈ ℕ, y ∈ X_N, and ε>0 such that B_ε(y) ∈ X_N. Note that X_N is symmetric under reflections x↦ −x and convex. So we get the same statement for −y. Hence, x ∈ B_ε(0) implies

x =

⎛
⎝

(x + y) + (x−y)

⎞
⎠

∈

⎛
⎝

X_N + X_N

⎞
⎠

⊂ X_N. (104)

This means that || x || ≤ ε implies || T_αx || ≤ N, and therefore || T_α|| ≤ ε⁻¹ N for all α ∈ I. □

Recall that the Fourier series of a C¹-function on a circle (identified with 2 π-periodic functions) converges uniformly to the function. We will now show that a statement like that can not hold for all continuous functions.

Corollary 41 There exist continuous periodic functions whose Fourier series do not converge point-wise.

Proof. We will show that there exists a continuous function whose Fourier series does not converge at x=0. Suppose by contradiction such functions would not exist, so we would have point-wise convergence of the Fourier series

a₀ +

∞

∑

m =1

⎛
⎝

a_m cos(m x) + b_m sin(m x)

⎞
⎠

for every f ∈ C(S¹) = C_per(ℝ). Here we identify continuous functions on the unit circle with continuous 2 π-periodic functions C_per(ℝ). Hence we have a map

T_n : C(S¹) → ℝ: f ↦

a₀ +

∑

m =1

a_m

by mapping the function f to the n-th partial sum of its Fourier series at x=0. This is a family of bounded operators T_n: C(S¹) → ℝ and by assumption we have for every f that

sup

| T_n(f) | < ∞.

By Banach–Steinhaus theorem we have sup_n || T_n || = sup_{n, || f ||_∞=1} | T_n(f) | < ∞. Now one computes the norm of the map

T_n : C(S¹) → ℝ: f ↦

∫

−π

f(x)

⎛
⎜
⎜
⎝

∑

k=1

cos(k x)

⎞
⎟
⎟
⎠

  d x =

2 π

∫

−π

f(x) D_n(x)   d x

where

D_n(x) =

sin

⎛
⎜
⎜
⎝

(n+

) x

⎞
⎟
⎟
⎠

sin

⎛
⎜
⎜
⎝

⎞
⎟
⎟
⎠

is the Dirichlet kernel , cf. Lem. 6. This norm equals 1/2 π ∫_−π^π| D_n(x) |   d x = 1/2 π ∫₀^{2 π} | D_n(x) |   d x (Exercise) which goes to ∞ as n → ∞. Indeed, using sin(x/2) ≤ x/2 and substituting we get

2 π

∫

| D_n(x) |   d x

≥

2π

∫

|sin((n+

) x)|

x/2

  d x

[since sins ≤ s]

(2n+1)π

∫

|sin(t)|

  d t

[change of variables t=(n+

) x]

≥

∑

k=0

(k+1) π

∫

k π

|sint|

  d t

[split integral into intervals]

≥

⎪
⎪
⎪
⎪

∑

k=0

∫

sint

(k+1)

  d t

⎪
⎪
⎪
⎪

[since t ≤ k+1 for t∈ (k,k+1) ]

= 2

∑

k=0

k+1

[evaluating the integral],

which is the harmonic series divergent as n → ∞. This gives a contradiction. □

Another corollary of the Banach–Steinhaus principle is an important continuity statement. Recall that of X and Y are normed spaces them so is the Cartesian product X × Y equipped with the norm || (x,y) || = ( || x ||_X² + || y ||_Y² )^1/2. It is easy to see that a sequence (x_n,y_n) converges to (x,y) in this norm if and only if x_n → x and y_n → y.

Theorem 42 Suppose that X, Y are Banach spaces and suppose that B: X × Y → ℝ is a bilinear form on X × Y that is separately continuous, i.e. B(·, y) is continuous on X for every y ∈ Y and B(x,·) is continuous on Y for every x ∈ X. Then B is continuous.

Proof. Suppose that (x_n,y_n) is a sequence that converges to (x,y). First note that

B(x_n−x,y_n−y)= B(x_n,y_n) − B(x_n,y) − B(x,y_n) + B(x,y),

where B(x_n,y) → B(x,y) as well as B(x,y_n) → B(x,y). So it is sufficient to show that B(x_n−x,y_n−y) → 0 or, equivalently, B(x′_n,y′_n) → 0 for any x′_n→ 0 and y′_n→ 0. Now, the linear mappings T_n(x)= B(x,y′_n): X → ℝ are bounded, by assumption. Since ||y′_n||→ 0 the sequence T_n(x)→ 0 and is bounded for every x∈ X. Then, by the Banach–Steinhaus theorem there exists a constant C such that ||T_n|| ≤ C for all n. That is |T_n(x)| = B(x, y′_n) ≤ C ||x|| for all n and x∈ X. Therefore, |B(x′_n,y′_n)| ≤ C ||x′_n|| → 0. □

Remark 43 Recall that already on ℝ² separate continuity does not imply joint continuity for any function. The standard example from Analysis is the function

f(x,y) =

⎧
⎪
⎪
⎨
⎪
⎪
⎩

x y

x² + y²

if (x,y) ≠ (0,0);

if (x,y) = 0,

which is continuous in x or y separately but is not jointly continuous.

16.3.3 The open mapping theorem

Recall that for a continuous map the pre-image of any open set is open. This does of course not mean that the image of any open set is open (for example, sin: ℝ → ℝ has image [−1,1], which is not open). A map f: X → Y between metric space is called open if the image of every open set is open. If a map is invertible then it is open if and only if its inverse is continuous. We start with a simple observation for linear maps. We will denote open balls in normed spaces X and Y by B_r^X(x) and B_s^Y(y) respectively, or simply B_r^X and B_s^Y if they are centred at the origin.

Lemma 44 Let X and Y be normed spaces. Then a linear map T: X → Y is open if and only if there exists ε>0 such that B_ε^Y(0) ⊂ T (B₁^X(0)), i.e. the image of the unit ball contains a zero’s neighbourhood.

Proof. If the map T is open it clearly has this property. Suppose conversely, that B_ε^Y(0) ⊂ T (B₁^X(0)) for some ε>0. Then, by scaling, B_{ε δ}^Y(0) ⊂ T (B_δ^X(0)) for any δ>0. Suppose that U is open. Suppose that y ∈ T(U), that is there exists x∈ U such that y=T(x). Then there exists δ>0 with x + B_δ^X(0) ⊂ U and therefore

TU ⊃ T B_δ^X(x) = { T x } + T B_δ^X(0) ⊃ { y } + B_δ ε^Y(0) =B_δ ε^Y(y) .

□

Theorem 45 (Open Mapping Theorem) Let T : X → Y be a continuous surjective linear operator between Banach spaces. Then T is open.

Proof. Since T is surjective we have Y = ∪_n T B_n^X. Therefore trivially, Y = ∪_n T B_n^X. By the Baire category theorem one of the T B_n^X must have an interior point. Rescaling implies that T B₁^X has an interior point y₀.

Since T B₁^X is symmetric under reflection y→ −y, the point −y₀ must also be an interior point. Therefore, by convexity of T B₁^X there exists a δ>0 with B_δ^Y ⊂ T B₁^X, cf. (104). By linearity this means B_{δ 2⁻ⁿ}^Y ⊂ T B_2⁻ⁿ^X for any natural n.

We will show that T B₁^X ⊂ T B₂^X, with the implication from above that B_δ^Y⊂ T B₂^X, which will complete the proof by the previous Lemma. So, let y ∈ TB₁^X be arbitrary. Then, there exists x₁ ∈ B₁^X such that y − T x₁ ∈ B_δ/2^Y ⊂ TB_1/2^X. Repeating this, there exists x₂ ∈ B_1/2^X such that y − T x₁ − T x₂ ∈ B_δ/4^Y and ||x₂||≤ 1/2.

Continuing inductively, we obtain a sequence (x_n) with the property that || x_n || < 2⁻ⁿ⁺¹ and

y −

∑

k=1

Tx_n ∈ B_{δ 2⁻ⁿ⁺¹}^Y. (105)

By completeness of X, the absolute convergent series ∑x_n converges to an element x∈ X of norm || x|| < 2. By linearity and continuity of T we get from (105) that y = T x. Thus y∈ TB₂. □

If the map T is also injective (and, therefore, bijective with the inverse T⁻¹) we can quickly conclude continuity of T⁻¹.

Corollary 46 Suppose that T: X → Y is a bijective bounded linear map between Banach spaces. Then T has a bounded inverse T⁻¹.

It is not rare that we may have two different norms ||·|| and ||·||_* on the same Banach space X. We say that ||·|| and ||·||_* are equivalent if there are constants c>0 and C>0 such that:

c ||x|| ≤ ||x||_* ≤ C ||x|| for all x ∈ X. (106)

Exercise 47

Check that (106) defines an equivalence relations on the set of all norms on X.
If a sequence is Cauchy/convergent/bounded in a norm then it is also Cauchy/convergent/bounded in any equivalent norm.

The Cor. 46 implies that if the identity map (X,||·||)→ (X,||·||_*) is bounded then both norms are equivalent.

Corollary 48 Let (X,||·||) be a Banach space and ||·||_* be a norm on X in which X is complete. If ||·|| ≤ C ||·||_* for some C>0 the norms are equivalent.

In particular, any two norm on a finite dimensional vector space are equivalent.

16.3.4 The closed graph theorem

Suppose that X, Y are Banach spaces and suppose that D ⊂ X is a linear subspace (not necessarily closed). Now suppose that T : D → Y is a linear operator. Then the graph gr(T) is defined as the subset {(x,Tx) ∣ x ∈ D} ⊂ X × Y. This is a linear subspace in the Banach space X × Y, which can be equipped with the norm ||(x,y)||² = ||x||_X² + || y||_Y². One often uses the equivalent norm ||(x,y)|| = ||x||_X + || y||_Y but the first choice makes sure that the product X × Y is also a Hilbert space if X and Y are Hilbert spaces. We will refer to T as an operator from X to Y with domain D.

Definition 49 The operator T is called closed if and only if its graph is a closed subset of X × Y.

It is easy to see that T is closed if an only if x_n → x and T x_n → y imply that T x_n → T x. Note the difference with continuity of T!!!

If T is an operator T : D → Y, where D is a subspace of X, then its graph is a subset of X × Y. If we close this subset the resulting set may fail to be the graph of an operator. If the closure is the graph as well, we say that T is closable and its closure is the operator whose graph is obtained by closing the graph of T.

Differential operators are often closed but not bounded. Let L²[a,b] be the Hilbert space of functions such that (C[a,b],|| ·||₂) is its dense subspace, cf. Prop. 60. Then D=C¹[a,b] is a dense subspace in L²[a,b] and the operator d/dx: C¹[a,b] → L²[a,b] is of the above type. This operator is not closed, however it is closable and its closure therefore defines a closed operator with dense domain. We have already seen that this operator is unbounded and therefore it cannot be continuous.

Of course, the map D → (x,Tx) is a bijection from D to gr(T). We can use the norm on gr(T) to define a norm on D, which is then

|| x ||_D =

⎛
⎝

|| x ||_X² + || T x ||_Y²

⎞
⎠

Obviously, T is closed if and only of D with norm ||·||_D is a Banach space. We are now ready to state the closed graph theorem. It is easy to check that T continuously maps (D, || · ||_D) to Y.

Theorem 50 (Closed Graph Theorem) Suppose that X and Y are Banach spaces and suppose that T: X → Y is closed. Then T is bounded.

Proof. Since in this case we have D=X with have two norms ||·||_X and || · ||_D on X that are both complete. Clearly,

||·||_X ≤ ||·||_D,

and by Cor. 48 the norms are therefore equivalent. Hence,

|| T x ||_Y ≤ || x ||_D ≤ C || x ||_X

for some constant C>0. □

16.4 Semi-norms and locally convex topological vector spaces

Definition 51 (Semi-Norm) Let X be a vector space, then a map p: X → ℝ is called semi-norm if

p(x) ≥ 0 for all x ∈ X,
p(λ x) = |λ| p(x), for all λ ∈ ℝ, x ∈ X,
p(x+y) ≤ p(x) + p(y), for all x,y ∈ X.

An example of a semi-norm on C¹[0,1] is p(f):=|| f ′ ||_∞. If (p_α)_α is a family of semi-norms with the property that

( ∀ α ∈ I, p_α(x) =0 ) x=0

then we say X with that family is a locally convex topological vector space. There is a topology (that is, a description of all open sets) on such a vector space, by declaring a subset U ⊂ X to be open if and only if for every point x ∈ U and any index α ∈ I there exists ε>0 such that { y ∣ p_α(y−x) < ε } ⊂ U. The notion of convergence one gets is x_n → x if and only of p_α(x_n −x) → 0 for all α. The topology of point-wise convergence on the space of functions S → ℝ is for example of this type, with the family of semi-norms given by (p_x)_{s x ∈ S}, p_x(f) = | f(x) |.

Another example is the vector space C^∞(ℝ^m) with the topology of uniform convergence of all derivatives on compact sets. Here the family of semi-norms p_α,K is indexed by all multi-indices α ∈ ℕ₀^m and all compact subsets K ⊂ ℝ and is given by

p_α, K(f) =

sup

x ∈ K

| ∂^αf(x) |.

If the family of semi-norms is countable then this topology is actually coming from a metric (so the space is a metric space)

d(x,y) =

∞

∑

k=1

2^k

p_k(x−y)

1+p_k(x−y)

Such a metric space is called Frechet space. Note that C^∞(ℝ^m) is a Frechet space because the family of semi-norms above can be replaced by a countable one by taking a countable exhaustion of ℝ^m by compact subsets.


site search by freefind	advanced

Last modified: February 16, 2025.