This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Introduction to Functional AnalysisNovember 6, 2024 |
Abstract:
This is lecture notes for several courses
on Functional Analysis at School of
Mathematics of University of
Leeds. They are based on the notes of Dr. Matt Daws,
Prof. Jonathan
R. Partington, Dr. David Salinger, and Prof. Alex Strohmaier used in the previous
years. Some sections are borrowed from the textbooks, which I
used since being a student myself.
However all misprints, omissions, and errors are only my
responsibility. I am very grateful to Filipa Soares de Almeida, Eric
Borgnet, Pasc Gavruta for pointing out some of them. Please
let me know if you find
more.The notes are available also for download in
PDF.
The suggested textbooks are
[, , , ]. The other nice books
with many interesting problems are [, ].
Exercises with stars are not a part of mandatory
material but are nevertheless worth to hear about. And they are not
necessarily difficult, try to solve them!
Contents
Notations and Assumptions
ℤ+, ℝ+ denotes non-negative integers
and reals.
x,y,z,… denotes vectors.
λ,µ,ν,… denotes scalars.
ℜ z, ℑ z stand for real and imaginary parts of a complex number
z.
Integrability conditions
In
this course, the functions we consider will be real or complex valued
functions defined on the real line which are locally Riemann
integrable. This means that they are Riemann integrable on any
finite closed interval [a,b]. (A complex valued function is
Riemann integrable iff its real and imaginary parts are
Riemann-integrable.) In practice, we shall be dealing mainly with
bounded functions that have only a finite number of points of
discontinuity in any finite interval. We can relax the boundedness
condition to allow improper Riemann integrals, but we then require the
integral of the absolute value of the function to converge.
We mention this right at the start to get it out of the way. There are many
fascinating subtleties connected with Fourier analysis, but those connected
with technical aspects of integration theory are beyond the scope of the
course. It turns out that one needs a “better” integral than the Riemann
integral: the Lebesgue integral, and I commend the module, Linear Analysis 1, which includes an introduction to that topic which
is available to MM students (or you could look it up in Real and Complex Analysis by Walter Rudin). Once one has the Lebesgue integral, one can start
thinking about the different classes of functions to which Fourier analysis
applies: the modern theory (not available to Fourier himself) can even go
beyond functions and deal with generalized functions (distributions) such as
the Dirac delta function which may be familiar to some of you from quantum
theory.
From now on, when we say “function”, we shall assume the conditions of the
first paragraph, unless anything is stated to the contrary.
0 Motivating Example: Fourier Series
0.1 Fourier series: basic notions
Before proceed with an abstract theory we consider a motivating
example: Fourier series.
0.1.1 2π-periodic functions
In this part of the course we deal with functions (as above) that are periodic.
We say a function f:ℝ→ℂ is periodic
with period T>0 if f(x+T)= f(x) for all
x∈ ℝ. For example, sinx, cosx, eix(=cos
x+i sinx) are periodic with period 2π. For k∈
R∖{0}, sinkx, coskx, and eikx are
periodic with period 2π/|k|. Constant functions are periodic
with period T, for any T>0. We shall specialize to periodic
functions with period 2π: we call them 2π-periodic
functions, for short. Note that cosnx, sinnx and
einx are 2π-periodic for n∈ℤ. (Of course these are
also 2π/|n|-periodic.)
Any half-open interval of length T is a fundamental domain of a
periodic function f of period T. Once you know the values of f on the
fundamental domain, you know them everywhere, because any point x in ℝ can
be written uniquely as x=w+nT where n∈ ℤ and w is in the fundamental
domain. Thus f(x) = f(w+(n−1)T +T)=⋯ =f(w+T) =f(w).
For 2π-periodic functions, we shall usually take the fundamental domain to
be ]−π, π]. By abuse of language, we shall sometimes refer to [−π,
π] as the fundamental domain. We then have to be aware that f(π)=f(−π).
0.1.2 Integrating the complex exponential function
We shall need to calculate
∫ab eikx d x, for k∈ℝ. Note first that when k=0, the
integrand is the constant function 1, so the result is b−a. For
non-zero k, ∫ab eikx d x= ∫ab (coskx+isinkx) d x =
(1/k)[ (sinkx − icoskx)]ab = (1/ik)[(coskx+isinkx)]ab =
(1/ik)[eikx]ab = (1/ik)(eikb−eika). Note that this is exactly the
result you would have got by treating i as a real constant and using the
usual formula for integrating eax. Note also that the cases k=0 and
k≠0 have to be treated separately: this is typical.
Definition 1
Let f:ℝ→ℂ
be a 2π
-periodic function
which is Riemann integrable on [−π, π]
. For each n∈ℤ
we define the
Fourier coefficient f(
n)
by
Example 3
-
f(x) = c then f(0) =c and f(n) =0 when n≠0.
-
f(x) = eikx, where k is an integer. f(n) =
δnk.
- f is 2π periodic and f(x) = x on ]−π, π]. (Diagram)
Then f(0) = 0 and, for n≠0,
f(n) = | | | xe−inx d x = | ⎡
⎢
⎢
⎣ | | ⎤
⎥
⎥
⎦ | | + | | | | einx d x = | | .
|
Proposition 4 (Linearity)
If f and g are 2π
-periodic functions and
c and d are complex constants, then, for all n∈ℤ
,
(c f + d g6) (n) = cf(n) + dĝ(n) .
|
Corollary 5
If p(
x)
is a trigonometric polynomial
,
p(
x)= ∑
−kk cneinx, then p(
n) =
cn for |
n|≤
k and =0
, for |
n|≥
k.
This follows immediately from Ex. 2 and Prop.4.
Definition 7
∑n∈ℤ f(n)einx is called the Fourier
series of the
2π-periodic function f.
For real-valued functions, the introduction of complex exponentials
seems artificial: indeed they can be avoided as follows. We work with
(1) in the case of a finite sum: then we can
rearrange the sum as
| f(0) + | | (f(n) einx +f(−n)e−inx) |
|
|
| = | f(0) + | | [(f(n)+f(−n))cosnx +i(f(n)−f(−n))sin
nx] |
|
| = | |
|
Here
an | = | (f(n)+f(−n)) =
| | | f(x)(e−inx+einx) d x |
|
| = | |
|
for n>0 and
bn =i((f(n)−f(−n))= | | | f(x)sin
nx d x
|
for n>0. a0 = 1/π∫−ππf(x) d x, the
constant chosen for consistency.
The an and bn are also called Fourier coefficients: if it is necessary to
distinguish them, we may call them Fourier cosine and
sine coefficients,
respectively.
We note that if f is real-valued, then the an and bn are real
numbers and so ℜ f(n) = ℜ f(−n), ℑ f(−n) = −ℑf(n):
thus f(−n) is the complex conjugate of f(n). Further, if f is an
even function then all the sine coefficients are 0 and if f is an odd
function, all the cosine coefficients are zero. We note further that the sine
and cosine coefficients of the functions coskx and sinkx themselves have
a particularly simple form: ak=1 in the first case and bk=1 in the second.
All the rest are zero.
For example, we should expect the 2π-periodic function whose value on
]−π,π] is x to have just sine coefficients: indeed this is the case:
an=0 and bn=i(f(n)−f(−n)) = (−1)n+12/n for n>0.
The above question can then be reformulated as “to what extent is
f(x) represented by the Fourier series a0/2 + ∑n>0(ancosx +
bnsinx)?” For instance how well does
∑(−1)n+1(2/n)sinnx represent the 2π-periodic sawtooth function
f whose value on ]−π, π] is given by f(x) = x. The easy points are
x=0, x=π, where the terms are identically zero. This gives the ‘wrong’
value for x=π, but, if we look at the periodic function near π, we see
that it jumps from π to −π, so perhaps the mean of those values isn’t a
bad value for the series to converge to. We could conclude that we had defined
the function incorrectly to begin with and that its value at the points
(2n+1)π should have been zero anyway. In fact one can show (ref. ) that the
Fourier series converges at all other points to the given values of f, but I
shan’t include the proof in this course. The convergence is not at all uniform
(it can’t be, because the partial sums are continuous functions, but the limit
is discontinuous.) In particular we get the expansion
which can also be deduced from the Taylor series for tan−1.
0.2 The vibrating string
In this subsection we shall discuss the formal solutions of the wave
equation in a special case which Fourier dealt with in his work.
We discuss the wave equation
subject to the boundary conditions
y(0, t) = y(π, t) = 0,
(3) |
for all t≥0, and the initial conditions
This is a mathematical model of a string on a musical instrument (guitar,
harp, violin) which is of length π and is plucked, i.e. held in the
shape F(x) and released at time t=0. The constant K depends on the
length, density and tension of the string. We shall derive the formal solution
(that is, a solution which assumes existence and ignores questions of
convergence or of domain of definition).
0.2.1 Separation of variables
We first look (as Fourier and others before him did) for solutions of the form
y(x,t) = f(x)g(t). Feeding this into the wave equation (2) we get
f′′(x) g(t) = | | f(x) g′′(t)
|
and so, dividing by f(x)g(t), we have
The left-hand side is an expression in x alone, the right-hand side in t
alone. The conclusion must be that they are both identically equal to the same
constant C, say.
We have f′′(x) −Cf(x) =0 subject to the condition f(0) =
f(π) =0. Working through the method of solving linear second order
differential equations tells you that the only solutions occur when C =
−n2 for some positive integer n and the corresponding solutions, up to
constant multiples, are f(x) = sinnx.
Returning to equation (4) gives the equation
g′′(t)+K2n2g(t) =0 which has the general solution
g(t) = ancosKnt + bnsinKnt. Thus the solution we get through
separation of variables, using the boundary conditions but ignoring
the initial conditions, are
yn(x,t) = sinnx(an cosKnt + bn sinKnt) ,
|
for n≥ 1.
0.2.2 Principle of Superposition
To get the general solution we just add together all the solutions we have got
so far, thus
y(x,t) = | | sinnx(an cosKnt + bn sin
Knt)
(5) |
ignoring questions of convergence. (We can do this for a finite sum without
difficulty because we are dealing with a linear differential equation: the iffy
bit is to extend to an infinite sum.)
We now apply the initial condition y(x,0) = F(x) (note F has
F(0) =F(π) =0). This gives
We apply the reflection trick: the right-hand side is a series of odd functions
so if we extend F to a function G by reflection in the origin, giving
G(x):= | ⎧
⎨
⎩ | F(x) | , if 0≤ x≤π; |
−F(−x) | , if −π<x<0.
|
|
we have
for −π≤ x ≤ π.
If we multiply through by sinrx and integrate term by term, we get
so, assuming that this operation is valid, we find that the an are precisely
the sine coefficients of G.
(Those of you who took Real Analysis 2 last year may remember that
a sufficient condition for integrating term-by -term is that the series which is
integrated is itself uniformly convergent.)
If we now assume, further, that the right-hand side
of (5) is differentiable (term by term) we
differentiate with respect to t, and set t=0, to get
0=yt(x,0) = | | bn K n sinnx.
(6) |
This equation is solved by the choice bn=0 for all n, so we have the
following result
Proposition 8 (Formal)
Assuming that the formal manipulations are valid, a solution of the
differential equation (2) with the given
boundary and initial conditions is
y(x,t) = | | an sinnx cosKnt ,(2.11)
|
where the coefficients an are the Fourier sine coefficients
of the 2π
periodic function G, defined on ]−π, π]
by reflecting the
graph of F in the origin.
0.3 Historic: Joseph Fourier
Joseph Fourier, Civil Servant,
Egyptologist, and mathematician, was born in 1768 in Auxerre, France,
son of a tailor. Debarred by birth from a career in the artillery, he
was preparing to become a Benedictine monk (in order to be a teacher)
when the French Revolution violently altered the course of history and
Fourier’s life. He became president of the local revolutionary
committee, was arrested during the Terror, but released at the fall of
Robespierre.
Fourier then became a pupil at the Ecole Normale (the teachers’ academy) in
Paris, studying under such great French mathematicians as Laplace and Lagrange.
He became a teacher at the Ecole Polytechnique (the military academy).
He was ordered to serve as a scientist under Napoleon in Egypt. In 1801,
Fourier returned to France to become Prefect of the Grenoble region. Among his
most notable achievements in that office were the draining of some 20 thousand
acres of swamps and the building of a new road across the alps.
During that time he wrote an important survey of Egyptian history (“a
masterpiece and a turning point in the subject”).
In 1804 Fourier started the study of the theory of heat conduction, in the
course of which he systematically used the sine-and-cosine series which are
named after him. At the end of 1807, he submitted a memoir on this work to the
Academy of Science. The memoir proved controversial both in terms of his use of
Fourier series and of his derivation of the heat equation and was not accepted
at that stage. He was able to resubmit a revised version in 1811: this had
several important new features, including the introduction of the Fourier
transform. With this version of his memoir, he won the Academy’s prize in
mathematics. In 1817, Fourier was finally elected to the Academy of Sciences
and in 1822 his 1811 memoir was published as “Théorie de la Chaleur”.
For more details see Fourier Analysis by T.W. Körner, 475-480 and for
even more, see the biography by J. Herivel Joseph Fourier: the man and the
physicist.
What is Fourier analysis. The idea is to analyse functions
(into sine and cosines or, equivalently, complex exponentials) to find
the underlying frequencies, their strengths (and phases) and, where
possible, to see if they can be recombined (synthesis) into the
original function. The answers will depend on the original properties
of the functions, which often come from physics (heat, electronic or
sound waves). This course will give basically a mathematical
treatment and so will be interested in mathematical classes of
functions (continuity, differentiability properties).
1 Basics of Metric Spaces
1.1 Metric Spaces
1.1.1 Metric spaces: definition and examples
In Analysis and Calculus the definition of convergence was based on the notion of a distance between points, namely the standard distance between two real numbers
is given by
Similarly, the distance between
two points in the plane,
given by
d(x,y)=d((x1,x2),(y1,y2))= | √ | | .
|
A metric space formalises this notion. This will give us the flexibility to talk about distances on
function spaces, for example, or introduce other notions of distance on spaces.
Definition 1 (Metric Space)
A metric space
(
X,
d)
is a set X together with a function d:
X ×
X → ℝ
that satisfies the following properties
-
d(x,y) ≥ 0; and d(x,y)=0 ⇐⇒ x=y (positive definite);
- d(x,y)=d(y,x) (symmetric);
- d(x,z) ≤ d(x,y)+d(y,z) (triangle inequality).
The function d is called the metric
. The word distance
will be used interchangeably with the same meaning.
Example 2
-
X=ℝ. The standard metric is given by d1(x,y)=|x−y|.
There are many other metrics on ℝ, for example
d(x,y)=
| ⎧
⎨
⎩ | |x−y| | if |x−y| ≤ 1, |
1 | if |x−y| ≥ 1.
|
|
|
-
Let X be any set whatsoever, then we can define the discrete metric
-
X=ℝm. The standard metric is the Euclidean metric: if x=(x1,x2,…,xm)
and y=(y1,y2,…,ym) then
d2(x,y)= | √ | |
(x1−y1)2+(x2−y2)2+…+(xm−ym)2 |
| .
|
This is linked to the inner-product (scalar product), x ·y=x1 y1+ x2 y2+…+xm ym, since it
is just √(x−y).(x−y).
We will study inner products more carefully later, so for the moment we won’t prove the
(well-known) fact that it is indeed a metric.Other possible metrics include
d∞(x,y)=max{|x1−y1|,|x2−y2|,…,|xm−ym|}. |
Another metric on ℝm comes from the generalisation of our first example:
d1(x,y)=|x1−y1|+|x2−y2|+…+|xm−ym| .
|
These metrics d1, d2, d∞ are
all translation-invariant (i.e., d(x+z,y+z)=d(x,y)), and
positively homogeneous (i.e., d(kx,ky)=|k|d(x,y)), see Ex. 8 for further discussion.
-
Take X=C[a,b]. Here are three metrics similar to above ones:
Again, this is linked to the idea of an inner product, so we will delay proving that it is a metric.
the area between two graphs
d∞(f,g)=max{ |f(x)−g(x)|: a ≤ x ≤ b}, |
the maximum vertical separation between two graphs.
Example 3
On C[0,1]
take f(
x)=
x and g(
x)=
x2 and calculated2(f,g) | = | ⎛
⎜
⎜
⎝ | | (x−x2)2 dx | ⎞
⎟
⎟
⎠ | | = | √ | | , |
|
d1(f,g) | = | |
d∞(f,g) | = | |
|
Example 5
-
The interval [a,b] with d(x,y)=|x−y| is a subspace of ℝ.
- The unit circle {(x1,x2) ∈ ℝ2: x12+x22=1 } with
d2(x,y)=√(x1−y1)2+(x2−y2)2 is a subspace of ℝ2.
- The space of polynomials P is a metric space with any of the metrics inherited from
C[a,b] above.
Definition 6
A normed space (
V,||· ||)
is a real vector space V with a map ||·||:
V → ℝ
(called norm
) satisfying
-
||v|| ≥ 0, and (||v||=0 ⇔ v=0),
- ||λ v|| = |λ| || v|| ,
- ||v+w|| ≤ ||v||+ ||w||.
Exercise 7
Prove that V is a metric space with metric d(v,w):=||v−w||.
Exercise 8
-
Write norms ||·||1, ||·||2, ||·||∞
on ℝm which produces metrics d1, d2, d∞ from Ex. 2.3.
Hint: see (11) and (9) below.
- Show, that the following are norms on the vector space V=C[a,b]:
| || f ||1 | | | | | | | | | | |
|| f ||2 | | | | | | | | | | |
|| f ||∞ | | | | | | | | | | |
|
Furthermore, these norms generate the respective metrics d1, d2 and d∞ from Ex. 2(4) as indicated in the previous exercise.
Definition 9
An inner product space(
V,⟨·, ·⟩)
is a real vector space V with a map ⟨·, ·⟩:
V ×
V → ℝ
(called inner product
)
satisfying
-
⟨ λ v,w ⟩ = λ⟨ v,w ⟩,
- ⟨ v1 +v2 ,w ⟩ = ⟨ v1,w ⟩+⟨ v2,w ⟩,
- ⟨ v,w ⟩ = ⟨ w,v ⟩,
- ⟨ v,v ⟩ ≥ 0, and (⟨ v,v ⟩ =0 ⇔ v=0).
Exercise 10
-
Prove that the Cauchy–Schwarz inequality | ⟨ v,w ⟩ |2 ≤ ⟨ v,v ⟩ ⟨ w,w ⟩ holds.
Hint: start by considering the expression ⟨ v + λ w ,v + λ w⟩ ≥ 0 and analyse the discriminant of the quadratic expression for λ.
- Then prove that
V is a normed space with norm || v||:= ⟨ v,v ⟩1/2.
- Which of the above norms ||·||1, ||·||2, ||·||∞ from Ex. 8 can be obtained from an inner product as described in the previous item?
There is a natural name for a class of maps, which preserve metrics:
Definition 11 (Isometry)
Let (
X,
dX)
and (
Y,
dY)
be two metric spaces. A map φ:
X →
Y is an isometry
if
dY(φ(x1), φ(x2)) = dX(x1, x2) for all x1, x2 ∈ X.
|
A metric space (
X,
dX)
is isometric
to a metric space (
Y,
dY)
if there is an isometry bijection between X and Y.
1.1.2 Open and closed sets
Definition 12 (Open and closed balls)
Let (
X,
d)
be a metric space, let x ∈
X and let r>0
. The open ball
centred at x, with radius r,
is the set
Br(x)={y ∈ X: d(x,y)<r },
|
and the closed ball
is the set
Trivial but useful observations are:
- x ∈ Br(x) ⊂ Br(x) for all x∈ X and r>0, so neither ball is empty and every point is covered by all balls it centres;
- Br(x) ⊂ Br+ε (x) for all x∈ X, r>0 and whatever small ε>0.
Note, that in ℝ with the usual metric the open ball is Br(x)=(x−r,x+r), an open interval, and
the closed ball is Br(x)=[x−r,x+r], a closed interval.
For the d2 metric on ℝ2, the unit ball, B1(0), is disc centred at
the origin, excluding the boundary. You may like to think about what you get for
other metrics on ℝ2. What are balls in the discrete metric, Ex. 2.2?
Definition 13 (Open sets)
A subset U of a metric space (X,d) is said to be open, if for each point x ∈ U
there is an r>0 such that the
open ball Br(x) is contained in U (“room to swing a cat").
Clearly X itself is an open set, that is the whole metric space is open in itself. Also the empty set ∅ is also considered to be open in a trivial way.
Proposition 15
Every “open ball" Br(x) is an open set.
Proof.
For if
y ∈
Br(
x), choose δ=
r−
d(
x,
y). We claim that
Bδ(
y) ⊂
Br(
x).
If z ∈ Bδ(y), i.e., d(z,y)<δ, then by the triangle inequality
d(z,x) ≤ d(z,y)+d(y,x) < δ + d(x,y) = r.
|
So z ∈ Br(x).
□
Definition 16 (Closed set)
A subset F of (X,d) is said to be closed, if its complement X ∖ F is open.
Note that closed does not mean “not open". In a metric space the sets ∅ and X are both
open and closed. In ℝ we have:
- (a,b) is open.
- [a,b] is closed, since its complement (−∞,a) ∪ (b,∞) is open.
- [a,b) is not open, since there is no open ball B(a,r) contained in the set. Nor it is
closed, since its complement (−∞,a) ∪ [b,∞) isn’t open (no ball centred at b can be contained in
the set).
Example 18
If we take the discrete metric,
then each point {
x}=
B1/2(
x)
so is an open set. Hence every set U is open,
since for x ∈
U we have B1/2(
x) ⊆
U. Hence, by taking complements, every set is also closed.
Theorem 19
In a metric space, every one-point set {x0} is closed.
Proof.
We need to show that the set U={x ∈ X: x ≠ x0} is open, so take a point x ∈ U.
Now d(x,x0)>0, and the ball Br(x) is contained in U for every 0<r< d(x,x0).
□
Theorem 20
Let (
Uα)
α ∈ A be any collection of open subsets of a metric space (
X,
d)
(not necessarily finite!).
Then ∪
α ∈ A Uα is open.
Let U and V be open subsets of a metric space (
X,
d)
. Then U ∩
V is open. Hence (by induction)
any finite intersection of open subsets is open.
Proof.
If
x ∈ ∪
α ∈ A Uα then there is an α with
x ∈
Uα. Now
Uα is open,
so
Br(
x) ⊂
Uα for some
r>0. Then
Br(
x) ⊂ ∪
α ∈ A Uα so the
union is open.
If now U and V are open and x ∈ U ∩ V, then ∃ r>0 and s>0 such that Br(x) ⊂ U and
B(x,s) ⊂ V, since U and V are open. Then B(x,t) ⊂ U ∩ V if t ≤ min(r,s).
□
Thereafter, the collection of open sets is preserved by arbitrary unions and finite intersections.
However, an arbitrary intersection of open sets is not always open; for example (−1/n,1/n) is open for each n=1,2,3,…,
but ∩n=1∞(−1/n,1/n)= {0}, which is not an open set.
For closed sets we swap union and intersection.
Theorem 22
Let (Fα)α ∈ A be any collection of closed subsets of a metric space (X,d) (not necessarily finite!).
Then ∩α ∈ A Fα is closed.
Let F and G be closed subsets of a metric space (X,d). Then F ∪ G is closed. Hence (by induction)
any finite union of closed subsets is closed.
Proof.To prove this we recall de Morgan’s laws. We use the notation Sc for the complement X ∖ S of a set S ⊂ X.
| ⇐⇒ | x ∉Aα for all α, so (∪Aα)c = ∩Aαc.
|
| ⇐⇒ | x ∉Aα for some α, so (∩Aα)c = ∪Aαc.
|
|
Write Uα= Fαc
=X ∖ Fα which is open. So ∪α ∈ A Uα is open by Theorem
20. Now, by de Morgan’s laws, (∩α ∈ A Fα)c = ∪α ∈ A Fαc. This is just ∪α ∈ A Uα. Since the complement of ∩α ∈ A Fα is open, it is closed.
Similarly, the complement of F ∪ G is Fc ∩ Gc, which is the intersection of two open sets and hence open by
Theorem
20. Hence F ∪ G is closed.
□
Infinite unions of closed sets do not need to be closed. An example is
which is open but not closed in ℝ with standard metric.
Definition 23 (Closure of a set)
The closure of S, written S, is the
smallest closed set containing S, and is contained in all other closed sets
containing S.
The above smallest closed set containing S does exist, because we can define
| = ∩{F: F ⊃ S and F closed },
|
the intersection of all closed sets containing S. There is at least one closed set containing S, namely X itself.
Example 24
In the metric space ℝ the closure of S=[0,1) is [0,1]. This is closed, and there is nothing smaller that is closed and contains S.
Exercise 25
Give an example of an open ball Br(x) and the respective closed ball Br(x) with the same centre and radius in a metric space X, such that Br(x) is not the closure of Br(x). Note the slight discontent on our notations, which shall not mislead us in future.
Definition 26 (Dense subset)
A subset S⊂ X is dense in X if S=X.
Theorem 27
The set ℚ of rationals is dense in ℝ, with the usual metric.
Proof.
Suppose that
F is a closed subset of ℝ which contains ℚ: we claim that it
F=ℝ.
For U=ℝ ∖ F is open and contains no points of ℚ. But an open set U (unless it is empty) must contain
an interval Br(x) for some x ∈ U, and hence a rational number within it.
Our only conclusion is that U=∅ and F=ℝ, so that ℚ=ℝ.
□
Definition 28 (Neighbourhood)
We say that V is a neighbourhood (nbh) of x if there is an
open set
U such that
x ∈ U ⊆ V; this means that ∃
δ>0 s.t. Bδ(x)
⊆ V. Thus, a set is open precisely when it is a neighbourhood of each of its points.
Example 29
The half-open interval [0,1) is a neighbourhood of every point in it except for 0.
Theorem 30
For a subset S of a metric space X, we have
x∈
S iff V ∩
S ≠ ∅
for all nhds
V of
x (i.e., all neighbourhoods of x meet S).
Proof.
If there is a neighbourhood of
x that doesn’t meet
S, then there is an open subset
U with
x ∈
U and
U ∩
S=∅.
But then X ∖ U is a closed set containing S and so S ⊂ X ∖ U, and then
x ∉ S because x ∈ U.
Conversely, if every neighbourhood of x does meet S, then x ∈ S, as otherwise X ∖ S is
as open neighbourhood of x that doesn’t meet S.
□
Definition 31 (Interior)
The interior of
S, intS, is the largest open set contained in
S, and can be written as
intS = ∪{ U: U ⊂ S and U open }.
|
the union of all open sets contained in S. There is at least open set within S, namely ∅
.
We see that S is open exactly when S=intS, otherwise intS is smaller.
Example 32
-
In the metric space ℝ we have int[0,1)=(0,1); clearly this is open and there is no larger open set contained in [0,1).
- intℚ = ∅. For any non-empty open set must contain an interval Br(x) and then it contains
an irrational number, so isn’t contained in ℚ.
Proposition 33
intS=
X ∖ (
X∖ S)
.
Proof.
By De Morgan’s laws,
intS | = | ∪{ U: U ⊂ S and U open } |
| = | X ∖ ∩{Uc: U ⊂ S and U open } |
| = | X ∖ ∩{F: F ⊃ (X ∖ S) and F closed } |
| = | |
|
This is because
U ⊂
S if and only if
Uc= (
X ∖
U) ⊃ (
X ∖
S).
Also
F=
Uc is closed precisely when
U is open.
That is, there
is a correspondence between open sets contained in
S and closed sets containing its complement.
□
1.1.3 Convergence and continuity
Let (xn) be a sequence in a metric space (X,d), i.e., x1,x2,…. (Sometimes we may start counting at x0.)
Definition 34 (Convergence)
We say xn →
x (i.e., xn converges
to x) if
d(
xn,
x) → 0
as n → ∞
.In other words: xn → x if for any ε>0 there exists N∈ℕ such that for all n>N we have d(x,xn) < ε.
This is the usual notion of convergence if we think of points in ℝd with the Euclidean metric.
Theorem 35
Let (
xn)
be a sequence in a metric space (
X,
d)
. Then the following are equivalent:-
xn → x;
-
for every open U with x ∈ U, there exists an N>0 such that (n>N) xn ∈ U;
-
for every ε>0 there exists an N>0 such that (n>N) xn ∈ Bε(x).
Proof.
1 ⇒
2
If
xn →
x and
x ∈
U, then there is a ball
Bε(
x) ⊂
U, since
U is open. But
xn →
x so
d(
xn,
x) < ε for
n sufficiently large, i.e.,
xn ∈
U for
n sufficiently large.
2 ⇒ 3 is obvious.
Finally, 3 ⇒ 1. If the 3 condition works for a given ε>0 and large n the inclusion xn ∈ Bε(x) implies d(xn,x)<ε.
□
Theorem 36
Let S be a subset of the metric space X. Then x ∈
S if and only if
there is a sequence (
xn)
of points of S with xn →
x.
Proof.
If
x ∈
S, then for each
n we have
B1/n(
x) ∩
S ≠ ∅ by Theorem
30. So
choose
xn ∈
B1/n(
x) ∩
S. Clearly
d(
xn,
x) → 0, i.e.,
xn →
x.
Conversely, if x ∉S, then there is a neighbourhood U of x with U ∩ S=∅. Now
no sequence in S can get into U so it cannot converge to x.
□
This can also be phrased as follows, characterising closed set in terms of sequences.
Corollary 37 (Closedness under taking limits)
A subset Y ⊂ X of a metric space (X,d) is closed if and only if for every sequence (xn) in Y that is convergent in X
its limit is also in Y.
Hence, the closure S is obtained from S by adding all possible limit points of sequences in S.
Example 38
-
Take (ℝ2,d1), where d1(x,y)=|x1−y1|+|x2−y2|, where x=(x1,x2) and
y=(y1,y2), and consider the sequence
(1/n,2n+1/n+1). We guess its limit is (0,2).
To see if this is right, look at
d1 | ⎛
⎜
⎜
⎝ | ⎛
⎜
⎜
⎝ | | , | | ⎞
⎟
⎟
⎠ | ,(0,2) | ⎞
⎟
⎟
⎠ | =
| ⎪
⎪
⎪
⎪ | | ⎪
⎪
⎪
⎪ | + | ⎪
⎪
⎪
⎪ | | −2 | ⎪
⎪
⎪
⎪ | = | | + | |
→ 0 |
as n → ∞. So the limit is (0,2). -
In C[0,1] let fn(t)=tn and f(t)=0 for 0 ≤ t ≤ 1. Does fn → f, (a) in
d1, and (b) in d∞?
(a) as n → ∞. So fn → f in d1.
(b) d∞(fn,f)=max{tn: 0 ≤ t ≤ 1}=1 ¬→0 |
as n → ∞. So fn ¬→f in d∞.
Note: Say gn → g pointwise on [a,b]
as n → ∞ if gn(x) → g(x) for all x ∈ [a,b].
If we define g(x)=
{ 0 | for 0 ≤ x < 1, |
1 | for x=1,
|
then
fn → g pointwise on [0,1]. But g ∉C[0,1], as it is not continuous at 1. - Take the discrete metric
Then xn → x ⇐⇒ d0(xn,x) → 0. But since d0(xn,x)=0 or 1, this happens if
and only if d0(xn,x)=0 for n sufficiently large. That is, there is an n0 such that xn=x for all
n ≥ n0.
All convergent sequences in this metric are eventually constant. So, for example d0(1/n,0) ¬→0.
A result on convergence in ℝm.
Proposition 39
Take ℝ
2 with any of the metrics d1, d2 and d∞. Then a sequence
xn=(
an,
bn)
converges to x=(
a,
b)
if and only if an →
a and bn →
b.
Proof.
A useful observation is that for any
xn and
x:
d1(xn,x) ≥ d2(xn,x) ≥ d∞(xn,x).
|
If
an →
a and
bn →
b, then for any ε>0 there are
Na and
Nb such that for
N>
Na we have |
an−
a |<ε/2 and for
n>
Nb |
bn−
b |<ε/2. Thus for any
n >
N=max(
Na,
Nb):
ε > | ⎪
⎪ | an−a | ⎪
⎪ | + | ⎪
⎪ | bn−b | ⎪
⎪ | = d1(xn,x) ≥ d2(xn,x) ≥ d∞(xn,x) ,
|
which shows the convergence in all three metrics.
To show the opposite, WLOG assume towards a contradiction that an ¬→a, that is, there exits ε>0 such that for any N there exists n>N such that | an−a |>ε. Then:
| d1(xn,x) ≥
d2(xn,x) ≥ d∞(xn,x)= max{ | ⎪
⎪ | an−a | ⎪
⎪ | , | ⎪
⎪ | bn−b | ⎪
⎪ | }> | ⎪
⎪ | an−a | ⎪
⎪ | >ε
|
| | | | | | | | | | |
|
showing the divergence in all three norms.
□
A similar result holds for ℝm in general.
Now let’s look at continuous functions again.
Theorem 40
If fn →
f in (
C[
a,
b],
d∞)
, then
fn →
f in (
C[
a,
b],
d1)
.Informally speaking, d∞ convergence is stronger than d1 convergence.
Proof.
d∞(
fn,
f)=max{|
fn(
x)−
f(
x)|:
a ≤
x ≤
b} → 0 as
n → ∞, so, given
ε>0 there is an
N so that
d∞(
fn,
f)<ε for
n ≥
N. It follows that
if
n ≥
N then
d1(fn,f) = | | |fn(x)−f(x)| dx ≤ | | ε dx = ε(b−a),
|
so
d1(
fn,
f) → 0 as
n → ∞.
□
Now we look at continuous functions between general metric spaces.
Definition 42 (Continuity)
Let f: (X,dX) → (Y,dY) be a map between metric spaces. We say that f is continuous
at x ∈ X if for each ε>0 there is a δε,x>0 such that dY(f(x′),f(x)) < ε for all x′∈ X whenever
dX(x′,x) < δε,x.
Another way of saying the same is that for every ε>0 there exists a δ>0 such that
The map f is continuous, if it is continuous at all points of X.
Theorem 43 (Sequential continuity)
For f as above, f is continuous at a if and only if, whenever a sequence xn →
a, then
f(
xn) →
f(
a)
.In short, f is continuous at a if and only if f permutes with the limit:
f | ⎛
⎜
⎜
⎝ | | xn | ⎞
⎟
⎟
⎠ | = | | f | ⎛
⎝ | xn | ⎞
⎠ |
(7) |
for any sequence xn → a.
Proof.
Same proof as in real analysis, more or less. If
f is continuous at
a and
xn →
a, then
for each ε>0 we have a δ>0 such that
dY(
f(
x),
f(
a)) < ε whenever
dX(
x,
a) < δ.
Then there’s an n0 with d(xn,a)<δ for all n ≥ n0,
and so d(f(xn),f(a))<ε for all n ≥ n0. Thus f(xn) → f(x).
Conversely, if f is not continuous at a, then there is an ε for which no δ will do, so
we can find xn with d(xn,a)<1/n, but d(f(xn),f(a)) ≥ ε. Then
xn → a but f(xn) ¬→f(a).
□
But there is a nicer way to define continuity. For a mapping f: X → Y and a set U ⊂ Y, let f−1(U) be the set, called pre-image or inverse image
f−1(U)={ x ∈ X: f(x) ∈ U }.
|
This makes sense even if f−1 is not defined as a function.
Theorem 44 (Continuity and open sets)
A function f: X → Y is continuous if and only if f−1(U) is open in X for every open subset U ⊂ Y. In short: the inverse image of an open set is open.
Proof.
Suppose that
f is continuous, that
U⊂
Y is open, and that
x0 ∈
f−1(
U), so
f(
x0) ∈
U. Now
there is a ball
Bε(
f(
x0)) ⊂
U, since
U is open, and then by continuity
there is a δ>0 such that
dY(
f(
x),
f(
x0)) < ε whenever
dX(
x,
x0) < δ. This means that for
d(
x,
x0)<δ,
f(
x) ∈
U and so
x ∈
f−1(
U). That is,
f−1(
U) is open.
Conversely, if the inverse image of an open set is open, and x0 ∈ X, let ε>0 be given.
We know that Bε(f(x0)) is open, so f−1(B(f(x0),ε)) is open, and contains x0. So it
contains some Bδ(x0) with δ>0.
But now if d(x,x0)<δ, we have x ∈ Bδ(x0) ⊂ f−1(Bε(f(x0))) so f(x) ∈ Bε(f(x0))
and we have d(f(x),f(x0))<ε.
□
Example 46
Let X=ℝ
with the discrete metric, and Y any metric space. Then all functions f:
X →
Y are continuous! Indeed, in either way:
-
Because the inverse image of an open set is an open set, since all sets are open.
- Because whenever xn → x0 we have xn=x0 for n large, so obviously f(xn) → f(x0).
Exercise 47
Which functions from a metric space X to the discrete metric space are continuous?
Which function from the discrete metric space to ℝ are continuous?
Proposition 48 Let X and Y be metric spaces.
-
A function f : X → Y is continuous if and only if f−1(F) is closed whenever F is a closed
subset of Y.
- If f: X → Y and g: Y → Z are continuous, then so is the composition g ∘ f: X → Z
defined by (g ∘ f)(x) = g(f(x)).
Proof.
- We can do this by complements, as if F is closed, then U=Fc is open, and f−1(F)=f−1(U)c
(a point is mapped into F if and only if it isn’t mapped into U).
Then f−1(F) is always closed when F is closed ⇐⇒ f−1(U) is always open when U is open.
- Take U ⊂ Z open; then (g ′ f)−1(U) = f−1(g−1(U)); for these are the points which map under f into g−1(U) so that
they map under g ′ f into U.
Now g−1(U) is open in Y, as
g is continuous, and then f−1(g−1(U)) is open in X since f is continuous.
□
In many cases we may need a stronger notion.
Definition 49 (Uniform continuity)
A function f: (X,dX) → (Y, dY) is called uniformly continuous if
for each ε>0 there exists δε>0 such that whenever x,x′∈ X satisfy
dX(x,x′)≤δε, we have that dY(f(x),f(x′))≤ε.
Note, that here the same δε shall work for all x∈ X. Thus any uniformly continuous function is continuous at every point. On the other hand the function f(x)=1/x on (0,1) is continuous but not uniformly continuous.
1.2 Useful properties of metric spaces
Metric spaces may or may not have some useful properties which we are discussing in the following subsections: completeness and compactness.
1.2.1 Cauchy sequences and completeness
Recall that if (X,d) is a metric space, then a sequence (xn) of elements of X converges to x∈ X if d(xn,x) → 0, i.e., if given ε>0 there exists N such that d(xn,x)< ε whenever n ≥ N. Thus, to show that a sequence is convergent from the definition we need to present its limit x which may not belong to the sequence (xn). It would be convenient to deduce convergence of (xn) just through its own properties without a reference to extraneous x. This is possible for complete metric spaces studied in this subsection.
Often we think of convergent sequences as ones where xn and xm are close together
when n and m are large. This is almost, but not quite, the same thing in a general metric space.
Definition 50 (Cauchy Sequence)
A sequence (xn) in a metric space (X,d) is a Cauchy sequence if for any ε>0 there
is an N such that d(xn,xm)<ε for all n, m ≥ N.
Example 51
Take xn=1/n in ℝ with the usual metric. Now d(xn,xm)=|1/n−1/m|.
Suppose that n and m are both at least as big as N; then d(xn,xm) ≤ 1/N.
Hence if ε>0 and we take N>1/ε, we have d(xn,xm)≤ 1/N <ε whenever n and m are both
≥ N.
In fact all convergent sequences are Cauchy sequences, by the following result.
Theorem 52
Suppose that (
xn)
is a convergent sequence in a metric space (
X,
d)
, i.e., there
is a limit point x such that d(
xn,
x) → 0
. Then (
xn)
is a Cauchy sequence.
Proof.
Take ε>0. Then there is an
N such that
d(
xn,
x)<ε/2 whenever
n ≥
N.
Now suppose both
n≥
N and
m ≥
N. Then
d(xn,xm) ≤ d(xn,x)+d(x,xm) = d(xn,x)+d(xm,x) < ε/2+ε/2=ε,
|
and we are done.
□
Proposition 53
Every subsequence of a Cauchy sequence is a Cauchy sequence.
Proof.
If (xn) is Cauchy and (xnk) is a subsequence, then given ε>0 there
is an N such that d(xn,xm) < ε whenever n, m ≥ N. Now there is a K such that nk ≥ N
whenever k ≥ K. So d(xnk,xnl)<ε whenever k, l ≥ K.
□
Does every Cauchy sequence converge?
Example 54
-
(X,d)=ℚ, as a subspace of ℝ with the usual metric. Take x0=2 and
define xn+1=xn/2+1/xn. The sequence continues
3/2, 17/12, 577/408,… and indeed the sequence converges in ℝ as xn → x where x=x/2+1/x, i.e., x2=2.
But this isn’t in ℚ.
Thus (xn) is Cauchy in ℝ, since it converges to √2
when we think of it as a sequence in ℝ.
So it is Cauchy in ℚ, but doesn’t converge to a point of ℚ.
- Easier. Take (X,d)=(0,1). Then (1/n) is a Cauchy sequence in X (since it is Cauchy
in ℝ, as seen above), and has no limit in X.
In each case there are “points missing from X”.
Definition 55 (Completeness)
A metric space (X,d) is complete if every Cauchy sequence in X converges to a limit in X.
Theorem 56
The metric space ℝ
is complete.
This is a result from the first year. Since its proof depends on the definition of ℝ we will not demonstrate it here.
Example 58
-
Open intervals in ℝ are not complete; closed intervals are complete.
-
What about C[a,b] with d1, d2 or d∞?
Following our consideration in Ex. 38.2, define fn in C[0,2] by
fn(x)=
| ⎧
⎨
⎩ | xn | for 0 ≤ x ≤ 1, |
1 | for 1 ≤ x ≤ 2.
|
|
|
[DIAGRAM]
Then
and hence (fn) is Cauchy in (C[0,2],d1). Does the sequence converge?
If there is an f ∈ C[0,2] with fn → f as n → ∞, then
∫02 |fn(x)−f(x)| dx → 0, so
∫01 and ∫12 both tend to zero. So fn → f in (C[0,1],d1), which means
that f(x)=0 on [0,1] (from an example we did earlier). Likewise, f=1 on [1,2],
which doesn’t give a continuous limit.
- Similarly, (C[a,b],d1) is incomplete in general. Also it is incomplete in the d2 metric,
as the same example shows (a similar calculation with squares of functions). We will see later that it is complete in the d∞ metric.
If a metric space (X,d) is not complete one can always pass to its abstract completion in the following sense.
Proposition 60 (Abstract completion)
Any metric space (
X,
d)
is isometric to a dense subspace of a complete metric space, which is called its abstract completion
if (
X,
d)
.
Proof.[Sketch of proof]
We describe a metric space (
X′,
d′) in which
X is isometric to a dense subset. Consider the space
X′ of Cauchy sequences of
X.
We define an equivalence relation ∼ on
X′ by
(xn) ∼ (yn) ⇔ d(xn,yn) → 0. |
The set
X′ is defined to be the set of equivalence classes [(
xn)]. It has a well defined metric given by
d′([(xn)],[(yn)]):= | | d(xn,yn).
|
One checks easily that this is metric and is well defined (does not depend on the chosen representative
xn of [(
xn)]).
Now there is an injective map
X →
X′ defined by sending
x to the constant sequence (
x,
x,
x,…). This map is an isometry. We can therefore think of (
X,
d)
as a subset of (
X′,
d′). This subset is dense because every Cauchy sequence can be approximated by a sequence of constant sequences.
So the only difficult bit in this construction is to show that (
X′,
d′) is complete. We will sketch the construction of a limit here. It turns out that it verifies completeness on a dense set.
Lemma 61
Suppose that (X,d) is a metric space and let Y ⊂ X be a dense set with the property that every Cauchy sequence in Y has a limit in X. Then (X,d) is complete.
Proof.
Let (xn) be a Cauchy sequence in X. Now replace xn with another sequence yn in Y such that d(xn,yn)<1/n. Then, by the triangle inequality, yn
is again a Cauchy sequence and converges, by assumption, to some x ∈ X. Then also xn converges to x.
□
Let us turn to the proof of completeness of X′. Suppose that (xn) is a Cauchy sequence in X. Then, in X′ this sequence has the form
((x1,x1,…),(x2,x2,…),(x3,x3,…),…). This sequence has a limit, namely, (xn) itself.
□
Exercise 62 (Extension by continuity)
Let (
X,
d)
be a metric space and X1 be a dense subset of X. Let f:
X1 →
Y be a uniformly continuous function to a complete metric space (
Y,
d′)
.
Show that there is a unique
function f′:
X →
Y which satisfies two properties:
-
restriction of f′ to X1 coincides with f, that is f′(x)=f(x) for all x∈ X1;
- f′ is continuous on X.
Furthermore, it can be shown that f′
is uniformly continuous on X.
We will call f′
the extension of
f by continuity
and will often keep the same letter f to denote f′
.
There are many important consequences of Ex. 62, in particular the following.
Corollary 63
All abstract completions of a metric space (X,d) are isometric, in other words, the abstract completions is unique up to isometry.
1.2.2 Compactness
Accordingly to a dictionary: compact—closely and firmly united or packed together. For a metric space a meaning of “closely and firmly united” can be defined in several different forms—through open coverings or convergent subsequences—and we will see that these interpretations are equivalent.
An open cover of a metric space (X,d) is a family of open sets (Uα)α ∈ I such that
A subcover of a cover is a subset I′ ⊂ I of the index set such that
(Uα)α ∈ I′ is still a cover.
Definition 64 (Compactness)
A metric space (X,d) is called compact if every open cover has a finite subcover.
Informally: a space is compact if any infinite open covering is excessive and can be reduced just to a finite one. An example of a compact set is [0,1] and example of non-compact—all reals or the open interval (0,1). An importance of this concept is clarified by Rem. 21.
Definition 65 (Sequential Compactness)
A metric space (X,d) is called sequentially compact if every sequence (xn)n ∈ ℕ
in X has a convergent subsequence.
The limit of a convergent sequence is called the accumulation point of {xn}. It is instructive to compare the definitions of:
| x is the limit of {xn}: | ∀ ε>0 ∃ N ∀ n>N: d(x, xn) < ε; | | | | | | | | | |
x is an accumulation point of {xn}: | ∀ ε>0 ∀ N ∃ n>N: d(x, xn) < ε.
| | | | | | | | | |
|
Thereafter, x is not an accumulation point of {xn} if for some ε>0 and some N for all subsequent n>N we have d(x, xn)>ε.
Informally: a space is sequentially compact if there is no room to place infinite number of points sufficiently apart from each other to avoid their condensation to a limit. Taking the sequence xn=n shows that the set of all reals is not sequentially compact.
On the other hand, we know from previous years that bounded closed set in ℝn every sequence
has a convergent subsequence. Therefore, bounded closed sets in ℝn
are sequentially compact.
Exercise 66
What are compact sets in a discrete metric space?
What are sequentially compact sets in a discrete metric space?
Lemma 67
Let (X,d) be a sequentially compact metric space. Then for every ε >0 there
exist finitely many points x1,…,xn such that {Bε(xi)∣ i=1,…,n}
is a cover.
Proof.
Suppose this were not the case. Then there would exist an ε>0 such that
for any finite number of points
x1,…,
xn the collection of balls
Bε(
xi)
does not cover, i.e.
Starting with
n=1 and then inductively adding points that are
in the complement of ∪
i=1n Bε(
xi)
we end up with an infinite sequence of points
xi such that
d(
xi,
xk) ≥ ε.
This sequence cannot have a Cauchy subsequence (required for convergence) in contradiction with the sequential compactness of
X.
□
Theorem 68
A metric space (X,d) is compact if and only if it is sequentially compact.
Proof. We show the two directions separately.
Compactness implies sequential compactness:
Suppose that X is compact and let (xi)i ∈ ℕ be a sequence. We want to show that it has a convergent
subsequence. Suppose (xi) did not have a convergent subsequence. Then no point x is an accumulation point.
Therefore, for each x ∈ X there exists an ε(x)>0 such that only finitely many i ∈ ℕ for which xi ∈ Bε(x). Since
(Bε(x))x ∈ X is an open cover it has a finite subcover, that is a finite number of balls with a finite number of xi in each. This contradicts to the infinite number of elements in the sequence (xi).
Sequential compactness implies compactness: This implication is quite tricky. The proof is again by contradiction. Let us assume our space is sequentially compact and
there exists a cover Uα that does not have a finite subcover.
By the above lemma there are finitely many points x1,…,xN1 such that B1(xi) is a cover.
Each of the balls B1(xi) is covered by Uα as well.
Since our cover does not have a finite subcover one of the balls B1(xi) does not have a finite subcover.
Denote the relevant point xi by z1.
Again there are finitely many points x′1,…,x′N2 such that
B1/2(x′i) is a cover of X.
The collection of sets B1(z1) ∩ B1/2(x′i), with i=1,…,N2 is also a covering of B1(z1). In the same way as before
there is at least one of the x′i (which we will again call z2), such that B1(z1) ∩ B1/2(z2) can not be covered by a finite subcover of Uα.
Continuing like this we construct a sequence of points zi such that none of the sets
B1(z1) ⋂ B | | (z2) ⋂ … ⋂ B | | (zN)
|
can be covered by a finite subcover of Uα.
By assumption the sequence (zi) has a convergent subsequence.
Say z is a limit point of that subsequence. Since Uα is an open cover the point z is contained in one of
the Uα and of course that means that an open ball Bε(z) around z is contained in Uα for some ε>0.
Now we show that
there exits an N ∈ ℕ such that B1/N(zN) is a subset of Uα (this will be the desired contradiction!).
Indeed, choose N large enough so that d(zN,z) + 1/N<ε. Then x ∈ B1/N(zN) implies that
d(x,z) ≤ d(zN,z) + d(x,zN) < d(zN,z) + 1/N<ε. This means in particular that
B1(z1) ⋂ B | | (z2) ⋂ … ⋂ B | | (zN)
|
is a subset of Uα. Thus, there is a subcover of the set
B1(z1) ∩ … ∩ B1/N(zN)
consisting of one element Uα. This is a contradiction as we constructed
the sequence of balls in such a way that these sets cannot be covered by a finite number of the Uα.
□
Definition 69 (Boundedness)
A subset A ⊂ X of a metric space is called bounded if there exists x0 ∈ X and C>0 such that
for all x ∈ A we have d(x0,x) ≤ C.
Theorem 71
Suppose that A ⊂ X is a compact subset of a metric space. Then A is closed and bounded.
Proof.
First we show
A is bounded. Choose any
x0 ∈
X and note that the set
Bn(
x0) indexed by
n ∈ ℕ is an open cover of
A. Hence, there exists a finite sub-cover<
Bn1(
x0),…,
BnN(
x0). Hence,
A ⊂
BC(
x0), where
C= max{
n1,…,
cN}. Hence,
A is bounded.
Next assume that (xk) is a sequence in A that converges in X. Since A is compact there exists a subsequence that converges in A. Hence, the limit of xk must also be in A. Therefore, A is closed.
□
The converse of this statement is not correct in general. It is however famously correct in ℝm.
Theorem 72 (Heine–Borel)
A subset K ⊂ ℝm is compact if and only if it is closed and bounded.
Proof.
We just need to combine the above statements.
We have already shown that compactness implies closedness and boundedness. If K is closed and bounded we know from Analysis that it is sequentially compact. Therefore it is compact.
□
As an illustration of further nice properties of compact spaces we mention the following result:
Exercise 73
-
Any continuous function on a compact set is bounded.
- Any continuous function f: K→ X from a compact space K to a metric space X is uniformly continuous.
2 Basics of Linear Spaces
A person is solely the concentration of an infinite set of
interrelations with another and others, and to separate a person
from these relations means to take away any real meaning of the
life.
Vl. Soloviev
A space around us could be described as a three dimensional Euclidean
space. To single out a point of that space we need a fixed
frame of references and three real numbers, which are
coordinates of the point. Similarly to describe a pair of
points from our space we could use six coordinates; for three
points—nine, end so on. This makes it reasonable to consider
Euclidean (linear) spaces of an arbitrary finite dimension, which are
studied in the courses of linear algebra.
The basic properties of Euclidean spaces are determined by its
linear and metric structures. The linear
space (or vector space) structure allows
to add and subtract vectors
associated to points as well as
to multiply vectors by
real or complex numbers (scalars).
The metric
space structure assign a
distance—non-negative real
number—to a pair of points or, equivalently, defines a
length of a vector defined by that
pair. A metric (or, more generally a topology) is essential for
definition of the core analytical notions like limit or continuity.
The importance of linear and metric (topological) structure in
analysis sometime encoded in the formula:
Analysis = Algebra +
Geometry .
(8) |
On the other hand we could observe that many sets admit a sort of
linear and metric structures which are linked each other. Just
few among many other examples are:
- The set of convergent sequences;
- The set of continuous functions on [0,1].
It is a very mathematical way of
thinking to declare
such sets to be spaces and call their elements
points.
But shall we lose all information on a particular element (e.g. a
sequence {1/n}) if we represent it by a shapeless and size-less
“point” without any inner configuration? Surprisingly not: all
properties of an element could be now retrieved not from its inner
configuration but from interactions with other elements through linear
and metric structures. Such a “sociological” approach to
all kind of mathematical objects was codified in the abstract
category
theory.
Another surprise is that starting from our three dimensional Euclidean
space and walking far away by a road of abstraction to infinite
dimensional Hilbert spaces we are arriving just to yet another picture
of the surrounding space—that time on the language of
quantum mechanics.
The distance from Manchester to Liverpool is 35 miles—just
about the mileage in the opposite direction!
A tourist guide
to England
2.1 Banach spaces (basic definitions only)
The following definition generalises the notion of distance
known from the everyday life.
Definition 1
A metric
(or distance function
) d on a set M
is a function d:
M×
M →ℝ
+ from the
set of pairs to non-negative real numbers such that:
-
d(x,y)≥0 for all x, y ∈ M, d(x,y)=0
implies x=y .
- d(x,y)=d(y,x) for all x and y in M.
-
d(x,y)+d(y,z)≥ d(x,z) for all x, y, and z
in M (triangle inequality).
Exercise 2
Let M be the set of UK’s cities are the following function are
metrics on M:
-
d(A,B) is the price of 2nd class railway ticket from A
to B.
- d(A,B) is the off-peak driving time from A to B.
The following notion is a useful specialisation of metric adopted to the
linear structure.
Definition 3
Let V be a (real or complex) vector space. A norm
on
V is a real-valued function, written ||
x||
, such that
-
||x||≥ 0 for all x∈ V, and ||x||=0
implies x=0.
- ||λ x|| = | λ | ||x|| for all scalar
λ and vector x.
-
||x+y||≤ ||x||+||y|| (triangle
inequality).
A vector space with a norm is called a normed
space.
The connection between norm and metric is as follows:
Proposition 4
If ||·||
is a norm on V, then it gives a metric on
V by d(
x,
y)=||
x−
y||
.
(a)
(b)
Figure 1: Triangle inequality in metric (a)
and normed (b) spaces. |
Proof.
This is a simple exercise to derive
items
1–
3 of
Definition
1 from corresponding items of
Definition
3. For example, see the
Figure
1 to derive
the triangle inequality.
□
An important notions known from real analysis
are limit and
convergence. Particularly we usually wish to have enough limiting
points for all “reasonable” sequences.
Definition 5
A sequence {
xk}
in a metric space (
M,
d)
is a
Cauchy sequence, if for every
є>0
, there exists an integer n such that k,
l>
n
implies that d(
xk,
xl)<є
.
Definition 6
(
M,
d)
is a complete metric
space if every Cauchy sequence in M
converges to a limit in M.
For example, the set of integers ℤ and reals
ℝ with the natural distance functions are complete
spaces, but the set of rationals ℚ is not. The complete
normed spaces
deserve a special name.
Definition 7
A Banach space is a complete normed space.
Exercise* 8
A convenient way to define a norm in a Banach space is as
follows. The unit ball U in a normed space B
is the set of x such that ||
x||≤ 1
. Prove that:
-
U is a convex set, i.e. x,
y∈ U and λ∈ [0,1] the point λ x
+(1−λ)y is also in U.
- ||x||=inf{ λ∈ℝ+ ∣
λ−1x ∈ U}.
- U is closed if and only if the space is Banach.
(i)
(ii)
(iii)
Figure 2: Different unit balls defining norms in ℝ2 from
Example 9. |
Example 9
Here is some examples of normed spaces.
-
l2n is either ℝn or
ℂn with norm defined by
| ⎪⎪
⎪⎪ | (x1,…,xn) | ⎪⎪
⎪⎪ | 2 = | √ | |
⎪
⎪ | x1 | ⎪
⎪ | 2+
| ⎪
⎪ | x2 | ⎪
⎪ | 2+ ⋯+ | ⎪
⎪ | xn | ⎪
⎪ | 2 |
|
| .
(9) |
-
l1n is either ℝn or
ℂn with norm defined by
| ⎪⎪
⎪⎪ | (x1,…,xn) | ⎪⎪
⎪⎪ | 1 = | ⎪
⎪ | x1 | ⎪
⎪ | +
| ⎪
⎪ | x2 | ⎪
⎪ | + ⋯+ | ⎪
⎪ | xn | ⎪
⎪ |
| .
(10) |
-
l∞n is either ℝn or
ℂn with norm defined by
| ⎪⎪
⎪⎪ | (x1,…,xn) | ⎪⎪
⎪⎪ | ∞ = max( | ⎪
⎪ | x1 | ⎪
⎪ | ,
| ⎪
⎪ | x2 | ⎪
⎪ | , ⋯, | ⎪
⎪ | xn | ⎪
⎪ |
| ).
(11) |
-
Let X be a topological space, then Cb(X) is
the space of continuous bounded functions f:
X→ℂ with norm ||f||∞=supX
| f(x) |.
-
Let X be any set, then l∞(X) is the
space of all bounded (not necessarily continuous)
functions f: X→ℂ with norm
||f||∞=supX | f(x) |.
All these normed spaces are also complete and thus are Banach
spaces. Some more examples of both complete and incomplete spaces
shall appear later.
—We need an extra space to accommodate this product!
A
manager to a shop assistant
2.2 Hilbert spaces
Although metric and norm capture important geometric information about
linear spaces they are not sensitive enough to represent such
geometric characterisation as angles (particularly
orthogonality). To
this end we need a further refinements.
From courses of linear algebra known that the scalar product
⟨ x,y
⟩= x1 y1 + ⋯ + xn yn is important in a space
ℝn and defines a norm ||x||2=⟨ x,x
⟩. Here
is a suitable generalisation:
Definition 10
A scalar product (or
inner product) on a real or complex
vector space V is a mapping V×
V →
ℂ
, written ⟨
x,
y
⟩
, that satisfies:
-
⟨ x,x
⟩ ≥ 0 and ⟨ x,x
⟩ =0 implies x=0.
- ⟨ x,y
⟩ = ⟨ y,x
⟩ in complex spaces and
⟨ x,y
⟩ = ⟨ y,x
⟩ in real ones for all x, y∈ V.
- ⟨ λ x,y
⟩=λ ⟨ x,y
⟩, for all x,
y∈ V and scalar λ. (What is
⟨ x,λ y
⟩?).
- ⟨ x+y,z
⟩=⟨ x,z
⟩ + ⟨ y,z
⟩, for all x,
y, and z∈ V. (What is
⟨ x, y+z
⟩?).
Last two properties of the scalar product is oftenly encoded in the
phrase: “it is linear in the first variable if we fix the second
and anti-linear in the second if we fix the first”.
Definition 11
An inner product space V is a
real or complex vector space with a scalar product on it.
Example 12
Here is some examples of inner product spaces which demonstrate that
expression ||
x||=√
⟨ x,x
⟩ defines a norm.
-
The inner product for ℝn was defined in the beginning
of this section. The inner product for ℂn is given
by ⟨ x,y
⟩=∑1n xj ȳj. The norm
||x||=√∑1n | xj |2 makes it
l2n from Example 1.
-
The extension for infinite vectors: let l2
be
l2={ sequences {xj}1∞ ∣
| | ⎪
⎪ | xj | ⎪
⎪ | 2 < ∞}.
(12) |
Let us equip this set with operations of term-wise addition and
multiplication by scalars, then l2 is closed under
them. Indeed it follows from the
triangle inequality
and properties of absolutely
convergent series. From the standard
Cauchy–Bunyakovskii–Schwarz
inequality follows that the series ∑1∞xjȳj
absolutely
converges and its sum defined to be ⟨ x,y
⟩.
- Let Cb[a,b] be a space of continuous functions
on the interval [a,b]∈ℝ. As we learn from
Example 4 a normed space it is a normed space
with the norm ||f||∞=sup[a,b]| f(x) |. We
could also define an inner product:
⟨ f,g
⟩= | | f(x)ḡ(x) d x
and
| ⎪⎪
⎪⎪ | f | ⎪⎪
⎪⎪ | 2= | ⎛
⎜
⎜
⎝ | | | ⎪
⎪ | f(x) | ⎪
⎪ | 2 d x | ⎞
⎟
⎟
⎠ | | .
(13) |
Now we state, probably, the most important inequality in analysis.
Theorem 13 (Cauchy–Schwarz–Bunyakovskii inequality)
For vectors x and y in an inner product space V let us
define ||
x||=√
⟨ x,x
⟩ and
||
y||=√
⟨ y,y
⟩ then we have
| ⎪
⎪ | ⟨ x,y
⟩ | ⎪
⎪ | ≤ | ⎪⎪
⎪⎪ | x | ⎪⎪
⎪⎪ | ⎪⎪
⎪⎪ | y | ⎪⎪
⎪⎪ | ,
(14) |
with equality if and only if x and y are scalar multiple
each other.
Proof.
For simplicity we start from a real vector space.
Let we have two vectors
u and
v and want to define an inner product on
the two-dimensional vector space spanned by them. That is we need to
know a value of ⟨
au+
bv,
cu+
dv
⟩ for all possible scalars
a,
b,
c,
d.
By the linearity ⟨ au+bv, cu+dv
⟩ = ac⟨ u,u
⟩ + (bc+ad)⟨ u,v
⟩ + db⟨ v,v
⟩,
thus everything is defined as soon as we know three inner products
⟨ u,u
⟩, ⟨ u,v
⟩ and ⟨ v,v
⟩. First of all we need to demand ⟨ u,u
⟩ ≥ 0 and
⟨ v,v
⟩ ≥ 0.
Furthermore, they shall be such that ⟨ au+bv, au+bv
⟩ ≥ 0 for all
scalar a and b. If a=0, that is reduced to the previous case ⟨ v,v
⟩ ≥ 0.
If a is non-zero we note ⟨ au+bv, au+bv
⟩ = a2 ⟨ u+(b/a)v, u+(b/a)v
⟩
and letting λ = b/a we reduce our consideration to the quadratic expression
⟨ u+λ v, u+λ v
⟩ = λ 2⟨ v,v
⟩+2λ ⟨ u,v
⟩+⟨ u,u
⟩.
|
The graph of this function of λ is an upward parabolabecause ⟨ v,v
⟩ ≥ 0. Thus, it will be non-negative for all λ if its
lowest value is non-negative. From the theory of quadratic
expressions, the latter is achieved at λ =−⟨ u,v
⟩/⟨ v,v
⟩
and is equal to
| | ⟨ v,v
⟩ − 2 | | ⟨ u,v
⟩+⟨ u,u
⟩=− | | +⟨ u,u
⟩
|
If −⟨ u,v
⟩2/⟨ v,v
⟩+⟨ u,u
⟩ ≥ 0 then ⟨ v,v
⟩⟨ u,u
⟩ ≥ ⟨ u,v
⟩2.
Therefore, the Cauchy-Schwarz inequality is necessary and sufficient
condition for the non-negativity of the inner product defined by the
three values ⟨ u,u
⟩, ⟨ u,v
⟩ and ⟨ v,v
⟩.
After the previous discussion it is easy to get the result for complex vector space as well. For any x, y∈ V and any t∈ℝ we have:
0< ⟨ x+t y,x+t y
⟩= ⟨ x,x
⟩+2t ℜ ⟨ y,x
⟩+t2⟨ y,y
⟩),
|
Thus, the discriminant of this quadratic expression in t is
non-positive: (ℜ ⟨ y,x
⟩)2−||x||2||y||2≤ 0,
that is | ℜ ⟨ x,y
⟩ |≤||x||||y||. Replacing y
by eiαy for an arbitrary α∈[−π,π] we get | ℜ
(eiα⟨ x,y
⟩) | ≤||x||||y||, this
implies the desired inequality.
□
Corollary 14
Any inner product space is a normed space with norm
||
x||=√
⟨ x,x
⟩ (hence also a metric space,
Prop. 4).
Proof.
Just to check
items
1–
3 from
Definition
3.
□
Again complete inner product spaces deserve a special name
Definition 15
A complete inner product space is Hilbert
space.
The relations between spaces introduced so far are as follows:
Hilbert spaces | ⇒ | Banach spaces | ⇒ | Complete
metric spaces |
⇓ | | ⇓ | | ⇓ |
inner product spaces | ⇒ | normed spaces | ⇒ | metric spaces.
|
How can we tell if a given norm comes from an inner product?
Figure 3: To the parallelogram identity. |
Theorem 16 (Parallelogram identity)
In an inner product space H we have for all x and y∈
H (see
Figure 3):
| ⎪⎪
⎪⎪ | x+y | ⎪⎪
⎪⎪ | 2+ | ⎪⎪
⎪⎪ | x−y | ⎪⎪
⎪⎪ | 2=2 | ⎪⎪
⎪⎪ | x | ⎪⎪
⎪⎪ | 2+2 | ⎪⎪
⎪⎪ | y | ⎪⎪
⎪⎪ | 2.
(15) |
Proof.
Just by linearity of inner product:
⟨ x+y,x+y
⟩+⟨ x−y,x−y
⟩=2⟨ x,x
⟩+2⟨ y,y
⟩,
|
because the cross terms cancel out.
□
Exercise 17
Show that (15) is also a
sufficient condition for a norm to
arise from an inner product. Namely, for a norm on a complex Banach
space satisfying
to (15) the formula
| ⟨ x,y
⟩ | = | | ⎛
⎝ | ⎪⎪
⎪⎪ | x+y | ⎪⎪
⎪⎪ | 2− | ⎪⎪
⎪⎪ | x−y | ⎪⎪
⎪⎪ | 2+i | ⎪⎪
⎪⎪ | x+iy | ⎪⎪
⎪⎪ | 2
−i | ⎪⎪
⎪⎪ | x−iy | ⎪⎪
⎪⎪ | 2 | ⎞
⎠ | |
| (16) |
| = | | |
|
defines an inner product. What is a suitable formula for a real
Banach space?
Divide and rule!
Old but still much used recipe
2.3 Subspaces
To study Hilbert spaces we may use the traditional mathematical
technique of analysis and synthesis: we split the
initial Hilbert spaces into smaller and probably simpler subsets,
investigate them separately, and then reconstruct the entire picture
from these parts.
As known from the linear algebra, a linear subspace is a subset
of a linear space is its subset, which inherits the linear structure,
i.e. possibility to add vectors and multiply them by scalars. In this
course we need also that subspaces inherit topological structure
(coming either from a norm or an inner product) as well.
Definition 18
By a subspace of a normed space (or inner product space) we
mean a linear subspace with the same norm (inner product
respectively). We write X⊂ Y or X ⊆ Y.
Example 19
-
Cb(X) ⊂ l∞(X) where X
is a metric space.
- Any linear subspace of ℝn or ℂn
with any norm given in
Example 1–3.
-
Let c00 be the space of finite sequences,
i.e. all sequences (xn)
such that exist N with xn=0 for n>N. This is a
subspace of l2 since ∑1∞| xj |2
is a finite sum, so finite.
We also wish that the both inhered structures (linear and
topological) should be in agreement, i.e. the subspace should be
complete. Such inheritance is linked to the property be closed.
A subspace need not be closed—for example the sequence
x=(1, 1/2, 1/3, 1/4, …)∈ l2
because
∑1/k2
< ∞
|
and xn=(1, 1/2,…, 1/n, 0, 0,…)∈
c00 converges to x thus x∈
c00 ⊂ l2.
Proposition 20
-
Any closed subspace of a Banach/Hilbert space is complete,
hence also a Banach/Hilbert space.
- Any complete subspace is closed.
- The closure of subspace is again a subspace.
Proof.
- This is true in any metric space X: any Cauchy sequence from
Y has a limit x ∈ X belonging to Ȳ, but if Y
is closed then x ∈ Y.
- Let Y is complete and x∈ Ȳ, then there is
sequence xn→ x in Y and it is a Cauchy sequence.
Then completeness of Y implies x∈ Y.
- If x, y∈ Ȳ then there are xn and yn
in Y such that xn→ x and yn→ y.
From the triangle
inequality:
| ⎪⎪
⎪⎪ | (xn+yn)−(x+y) | ⎪⎪
⎪⎪ | ≤ | ⎪⎪
⎪⎪ | xn−x | ⎪⎪
⎪⎪ | + | ⎪⎪
⎪⎪ | yn−y | ⎪⎪
⎪⎪ | →
0,
|
so xn+yn→ x+y and x+y∈ Ȳ.
Similarly x∈Ȳ implies λ x ∈Ȳ for any
λ.
□
Hence c00 is an incomplete inner product space,
with inner product ⟨ x,y
⟩=∑1∞xk ȳk (this
is a finite sum!) as it is not closed in l2.
(a)
(b)
Figure 4: Jump function on (b) as a L2 limit of
continuous functions from (a). |
Similarly C[0,1] with inner product norm
||f||=(∫01 | f(t) |2 dt)1/2 is
incomplete—take the large space X of functions continuous on
[0,1] except for a possible jump at 1/2 (i.e. left and
right limits exists but may be unequal and
f(1/2)=limt→1/2+ f(t).
Then the sequence of functions defined on
Figure 4(a) has the limit shown on
Figure 4(b) since:
| ⎪⎪
⎪⎪ | f−fn | ⎪⎪
⎪⎪ | = | |
| ⎪
⎪ | f−fn | ⎪
⎪ | 2 dt < | | → 0.
|
Obviously
f∈C[0,1]∖C[0,1].
Exercise 21
Show alternatively that the sequence of function fn from
Figure 4(a) is a Cauchy sequence in
C[0,1]
but has no continuous limit.
Similarly the space C[a,b] is
incomplete for any a<b if equipped by the inner product
and the corresponding norm:
| ⟨ f,g
⟩ | = | | (17) |
| = | ⎛
⎜
⎜
⎝ | | | ⎪
⎪ | f(t) | ⎪
⎪ | 2 d t | ⎞
⎟
⎟
⎠ | | .
|
| (18) |
|
Definition 22
Define a Hilbert space L2[
a,
b]
to be the
smallest complete inner product space containing space
C[
a,
b]
with the restriction of inner product given
by (17).
It is practical to realise L2[a,b] as a certain space
of “functions” with the inner product defined via an integral. There are
several ways to do that and we mention just two:
- Elements of L2[a,b] are equivalent classes of
Cauchy sequences f(n) of functions from
C[a,b].
- Let integration be extended from the
Riemann definition
to the wider Lebesgue
integration (see
Section 13). Let L be a set of square
integrable in Lebesgue sense
functions on [a,b] with a finite norm (18). Then
L2[a,b] is a quotient space of L with respect to
the equivalence relation f∼ g ⇔ ||f−g||2=0
.
Example 23
Let the Cantor function on [0,1]
be defined as follows:
This function is not
integrable in the Riemann sense but
does
have the Lebesgue integral. The later however is equal
to 0
and as an L2-function the Cantor function
equivalent to the function identically equal to 0
.
- The third possibility is to map L2(ℝ)
onto a space of “true” functions but with an additional
structure. For example, in quantum mechanics it is useful to
work with the Segal–Bargmann
space of
analytic functions on
ℂ with the inner product [, , ]:
⟨ f1,f2
⟩= | ∫ | | f1(z) f2(z)e | | d z.
|
Theorem 24
The sequence space l2 is complete, hence a Hilbert space.
Proof.
Take a Cauchy sequence
x(n)∈
l2, where
x(n)=(
x1(n),
x2(n),
x3(n), … ). Our proof
will have three steps: identify the limit
x; show it is in
l2; show
x(n)→
x.
- If x(n) is a Cauchy sequence in l2 then
xk(n) is also a Cauchy sequence of numbers for any fixed
k:
| ⎪
⎪ | xk(n)−xk(m) | ⎪
⎪ | ≤ | ⎛
⎜
⎜
⎝ | | ⎪
⎪ | xk(n)−xk(m) | ⎪
⎪ | 2 | ⎞
⎟
⎟
⎠ | | =
| ⎪⎪
⎪⎪ | x(n)−x(m) | ⎪⎪
⎪⎪ | → 0.
|
Let xk be the limit of xk(n).
- For a given є>0 find n0 such that
||x(n)−x(m)||<є for all n,m>n0. For any
K and m:
| | | ⎪
⎪ | xk(n)−xk(m) | ⎪
⎪ | 2 ≤ | ⎪⎪
⎪⎪ | x(n)−x(m) | ⎪⎪
⎪⎪ | 2<є2.
|
Let m→ ∞ then ∑k=1K
| xk(n)−xk |2 ≤ є2.
Let K→ ∞ then ∑k=1∞| xk(n)−xk |2 ≤ є2. Thus
x(n)−x∈l2 and because l2 is a
linear space then x = x(n)−(x(n)−x) is also in
l2.
- We saw above that for any є >0 there is n0
such that ||x(n)−x||<є for all n>n0. Thus
x(n)→ x.
Consequently
l2 is complete.
□
All good things are covered by a thick layer
of chocolate (well, if something is not yet–it certainly will)
2.4 Linear spans
As was explained into introduction 2, we
describe “internal” properties of a vector through its relations to
other vectors. For a detailed description we need sufficiently many
external reference points.
Let A be a subset (finite or infinite) of a normed space V. We
may wish to upgrade it to a linear subspace in order to make it
subject to our theory.
Definition 25
The linear span of A, write Lin(
A)
, is
the intersection of all linear subspaces of V containing A,
i.e. the smallest subspace containing A, equivalently the set of all finite
linear combination of elements of A.
The closed linear span of A write
CLin(
A)
is
the intersection of all closed
linear subspaces of V containing A,
i.e. the smallest closed
subspace containing A.
Exercise* 26
-
Show that if A is a subset of finite dimension space then Lin(A)=CLin(A).
- Show that for an infinite A spaces Lin(A) and
CLin(A)could be different. (Hint: use
Example 3.)
Proposition 27
Lin(A)=CLin(A).
Proof.
Clearly Lin(A) is a closed subspace containing A thus it
should contain CLin(A). Also Lin(A)⊂ CLin(A) thus
Lin(A)⊂
CLin(A)=CLin(A). Therefore
Lin(A)= CLin(A).
□
Consequently CLin(A) is the set of all limiting points of finite
linear combination of elements of A.
The following simple result will be used later many times without
comments.
Lemma 30 (about Inner Product Limit)
Suppose H is an inner product space and sequences xn and
yn have limits x and y correspondingly. Then
⟨
xn,
yn
⟩→⟨
x,
y
⟩
or equivalently:
Proof.
Obviously by the
Cauchy–Schwarz inequality:
| = | | ⎪
⎪ | ⟨ xn−x,yn
⟩+⟨ x,yn−y
⟩ | ⎪
⎪ |
|
| ≤ | ⎪
⎪ | ⟨ xn−x,yn
⟩ | ⎪
⎪ | + | ⎪
⎪ | ⟨ x,yn−y
⟩ | ⎪
⎪ |
|
| ≤ | ⎪⎪
⎪⎪ | xn−x | ⎪⎪
⎪⎪ | ⎪⎪
⎪⎪ | yn | ⎪⎪
⎪⎪ | + | ⎪⎪
⎪⎪ | x | ⎪⎪
⎪⎪ | ⎪⎪
⎪⎪ | yn−y | ⎪⎪
⎪⎪ |
→ 0,
|
|
|
since ||
xn−
x||→ 0, ||
yn−
y||→ 0,
and ||
yn|| is bounded.
□
3 Orthogonality
Pythagoras is forever!
The catchphrase from TV commercial
of
Hilbert Spaces course
As was mentioned in the introduction the Hilbert spaces is an analog
of our 3D Euclidean space and theory of Hilbert spaces similar to
plane or space geometry. One of the primary result of Euclidean
geometry which still survives in high school curriculum despite its continuous
nasty de-geometrisation is Pythagoras’ theorem based on the notion of
orthogonality1.
So far we was concerned only with distances between points. Now we
would like to study angles between vectors and notably right
angles. Pythagoras’ theorem states that if the angle C in a
triangle is right then c2=a2+b2, see
Figure 5 .
Figure 5: The Pythagoras’ theorem c2=a2+b2 |
It is a very mathematical way
of thinking to turn this property of
right angles into their definition, which will work even in
infinite dimensional Hilbert spaces.
Look for a triangle, or
even for a right triangle
A
universal advice in solving problems from elementary geometry.
3.1 Orthogonal System in Hilbert Space
In inner product spaces it is even more convenient to give a definition of
orthogonality not from Pythagoras’ theorem but from an equivalent
property of inner product.
Definition 1
Two vectors x and y in an inner product space are
orthogonal
if ⟨
x,
y
⟩=0
,
written x ⊥
y. An orthogonal sequence (or
orthogonal system) en (finite
or infinite) is one in which en ⊥ em whenever n≠
m.
An orthonormal sequence (or
orthonormal system) en is
an orthogonal sequence with ||en||=1 for all n.
Exercise 2
-
Show that if x ⊥ x then x=0 and consequently x
⊥ y for any y∈ H.
-
Show that if all vectors of an orthogonal system are non-zero
then they are linearly independent.
Example 3 These are orthonormal sequences:
-
Basis vectors (1,0,0), (0,1,0), (0,0,1) in
ℝ3 or ℂ3.
-
Vectors en=(0,…,0,1,0,…) (with the only 1 on
the nth place) in l2. (Could you see a
similarity with the previous example?)
-
Functions en(t)=1/(√2π) eint ,
n∈ℤ in C[0,2π]:
⟨ en,em
⟩= | | | |
einte−imtdt = | ⎧
⎨
⎩ | |
|
(19) |
Exercise 4
Let A be a subset of an inner product space V and x⊥
y for any y∈
A. Prove that x⊥
z for all z∈
CLin(
A)
.
Theorem 5 (Pythagoras’)
If x ⊥
y then ||
x+
y||
2=||
x||
2+||
y||
2. Also if
e1, …, en is orthonormal then
| ⎪⎪
⎪⎪
⎪⎪
⎪⎪ | | ak ek | ⎪⎪
⎪⎪
⎪⎪
⎪⎪ | 2=⟨ | | ak ek, | | ak
ek
⟩= | | | ⎪
⎪ | ak | ⎪
⎪ | 2.
|
Proof.
A one-line calculation.
□
The following theorem provides an important property of Hilbert spaces
which will be used many times. Recall, that a subset K of a linear
space V is convex if for all x,
y∈ K and λ∈ [0,1] the point λ x
+(1−λ)y is also in K. Particularly any subspace is convex
and any unit ball as well (see Exercise 1).
Theorem 6 (about the Nearest Point)
Let K be a non-empty convex closed subset of a Hilbert space
H. For any point x∈
H there is the unique point y∈
K
nearest to x.
Proof.
Let
d=inf
y∈ K d(
x,
y), where
d(
x,
y)—the distance
coming from the norm ||
x||=√
⟨ x,x
⟩ and let
yn a sequence points in
K such that lim
n→
∞d(
x,
yn)=
d. Then
yn is a Cauchy sequence. Indeed
from the
parallelogram identity
for the parallelogram generated by vectors
x−
yn and
x−
ym
we have:
| ⎪⎪
⎪⎪ | yn−ym | ⎪⎪
⎪⎪ | 2=2 | ⎪⎪
⎪⎪ | x−yn | ⎪⎪
⎪⎪ | 2+2 | ⎪⎪
⎪⎪ | x−ym | ⎪⎪
⎪⎪ | 2− | ⎪⎪
⎪⎪ | 2x−yn−ym | ⎪⎪
⎪⎪ | 2.
|
Note that ||2
x−
yn−
ym||
2=4||
x−
yn+
ym/2||
2≥
4
d2 since
yn+
ym/2∈
K by its convexity. For
sufficiently large
m and
n we get ||
x−
ym||
2≤
d
+є and ||
x−
yn||
2≤
d +є, thus
||
yn−
ym||≤ 4(
d2+є)−4
d2=4є, i.e.
yn
is a Cauchy sequence.
Let y be the limit of yn, which exists by the completeness
of H, then y∈ K since K is closed. Then
d(x,y)=limn→ ∞d(x,yn)=d. This show the
existence of the nearest point. Let y′ be another point in
K such that d(x,y′)=d, then the parallelogram identity
implies:
| ⎪⎪
⎪⎪ | y−y′ | ⎪⎪
⎪⎪ | 2=2 | ⎪⎪
⎪⎪ | x−y | ⎪⎪
⎪⎪ | 2+2 | ⎪⎪
⎪⎪ | x−y′ | ⎪⎪
⎪⎪ | 2− | ⎪⎪
⎪⎪ | 2x−y−y′ | ⎪⎪
⎪⎪ | 2≤ 4d2−4d2=0.
|
This shows the uniqueness of the nearest point.
□
Exercise* 7 The essential rôle of the parallelogram identity
in the above proof indicates that the theorem does not hold in a
general Banach space.
-
Show that in ℝ2 with either norm
||·||1 or ||·||∞ form
Example 9 the nearest point could
be non-unique;
- Could you construct an example (in Banach space) when the
nearest point does not exists?
Liberte, Egalite, Fraternite!
A longstanding ideal
approximated in the real life by something completely different
3.2 Bessel’s inequality
For the case then a convex subset is a subspace we could characterise
the nearest point in the term of orthogonality.
Theorem 8 (on Perpendicular)
Let M be a subspace of a Hilbert space H and a point x∈
H be fixed. Then z∈
M is the nearest point to x if and
only if x−
z is orthogonal to any vector in M.
(i)
(ii)
Figure 6: (i) A smaller distance
for a non-perpendicular direction;
and |
(ii) Best approximation from a subspace |
Proof.
Let
z is the nearest point to
x existing by the
previous
Theorem. We claim that
x−
z orthogonal to any vector in
M,
otherwise there exists
y∈
M such that ⟨
x−
z,
y
⟩≠ 0. Then
| = | ⎪⎪
⎪⎪ | x−z | ⎪⎪
⎪⎪ | 2−2є
ℜ⟨ x−z,y
⟩+є2 | ⎪⎪
⎪⎪ | y | ⎪⎪
⎪⎪ | 2 |
|
| < | |
|
if є is chosen to be small enough and such
that є ℜ⟨
x−
z,
y
⟩ is positive, see
Figure
6(i). Therefore we get a contradiction with
the statement that
z is closest point to
x.
On the other hand if x−z is orthogonal to all vectors in H1 then particularly
(x−z)⊥ (z−y) for all y∈ H1, see
Figure 6(ii). Since x−y=(x−z)+(z−y)
we got by the Pythagoras’ theorem:
| ⎪⎪
⎪⎪ | x−y | ⎪⎪
⎪⎪ | 2= | ⎪⎪
⎪⎪ | x−z | ⎪⎪
⎪⎪ | 2 + | ⎪⎪
⎪⎪ | z−y | ⎪⎪
⎪⎪ | 2.
|
So ||x−y||2≥ ||x−z||2 and the are equal if and only if z=y.
□
Exercise 9
The above proof does not work if ⟨ x−z,y
⟩ is an imaginary
number, what to do in this case?
Consider now a basic case of approximation: let x∈ H be
fixed and e1, …, en be orthonormal and denote
H1=Lin{e1,…,en}. We could try to approximate x by a
vector y=λ1 e1+⋯ +λn en ∈ H1.
Corollary 10
The minimal value of ||x−y|| for
y∈ H1 is achieved when
y=∑1n⟨ x,ei
⟩ ei.
Proof.
Let
z=∑
1n⟨
x,
ei
⟩
ei, then
⟨
x−
z,
ei
⟩=⟨
x,
ei
⟩−⟨
z,
ei
⟩=0. By the
previous Theorem z is the
nearest point to
x.
□
Figure 7: Best approximation by three trigonometric polynomials |
Example 11
-
In ℝ3 find the best approximation to (1,0,0)
from the plane V:{x1+x2+x3=0}. We take an orthonormal
basis e1=(2−1/2, −2−1/2,0), e2=(6−1/2, 6−1/2,
−2· 6−1/2) of V (Check this!). Then:
z=⟨ x,e1
⟩e1+⟨ x,e2
⟩e2= | ⎛
⎜
⎜
⎝ | | ,− | | ,0 | ⎞
⎟
⎟
⎠ | +
| ⎛
⎜
⎜
⎝ | | , | | ,− | | ⎞
⎟
⎟
⎠ | = | ⎛
⎜
⎜
⎝ | | ,− | | ,− | | ⎞
⎟
⎟
⎠ | .
|
- In C[0,2π] what is the best approximation to
f(t)=t by functions a+beit+ce−it? Let
We find:
⟨ f,e0
⟩ | = | |
| | dt= | ⎡
⎢
⎢
⎢
⎢
⎣ |
| | | ⎤
⎥
⎥
⎥
⎥
⎦ | | = | √ | | π3/2; |
|
⟨ f,e1
⟩ | = | |
⟨ f,e−1
⟩ | = | |
| | dt=−i | √ | | (Why
we may not check this one?)
|
|
|
Then the best approximation is (see Figure 7):
f0(t) | = | ⟨ f,e0
⟩e0+⟨ f,e1
⟩e1+⟨ f,e−1
⟩e−1 |
| = | |
|
Corollary 12 (Bessel’s inequality)
If (
ei)
is orthonormal then
| ⎪⎪
⎪⎪ | x | ⎪⎪
⎪⎪ | 2≥ | | | ⎪
⎪ | ⟨ x,ei
⟩ | ⎪
⎪ | 2.
|
Proof.
Let
z= ∑
1n⟨
x,
ei
⟩
ei then
x−
z⊥
ei for all
i therefore by Exercise
4 x−
z⊥
z.
Hence:
| = | ⎪⎪
⎪⎪ | z | ⎪⎪
⎪⎪ | 2+ | ⎪⎪
⎪⎪ | x−z | ⎪⎪
⎪⎪ | 2 |
|
| ≥ | ⎪⎪
⎪⎪ | z | ⎪⎪
⎪⎪ | 2= | | | ⎪
⎪ | ⟨ x,ei
⟩ | ⎪
⎪ | 2.
|
|
|
□
—Did you say “rice and fish for them”?
A student question
3.3 The Riesz–Fischer theorem
When (ei) is orthonormal we call ⟨ x,en
⟩ the nth
Fourier coefficient of x (with
respect to (ei), naturally).
Theorem 13 (Riesz–Fisher)
Let (
en)
1∞ be an orthonormal sequence in a Hilbert
space H. Then ∑
1∞λ
n en converges in H
if and only if ∑
1∞| λ
n |
2 < ∞
. In
this case ||∑
1∞λ
n en||
2=∑
1∞| λ
n |
2.
Proof.
Necessity: Let
xk=∑
1k λ
n en and
x=lim
k→ ∞ xk. So
⟨
x,
en
⟩=lim
k→
∞⟨
xk,
en
⟩=λ
n for all
n. By the
Bessel’s inequality for all
k
| ⎪⎪
⎪⎪ | x | ⎪⎪
⎪⎪ | 2≥ | | | ⎪
⎪ | ⟨ x,en
⟩ | ⎪
⎪ | 2= | |
| ⎪
⎪ | λn | ⎪
⎪ | 2,
|
hence ∑
1k | λ
n |
2 converges and the sum
is at most ||
x||
2.
Sufficiency: Consider ||xk−xm||=||∑mk λn
en||=(∑mk | λn |2)1/2 for
k>m. Since ∑mk | λn |2 converges
xk is a Cauchy sequence in H and thus has a limit x. By
the Pythagoras’ theorem
||xk||2=∑1k | λn |2 thus for
k→ ∞
||x||2=∑1∞| λn |2 by the
Lemma about
inner product limit.
□
Observation: the closed linear span
of an orthonormal sequence in any Hilbert space looks like
l2, i.e. l2 is a universal model for a
Hilbert space.
By Bessel’s inequality and the
Riesz–Fisher theorem we know that the series
∑1∞⟨ x,ei
⟩ ei converges for any x∈
H. What is its limit?
Let y=x− ∑1∞⟨ x,ei
⟩ ei, then
⟨ y,ek
⟩=⟨ x,ek
⟩− | | ⟨ x,ei
⟩
⟨ ei,ek
⟩=⟨ x,ek
⟩−⟨ x,ek
⟩ =0 for all k.
(20) |
Definition 14
An orthonormal sequence (
ei)
in a Hilbert space H is
complete
if the identities ⟨
y,
ek
⟩=0
for all k imply y=0
.A complete orthonormal sequence is also called orthonormal
basis in H.
Theorem 15 (on Orthonormal Basis)
Let ei be an orthonormal basis in a Hilber space H. Then
for any x∈
H we have
x= | | ⟨ x,en
⟩en and
| ⎪⎪
⎪⎪ | x | ⎪⎪
⎪⎪ | 2= | | ⎪
⎪ | ⟨ x,en
⟩ | ⎪
⎪ | 2.
|
There are constructive existence theorems in
mathematics.
An example of pure existence statement
3.4 Construction of Orthonormal Sequences
Natural questions are: Do orthonormal sequences always exist?
Could we construct them?
Theorem 16 (Gram–Schmidt)
Let (
xi)
be a sequence of linearly independent vectors in an
inner product space V. Then there exists orthonormal sequence
(
ei)
such that
Lin{x1,x2,…,xn}=Lin{e1,e2,…,en},
for all n.
|
Proof.
We give an explicit algorithm working by induction. The
base of
induction: the first vector is
e1=
x1/||
x1||. The
step of
induction: let
e1,
e2, …,
en are already
constructed as required. Let
yn+1=
xn+1−∑
i=1n⟨
xn+1,
ei
⟩
ei. Then by
(
20)
yn+1 ⊥
ei for
i=1,…,
n. We may put
en+1=
yn+1/||
yn+1||
because
yn+1≠ 0 due to linear independence of
xk’s. Also
Lin{e1,e2,…,en+1} | = | Lin{e1,e2,…,yn+1} |
| = | Lin{e1,e2,…,xn+1} |
| = | Lin{x1,x2,…,xn+1}.
|
|
So (
ei) are orthonormal sequence.
□
Example 17
Consider C[0,1]
with the usual inner product (17)
and apply
orthogonalisation to the sequence 1
, x, x2, ….
Because ||1||=1
then e1(
x)=1
.
The continuation could be presented by the table:
e1(x)=1 |
y2(x)=x−⟨ x,1
⟩1=x− | | ,
| ⎪⎪
⎪⎪ | y2 | ⎪⎪
⎪⎪ | 2= | |
(x− | | )2 d x= | | ,
e2(x)= | √ | | (x− | | ) |
|
y3(x)=x2−⟨ x2,1
⟩1−⟨ x2,x− | |
⟩(x− | | )· 12 ,
…,
e3= | |
|
… … …
|
|
Figure 8: Five first Legendre
Pi and Chebyshev Ti polynomials |
Example 18
Many famous sequences of orthogonal
polynomials,
e.g. Chebyshev,
Legendre,
Laguerre,
Hermite, can be obtained
by orthogonalisation of 1
, x, x2, …with various
inner products.
-
Legendre polynomials in C[−1,1] with inner
product
- Chebyshev polynomials in
C[−1,1] with inner product
- Laguerre polynomials in
the space of polynomials P[0,∞) with inner product
See Figure 8 for the five first Legendre and Chebyshev
polynomials. Observe the difference caused by the different inner
products (21) and (22). On the other
hand note the similarity in oscillating behaviour with different
“frequencies”.
Another natural question is: When is an orthonormal sequence
complete?
Proposition 19
Let (
en)
be an orthonormal sequence in a Hilbert space
H. The following are equivalent:
-
(en) is an orthonormal basis.
- CLin((en))=H.
- ||x||2=∑1∞| ⟨ x,en
⟩ |2 for
all x∈ H.
Proof.
Clearly
1 implies
2 because
x=∑
1∞⟨
x,
en
⟩
en in
CLin((
en)) and
||
x||
2=∑
1∞⟨
x,
en
⟩
en by
Theorem
15. The same theorem tells that
1 implies
3.
If (en) is not complete then there exists x∈ H
such that x≠ 0 and ⟨ x,ek
⟩=0 for all k, so
3 fails, consequently 3 implies
1.
Finally if ⟨ x,ek
⟩=0 for all k then
⟨ x,y
⟩=0 for all y∈Lin((en)) and moreover for all
y∈CLin((en)), by the
Lemma on
continuity of the inner product. But then
x∉CLin((en)) and 2 also
fails because ⟨ x,x
⟩=0 is not possible. Thus
2 implies 1.
□
Corollary 20
A separable Hilbert space (i.e. one
with a countable dense set) can be identified with either
l2n or l2, in other words it has an orthonormal
basis (
en)
(finite or infinite) such that
x= | | ⟨ x,en
⟩en and
| ⎪⎪
⎪⎪ | x | ⎪⎪
⎪⎪ | 2= | | ⎪
⎪ | ⟨ x,en
⟩ | ⎪
⎪ | 2.
|
Proof.
Take a countable dense set (
xk), then
H=
CLin((
xk)),
delete all vectors which are a linear combinations of preceding
vectors, make orthonormalisation by Gram–Schmidt the remaining set
and apply the
previous proposition.
□
Most pleasant compliments are usually orthogonal to our real
qualities.
An advise based on observations
3.5 Orthogonal complements
Orthogonality allow us split a Hilbert space into subspaces which will
be “independent from each other” as much as possible.
Definition 21
Let M be a subspace of an inner product space V. The
orthogonal
complement,
written
M⊥, of M is
M⊥={x∈ V: ⟨ x,m
⟩=0 ∀ m∈ M}.
|
Theorem 22
If M is a closed subspace of a Hilbert space H then
M⊥ is a closed subspace too (hence a Hilbert space too).
Proof.
Clearly
M⊥ is a subspace of
H because
x,
y∈
M⊥ implies
ax+
by∈
M⊥:
⟨ ax+by,m
⟩= a⟨ x,m
⟩+ b⟨ y,m
⟩=0.
|
Also if all
xn∈
M⊥ and
xn→
x then
x∈
M⊥ due to
inner product
limit Lemma.
□
Theorem 23
Let M be a closed subspace of a Hilber space H. Then for any
x∈
H there exists the unique decomposition x=
m+
n with
m∈
M, n∈
M⊥ and
||
x||
2=||
m||
2+||
n||
2. Thus H=
M⊕
M⊥ and
(
M⊥)
⊥=
M.
Proof.
For a given
x there exists the unique closest point
m in
M by the
Theorem on nearest
point and by the
Theorem on
perpendicular (
x−
m)⊥
y for all
y∈
M.
So x= m + (x−m)= m+n with m∈ M and n∈ M⊥. The
identity ||x||2=||m||2+||n||2 is just Pythagoras’
theorem and M∩ M⊥={0} because null vector is the
only vector orthogonal to itself.
Finally (M⊥)⊥=M. We have H=M⊕
M⊥=(M⊥)⊥⊕ M⊥, for any
x∈(M⊥)⊥ there is a decomposition x=m+n with
m∈ M and n∈ M⊥, but then n is orthogonal to
itself and therefore is zero.
□
4 Duality of Linear Spaces
Everything has another side
Orthonormal basis allows to reduce any question on Hilbert space to a
question on sequence of numbers. This is powerful but sometimes heavy
technique. Sometime we need a smaller and faster tool to study
questions which are represented by a single number, for example to
demonstrate that two vectors are different it is enough to show that
there is a unequal values of a single coordinate. In such cases linear
functionals are just what we needed.
–Is it functional?
–Yes, it works!
4.1 Dual space of a normed space
Definition 1
A linear functional on a vector space
V is a linear mapping α:
V→ ℂ
(or
α:
V→ ℝ
in the real case), i.e.
α(ax+by)=aα(x)+bα(y), for all
x,y∈ V and a,b∈ℂ.
|
Exercise 2
Show that α(0) is necessarily 0.
We will not consider any functionals but linear, thus below
functional always means linear functional.
Example 3
-
Let V=ℂn and ck, k=1,…,n be
complex numbers. Then
α((x1,…,xn))=c1x1+⋯+c2x2 is a linear
functional.
- On C[0,1] a functional is given by
α(f)=∫01 f(t) d t.
- On a Hilbert space H for any x∈ H a functional
αx is given by αx(y)=⟨ y,x
⟩.
Theorem 4
Let V be a normed space and α
is a linear
functional. The following are equivalent:
-
α is continuous (at any point of V).
-
α is continuous at point 0.
-
sup{| α(x) |: ||x||≤ 1}< ∞,
i.e. α is a bounded linear
functional.
Proof.
Implication
1 ⇒
2 is trivial.
Show 2 ⇒
3. By the
definition of continuity: for any
є>0 there exists δ>0 such that
||v||<δ implies
| α(v)−α(0) |<є . Take є=1
then | α(δ x) |<1 for all x with norm
less than 1 because ||δ x||< δ. But from
linearity of α the inequality | α(δ x) |<1
implies | α(x) |<1/δ<∞ for all
||x||≤ 1.
3 ⇒
1. Let mentioned supremum be M. For
any x, y∈ V such that x≠ y vector
(x−y)/||x−y|| has norm 1. Thus | α
((x−y)/||x−y||) |<M. By the linearity of α this
implies that | α
(x)−α(y) |<M||x−y||. Thus α is continuous.
□
Definition 5
The dual space X* of a normed space
X is the set of continuous linear functionals on X. Define a
norm on it by
| ⎪⎪
⎪⎪ | α | ⎪⎪
⎪⎪ | = | | ⎪
⎪ | α(x) | ⎪
⎪ | .
(23) |
Exercise 6
-
Show that the chain of inequalities:
| ⎪⎪
⎪⎪ | α | ⎪⎪
⎪⎪ | ≤ | | | ⎪
⎪ | α(x) | ⎪
⎪ |
≤ | |
| | ≤ | ⎪⎪
⎪⎪ | α | ⎪⎪
⎪⎪ | .
|
Deduce that any of the mentioned supremums deliver the norm of
α. Which of them you will prefer if you need to show
boundedness of α? Which of them is better to use if
boundedness of α is given?
- Show that | α(x) |≤
||α||·||x|| for all x∈ X, α ∈ X*.
The important observations is that linear functionals form a normed
space as follows:
Exercise 7
-
Show that X* is a linear space with natural (point-wise) operations.
- Show that (23) defines a norm on X*.
Furthermeore, X* is always complete, regardless of properties of X!
Theorem 8
X* is a Banach space with the defined norm (even if X was
incomplete).
Proof.
Due to Exercise
7 we only need to show that
X* is complete. Let (α
n) be a Cauchy sequence in
X*, then for any
x∈
X scalars α
n(
x) form a
Cauchy sequence, since
| α
m(
x)−α
n(
x) |≤||α
m−α
n||·||
x||. Thus
the sequence has a limit and we define α by
α(
x)=lim
n→∞α
n(
x). Clearly
α is a linear functional on
X. We should show that it
is bounded and α
n→ α. Given
є>0 there exists
N such that
||α
n−α
m||<є for all
n,
m≥
N. If
||
x||≤ 1 then | α
n(
x)−α
m(
x) |≤
є, let
m→∞ then | α
n(
x)−α(
x) |≤
є, so
| ⎪
⎪ | α(x) | ⎪
⎪ | ≤ | ⎪
⎪ | αn(x) | ⎪
⎪ | +є≤
| ⎪⎪
⎪⎪ | αn | ⎪⎪
⎪⎪ | + є,
|
i.e. ||α|| is finite and ||α
n−α||≤
є, thus α
n→α.
□
Definition 9
The kernel of linear functional α, write kerα, is the set
all vectors x∈ X such that α(x)=0.
Exercise 10
Show that
-
kerα is a subspace of X.
- If α≢0 then obviously kerα ≠ X.
Furthermore, if X has at least two linearly independent vectors
then kerα ≠ {0}, thus kerα is a
proper subspace of X.
- If α is continuous then kerα is closed.
Study one and get any other for free!
Hilbert spaces sale
4.2 Self-duality of Hilbert space
Lemma 11 (Riesz–Fréchet)
Let H be a Hilbert space and α
a continuous linear
functional on H, then there exists the unique y∈
H such
that α(
x)=⟨
x,
y
⟩
for all x∈
H. Also
||α||
H*=||
y||
H.
Proof.
Uniqueness: if ⟨
x,
y
⟩=⟨
x,
y′
⟩ ⇔
⟨
x,
y−
y′
⟩=0 for all
x∈
H
then
y−
y′ is self-orthogonal and thus is zero
(Exercise
1).
Existence: we may assume that α≢0 (otherwise take
y=0), then M=kerα is a closed proper subspace of
H. Since H=M⊕ M⊥, there exists a non-zero z∈
M⊥, by scaling we could get α(z)=1. Then for any
x∈ H:
x=(x−α(x)z)+α(x)z, with
x−α(x)z∈ M, α(x)z∈ M⊥.
|
Because ⟨ x,z
⟩=α(x)⟨ z,z
⟩=α(x)||z||2
for any x∈ H we set y=z/||z||2.
Equality of the norms ||α||H*=||y||H follows from the
Cauchy–Bunyakovskii–Schwarz
inequality in the form α(x)≤ ||x||·||y|| and the
identity α(y/||y||)=||y||.
□
Example 12
On L2[0,1]
let
α(
f)=⟨
f,
t2
⟩=∫
01 f(
t)
t2 d t. Then
| ⎪⎪
⎪⎪ | α | ⎪⎪
⎪⎪ | = | ⎪⎪
⎪⎪ | t2 | ⎪⎪
⎪⎪ | = | ⎛
⎜
⎜
⎝ | |
(t2)2 d t | ⎞
⎟
⎟
⎠ | | = | | .
|
5 Fourier Analysis
All bases are equal, but some are more equal then others.
As we saw already any separable Hilbert space posses an orthonormal
basis (infinitely many of them indeed). Are they equally good?
This depends from our purposes. For solution of differential equation
which arose in mathematical physics (wave, heat, Laplace equations, etc.)
there is a proffered choice. The fundamental formula: d/dx
eax=aeax reduces the derivative to a multiplication by
a. We could benefit from this observation if the orthonormal basis
will be constructed out of exponents. This helps to solve differential
equations as was demonstrated in
Subsection 0.2.
7.40pm Fourier series: Episode II
Today’s TV listing
5.1 Fourier series
Now we wish to address questions stated in
Remark 9. Let us consider the space
L2[−π,π]. As we saw in Example 3
there is an orthonormal sequence en(t)=(2π)−1/2eint in
L2[−π,π]. We will show that it is an orthonormal
basis, i.e.
f(t)∈ L2[−π,π] ⇔
f(t)= | | ⟨ f,ek
⟩ek(t),
|
with convergence in L2 norm. To do this we show that
CLin{ek:k∈ℤ}=L2[−π,π].
Let CP[−π,π] denote the continuous functions
f on [−π,π] such that f(π)=f(−π). We also define
f outside of the interval [−π,π] by periodicity.
Lemma 1
The space CP[−π,π]
is dense in
L2[−π,π]
.
Figure 9: A modification of continuous function to periodic |
Proof.
Let
f∈
L2[−π,π].
Given є>0 there exists
g∈
C[−π,π]
such that ||
f−
g||<є/2. From continuity of
g on a
compact set follows that there is
M such that |
g(
t) |<
M for all
t∈[−π,π].
We can now replace g by periodic g′,
which coincides with g on [−π,π−δ] for an arbitrary
δ>0 and has the same bounds: | g′(t) |<M,
see Figure 9. Then
| ⎪⎪
⎪⎪ | g−g′ | ⎪⎪
⎪⎪ | 22= | | ⎪
⎪ | g(t)−g′(t) | ⎪
⎪ | 2 d t ≤ (2M)2δ.
|
So if δ<є2/(4M)2 then
||g−g′||<є/2 and ||f−g′||<є.
□
Now if we could show that CLin{ek: k ∈ ℤ} includes
CP[−π,π] then it also includes
L2[−π,π].
Notation 2
Let f∈
CP[−π,π]
,write
fn= | | ⟨ f,ek
⟩ ek , for n=0,1,2,…
(24) |
the partial sum of the Fourier series
for f.
We want to show that ||f−fn||2→ 0. To this end we
define nth Fejér sum by the formula
and show that
Then we conclude
| ⎪⎪
⎪⎪ | Fn−f | ⎪⎪
⎪⎪ | 2= | ⎛
⎜
⎜
⎝ | | ⎪
⎪ | Fn(t)−f | ⎪
⎪ | 2 | ⎞
⎟
⎟
⎠ | | ≤ (2π)1/2
| ⎪⎪
⎪⎪ | Fn−f | ⎪⎪
⎪⎪ | ∞→ 0.
|
Since Fn∈Lin((en)) then f∈CLin((en)) and hence
f=∑−∞∞⟨ f,ek
⟩ek.
Exercise 4
Find an example illustrating the above Remark.
The summation method used in (25) us useful not
only in the context of Fourier series but for many other cases as
well. In such a wider framework the method is known as
.
It took 19 years of his life to prove this theorem
5.2 Fejér’s theorem
Proposition 5 (Fejér, age 19)
Let f∈
CP[−π,π]
. Then
is the Fejér kernel.
Proof.
From notation (
24):
Then from (
25):
which finishes the proof.
□
Lemma 6
The Fejér kernel is 2π
-periodic, Kn(0)=
n+1
and can be expressed as:
| | | 1 | | | |
| | z−1 | 1 | z | | |
| z−2 | z−1 | 1 | z | z2 | |
| ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋱
|
Table 1: Counting powers in rows and columns |
Proof.
Let
z=
eit, then:
by switch from counting in rows to counting in columns in
Table
1.
Let
w=
eit/2, i.e.
z=
w2, then
| Kn(t) | = | | (w−2n+2w−2n+2+⋯+(n+1)+nw2+⋯+w2n)
|
| |
| = | | (29) |
| = | | ⎛
⎜
⎜
⎝ | | ⎞
⎟
⎟
⎠ | |
Could you sum a geometric progression?
|
| |
| = | | ⎛
⎜
⎜
⎜
⎜
⎜
⎝ | | ⎞
⎟
⎟
⎟
⎟
⎟
⎠ | | ,
|
| |
|
if
w≠ ± 1. For the value of
Kn(0) we substitute
w=1 into (
29).
□
Figure 10: A family of Fejér kernels with the parameter m running from
0 to 9 is on the left picture. For a comparison
unregularised Fourier kernels are on the right picture. |
The first eleven Fejér kernels are shown on
Figure 10, we could observe that:
Lemma 7
Fejér’s kernel has the following properties:
-
Kn(t)≥0 for all t∈ ℝ and
n∈ℕ.
-
∫−ππKn(t) d t=2π.
-
For any δ∈ (0,π)
Proof.
The
first property immediately
follows from the explicit formula (
28). In
contrast the
second property is
easier to deduce from expression with double
sum (
27):
since the formula (
19).
Finally if | t |>δ then sin2(t/2)≥
sin2(δ/2)>0 by monotonicity of sinus on [0,π/2], so:
implying:
0≤ | | Kn(t)
d t ≤ | |
→ 0 as n→ 0.
|
Therefore the third property follows
from the squeeze rule.
□
Theorem 8 (Fejér Theorem)
Let f∈
CP[−π,π]
. Then its Fejér sums
Fn (25) converges in supremum norm to f
on [−π,π]
and hence in L2 norm as well.
Proof.
Idea of the proof: if in the formula (
26)
t is long way from
x,
Kn is small (see
Lemma
7 and
Figure
10), for
t near
x,
Kn is big
with total “weight” 2π, so the weighted average of
f(
t)
is near
f(
x).
Here are details. Using property 2 and
periodicity of f and Kn we could
express trivially
f(x)= f(x) | | | Kn(x−t)
dt
= | | | f(x) Kn(x−t) d t.
|
Similarly we rewrite (26) as
then
| = | | | ⎪
⎪
⎪
⎪ | | (f(x)−f(t))
Kn(x−t) d t | ⎪
⎪
⎪
⎪ |
|
| ≤ | | | | ⎪
⎪ | f(x)−f(t) | ⎪
⎪ | Kn(x−t) d t.
|
|
|
Given є>0 split into three intervals:
I1=[x−π,x−δ], I2=[x−δ,x+δ],
I3=[x+δ,x+π], where δ is chosen such that
| f(t)−f(x) |<є/2 for t∈ I2, which is
possible by continuity of f. So
| | ∫ | | | ⎪
⎪ | f(x)−f(t) | ⎪
⎪ | Kn(x−t) d t≤
| |
| | ∫ | | Kn(x−t) d t <
| | .
|
And
| | ∫ | | | ⎪
⎪ | f(x)−f(t) | ⎪
⎪ | Kn(x−t)
dt |
| ≤ | 2 | ⎪⎪
⎪⎪ | f | ⎪⎪
⎪⎪ | ∞ | | ∫ | | Kn(x−t)
dt |
|
| = | |
| < | |
|
if n is sufficiently large due to
property 3 of Kn. Hence
| f(x)−Fn(x) |<є for a large n independent
of x.
□
We almost finished the demonstration that en(t)=(2π)−1/2eint
is an orthonormal basis of L2[−π,π]:
Corollary 10 (Fourier series)
Let f∈
L2[−π,π]
, with Fourier series
| | ⟨ f,en
⟩en= | | cneint
where
cn= | | = | | | f(t)e−int d t.
|
Then the series ∑
−∞∞⟨
f,
en
⟩
en=∑
−∞∞cneint converges in L2[−π,π]
to f, i.e
| | | ⎪⎪
⎪⎪
⎪⎪
⎪⎪ | f− | |
cneint | ⎪⎪
⎪⎪
⎪⎪
⎪⎪ | 2=0.
|
5.3 Parseval’s formula
The following result first appeared in the framework of
L2[−π,π] and only later was understood to be a
general property of inner product spaces.
Theorem 12 (Parseval’s formula)
If f, g∈
L2[−π,π]
have Fourier series
f=∑
n=−∞∞cneint and g=∑
n=−∞∞dneint, then
⟨ f,g
⟩= | | f(t)
| | d t=2π | | cn | | .
(30) |
More generally if f and g are two vectors of a Hilbert space H with an
orthonormal basis (en)−∞∞ then
⟨ f,g
⟩= | | cn | | ,
where cn=⟨ f,en
⟩, dn=⟨ g,en
⟩,
|
are the Fourier coefficients of f and g.
Proof.
In fact we could just prove the second, more general,
statement—the first one is its particular realisation. Let
fn=∑
k=−nn ckek and
gn=∑
k=−nn
dkek will be partial sums of the corresponding Fourier
series. Then from orthonormality of (
en) and linearity of the
inner product:
⟨ fn,gn
⟩=⟨ | | ckek, | |
dkek
⟩= | | ck | | .
|
This formula together with the facts that
fk→
f and
gk→
g (following from
Corollary
10) and
Lemma about continuity of the
inner product implies the assertion.
□
Corollary 13
A integrable function f belongs to L2[−π,π]
if and only if its Fourier series is convergent and then
||
f||
2=2π∑
−∞∞|
ck |
2.
Heat and noise but not a fire?
Answer:
5.4 Some Application of Fourier Series
We are going to provide now few examples which demonstrate the
importance of the Fourier series in many questions. The first two
(Example 16 and Theorem 17)
belong to pure mathematics and last two are of more applicable
nature.
Example 16
Let f(
t)=
t on [−π,π]
. Then
⟨ f,en
⟩= | | te−int d t= | ⎧
⎪
⎪
⎨
⎪
⎪
⎩ | |
| (check!),
|
so f(
t)∼ ∑
−∞∞(−1)
n (
i/
n)
eint. By a direct
integration:
On the other hand by the
previous Corollary:
| ⎪⎪
⎪⎪ | f | ⎪⎪
⎪⎪ | 22=2π | | ⎪
⎪
⎪
⎪ | | | ⎪
⎪
⎪
⎪ | 2=4π | | | .
|
Thus we get a beautiful formula
Here is another important result.
Theorem 17 (Weierstrass Approximation Theorem)
For any function f∈
C[
a,
b]
and any є>0
there exists a polynomial p such that
||
f−
p||
∞<є
.
Proof.
Change variable:
t=2π(
x−
a+
b/2)/(
b−
a) this maps
x∈[
a,
b] onto
t∈[−π,π]. Let
P denote
the subspace of polynomials in
C[−π,π]. Then
eint∈
$P_^$ for
any
n∈ℤ since Taylor series converges uniformly in
[−π,π]. Consequently
P contains the closed linear
span in (supremum norm) of
eint, any
n∈ℤ,
which is
CP[−π,π] by the
Fejér theorem. Thus
$P_^$⊇
CP[−π,π] and we
extend that to non-periodic function as follows (why we could not
make use of Lemma
1 here, by the way?).
For any f∈C[−π,π] let
λ=(f(π)−f(−π))/(2π) then f1(t)=f(t)−λ t∈
CP[−π,π] and could be approximated by a polynomial
p1(t) from the above discussion. Then f(t) is approximated
by the polynomial p(t)=p1(t)+λ t.
□
It is easy to see, that the rôle of exponents eint in the
above prove is rather modest: they can be replaced by any functions
which has a Taylor expansion. The real glory of the Fourier analysis
is demonstrated in the two following examples.
Figure 11: The dynamics of a heat
equation: |
x—coordinate on the rod, |
t—time, |
T—temperature. |
Example 18
The modern history of the Fourier analysis starts from the works of
Fourier on the heat equation. As was mentioned in the introduction
to this part, the exceptional role of Fourier coefficients for
differential equations is explained by the simple formula
∂
x einx=
ineinx. We shortly review a solution of
the heat equation to illustrate this.Let we have a rod of the length 2π. The temperature at its
point x∈[−π,π] and a moment t∈[0,∞) is
described by a function u(t,x) on [0,∞)×[−π,π].
The mathematical equation describing a dynamics of the temperature
distribution is:
| | = | | or, equivalently,
| ⎛
⎝ | ∂t−∂x2 | ⎞
⎠ | u(t,x)=0.
(32) |
For any fixed moment t0 the function u(t0,x) depends only
from x∈[−π,π] and according to
Corollary 10 could be represented by its Fourier series:
u(t0,x)= | | ⟨ u,en
⟩en= | | cn(t0)einx,
|
where
cn(t0)= | | =
| | | u(t0,x)e−inx d x,
|
with Fourier coefficients cn(t0) depending from t0. We
substitute that decomposition into the heat equation (32)
to receive:
| | | | | | | | | | | |
| | | | | | | | | | |
| = | | (c′n(t)+n2cn(t))einx=0 .
|
| | | | | | | | | (33) |
|
Since function einx form a basis the last
equation (33) holds if and only if
c′n(t)+n2cn(t)=0 for all n and t.
(34) |
Equations from the system (34) have general
solutions of the form:
cn(t)=cn(0)e−n2t for all t∈[0,∞),
(35) |
producing a general solution of the heat equation (32) in
the form:
u(t,x)= | | cn(0)e−n2teinx
= | | cn(0)e−n2t+inx,
(36) |
where constant cn(0) could be defined from boundary
condition. For example, if it is known that the initial distribution
of temperature was u(0,x)=g(x) for a function
g(x)∈L2[−π,π] then cn(0) is the n-th
Fourier coefficient of g(x).
The general solution (36) helps produce both the
analytical study of the heat equation (32) and
numerical simulation. For example, from (36)
obviously follows that
-
the temperature is rapidly relaxing toward the thermal equilibrium
with the temperature given by c0(0), however never reach it
within a finite time;
- the “higher frequencies” (bigger thermal gradients) have a
bigger speed of relaxation; etc.
The example of numerical simulation for the initial value problem
with g(x)=2cos(2*u) + 1.5sin(u). It is clearly illustrate our
above conclusions.
Figure 12: Two oscillation with
unharmonious frequencies and the appearing dissonance. Click to
listen the blue and green
pure harmonics and red dissonance. |
Figure 13: Graphics of G5 performed
on different musical instruments (click on picture to hear the
sound). Samples are taken from
Sound
Library. |
Figure 14: Fourier
series for G5 performed on different musical instruments
(same order and colour as on the previous Figure) |
(a)
(b)
(c)
Figure 15: Limits of the Fourier analysis: different frequencies
separated in time |
Example 19
Among the oldest periodic functions in human culture are acoustic
waves of musical tones. The mathematical theory of musics (including
rudiments of the Fourier analysis!) is as old
as mathematics itself and was highly respected already in
Pythagoras’ school more 2500 years
ago. The earliest observations are that
-
The musical sounds are made of pure harmonics (see the blue
and green graphs on the Figure 12), in our
language cos and sin functions form a basis;
- Not every two pure harmonics are compatible, to be their
frequencies should make a simple ratio. Otherwise the dissonance
(red graph on Figure 12) appears.
The musical tone, say G5, performed on different instruments clearly
has something in common and different, see
Figure 13 for comparisons. The decomposition into the
pure harmonics, i.e. finding Fourier coefficient for the signal,
could provide the complete characterisation, see
Figure 14.
The Fourier analysis tells that:
-
All sound have the same base (i.e. the lowest) frequencies
which corresponds to the G5 tone, i.e. 788 Gz.
- The higher frequencies, which are necessarily are multiples of
788 Gz to avoid dissonance, appears with different weights for
different instruments.
The Fourier analysis is very useful in the signal processing and
is indeed the fundamental tool. However it is not universal and has
very serious limitations. Consider the simple case of the signals
plotted on the Figure 15(a) and (b). They
are both made out of same two pure harmonics:
-
On the first signal the two harmonics (drawn in blue and
green) follow one after another in
time on Figure 15(a);
- They just blended in equal proportions over the whole interval
on Figure 15(b).
This appear to be two very different signals. However the Fourier
performed over the whole interval does not seems to be very
different, see Figure 15(c). Both
transforms (drawn in blue-green and pink) have two major pikes
corresponding to the pure frequencies. It is not very easy to
extract differences between signals from their Fourier transform
(yet this should be possible according to our study).
Even a better picture could be obtained if we use windowed
Fourier transform, namely use
a sliding “window” of the constant width instead of the
entire interval for the Fourier transform. Yet even better analysis
could be obtained by means of wavelets already mentioned in
Remark 14 in
connection with Plancherel’s formula. Roughly, wavelets correspond to
a sliding window of a variable size—narrow for high frequencies and
wide for low.
6 Operators
All the space’s a stage,
and all functionals and operators
merely players!
All our previous considerations were only a preparation of the stage and
now the main actors come forward to perform a play. The vectors spaces
are not so interesting while we consider them in statics, what really
make them exciting is the their transformations. The natural first
steps is to consider transformations which respect both linear
structure and the norm.
6.1 Linear operators
Definition 1
A linear
operator
T
between two normed
spaces X and Y is a mapping T:
X→
Y such that
T(λ
v + µ
u)=λ
T(
v) + µ
T(
u)
. The
kernel of linear operator
ker
T and
image
are defined by
kerT ={x∈ X: Tx=0}
Im T={y∈ Y: y=Tx, for some x∈ X}.
|
Exercise 2
Show that kernel of T is a linear subspace of X and image of
T is a linear subspace of Y.
As usual we are interested also in connections with the second
(topological) structure:
Definition 3
A norm of linear operator is defined:
| ⎪⎪
⎪⎪ | T | ⎪⎪
⎪⎪ | =sup{ | ⎪⎪
⎪⎪ | Tx | ⎪⎪
⎪⎪ | Y: | ⎪⎪
⎪⎪ | x | ⎪⎪
⎪⎪ | X≤ 1}.
(37) |
T is a bounded linear
operator if
||T||=sup{||Tx||: ||x||}<∞.
Exercise 4
Show that ||Tx||≤ ||T||·||x|| for all x∈ X.
Example 5
Consider the following examples and determine kernel and images of
the mentioned operators.
-
On a normed space X define the zero
operator to a space Y by Z:
x→ 0 for all x∈ X. Its norm is 0.
-
On a normed space X define the identity
operator by IX:
x→ x for all x∈ X. Its norm is 1.
-
On a normed space X any linear functional define a linear
operator from X to ℂ, its norm as
operator is the same as functional.
- The set of operators from ℂn to ℂm
is given by n× m matrices which acts on vector by the
matrix multiplication. All linear operators on finite-dimensional
spaces are bounded.
-
On l2, let
S(x1,x2,…)=(0,x1,x2,…) be the right
shift operator. Clearly
||Sx||=||x|| for all x, so ||S||=1.
-
On L2[a,b], let w(t)∈ C[a,b]
and define multiplication operator Mwf by (Mw f)(t)=w(t)f(t). Now:
| = | | | ⎪
⎪ | w(t) | ⎪
⎪ | 2
| ⎪
⎪ | f(t) | ⎪
⎪ | 2 d t |
|
| ≤ | K2 | | | ⎪
⎪ | f(t) | ⎪
⎪ | 2 d t, where
K= | ⎪⎪
⎪⎪ | w | ⎪⎪
⎪⎪ | ∞= | | ⎪
⎪ | w(t) | ⎪
⎪ | ,
|
|
|
so ||Mw||≤ K.
Exercise 6
Show that for multiplication operator in fact there is the
equality of norms ||Mw||2=
||w(t)||∞.
Theorem 7
Let T:
X →
Y be a linear operator. The following
conditions are equivalent:
-
T is continuous on X;
- T is continuous at the point 0.
- T is a bounded linear operator.
Proof.
Proof essentially follows the proof of similar
Theorem
4.
□
6.2 Orthoprojections
Here we will use orthogonal
complement, see § 3.5, to introduce a class of linear
operators—orthogonal projections. Despite of (or rather due to)
their extreme simplicity these operators are among most frequently
used tools in the theory of Hilbert spaces.
Corollary 8 (of Thm. 23, about Orthoprojection)
Let M be a closed
linear subspace of a hilbert space H. There is a linear map
PM from H on
to M (the orthogonal
projection
or orthoprojection
) such that
PM2=PM, kerPM=M⊥, PM⊥=I−PM.
(38) |
Proof.
Let us define PM(x)=m where x=m+n is the decomposition from
the previous theorem. The linearity of this operator follows from
the fact that both M and M⊥ are linear subspaces. Also
PM(m)=m for all m∈ M and the image of PM is
M. Thus PM2=PM. Also if PM(x)=0 then x⊥ M,
i.e. kerPM=M⊥. Similarly PM⊥(x)=n where
x=m+n and PM+PM⊥=I.
□
Example 9
Let (
en)
be an orthonormal basis in a Hilber space and let
S⊂ ℕ
be fixed. Let M=
CLin{
en:
n∈
S}
and M⊥=
CLin{
en:
n∈ ℕ∖
S}
. Then
6.3 B(H) as a Banach space (and even algebra)
Theorem 11
Let B(
X,
Y)
be the space of bounded linear
operators
from X and Y with the
norm defined above. If Y is
complete, then B(
X,
Y)
is a Banach space.
Proof.
The proof repeat proof of the Theorem
8,
which is a particular case of the present theorem for
Y=ℂ, see Example
3.
□
Theorem 12
Let T∈
B(
X,
Y)
and S∈
B(
Y,
Z)
, where
X, Y, and Z are normed spaces. Then
ST∈
B(
X,
Z)
and ||
ST||≤||
S||||
T||
.
Proof.
Clearly (
ST)
x=
S(
Tx)∈
Z, and
| ⎪⎪
⎪⎪ | STx | ⎪⎪
⎪⎪ | ≤ | ⎪⎪
⎪⎪ | S | ⎪⎪
⎪⎪ | ⎪⎪
⎪⎪ | Tx | ⎪⎪
⎪⎪ | ≤ | ⎪⎪
⎪⎪ | S | ⎪⎪
⎪⎪ | ⎪⎪
⎪⎪ | T | ⎪⎪
⎪⎪ | ⎪⎪
⎪⎪ | x | ⎪⎪
⎪⎪ | ,
|
which implies norm estimation if ||
x||≤1.
□
Corollary 13
Let T∈
B(X,X)=B(X),
where X is a
normed space. Then for any n≥ 1, Tn∈B(X)
and ||Tn||≤ ||T||n.
Proof.
It is induction by n with the trivial base n=1 and the step
following from the previous theorem.
□
Definition 15
Let T∈
B(
X,
Y)
. We say T is an
invertible operator if there exists
S∈
B(
Y,
X)
such that
Such an S is called the inverse
operator of T.
Exercise 16 Show that
-
for an invertible operator T:X→ Y we have ker
T={0} and ℑ T=Y.
- the inverse operator is unique (if exists at all).
(Assume existence of S and S′, then consider operator
STS′.)
Example 17
We consider inverses to operators from Exercise 5.
-
The zero operator is never invertible unless the pathological
spaces X=Y={0}.
- The identity operator IX is the inverse of itself.
- A linear functional is not invertible unless it is non-zero
and X is one dimensional.
- An operator ℂn→ ℂm is
invertible if and only if m=n and corresponding square matrix
is non-singular, i.e. has non-zero determinant.
-
The right shift S is not
invertible on l2 (it is one-to-one but is not
onto). But the left shift operator
T(x1,x2,…)=(x2,x3,…) is its left
inverse, i.e. TS=I but
TS≠I since ST(1,0,0,…)=(0,0,…). T is
not invertible either (it is onto but not one-to-one), however S
is its right inverse.
- Operator of
multiplication Mw is invertible if and only if
w−1∈C[a,b] and inverse is Mw−1. For
example M1+t is invertible L2[0,1] and
Mt is not.
6.4 Adjoints
Theorem 18
Let H and K be Hilbert Spaces and T∈
B(
H,
K)
.
Then there exists operator T*∈
B(
K,
H)
such that
⟨ Th,k
⟩K=⟨ h,T*k
⟩H for
all
h∈ H, k∈ K.
|
Such T* is called the adjoint
operator of T. Also T**=
T and
||
T*||=||
T||
.
Proof.
For any fixed
k∈
K the expression
h:→ ⟨
Th,
k
⟩
K defines a
bounded linear functional on
H. By the
Riesz–Fréchet lemma there is a
unique y∈
H such that ⟨
Th,
k
⟩
K=⟨
h,
y
⟩
H for all
h∈
H. Define
T* k =
y then
T* is linear:
⟨ h,T*(λ1k1+λ2k2)
⟩H | = | ⟨ Th,λ1k1+λ2k2
⟩K |
| = | λ1⟨ Th,k1
⟩K+λ2⟨ Th,k2
⟩K |
| = | λ1⟨ h,T*k1
⟩H+λ2⟨ h,T*k2
⟩K |
| = | ⟨ h,λ1T*k1+λ2T*k2
⟩H
|
|
So
T*(λ
1k1+λ
2k2)=λ
1T*k1+λ
2T*k2.
T** is defined by ⟨
k,
T**h
⟩=⟨
T*k,
h
⟩
and the identity
⟨
T**h,
k
⟩=⟨
h,
T*k
⟩=⟨
Th,
k
⟩ for all
h and
k shows
T**=
T.
Also:
| = | ⟨ T*k,T*k
⟩=⟨ k,TT*k
⟩ |
| ≤ | ⎪⎪
⎪⎪ | k | ⎪⎪
⎪⎪ | · | ⎪⎪
⎪⎪ | TT*k | ⎪⎪
⎪⎪ | ≤ | ⎪⎪
⎪⎪ | k | ⎪⎪
⎪⎪ | · | ⎪⎪
⎪⎪ | T | ⎪⎪
⎪⎪ | · | ⎪⎪
⎪⎪ | T*k | ⎪⎪
⎪⎪ | ,
|
|
|
which implies ||
T*k||≤||
T||·||
k||,
consequently ||
T*||≤||
T||. The opposite inequality
follows from the identity ||
T||=||
T**||.
□
Exercise 19
-
For operators T1 and T2 show that
(T1T2)*=T2*T1*, (T1+T2)*=T1*+T2*
(λ T)*=λT*.
|
-
If A is an operator on a Hilbert space H then (kerA)⊥= Im A*.
6.5 Hermitian, unitary and normal operators
Definition 20
An operator T:
H→
H is a Hermitian
operator or self-adjoint operator
if
T=
T*, i.e. ⟨
Tx,
y
⟩=⟨
x,
Ty
⟩
for all x,
y∈
H.
Example 21
-
On l2 the adjoint S* to the
right shift operator
S is given by the left shift
S*=T, indeed:
⟨ Sx,y
⟩ | = | ⟨ (0,x1,x2,…),(y1,y2,…)
⟩ |
| = | x1ȳ2+x2y_3+⋯=⟨ (x1,x2,…),(y2,y3,…)
⟩ |
| = | ⟨ x,Ty
⟩.
|
|
Thus S is not Hermitian.
-
Let D be diagonal operator on l2
given by
D(x1,x2,…)=(λ1 x1, λ2 x2, …).
|
where (λk) is any bounded complex sequence. It is easy to
check that
||D||=||(λn)||∞=supk| λk |
and
D* (x1,x2,…)=(λ1 x1, λ2 x2, …),
|
thus D is Hermitian if and only if λk∈ℝ
for all k.
- If T: ℂn→ ℂn is represented
by multiplication of a column vector by a matrix A, then
T* is multiplication by the matrix A*—transpose and
conjugate to A.
Exercise 22
Show that for any bounded operator T operators
Tr=1/2(T+ T*), Ti=1/2i(T− T*), T*T and TT* are
Hermitians. Note, that any operator is the linear combination of two
hermitian operators: T=Tr+i Ti (cf. z= ℜ z + i ℑ
z for z∈ℂ).
To appreciate the next Theorem the following exercise is useful:
Exercise 23
Let H be a Hilbert space. Show that
-
For x∈ H we have ||x||=
sup { | ⟨ x,y
⟩ | for all y∈ H such
that ||y||=1}.
- For T∈B(H) we have
| ⎪⎪
⎪⎪ | T | ⎪⎪
⎪⎪ | =
sup { | ⎪
⎪ | ⟨ Tx,y
⟩ | ⎪
⎪ | for all x,y∈ H
such that | ⎪⎪
⎪⎪ | x | ⎪⎪
⎪⎪ | = | ⎪⎪
⎪⎪ | y | ⎪⎪
⎪⎪ | =1}.
(39) |
The next theorem says, that for a Hermitian operator T the
supremum in (39) may be taken over
the “diagonal” x=y only.
Theorem 24
Let T be a Hermitian operator on a Hilbert space. Then
| ⎪⎪
⎪⎪ | T | ⎪⎪
⎪⎪ | = | | ⎪
⎪ | ⟨ Tx,x
⟩ | ⎪
⎪ | .
|
Proof.
If
Tx=0 for all
x∈
H, both sides of the identity are
0. So we suppose that ∃
x∈
H for which
Tx≠
0.
We see that | ⟨ Tx,x
⟩ |≤ ||Tx||||x|| ≤ ||T||||x2||, so sup||x|| =1 | ⟨ Tx,x
⟩ |≤ ||T||.
To get the inequality the other way around, we first write
s:=sup||x|| =1 | ⟨ Tx,x
⟩ |. Then for any
x∈ H, we have | ⟨ Tx,x
⟩ |≤ s||x2||.
We now consider
⟨ T(x+y),x+y
⟩ =⟨ Tx,x
⟩ +⟨ Tx,y
⟩+⟨ Ty,x
⟩ +⟨ Ty,y
⟩
= ⟨ Tx,x
⟩
+2ℜ ⟨ Tx,y
⟩ +⟨ Ty,y
⟩
|
(because T being Hermitian gives ⟨ Ty,x
⟩=⟨ y,Tx
⟩ =⟨ Tx,y
⟩) and, similarly,
⟨ T(x−y),x−y
⟩ = ⟨ Tx,x
⟩ −2ℜ ⟨ Tx,y
⟩
+⟨ Ty,y
⟩.
|
Subtracting gives
| 4ℜ ⟨ Tx,y
⟩ | = ⟨ T(x+y),x+y
⟩−⟨ T(x−y),x−y
⟩ | | | | | | | | | |
| ≤
s( | ⎪⎪
⎪⎪ | x+y | ⎪⎪
⎪⎪ | 2 + | ⎪⎪
⎪⎪ | x−y | ⎪⎪
⎪⎪ | 2) |
| | | | | | | | | |
| = 2s( | ⎪⎪
⎪⎪ | x | ⎪⎪
⎪⎪ | 2 + | ⎪⎪
⎪⎪ | y | ⎪⎪
⎪⎪ | 2),
|
| | | | | | | | | |
|
by the parallelogram identity.
Now, for x∈ H such that Tx≠ 0, we put y=||Tx||−1||x|| Tx. Then ||y|| =||x|| and when we
substitute into the previous inequality, we get
4 | ⎪⎪
⎪⎪ | Tx | ⎪⎪
⎪⎪ | ⎪⎪
⎪⎪ | x | ⎪⎪
⎪⎪ | =4ℜ⟨ Tx,y
⟩ ≤ 4s | ⎪⎪
⎪⎪ | x2 | ⎪⎪
⎪⎪ | ,
|
So
||Tx||≤ s||x|| and it follows that ||T||≤ s, as required.
□
Definition 25
We say that U:
H→
H is a unitary
operator on a Hilbert space H if
U*=
U−1, i.e. U*U=
UU*=
I.
Example 26
-
If D:l2→l2 is a
diagonal
operator such that D
ek=λk ek, then D* ek=λk ek and
D is unitary if and only if | λk |=1 for all
k.
- The shift operator S
satisfies S*S=I but SS*≠ I thus S is
not unitary.
Theorem 27
For an operator U on a complex Hilbert space H the following are
equivalent:
-
U is unitary;
-
U is surjection and an
isometry,
i.e. ||Ux||=||x|| for all x∈ H;
-
U is a surjection and preserves the inner product,
i.e. ⟨ Ux,Uy
⟩=⟨ x,y
⟩ for all x, y∈ H.
Proof.
1⇒
2. Clearly
unitarity of operator implies its invertibility and hence
surjectivity. Also
| ⎪⎪
⎪⎪ | Ux | ⎪⎪
⎪⎪ | 2=⟨ Ux,Ux
⟩=⟨ x,U*Ux
⟩=⟨ x,x
⟩= | ⎪⎪
⎪⎪ | x | ⎪⎪
⎪⎪ | 2.
|
2⇒
3. Using
the polarisation identity (cf. polarisation in equation (
16)):
4⟨ Tx,y
⟩ | = | ⟨ T(x+y),x+y
⟩+i⟨ T(x+iy),x+iy
⟩ |
| | −⟨ T(x−y),x−y
⟩−i⟨ T(x−iy),x−iy
⟩. |
| = | |
|
Take
T=
U*U and
T=
I, then
4⟨ U*Ux,y
⟩ | = | |
| = | |
| = | |
| = | 4⟨ x,y
⟩.
|
|
3⇒
1. Indeed
⟨
U*U x,
y
⟩=⟨
x,
y
⟩ implies
⟨ (
U*U−
I)
x,
y
⟩=0 for all
x,
y∈
H, then
U*U=
I. Since
U is surjective, for any
y∈
H there is
x∈
H such that
y=
Ux. Then, using the already established
fact
U*U=
I we get
UU* y = UU*(Ux) = U(U*U)x = Ux= y.
|
Thus we have
UU*=
I as well and
U is unitary.
□
Definition 28
A normal operator T is one for which
T*T=
TT*.
Example 29
-
Any self-adjoint operator T is normal, since T*=T.
- Any unitary operator U is normal, since U*U=I=UU*.
- Any diagonal operator D
is normal , since D ek=λk ek, D* ek=λk
ek, and DD*ek=D*D ek=| λk |2 ek.
- The shift operator S is
not normal.
- A finite matrix is normal (as an operator on
l2n) if and only if it has an orthonormal basis
in which it is diagonal.
7 Spectral Theory
Beware of ghosts2 in this area!
As we saw operators could be added and multiplied each other, in some
sense they behave like numbers, but are much more complicated. In this
lecture we will associate to each operator a set of complex numbers
which reflects certain (unfortunately not all) properties of this operator.
The analogy between operators and numbers become even more deeper
since we could construct functions of operators (called
functional calculus) in a way we build
numeric functions. The most important functions of this sort is called
resolvent (see Definition 5). The methods of
analytical functions are very powerful in operator theory and students
may wish to refresh their knowledge of complex analysis before this
part.
7.1 The spectrum of an operator on a Hilbert space
An eigenvalue of operator
T∈B(H) is a complex number λ such that there
exists a nonzero x∈ H, called
eigenvector with property
Tx=λ x, in other words x∈ker(T−λ I).
In finite
dimensions T−λ I is invertible if and only if λ
is not an eigenvalue.
In infinite dimensions it is not the same:
the right shift operator S is not
invertible but 0 is not its eigenvalue because Sx=0 implies
x=0 (check!).
Definition 1
The resolvent set ρ(
T)
of an operator T is the set
ρ (T)={λ∈ℂ: T−λ I is invertible}.
|
The spectrum of operator
T∈
B(
H)
, denoted σ(
T)
, is the complement of
the resolvent set ρ(
T)
:
σ(T)={λ∈ℂ: T−λ I is not invertible}.
|
Example 2
If H is finite dimensional the from previous discussion follows
that σ(
T)
is the set of eigenvalues of T for any T.Even this example demonstrates that spectrum does not provide a
complete description for operator even in finite-dimensional
case. For example, both operators in ℂ2 given by matrices
(
) and
(
) have a single point spectrum {0}, however are
rather different. The situation became even worst in the infinite
dimensional spaces.
Theorem 3
The spectrum σ(
T)
of a bounded
operator T is a nonempty compact (i.e. closed and bounded)
subset of ℂ
.
For the proof we will need several Lemmas.
Lemma 4
Let A∈
B(
H)
. If ||
A||<1
then I−
A is
invertible in B(
H)
and inverse is given by the
Neumann series (C. Neumann, 1877):
(I−A)−1=I+A+A2+A3+…= | | Ak.
(40) |
Proof.
Define the sequence of operators
Bn=
I+
A+⋯+
AN—the partial sums
of the infinite series (
40). It is a
Cauchy sequence, indeed:
| = | ⎪⎪
⎪⎪ | Am+1+Am+2+⋯+An | ⎪⎪
⎪⎪ |
(if n<m) |
|
| ≤ | ⎪⎪
⎪⎪ | Am+1 | ⎪⎪
⎪⎪ | + | ⎪⎪
⎪⎪ | Am+2 | ⎪⎪
⎪⎪ | +⋯+ | ⎪⎪
⎪⎪ | An | ⎪⎪
⎪⎪ |
|
| ≤ | ⎪⎪
⎪⎪ | A | ⎪⎪
⎪⎪ | m+1+ | ⎪⎪
⎪⎪ | A | ⎪⎪
⎪⎪ | m+2+⋯+ | ⎪⎪
⎪⎪ | A | ⎪⎪
⎪⎪ | n |
|
| ≤ | |
|
for a large
m. By the
completeness of
B(
H) there is a limit, say
B, of the sequence
Bn. It is a simple algebra to check that
(
I−
A)
Bn=
Bn(
I−
A)=
I−
An+1, passing to the limit in the norm
topology, where
An+1→ 0 and
Bn→
B we
get:
(I−A)B=B(I−A)=I ⇔ B=(I−A)−1.
|
□
Definition 5
The resolvent
of an operator T is the operator valued
function defined on the resolvent set by the formula:
Corollary 6
-
If | λ |>||T|| then
λ∈ ρ(T), hence the spectrum is
bounded.
-
The resolvent set ρ(T) is open, i.e for any λ ∈
ρ(T) then there exist є>0 such
that all µ with | λ−µ |<є are also
in ρ(T), i.e. the resolvent set is open and the spectrum is
closed.
Both statements together imply that the spectrum is compact.
Proof.
- If | λ |>||T|| then ||λ−1T||<1
and the operator T−λ I=−λ(I−λ−1T)
has the inverse
R(λ,T)= (T−λ I)−1=− | | λ−k−1Tk.
(42) |
by the
previous Lemma.
- Indeed:
T−µ I | = | T−λ I + (λ−µ)I |
| = | (T−λ I)(I+(λ−µ)(T−λ I)−1).
|
|
The last line is an invertible operator because T−λ I is
invertible by the assumption and I+(λ−µ)(T−λ
I)−1 is invertible by the
previous Lemma, since
||(λ−µ)(T−λ I)−1||<1 if
є<||(T−λ I)−1||.
□
Exercise 7
-
Prove the first resolvent
identity:
R(λ,T)−R(µ,T)=(λ−µ)R(λ,T)R(µ,T)
(43) |
- Use the identity (43) to show that
(T−µ I)−1→
(T−λ I)−1 as µ→ λ.
-
Use the identity (43) to show that
for z∈ρ(t) the complex derivative d/dz R(z,T)
of the resolvent R(z,T) is well defined, i.e. the resolvent is
an analytic function operator valued function of z.
Lemma 8
The spectrum is non-empty.
Proof.
Let us assume the opposite, σ(
T)=∅ then the resolvent
function
R(λ,
T) is well defined for all
λ∈ℂ. As could be seen from the von Neumann
series (
42) ||
R(λ,
T)||→ 0
as λ→ ∞. Thus for any vectors
x,
y∈
H the function
f(λ)=⟨
R(λ,
T)
x,
y)
⟩ is
analytic (see Exercise
3)
function tensing to zero at infinity. Then by the Liouville theorem from
complex analysis
R(λ,
T)=0, which is impossible. Thus the
spectrum is not empty.
□
Proof.[Proof of Theorem
3]
Spectrum is nonempty by Lemma
8 and compact
by Corollary
6.
□
7.2 The spectral radius formula
The following definition is of interest.
Definition 10
The spectral radius of T is
r(T)=sup{ | ⎪
⎪ | λ | ⎪
⎪ | : λ∈ σ(T)}.
|
From the Lemma 1 immediately follows that
r(T)≤||T||. The more accurate estimation is given by the
following theorem.
Theorem 11
For a bounded operator T we have
r(T)= | | ⎪⎪
⎪⎪ | Tn | ⎪⎪
⎪⎪ | 1/n.
(44) |
We start from the following general lemma:
Lemma 12
Let a sequence (
an)
of positive real numbers satisfies
inequalities: 0≤
am+n≤
am+
an for all m and
n. Then there is a limit
lim
n→∞(
an/
n)
and its equal to
inf
n(
an/
n)
.
Proof.
The statements follows from the observation that for any n and
m=nk+l with 0≤ l≤ n we have am≤ kan+la1
thus, for big m we got am/m≤ an/n +la1/m ≤
an/n+є.
□
Proof.[Proof of Theorem
11]
The existence of the limit
lim
n→∞||
Tn||
1/n
in (
44) follows from the previous Lemma
since by the Lemma
12
log||
Tn+m||≤ log||
Tn||+log||
Tm||. Now we
are using some results from the complex analysis. The Laurent series
for the resolvent
R(λ,
T) in the neighbourhood of infinity
is given by the von Neumann series (
42). The
radius of its convergence (which is equal, obviously, to
r(
T))
by the Hadamard theorem is exactly
lim
n→∞||
Tn||
1/n.
□
Corollary 13
There exists λ∈σ(
T)
such that
| λ |=
r(
T)
.
Proof.
Indeed, as its known from the complex analysis the boundary of the
convergence circle of a Laurent (or Taylor) series contain a
singular point, the singular point of the resolvent is obviously
belongs to the spectrum.
□
Example 14
Let us consider the left shift
operator S*, for any λ∈ℂ
such that
| λ | <1
the vector
(1,λ,λ
2,λ
3,…)
is in l2
and is an eigenvector of S* with eigenvalue λ
, so the
open unit disk | λ |<1
belongs to σ(
S*)
. On
the other hand spectrum of S* belongs to the closed unit disk
| λ |≤ 1
since r(
S*)≤
||
S*||=1
. Because spectrum is closed it should coincide with
the closed unit disk, since the open unit disk is dense in
it. Particularly 1∈σ(
S*)
, but it is easy to see that 1
is not an eigenvalue of S*.
Proposition 15
For any T∈B(H) the spectrum of the adjoint operator
is σ(T*)={λ: λ∈ σ(T)}.
Proof.
If (T−λ I)V=V(T−λ I)=I the by taking adjoints
V*(T*−λI)=(T*−λI)V*=I. So λ
∈ ρ(T) implies λ∈ρ(T*), using the property
T**=T we could invert the implication and get the statement
of proposition.
□
Example 16
In continuation of Example 14 using the
previous Proposition we conclude that σ(
S)
is also the
closed unit disk, but S does not have eigenvalues at all!
7.3 Spectrum of Special Operators
Theorem 17
-
If U is a unitary operator then σ(U)⊆
{| z |=1}.
-
If T is Hermitian then σ(T)⊆ ℝ.
Proof.
- If | λ |>1 then ||λ−1U||<1 and
then λ I−U=λ(I−λ−1U) is invertible, thus
λ∉σ(U).
If | λ |<1 then ||λ U*||<1 and
then λ I−U=U (λ U*−I) is invertible, thus
λ∉σ(U). The remaining set is exactly
{z:| z |=1}.
- Without lost of generality we
could assume that ||T||<1, otherwise we could multiply
T by a small real scalar. Let us consider the Cayley
transform which maps real axis to the unit circle:
Straightforward calculations show that U is unitary if
T is Hermitian. Let us take λ∉ℝ and
λ≠ −i (this case could be checked directly by
Lemma 4). Then the Cayley transform
µ=(λ−i)(λ+i)−1 of
λ is not on the unit circle and thus the operator
U−µ I=(T−iI)(T+iI)−1−(λ−i)(λ+i)−1I=
2i(λ+i)−1(T−λ I)(T+iI)−1,
|
is invertible, which implies invertibility of T−λ I. So
λ∉ℝ.
□
The above reduction of a self-adjoint operator to a unitary one (it
can be done on the opposite direction as well!) is an important tool
which can be applied in other questions as well, e.g. in the following
exercise.
Exercise 18
-
Show that an operator U: f(t) ↦ eitf(t) on
L2[0,2π] is unitary and has the
entire unit circle {| z |=1} as its spectrum .
- Find a self-adjoint operator T with the entire real line
as its spectrum.
8 Compactness
It is not easy to study
linear operators “in general” and there are many questions about
operators in Hilbert spaces raised many decades ago which are still
unanswered. Therefore it is reasonable to single out classes of
operators which have (relatively) simple properties. Such a class of
operators more closed to finite dimensional ones will be studied here.
These operators are so compact that we even can fit them
in our course
8.1 Compact operators
Let us recall some topological definition and results.
Definition 1
A compact set in a metric space is defined
by the property that any its covering by a family of open sets
contains a subcovering by a finite subfamily.
In the finite dimensional vector spaces ℝn or
ℂn there is the following equivalent definition of
compactness (equivalence of 1
and 2 is known as Heine–Borel
theorem):
Theorem 2
If a set E in ℝ
n or ℂ
n has any of the
following properties then it has other two as well:
-
E is bounded and closed;
-
E is compact;
- Any infinite subset of E has a limiting point belonging to
E.
Exercise* 3
Which equivalences from above are not true any more in the infinite
dimensional spaces?
Definition 4
Let X and Y be normed spaces, T∈
B(
X,
Y)
is a
finite rank operator if Im
T is a finite dimensional subspace of Y. T is a
compact operator if whenever
(
xi)
1∞ is a bounded sequence in X then its image
(
T xi)
1∞ has a convergent subsequence in Y.The set of finite rank operators is denote by
F(X,Y) and the set of compact
operators—by K(X,Y)
Exercise 5
Show that both F(X,Y) and K(X,Y) are
linear subspaces of B(X,Y).
We intend to show that F(X,Y)⊂K(X,Y).
Lemma 6
Let Z be a finite-dimensional normed space. Then there is a
number N and a mapping S:
l2N →
Z
which is invertible and such that S and S−1 are bounded.
Proof.
The proof is given by an explicit construction. Let
N=dim
Z and
z1,
z2, …,
zN be a basis in
Z. Let us define
S: l2N → Z by
S(a1,a2,…,aN)= | | ak zk,
|
then we have an estimation of norm:
| = | ⎪⎪
⎪⎪
⎪⎪
⎪⎪ | | ak zk | ⎪⎪
⎪⎪
⎪⎪
⎪⎪ | ≤ | |
| ⎪
⎪ | ak | ⎪
⎪ | | ⎪⎪
⎪⎪ | zk | ⎪⎪
⎪⎪ | |
|
| ≤ | ⎛
⎜
⎜
⎝ | | | ⎪
⎪ | ak | ⎪
⎪ | 2 | ⎞
⎟
⎟
⎠ | |
| ⎛
⎜
⎜
⎝ | | | ⎪⎪
⎪⎪ | zk | ⎪⎪
⎪⎪ | 2 | ⎞
⎟
⎟
⎠ | | .
|
|
|
So ||
S||≤ (∑
1N ||
zk||
2)
1/2 and
S is continuous.
Clearly S has the trivial kernel, particularly ||Sa||>0
if ||a||=1. By the Heine–Borel
theorem the unit sphere in l2N is compact,
consequently the continuous function a↦ ||∑1N ak
zk|| attains its lower bound, which has to be positive. This means
there exists δ>0 such that ||a||=1 implies
||Sa||>δ , or, equivalently if ||z||<δ then
||S−1 z||<1. The later means that ||S−1||≤
δ−1 and boundedness of S−1.
□
Corollary 7
For any two metric spaces X and Y we have
F(X,Y)⊂ K(X,Y).
Proof.
Let
T∈
F(
X,
Y), if (
xn)
1∞ is a bounded
sequence in
X then ((
Txn)
1∞⊂
Z=
Im T is
also bounded. Let
S:
l2N→
Z be a map
constructed in the
above Lemma.
The sequence (
S−1T xn)
1∞ is bounded in
l2N and thus has a limiting point, say
a0. Then
Sa0 is a limiting point of (
T xn)
1∞.
□
There is a simple condition which allows to determine
which diagonal operators are compact (particularly the
identity operator IX is
not compact if dimX =∞):
Proposition 8
Let T is a diagonal operator
and given by identities T en=λ
n en for all n in a
basis en. T is compact if and only if λ
n→ 0
.
Figure 16: Distance between scales of orthonormal vectors |
Proof.
If λ
n↛0 then there exists a subsequence
λ
nk and δ>0 such that
| λ
nk |>δ for all
k. Now the sequence
(
enk) is bounded but its image
T enk=λ
nk
enk has no convergent subsequence because for any
k≠
l:
| ⎪⎪
⎪⎪ | λ nkenk−λ nlenl | ⎪⎪
⎪⎪ | =
( | ⎪
⎪ | λ nk | ⎪
⎪ | 2 + | ⎪
⎪ | λ
nl | ⎪
⎪ | 2)1/2≥ | √ | | δ ,
|
i.e.
T enk is not a Cauchy sequence, see
Figure
16.
For the converse, note that if λ
n→ 0 then we
can define a finite rank operator
Tm,
m≥
1—
m-“truncation” of
T by:
Tm en = | ⎧
⎨
⎩ | Ten=λn en, | 1≤ n≤ m; |
0 , | n>m.
|
|
|
(45) |
Then obviously
and
||
T−
Tm||=sup
n>m| λ
n |→ 0 if
m→ ∞. All
Tm are finite rank operators (so
are compact) and
T is also compact as their limit—by the
next Theorem.
□
Theorem 9
Let Tm be a sequence of compact operators convergent to an
operator T in the norm topology (i.e. ||
T−
Tm||→
0
) then T is compact itself. Equivalently
K(
X,
Y)
is a closed subspace of B(
X,
Y)
.
Figure 17: The є/3 argument to estimate | f(x)−f(y) |. |
T1x1(1) | T1x2(1) | T1x3(1) | … | T1xn(1) | … | → | a1 |
T2x1(2) | T2x2(2) | T2x3(2) | … | T2xn(2) | … | → | a2 |
T3x1(3) | T3x2(3) | T3x3(3) | … | T3xn(3) | … | → | a3 |
… | … | … | … | … | … | | |
Tnx1(n) | Tnx2(n) | Tnx3(n) | … | Tnxn(n) | … | → | an |
… | … | … | … | … | … | | ↓
|
| | | | | | ↘ | |
| | | | | | | a
|
Table 2: The “diagonal argument”. |
Proof.
Take a bounded sequence (
xn)
1∞. From compactness
of T1 | ⇒ ∃ | subsequence (xn(1))1∞
of (xn)1∞ | s.t. | (T1xn(1))1∞ is convergent. |
of T2 | ⇒ ∃ | subsequence (xn(2))1∞
of (xn(1))1∞ | s.t. | (T2xn(2))1∞ is convergent. |
of T3 | ⇒ ∃ | subsequence (xn(3))1∞
of (xn(2))1∞ | s.t. | (T3xn(3))1∞ is
convergent. |
… | … | … | … | … |
Could we find a subsequence which converges for all
Tm
simultaneously? The first guess “take the intersection of all
above sequences (
xn(k))
1∞” does not work because the
intersection could be empty. The way out is provided by the
diagonal argument (see Table
2):
a subsequence (
Tm xk(k))
1∞ is convergent for
all
m, because at latest after the term
xm(m) it is a
subsequence of (
xk(m))
1∞.
We are claiming that a subsequence (T xk(k))1∞ of
(T xn)1∞ is convergent as well. We use here
є/3 argument (see
Figure 17): for a given
є>0 choose p∈ℕ such that
||T−Tp||<є/3.
Because (Tp xk(k))→ 0 it is a Cauchy sequence,
thus there exists n0>p such that ||Tp xk(k)−Tp
xl(l)||< є/3 for all k, l>n0. Then:
| ⎪⎪
⎪⎪ | T xk(k)−T xl(l) | ⎪⎪
⎪⎪ |
| = | ⎪⎪
⎪⎪ | (T xk(k)−Tp
xk(k))+(Tp xk(k)−Tp xl(l))+(Tp xl(l)−T
xl(l)) | ⎪⎪
⎪⎪ |
|
| ≤ | ⎪⎪
⎪⎪ | T xk(k)−Tp xk(k) | ⎪⎪
⎪⎪ | +
| ⎪⎪
⎪⎪ | Tp xk(k)−Tp xl(l) | ⎪⎪
⎪⎪ | + | ⎪⎪
⎪⎪ | Tp xl(l)−T
xl(l) | ⎪⎪
⎪⎪ |
|
| ≤ | є
|
|
Thus T is compact.
□
8.2 Hilbert–Schmidt operators
Definition 10
Let T:
H→
K be a bounded linear map between two
Hilbert spaces. Then T is said to be Hilbert–Schmidt
operator if there exists an
orthonormal
basis in H such that the series ∑
k=1∞||
T
ek||
2 is convergent.
Example 11
-
Let T: l2→ l2 be a
diagonal operator defined by
Ten=en/n, for all n≥ 1. Then ∑
||Ten||2=∑n−2=π2/6 (see
Example 16) is finite.
- The identity operator
IH is not
a Hilbert–Schmidt operator, unless H is finite dimensional.
A relation to compact operator is as follows.
Theorem 12
All Hilbert–Schmidt operators are compact. (The opposite inclusion
is false, give a counterexample!)
Proof.
Let
T∈
B(
H,
K) have a convergent series ∑
||
T en||
2 in an orthonormal basis (
en)
1∞ of
H. We again (see (
45)) define the
m-truncation of
T by the formula
Then
Tm(∑
1∞ak ek)=∑
1m ak
ek and each
Tm is a
finite
rank operator because its image is spanned by the finite set of
vectors
Te1, …,
Ten. We claim that
||
T−
Tm||→ 0. Indeed by linearity and definition
of
Tm:
(T−Tm) | ⎛
⎜
⎜
⎝ | | an en | ⎞
⎟
⎟
⎠ | =
| | an (Ten).
|
|
|
Thus:
| | ⎪⎪
⎪⎪
⎪⎪
⎪⎪ | (T−Tm) | ⎛
⎜
⎜
⎝ | | an en | ⎞
⎟
⎟
⎠ | ⎪⎪
⎪⎪
⎪⎪
⎪⎪ |
| = | | ⎪⎪
⎪⎪
⎪⎪
⎪⎪ |
| | an (Ten) | ⎪⎪
⎪⎪
⎪⎪
⎪⎪ | |
| (47) |
| ≤ | | | ⎪
⎪ | an | ⎪
⎪ | | ⎪⎪
⎪⎪ | (Ten) | ⎪⎪
⎪⎪ |
|
| |
| ≤ |
| ⎛
⎜
⎜
⎝ | | ⎪
⎪ | an | ⎪
⎪ | 2 | ⎞
⎟
⎟
⎠ | |
| ⎛
⎜
⎜
⎝ | | ⎪⎪
⎪⎪ | (Ten) | ⎪⎪
⎪⎪ | 2 | ⎞
⎟
⎟
⎠ | |
| |
| ≤ | | ⎪⎪
⎪⎪
⎪⎪
⎪⎪ | | an en | ⎪⎪
⎪⎪
⎪⎪
⎪⎪ |
| ⎛
⎜
⎜
⎝ | | ⎪⎪
⎪⎪ | (Ten) | ⎪⎪
⎪⎪ | 2 | ⎞
⎟
⎟
⎠ | |
|
| (48) |
|
so ||
T−
Tm||→ 0 and by the
previous Theorem T
is compact as a limit of compact operators.
□
Corollary 13 (from the above proof)
For a Hilbert–Schmidt operator
| ⎪⎪
⎪⎪ | T | ⎪⎪
⎪⎪ | ≤
| ⎛
⎜
⎜
⎝ | | ⎪⎪
⎪⎪ | (Ten) | ⎪⎪
⎪⎪ | 2 | ⎞
⎟
⎟
⎠ | | .
|
Proof.
Just consider difference of
T and
T0=0 in
(
47)–(
48).
□
Example 14
An integral operator T on
L2[0,1]
is defined by the formula:
(T f)(x)= | | K(x,y)f(y) d y, f(y)∈L2[0,1],
(49) |
where the continuous on [0,1]×[0,1]
function K is
called the kernel of integral
operator.
Theorem 15
Integral operator (49) is Hilbert–Schmidt.
Proof.
Let (
en)
−∞∞ be an orthonormal basis of
L2[0,1], e.g. (
e2π i
nt)
n∈ℤ. Let us consider the kernel
Kx(
y)=
K(
x,
y) as a function of the argument
y depending from
the parameter
x. Then:
(T en)(x)= | | K(x,y)en(y) d y= | |
Kx(y)en(y) d y= ⟨ Kx,ēn
⟩.
|
So ||
T en||
2=
∫
01| ⟨
Kx,ē
n
⟩ |
2 d x. Consequently:
| | = | | |
| = | | (50) |
| = | | |
| = | | | | | | ⎪
⎪ | K(x,y) | ⎪
⎪ | 2 d x d y <
∞
|
| |
|
Exercise 16
Justify the exchange of summation and integration in (50).
□
Definition 18
Define Hilbert–Schmidt norm of a Hilbert–Schmidt
operator A by ||
A||
HS2=∑
n=1∞||
Aen||
2 (it is
independent of the choice of orthonormal basis (
en)
1∞,
see Question 27).
Exercise* 19
Show that set of Hilbert–Schmidt operators with the above norm is a
Hilbert space and find the an expression for the inner product.
Example 20
Let K(
x,
y)=
x−
y, then
(Tf)(x)= | | (x−y)f(y) d y =x | | f(y) d y − | | yf(y) d y
|
is a rank 2
operator. Furthermore:
| = | | | |
(x−y)2 d x d y = | | | ⎡
⎢
⎢
⎣ |
| | ⎤
⎥
⎥
⎦ | | d y |
|
| = | | | | + | | d y= | ⎡
⎢
⎢
⎣ |
− | | + | | ⎤
⎥
⎥
⎦ | | = | | .
|
|
|
On the other hand there is an orthonormal basis such that
Tf= | | ⟨ f,e1
⟩e1− | | ⟨ f,e2
⟩e2,
|
and ||
T||=1/√
12 and ∑
12
||
Tek||
2=1/6
and we get ||
T||≤ ||
T||
HS in
agreement with Corollary 13.
9 Compact normal operators
Recall from Section 6.5 that an operator T
is normal
if TT*=T*T; Hermitian
(T*=T) and unitary
(T*=T−1) operators are normal.
9.1 Spectrum of normal operators
Theorem 1
Let T∈
B(
H)
be a normal operator then
-
kerT =kerT*, so ker(T−λ I) =ker
(T*−λI) for all λ∈ℂ
-
Eigenvectors corresponding to distinct eigenvalues are
orthogonal.
-
||T||=r(T).
Proof.
- Obviously:
x∈kerT | ⇔ | ⟨ Tx,Tx
⟩=0 ⇔
⟨ T*Tx,x
⟩=0 |
| ⇔ | ⟨ TT*x,x
⟩=0
⇔ ⟨ T*x,T*x
⟩=0 |
| ⇔ | x∈kerT*.
|
|
The second part holds because normalities of T and T−λ I
are equivalent.
- If Tx=λ x, Ty=µ y then from the previous
statement T* y =µy. If λ≠µ then the identity
λ⟨ x,y
⟩=⟨ Tx,y
⟩ =⟨ x,T*y
⟩=µ⟨ x,y
⟩
|
implies ⟨ x,y
⟩=0.
- Let S=T*T, then S is Hermitian (check!). Consequently, inequality
| ⎪⎪
⎪⎪ | Sx | ⎪⎪
⎪⎪ | 2=⟨ Sx,Sx
⟩=⟨ S2x,x
⟩≤ | ⎪⎪
⎪⎪ | S2 | ⎪⎪
⎪⎪ | ⎪⎪
⎪⎪ | x | ⎪⎪
⎪⎪ | 2
|
implies ||S||2≤ ||S2||. But the opposite inequality
follows from the Theorem 12, thus we
have the equality ||S2||=||S||2 and more generally by
induction: ||S2m||=||S||2m for all m.Now we claim ||S||=||T||2. From
Theorem 12
and 18 we get ||S||=||T*T||≤
||T||2. On the other hand if ||x||=1 then
| ⎪⎪
⎪⎪ | T*T | ⎪⎪
⎪⎪ | ≥ | ⎪
⎪ | ⟨ T*Tx,x
⟩ | ⎪
⎪ | =⟨ Tx,Tx
⟩= | ⎪⎪
⎪⎪ | Tx | ⎪⎪
⎪⎪ | 2
|
implies the opposite inequality ||S||≥||T||2. Only now we
use normality of T to obtain
(T2m)*T2m=(T*T)2m and get the equality
| ⎪⎪
⎪⎪ | T2m | ⎪⎪
⎪⎪ | 2= | ⎪⎪
⎪⎪ | (T*T)2m | ⎪⎪
⎪⎪ | = | ⎪⎪
⎪⎪ | T*T | ⎪⎪
⎪⎪ | 2m = | ⎪⎪
⎪⎪ | T | ⎪⎪
⎪⎪ | 2m+1.
|
Thus:
r(T)= | | | ⎪⎪
⎪⎪ | T2m | ⎪⎪
⎪⎪ | 1/2m=
| | | ⎪⎪
⎪⎪ | T | ⎪⎪
⎪⎪ | 2m+1/2m+1 = | ⎪⎪
⎪⎪ | T | ⎪⎪
⎪⎪ | .
|
by the spectral radius formula (44).
□
Example 2
It is easy to see that normality is important
in 3, indeed the non-normal operator T
given by the matrix (
)
in ℂ
has one-point spectrum
{0}
, consequently r(
T)=0
but ||
T||=1
.
Lemma 3
Let T be a compact normal operator then
-
The set of of eigenvalues of T is either finite or a
countable sequence tending to zero.
-
All the eigenspaces, i.e. ker(T−λ
I), are finite-dimensional for all λ≠ 0.
Proof.
- Let H0 be the closed linear span of eigenvectors of
T. Then T restricted to H0 is a diagonal compact
operator with the same set of eigenvalues λn as in
H. Then λn→ 0 from
Proposition 8 .
Exercise 5
Use the proof of
Proposition 8 to give a
direct demonstration.
Proof.[Solution]
Or straightforwardly assume opposite: there exist an
δ>0 and infinitely many eigenvalues λ
n such
that | λ
n |>δ. By the
previous Theorem there
is an orthonormal sequence
vn of corresponding eigenvectors
T vn=λ
n vn. Now the sequence (
vn) is bounded
but its image
T vn=λ
n en has no convergent
subsequence because for any
k≠
l:
| ⎪⎪
⎪⎪ | λ kvk−λ lel | ⎪⎪
⎪⎪ | =
( | ⎪
⎪ | λ k | ⎪
⎪ | 2 + | ⎪
⎪ | λl | ⎪
⎪ | 2)1/2≥ | √ | | δ ,
|
i.e.
T enk is not a Cauchy sequence, see
Figure
16.
□
- Similarly if H0=ker(T−λ I) is infinite
dimensional, then restriction of T on H0 is λ
I—which is non-compact by
Proposition 8. Alternatively
consider the infinite orthonormal sequence (vn),
Tvn=λ vn as in Exercise 5.
□
Lemma 6
Let T be a compact normal operator. Then all non-zero points
λ∈ σ(
T)
are eigenvalues and there exists an
eigenvalue of modulus ||
T||
.
Proof.
Assume without lost of generality that
T≠ 0. Let
λ∈σ(
T), without lost of generality (multiplying by a
scalar) λ=1.
We claim that if 1 is not an eigenvalue
then there exist δ>0 such that
| ⎪⎪
⎪⎪ | (I−T)x | ⎪⎪
⎪⎪ | ≥
δ | ⎪⎪
⎪⎪ | x | ⎪⎪
⎪⎪ |
.
(51) |
Otherwise there exists a sequence of vectors
(xn) with unit norm such that (I−T)xn→ 0. Then
from the compactness of T for a subsequence (xnk) there
is y∈ H such that Txnk → y, then
xn→ y implying Ty=y and y≠ 0—i.e. y
is eigenvector with eigenvalue 1.
Now we claim Im (I−T) is closed,
i.e. y∈Im(I−T) implies y∈Im(I−T). Indeed, if (I−T)xn
→ y, then there is a subsequence (xnk) such that
Txnk→ z implying xnk→ y+z, then
(I−T)(z+y)=y by continuity of I−T.
Finally I−T is injective, i.e ker(I−T)={0},
by (51). By the
property 1, ker(I−T*)={0} as
well. But because always ker(I−T*)=Im(I−T)⊥
(by 2) we got surjectivity, i.e. Im(I−T)⊥={0}, of
I−T.
Thus (I−T)−1 exists and is bounded
because (51) implies ||y||>δ
||(I−T)−1y||. Thus 1∉σ(T).
The existence of eigenvalue λ such that
| λ |=||T|| follows from combination of Lemma 13 and
Theorem 3.
□
9.2 Compact normal operators
Theorem 7 (The spectral theorem for compact normal operators)
Let T be a compact normal operator on a Hilbert space
H. Then there exists an orthonormal sequence (
en)
of
eigenvectors of T and corresponding eigenvalues (λ
n)
such that:
Tx= | | λn ⟨ x,en
⟩ en, for all
x∈ H.
(52) |
If (λ
n)
is an infinite sequence it tends to zero.Conversely, if T is given by a formula (52)
then it is compact and normal.
Proof.
Suppose
T≠ 0. Then by the
previous Theorem there exists
an eigenvalue λ
1 such that
| λ
1 |=||
T|| with corresponding eigenvector
e1 of the unit norm. Let
H1=
Lin(
e1)
⊥. If
x∈
H1 then
⟨ Tx,e1
⟩=⟨ x,T*e1
⟩=⟨ x,λ1
e1
⟩=λ1⟨ x,e1
⟩=0,
(53) |
thus
Tx∈
H1 and similarly
T* x ∈
H1. Write
T1=
T|
H1 which is again a normal compact
operator with a norm does not exceeding ||
T||. We could
inductively repeat this procedure for
T1 obtaining sequence of
eigenvalues λ
2, λ
3, …with eigenvectors
e2,
e3, …. If
Tn=0 for a finite
n then
theorem is already proved. Otherwise we have an infinite sequence
λ
n→ 0. Let
x= | | ⟨ x,ek
⟩ek +yn ⇒
| ⎪⎪
⎪⎪ | x | ⎪⎪
⎪⎪ | 2= | |
| ⎪
⎪ | ⟨ x,ek
⟩ | ⎪
⎪ | 2 + | ⎪⎪
⎪⎪ | yn | ⎪⎪
⎪⎪ | 2 , yn∈ Hn,
|
from
Pythagoras’s theorem. Then
||
yn||≤ ||
x|| and ||
T yn||≤
||
Tn||||
yn||≤ | λ
n |||
x||→
0 by Lemma
3. Thus
T x = | | | ⎛
⎜
⎜
⎝ | | ⟨ x,en
⟩ Ten +
Tyn | ⎞
⎟
⎟
⎠ | = | | λn⟨ x,en
⟩ en
|
Conversely, if T x = ∑1∞λn⟨ x,en
⟩ en then
⟨ Tx,y
⟩= | | λn⟨ x,en
⟩
⟨ en,y
⟩
= | | ⟨ x,en
⟩ λn | | ,
|
thus T* y = ∑1∞λn⟨ y,en
⟩
en. Then we got the normality of T:
T*Tx=TT*x= ∑1∞| λn |2⟨ y,en
⟩
en. Also T is compact because it is a uniform limit of the
finite rank operators Tnx=∑1n
λn⟨ x,en
⟩en.
□
Corollary 8
Let T be a compact normal operator on a separable Hilbert space
H, then there exists a orthonormal basis gk such that
and λ
n are eigenvalues of T including zeros.
Proof.
Let (
en) be the orthonormal sequence constructed in the proof of
the
previous
Theorem. Then
x is perpendicular to all
en if and only
if its in the kernel of
T. Let (
fn) be any orthonormal
basis of ker
T. Then the union of (
en) and (
fn) is
the orthonormal basis (
gn) we have looked for.
□
Exercise 9
Finish all details in the above proof.
Corollary 10 (Singular value decomposition)
If T is any compact operator on a separable Hilbert space then
there exists orthonormal sequences (
ek)
and (
fk)
such that
Tx=∑
k µ
k ⟨
x,
ek
⟩
fk where (µ
k)
is a
sequence of positive numbers such that µ
k→ 0
if it
is an infinite sequence.
Proof.
Operator
T*T is compact and Hermitian (hence normal). From the
previous Corollary there is an
orthonormal basis (
ek) such that
T*T x= ∑
n
λ
n⟨
x,
ek
⟩
ek for some positive λ
n=||
T
en||
2. Let µ
n=||
Ten|| and
fn=
Ten/µ
n. Then
fn is an orthonormal sequence (check!) and
Tx= | | ⟨ x,en
⟩ Ten = | | ⟨ x,en
⟩ µn fn.
|
□
Corollary 11
A bounded operator in a Hilber space is compact if and only if it is
a uniform limit of the finite rank operators.
Proof.
Sufficiency follows
from
9.
Necessity: by the
previous Corollary Tx
=∑
n ⟨
x,
en
⟩ µ
n fn thus
T is a uniform limit of
operators
Tm x=∑
n=1m ⟨
x,
en
⟩ µ
n fn which are
of finite rank.
□
10 Integral equations
In this lecture we will study the Fredholm
equation defined as follows. Let the
integral operator with a kernel
K(x,y) defined on [a,b]×[a,b] be defined
as before:
(Tφ)(x)= | | K(x,y)φ(y) d y.
(54) |
The Fredholm equation of the first and second
kinds correspondingly are:
for a function f on [a,b]. A special case is given by
Volterra equation by an operator
integral operator (54) T with a kernel
K(x,y)=0 for all y>x which could be written as:
(Tφ)(x)= | | K(x,y)φ(y) d y.
(56) |
We will consider integral operators with kernels K such that
∫ab∫ab K(x,y) d x d y<∞, then by
Theorem 15 T is a
Hilbert–Schmidt operator and
in particular bounded.
As a reason to study Fredholm operators we
will mention that solutions of differential equations in mathematical
physics (notably heat and wave equations) requires a decomposition of
a function f as a linear combination of functions K(x,y) with
“coefficients” φ. This is an continuous analog of a discrete
decomposition into Fourier series.
Using ideas from the proof of Lemma 4 we
define Neumann series for the resolvent:
(I−λ T)−1=I+λ T + λ2T2+⋯,
(57) |
which is valid for all λ<||T||−1.
Example 1
Solve the Volterra equation
φ(x)−λ | | y φ(y) d y=x2,
on L2[0,1].
|
In this case I−λ
T φ =
f, with f(
x)=
x2 and:
Straightforward calculations shows:
and generally by induction:
Hence:
φ(x) | = | |
| = | |
| = | | (eλ x2/2−1) for all
λ ∈ ℂ∖ {0},
|
|
|
because in this case r(
T)=0
. For the Fredholm equations this is
not always the case, see Tutorial problem 29.
Among other integral operators there is an important subclass with
separable kernel, namely a
kernel which has a form:
In such a case:
i.e. the image of T is spanned by g1(x), …, gn(x)
and is finite dimensional, consequently the solution of such equation
reduces to linear algebra.
Example 2
Solve the Fredholm equation (actually find eigenvectors of T):
φ(x) | = | |
| = | λ | | (cosxcosy − sinx siny)φ(y) d y.
|
|
|
Clearly φ (
x)
should be a linear combination φ(
x)=
Acos
x+
Bsin
x with coefficients A and B satisfying to:
A | = | λ | | cosy (Acosy+Bsiny) d y, |
|
B | = | −λ | | siny (Acosy+Bsiny) d y.
|
|
|
Basic calculus implies
A=λπ
A and B=−λπ
B and the only nonzero
solutions are:
λ=π−1 | A ≠ 0 | B = 0 |
λ=−π−1 | A = 0 | B ≠ 0
|
|
We develop some Hilbert–Schmidt theory for integral operators.
Theorem 3
Suppose that K(
x,
y)
is a continuous function on
[
a,
b]×[
a,
b]
and K(
x,
y)=
K(y,x) and operator
T is defined by (54). Then
-
T is a self-adjoint Hilbert–Schmidt operator.
- All eigenvalues of T are real and satisfy ∑n
λn2<∞.
- The eigenvectors vn of T can be chosen as an
orthonormal basis of L2[a,b], are continuous for
nonzero λn and
Tφ= | | λn ⟨ φ,vn
⟩vn
where φ= | | ⟨ φ,vn
⟩vn
|
Proof.
- The condition K(x,y)=K(y,x) implies the
Hermitian property of T:
⟨ Tφ,ψ
⟩ | = | | ⎛
⎜
⎜
⎝ | |
K(x,y)φ(y) d y | ⎞
⎟
⎟
⎠ | ψ(x) d x |
|
| = | |
| = | | φ(y)
| ⎛
⎜
⎜
⎝ | | | d x | ⎞
⎟
⎟
⎠ | dy |
|
| = | ⟨ φ,Tψ
⟩.
|
|
The Hilbert–Schmidt property (and hence compactness) was proved
in Theorem 15.
- Spectrum of T is real as for any Hermitian operator,
see Theorem 2 and finiteness
of ∑n λn2 follows from Hilbert–Schmidt
property
- The existence of orthonormal basis consisting from
eigenvectors (vn) of T was proved in
Corollary 8. If λn≠ 0 then:
vn(x1)−vn(x2) | = | λn−1((Tvn)(x1)−(Tvn)(x2)) |
| = | | | | (K(x1,y)−K(x2,y))vn(y) d y |
|
|
and by Cauchy–Schwarz-Bunyakovskii inequality:
| ⎪
⎪ | vn(x1)−vn(x2) | ⎪
⎪ |
≤ | |
| ⎪⎪
⎪⎪ | vn | ⎪⎪
⎪⎪ | 2 | | | ⎪
⎪ | K(x1,y)−K(x2,y) | ⎪
⎪ | d y
|
which tense to 0 due to (uniform) continuity of K(x,y).
□
Theorem 4
Let T be as in the
previous Theorem. Then if
λ≠ 0
and λ
−1∉σ(
T)
, the unique
solution φ
of the Fredholm equation of the second
kind φ−λ
T φ=
f is
Proof.
Let φ=∑
1∞an vn where
an=⟨ φ,
vn
⟩, then
φ−λ Tφ= | | an(1−λ λn) vn
=f= | | ⟨ f,vn
⟩vn
|
if and only if
an=⟨
f,
vn
⟩/(1−λ λ
n) for all
n. Note 1−λ λ
n≠ 0 since
λ
−1∉σ(
T).
Because λn→ 0 we got ∑1∞| an |2 by its comparison with ∑1∞| ⟨ f,vn
⟩ |2=||f||2, thus the solution exists
and is unique by the Riesz–Fisher
Theorem.
□
See Exercise 30 for an example.
Theorem 5 (Fredholm alternative)
Let T∈
K(
H)
be compact normal and
λ∈ℂ∖ {0}
. Consider the equations:
|
φ−λ Tφ | = | 0 | (60) |
φ−λ Tφ | = | f
| (61) |
|
then either
-
the only solution to (60) is
φ=0 and (61) has a unique
solution for any f∈ H; or
- there exists a nonzero solution to (60)
and (61) can be solved if and only if
f is orthogonal all solutions to (60).
Proof.
- If φ=0 is the only solution
of (60), then λ−1 is not an
eigenvalue of T and then by
Lemma 6 is neither in spectrum of
T. Thus I−λ T is invertible and the unique solution
of (61) is given by φ=(I−λ
T)−1 f.
- A nonzero solution to (60) means
that λ−1∈σ(T). Let (vn) be an orthonormal
basis of eigenvectors of T for eigenvalues (λn). By
Lemma 2 only a finite number of
λn is equal to λ−1, say they are
λ1, …, λN, then
(I−λ T)φ= | | (1−λ
λn)⟨ φ,vn
⟩vn
= | | (1−λ λn)⟨ φ,vn
⟩vn.
|
If f=∑1∞⟨ f,vn
⟩vn then the identity
(I−λ T)φ=f is only possible if ⟨ f,vn
⟩=0
for 1≤ n≤ N. Conversely from that condition we could
give a solution
φ= | | | vn +φ0, for any φ0∈Lin(v1,…,vN),
|
which is again in H because f∈ H and λn→ 0.
□
Example 6
Let us consider
(Tφ)(x)= | | (2xy−x−y+1)φ(y) d y.
|
Because the kernel of T is real and symmetric T=
T*, the
kernel is also separable:
(Tφ)(x)=x | | (2y−1)φ(y) d y+ | | (−y+1)φ(y) d y,
|
and T of the rank 2
with image of T spanned by 1
and
x. By direct calculations:
| |
or T is given by the matrix
| ⎛
⎜
⎜
⎜
⎜
⎜
⎜
⎜
⎝ | | ⎞
⎟
⎟
⎟
⎟
⎟
⎟
⎟
⎠ |
According to linear algebra decomposition over eigenvectors is:
with normalisation v1(
y)=1
,
v2(
y)=√
12(
y−1/2)
and we complete it to an orthonormal
basis (
vn)
of L2[0,1]
. Then
-
If λ≠ 2 or 6 then (I−λ T)φ = f has a
unique solution (cf. equation (59)):
φ | = | |
| = | | | | vn + | ⎛
⎜
⎜
⎝ | f− | | ⟨ f,vn
⟩ vn) | ⎞
⎟
⎟
⎠ |
|
| = | |
|
- If λ=2 then the solutions exist provided
⟨ f,v1
⟩=0 and are:
φ=f+ | |
⟨ f,v2
⟩v2+Cv1=f+ | | ⟨ f,v2
⟩v2+Cv1,
C∈ℂ.
|
- If λ=6 then the solutions exist provided
⟨ f,v2
⟩=0 and are:
φ=f+ | | ⟨ f,v1
⟩v1+Cv2=f− | | ⟨ f,v2
⟩v2+Cv2,
C∈ℂ.
|
11 Banach and Normed Spaces
We will work with either the field of real numbers ℝ or
the complex numbers ℂ. To avoid repetition, we use
K to denote either ℝ or ℂ.
11.1 Normed spaces
Recall, see Defn. 3, a norm on a
vector space V is a map ||·||:V→[0,∞) such
that
- ||u||=0 only when u=0;
- ||λ u|| = | λ |
||u|| for λ∈K and u∈ V;
- ||u+v|| ≤ ||u|| + ||v|| for u,v∈ V.
Note, that the second and third conditions imply that linear
operations—multiplication by a scalar and addition of vectors
respectively—are continuous in the topology defined by the norm.
A norm induces a metric, see
Defn. 1, on V by setting d(u,v)=||u−v||.
When V is complete, see Defn. 6, for this
metric, we say that V is a Banach space.
Theorem 1
Every finite-dimensional normed vector space is a Banach space.
We will use the following simple inequality:
Lemma 2 (Young’s inequality)
Let two real numbers 1<
p,
q<∞
are related through
1/
p+1/
q=1
then
for any complex a and b.
Proof.[First proof: analytic]
Obviously, it is enough to prove inequality for positive reals
a=| a | and b=| b |.
If p>1 then 0<1/p < 1. Consider the function
φ(t)=tm−mt for an 0<m<1. From its derivative
φ(t)=m(tm−1−1) we find the only critical point t=1 on
[0,∞), which is its maximum for m=1/p<1. Thus
write the inequality φ(t)≤ φ(1) for t=ap/bq
and m=1/p. After a transformation we get
a· b−q/p−1≤ 1/p(apb−q−1) and
multiplication by bq with rearrangements lead to the desired
result.
□
Proof.[Second proof: geometric]
Consider the plane with coordinates (
x,
y) and take the curve
y=
xp−1 which is the same as
x=
yq−1. Comparing areas on
the figure:
we see that
S1+
S2≥
ab for any positive reals
a and
b. Elementary integration shows:
S1= | | xp−1 d x= | | ,
S2= | | yq−1 d y= | | .
|
This finishes the demonstration.
□
Proposition 4 (Hölder’s Inequality)
For
1<
p<∞
, let q∈(1,∞)
be such that 1/
p + 1/
q =
1
. For n≥1
and u,
v∈K
n, we have that
| | | ⎪
⎪ | uj vj | ⎪
⎪ | ≤ | ⎛
⎜
⎜
⎝ | | | ⎪
⎪ | uj | ⎪
⎪ | p | ⎞
⎟
⎟
⎠ | |
| ⎛
⎜
⎜
⎝ | | | ⎪
⎪ | vj | ⎪
⎪ | q | ⎞
⎟
⎟
⎠ | | .
|
Proof.
For reasons become clear soon we use the notation ||
u||
p=(
∑
j=1n |
uj |
p )
1/p and
||
v||
q= ( ∑
j=1n |
vj |
q
)
1/q and define for 1≤
i ≤
n:
Summing up for 1≤
i ≤
n all inequalities obtained from (
62):
we get the result.
□
Using Hölder inequality we can derive the following one:
Proposition 5 (Minkowski’s Inequality)
For 1<
p<∞
, and n≥ 1
,
let u,
v∈K
n. Then
| ⎛
⎜
⎜
⎝ | | | ⎪
⎪ | uj+vj | ⎪
⎪ | p | ⎞
⎟
⎟
⎠ | |
≤ | ⎛
⎜
⎜
⎝ | | | ⎪
⎪ | uj | ⎪
⎪ | p | ⎞
⎟
⎟
⎠ | | + | ⎛
⎜
⎜
⎝ |
| | | ⎪
⎪ | vj | ⎪
⎪ | p | ⎞
⎟
⎟
⎠ | | .
|
Proof.
For
p>1 we have:
| | | ⎪
⎪ | uk+vk | ⎪
⎪ | p = | |
| ⎪
⎪ | uk | ⎪
⎪ | ⎪
⎪ | uk+vk | ⎪
⎪ | p−1
+ | |
| ⎪
⎪ | vk | ⎪
⎪ | ⎪
⎪ | uk+vk | ⎪
⎪ | p−1.
(63) |
By Hölder inequality
| | | ⎪
⎪ | uk | ⎪
⎪ | ⎪
⎪ | uk+vk | ⎪
⎪ | p−1 ≤ | ⎛
⎜
⎜
⎝ | |
| ⎪
⎪ | uk | ⎪
⎪ | p | ⎞
⎟
⎟
⎠ | |
| ⎛
⎜
⎜
⎝ | | | ⎪
⎪ | uk+vk | ⎪
⎪ | q(p−1) | ⎞
⎟
⎟
⎠ | | .
|
Adding a similar inequality for the second term in the right hand
side of (
63) and division by (∑
1n
|
uk+
vk |
q(p−1))
1/q yields the result.
□
Minkowski’s inequality shows that for 1≤ p<∞ (the case
p=1 is easy) we can define a norm ||·||p on
Kn by
| ⎪⎪
⎪⎪ | u | ⎪⎪
⎪⎪ | p = | ⎛
⎜
⎜
⎝ | | | ⎪
⎪ | uj | ⎪
⎪ | p | ⎞
⎟
⎟
⎠ | |
( u =(u1,⋯,un)∈Kn ).
|
See, Figure 2 for illustration of various
norms of this type defined in ℝ2.
We can define an infinite analogue of this. Let 1≤ p<∞,
let lp be the space of all
scalar sequences (xn) with ∑n | xn |p < ∞.
A careful use of Minkowski’s inequality shows that lp
is a vector space. Then lp becomes a normed space for
the ||·||p norm. Note also, that
l2 is the Hilbert space
introduced before in Example 2.
Recall that a Cauchy sequence, see Defn. 5, in a
normed space is bounded: if (xn) is Cauchy then we can find N
with ||xn−xm||<1 for all n,m≥ N. Then ||xn||
≤ ||xn−xN|| + ||xN|| < ||xN||+1 for n≥ N, so
in particular, ||xn|| ≤ max(
||x1||,||x2||,⋯,||xN−1||,||xN||+1).
Theorem 6
For 1≤
p<∞
, the space lp is a Banach space.
Proof.
We repeat the proof of Thm.
24 changing 2
to
p. Let (
x(n)) be a Cauchy-sequence in
lp; we wish to show this converges to some vector in
lp.
For each n, x(n)∈lp so is a sequence of scalars, say
(xk(n))k=1∞. As (x(n)) is Cauchy, for each
є>0 there exists Nє so that ||x(n) −
x(m)||p ≤ є for n,m≥ Nє.
For k fixed,
| ⎪
⎪ | xk(n) − xk(m) | ⎪
⎪ | ≤
| ⎛
⎜
⎜
⎝ | | | ⎪
⎪ | xj(n) − xj(m) | ⎪
⎪ | p | ⎞
⎟
⎟
⎠ | |
= | ⎪⎪
⎪⎪ | x(n) − x(m) | ⎪⎪
⎪⎪ | p ≤ є,
|
when n,m≥ Nє. Thus the scalar sequence
(xk(n))n=1∞ is Cauchy in K and hence
converges, to xk say.
Let x=(xk), so that x is a candidate for the limit of
(x(n)).
Firstly, we check that x−x(n)∈lp for
some n. Indeed, for a given є>0 find n0 such that
||x(n)−x(m)||<є for all n,m>n0. For any
K and m:
| | | ⎪
⎪ | xk(n)−xk(m) | ⎪
⎪ | p ≤ | ⎪⎪
⎪⎪ | x(n)−x(m) | ⎪⎪
⎪⎪ | p<єp.
|
Let m→ ∞ then ∑k=1K
| xk(n)−xk |p ≤ єp.
Let K→ ∞ then ∑k=1∞| xk(n)−xk |p ≤ єp. Thus
x(n)−x∈lp and because lp is a
linear space then x = x(n)−(x(n)−x) is also in
lp.
Finally, we saw above that for any є >0 there is n0
such that ||x(n)−x||<є for all n>n0. Thus
x(n)→ x.
□
For p=∞, there are two analogies to the lp
spaces. First, we define
l∞ to be the vector
space of all bounded scalar sequences, with the
sup-norm (||·||∞-norm):
| ⎪⎪
⎪⎪ | (xn) | ⎪⎪
⎪⎪ | ∞ = | | | ⎪
⎪ | xn | ⎪
⎪ |
( (xn)∈ l∞ ).
(64) |
Second, we define c0 to be the space of all scalar sequences
(xn) which converge to 0. We equip c0 with the sup
norm (64). This is defined, as if xn→0,
then (xn) is bounded. Hence c0 is a subspace of
l∞, and we can check (exercise!) that c0 is
closed.
Theorem 8
The spaces c0 and l∞ are Banach spaces.
Proof.
This is another variant of the previous proof of
Thm.
6. We do the
l∞ case.
Again, let (
x(n)) be a Cauchy sequence in
l∞, and for each
n, let
x(n)=(
xk(n))
k=1∞. For є>0 we can
find
N such that ||
x(n)−
x(m)||
∞ < є
for
n,
m≥
N. Thus, for any
k, we see that
|
xk(n) −
xk(m) | < є when
n,
m≥
N.
So (
xk(n))
n=1∞ is Cauchy, and hence converges, say
to
xk∈K. Let
x=(
xk).
Let m≥ N, so that for any k, we have that
| ⎪
⎪ | xk − xk(m) | ⎪
⎪ | = | | | ⎪
⎪ | xk(n) − xk(m) | ⎪
⎪ |
≤ є.
|
As k was arbitrary, we see that supk | xk−xk(m) |
≤ є. So, firstly, this shows that
(x−x(m))∈l∞, and so also x = (x−x(m)) + x(m) ∈
l∞. Secondly, we have shown that ||x−x(m)||∞
≤ є when m≥ N, so x(m)→ x in norm.
□
Example 9
We can also consider a Banach space of functions
Lp[
a,
b]
with the norm
| ⎪⎪
⎪⎪ | f | ⎪⎪
⎪⎪ | p= | ⎛
⎜
⎜
⎜
⎜
⎝ | ∫ | | | ⎪
⎪ | f(t) | ⎪
⎪ | p d t | ⎞
⎟
⎟
⎟
⎟
⎠ | | .
|
See the discussion after Defn. 22 for a realisation
of such spaces.
11.2 Bounded linear operators
Recall what a linear map is, see
Defn. 1. A linear map is often called an
operator. A linear map T:E→ F
between normed spaces is bounded if there exists M>0 such that
||T(x)|| ≤ M ||x|| for x∈ E, see
Defn. 3. We write B(E,F) for
the set of operators from E to F. For the natural operations,
B(E,F) is a vector space. We norm
B(E,F) by setting
| ⎪⎪
⎪⎪ | T | ⎪⎪
⎪⎪ | = sup | ⎧
⎪
⎨
⎪
⎩ | | : x∈ E, x≠0 | ⎫
⎪
⎬
⎪
⎭ | .
(65) |
Exercise 10
Show that
-
The expression (65) is a norm
in the sense of Defn. 3.
- We equivalently have
| ⎪⎪
⎪⎪ | T | ⎪⎪
⎪⎪ | = sup | ⎧
⎨
⎩ | ⎪⎪
⎪⎪ | T(x) | ⎪⎪
⎪⎪ | : x∈ E, | ⎪⎪
⎪⎪ | x | ⎪⎪
⎪⎪ | ≤1 | ⎫
⎬
⎭ | = sup | ⎧
⎨
⎩ | ⎪⎪
⎪⎪ | T(x) | ⎪⎪
⎪⎪ | : x∈ E, | ⎪⎪
⎪⎪ | x | ⎪⎪
⎪⎪ | =1 | ⎫
⎬
⎭ | .
|
Proposition 11
For a linear map T:
E→
F
between normed spaces, the following are equivalent:
-
T is continuous (for the metrics induced by the norms on E and F);
- T is continuous at 0;
- T is bounded.
Proof.
Proof essentially follows the proof of similar
Theorem
4. See also discussion
about usefulness of this theorem there.
□
Theorem 12
Let E be a normed space, and let F be a Banach space. Then
B(E,F) is a Banach space.
Proof.
In the essence, we follows the same three-step procedure as in
Thms.
24,
6 and
8.
Let (
Tn) be a Cauchy sequence in
B(
E,
F). For
x∈
E, check that (
Tn(
x)) is Cauchy in
F, and hence converges to, say,
T(
x), as
F is complete.
Then check that
T:
E→
F is linear, bounded, and that
||
Tn−
T||→ 0.
□
We write B(E) for B(E,E). For normed spaces
E, F and G, and for T∈B(E,F) and
S∈B(F,G), we have that ST=S∘ T∈B(E,G) with ||ST|| ≤ ||S|| ||T||.
For T∈B(E,F), if there exists
S∈B(F,E) with ST=IE, the identity of E, and
TS=IF, then T is said to be invertible, and write
T=S−1. In this case, we say that E and F are
isomorphic spaces, and that T is an
isomorphism.
If ||T(x)||=||x|| for each x∈ E, we say that T is
an isometry. If additionally T is an isomorphism, then
T is an isometric isomorphism, and we say that E and
F are isometrically isomorphic.
11.3 Dual Spaces
Let E be a normed vector space, and let E* (also written
E′) be B(E,K), the space of bounded
linear maps from E to K, which we call
functionals, or more correctly, bounded linear
functionals, see Defn. 1. Notice that as
K is complete, the above theorem shows that E* is
always a Banach space.
Theorem 13
Let 1<
p<∞
, and again let q be such that 1/
p+1/
q=1
.
Then the map lq→(
lp)
*:
u↦φ
u, is an
isometric isomorphism, where φ
u is defined, for
u=(
uj)∈
lq, by
φu(x) = | | uj xj
| ⎛
⎝ | x=(xj)∈lp | ⎞
⎠ | .
|
Proof.
By Hölder’s inequality, we see that
| ⎪
⎪ | φu(x) | ⎪
⎪ | ≤ | | ⎪
⎪ | uj | ⎪
⎪ | | ⎪
⎪ | xj | ⎪
⎪ |
≤ | ⎛
⎜
⎜
⎝ | | ⎪
⎪ | uj | ⎪
⎪ | q | ⎞
⎟
⎟
⎠ | |
| ⎛
⎜
⎜
⎝ | | ⎪
⎪ | xj | ⎪
⎪ | p | ⎞
⎟
⎟
⎠ | |
= | ⎪⎪
⎪⎪ | u | ⎪⎪
⎪⎪ | q | ⎪⎪
⎪⎪ | x | ⎪⎪
⎪⎪ | p.
|
So the sum converges, and hence φ
u is defined. Clearly
φ
u is linear, and the above estimate also shows that
||φ
u|| ≤ ||
u||
q. The map
u↦ φ
u is
also clearly linear, and we’ve just shown that it is
norm-decreasing.
Now let φ∈(lp)*. For each n, let en =
(0,⋯,0,1,0,⋯) with the 1 in the nth position.
Then, for x=(xn)∈lp,
| ⎪⎪
⎪⎪
⎪⎪
⎪⎪ | x − | | xk ek | ⎪⎪
⎪⎪
⎪⎪
⎪⎪ | p = | ⎛
⎜
⎜
⎝ | | ⎪
⎪ | xk | ⎪
⎪ | p | ⎞
⎟
⎟
⎠ | | → 0,
|
as n→∞. As φ is continuous, we see that
φ(x) = | | | | φ(xkek)
= | | xk φ(ek).
|
Let uk=φ(ek) for each k. If u=(uk)∈lq then we would have
that φ=φu.
Let us fix N∈ℕ, and define
xk = | ⎧
⎪
⎨
⎪
⎩ | 0, | if uk=0 or k>N; |
| if uk≠0
and k≤ N. |
|
|
Then we see that
| | ⎪
⎪ | xk | ⎪
⎪ | p = | | | ⎪
⎪ | uk | ⎪
⎪ | p(q−1)
= | | | ⎪
⎪ | uk | ⎪
⎪ | q,
|
as p(q−1) = q. Then, by the previous paragraph,
φ(x) = | | xk uk = | | | ⎪
⎪ | uk | ⎪
⎪ | q.
|
Hence
| ⎪⎪
⎪⎪ | φ | ⎪⎪
⎪⎪ | ≥ | |
= | ⎛
⎜
⎜
⎝ | | | ⎪
⎪ | uk | ⎪
⎪ | q | ⎞
⎟
⎟
⎠ | |
= | ⎛
⎜
⎜
⎝ | | | ⎪
⎪ | uk | ⎪
⎪ | q | ⎞
⎟
⎟
⎠ | | .
|
By letting N→∞, it follows that u∈lq with ||u||q
≤ ||φ||. So φ=φu and ||φ|| = ||φu|| ≤ ||u||q.
Hence every element of (lp)* arises as φu for some u, and also
||φu|| = ||u||q.
□
Loosely speaking, we say that lq = (lp)*, although we
should always be careful to keep in mind the exact map which gives
this.
Corollary 14 (Riesz–Frechet Self-duality Lemma 11)
l2 is self-dual: l2=
l2*.
Similarly, we can show that c0*=l1 and that
(l1)*=l∞ (the implementing isometric isomorphism is
giving by the same summation formula).
11.4 Hahn–Banach Theorem
Mathematical induction is a well known method to prove statements
depending from a natural number. The mathematical induction is based
on the following property of natural numbers: any subset of
ℕ has the least element. This observation can be
generalised to the transfinite induction described as follows.
A poset is a set X with a relation ≼ such that
a≼ a for all a∈ X, if a≼ b and b≼
a then a=b, and if a≼ b and b≼ c, then
a≼ c. We say that (X,≼) is total if for
every a,b∈ X, either a≼ b or b≼ a. For a
subset S⊆ X, an element a∈ X is an upper
bound for S if s≼ a for every s∈ S. An element
a∈ X is maximal if whenever b∈ X is such that
a≼ b, then also b≼ a.
Then Zorn’s Lemma tells us that if X
is a non-empty poset such that every total subset has an upper bound,
then X has a maximal element. Really this is an axiom
which we have to assume, in addition to the usual axioms of
set-theory. Zorn’s Lemma is equivalent to the
axiom of choice and
Zermelo’s theorem.
Theorem 15 (Hahn–Banach Theorem)
Let E be a normed vector space, and let F⊆
E be a
subspace. Let φ∈
F*. Then there exists ψ∈
E*
with ||ψ||≤||φ||
and ψ(
x)=φ(
x)
for each
x∈
F.
Proof.
We do the real case. An “extension” of φ is a bounded
linear map φ
G:
G→ℝ such that
F⊆
G⊆
E, φ
G(
x)=φ(
x) for
x∈
F, and
||φ
G||≤||φ||. We introduce a partial order on
the pairs (
G, φ
G) of subspaces and functionals as follows:
(
G1, φ
G1)≼ (
G2, φ
G2) if and only if
G1⊆
G2 and φ
G1(
x)=φ
G2(
x) for all
x∈
G1. A Zorn’s Lemma argument shows that a maximal
extension φ
G:
G→ℝ exists. We shall show
that if
G≠
E, then we can extend φ
G, a contradiction.
Let x∉G, so an extension φ1 of φ to
the linear span of G and x must have the form
φ1(x′+ax) = φ(x) + a α (x′∈ G, a∈ℝ),
|
for some α∈ℝ. Under this, φ1 is
linear and extends φ, but we also need to ensure that
||φ1||≤||φ||. That is, we need
| ⎪
⎪ | φ(x′) + aα | ⎪
⎪ | ≤ | ⎪⎪
⎪⎪ | φ | ⎪⎪
⎪⎪ | | ⎪⎪
⎪⎪ | x′+ax | ⎪⎪
⎪⎪ |
(x′∈ G, a∈ℝ).
(66) |
It is straightforward for a=0, otherwise to simplify proof put −a y=x′
in (66) and divide both sides of the identity by
a. Thus we need to show that there exist such α that
| ⎪
⎪ | α−φ(y) | ⎪
⎪ | ≤ | ⎪⎪
⎪⎪ | φ | ⎪⎪
⎪⎪ | | ⎪⎪
⎪⎪ | x−y | ⎪⎪
⎪⎪ |
for all y∈ G, a∈ℝ,
|
or
φ(y)− | ⎪⎪
⎪⎪ | φ | ⎪⎪
⎪⎪ | | ⎪⎪
⎪⎪ | x−y | ⎪⎪
⎪⎪ | ≤ α ≤ φ(y)+ | ⎪⎪
⎪⎪ | φ | ⎪⎪
⎪⎪ | | ⎪⎪
⎪⎪ | x−y | ⎪⎪
⎪⎪ | .
|
For any y1 and y2 in G we have:
φ(y1)−φ(y2)≤ | ⎪⎪
⎪⎪ | φ | ⎪⎪
⎪⎪ | ⎪⎪
⎪⎪ | y1−y2 | ⎪⎪
⎪⎪ | ≤ | ⎪⎪
⎪⎪ | φ | ⎪⎪
⎪⎪ | ( | ⎪⎪
⎪⎪ | x−y2 | ⎪⎪
⎪⎪ | + | ⎪⎪
⎪⎪ | x−y1 | ⎪⎪
⎪⎪ | ).
|
Thus
φ(y1)− | ⎪⎪
⎪⎪ | φ | ⎪⎪
⎪⎪ | | ⎪⎪
⎪⎪ | x−y1 | ⎪⎪
⎪⎪ | ≤ φ(y2)+ | ⎪⎪
⎪⎪ | φ | ⎪⎪
⎪⎪ | | ⎪⎪
⎪⎪ | x−y2 | ⎪⎪
⎪⎪ | .
|
As y1 and y2 were arbitrary,
| | (φ(y) − | ⎪⎪
⎪⎪ | φ | ⎪⎪
⎪⎪ | ⎪⎪
⎪⎪ | y+x | ⎪⎪
⎪⎪ | ) ≤
| | (φ(y) + | ⎪⎪
⎪⎪ | φ | ⎪⎪
⎪⎪ | | ⎪⎪
⎪⎪ | y+x | ⎪⎪
⎪⎪ | ).
|
Hence we can choose α between the inf and the sup.
The complex case follows by “complexification”.
□
The Hahn-Banach theorem tells us that a functional from a subspace can
be extended to the whole space without increasing the norm. In
particular, extending a functional on a one-dimensional subspace
yields the following.
Corollary 16
Let E be a normed vector space, and let x∈
E. Then there exists
φ∈
E* with ||φ||=1
and φ(
x)=||
x||
.
Another useful result which can be proved by Hahn-Banach is the
following.
Corollary 17
Let E be a normed vector space, and let F be a subspace of E.
For x∈
E, the following are equivalent:
-
x∈ F the closure of F;
-
for each φ∈ E* with φ(y)=0 for each y∈ F, we have
that φ(x)=0.
Proof.
1⇒
2
follows because we can find a sequence (
yn) in
F with
yn→
x; then it’s immediate that φ(
x)=0,
because φ is continuous. Conversely, we show that if
1 doesn’t hold then
2
doesn’t hold (that is, the contrapositive to
2⇒
1).
So, x∉F. Define
ψ:{F,x}→K by
This is well-defined, for y, y′∈ F if y+tx=y′+t′x then either t=t′,
or otherwise x = (t−t′)−1(y′−y) ∈ F which is a
contradiction. The map ψ is obviously linear, so we need to
show that it is bounded. Towards a contradiction, suppose that
ψ is not bounded, so we can find a sequence (yn+tnx)
with ||yn+tnx||≤1 for each n, and yet
| ψ(yn+tnx) |=| tn |→∞. Then
|| tn−1 yn + x || ≤ 1/| tn | → 0, so
that the sequence (−tn−1yn), which is in F, converges
to x. So x is in the closure of F, a contradiction. So
ψ is bounded. By Hahn-Banach theorem, we can find some
φ∈ E* extending ψ. For y∈ F, we have
φ(y)=ψ(y)=0, while φ(x)=ψ(x)=1, so
2 doesn’t hold, as required.
□
We define E** = (E*)* to be the bidual of E, and define
J:E→ E** as follows. For x∈ E, J(x) should
be in E**, that is, a map E*→K. We
define this to be the map φ↦φ(x) for φ∈ E*.
We write this as
J(x)(φ) = φ(x) (x∈ E, φ∈ E*).
|
The Corollary 16 shows that J is an isometry;
when J is surjective (that is, when J is an isomorphism), we
say that E is reflexive. For example, lp is
reflexive for 1<p<∞. On the other hand c0 is
not reflexive.
11.5 C(X) Spaces
This section is not examinable. Standard facts about
topology will be used in later sections of the course.
All our topological spaces are assumed Hausdorff. Let X be
a compact space, and let CK(X) be the space of
continuous functions from X to K, with pointwise
operations, so that CK(X) is a vector space. We norm
CK(X) by setting
| ⎪⎪
⎪⎪ | f | ⎪⎪
⎪⎪ | ∞ = | | | ⎪
⎪ | f(x) | ⎪
⎪ | (f∈ CK(X)).
|
Theorem 18
Let X be a compact space. Then CK(X) is a Banach
space.
Let E be a vector space, and let ||·||(1) and ||·||(2)
be norms on E. These norms are equivalent if there exists
m>0 with
m−1 | ⎪⎪
⎪⎪ | x | ⎪⎪
⎪⎪ | (2) ≤ | ⎪⎪
⎪⎪ | x | ⎪⎪
⎪⎪ | (1) ≤ m | ⎪⎪
⎪⎪ | x | ⎪⎪
⎪⎪ | (2)
(x∈ E).
|
Theorem 19
Let E be a finite-dimensional vector space with basis
{e1,…,en}, so we can identify E with
Kn as vector spaces, and hence talk about the norm
||·||2 on E. If ||·|| is any norm on
E, then ||·|| and ||·||2 are equivalent.
Corollary 20
Let E be a finite-dimensional normed space. Then a subset
X⊆ E is compact if and only if it is closed and bounded.
Lemma 21
Let E be a normed vector space, and let F be a closed
subspace of E with E≠F. For 0<θ<1, we can find
x0∈ E with ||x0||≤1 and ||x0−y||>θ
for y∈ F.
Theorem 22
Let E be an infinite-dimensional normed vector space. Then the
closed unit ball of E, the set {x∈ E : ||x||≤
1}, is not compact.
Proof.
Use the above lemma to construct a sequence (xn) in the closed
unit ball of E with, say, ||xn−xm||≥1/2 for each
n≠m. Then (xn) can have no convergent subsequence, and
so the closed unit ball cannot be compact.
□
12 Measure Theory
The presentation in this section is close to [, , ].
12.1 Basic Measure Theory
The following object will be the cornerstone of our construction.
Definition 1
Let X be a set. A σ-algebra
R on X is a collection of subsets of
X, written R⊆ 2
X, such that
-
X∈R;
- if A,B∈R, then A∖
B∈R;
- if (An) is any
sequence in R, then ∪n An∈R.
Note, that in the third condition
we admit any countable unions. The usage of “σ” in the names of
σ -algebra and σ-ring is a reference to this. If we
replace the condition by
- if (An)1m is any
finite family in R, then ∪n=1m An∈R;
then we obtain definitions of an algebra.
For a σ-algebra R and A,B∈R,
we have
A ⋂ B = X∖ | ⎛
⎝ | X∖(A⋂ B) | ⎞
⎠ |
= X ∖ | ⎛
⎝ | (X∖ A)⋃(X∖ B) | ⎞
⎠ | ∈R.
|
Similarly, R is closed under taking (countably)
infinite intersections.
If we drop the first condition from the definition of
(σ-)algebra (but keep the above conclusion from it!) we got a
(σ-)ring, that is a
(σ-)ring is closed under (countable) unions, (countable)
intersections and subtractions of sets.
Exercise 2
-
Use the above comments to write in full the three missing definitions: of set algebra, set ring and set σ-ring.
- Show that the empty set belongs to any non-empty ring.
Sets Ak are pairwise disjoint if An∩ Am=∅ for
n≠m. We denote the union of pairwise disjoint sets by
⊔, e.g. A ⊔ B ⊔ C.
It is easy to work with a vector space through its basis. For a ring
of sets the following notion works as a helpful “basis”.
Definition 3
A semiring
S of sets is a collection
such that
-
it is closed under intersection;
- for A, B∈ S we have A∖ B=C1⊔ …
⊔ CN with Ck∈ S.
Again, any non-empty semiring contain the empty set.
Example 4
The following are semirings but not rings:
-
The collection of intervals [a,b) on the real line;
-
The collection of all rectangles { a≤ x < b, c≤ y <d
} on the plane.
As the intersection of a family of σ-algebras is again a
σ-algebra, and the power set 2X is a σ-algebra,
it follows that given any collection D⊆ 2X,
there is a σ-algebra R such that
D⊆R, such that if S
is any other σ-algebra, with
D⊆S, then
R⊆S. We call R the
σ-algebra generated by D.
Exercise 5
Let S be a semiring. Show that
-
The collection of all finite disjoint unions ⊔k=1n Ak,
where Ak∈ S, is a ring. We call it the ring R(S)
generated by the semiring S.
- Any ring containing S contains R(S) as
well.
-
The collection of all finite (not necessarily disjoint!)
unions ∪k=1n Ak, where Ak∈ S, coincides with
R(S).
We introduce the symbols +∞, −∞, and treat these as being “extended
real numbers”, so −∞ < t < ∞ for t∈ℝ. We define
t+∞ = ∞, t∞ = ∞ if t>0 and so forth. We do not (and
cannot, in a consistent manner) define ∞ − ∞ or 0·∞.
Definition 6
A measure
is a map
µ:
R→[0,∞]
defined on a (semi-)ring
(or σ
-algebra) R, such that if A=⊔
n
An for A∈
R and a finite subset (
An)
of
R, then µ (
A) = ∑
n µ(
An)
. This property is
called additivity
of a measure.
The additivity property of a measure is rather demanding. For example, let us consider the decomposition [0,1)=[0,1/2) ⊔ [1/2,1) = [0,1/3) ⊔ [1/3,2/3) ⊔ [2/3,1), then additivity puts measures of those five intervals into equations:
µ([0, | | )) + µ( [ | | ,1) ) = µ([0.1)) = µ([0, | | )) + µ([ | | , | | )) + µ([ | | ,1)).
|
Similar equations appear from any other (out of infinitely many) decomposition of [0,1), thus measures of various intervals are highly interconnected and very far from being arbitrary.
Exercise 7
Show that the following two conditions are equivalent:
-
µ(∅)=0.
- There is a set A∈R such that µ(A)<∞.
The first condition often (but not always) is included in the
definition of a measure.
In analysis we are interested in infinities and limits, thus the
following extension of additivity is very important.
Definition 8
In terms of the previous definition we say that µ is
countably additive (or σ-additive) if for any
countable infinite family (An) of pairwise disjoint sets from
R such that A=⊔n An∈R we have
µ(A) = ∑n µ(An). If the sum diverges, then
as it will be the sum of positive numbers, we can, without problem,
define it to be +∞.
Note, that this property may be stated as a sort of continuity of an additive measure, cf. (7):
µ | ⎛
⎜
⎜
⎝ | | | | Ak | ⎞
⎟
⎟
⎠ |
= | | µ | ⎛
⎜
⎜
⎝ | | Ak | ⎞
⎟
⎟
⎠ | .
|
Example 9
-
Fix a point a∈ℝ and define a measure µ by the
condition µ(A)=1 if a∈ A and µ(A)=0 otherwise.
-
For the ring obtained in Exercise 5 from
semiring S in Example 1 define
µ([a,b))=b−a on S. This is a measure, and we will show its
σ-additivity.
-
For ring obtained in Exercise 5 from the
semiring in Example 2, define
µ(V)=(b−a)(d−c) for the rectangle V={ a≤ x < b, c≤ y <d
} S. It will be again a σ-additive measure.
- Let X=ℕ and R=2ℕ, we define
µ(A)=0 if A is a finite subset of X=ℕ and
µ(A)=+∞ otherwise. Let An={n}, then
µ(An)=0 and µ(⊔n
An)=µ(ℕ)=+∞≠ ∑n µ(An)=0. Thus, this
measure is not σ-additive.
We will see further examples of measures which are not
σ-additive in Section 12.4.
Definition 10
A measure µ
is finite
if µ(
A)<∞
for all A∈
R.A measure µ is σ-finite if X is a union of countable
number of sets Xk, such that for any A∈ R and any k∈
ℕ the intersection A∩ Xk is in R and µ(A∩ Xk)<∞.
Exercise 11
Modify the example 1 to obtain
-
a measure which is not finite, but is
σ-finite. (Hint: let the measure count the number of
integer points in a set).
- a measure which is not σ-finite. (Hint: assign
µ(A)=+∞ if a∈ A.)
Proposition 12
Let µ
be a σ
-additive measure on a σ
-algebra
R. Then:
-
If A,B∈R with A⊆ B, then
µ(A)≤µ(B) [we call this property “monotonicity of a measure”];
- If A,B∈R with A⊆ B and µ(B)<∞, then
µ(B∖ A) = µ(B) − µ(A);
-
If (An) is a sequence in R, with A1 ⊆ A2 ⊆ A3
⊆⋯. Then
-
If (An) is a sequence in R, with A1 ⊇ A2 ⊇ A3
⊇⋯. If µ(Am)<∞ for some m, then
| | µ(An) = µ | ⎛
⎝ | ⋂n An | ⎞
⎠ | .
(67) |
Proof.
The two first properties are easy to see. For the third statement,
define A=∪n An, B1=A1 and
Bn=An∖ An−1, n>1. Then
An=⊔k=1n Bn and A=⊔k=1∞Bn. Using
the σ-additivity of measures
µ(A)=∑k=1∞µ(Bk) and
µ(An)=∑k=1n µ(Bk). From the theorem in real
analysis that any monotonic sequence of real numbers converges
(recall that we admit +∞ as limits’ value) we have
µ(A)=∑k=1∞µ(Bk)=limn→ ∞
∑k=1n µ(Bk) = limn→ ∞ µ(An). The
last statement can be shown similarly.
□
Exercise 13
Let a measure µ
on ℕ
be defined by µ(
A)=0
for finite A and µ(
A) = ∞
for infinite A. Check that µ
is additive but not σ
-additive.
Therefore give an example that µ
does not satisfies 3.
12.2 Extension of Measures
From now on we consider only finite measures, an extension to
σ-finite measures will be done later.
Proposition 14
Any measure µ′
on a semiring S is uniquely extended to a
measure µ
on the generated ring R(
S)
, see
Ex. 5. If the initial measure was
σ
-additive, then the extension is σ
-additive as
well.
Proof.
If an extension exists it shall satisfy µ(
A)=∑
k=1n
µ′(
Ak), where
Ak∈
S. We need to show for this
definition two elements:
- Consistency, i.e. independence of the value from a
presentation of A∈ R(S) as A=⊔k=1n
Ak, where Ak∈ S. For two different presentation
A=⊔j=1n Aj and A=⊔k=1m Bk define
Cjk=Aj∩ Bk, which will be pair-wise disjoint. By the
additivity of µ′ we have µ′(Aj)=∑kµ′(Cjk)
and µ′(Bk)=∑jµ′(Cjk). Then
| | µ′(Aj)= | | | | µ′(Cjk)
= | | | | µ′(Cjk)= | | µ′(Bk).
|
- Additivity. For A=⊔k=1n Ak, where Ak∈
R(S) we can present Ak=⊔j=1n(k)
Cjk, Cjk∈ S. Thus A=⊔k=1n ⊔j=1n(k)
Cjk and:
Finally, show the σ-additivity. For a set
A=⊔
k=1∞Ak, where
A and
Ak∈
R(
S), find
presentations
A=⊔
j=1n Bj,
Bj∈
S and
Ak=⊔
l=1m(k)
Blk,
Blk∈
S. Define
Cjlk=
Bj ∩
Blk∈
S,
then
Bj=⊔
k=1∞⊔
l=1m(k) Cjlk and
Ak=
⊔
j=1n ⊔
l=1m(k) Cjlk.
Then, from σ-additivity of µ′:
| µ(A) | = | | µ′(Bj)= | | | | | µ′(Cjlk)= | | | | | µ′(Cjlk) = | | µ(Ak),
|
| | | | | | | | | |
|
where we changed the summation order in series with non-negative
terms.
□
In a similar way we can extend a measure from a semiring to
corresponding σ-ring, however it can be done even for a
larger family. The procedure recall the famous story on
Baron Munchausen
saves himself from being drowned in a swamp by pulling on his own
hair. Indeed, initially we knew measure for elements of semiring
S or their finite disjoint unions from R(S). For an arbitrary
set A we may assign a measure from an element of R(S) which
“approximates” A. But how to measure such approximation? Well,
to this end we use the measure on R(S) again (pulling on his own
hair)!
Coming back to exact definitions, we introduce the following notion.
Definition 15
Let S be a semi-ring of subsets in X, and µ
be a
measure defined on S.
An outer measure µ
* on X is a map
µ
*:2
X→[0,∞]
defined by:
µ*(A)=inf | ⎧
⎪
⎨
⎪
⎩ | | µ(Ak), such that
A⊆ ⋃k Ak, Ak∈ S | ⎫
⎪
⎬
⎪
⎭ | .
|
Proposition 16
An outer measure has the following properties:
-
µ*(∅)=0;
-
if A⊆ B then µ*(A)≤µ*(B), this is called
monotonicity
of the outer measure;
- if (An) is any sequence in 2X, then
µ*(∪n An) ≤ ∑n µ*(An).
The final condition says that an outer measure is countably
sub-additive. Note, that an outer measure may be
not a measure in the sense of Defn. 6 due to a luck of
additivity.
Example 17
The Lebesgue outer measure
on ℝ
is defined out of
the measure from Example 2, that is, for
A⊆ℝ
, as
µ*(A) = inf | ⎧
⎪
⎨
⎪
⎩ | | (bj−aj) :
A⊆ ⋃j=1∞[aj,bj) | ⎫
⎪
⎬
⎪
⎭ | .
|
We make this definition, as intuitively, the “length”, or measure,
of the interval [
a,
b)
is (
b−
a)
. For example, for outer Lebesgue measure we have µ*(A)=0 for
any countable set, which follows, as clearly µ*({x})=0 for
any x∈ℝ.
Lemma 18
Let a<
b. Then µ
*([
a,
b])=
b−
a.
Proof. For є>0, as [
a,
b] ⊆
[
a,
b+є), we have that µ
*([
a,
b])≤
(
b−
a)+є. As є>0, was arbitrary, µ
*([
a,
b])
≤
b−
a.
To show the opposite inequality we observe that
[a,b)⊂[a,b] and µ*[a,b) =b−a (because [a,b) is
in the semi-ring) so µ*[a,b]≥ b−a by 2.
□
Our next aim is to construct measures from outer measures. We use the
notation A▵ B=(A∪ B)∖ (A∩ B) for
symmetric difference of sets.
Definition 19
Given an outer measure µ
* defined by a measure µ
on a
semiring S, we define A⊆
X to be Lebesgue
measurable
if for any ε >0
there
is a finite union B of elements in S (in other words:
B∈
R(
S)
by Lem. 3), such that
µ
*(
A▵
B)<ε
.
Figure 18: Approximating area by refined simple sets arrangements. |
See Fig. 18 for an illustration of the concept of measurable sets.
Obviously all elements of S and R(S) are measurable.
Exercise 20
-
Define a function of pairs of Lebesgue
measurable sets A and B as the outer measure
of the symmetric difference of A and B:
Show that d is a metric on the collection of equivalence classes with respect to the equivalence relation: A∼ B if d(A,B)=0. Hint: to show the triangle inequality
use the inclusion:
-
Let a sequence (εn)→ 0 be monotonically decreasing. For a Lebesgue measurable A there exists a sequence (An)⊂ R(S) such that d(A,An)< εn for each n. Show that (An) is a Cauchy sequence for the distance d (68).
An alternative definition of a measurable set is due to Carathéodory.
Definition 21
Given an outer measure µ
*, we define E⊆
X to be
Carathéodory measurable
if
µ*(A) = µ*(A⋂ E) + µ*(A∖ E),
|
for any A⊆
X.
As µ* is sub-additive, this is equivalent to
µ*(A) ≥ µ*(A⋂ E) + µ*(A∖ E) (A⊆ X),
|
as the other inequality is automatic.
Exercise* 22
-
Show that for a Lebesgue measurable set A and any ε>0 there exist two elements B1 and B2 of the ring R(S) such that B1⊂ A ⊂ B2 and µ(B2∖ B1) < ε, cf. areas shadowed in darker and lighter colours on Fig. 18.
Hint: For a set B∈ R(S) such that µ*(A▵ B)<ε/2 from Defn. 19 shall exists C∈ R(S) such that C ⊃ A▵ B and µ(C) < µ*(A▵ B)+ε/2. Put B1 = B ∖ C and B2 = B ∪ C.
- Let µ(X)<∞ show that A is Lebesgue measurable if and only if µ(X) = µ*(A)+µ*(X∖ A).
- Show that measurability by Lebesgue and Carathéodory are equivalent.
Suppose now that the ring R(S) is an algebra (i.e., contains the
maximal element X). Then, the outer measure of any set is finite,
and the following theorem holds:
Theorem 23 (Lebesgue)
Let µ
* be
an outer measure on X defined by a semiring S, and let
L be the collection of all
Lebesgue measurable sets for µ
*. Then L is a
σ
-algebra, and if µ′
is the restriction of µ
*
to L, then µ′
is a measure. Furthermore,
µ′
is σ
-additive on L if
µ
is σ
-additive on S.
Proof.[Sketch of proof]
Clearly,
R(
S)⊂
L. Now we show that
µ
*(
A)=µ(
A) for a set
A∈
R(
S). If
A⊂ ∪
k Ak for
Ak ∈
S, then
µ(
A)≤ ∑
k µ(
Ak), taking the infimum we get
µ(
A)≤µ
*(
A). For the opposite inequality, any
A∈
R(
S) has a disjoint representation
A=⊔
k Ak,
Ak∈
S, thus
µ
*(
A)≤ ∑
k µ(
Ak)=µ(
A).
Now we will show that R(S) with the distance d (68) is an incomplete metric space, with the
measure µ being uniformly continuous functions. Measurable
sets make the completion of R(S) (cf. Ex. 2) with µ being continuation of
µ* to the completion by continuity, cf. Ex. 62.
Then, by the definition, Lebesgue measurable sets make the closure
of R(S) with respect to this distance.
We can check that measurable sets form an algebra. To this end we
need to make estimations, say, of µ*((A1∩ A2)▵
(B1∩ B2)) in terms of µ*(Ai▵ Bi). A
demonstration for any finite number of sets is performed through
mathematical inductions. The above two-sets case provide both: the
base and the step of the induction.
Now, we show that L is σ-algebra. Let
Ak∈ L and A=∪k Ak. Then for any
ε>0 there exists Bk∈ R(S), such that
µ*(Ak▵ Bk)<ε/2k. Define B=∪k
Bk. Then
| ⎛
⎝ | ⋃k Ak | ⎞
⎠ | ▵ | ⎛
⎝ | ⋃k Bk | ⎞
⎠ |
⊂ ⋃k | ⎛
⎝ | Ak ▵ Bk | ⎞
⎠ |
implies µ*(A▵ B)<ε.
|
We cannot stop at this point since B=∪k Bk may be not in
R(S). Thus, define B′1=B1 and
B′k=Bk∖ ∪i=1k−1 Bi, so B′k are
pair-wise disjoint. Then B=⊔k B′k and
B′k∈R(S). From the convergence of the series there
is N such that ∑k=N∞µ(B′k)<ε . Let
B′=∪k=1N B′k, which is in R(S). Then
µ*(B▵ B′)≤ ε and, thus,
µ*(A▵ B′)≤ 2ε.
To check that µ* is measure on L we use the following
Lemma 24
| µ
*(
A)−µ
*(
B) |≤ µ
*(
A▵
B)
, that
is µ
* is uniformly continuous in the metric d(
A,
B)
(68).
Proof.[Proof of the Lemma]
Use inclusions A⊂ B∪(A▵ B) and
B⊂ A∪(A▵ B).
□
To show additivity take A1,2∈L , A=A1⊔
A2, B1,2∈R(S) and µ*(Ai▵
Bi)<ε. Then µ*(A▵(B1∪
B2))<2ε and | µ*(A) − µ*(B1∪
B2) |<2ε. Thus µ*(B1∪ B2)=µ(B1∪
B2)=µ (B1) +µ (B2)−µ (B1∩ B2), but µ (B1∩
B2)=d(B1∩ B2,∅)=d(B1∩ B2,A1∩
A2)<2ε. Therefore
| ⎪
⎪ | µ*(B1⋃ B2)−µ (B1) −µ (B2) | ⎪
⎪ | <2ε.
|
Combining everything together we get (this is a sort of ε/3-argument):
| | ⎪
⎪ | µ*(A)−µ*(A1)−µ*(A2) | ⎪
⎪ |
| |
| | | | | | | | | | |
= | ⎪
⎪ | µ*(A)−µ*(B1⋃ B2) +µ*(B1⋃ B2) −(µ (B1) +µ (B2)) |
| | | | | | | | | |
| | +µ (B1) +µ (B2)−µ*(A1)−µ*(A2) | ⎪
⎪ |
| | | | | | | | | |
≤ | ⎪
⎪ | µ*(A)−µ*(B1⋃ B2) | ⎪
⎪ | + | ⎪
⎪ | µ*(B1⋃ B2)−(µ (B1) +µ (B2)) | ⎪
⎪ |
| | | | | | | | | |
| + | ⎪
⎪ | µ (B1) +µ (B2)−µ*(A1)−µ*(A2) | ⎪
⎪ |
| | | | | | | | | |
≤ | 6ε.
| | | | | | | | | |
|
Thus µ* is additive on L.
Check the countable additivity for A=⊔k Ak.
The inequality µ*(A)≤ ∑kµ*(Ak)
follows from countable sub-additivity. The opposite inequality is
the limiting case of the finite inequality
µ*(A)≥ µ*(⊔k=1N Ak)=∑k=1Nµ*(Ak)
following from monotonicity and additivity of µ*.
□
Corollary 25
Let E⊆ℝ
be open or closed. Then E is
Lebesgue measurable.
Proof.
As σ-algebras are closed under taking complements, we need
only show that open sets are Lebesgue measurable. For the latter we
will use a common trick, using the density and the countability of
the rationals.
Intervals (a,b) are Lebesgue measurable because they are countable unions of measurable half-open intervals from the semiring, e.g.:
(0,1) = | | | ⎡
⎢
⎢
⎣ | | , | | ⎞
⎟
⎟
⎠ | .
|
Now let U⊆ℝ be open. For each x∈ U,
there exists ax<bx with x∈(ax,bx)⊆ U. By
making ax slightly larger, and bx slightly smaller, we can
ensure that ax,bx∈ℚ. Thus U = ∪x (ax,
bx). Each interval is measurable, and there are at most a countable
number of them (endpoints make a countable set) thus U is the
countable (or finite) union of Lebesgue measurable sets, and hence
U is Lebesgue measurable itself.
□
We perform now an extension of finite measure to σ-finite one. Let µ be a σ-additive and
σ-finite measure defined on a semiring in X=⊔k Xk,
such that the restriction of µ to every Xk is finite. Consider the
Lebesgue extension µk of µ defined within Xk. A set
A⊂ X is measurable if every intersection A∩ Xk is
µk measurable. For a such measurable set A we define its
measure by the identity:
We call a measure µ defined on L complete if whenever E⊆ X is such that
there exists F∈L with µ(F)=0 and E⊆
F, we have that E∈L. Measures constructed from
outer measures by the above theorem are always complete. On the
example sheet, we saw how to form a complete measure from a given
measure. We call sets like E null sets: complete measures
are useful, because it is helpful to be able to say that null sets are
in our σ-algebra. Null sets can be quite complicated. For
the Lebesgue measure, all countable subsets of ℝ are
null, but then so is the Cantor set, which is uncountable.
Definition 26
If we have a property P(x) which is true except possibly x∈
A and µ(A)=0, we say P(x) is almost everywhere or a.e..
12.3 Complex-Valued Measures and Charges
We start from the following observation.
Exercise 27
Let µ1 and µ2 be measures on a same
σ-algebra. Define µ1+µ2 and λµ1,
λ>0 by (µ1+µ2)(A)=µ1(A)+µ2(A) and
(λµ1)(A)=λ(µ1(A)). Then µ1+µ2 and
λµ1 are measures on the same σ-algebra as well.
In view of this, it will be helpful to extend the notion of a measure
to obtain a linear space.
Definition 28
Let X be a set, and R be a σ-ring. A
real- (complex-) valued function ν on R is called a
charge (or signed measure) if it is countably additive as follows: for any
Ak∈R the identity A=⊔k Ak implies the
series ∑k ν(Ak) is absolute convergent and has the sum
ν(A).
In the following “charge” means “real charge”.
Example 29
Any linear combination of σ-additive measures on
ℝ with real (complex) coefficients is real (complex)
charge.
The opposite statement is also true:
Theorem 30
Any real (complex) charge ν
has a representation
ν=µ
1−µ
2 (ν=µ
1−µ
2+
iµ
3−
iµ
4), where
µ
k are σ
-additive measures.
To prove the theorem we need the following definition.
Definition 31
The variation of a charge
on a set A is | ν |(
A)=sup
∑
k| ν(
Ak) |
for all disjoint splitting A=⊔
k Ak.
Example 32
If ν=µ1−µ2, then | ν |(A)≤
µ1(A)+µ2(A). The inequality becomes an identity for
disjunctive measures on A (that is there is a partition
A=A1⊔ A2 such that µ2(A1)=µ1(A2)=0).
The relation of variation to charge is as follows:
Theorem 33
For any charge ν the function | ν | is
a σ-additive measure.
Finally to prove the Thm. 30 we use the
following
Proposition 34
For any charge ν the function | ν |−ν is
a σ-additive measure as well.
From the Thm. 30 we can deduce
Corollary 35
The collection of all charges on a σ
-algebra
R is a linear space which is complete with respect to
the distance:
d(ν1,ν2)= | | | ⎪
⎪ | ν1(A)−ν2(A) | ⎪
⎪ | .
|
The following result is also important:
Theorem 36 (Hahn Decomposition)
Let ν
be a charge. There exist A,
B∈
L, called
a Hahn decomposition
of (
X,ν)
, with A∩
B=∅
,
A∪
B=
X and such that for any E∈
L,
ν (A⋂ E) ≥ 0, ν(B⋂ E)≤ 0.
|
This need not be unique.
Proof.[Sketch of proof] We only sketch this. We say that
A∈
L
is
positive if
and similiarly define what it means for a measurable set to be
negative.
Suppose that ν never takes the value −∞ (the other case follows
by considering the charge −ν).
Let β = infν(B0) where we take the infimum over all negative
sets B0. If β=−∞ then for each n, we can find a negative Bn
with ν(Bn)≤ −n. But then B=∪n Bn would be negative with
ν(B)≤ −n for any n, so that ν(B)=−∞ a contradiction.
So β>−∞ and so for each n we can find a negative Bn
ν(Bn) < β+1/n.
Then we can show that B = ∪n Bn is negative, and argue that
ν(B) ≤ β. As B is negative, actually ν(B) = β.
There then follows a very tedious argument, by contradiction, to show that
A=X∖ B is a positive set. Then (A,B) is the required decomposition.
□
12.4 Constructing Measures, Products
Consider the semiring S of intervals [a,b). There is a simple
description of all measures on it. For a measure µ define
Fµ(t)= | ⎧
⎪
⎨
⎪
⎩ | µ([0,t)) | if t>0, |
0 | if t=0, |
−µ([t,0)) | if t<0, |
|
|
|
(69) |
Fµ is monotonic and any monotonic function F defines a
measure µ on S by the by µ([a,b))=F(b)−F(a). The
correspondence is one-to-one with the additional assumption
F(0)=0.
Theorem 37
The above measure µ is σ-additive on S if and
only if F is continuous from the left: F(t−0)=F(t) for all
t∈ℝ.
Proof.
Necessity:
F(
t)−
F(
t−0)=lim
ε→
0µ([
t−ε,
t))=µ(lim
ε→
0[
t−ε,
t))=µ(∅)=0, by the continuity of a
σ-additive measure, see
4.
For sufficiency assume [a,b)=⊔k [ak,bk). The inequality
µ([a,b))≥ ∑k µ([ak,bk)) follows from additivity and
monotonicity. For the opposite inequality take δk
s.t. F(b)−F(b−δ)<ε and
F(ak)−F(ak−δk)<ε/2k (use left continuity of
F). Then the interval [a,b−δ] is covered by
(ak−δk,bk), due to compactness of [a,b−δ] there
is a finite subcovering. Thus
µ([a,b−δ ))≤∑j=1N
µ([akj−δkj,bkj)) and µ([a,b))≤∑j=1N
µ([akj,bkj))+2ε .
□
Exercise 38
-
Give an example of function discontinued from the left at 1 and
show that the resulting measure is additive but not σ-additive.
- Check that, if a function F is continuous at point a
then µ({a})=0.
Example 39
-
Take F(t)=t, then the corresponding measure is the Lebesgue
measure
on ℝ.
- Take F(t) be the integer part of t, then µ
counts the number of integer within the set.
- Define the Cantor function as follows α(x)=1/2 on
(1/3,2/3); α(x)=1/4 on (1/9,2/9);
α(x)=3/4 on (7/9,8/9), and so for. This function is
monotonic and can be continued to [0,1] by continuity, it is
know as Cantor ladder. The resulting measure
has the following properties:
-
The measure of the entire interval is 1.
- Measure of every point is zero.
- The measure of the Cantor set is 1, while its Lebesgue measure is
0.
Another possibility to build measures is their product. In particular,
it allows to expand various measures defined
through (69) on the real line to ℝn.
Definition 40
Let X and Y be spaces, and let S and T be semirings
on X and Y respectively. Then S×
T is the semiring
consisting of {
A×
B :
A∈
S,
B∈
T }
(“generalised
rectangles”). Let µ
and ν
be measures on S and
T respectively. Define the product measure µ×ν
on S×
T by the rule (µ× ν)(
A×
B)=µ(
A) ν(
B)
.
Example 41
The measure from Example 3 on the semiring of half-open rectangles is the product of two
copies of pre-Lebesgue measures from Example 2 on the semiring of half-open intervals.
13 Integration
We now come to the main use of measure theory: to define a general theory
of integration.
13.1 Measurable functions
From now on, by a measure space we shall mean a triple
(X,L,µ), where X is a set, L is a
σ-algebra on X, and µ is a σ-additive
measure defined on L. We say that the members of
L are measurable, or
L-measurable, if necessary to avoid confusion.
Definition 1
A function f:
X→ℝ
is measurable
if
Ec(f)={x∈ X: f(x)<c} that is Ec(f)=f−1((−∞,c))
|
is in L (that is Ec(
f)
is a measurable set) for any c∈ℝ
.A complex-valued function is measurable if its real and imaginary
parts are measurable.
Lemma 2
The following are equivalent:
-
A function f is measurable;
-
For any a<b the set f−1((a,b)) is measurable;
-
For any open set U⊂ ℝ the set f−1(U) is
measurable.
Proof.
To show
1 ⇒
2 we note that
f−1((a,b)) = Eb(f)∖ | ⎛
⎜
⎜
⎝ | | Ea+1/n(f) | ⎞
⎟
⎟
⎠ | .
|
For
2 ⇒
3
use that any open set
U⊂ ℝ is a union of countable
set of intervals (
a,
b), cf. proof of Cor.
25.
The final implication 3 ⇒ 1 directly follows from openness of (−∞,a).
□
Corollary 3
Let f:
X → ℝ
be measurable and
g: ℝ → ℝ
be continuous, then the
composition g(
f(
x))
is measurable.
Proof.
The preimage of the open set (−∞,
c) under a continuous
g is an open set, say
U. The preimage of
U under
f is measurable by Lem.
3. Thus, the preimage of (−∞,
c) under the composition
g ∘
f is measurable, thereafter
g ∘
f is a measurable function.
□
Theorem 4
Let f,g:X→ℝ be measurable. Then af
(a∈ℝ), f+g, fg, max(f,g) and
min(f,g) are all measurable. That is measurable functions form
an algebra and this algebra is closed under convergence a.e.
Proof.
Use Cor.
3 to show measurability of λ
f, |
f | and
f2. The measurability of a sum
f1 +
f2 follows from the relation
Ec(f1+f2)=⋃r∈ℚ (Er(f1)⋂
Ec−r(f2)).
|
Next use the following identities:
If (fn) is a non-increasing sequence of measurable functions
converging to f. Than Ec(f)=∪n Ec(fn).
Moreover any limit can be replaced by two monotonic limits:
| | fn(x)=
| | | | max
(fn(x), fn+1(x),…,fn+k(x)).
(70) |
Finally if f1 is measurable and f2=f1 almost everywhere,
then f2 is measurable as well.
□
We can define several types of convergence for measurable functions.
Definition 5
We say that sequence (
fn)
of functions converges-
uniformly to f (notated fn⇉
f) if
- almost everywhere to f (notated fn→a.e.f) if
fn(x)→ f(x) for all x∈ X∖ A,
µ(A)=0;
|
- in measure µ to f (notated
fn→µf) if for all ε>0
µ({x∈ X: | ⎪
⎪ | fn(x)−f(x) | ⎪
⎪ | >ε }) → 0.
(71) |
Clearly uniform convergence implies both convergences a.e and in
measure.
Theorem 6
On finite measures convergence a.e. implies convergence in measure.
Proof.
Define
An(ε)={
x∈
X: |
fn(
x)−
f(
x) |≥
ε}. Let
Bn(ε)=∪
k≥ n
Ak(ε). Clearly
Bn(ε)⊃
Bn+1(ε), let
B(ε)=∩
1∞Bn(ε). If
x∈
B(ε)
then
fn(
x)↛
f(
x). Thus µ(
B(ε))=0,
but µ(
B(ε))=lim
n→
∞µ(
Bn(ε)), cf. (
67). Since
An(ε)⊂
Bn(ε) we see that
µ(
An(ε))→ 0 as required for (
71)
□
Note, that the construction of sets Bn(ε) is just
another implementation of the “two monotonic limits”
trick (70) for sets.
Exercise 7
Present examples of sequences (
fn)
and functions
f such that:
-
fn→ µf but not
fn→ a.e.f.
-
fn→ a.e.f but not fn⇉ f.
However we can slightly “fix” either the set or the sequence to
“upgrade” the convergence as shown in the following two theorems.
Theorem 8 (Egorov)
If fn→
a.e.f on a finite measure
set X then for any σ>0
there is Eσ⊂
X
with µ(
Eσ)<σ
and fn⇉
f on
X∖
Eσ.
Proof.
We use
An(ε) and
Bn(ε) from the proof
of Thm.
6. Observe that |
f(
x)−
fk(
x) |< ε uniformly for all
x ∈
X∖
Bn(ε) and
k>
n.
For every ε>0 we seen
that µ(
Bn(ε))→ 0, thus for each
k there is
N(
k) such that µ(
BN(k)(1/
k))<σ/2
k. Put
Eσ=∪
k BN(k)(1/
k).
□
Theorem 9
If fn→ µf then there is a subsequence
(nk) such that fnk→
a.e.f for k→ ∞.
Proof.
In the notations of two previous proofs: for every natural
k
take
nk such that µ(
Ank(1/
k))< 1/2
k, which is possible since µ(
An(ε))→ 0. Define
Cm=∪
k=m∞Ank(1/
k) and
C=∩
Cm. Then,
µ(
Cm)=1/2
m−1 and, thus, µ(
C)=0 by (
67). If
x∉
C
then there is such
N that
x∉
Ank(1/
k) for all
k>
N. That means that |
fnk(
x)−
f(
x) |<1/
k for all
such
k, i.e
fnk(
x)→
f(
x). Thus, we have the point-wise convergence everywhere except the zero-measure set
C.
□
It is worth to note, that we can use the last two theorem subsequently
and upgrade the convergence in measure to the uniform convergence of a
subsequence on a subset.
Exercise 10
For your counter examples from
Exercise 7, find
-
a subsequence fnk of the sequence
from 1 which converges to f a.e.;
- a subset such that sequence
from 2 converges uniformly.
Exercise 11
Read about Luzin’s C-property.
13.2 Lebesgue Integral
First we define a sort of “basis” for the space of integral functions.
Definition 12
For A⊆
X, we define χ
A to be the indicator
function
of A, by
Then, if χA is measurable, then χA−1( (1/2,3/2) ) = A
∈ L; conversely, if A∈L, then
X∖ A∈L, and we see that for any
U⊆ℝ open, χA−1(U) is either
∅, A, X∖ A, or X, all of which are
in L. So χA is measurable if and only if
A∈L.
Definition 13
A measurable function f:X→ℝ is
simple if it attains only a countable number of values.
Lemma 14
A function f:
X→ℝ
is simple if and only if
for some (
tk)
k=1∞⊆ℝ
and
Ak∈
L. That is, simple functions are
linear combinations of indicator functions of measurable sets.Moreover in the above representation the sets Ak can be
pair-wise disjoint and all tk≠ 0 pair-wise different. In this case
the representation is unique.
Notice that it is now obvious that
Corollary 15
The collection of simple functions forms a vector space: this wasn’t
clear from the original definition.
Definition 16
A simple function in the form (72) with disjoint
Ak is called summable
if the following series converges:
| | ⎪
⎪ | tk | ⎪
⎪ | µ(Ak)
if f has the above unique representation f = | | tk
χAk .
(73) |
It is another combinatorial exercise to show that this definition is
independent of the way we write f.
Definition 17
We define the integral
of a simple function
f=∑
k tk χ
Ak (72) over a measurable set A by setting
Clearly the series converges for any simple summable function
f. Moreover
Lemma 18
The value of integral of a simple summable function is independent
from its representation by the sum of indicators (72). In particular, we can evaluate the integral taking the canonical representation over pair-wise
disjoint sets having pair-wise different values.
Proof.
This is another slightly tedious combinatorial exercise.
You need to prove that the integral of a simple function is
well-defined, in the sense that it is independent of the way we
choose to write the simple function.
□
Exercise 19
Let f be the function on [0,1] which take the value 1 in
all rational points and 0—everywhere else. Find the value of
the Lebesgue integral ∫[0,1] f,dµ with respect to the
Lebesgue measure on [0,1]. Show that the Riemann upper-
and lower sums for f converges to different values, so f is
not Riemann-integrable.
We will denote by S(X) the
collection of all simple summable functions on X.
Proposition 21
Let f, g:
X→ ℝ
be in S(
X)
(that is simple summable), let a, b∈ ℝ
and A
is a measurable set. Then:
-
∫A af+bg d µ = a∫A f d µ + b∫A g d µ,
that is S(X) is a linear space;
- The correspondence f→ ∫A f d µ is a linear
functional on S(X);
-
The correspondence A → ∫A f d µ is a charge;
-
If f≤ g then ∫X f d µ ≤ ∫X g d µ,
that is integral is monotonic;
-
The function
d1(f,g)= | ∫ | | ⎪
⎪ | f(x)−g(x) | ⎪
⎪ | d µ(x)
(74) |
has all properties of a metric (distance) on S(X) probably
except separation, but see the next item.
- For f≥ 0 we have ∫X f d µ=0 if and only if
µ( { x∈ X : f(x)≠0 } ) = 0. Therefore for the function d1 (74):
d1(f,g)=0 if and only if f | | g.
|
- The integral is uniformly continuous with respect the above metric d1 (74):
| ⎪
⎪
⎪
⎪
⎪
⎪ | ∫ | | f(x) d µ(x)− | ∫ | | g(x) d µ(x) | ⎪
⎪
⎪
⎪
⎪
⎪ | ≤ d1(f,g).
|
Proof.
The proof is almost obvious, for example the
Property
1 easily follows from
Lem.
18.
We will outline 3 only. Let f is an indicator
function of a set B, then A→ ∫A
f d µ=µ(A∩ B) is a σ-additive measure (and
thus—a charge). By the Cor. 35 the
same is true for finite linear combinations of indicator functions
and their limits in the sense of distance d1.
□
We can identify functions which has the same values a.e. Then
S(X) becomes a metric space with the distance
d1 (74). The space may be incomplete and we
may wish to look for its completion. However, if we will simply try
to assign a limiting point to every Cauchy sequence in
S(X), then the resulting space becomes so huge that it
will be impossible to realise it as a space of functions on X.
Exercise 22
Use ideas of Ex. 1 to present a sequence of simple functions which has the Cauchy property in metric d1 (74) but does not have point-wise limits anywhere.
To
reduce the number of Cauchy sequences in S(X) eligible
to have a limit, we shall ask an additional condition. A convenient
reduction to functions on X appears if we ask both the convergence
in d1 metric and the point-wise convergence on X a.e.
Definition 23
A function f is summable
by a measure µ
if there is a sequence
(
fn)⊂
S(
X)
such that
-
the sequence (fn) is a Cauchy sequence in
S(X);
- fn→a.e. f.
Clearly, if a function is summable, then any equivalent function is
summable as well. Set of equivalent classes of summable functions will be denoted by
L1(X).
Lemma 24
If the measure µ is finite then any bounded measurable
function is summable.
Proof.
Define
Ekn(
f)={
x∈
X:
k/
n≤
f(
x)< (
k+1)/
n} and
fn=∑
k k/
n χ
Ekn (note that the sum is
finite due to boundedness of
f).
Since | fn(x)−f(x) |<1/n we have uniform convergence
(thus convergence a.e.) and (fn) is the Cauchy sequence:
d1(fn,fm)=∫X| fn−fm | d µ≤
(1/n+1/m)µ(X).
□
Another simple result, which is useful on many occasions is as
follows.
Lemma 26
If the measure µ
is finite and fn⇉
f then
d1(
fn,
f)→ 0
.
Corollary 27
For a convergent sequence fn→
a.e.
f, which admits the uniform bound |
fn(
x) |<
M for all
n and x, we have d1(
fn,
f)→ 0
.
Proof.
For any ε>0, by the Egorov’s theorem
8
we can find
E, such that
- µ(E)< ε/2M; and
- from the uniform convergence on X∖ E there exists
N such that for any n>N we have
| f(x)−fn(x) |<ε /2µ(X).
Combining this we found that for
n>
N,
d1(
fn,
f)<
M
ε/2
M + µ(
X) ε /2µ(
X) <
ε .
□
Exercise 28
Convergence in the metric d1 and a.e. do not imply each other:
-
Give an example of fn→a.e.
f such that d1(fn ,f)↛0.
- Give an example of the sequence (fn) and function f
in L1(X)
such that d1(fn ,f)→ 0 but fn does not
converge to f a.e.
To build integral we need the following
Lemma 29
Let (
fn)
and (
gn)
be two Cauchy sequences in
S(
X)
with the same limit a.e., then
d1(
fn,
gn)→ 0
.
Proof.
Let φ
n=
fn−
gn, then this is a Cauchy sequence with zero
limit a.e. Assume the opposite to the statement: there exist
δ>0 and sequence (
nk) such that
∫
x| φ
nk |
d µ>δ. Rescaling-renumbering
we can obtain ∫
x| φ
n |
d µ>1.
Take quickly convergent subsequence using the Cauchy property:
Renumbering agian assume d1(φk,φk+1)≤ 1/2k+2.
Since φ1 is a simple, take the canonical presentation φ1=∑k tk χAk,
then ∑k | tk | µ(Ak)=∫X
| φ1 | d µ≥ 1. Thus, there exists N, such that
∑k=1N | tk | µ(Ak)≥ 3/4. Put
A=⊔k=1N Ak and C=max1≤ k ≤
N| tk |=maxx∈ A| φ1(x) |.
By the Egorov’s Theorem 8 there is E⊂ A such
that µ(E)<1/(4C) and φn⇉ 0 on B=A∖
E. Then
| ∫ | | | ⎪
⎪ | φ1 | ⎪
⎪ | d µ= | ∫ | |
| ⎪
⎪ | φ1 | ⎪
⎪ | d µ− | ∫ | | | ⎪
⎪ | φ1 | ⎪
⎪ | d µ≥
| | − | | · C= | | .
|
By the triangle inequality for d1:
| ⎪
⎪
⎪
⎪
⎪
⎪ | ∫ | | | ⎪
⎪ | φn | ⎪
⎪ | d µ− | ∫ | |
| ⎪
⎪ | φn+1 | ⎪
⎪ | d µ | ⎪
⎪
⎪
⎪
⎪
⎪ | ≤
d1(φn,φn+1)≤ | |
we get
| ∫ | | | ⎪
⎪ | φn | ⎪
⎪ | d µ≥ | ∫ | | | ⎪
⎪ | φ1 | ⎪
⎪ | d µ−
| | ⎪
⎪
⎪
⎪
⎪
⎪ | ∫ | | | ⎪
⎪ | φn | ⎪
⎪ | d µ− | ∫ | |
| ⎪
⎪ | φn+1 | ⎪
⎪ | d µ | ⎪
⎪
⎪
⎪
⎪
⎪ | ≥
| | − | | | > | | .
|
But this contradicts to the fact ∫B | φn | d µ
→ 0, which follows from the uniform convergence
φn⇉ 0 on B.
□
It follows from the Lemma that we can use any Cauchy sequence of simple functions for the extension of integral.
Corollary 30
The functional IA(f)=∫A f(x) d µ(x), defined on any
A∈ L on the space of simple functions
S(X) can be extended by continuity to the functional
on L1(X,µ).
Definition 31
For an arbitrary summable f∈
L1(
X)
, we define
the Lebesgue integral
where the Cauchy sequence fn of summable simple functions
converges to f a.e.
Theorem 32
-
L1(X) is a linear space.
- For any measurable set A⊂ X the correspondence f↦
∫A f d µ is a linear functional on L1(X).
- For any f∈L1(X) the value ν(A)=∫A f
d µ is a charge.
- d1(f,g)=∫A | f−g | d µ is a distance on
L1(X).
Proof.
The proof is follows from Prop.
21 and
continuity of extension.
□
Summing up: we build L1(X) as a completion of
S(X) with respect to the distance d1 such that
elements of L1(X) are associated with (equivalence
classes of) measurable functions on X.
13.3 Properties of the Lebesgue Integral
The space L1 was defined from dual convergence—in
d1 metric and point-wise a.e. Can we get the continuity of the integral from
the convergence almost everywhere alone? No, in general. However,
we will state now some results on continuity of the integral under
convergence a.e. with some additional assumptions. Finally, we show
that L1(X) is closed in d1 metric.
Theorem 33 (Lebesgue on dominated convergence)
Let (
fn)
be a
sequence of µ
-summable functions on X, and there is
φ∈
L1(
X)
such that |
fn(
x) |≤ φ(
x)
for all x∈
X, n∈ℕ
. If fn→a.e. f, then
f∈L1(X) and for any measurable A:
Proof.
For any measurable
A the expression ν(
A)=∫
A φ
d µ defines a finite measure on
X due to non-negativeness
of φ and Thm.
32.
Lemma 34 (Change of variable)
If g is measurable and bounded then f=φ
g is µ
-summable
and for any µ
-measurable set A we have
Proof.[Proof of the Lemma]
Let
M be the set of all
g such that the Lemma is
true.
M includes any indicator functions
g=χ
B of a
measurable
B:
| ∫ | | f d µ= | ∫ | | φχB d µ = | ∫ | | φ
d µ =ν(A⋂ B)= | ∫ | | χB d ν= | ∫ | | g d ν.
|
Thus
M contains also finite linear combinations of indicators.
For any
n∈ℕ and a bounded
g two functions
g−(
x)=1/
n[
ng(
x)] and
g+(
x)=
g−+1/
n are
finite linear combinations of indicators and are in
M. Since
g−(
x)≤
g(
x)≤
g+(
x) we have
| ∫ | | g− d ν= | ∫ | | φ g− d µ≤ | ∫ | |
φ g d µ≤ | ∫ | | φ g+ d µ= | ∫ | |
g+ d ν.
|
By squeeze rule for
n→ ∞ we have the middle
term tenses to ∫
Ag d ν, that is
g∈
M.
Note, that formula (75) is a change of
variable in the Lebesgue integral of the type: ∫f(sinx)
cosx d x = ∫f(sinx) d (sinx).
□
For the proof of the theorem define:
gn(x) | = | ⎧
⎨
⎩ | fn(x)/φ(x), | if φ(x)≠ 0, |
0, | if φ(x)= 0,
|
|
|
|
g(x) | = | ⎧
⎨
⎩ | f(x)/φ(x), | if φ(x)≠ 0, |
0, | if φ(x)= 0.
|
|
|
|
|
|
Then
gn is bounded by 1 and
gn→
a.e. g. To show the theorem it
will be enough to show lim
n→ ∞∫
A
gn d ν=∫
A g d ν. For the uniformly bounded functions on
the finite measure set this can be derived from the Egorov’s
Thm.
8, see an example of this in the proof of
Lemma
29.
□
Note, that in the above proof summability of φ was used to obtain the
finiteness of the measure ν, which is required for Egorov’s
Thm. 8.
Exercise 35
Give an example of fn→a.e.
f such that ∫X fn d µ ≠ ∫X f d µ. For such an example, try to find a function φ such that | fn | ≤ φ for all n and check either φ is summable.
Exercise 36 (Chebyshev’s inequality)
Show that: if f is non-negative and summable, then
µ{x∈ X: f(x)>c} < | | ∫ | | f d µ.
(76) |
Theorem 37 (B. Levi’s, on monotone convergence)
Let (
fn)
be monotonically increasing sequence of µ
-summable
functions on X. Define f(
x)=lim
n→∞ fn(
x)
(allowing the value +∞
).
-
If all integrals ∫X fn d µ are bounded by the same
value C<∞ then f is summable and ∫X
f d µ=limn→∞∫X fn d µ.
- If limn→∞∫X fn d µ=+∞ then
function f is not summable.
Proof.
Replacing
fn by
fn−
f1 and
f by
f−
f1 we can
assume
fn≥ 0 and
f≥ 0. Let
E be the set where
f is infinite, then
E=⋂N⋃n ENn, where
ENn={x∈ X: fn(x)≥ N}.
|
By Chebyshev’s
inequality (
76) we have
Nµ(ENn) < | ∫ | | fn d µ ≤ | ∫ | | fn d µ≤ C,
|
then µ(
ENn)≤
C/
N . Thus
µ(
E)=lim
N→∞lim
n→∞
µ(
ENn)=0.
Thus f is finite a.e.
Lemma 38
Let f be a measurable non-negative function attaining only
finite values. f is summable
if and only if sup∫A f d µ<∞, where the supremum
is taken over all finite-measure set A such that f is
bounded on A.
Proof.[Proof of the Lemma]
Necessity: if
f is summable then for any set
A⊂
X we have
∫
A f d µ≤ ∫
X f d µ<∞, thus the supremum is
finite.
Sufficiency: let sup∫A f d µ=M<∞, define B={x∈ X:
f(x)=0} and
Ak={x∈ X:
2k≤ f(x)<2k+1, k∈ℤ}, by (76) we have
µ(Ak)<M/2k and
X=B⊔(⊔k=0∞Ak). Define
g(x) | = | |
fn(x) | = | ⎧
⎨
⎩ | f(x), | if x∈ ⊔−nn An, |
0, | otherwise.
|
|
|
|
|
|
Then g(x)≤ f(x) < 2g(x). Function g is a simple
function, its summability follows from the estimate
∫⊔−nn Ak g d µ≤∫⊔−nn Ak
f d µ≤ M which is valid for any n, taking
n→ ∞ we get summability of g. Furthermore,
fn →a.e. f and fn(x)≤
f(x) <2g(x), so we use the Lebesgue
Thm. 33 on dominated convergence to
obtain the conclusion.
□
Let A be a finite measure set such that f is bounded on A, then
| ∫ | | f d µ | | | ∫ | | fn d µ≤
| | ∫ | | fn d µ≤ C.
|
This show summability of f by the previous Lemma. The rest of
statement and (contrapositive to) the second part follows from the Lebesgue
Thm. 33 on dominated convergence.
□
Now we can extend this result dropping the monotonicity assumption.
Lemma 39 (Fatou)
If a sequence (
fn)
of µ
-summable non-negative functions
is such that:
-
∫X fn d µ≤ C for all n;
- fn →a.e. f,
then f is µ
-summable and ∫
X f d µ≤
C.
Proof.Let us replace the limit
fn→
f by two monotonic
limits. Define:
gkn(x) | = | min(fn(x),…,fn+k(x)), |
gn(x) | = | |
|
Then
gn is a non-decreasing sequence of functions and
lim
n→ ∞ gn(
x)=
f(
x) a.e. Since
gn≤
fn, from monotonicity of integral we get ∫
X gn d µ≤
C for all
n. Then Levi’s Thm.
37 implies
that
f is summable and ∫
X f d µ≤
C.
□
Exercise 41
Give an example such that under the Fatou’s lemma condition we get
limn→∞∫X fn d µ ≠ ∫X f d µ.
Now we can show that L1(X) is complete:
Theorem 42
L1(
X)
is a Banach space.
Proof.
It is clear that the distance function
d1 indeed define a norm
||
f||
1=
d1(
f,0). We only need to demonstrate the
completeness. We again utilise the three-step procedure from
Rem.
7.
Take a Cauchy sequence (fn)
and building a subsequence if necessary, assume that its
quickly convergent that is
d1(fn,fn+1)≤ 1/2k. Put
φ1=f1 and
φn=fn−fn−1 for n>1. Then fn= | | φk .
|
The sequence
ψn(x)=∑1n | φk(x) | is monotonic, integrals
∫X ψn d µ are bounded by the same constant
||f1||1+1. Thus, by the B. Levi’s
Thm. 37 and its proof, ψn→
ψ for a summable essentially bounded function ψ. Therefore, the series ∑φk(x) converges as well to a value f(x) of a
function f. But, this means that fn
→a.e. f (the
first step is completed).
We also notice
| fn(x) |≤| ψ(x) |. Thus by the Lebesgue
Thm. 33 on dominated convergence f∈
L1(X) (the second step is completed).
Furthermore,
0≤ | | | ∫ | | ⎪
⎪ | fn−f | ⎪
⎪ | d µ≤
| | | | ⎪⎪
⎪⎪ | φk | ⎪⎪
⎪⎪ | =0. |
That
is, fn→ f in the norm of L1(X). (That
completes the third step and the whole proof).
□
The next important property of the Lebesgue integral is its
absolute continuity.
Theorem 43 (Absolute continuity of Lebesgue integral)
Let f∈
L1(
X)
. Then for any ε>0
there
is a δ>0
such that | ∫
A f d µ |<ε
if
µ(
A)<δ
.
Proof.
If
f is essentially bounded by
M, then it is enough to set
δ=ε/
M. In general let:
An | = | {x∈ X: n≤ | ⎪
⎪ | f(x) | ⎪
⎪ | < n+1}, |
|
Bn | = | ⊔0n Ak, |
Cn | = | X∖ Bn.
|
|
Then ∫
X|
f |
d µ=∑
0∞∫
Ak|
f |
d µ, thus there is an
N such that
∑
N∞∫
Ak|
f |
d µ=∫
CN|
f |
d µ<ε/2.
Now put δ =ε/2
N+2, then for any
A⊂
X
with µ(
A)<δ:
| ⎪
⎪
⎪
⎪
⎪
⎪ | ∫ | | f d µ | ⎪
⎪
⎪
⎪
⎪
⎪ | ≤ | ∫ | | | ⎪
⎪ | f | ⎪
⎪ | d µ=
| ∫ | | | ⎪
⎪ | f | ⎪
⎪ | d µ+ | ∫ | |
| ⎪
⎪ | f | ⎪
⎪ | d µ
< | | + | | =ε.
|
□
13.4 Integration on Product Measures
It is well-known geometrical interpretation of an integral in calculus
as the “area under the graph”. If we advance from “area” to a
“measure” then the Lebesgue integral can be
treated as theory of measures of very special shapes created by graphs
of functions. This shapes belong to the product spaces of the function
domain and its range. We introduced product measures in
Defn. 40, now we will study them in same details
using the Lebesgue integral. We start from the following
Theorem 44
Let X and Y be spaces, and let S and T be semirings
on X and Y respectively and µ
and ν
be measures
on S and T respectively. If µ
and ν
are
σ
-additive, then the product measure ν× µ
from
Defn. 40 is σ
-additive as well.
Proof.
For any
C=
A×
B∈
S×
T let us define
fC(
x)=χ
A(
x)ν(
B). Then
(µ×ν)(C)=µ(A)ν(B)= | ∫ | | fC d µ.
|
If the same set
C has a representation
C=⊔
k Ck for
Ck∈
S×
T, then σ-additivity of ν implies
fC=∑
k fCk. By the Lebesgue
theorem
33 on dominated
convergence:
Thus
□
The above correspondence C↦ fC can be extended to the ring
R(S× T) generated by S× T by the formula:
fC= | | fCk, for C=⊔k Ck∈ R(S× T).
|
We have the uniform continuity of this correspondence:
| ⎪⎪
⎪⎪ | fC1−fC2 | ⎪⎪
⎪⎪ | 1≤ (µ×ν)(C1▵ C2)=d1(C1,C2)
|
because from the representation C1=A1⊔ B and
C2=A2⊔ B, where B=C1∩ C2 one can see that
fC1−fC2=fA1−fA2, fC1▵
C2=fA1+fA2 together with | fA1−fA2 |≤
fA1+fA2 for non-negative functions.
Thus the map C↦ fC can be extended to the map of
σ-algebra L(X× Y) of
µ×ν-measurable set to L1(X) by the formula
flimn Cn=limn fCn.
Exercise 45
Describe topologies where two limits from the last formula are taken.
The following lemma provides the geometric interpretation of the
function fC as the size of the slice of the set C along
x=const.
Lemma 46
Let C∈
L(
X×
Y)
. For almost every x∈
X the
set Cx={
y∈
Y: (
x,
y)∈
C}
is ν
-measurable and
ν(
Cx)=
fC(
x)
.
Proof.
For sets from the ring
R(
S×
T) it is true by the
definition. If
C(n) is a monotonic sequence of sets, then
ν(lim
n Cx(n))=lim
n ν(
Cx(n)) by
σ-additivity of measures. Thus the property
ν(
Cx)=
fx(
C) is preserved by monotonic limits. The following
result of the separate interest:
Lemma 47
Any measurable set can be received (up to a set of zero measure)
from elementary sets by two monotonic limits.
Proof.[Proof of Lem.
47]
Let
C be a measurable set, put
Cn∈
R(
S×
T)
to approximate
C up to 2
−n in µ×ν. Let
C′=∩
n=1∞∪
k =1∞Cn+k, then
(µ× ν) | ⎛
⎝ | C∖ ⋃k=1∞Cn+k | ⎞
⎠ | =0 and
(µ× ν) | ⎛
⎝ | ⋃k=1∞Cn+k∖ C | ⎞
⎠ | =21−n.
|
|
|
Then (µ×ν)(
C′▵
C)≤ 2
1−n for
any
n∈ℕ.
□
Coming back to Lem.
46 we notice that (in the above notations)
fC=
fC′ almost everywhere. Then:
fC(x) | | fC′(x)=ν(C′x)=ν(Cx).
|
□
The following theorem generalizes the meaning of the integral as “area under the graph”.
Theorem 48
Let µ
and ν
are σ
-finite measures and C be
a µ×ν
measurable set X×
Y. We define
Cx={
y∈
Y: (
x,
y)∈
C}
. Then for µ
-almost every
x∈
X the set Cx is ν
-measurable, function
fC(
x)=ν(
Cx)
is µ
-measurable and
where both parts may have the value +∞
.
Proof.
If
C has a finite measure, then the statement is reduced to
Lem.
46 and a passage to limit
in (
77).
If C has an infinite measure, then there exists a sequence of
Cn⊂ C, such that ∪n Cn=C and
(µ×ν)(Cn)→ ∞. Then fC(x)=limn
fCn (x) and
Thus fC is measurable and non-summable.
□
This theorem justify the well-known technique to calculation of areas
(volumes) as integrals of length (areas) of the sections.
Theorem 50 (Fubini)
Let f(
x,
y)
be a summable function on the product of spaces
(
X,µ)
and (
Y,ν)
. Then:-
For µ-almost every x∈ X the function f(x,y) is
summable on Y and fY(x)=∫Y f(x,y) d ν(y) is a
µ-summable on X.
- For ν-almost every y∈ Y the function f(x,y) is
summable on X and fX(y)=∫X f(x,y) d µ(x) is a
ν-summable on Y.
- There are the identities:
| | = |
| ∫ | | ⎛
⎜
⎜
⎜
⎜
⎝ | ∫ | | f(x,y) d ν(y) | ⎞
⎟
⎟
⎟
⎟
⎠ | dµ(x) |
| (79) |
| = | | ∫ | | ⎛
⎜
⎜
⎜
⎜
⎝ | ∫ | | f(x,y) d µ(x) | ⎞
⎟
⎟
⎟
⎟
⎠ | dν(y).
|
| |
|
- For a non-negative functions the existence of any repeated
integral in (79) implies summability of f on
X× Y.
Proof.
From the decomposition
f=
f+−
f− we can reduce our consideration
to non-negative functions. Let us consider the product of three
spaces (
X,µ), (
Y,ν), (ℝ,λ), with
λ=
dz being the Lebesgue measure on ℝ. Define
C={(x,y,z)∈ X× Y× ℝ: 0≤ z≤ f(x,y)}.
|
Using the relation (
78) we get:
Cxy | = | {z∈ ℝ: 0≤ z≤ f(x,y)},
λ(Cxy)=f(x,y) |
Cx | = | {(y,z)∈ Y× ℝ: 0≤ z≤ f(x,y)},
(ν× λ)(Cx)= | ∫ | | f(x,y) d ν(y).
|
|
|
the theorem follows from those relations.
□
Exercise 51
-
Show that the first three conclusions of the Fubini Theorem may
fail if f is not summable.
- Show that the fourth conclusion of the Fubini Theorem may
fail if f has values of different signs.
13.5 Absolute Continuity of Measures
Here, we consider another topic in the measure theory which benefits
from the integration theory.
Definition 52
Let X be a set with σ-algebra R and
σ-finite measure µ and finite charge ν on
R. The charge ν is absolutely continuous
with respect to µ if µ(A)=0 for A∈ R
implies ν(A)=0. Two charges ν1 and ν2 are
equivalent if two conditions
| ν1 |(A)=0 and | ν2 |(A)=0 are equivalent.
The above definition seems to be not justifying “absolute
continuity” name, but this will become clear from the following
important theorem.
Theorem 53 (Radon–Nikodym)
Any charge ν
which absolutely continuous with respect to a
measure µ
has the form
where f is a function from L1. The function
f∈
L1 is uniquely defined by the charge ν
.
Proof.[Sketch of the proof]
First we will assume that ν is a
measure. Let
D be the collection of measurable
functions
g:
X→[0,∞) such that
Let α = sup
g∈D ∫
X g d µ ≤ ν(
X) <
∞. So we can find a sequence (
gn) in
D
with ∫
X gn d µ → α.
We define f0(x) = supn gn(x). We can show that
f0=∞ only on a set of µ-measure zero, so if we
adjust f0 on this set, we get a measurable function
f:X→[0,∞). There is now a long argument to show
that f is as required.
If ν is a charge, we can find f by applying the previous
operation to the measures ν+ and ν− (as it is easy to
verify that ν+,ν−⋘µ).
We show that f is essentially unique. If g is another
function inducing ν, then
| ∫ | | f−g d µ = ν(E) − ν(E) = 0 (E∈L).
|
Let E = {x∈ X : f(x)−g(x)≥ 0}, so as f−g is
measurable, E∈L. Then ∫E f−g d µ =0 and
f−g≥0 on E, so by our result from integration theory, we
have that f−g=0 almost everywhere on E. Similarly, if F =
{x∈ X : f(x)−g(x)≤ 0}, then F∈L and
f−g=0 almost everywhere on F. As E∪ F=X, we conclude
that f=g almost everywhere.
□
Corollary 54
Let µ be a measure on X, ν be a finite charge, which
is absolutely continuous with respect to µ. For any
ε>0 there exists δ>0 such that
µ(A)<δ implies | ν |(A)<ε .
Proof.
By the Radon–Nikodym theorem there is a function
f∈
L1(
X,µ) such that ν(
A)=∫
A
f d µ. Then | ν |(
A)=∫
A
|
f |
d µ ad we get the statement from
Theorem
43 on absolute continuity
of the Lebesgue integral.
□
14 Functional Spaces
In this section we describe various Banach spaces of functions on
sets with measure.
14.1 Integrable Functions
Let (X,L,µ) be a measure space. For 1≤
p<∞, we define Lp(µ) to
be the space of measurable functions f:X→K
such that
We define ||·||p : Lp(µ)→[0,∞) by
| ⎪⎪
⎪⎪ | f | ⎪⎪
⎪⎪ | p = | ⎛
⎜
⎜
⎜
⎜
⎝ | ∫ | | | ⎪
⎪ | f | ⎪
⎪ | p d µ | ⎞
⎟
⎟
⎟
⎟
⎠ | | (f∈ Lp(µ)).
|
Notice that if f=0 almost everywhere, then | f |p=0 almost everywhere,
and so ||f||p=0. However, there can be non-zero functions such that f=0
almost everywhere. So ||·||p is not a norm on
Lp(µ).
Exercise 1
Find a measure space (
X,µ)
such that
lp=
Lp(µ)
, that is the space of
sequences lp is a particular case of function spaces
considered in this section. It also explains why the following
proofs are referencing to Section 11 so
often.
Lemma 2 (Integral Hölder inequality)
Let 1<
p<∞
, let q∈(1,∞)
be such that 1/
p +
1/
q=1
. For f∈
Lp(µ)
and g∈
Lq(µ)
, we
have that fg is summable, and
| ∫ | | | ⎪
⎪ | fg | ⎪
⎪ | d µ ≤ | ⎪⎪
⎪⎪ | f | ⎪⎪
⎪⎪ | p | ⎪⎪
⎪⎪ | g | ⎪⎪
⎪⎪ | q.
(80) |
Proof.
Recall that we know from Lem.
2 that
Now we follow the steps in proof of Prop.
4.
Define measurable functions
a,
b:
X→K by setting
So we have that
| ⎪
⎪ | a(x) b(x) | ⎪
⎪ | ≤ | | +
| | (x∈ X).
|
By integrating, we see that
| ∫ | | | ⎪
⎪ | ab | ⎪
⎪ | d µ ≤ | | | ∫ | | | ⎪
⎪ | f | ⎪
⎪ | p d µ
+ | | | ∫ | | | ⎪
⎪ | g | ⎪
⎪ | q d µ
= | | + | | = 1.
|
Hence, by the definition of
a and
b,
| ∫ | | | ⎪
⎪ | fg | ⎪
⎪ | ≤ | ⎪⎪
⎪⎪ | f | ⎪⎪
⎪⎪ | p | ⎪⎪
⎪⎪ | g | ⎪⎪
⎪⎪ | q,
|
as required.
□
Lemma 3
Let f,
g∈
Lp(µ)
and let a∈K
.
Then:
-
||af||p = | a | ||f||p;
- || f+g ||p ≤ ||f||p + ||g||p.
In particular, Lp is a vector space.
Proof.
Part
1 is easy. For
2, we
need a version of Minkowski’s Inequality, which will follow from the
previous lemma. We essentially repeat the proof of
Prop.
5.
Notice that the p=1 case is easy, so suppose
that 1<p<∞. We have that
| | = | ∫ | | | ⎪
⎪ | f+g | ⎪
⎪ | p−1 | ⎪
⎪ | f+g | ⎪
⎪ | d µ |
| | | | | | | | | |
| ≤ | ∫ | | | ⎪
⎪ | f+g | ⎪
⎪ | p−1 | ⎛
⎝ | ⎪
⎪ | f | ⎪
⎪ | + | ⎪
⎪ | g | ⎪
⎪ | | ⎞
⎠ | d µ |
| | | | | | | | | |
| = | ∫ | | | ⎪
⎪ | f+g | ⎪
⎪ | p−1 | ⎪
⎪ | f | ⎪
⎪ | d µ + | ∫ | | | ⎪
⎪ | f+g | ⎪
⎪ | p−1 | ⎪
⎪ | g | ⎪
⎪ | d µ.
|
| | | | | | | | | |
|
Applying the lemma, this is
≤ | ⎪⎪
⎪⎪ | f | ⎪⎪
⎪⎪ | p | ⎛
⎜
⎜
⎜
⎜
⎝ | ∫ | | | ⎪
⎪ | f+g | ⎪
⎪ | q(p−1) d µ | ⎞
⎟
⎟
⎟
⎟
⎠ | |
+ | ⎪⎪
⎪⎪ | g | ⎪⎪
⎪⎪ | p | ⎛
⎜
⎜
⎜
⎜
⎝ | ∫ | | | ⎪
⎪ | f+g | ⎪
⎪ | q(p−1) d µ | ⎞
⎟
⎟
⎟
⎟
⎠ | | .
|
As q(p−1)=p, we see that
| ⎪⎪
⎪⎪ | f+g | ⎪⎪
⎪⎪ | pp ≤ | ⎛
⎝ | ⎪⎪
⎪⎪ | f | ⎪⎪
⎪⎪ | p + | ⎪⎪
⎪⎪ | g | ⎪⎪
⎪⎪ | p | ⎞
⎠ | ⎪⎪
⎪⎪ | f+g | ⎪⎪
⎪⎪ | pp/q.
|
As p−p/q = 1, we conclude that
| ⎪⎪
⎪⎪ | f+g | ⎪⎪
⎪⎪ | p ≤ | ⎪⎪
⎪⎪ | f | ⎪⎪
⎪⎪ | p + | ⎪⎪
⎪⎪ | g | ⎪⎪
⎪⎪ | p,
|
as required.
In particular, if f,g∈ Lp(µ) then af+g∈
Lp(µ), showing that Lp(µ) is a
vector space.
□
We define an equivalence relation ∼ on the space of measurable
functions by setting f∼ g if and only if f=g almost
everywhere. We can check that ∼ is an equivalence relation
(the slightly non-trivial part is that ∼ is transitive).
Proposition 4
For 1≤ p<∞, the collection of equivalence classes
Lp(µ) / ∼ is a vector space, and
||·||p is a well-defined norm on Lp(µ) /
∼.
Proof. We need to show that addition, and scalar
multiplication, are well-defined on
Lp(µ)/∼.
Let
a∈K and
f1,
f2,
g1,
g2∈
Lp(µ) with
f1∼
f2 and
g1∼
g2. Then
it’s easy to see that
af1+
g1 ∼
af2+
g2; but this is all
that’s required!
If f ∼ g then | f |p = | g |p almost
everywhere, and so ||f||p = ||g||p. So
||·||p is well-defined on equivalence classes. In
particular, if f∼ 0 then ||f||p=0. Conversely, if
||f||p=0 then ∫X | f |p d µ=0, so as
| f |p is a positive function, we must have that
| f |p=0 almost everywhere. Hence f=0 almost
everywhere, so f∼ 0. That is,
| ⎧
⎨
⎩ | f∈ Lp(µ) : f∼ 0 | ⎫
⎬
⎭ | = | ⎧
⎨
⎩ | f∈ Lp(µ) : | ⎪⎪
⎪⎪ | f | ⎪⎪
⎪⎪ | p=0 | ⎫
⎬
⎭ | .
|
It follows from the above lemma that this is a subspace of Lp(µ).
The above lemma now immediately shows that ||·||p is a
norm on Lp(µ)/∼.
□
Definition 5
We write Lp(µ) for the normed space
(Lp(µ)/∼ , ||·||p).
We will abuse notation and continue to write members of
Lp(µ) as functions. Really they are equivalence
classes, and so care must be taken when dealing with
Lp(µ). For example, if f∈ Lp(µ),
it does not make sense to talk about the value of f at a
point.
Theorem 6
Let (fn) be a Cauchy sequence in Lp(µ). There
exists f∈ Lp(µ) with ||fn−f||p→
0. In fact, we can find a subsequence (nk) such that
fnk→ f pointwise, almost everywhere.
Proof. Consider first the case of a finite measure space
X.
We again follow the three steps scheme from Rem.
7.
Let
fn be a Cauchy sequence in
Lp(µ). From the
Hölder inequality (
80) we see that
||
fn−
fm||
1≤ ||
fn−
fm||
p (µ(
X))
1/q. Thus,
fn is also a Cauchy sequence in
L1(µ). Thus by
the Theorem
42 there is the limit function
f∈
L1(µ). Moreover, from the proof of that
theorem we know that there is a subsequence
fnk of
fn
convergent to
f almost everywhere. Thus in the Cauchy sequence
inequality
| ∫ | | | ⎪
⎪ | fnk −fnm | ⎪
⎪ | p d µ <ε
|
we can pass to the limit
m→ ∞ by the Fatou
Lemma
39 and conclude:
| ∫ | | | ⎪
⎪ | fnk −f | ⎪
⎪ | p d µ <ε.
|
So,
fnk converges to
f in
Lp(µ), then
fn converges to
f in
Lp(µ) as well.
For a σ-finite measure µ we represent X=⊔k
Xk with µ(Xk)<+∞ for all k. The restriction
(fn(k)) of a Cauchy sequence
(fn)⊂Lp(X,µ) to every Xk is a Cauchy
sequence in Lp(Xk,µ). By the previous paragraph
there is the limit f(k)∈ Lp(Xk,µ). Define a
function f∈Lp(X,µ) by the identities
f(x)=f(k) if x∈ Xk. By the additivity of integral, the
Cauchy condition on (fn)
can be written as:
| ∫ | | ⎪
⎪ | fn−fm | ⎪
⎪ | p d µ= | | ∫ | | ⎪
⎪ | fn(k)−fm(k) | ⎪
⎪ | p d µ<ε.
|
It implies for any M:
| |
| ∫ | | ⎪
⎪ | fn(k)−fm(k) | ⎪
⎪ | p d µ<ε.
|
In the last inequality we can pass to the limit m→ ∞:
| |
| ∫ | | ⎪
⎪ | fn(k)−f(k) | ⎪
⎪ | p d µ<ε.
|
Since the last inequality is independent of M we conclude:
| ∫ | | ⎪
⎪ | fn−f | ⎪
⎪ | p d µ= | | ∫ | | ⎪
⎪ | fn(k)−f(k) | ⎪
⎪ | p d µ<ε.
|
Thus we conclude that fn→ f in Lp(X,µ).
□
Corollary 7
Lp(µ) is a Banach space.
Example 8
If p=2
then Lp(µ)=
L2(µ)
can be equipped with the inner product:
The previous Corollary implies that L2(µ)
is a Hilbert space, see a preliminary discussion in Defn. 22.
Proposition 9
Let (
X,
L,µ)
be a measure space, and let 1≤
p<∞
. We can define a map
Φ:
Lq(µ) →
Lp(µ)
* by
setting Φ(
f)=
F, for f∈
Lq(µ)
, 1/
p+1/
q=1
, where
F:Lp(µ)→K, g ↦ | ∫ | |
fg d µ (g∈Lp(µ)).
|
Proof.
This proof very similar to proof of Thm.
13. For
f∈
Lq(µ) and
g∈
Lp(µ), it follows by the
Hölder’s Inequality (
80), that
fg
is summable, and
| ⎪
⎪
⎪
⎪
⎪
⎪ | | ∫ | | fg d µ | ⎪
⎪
⎪
⎪
⎪
⎪ | ≤ | ∫ | | | ⎪
⎪ | fg | ⎪
⎪ | d µ
≤ | ⎪⎪
⎪⎪ | f | ⎪⎪
⎪⎪ | q | ⎪⎪
⎪⎪ | g | ⎪⎪
⎪⎪ | p.
|
Let f1,f2∈ Lq(µ) and g1,g2∈ Lp(µ) with f1∼ f2 and
g1∼ g2. Then f1g1 = f2g1 almost everywhere and
f2g1 = f2g2 almost everywhere, so f1g1 = f2g2 almost everywhere, and
hence
So Φ is well-defined.
Clearly Φ is linear, and we have shown that ||Φ(f)|| ≤ ||f||q.
Let f∈ Lq(µ) and define g:X→K by
Then | g(x) | = | f(x) |q−1 for all x∈ X, and so
| ∫ | | | ⎪
⎪ | g | ⎪
⎪ | p d µ = | ∫ | | | ⎪
⎪ | f | ⎪
⎪ | p(q−1) d µ = | ∫ | | | ⎪
⎪ | f | ⎪
⎪ | q d µ,
|
so ||g||p = ||f||qq/p, and so, in particular, g∈Lp(µ).
Let F=Φ(f), so that
F(g) = | ∫ | | fg d µ = | ∫ | | | ⎪
⎪ | f | ⎪
⎪ | q d µ = | ⎪⎪
⎪⎪ | f | ⎪⎪
⎪⎪ | qq.
|
Thus ||F|| ≥ ||f||qq / ||g||p = ||f||q. So
we conclude that ||F|| = ||f||q, showing that Φ is
an isometry.
□
Proposition 10
Let (
X,
L,µ)
be a finite measure space, let
1≤
p<∞
, and let F∈
Lp(µ)
*. Then there
exists f∈
Lq(µ)
, 1/
p+1/
q=1
such that
F(g) = | ∫ | | fg d µ (g∈Lp(µ)).
|
Proof.[Sketch of the proof]
As µ(
X)<∞, for
E∈
L, we have that
||χ
E||
p = µ(
E)
1/p < ∞. So
χ
E∈
Lp(µ), and hence we can define
We proceed to show that ν is a signed (or complex) measure.
Then we can apply the Radon-Nikodym Theorem
53
to find a function
f:
X→K such that
F(χE) = ν(E) = | ∫ | | f d µ (E∈L).
|
There is then a long argument to show that f∈
Lq(µ), which we skip here. Finally, we need to show that
for all g∈ Lp(µ), and not just for
g=χE. That follows for simple functions with a finite set of
values by linearity of the Lebesgue integral and F. Then, it can
be extended by continuity to the entire space Lp(µ)
in view in the following Prop. 14.
□
Proposition 11
For 1<p<∞, we have that Lp(µ)* =
Lq(µ) isometrically, under the identification of the
above results.
Exercise 13
Let µ
be a measure on the real line.
-
Show that the space L∞(ℝ,µ) is either
finite-dimensional or non-separable.
- Show that for p≠ q neither
Lp(ℝ,µ) nor
Lq(ℝ,µ) contains the other space.
14.2 Dense Subspaces in Lp
We note that f∈Lp(X) if and only if | f |p
is summable, thus we can use all results from
Section 13 to investigate Lp(X).
Proposition 14
Let (
X,
L,µ)
be a finite measure space, and let
1≤
p<∞
. Then the collection of simple bounded functions
attained only a finite number of values is dense in
Lp(µ)
.
Proof.Let
f∈
Lp(µ), and suppose for now that
f≥0.
For each
n∈ℕ, let
Then each
fn is simple,
fn ↑
f, and |
fn−
f |
p→0
pointwise. For each
n, we have that
so that |
f−
fn |
p ≤ |
f |
p for all
n. As ∫|
f |
p d µ<∞,
we can apply the Dominated Convergence Theorem to see that
that is, ||
fn−
f||
p → 0.
The general case follows by taking positive and negative parts,
and if K=ℂ, by taking real and imaginary parts first.
□
Corollary 15
Let µ be the Lebesgue measure on the real line.
The collection of simple bounded functions with compact supports
attained only a finite number of values is dense in
Lp(ℝ,µ).
Proof.
Let f ∈ Lp(ℝ,µ), since
∫ℝ | f |p d µ =
∑k=−∞∞ ∫[k,k+1) | f |p d µ
there exists N such that
∑k=−∞−N + ∑Nk=∞ ∫[k,k+1)
| f |p d µ < ε . By the previous
Proposition, the restriction of f to [−N,N] can be
ε-approximated by a simple bounded function f1
with support in [−N,N] attained only a finite number of
values. Therefore f1 will be also
(2ε)-approximation to f as well.
□
Definition 16
A function f:ℝ→ ℂ is
called step function if it a linear combination of a finite number of
indicator functions of half-open disjoint intervals: f=∑k ck χ[ak,bk).
The regularity of the Lebesgue measure allows to make a stronger
version of Prop. 14.
Lemma 17
The space of step functions is dense in Lp(ℝ)
.
Proof.
By Prop.
14, for a given
f∈
Lp(ℝ) and ε>0 there
exists a simple function
f0=∑
k=1n ck χ
Ak such
that ||
f−
f0||
p<ε/2. Let
M=||
f0||
∞ < ∞. By measurability of the set
Ak there is
Ck=⊔
jmk [
ajk,
bjk) a disjoint finite union of half-open intervals such that
µ(
Ck▵
Ak)<ε/2
n3 M. Since
Ak and
Aj are disjoint for
k≠
j we also obtain by the triangle inequality: µ(
Cj ∩
Ak)<ε/2
n3 M and µ(
Cj ∩
Ck)<2ε/2
n3 M.
We define a step function
f1= | | ck χCk= | | | ck
χ[ajk,bjk).
|
Clearly
f1(x)=ck for all x ∈ Ak∖ ((Ck▵ Ak) ⋃(⋃j≠ k Cj)).
|
Thus:
µ({x ∈ ℝ ∣ f0(x)≠ f1(x)}) ≤ n· n· | | = | | .
|
Then ||
f0−
f1||
p≤
nM· ε/2
n M=ε/2 because ||
f1||
∞<
nM. Thus
||
f−
f1||
p<ε.
□
Corollary 18
The collection of continuous function belonging to
Lp(ℝ) is dense in Lp(ℝ).
Proof.
In view of Rem.
29 and the previous
Lemma it is enough to show that the characteristic function of an
interval [
a,
b] can be approximated by a continuous function in
Lp(ℝ). The idea of such approximation is
illustrated by Fig.
4 and we skip the technical details.
□
We will establish denseness of the subspace of smooth function in
§ 15.4.
Exercise 19
Show that every f∈
L1(ℝ)
is
continuous on average
, that is for any ε>0
there
is δ>0
such that for all t such that
|
t |<δ
we have:
| ∫ | | | ⎪
⎪ | f(x)−f(x+t) | ⎪
⎪ | d x < ε .
(82) |
Here is an alternative demonstration of a similar result, it
essentially encapsulate all the above separate statements.
Let ([0,1],L,µ) be the restriction of Lebesgue
measure to [0,1]. We often write Lp([0,1]) instead
of Lp(µ).
Proposition 20
For 1≤
p<∞
, we have that CK([0,1])
is dense
in Lp([0,1])
.
Proof. As [0,1] is a finite measure space, and each member
of
CK([0,1]) is bounded, it is easy to see that
each
f∈
CK([0,1]) is such that
||
f||
p<∞. So it makes sense to regard
CK([0,1]) as a subspace of
Lp(µ).
If
CK([0,1]) is not dense in
Lp(µ), then we can find a non-zero
F∈
Lp([0,1])
* with
F(
f)=0 for each
f∈
CK([0,1]). This was a corollary of the Hahn-Banach
theorem
15.
So there exists a non-zero g∈ Lq([0,1]) with
| ∫ | | fg d µ = 0 (f∈ CK([0,1])).
|
Let a<b in [0,1]. By approximating χ(a,b) by a
continuous function, we can show that ∫(a,b) g d µ = ∫
g χ(a,b) d µ = 0.
Suppose for now that K=ℝ. Let A = {
x∈[0,1] : g(x)≥0 } ∈ L. By the definition of
the Lebesgue (outer) measure, for є>0, there exist
sequences (an) and (bn) with A ⊆ ∪n
(an,bn), and ∑n (bn−an) ≤ µ(A) + є.
For each N, consider ∪n=1N (an,bn). If some
(ai,bi) overlaps (aj,bj), then we could just consider
the larger interval (min(ai,aj), max(bi,bj)). Formally by
an induction argument, we see that we can write ∪n=1N
(an,bn) as a finite union of some disjoint open
intervals, which we abusing notations still denote by (an,bn). By
linearity, it hence follows that for N∈ℕ, if we set
BN = ⊔n=1N (an,bn), then
| ∫ | g χBN d µ = | ∫ | g χ(a1,b1)⊔⋯⊔(aN,bN) d µ = 0.
|
Let B=∪n (an,bn), so A⊆ B and µ(B) ≤
∑n (bn−an) ≤ µ(A)+є. We then have that
| ⎪
⎪ | | ∫ | g χBN d µ −
| ∫ | g χB d µ | ⎪
⎪ | = | ⎪
⎪ | | ∫ | g χB∖ (a1,b1)⊔⋯⊔(aN,bN) d µ | ⎪
⎪ | .
|
We now apply Hölder’s inequality to get
| | ⎛
⎝ | ∫ | χB∖ (a1,b1)⋃⋯⋃(aN,bN) d µ | ⎞
⎠ | 1/p
| ⎪⎪
⎪⎪ | g | ⎪⎪
⎪⎪ | q |
| = µ(B∖ (a1,b1)⊔⋯⊔(aN,bN))1/p | ⎪⎪
⎪⎪ | g | ⎪⎪
⎪⎪ | q |
| | | | | | | | | |
| ≤ | ⎛
⎜
⎜
⎝ | | (bn−an) | ⎞
⎟
⎟
⎠ | | | ⎪⎪
⎪⎪ | g | ⎪⎪
⎪⎪ | q. |
| | | | | | | | | |
|
We can make this arbitrarily small by making N large. Hence we conclude that
Then we apply Hölder’s inequality again to see that
| ⎪
⎪ | | ∫ | gχA d µ | ⎪
⎪ | = | ⎪
⎪ | | ∫ | gχA d µ − | ∫ | gχB d µ | ⎪
⎪ |
= | ⎪
⎪ | | ∫ | g χB∖ A d µ | ⎪
⎪ | ≤ | ⎪⎪
⎪⎪ | g | ⎪⎪
⎪⎪ | q µ(B∖ A)1/p
≤ | ⎪⎪
⎪⎪ | g | ⎪⎪
⎪⎪ | q є1/p.
|
As є>0 was arbitrary, we see that ∫A g d µ=0.
As g is positive on A, we conclude that g=0 almost everywhere on A.
A similar argument applied to the set {x∈[0,1] : g(x)≤0}
allows us to conclude that g=0 almost everywhere. If
K=ℂ, then take real and imaginary parts.
□
14.3 Continuous functions
Let K be a compact (always assumed Hausdorff) topological space.
Definition 21
The Borel σ-algebra
, B(
K)
, on K, is the
σ
-algebra generated by the open sets in K (recall what
this means from Section 11.5). A member of
B(
K)
is a Borel
set.
Notice that if f:K→K is a continuous function, then
clearly f is B(K)-measurable (the inverse image of
an open set will be open, and hence certainly Borel). So if
µ:B(K)→K is a finite real or
complex charge (for K=ℝ or
K=ℂ respectively), then f will be
µ-summable (as f is bounded) and so we can define
φµ:CK(K) → K, φµ(f) = | ∫ | | f d µ
(f∈ CK(K)).
|
Clearly φµ is linear. Suppose for now that µ is positive, so that
| ⎪
⎪ | φµ(f) | ⎪
⎪ | ≤ | ∫ | | | ⎪
⎪ | f | ⎪
⎪ | d µ ≤ | ⎪⎪
⎪⎪ | f | ⎪⎪
⎪⎪ | ∞ µ(K)
(f∈ CK(K)).
|
So φµ∈ CK(K)* with ||φµ||≤ µ(K).
The aim of this section is to show that all of
CK(K)* arises in this way. First we need to define a
class of measures which are in a good agreement with the topological
structure.
Definition 22
A measure µ:
B(
K)→[0,∞)
is
regular
if for each A∈
B(
K)
, we have
| µ(A) | = sup | ⎧
⎨
⎩ | µ(E) : E⊆ A and E is compact | ⎫
⎬
⎭ |
| | | | | | | | | |
| = inf | ⎧
⎨
⎩ | µ(U) : A⊆ U and U is open | ⎫
⎬
⎭ | .
|
| | | | | | | | | |
|
A charge ν=ν
+−ν
− is regular
if
ν
+ and ν
− are regular measures. A complex measure is
regular
if its real and imaginary parts are regular.
Note the similarity between this notion and definition of outer measure.
Example 23
-
Many common measures on the real line, e.g. the Lebesgue measure,
point measures, etc., are regular.
- An example of the measure µ on [0,1] which is not regular:
for any other subset A⊂[0,1].
- Another example of a σ-additive measure µ on [0,1]
which is not regular:
µ(A)= | ⎧
⎨
⎩ | 0, | if A is at most countable; |
+∞ | otherwise.
|
|
|
The following subspace of the space of all simple functions is helpful.
As we are working only with compact spaces, for us, “compact” is the
same as “closed”.
Regular measures somehow interact “well” with the underlying
topology on K.
We let Mℝ(K) and Mℂ(K) be the
collection of all finite, regular real or complex charges (that is, signed or complex
measures) on B(K).
Exercise 24
Check that, Mℝ(K) and Mℂ(K) are
real or complex, respectively, vector spaces for the obvious
definition of addition and scalar multiplication.
Recall, Defn. 31, that for µ∈ MK(K) we define
the variation of µ
| ⎪⎪
⎪⎪ | µ | ⎪⎪
⎪⎪ | = sup | ⎧
⎪
⎨
⎪
⎩ | | ⎪
⎪ | µ(An) | ⎪
⎪ | | ⎫
⎪
⎬
⎪
⎭ | ,
|
where the supremum is taken over all sequences (An) of pairwise
disjoint members of B(K), with ⊔n An=K. Such
(An) are called partitions.
Proposition 25
The variation ||·|| is a norm on MK(K).
Proof.
If µ=0 then clearly ||µ||=0. If
||µ||=0, then for
A∈
B(
K), let
A1=
A,
A2=
K∖
A and
A3=
A4=⋯=∅. Then
(
An) is a partition, and so
0 = | | ⎪
⎪ | µ(An) | ⎪
⎪ | = | ⎪
⎪ | µ(A) | ⎪
⎪ | + | ⎪
⎪ | µ(K∖ A) | ⎪
⎪ | .
|
Hence µ(
A)=0, and so as
A was arbitrary, we have that µ=0.
Clearly ||aµ|| = | a |||µ|| for
a∈K and µ∈ MK(K).
For µ,λ∈ MK(K) and a partition (An),
we have that
| | | ⎪
⎪ | (µ+λ)(An) | ⎪
⎪ | = | | | ⎪
⎪ | µ(An)+λ(An) | ⎪
⎪ |
≤ | | | ⎪
⎪ | µ(An) | ⎪
⎪ | + | | | ⎪
⎪ | λ(An) | ⎪
⎪ | ≤ | ⎪⎪
⎪⎪ | µ | ⎪⎪
⎪⎪ | + | ⎪⎪
⎪⎪ | λ | ⎪⎪
⎪⎪ | .
|
As (An) was arbitrary, we see that ||µ+λ|| ≤ ||µ|| + ||λ||.
□
To get a handle on the “regular” condition, we need to know a little
more about CK(K).
Theorem 26 (Urysohn’s Lemma)
Let K be a
compact space, and let E,
F be closed subsets of K with
E∩
F=∅
. There exists f:
K→[0,1]
continuous with f(
x)=1
for x∈
E and f(
x)=0
for x∈
F (written f(
E)={1}
and f(
F)={0}
).
Proof.
See a book on (point set) topology.
□
Lemma 27
Let µ:
B(
K)→[0,∞)
be a regular
measure. Then for U⊆
K open, we have
µ(U) = sup | ⎧
⎪
⎪
⎨
⎪
⎪
⎩ | ∫ | | f d µ : f∈ Cℝ(K), 0≤ f≤χU | ⎫
⎪
⎪
⎬
⎪
⎪
⎭ | .
|
Proof.
If 0≤
f≤χ
U, then
0 = | ∫ | | 0 d µ ≤ | ∫ | | f d µ ≤ | ∫ | | χU d µ = µ(U).
|
Conversely, let
F=
K∖
U, a closed set. Let
E⊆
U be closed. By Urysohn Lemma
26, there exists
f:
K→[0,1] continuous with
f(
E)={1} and
f(
F)={0}. So χ
E ≤
f ≤ χ
U, and hence
As µ is regular,
µ(U) = sup | ⎧
⎨
⎩ | µ(E) : E⊆ U closed | ⎫
⎬
⎭ | ≤ sup | ⎧
⎪
⎪
⎨
⎪
⎪
⎩ | ∫ | | f d µ : 0≤ f≤χU | ⎫
⎪
⎪
⎬
⎪
⎪
⎭ | ≤ µ(U).
|
Hence we have equality throughout.
□
The next result tells that the variation coincides with the norm on
real charges viewed as linear functionals on Cℝ(K).
Lemma 28
Let µ∈
Mℝ(
K)
. Then
| ⎪⎪
⎪⎪ | µ | ⎪⎪
⎪⎪ | = | ⎪⎪
⎪⎪ | φµ | ⎪⎪
⎪⎪ | := sup | ⎧
⎪
⎪
⎨
⎪
⎪
⎩ | ⎪
⎪
⎪
⎪
⎪
⎪ | ∫ | | f d µ | ⎪
⎪
⎪
⎪
⎪
⎪ | : f∈ Cℝ(K),
| ⎪⎪
⎪⎪ | f | ⎪⎪
⎪⎪ | ∞≤ 1 | ⎫
⎪
⎪
⎬
⎪
⎪
⎭ | .
|
Proof.
Let (
A,
B) be a Hahn decomposition
(Thm.
36) for µ. For
f∈
Cℝ(
K) with ||
f||
∞≤ 1, we have that
| ⎪
⎪
⎪
⎪
⎪
⎪ | ∫ | | f d µ | ⎪
⎪
⎪
⎪
⎪
⎪ |
| ≤ | ⎪
⎪
⎪
⎪
⎪
⎪ | ∫ | | f d µ | ⎪
⎪
⎪
⎪
⎪
⎪ | + | ⎪
⎪
⎪
⎪
⎪
⎪ | ∫ | | f d µ | ⎪
⎪
⎪
⎪
⎪
⎪ |
= | ⎪
⎪
⎪
⎪
⎪
⎪ | ∫ | | f d µ+ | ⎪
⎪
⎪
⎪
⎪
⎪ | + | ⎪
⎪
⎪
⎪
⎪
⎪ | ∫ | | f d µ− | ⎪
⎪
⎪
⎪
⎪
⎪ | |
| | | | | | | | | |
| ≤ | ∫ | | | ⎪
⎪ | f | ⎪
⎪ | d µ+ + | ∫ | | | ⎪
⎪ | f | ⎪
⎪ | d µ−
≤ | ⎪⎪
⎪⎪ | f | ⎪⎪
⎪⎪ | ∞ | ⎛
⎝ | µ(A) − µ(B) | ⎞
⎠ | ≤ | ⎪⎪
⎪⎪ | f | ⎪⎪
⎪⎪ | ∞ | ⎪⎪
⎪⎪ | µ | ⎪⎪
⎪⎪ | , |
| | | | | | | | | |
|
using the fact that µ(
B)≤0 and that (
A,
B) is a partition of
K.
Conversely, as µ is regular, for є>0, there exist
closed sets E and F with E⊆ A, F⊆ B,
and with µ+(E)> µ+(A)−є and
µ−(F)>µ−(B)−є. By Urysohn Lemma 26,
there exists f:K→[0,1] continuous with f(E)={1}
and f(F)={0}. Let g=2f−1, so g is continuous, g
takes values in [−1,1], and g(E)={1}, g(F)={−1}.
Then
| | = | ∫ | | 1 d µ + | ∫ | | −1 d µ + | ∫ | | g d µ |
| | | | | | | | | |
| = µ(E) − µ(F) + | ∫ | | g d µ + | ∫ | | g d µ |
| | | | | | | | | |
| | | | | | | | | | |
|
As E⊆ A, we have µ(E) = µ+(E), and as F⊆ B,
we have −µ(F)=µ−(F). So
| | > µ+(A)−є + µ−(B) − є + | ∫ | | g d µ + | ∫ | | g d µ |
| | | | | | | | | |
| ≥ | ⎪
⎪ | µ(A) | ⎪
⎪ | + | ⎪
⎪ | µ(B) | ⎪
⎪ | − 2є − | ⎪
⎪ | µ(A∖ E) | ⎪
⎪ | − | ⎪
⎪ | µ(B∖ F) | ⎪
⎪ | |
| | | | | | | | | |
| ≥ | ⎪
⎪ | µ(A) | ⎪
⎪ | + | ⎪
⎪ | µ(B) | ⎪
⎪ | − 4є. |
| | | | | | | | | |
|
As є>0 was arbitrary, we see that ||φµ|| ≥
| µ(A) |+| µ(B) |=||µ||.
□
Thus, we know that Mℝ(K) is
isometrically embedded in Cℝ(K)*.
14.4 Riesz Representation Theorem
To facilitate an approach to the key point of this Subsection we will
require some more definitions.
Definition 29
A functional F on C(K)
is positive
if for any non-negative
function f≥ 0
we have F(
f)≥0
.
Lemma 30
Any positive linear functional F on C(X) is continuous and
||F||=F(1), where 1 is the function
identically equal to 1 on X.
Proof.
For any function f such that ||f||∞≤ 1 the
function 1−f is non negative thus:
F(1)−F(f)=F(1−f)>0, Thus F(1)>F(f), that
is F is bounded and its norm is F(1).
□
So for a positive functional you know the exact place where to spot
its norm, while a linear functional can attain its norm in an
generic point (if any) of the unit ball in C(X). It is also
remarkable that any bounded linear functional can be represented by a
pair of positive ones.
Lemma 31
Let λ
be a continuous linear functional on
C(
X)
. Then there are positive functionals
λ
+ and λ
− on C(
X)
, such that
λ=λ
+−λ
−.
Proof.
First, for
f∈
Cℝ(
K) with
f≥0, we define
| λ+(f) | = sup | ⎧
⎨
⎩ | λ(g) : g∈ Cℝ(K), 0≤ g≤ f | ⎫
⎬
⎭ | ≥0, |
| | | | | | | | | |
λ−(f) | = λ+(f) − λ(f)
= sup | ⎧
⎨
⎩ | λ(g)−λ(f): g∈ Cℝ(K), 0≤ g≤ f | ⎫
⎬
⎭ |
| | | | | | | | | |
| = sup | ⎧
⎨
⎩ | λ(h): h∈ Cℝ(K), 0≤ h+f≤ f | ⎫
⎬
⎭ |
| | | | | | | | | |
| = sup | ⎧
⎨
⎩ | λ(h): h∈ Cℝ(K), −f ≤ h ≤
0 | ⎫
⎬
⎭ | ≥ 0.
|
| | | | | | | | | |
|
In a sense, this is similar to the Hahn decomposition (Thm.
36).
We can check that
λ+(tf) = tλ+(f),
λ−(tf) = tλ−(f) (t≥0, f≥0).
|
For f1,f2≥ 0, we have that
| λ+(f1+f2) | =
sup | ⎧
⎨
⎩ | λ(g): 0≤ g ≤ f1+f2 | ⎫
⎬
⎭ |
| | | | | | | | | |
| = sup | ⎧
⎨
⎩ | λ(g1+g2): 0≤ g1+g2 ≤ f1+f2 | ⎫
⎬
⎭ |
| | | | | | | | | |
| ≥ sup | ⎧
⎨
⎩ | λ(g1) + λ(g2): 0≤ g1≤
f1, 0 ≤ g2 ≤ f2 | ⎫
⎬
⎭ |
| | | | | | | | | |
| = λ+(f1) + λ+(f2).
| | | | | | | | | |
|
Conversely, if 0≤ g≤ f1+f2, then set g1 =
min(g,f1), so 0≤ g1 ≤ f1. Let g2 = g−g1 so
g1≤ g implies that 0≤ g2. For x∈ K, if
g1(x)=g(x) then g2(x) = 0 ≤ f2(x); if
g1(x)=f1(x) then f1(x)≤ g(x) and so g2(x) =
g(x)−f1(x) ≤ f2(x). So 0 ≤ g2 ≤ f2, and g =
g1 + g2. So in the above displayed equation, we really have
equality throughout, and so λ+(f1+f2) = λ+(f1) +
λ+(f2). As λ is additive, it is now immediate
that λ−(f1+f2) = λ−(f1) + λ−(f2)
For f∈ Cℝ(K) we put f+(x)=max(f(x),0)
and f−(x)=−min(f(x),0). Then f±≥ 0 and
f=f+−f−. We define:
λ+(f) = λ+(f+) − λ+(f−),
λ−(f) = λ−(f+) − λ−(f−).
|
As when we were dealing with integration, we can check that
λ+ and λ− become linear functionals; by the
previous Lemma they are bounded.
□
Finally, we need a technical definition.
Definition 32
For f∈
Cℝ(
K)
, we define the support
of f, written supp(
f)
, to
be the closure of the set {
x∈
K :
f(
x)≠0}
.
Theorem 33 (Riesz Representation)
Let K be a compact (Hausdorff) space, and let λ∈
CK(
K)
*. There exists a unique µ∈
MK(
K)
such that
λ(f) = | ∫ | | f d µ ( f∈ CK(K) ).
|
Furthermore, ||λ|| = ||µ||
.
Proof.
Let us show
uniqueness. If µ
1,µ
2∈
MK(
K)
both induce λ then µ = µ
1−µ
2 induces the zero
functional on
CK(
K). So for
f∈
Cℝ(
K),
So µ
r and µ
i both induce the zero functional on
Cℝ(
K). By Lemma
28, this
means that ||µ
r|| = ||µ
i||=0, showing that µ =
µ
r +
iµ
i = 0, as required.
Existence is harder, and we shall only sketch it here. Firstly, we
shall suppose that K=ℝ and that λ
is positive.
Motivated by the above Lemmas 27
and 28, for U⊆ K open, we define
µ*(U) = sup | ⎧
⎨
⎩ | λ(f): f∈ Cℝ(K),
0≤
f≤χU,
supp(f)⊆ U | ⎫
⎬
⎭ | .
|
For A⊆ K general, we define
µ*(A) = inf | ⎧
⎨
⎩ | µ*(U): U⊆ K is open,
A⊆ U | ⎫
⎬
⎭ | .
|
We then proceed to show that
- µ* is an outer measure: this requires a technical
topological lemma, where we make use of the support condition in
the definition.
- We then check that every open set in µ*-measurable.
- As B(K) is generated by open sets, and the
collection of µ*-measurable sets is a σ-algebra,
it follows that every member of B(K) is
µ*-measurable.
- By using results from Section 12, it
follows that if we let µ be the restriction of µ* to
B(K), then µ is a measure on
B(K).
- We then check that this measure is regular.
- Finally, we show that µ does induce the functional
λ. Arguably, it is this last step which is the hardest
(or least natural to prove).
If λ is not positive, then by
Lemma 31 represent it as
λ=λ+−λ− for positive λ±. As
λ+ and λ− are positive functionals, we can
find µ+ and µ− positive measures in
Mℝ(K) such that
λ+(f) = | ∫ | | f d µ+,
λ−(f) = | ∫ | | f d µ− (f∈ Cℝ(K)).
|
Then if µ = µ+ − µ−, we see that
λ(f) = λ+(f) − λ−(f)
= | ∫ | | f d µ (f∈ Cℝ(K)).
|
Finally, if K=ℂ, then we use the same
“complexification” trick from the proof of the Hahn-Banach
Theorem 15. Namely, let λ∈ Cℂ(K)*, and
define λr, λi∈ Cℝ(K)* by
λr(f) = ℜ λ(f), λi(f) = ℑ λ(f)
( f∈ Cℝ(K) ).
|
These are both clearly ℝ-linear. Notice also that
| λr(f) | = | ℜ λ(f) | ≤
| λ(f) | ≤ ||λ|| ||f||∞, so
λr is bounded; similarly λi.
By the real version of the Riesz Representation Theorem, there exist
charges µr and µi such that
ℜ λ(f) = λr(f) = | ∫ | | f d µr,
ℑλ(f) = λi(f) = | ∫ | | f d µi (f∈ Cℝ(K) ).
|
Then let µ=µr+iµi, so for f∈ Cℂ(K),
| | | | | | | | | | | |
| = | ∫ | | ℜ (f) d µr + i | ∫ | | ℑ(f) d µr
+ i | ∫ | | ℜ (f) d µi − | ∫ | | ℑ(f) d µi |
| | | | | | | | | |
| = λr(ℜ (f)) + iλr(ℑ(f)) + iλi(ℜ (f)) − λi(ℑ(f)) | | | | | | | | | |
| = ℜ λ(ℜ (f)) + iℜ λ(ℑ(f)) +iℑλ(ℜ (f)) − ℑλ(ℑ(f)) | | | | | | | | | |
| = λ(ℜ (f) + iℑ(f)) = λ(f),
| | | | | | | | | |
|
as required.
□
Notice that we have not currently proved that ||µ|| = ||λ|| in
the case K=ℂ. See a textbook for this.
15 Fourier Transform
In this section we will briefly present a theory of Fourier transform
focusing on commutative group approach. We mainly follow footsteps
of []*Ch. IV.
15.1 Convolutions on Commutative Groups
Let G be a commutative group, we will use + sign to denote
group operation, respectively the inverse elements of g∈ G will
be denoted −g. We assume that G has a Hausdorff topology such
that operations (g1,g2)↦ g1+g2 and g↦ −g are
continuous maps. We also assume that the topology is locally
compact, that is the group neutral element
has a neighbourhood with a compact closure.
Example 1
Our main examples will be as follows:
-
G=ℤ the group of integers with operation of
addition and the discrete
topology (each point is an open set).
- G=ℝ the group of real numbers with addition and
the topology defined by open intervals.
- G=T the group of Euclidean rotations the unit circle
in ℝ2 with the natural topology. Another
realisations of the same group:
-
Unimodular complex numbers under multiplication.
- Factor group ℝ/ℤ, that is addition
of real numbers modulo 1.
There is a homomorphism between two realisations given by
z=e2πi t, t∈[0,1), | z |=1.
We assume that G has a regular Borel measure which is invariant in
the following sense.
Definition 2
Let µ
be a measure on a commutative group G, µ
is
called invariant
(or Haar measure
) if for any measurable X and any g∈
G the
sets g+
X and −
X are also measurable and
µ(
X)=µ(
g+
X)=µ(−
X)
.
Such an invariant measure exists if and only if the group is locally
compact, in this case the measure is uniquely defined up to the
constant factor.
Exercise 3
Check that in the above three cases invariant measures are:
-
G=ℤ, the invariant measure of X is equal to number of
elements in X.
- G=ℝ the invariant measure is the Lebesgue
measure.
- G=T the invariant measure coincides with the
Lebesgue measure.
Definition 4
A convolution
of two functions on a
commutative group G with an invariant measure µ
is defined
by:
(f1*f2)(x)= | ∫ | | f1(x−y) f2(y) d µ(y)= | ∫ | | f1(y) f2(x−y) d µ(y).
(83) |
Theorem 5
If f1, f2∈
L1(
G,µ)
, then the integrals
in (83) exist for almost every x∈
G,
the function f1*
f2 is in L1(
G,µ)
and
||
f1*
f2||≤ ||
f1||·||
f2||
.
Proof.
If
f1,
f2∈
L1(
G,µ) then by Fubini’s
Thm.
50 the function φ(
x,
y)=
f1(
x)*
f2(
y) is in
L1(
G×
G, µ× µ) and
||φ||=||
f1||· ||
f2||.
Let us define a map τ: G× G → G× G such
that τ(x,y)=(x+y,y). It is measurable (send Borel sets to
Borel sets) and preserves the measure µ×µ. Indeed, for
an elementary set C=A× B⊂ G× G we have:
(µ×µ)(τ(C)) | = | ∫ | |
χτ(C)(x,y) d µ(x) d µ(y) |
|
| = | ∫ | | χC(x−y,y) d µ(x) d µ(y) |
|
| = | ∫ | | ⎛
⎜
⎜
⎜
⎜
⎝ | ∫ | | χC(x−y,y) d µ(x) | ⎞
⎟
⎟
⎟
⎟
⎠ | dµ(y) |
|
| = | ∫ | | µ(A+y) d µ(y)=µ(A)× µ(B)=(µ×µ)(C).
|
|
|
We used invariance of µ and Fubini’s
Thm. 50. Therefore we have an isometric isomorphism of
L1(G× G,µ× µ) into itself by the
formula:
Tφ(x,y)=φ(τ(x,y))=φ(x−y,y).
|
If we apply this isomorphism to the above function
φ(x,y)=f1(x)*f2(y) we shall obtain the statement.
□
Definition 6
Denote by S(k) the map S(k): f↦ k*f
which we will call convolution operator with the kernel k.
Corollary 7
If k∈L1(G) then the convolution S(k) is a
bounded linear operator on L1(G).
Theorem 8
Convolution is a commutative, associative and distributive
operation. In particular S(
f1)
S(
f2)=
S(
f2)
S(
f1)=
S(
f1*
f2)
.
Proof.
Direct calculation using change of variables.
□
It follows from Thm. 5 that convolution is a
closed operation on L1(G) and has nice properties due
to Thm. 8. We fix this in the following definition.
Definition 9
L1(G) equipped with the operation of convolution is
called convolution algebra L1(G).
The following operators of special interest.
Definition 10
An operator of shift T(a) acts on functions by
T(a): f(x)↦ f(x+a).
Lemma 11
An operator of shift is an isometry of Lp(G), 1≤ p≤∞.
Theorem 12
Operators of shifts and convolutions commute:
T(a)(f1*f2)=T(a)f1*f2=f1*T(a)f2,
|
or
T(a)S(f)=S(f)T(a)=S(T(a)f).
|
Proof.
Just another calculation with a change of variables.
□
There is a useful relation between support of functions and their convolutions.
Lemma 14
For any f1, f2∈
L1(
G)
we have:
supp(f1*f2)⊂supp(f1)+supp(f2).
|
Proof.
If x∉supp(f1)+supp(f2) then for any
y∈supp(f2) we have
x−y∉supp(f1). Thus for such x convolution is
the integral of the identical zero.
□
Exercise 15
Suppose that the function f1 is compactly supported and k
times continuously differentiate in ℝ
, and that the
function f2 belongs to L1(ℝ)
. Prove
that the convolution f1*
f2 has continuous derivatives up to
order k.
[Hint:
Express the derivative d/
d x as the
limit of operators (
T(
h)−
I)/
h when h→ 0
and use
Thm. 12.]
15.2 Characters of Commutative Groups
Our purpose is to map the commutative algebra of convolutions to a
commutative algebra of functions with point-wise multiplication. To
this end we first represent elements of the group as operators of
multiplication.
Definition 16
A character χ:
G→ T
is a
continuous homomorphism of an abelian topological group G to the group
T
of unimodular complex numbers under multiplications:
Note, that a character is an eigenfunction for a shift operator
T(a) with the eigenvalue χ(a). Furthermore, if a function
f on G is an eigenfunction for all shift operators T(a),
a∈ G then the collection of respective eigenvalues
λ(a) is a homomorphism of G to ℂ and
f(a)=α λ(a) for some
α∈ℂ. Moreover, if T(a) act by isometries on
the space containing f(a) then λ(a) is a homomorphism to
T.
Lemma 17
The product of two characters of a group is again a character of the
group. If χ is a character of G then
χ−1=χ is a character as well.
Proof.
Let χ
1 and χ
2 be characters of
G. Then:
χ1(gh)χ2(gh) | = | χ1(g)χ1(h)χ2(g)χ2(h) |
| = | (χ1(g)χ2(g))(χ1(h)χ2(h))∈T.
|
|
□
Definition 18
The dual group Ĝ
is collection of all characters of G
with operation of multiplication.
The dual group becomes a topological group with the uniform
convergence on compacts: for any compact subset
K⊂ G and any ε>0 there is
N∈ℕ such that
| χn(x)−χ(x) |<ε for all x∈ K and n>N.
Exercise 19
Check that
-
The sequence fn(x)=xn does not converge uniformly on
compacts if considered on [0,1]. However it does converges
uniformly on compacts if considered on (0,1).
- If X is a compact set then the topology of uniform
convergence on compacts and the topology uniform convergence on
X coincide.
Example 20
If G=ℤ
then any character χ
is defined by its
values χ(1)
since
Since χ(1)
can be any number on T
we see that
$ℤ^_$
is parametrised by T
.
Theorem 21
The group $ℤ^_$ is isomorphic to T.
Proof.
The correspondence from the above example is a group
homomorphism. Indeed if χ
z is the character with
χ
z(1)=
z, then χ
z1χ
z2=χ
z1 z2. Since
ℤ is discrete, every compact consists of a finite
number of points, thus uniform convergence on compacts means
point-wise convergence. The equation (
84) shows that
χ
zn→ χ
z if and only if
χ
zn(1)→ χ
z(1), that is
zn→
z.
□
Theorem 22
The group $T^_$ is isomorphic to ℤ.
Proof.
For every
n∈ℤ define a character of T
by the identity
We will show that these are the only characters in Cor.
26.
The isomorphism property is easy to establish. The topological
isomorphism follows from discreteness of
$T^_$. Indeed due to compactness of T
for
n≠
m:
| | | ⎪
⎪ | χn(z)−χm(z) | ⎪
⎪ | 2=
| | | ⎪
⎪ | 1−ℜ zm−n | ⎪
⎪ | 2=22=4.
|
Thus, any convergent sequence (
nk) have to be constant for sufficiently
large
k, that corresponds to a discrete topology on ℤ.
□
The two last Theorem are an illustration to the following general statement.
In particular, the Pontryagin’s duality tells that the collection of all
characters contains enough information to rebuild the initial group.
Theorem 25
The group $ℝ^_$ is isomorphic to ℝ.
Proof.
For λ∈ℝ define a character
χ
λ∈$ℝ^_$ by the identity
χλ(x)=e2π i λ x, x∈ℝ.
(87) |
Moreover any smooth character of the group
G=(ℝ, +)
has the form (
87). Indeed,
let χ be a smooth character of ℝ. Put
c=χ′(
t)|
t=0∈ ℂ. Then χ′(
t)=
cχ(
t) and
χ(
t)=
ect. We also get
c∈
iℝ and any such
c defines a character. Then the multiplication of characters is:
χ
1(
t)χ
2(
t)=
ec1tec2t=
e(c2+c1)t. So we have a
group isomorphism.
For a generic character we can apply first the smoothing
technique and reduce to the above case.
Let us show topological homeomorphism. If λn→
λ then χλn→ χλ
uniformly on any compact in ℝ from the explicit
formula of the character. Reverse, let
χλn→ χλ
uniformly on any interval. Then
χλn−λ(x)→ 1 uniformly on any
compact, in particular, on
[0,1]. But
Thus λn→ λ.
□
Corollary 26
Any character of the group T
has the form (85).
Proof.
Let χ∈$T^_$, consider χ1(t)=χ(e2π
i t) which is a character of ℝ. Thus
χ1(t)=e2π i λ t for some
λ∈ℝ. Since χ1(1)=1 then
λ=n∈ℤ. Thus χ1(t)=e2π i n
t, that is χ(z)=zn for z=e2π i t.
□
We can unify the previous three Theorem into the following statement.
Theorem 28
Let G=ℝ
n× ℤ
k× T
l be the
direct product of groups. Then the dual group is
Ĝ=ℝ
n× T
k× ℤ
l.
15.3 Fourier Transform on Commutative Groups
Definition 29
Let G be a locally compact commutative group with an invariant
measure µ
. For any f∈
L1(
G)
define the
Fourier transform f
by
f(χ)= | ∫ | | f(x) χ(x) d µ(x), χ∈Ĝ.
(88) |
That is the Fourier transform f is a function on the dual
group Ĝ.
Example 30
-
If G=ℤ, then f∈L1(Z) is a
two-sided summable sequence
(cn)n∈ℤ. Its Fourier transform is the
function f(z)=∑n=−∞∞cn zn on
T. Sometimes f(z) is called generating
function of the sequence (cn).
- If G=T, then the Fourier transform of
f∈L1(T) is its Fourier
coefficients, see
Section 5.1.
- If G=ℝ, the Fourier transform is also the
function on ℝ given by the Fourier integral:
f(λ)= | ∫ | | f(x) e−2πiλ x d x.
(89) |
The important properties of the Fourier transform are captured in the
following statement.
Theorem 31
Let G be a locally compact commutative group with an invariant
measure µ
. The Fourier transform maps functions from
L1(
G)
to continuous bounded functions on Ĝ
.
Moreover, a convolution is transformed to point-wise multiplication:
(f1*f2)^ (χ)=f1(χ)·f2(χ),
(90) |
a shift operator T(
a)
, a∈
G is transformed in
multiplication by the character fa∈Ĝ
:
(T(a)f)^ (χ)=fa(χ)·f(χ), fa(χ)=χ(a)
(91) |
and multiplication by a character χ∈Ĝ
is transformed
to the shift T(χ
−1)
:
(χ· f)^ (χ1)=T(χ−1)f(χ1)=f(χ−1χ1).
(92) |
Proof.
Let
f∈
L1(
G). For any ε>0 there is a
compact
K⊂
G such that ∫
G∖ K
|
f |
d µ<ε. If χ
n→ χ in
Ĝ, then we have the uniform convergence of
χ
n→ χ on
K, so there is
n(ε) such that for
k>
n(ε) we have
| χ
k(
x)−χ(
x) | <ε for all
x∈
K. Then
| ≤ | ∫ | | | ⎪
⎪ | f(x) | ⎪
⎪ |
| ⎪
⎪ | χn(x)− χ(x) | ⎪
⎪ | d µ(x)+
| ∫ | | | ⎪
⎪ | f(x) | ⎪
⎪ |
| ⎪
⎪ | χn(x)− χ(x) | ⎪
⎪ | d µ(x) |
|
| ≤ | |
|
Thus f is continuous. Its boundedness follows from the
integral estimations. Algebraic
maps (
90)–(
92) can be
obtained by changes of variables under integration. For example,
using Fubini’s Thm.
50 and invariance of the measure:
(f1*f2)^ (χ ) | = | ∫ | | ∫ | | f1(s) f2(t−s) d s χ(t) d t |
|
| = | ∫ | | ∫ | | f1(s) χ(s) f2(t−s) χ(t−s) d s d t |
|
| = | f1(χ)f2(χ).
|
|
□
15.4 The Schwartz space of smooth rapidly decreasing functions
We say that a function f is rapidly decreasing if
limx→ ±∞ | xkf(x) |=0 for any
k∈ℕ.
Definition 32
The Schwartz space
denoted by S or space of rapidly
decreasing functions on Rn is the space of infinitely differentiable
functions such that:
S = | ⎧
⎪
⎨
⎪
⎩ | f∈ C∞ (ℝ): | | ⎪
⎪ | xα f(β) (x) | ⎪
⎪ | <∞
∀ α ,β ∈ ℕ | ⎫
⎪
⎬
⎪
⎭ | .
(93) |
Example 33
An example of a rapidly decreasing function is the Gaussian e−π x2.
It is worth to notice that S⊂
Lp(ℝ) for any 1<p<∞. Moreover,
S is dense in Lp(ℝ), for p=1 this can
be shown in the following steps (other values of p can be done
similarly but require some
more care).
First we will show that S is an ideal of the
convolution algebra L1(ℝ).
Exercise 34
For any g∈
S and
f ∈
L1(ℝ)
with compact support their
convolution f*
g belongs to S. [Hint:
smoothness follows from Ex. 15.]
Define the family of functions gt(x) for t>0 in S by
scaling the Gaussian:
Exercise 35
Show that gt(
x)
satisfies the following properties, cf. Lem 7:
-
gt(x)>0 for all x∈ℝ and t>0.
- ∫ℝ gt(x) d x=1 for all
t>0. [Hint: use the table integral ∫ℝ e−π
x2 d x=1.]
- For any ε>0 and
any δ>0 there exists T>0 such that for all positive
t< T we have:
It is easy to see, that the above
properties 1–3
are not unique to the Gaussian and a wide class have them. Such a
family a family of functions is known as approximation of the
identity [] due to the next
property (94).
Exercise 36
-
Let f be a continuous function with compact support, then
| | | ⎪⎪
⎪⎪ | f−gt*f | ⎪⎪
⎪⎪ | 1=0 .
(94) |
[Hint: use the proof of Thm. 8.]
- The Schwartz space S is dense in
L1(ℝ). [Hint: use
Prop. 20, Ex. 34 and
(94).]
15.5 Fourier Integral
We recall the formula (89):
Definition 37
We define the Fourier integral
of a function f∈
L1(ℝ)
by
f(λ)= | ∫ | | f(x) e−2π i λ
x d x.
(95) |
We already know that f is a bounded continuous function on
ℝ, a further property is:
Lemma 38
If a sequence of functions
(
fn)⊂
L1(ℝ)
converges in the metric
L1(ℝ)
, then the sequence (f
n)
converges uniformly on the real line.
Proof.
This follows from the estimation:
| ⎪
⎪ | fn(λ)−fm(λ) | ⎪
⎪ | ≤
| ∫ | | | ⎪
⎪ | fn(x)−fm(x) | ⎪
⎪ | d x.
|
□
Lemma 39
The Fourier integral f of f∈L1(ℝ) has zero limits at −∞ and +∞.
Proof.
Take
f the indicator function of [
a,
b]. Then
f(λ )=1/−2π
i λ (
e−2π i a −
e−2π i
b), λ≠ 0. Thus lim
λ→
±∞ f(λ)=0. By continuity from the previous
Lemma this can be extended to the closure of step functions, which is the
space
L1(ℝ) by
Lem.
17.
□
Lemma 40
If f is absolutely continuous on every interval and
f′∈
L1(ℝ)
, then
More generally:
Proof.
A direct demonstration is based on integration by parts, which is
possible because assumption in the Lemma.
It may be also interesting to mention that the operation of
differentiation D can be expressed through the shift operatot
Ta:
By the formula (91), the Fourier integral
transforms 1/Δ t(TΔ t− I) into
1/Δ t(χλ(Δ t)− 1).
Providing we can justify that the Fourier integral commutes with the
limit, the last operation is multiplication by χ′λ(0)=2πi λ.
□
Corollary 41
If f(k)∈
L1(ℝ)
then
that is f
decrease at infinity faster than | λ |
−k.
Lemma 42
Let f(
x)
and xf(
x)
are both in
L1(ℝ)
, then f
is differentiable
and
More generally
f(k)=((−2π i x)k f)^.
(98) |
Proof.
There are several strategies to prove this results, all having their
own merits:
- The most straightforward uses the differentiation under the
integration sign.
- We can use the intertwining property (92)
of the Fourier integral and the connection of derivative with
shifts (97).
- Using the inverse Fourier integral (see below), we regard this
Lemma as the dual to the Lemma 40.
□
Corollary 43
The Fourier transform of a smooth rapidly decreasing function is a
smooth rapidly decreasing function.
Corollary 44
The Fourier integral of the Gaussian e−π x2 is e−π λ 2.
Proof.[
]
Note that the Gaussian
g(
x)=
e−π x2 is a unique (up to a
factor) solution of the equation
g′+2π
x g=0. Then, by
Lemmas
40 and
42, its
Fourier transform shall satisfy to the equation 2π
i λ
ĝ+
iĝ′=0. Thus, ĝ=
c·
e−π λ
2 with a constant factor
c, its value 1 can be found from
the classical integral ∫
ℝ e−π
x2 d x=1 which represents ĝ(0).
□
The relation (96)
and (98) allows to reduce many partial
differential equations to algebraic one, see
§ 0.2 and 5.4. To convert
solutions of algebraic equations into required differential equations
we need the inverse of the Fourier transform.
Definition 45
We define the inverse Fourier transform
on
L1(ℝ)
:
f(λ)= | ∫ | | f(x) e2π i λ x d x.
(99) |
We can notice the formal correspondence
f(λ)=f(−λ)=f(λ),
which is a manifestation of the group duality
$ℝ^_$=ℝ for the real line. This immediately
generates analogous results from Lem. 38 to
Cor. 44 for the inverse Fourier transform.
Theorem 46
The Fourier integral and the inverse Fourier transform are inverse
maps. That is, if g=f
then f=ǧ
.
Proof.[Sketch of a proof]
The exact meaning of the statement depends from the spaces which we
consider as the domain and the range. Various variants and their
proofs can be found in the literature. For example,
in [
, § IV.2.3], it is proven for the Schwartz space
S of smooth rapidly decreasing functions.
The outline of the proof is as follows. Using the
intertwining relations (96)
and (98), we conclude the composition of Fourier
integral and the inverse Fourier transform commutes both with
operator of multiplication by x and differentiation. Then we
need a result, that any operator commuting with multiplication by
x is an operator of multiplication by a function
f. For this function, the
commutation with differentiation implies f′=0, that is
f=const. The value of this constant can be evaluated by
a Fourier transform on a single function, say the Gaussian e−π
x2 from Cor. 44.
□
The above Theorem states that the Fourier integral is an invertible map. For
the Hilbert space L2(ℝ) we can show a
stronger property—its unitarity.
Theorem 47 (Plancherel identity)
The Fourier transform extends uniquely to a unitary map
L2(ℝ)→
L2(ℝ)
:
| ∫ | | | ⎪
⎪ | f | ⎪
⎪ | 2 d x= | ∫ | |
| ⎪
⎪ | f | ⎪
⎪ | 2 d λ.
(100) |
Proof.
The proof will be done in three steps: first we establish the
identity for smooth rapidly decreasing functions, then for
L2 functions with compact support and finally for any
L2 function.
- Take f1 and f2∈S be smooth rapidly
decreasing functions and g1 and g2 be their Fourier
transform. Then (using Fubini’s Thm. 50):
| = | | ∫ | | | ∫ | | g1(λ ) e2π i
λ t d λ f2(t) d t |
|
| = | ∫ | | g1(λ ) | ∫ | | e2π i
λ t f2(t) d t d λ |
|
| = | |
|
Put f1=f2=f (and therefore g1=g2=f) we get the
identity ∫| f |2 d x=∫| f |2 d λ. The same identity (100) can be obtained from the
property (f1f2)^=f1*f2,
cf. (90), or explicitly:
| ∫ | | f1(x) f2(x) e−2π iλ x d x=
| ∫ | | f1(t) f2(λ−t) d t.
|
Now, substitute λ=0 and f2=f1 (with its
corollary f2(t)=f1(−t)) and
obtain (100).
- Next let f∈L2(ℝ) with a support in
(−a,a) then f∈L1(ℝ) as well, thus
the Fourier transform is well-defined. Let fn∈S
be a sequence with support on (−a,a) which converges to f
in L2 and thus in L1. The Fourier
transform gn converges to g uniformly and is a Cauchy
sequence in L2 due to the above identity. Thus
gn→ g in L2 and we can extend the
Plancherel identity by continuity to L2 functions
with compact support.
- The final bit is done for a general
f∈L2 the sequence
of truncations to the interval (−n,n). For fn the
Plancherel identity is established above, and fn→ f
in L2(ℝ). We also build their Fourier
images gn and see that this is a Cauchy sequence in
L2(ℝ), so gn→ g.
If
f∈
L1∩
L2 then the above
g
coincides with the ordinary Fourier transform on
L1.
□
We note that Plancherel identity and the Parseval’s identity (30) are
cousins—they both states that the Fourier transform
L2(G)→ L2(Ĝ) is an isometry
for G=ℝ and G=T respectively. They may be
combined to state the unitarity of the Fourier transform on
L2(G) for the group G=ℝn×
ℤk× Tl
cf. Thm. 28.
Proofs of the following statements are not examinable
Thms. 23, 36,
53, 33, 46,
Props. 14, 20.
16 Advances of Metric Spaces
16.1 The Stone–Weierstrass Theorem
Density in metric spaces is an important concept since it allows to approximate any element by an element from the dense set. Furthermore, we can extend a uniformly continuous function from a dense subset by continuity, see Ex. 62. Thus, it is convenient to have some supply of manageable dense subsets of common metric spaces.
A famous case of density is the Theorem of Stone–Weierstrass, which in the original and most known form says that any continuous function
on a compact interval can be uniformly approximated by a sequence of polynomials. Polynomials have many nice properties which make this dense subset particularly useful: easy computation, derivation and integration, etc. Yet, we will prove here a more general version of the Stone–Weierstrass theorem that applies to general compact metric spaces.
Theorem 1 (Stone–Weierstrass)
Suppose that X is a compact metric space and let C(
X,ℝ)
be the Banach space of
real valued continuous functions on X with norm || · ||
∞. Suppose that A ⊂
C(
X,ℝ)
is a unital subalgebra
of C(
X,ℝ)
, i.e.
-
A is a linear subspace,
- 1 ∈ A,
- A · A ⊂ A, or in other words f,g ∈ A implies that also f · g ∈ A.
Suppose furthermore that A separates points, i.e. for any two x,
y ∈
X with x ≠
y there exists
a function f ∈
A such that f(
x) ≠
f(
y)
. Then, A is dense in C(
X,ℝ)
.
This is an interestingly sounding theorem: it states that a subset of C(X,ℝ) which is closed under algebraic operations
and separates points automatically has topological property—it is dense. Its consequences are striking.
Before we prove this theorem let’s look at some of them.
Corollary 2 [Weierstrass approximation theorem]
The space of polynomials ℝ[x] is dense in C([a,b],ℝ) for any compact interval [a,b]
in the in the || · ||∞ norm.
In other words, any continuous function can be approximated with arbitrary accuracy
by a polynomial.
Corollary 3
The space of polynomials ℝ[x1,…,xn]
is dense in C(K,ℝ) for any compact subset K of ℝn in the || · ||∞ norm.
This is the higher dimensional version of the above theorem and states that a continuous
functions of n-variables can be approximated by polynomials in n variables.
Corollary 4
Let C(
S1,ℝ)
be the space of continuous functions on the unit circle, or, equivalently,
the space of 2 π
-periodic real valued functions on ℝ
. Then the finite linear span
of the set
is dense in C(
S1,ℝ)
.
The Stone–Weierstrass theorem is actually a consequence of the following theorem
by Stone. This is a good illustration to inventor’s paradox stated by Polya [].
Here is some notation first for two functions f and g:
| f ∧ g = min{f,g}, | |
| f ∨ g = max{f,g}
| |
|
Note that of f and g are continuous, then so are f ∧ g and f ∨ g (demonstrate this!).
Theorem 5 (Stone’s Theorem)
Let X be a compact metric space and suppose that
there is a subset A of C(
X,ℝ)
such that
-
A is closed under the operations ∧ and ∨, this means
f,g ∈ A implies f ∧ g ∈ A and f ∨ g ∈ A.
- for any pair of points x ≠y and numbers a,b ∈ ℝ there is a function
f ∈ A such that f(x)=a and f(y)=b.
Then, A is dense in C(
X,ℝ)
in the topology induced by the norm || · ||
∞ (the uniform topology).
Proof.
We need to prove that any function
g can be approximated by elements in
A.
For each two points
x,
y choose a function
fx,y ∈
A such that
fx,y(
x)=
g(
x) and
fx,y(
y)=
g(
y). Such a function exists by our hypothesis for every
pair of points. Now, for an ε>0 the sets
Ox,y={z ∈ X ∣ fx,y(z) < g(z) + ε}
|
are open and form a cover of
X even if we fix
x.
This is because
Ox,y contains both
x and
y together with some neighbourhoods of these points.
Now find a finite subcover for each fixed
x. That is there are finitely many points
y1, …,
yn such that
Ox,yi is an open cover.
Now define the function
By hypothesis
fx is in
A for any
x ∈
X and it has the property that
but now for all
z. Moreover,
fx(
x) =
g(
x).
Again, the sets
Ox={z ∈ X ∣ fx(z) > g(z) − ε}
|
make an open cover and therefore there is a finite subcover.
This means there are finitely many points
x1,…,
xk
such that
Oxi is an open cover of
X.
Now the function
is in
A and satisfies
for all
x, or in other words ||
f −
g ||
∞ < ε.
□
It may be not obvious why conditions of Stone’s theorem 5 are more general than in Thm. 1. This will be seen from the following proof. We employ what Polya called leading particular case [, § 4.4]—we will show that the particular algebra of polynomials on [0,1] approximate the particular function √x and then reduce the general situation to it.
Proof.[Sketch of proof of the Stone Weierstrass theorem
1.]
First we observe that if
B is the closure in
C(
X,ℝ) of
A from Thm.
1,
then
B will also be a unital point separating subalgebra of
C(
X,ℝ) (exercise!).
Step 1: If f is non-negative and in B, so is √f.
To see this note that it is enough to show this for 0≤ f < 1 because
in case f ≠0 we can compute √f by
Now the Taylor series ∑k=0∞an xk for √1−x
√ | | = | | ak xk = 1− | | x− | | x2− | | x3− | | x4− | | x5−…
|
converges absolutely and uniformly on any interval
[0,1−δ). Therefore, the series
converges in the Banach space B because all the partial sums
are actually in B as B is a subalgebra. The limit of this sequence
is, of course, √f+δ. If we let δ go to zero, we can see that also √f ∈ B.
This works because
| | √ | | − | √ | | | = δ ( | √ | | + | √ | | )−1 ≤ | √ | |
|
so that the approximation is uniform.
Step 2:
Since | f | = √f2 we have that f ∈ B implies | f | ∈ B.
Since moreover,
we conclude from this that B is closed under the operations ∧ and ∨.
Step 3:
Assume that x ≠ y are points in X and assume that a,b are real numbers.
Then, by assumption there is an element f in B such that
Since B is a subspace that contains the constant functions, the function
is also in B and it satisfies g(x)=a and g(y)=b.
Final Step: As we can see all the conditions of Stone’s theorem are satisfied and therefore B
is dense in C(X,ℝ). Since B is closed in C(X,ℝ) this means that B=C(X,ℝ). Thus, A is dense in C(X,ℝ).
□
We can extend the result from real scalars to complex one through identities ℜ z = 1/2(z +z) and ℑ z = 1/2(z −z).
Corollary 6 [Stone–Weierstrass (complex version)]
Suppose that X is a compact metric space and let C(
X,ℂ)
be the complex Banach space of
complex valued continuous functions on X with norm || · ||
∞. Suppose that A ⊂
C(
X,ℂ)
is a unital *
-subalgebra of C(
X,ℂ)
, i.e.
-
A is a linear subspace,
- 1 ∈ A,
- A · A ⊂ A, or in other words f,g ∈ A implies that also f · g ∈ A.
- if f ∈ A the also f ∈ A.
Suppose furthermore that A separates points, i.e. for any two x,
y ∈
X with x ≠
y there exists
a function f ∈
A such that f(
x) ≠
f(
y)
. Then, A is dense in C(
X,ℂ)
.
Using this complex version of the theorem (or simply the Euler identity ei φ = cosφ +i sinφ) we obtain the complex version of Cor. 4:
Corollary 7
The linear span of the set { ei m ϕ ∣ m ∈ ℤ}
is dense in C(S1,ℂ).
Note that we need both positive and negative values of m in ei m ϕ, the set { ei m ϕ ∣ m ∈ ℕ0} is not dense in C(S1,ℂ).
The Stone–Weierstrass also says something about the separability of certain Banach spaces.
Remember what it means for a topological space to be separable.
Definition 8 (Separable Metric Space)
A metric space X is called separable if there exists a countable dense subset of X.
In other words, a separable metric space consists of accumulation points of a single sequence.
Suppose K ⊂ ℝn is compact.
Then the space of polynomials with real coefficients and n-variables
is dense in the space of continuous functions C(K).
Of course every polynomial with real coefficients may be approximated by one with rational
coefficients. Thus the set of rational polynomials ℚ[x1,…,xn] is dense in C(K).
However, the space of rational polynomials is a countable set. In this way one obtains
Corollary 9
Let K be a compact subset of ℝn then the Banach space C(K) is separable.
The following statement shows that continuous functions make only a tiny fraction of all bounded functions:
Exercise 10
Let X be an infinite set, show that the space B(X) of bounded functions on X is not separable. (Hint: present a set of disjoint balls of radius 1/2 parametrised by all real numbers.)
16.2 Contraction mappings and fixed point theorems
16.2.1 The Banach fixed point theorem
An important tool in numerical Analysis, but also in constructions of solutions of differential equations are fixed point approximations.
In order to understand this, suppose that (X,d) is a metric space and f: X → X a self-map. Then a point x ∈ X is called fixed point of f
if f(x) = x. For example the function cos defines a self-map on the interval [0,1], and by starting with x1=0 and inductively computing xn+1 = cosxn
one converges to the value roughly 0.739085 which is a fixed point of cos, i.e. solves the equation cos(x) = x.
Under certain conditions one can show that such sequences always converge to a fixed point. This is the statement of the Banach fixed point theorem (contraction mapping principle).
Definition 11 (Contraction Mapping)
Let (
X,
d)
be a metric space. Then a map f:
X →
X is called contraction
if there exists a constant C<1
such that
Note that any contraction is (uniformly) continuous.
Theorem 12 (Banach Fixed Point Theorem)
Suppose that f:
X →
X is a contraction on a complete metric space (
X,
d)
. Then f has a unique fixed point y. Moreover, for any x ∈
X the sequence (
xn)
defined recursively by
converges to y.
Proof.
Let us start with uniqueness. If
x,
y are both fixed points in
X, then since
f is a contraction:
for some constant
C<1. Hence,
d(
x,
y)=0 and therefore
x=
y.
To prove the remaining claims we start with any x in X and we will show that the sequence xn defined by x1=x and xn+1=f(xn) converges.
Since f is continuous the limit of (xn) must be a fixed point. Since (X,d) is complete we only need to show that (xn) is Cauchy.
To see this note that
d(xn+1,xn) ≤ C d(xn,xn−1)
|
and therefore inductively,
d(xn+1,xn) ≤ Cn−1 d(x2,x1).
|
By the triangle inequality we have for any n,m>0
d(xN+m,xN) ≤ (CN−1 + CN + … CN+m−2) d(x2,x1) ≤ CN−1 | | d(x2,x1) .
|
Since C<1 this can be made arbitrarily small by choosing N large enough.
□
Corollary 13
Suppose that (
X,
d)
is a complete metric space and f:
X →
X a map such that fn is a contraction for some n ∈ ℕ
.
Then f has a unique fixed point.
Proof.
Since
fn is a contraction it has a unique fixed point
x ∈
X, i.e.
Now note that
fn(f(x)) = fn ∘ f (x) = fn+1(x) = f∘ fn(x) = f(fn(x))=f(x)
|
and therefore
f(
x) is also a fixed point of
fn. By uniqueness we must have
f(
x)=
x.
□
The question arises how to show that a given map f is a contraction. In subsets of ℝm there is a simple criterion.
Recall that an open set U ⊂ ℝ is called convex if for any two points x,y ∈ U the line { t x + (1−t) y ∣ t ∈ [0,1] } is contained in U.
Theorem 14 (Mean Value Inequality)
Suppose that U ⊂ ℝ
m is an open set with convex closure U and let f:
U → ℝ
m be a C1-function. Let d f be the total derivative (or Jacobian)
understood as a function on U with values in m ×
m-matrices. Suppose that ||
df(
x) || ≤
M for all x ∈
U. Then f:
U → ℝ
m
satisfies
|| f(x) −f(y) || ≤ M || x − y ||
|
for all x,
y ∈
U.
Proof.
Given
x,
y ∈
U let γ(
t) =
t x + (1−
t )
y. Then
d/
dt γ(
t) =
x −
y.
f(x) − f(y) = | | | | f(γ(t)) dt = | | (df) · | | (t) dt.
|
Using the triangle inequality (this can be used for Riemann integrals too because these are limits of finite sums), one gets
|| f(x) −f (y) || ≤ | | || (df) · | | (t) || dt ≤ M | | || x − y || dt = M || x−y||.
|
By continuity this inequality extends to
U.
□
Example 15
Consider the map f: ℝ
2 ⊃
B1(0) →
B1(0), (
x,
y) ↦ (
x2/4+
y/3+1/3,
y2/4−
x/2)
.
Then
df = | ⎛
⎜
⎜
⎜
⎜
⎜
⎜
⎜
⎝ | | | ⎞
⎟
⎟
⎟
⎟
⎟
⎟
⎟
⎠ | .
|
The operator norm ||
df||
can be estimated by the Hilbert–Schmidt norm. Recall ||
A ||
HS = (
tr(
A* A))
1/2, so we get
|| df|| ≤ || df||HS = ( | | (x2+y2) + | | + | | )1/2 <1.
|
Therefore f is a contraction. We can find the fixed point by starting, for example, with the point (0,0)
and iterating. We get iterations:
| (0,0), (0.333333, 0.), (0.361111, −0.166667), | |
| (0.310378, −0.173611), (0.299547, −0.147654), | |
| (0.306547, −0.144323), (0.308719, −0.148066), | |
| (0.307805, −0.148878), (0.307393, −0.148361), | |
| (0.307502, −0.148194), (0.307575, −0.148261), | |
| (0.307564, −0.148292), (0.307551, −0.148284), | |
| (0.307552, −0.148279), (0.307554, −0.148279).
| |
|
Example 16
Put a map of the country of your current presence on the floor, there’s a point on the map that is touching the actual point it refers to!
16.2.2 Applications of fixed point theory: The Picard-Lindelöf Theorem
Let f: K → ℝ be a function on a compact rectangle of the form K=[T1,T2] × [L1,L2] in ℝ2.
Consider the initial value problem (IVP)
| | = f(t,y), y(t0) = y0,
(101) |
where y: [T1,T2] → ℝ, t ↦ y(t) is a function. The function f and the initial value y0 ∈ [L1,L2], and t0 ∈ [T1,T2] are given and we are looking for a function y satisfying the above equations.
Example 17
Let f(
t,
x)=
x and y0 =1
, t0=0
. Then the initial value problem is
We know from other courses that there is a unique solution y(
t) =
et, see Fig. 19 top-left.
Example 18
Let f(
t,
x)=
x2 and y0 =1
, t0=0
. Then the initial value problem is
We know from other courses that there is a unique solution y(
t) = 1/1−
t which exists only on the interval (−∞,1)
, see Fig. 19 top-right.
Example 19
Let f(
t,
x)=
x2−
t and y0 =1
, t0=0
. Then the initial value problem is
One can show that there exists a solution for small |
t|
, however this solution cannot be expressed in terms of elementary functions, see Fig. 19 bottom-left.
Example 20
Let f(
t,
x)=
x2/3 and y0 =0
, t0=0
. Then the initial value problem is
It has at least two solutions, namely y=0
and y=
t3/27
, see Fig. 19 bottom-right.
Hence, there are two fundamental questions here: existence and uniqueness of solutions. The following theorem is one of the basic results in the theorem of ordinary differential equation and establishes existence and uniqueness under rather general assumptions.
Theorem 21 (Picard–Lindelöf theorem)
Suppose that f: [
T1,
T2] × [
y0−
C,
y0+
C] → ℝ
is a continuous function such that for some M>0
we have
|f(t,y1) − f(t,y2)| ≤ M | y1−y2| (Lipschitz condition)
|
for all t ∈ [
T1,
T2],
y1,
y2 ∈ [
y0−
C,
y0+
C]
.
Then, for any t0 ∈ [
T1,
T2]
the initial value problem
| | (t) = f(t,y(t)), y(t0) = y0,
|
has a unique solution y in C1[
a,
b]
, where [
a,
b]
is the interval [
t0−
R,
t0+
R] ∩ [
T1,
T2]
, where
(The solution exists for all times t such that |
t−
t0| ≤
R).
Proof.
Using the fundamental theorem of calculus we can write the IVP as a fixed point equation
F(
y) =
y for a map defined by
F(y) (t) = y0 + | | f(s,y(s)) ds.
|
This is a map that will send a continuous function
y ∈
C[
T1,
T2] to a continuous function
F(
y) ∈
C[
T1,
T2].
As a metric space we take
X = C([a,b], [y0 −C, y0 +C])
|
that is, the set of continuous functions on [
a,
b] taking values in the interval [
y0 −
C,
y0 +
C]. This is a closed (why?) subset of the Banach space
C[
a,
b]
and is therefore a complete metric space.
First we show that F: X → X, i.e. F maps X to itself. Indeed,
| F(y)(t) − y0 | = | ⎪
⎪
⎪
⎪ | | f(s,y(s)) ds | ⎪
⎪
⎪
⎪ | ≤ R || f ||∞≤ C.
|
Next we show that FN is a contraction for N large enough and thus establish the existence of a unique fixed point.
It is the place to use the Lipschitz condition. Observe that for two functions y, y′ ∈ X we have
| | F(y)(t) − F(y′)(t) | | = | ⎪
⎪
⎪
⎪ | | f(s,y(s)) − f(s,y′(s)) ds | ⎪
⎪
⎪
⎪ |
|
| ≤ | | | f(s,y(s)) − f(s,y′(s)) | ds ≤ |t−t0| M || y − y′ ||∞.
|
|
|
(102) |
We did not assume that (t−t0) M ≤ RM <1, so F will in general not be a contraction. There are several ways to resolve this situations. For example, we can argue in either of the following two manners:
- We use both the result and the method from (102) to compute distances for higher powers of F, starting from the squares:
| | F2(y)(t) − F2(y′)(t) | | ≤ | | | f(s,F(y)(s)) − f(s,F(y′)(s)) | ds |
| | | | | | | | | |
| ≤ | | |s−t0| · M · || F(y) − F(y′)||∞ ds |
| | | | | | | | | |
| ≤ | | |s−t0| · M2 · || y − y′ ||∞ ds |
| | | | | | | | | |
| | | | | | | | | | |
|
and iterating this gives for any natural N:
| || FN(y) − FN(y′) ||∞≤ | | MN || y − y′ ||∞.
|
| |
|
Since the factorial will overgrow the respective power, for N large enough, FN is a contraction and we deduce the existence of a unique solution from Cor. 13. This solution is in C1 since it can be written as the integral of a continuous function. - The inequality (102) shows existence and uniqueness of solution only in the space of functions C([t0−r,t0+r], [y0−C,y0+C]) where r < M−1 and therefore |t−t0| M< 1 in (102). Now suppose we have two solutions y and y′. They coincide at t0. Application of (102) to other initial points where the solutions coincide shows that
the set E={ x ∈ [a,b] ∣ y(x) = y′(x)} is open. It is also the pre-image of the closed set {0} under the continuous map y−y′. So we have that E is a closed and open subset of [a,b] that is
non-empty. It must therefore be [a,b]. Hence, we get y = y′, establishing uniqueness in the whole C[a,b].
□
Note that this not only gives uniqueness and existence, but also gives a constructive method to compute the solution by iterating the map F starting for example with the constant function
y(t)=y0. The iteration
yn+1(t) = y0 + | | f(s,yn(s)) ds
|
is called Picard iteration. It will converge to the solution uniformly. See Fig. 20 for an illustration of few first iterations for the exponent functions.
Figure 20: Few initial Picard iterations for the differential equation y′=y: constant f0, linear f1, quadratic f2, etc. |
Example 25
Consider the IVP
Hence, f(
t,
x) =
x2t +1
. If we take f to be defined on the square [−
T,
T] × [1−
C,1+
C]
then we obtain
||
f||
∞= (1+
C)
2 T +1
(the value at the top-right corner). In this case the solution will exist up to time
min | ⎧
⎪
⎨
⎪
⎩ | T, | | ⎫
⎪
⎬
⎪
⎭ | .
|
If we choose, for example C=2
and T=1/2
we get that a unique solution exists up to time |
t | ≤ 4/11
. This solution will then satisfy |
y(
t) −1 | ≤ 2
for |
t | ≤ 4/11
.In fact one can show that the solution can be expressed in a complicated way in terms of the Airy-Bi-function and it blows up at t=1.
16.2.3 Applications of fixed point theory: Inverse and Implicit Function Theorems
It is an easy exercise in Analysis to show that if a function f ∈ C1[a,b] has nowhere vanishing derivative, then f is invertible on its image. To be more precise,
f−1: Im(f) → [a,b] exists and has derivative (f′(x))−1 at the point y=f(x). In higher dimensions a statement like this can not be correct as the following counterexample shows.
Let 0<a<b and define
| f: [a,b] × ℝ → ℝ2, | |
| (r,θ) ↦ (r cosθ, r sinθ).
| |
|
This maps has invertible derivative
f′(r,θ) = | ⎛
⎜
⎝ | | | ⎞
⎟
⎠ | , detf′(r,θ) = r2 >0.
|
at any point, the map is however not injective, see Fig. 21 for a cartoon illustration of the difference between one- and two-dimensional cases. However, for any point we can restrict domain and co-domain, so that the restriction of the function is invertible.
In such a case we say that f is locally invertible. This concept will be explained in more detail below.
Figure 21: Flat and spiral staircases: can we return to the same value going just in one way? |
Definition 26 (Local Invertibility)
Suppose U1, U2 ⊂ ℝm are open subsets of ℝm. Then a map f: U1 → U2 is called locally invertible at x ∈ U1 if there exists an open neighbourhood
U of x such that f |U : U → f(U) is invertible. The function f is said to be locally invertible it it is locally invertible at x for any x ∈ U1.
Often, say for differential equations, we need a map which preserves differentiability of functions in both directions.
Definition 27 (Diffeomorphism)
Suppose U1,
U2 ⊂ ℝ
m are open subsets of ℝ
m. Then a map f:
U1 →
U2 is called Ck-diffeomorphism
if f ∈
Ck(
U1,
U2)
and if there exists a g ∈
Ck(
U2,
U1)
such that
f ∘ g = 1U2, g ∘ f = 1U1,
|
where 1
U1 and 1
U2 are the identity maps on U1 and U2 respectively.
There is also a local version of the above definition.
Definition 28 (Local Diffeomorphism)
Suppose U1, U2 ⊂ ℝm are open subsets of ℝm. Then a map f: U1 → U2 is called a local-Ck- diffeomorphism at x ∈ U1 if there exists an open neighbourhood
U of x such that f |U: U → f(U) is a Ck-diffeomorphism. It is called a local-Ck- diffeomorphism if it is a local diffeomorphism at any point x ∈ U1.
Not every invertible Ck-map is a diffeomorphism. An example is the function f(x) = x3 whose inverse g(x) = x1/3 fails to be differentiable.
Theorem 29 (Inverse Function Theorem)
Let U ⊂ ℝ
m be an open subset and suppose that f ∈
Ck(
U,ℝ
m)
such that f′(
x)
is invertible at every point x ∈
U. Then f is a local Ck-diffeomorphism.
Before we can prove this theorem we need a Lemma, which basically says that under the assumptions of the inverse function theorem an inverse function must be in C1. That is, differentiability is the leading particular case [, § 4.4] for the general case of k-differentiable functions.
Lemma 30
Suppose that f ∈ C1(U1,U2) is bijective with continuous inverse. Assume that the derivative of f is invertible at any point, then f is a C1-diffeomorphism, and g′(f(x)) = (f′(x))−1.
Proof.
Denote the inverse of f by g: U2 → U1. The continuity of f and g imply that xn → x0 if and only if f(xn) → f(x0). We will show that g is differentiable at the point
y0 = f(x0). If y=f(x) is very close to y0 (so that the line interval between x and x0 is contained in U1) then, by the MVT there exists a ξ on this line such that
y−y0 = f(x) − f(x0) = f′(ξ) · (x−x0). Therefore, g(y)−g(y0) = (f′(ξ))−1 · (y−y0). If y tends to y0, then ξ will tend to x0, and therefore, by continuity of f′ the value of
(f′(ξ))−1 will tend to (f′(x0))−1. Thus, the partial derivatives of g exist and are continuous, so g ∈ C1. Note that we have used here that matrix inversion is continuous.
□
Now we can proceed with the general situation.
Proof.[Proof of the Inverse Function Theorem
29]
Let
x0 ∈
U and let
y0=
f(
x0). We need to show that there exists an open neighborhood
U1 of
f(
x0) such that
f:
f−1(
U1) →
U1 is a
Ck-diffeomorphism. As a first step we construct a continuous inverse. Since
f′(
x0)=
A is an invertible
m ×
m-matrix we can change coordinates
x =
A−1 y +
x0, so that we can assume without loss of generality that
f′(
x0)= 1 and
x0=0. Replacing
f by
f−
y0 we also assume w.l.o.g. that
y0=0.
Since
f′(
x) is continuous there exists an ε>0 such that ||
f′(
x) − 1 || ≤ 1/2 for all
x ∈
Bε(0). This ε>0 can also be chosen such that
Bε(0) ⊂
U.
Thus, ||
x−
f(
x) || ≤ 1/2||
x|| for all
x ∈
Bε(0) by MVT,
and for each
y ∈
Bε/2(0) the map
is a contraction on
Bε(0). Indeed, by MVT again:
| ||x + y − f(x) − (x′ + y − f(x′))|| | = ||x − f(x) − (x′ − f(x′))|| |
| = ||(f′(ξ) − 1) (x−x′)|| |
| |
|
(103) |
where ||·|| is the norm of vectors in ℝ
m.
Consider the complete metric space
X=
C(
Bε/2(0),
Bε(0)) and define the map
F: X → X, u ↦ F(u), F(u)(y) = u(y) + y − f(u(y)).
|
By the above this map is well defined and it also is a contraction
| || F(u)(y) −F(v)(y) || | = || u(y) − f(u(y)) − | ⎛
⎝ | v(y) −f(v(y)) | ⎞
⎠ | || |
| | | | | | | | | |
| | | [by (103)] | | | | | | | |
| | | | | | | | | | |
|
Hence, there exists a unique fixed point
g. This fixed point yields a continuous inverse
g of
f|
U defined on
U =
Bε/2(0) ∩
f−1(
Bε/2(0)). By the previous Lemma this implies that
g is differentiable. Now simply note that
g′ = (
f′)
−1 ∘
g. Since matrix inversion is smooth and
f′ is in
Ck−1 this implies that for
m ≤
k−1 we get the conclusion (
g ∈
Cm) (
g ∈
Cm+1).
Hence,
g is in
Ck.
□
The implicit function theorem is actually a rather simple consequence of the inverse function theorem. It gives a nice criterion for local solvability of equations in many variables.
Theorem 31 (Implicit Function Theorem)
Let U1 ⊂ ℝ
n × ℝ
m and U2 ⊂ ℝ
m be open subsets and let
F: U1 → U2, (x1, …, xn, y1,…,ym) ↦ F(x1, …, xn, y1,…,ym)
|
be a Ck-map. Suppose that F(
x0,
y0)=
0 for some point (
x0,
y0) ∈
U1 and that the
m ×
m-matrix ∂
y F(
x0,
y0)
is invertible.
Then there exists an neighborhood U of (
x0,
y0) ∈ ℝ
n × ℝ
m, an open neighborhood V of x0 in ℝ
n,
and a Ck-function f:
V → ℝ
m such that
{ (x,y) ∈ U ∣ F(x,y) =0 } = { (x , f(x)) ∈ U ∣ x ∈ V }.
|
The function f has derivative
f′(x0)=−(∂y F(x0,y0))−1 ∂x F(x0,y0)
|
at x0.
Proof.
This is proved by reducing it to the inverse function theorem. Just design the map
G : U1 → ℝn × ℝm, (x,y) ↦ (x, F(x,y))
|
and then note that
G′(x0,y0) = | ⎛
⎜
⎝ | 1 | 0 |
∂x F(x0,y0) | ∂y F(x0,y0) |
| ⎞
⎟
⎠ |
is invertible with inverse
(G′(x0,y0))−1 = | ⎛
⎜
⎝ | 1 | 0 |
−(∂y F(x0,y0))−1 ∂x F(x0,y0) | (∂y F(x0,y0))−1 |
| ⎞
⎟
⎠ | .
|
By the inverse function theorem there exists a local inverse
G−1:
U3 →
U4, where
U3 is an open neighborhood of
0 and
U4 an open neighborhood of (
x0,
y0).
Now define
f by (
x,
f(
x)) =
G−1(
x,
0).
□
Example 32
Consider the system of equations
| x12 + x22 + y12 + y22 = 2, | |
| x1 + x23 + y1 + y23 =2.
| |
|
We would like to know if this system
implicitly determines functions y1(
x1,
x2)
and y2(
x1,
x2)
near the point (0,0,1,1)
, which solves the equation.
For this one simply applies the implicit function theorem to
F(x1,x2,y1,y2) = ( x12 + x22 + y12 + y22 − 2, x1 + x23 + y1 + y23 − 2).
|
The derivatives are
∂xF = | ⎛
⎜
⎝ | | ⎞
⎟
⎠ | , ∂yF = | ⎛
⎜
⎝ | | ⎞
⎟
⎠ |
The values of these derivatives at the point (0,0,1,1)
are
∂xF(0,0,1,1) = | ⎛
⎜
⎝ | | ⎞
⎟
⎠ | , ∂yF(0,0,1,1) = | ⎛
⎜
⎝ | | ⎞
⎟
⎠ |
The latter matrix is invertible and one computes
−(∂y F(x0,y0))−1 ∂x F(x0,y0)(0,0,1,1) = | ⎛
⎜
⎝ | | ⎞
⎟
⎠ | .
|
We conclude that there is an implicitly defined function (
y1,
y2)=
f(
x1,
x2)
whose derivative at (0,0)
is given by
The geometric meaning is that near the point (0,0,1,1)
the system defines a two-dimensional manifold that is locally given by the graph of a function. Its tangent plane is spanned by the vectors
(1/2,0,1,0)
and (−1/2,0,0,1)
.
Example 33
Consider the system of equations
| x2 + y2 + z2 = 1, | |
| x + y z + z3 =1.
| |
|
This is the intersection of a sphere (drawn in light green on Figure 22) with some cubic surface defined by the second equation (drawn in light blue).
The point (0,0,1)
solves the equation and is pictured as a little orange dot. By the implicit function theorem the intersection is a smooth curve (drawn in red) near this point which can be parametrised by x coordinate. Indeed,
we can express y and z along the curve as functions of x because the resulting matrix
∂(y,z)F(0,1)= | ⎛
⎜
⎝ | | | ⎞
⎟
⎠ | ⎪
⎪
⎪ | | = | ⎛
⎜
⎝ | | | ⎞
⎟
⎠ |
is invertible.
Figure 22: Example of the implicit theorem: the intersection (red) of the unit sphere (green) and a cubic surface (blue). |
Exercise 34
Fig. 22 suggests that the intersection curve can be alternatively parametrised by the coordinates y and cannot
by z (why?). Check these claims by verifying conditions of Thm. 31.
16.3 The Baire Category Theorem and Applications
We are going to see another example of an abstract result which has several non-trivial consequences for real analysis.
16.3.1 The Baire’s Categories
Let us first prove the following result and then discuss its meaning and name.
Theorem 35 (Baire’s category theorem)
Let (X,d) be a complete metric space and Un a sequence of open dense sets. Then the intersection S=∩n Un is dense.
Proof. The proof is rather straightforward.
We need to show that any ball
Bε(
x0) contains an element of
S. Let us therefore fix
x0 and ε>0.
Since
U1 is dense the intersection of
Bε(
x0) with
U1 is non-trivial. Thus there exists a point
x1 ∈
Bε(
x0) ∩
U1. Now choose ε
1 < ε/2 so that
Bε1(x1) ⊂
Bε(
x) ∩
U1 (note the closure of the ball). Since
U2 is dense, the intersection
Bε1(
x1) ∩
U2 ⊂
Bε(
x0) ∩
U1 ∩
U2 is non-empty.
Choose a point
x2 and ε
2 < ε
1 /2 such that
Bε2(x2) ⊂
Bε1(
x1) ∩
U2 ⊂
Bε(
x0) ∩
U1 ∩
U2.
Continue inductively, to obtain a sequence
xn such that
| ⊂ Bεn−1(xn−1) ⋂ Un ⊂ Bε(x0) ⋂ U1 ⋂ U2 ⋂ … ⋂ Un,
|
and ε
n < 2
−n ε. In particular, for any
n>
N we have
which implies that
xn is a Cauchy sequence. Hence
xn has a limit
x, by completeness of (
X,
d). Consequently,
x is contained in the closed ball
BεN(xN) for any
N,
and therefore it is contained in
Bε(
x0) ∩ (∩
n Un), as claimed.
□
Completeness is essential here. For example, the conclusion does not hold for the metric space ℚ: take bijection ψ: ℕ → ℚ, and consider the open dense sets
Un = { ψ(1), ψ(1), …, ψ(n)}c = {ψ(n+1), ψ(n+2),… }.
|
The intersection ∩n Un is empty.
The following historic terminology, due to Baire, is in use.
Definition 36 (Baire’s categories)
A subset Y of a metric space X is called
-
nowhere dense if the interior of Y is empty;
- of first category if there is a sequence (Yk) of nowhere dense sets with Y = ∪k Yk;
- of second category if it is not of first category.
Example of nowhere dense sets are ℤ ⊂ ℝ, the circle in ℝ2, or the set { 1/n ∣ n ∈ ℕ } ⊂ ℝ.
Note that the complement of a nowhere dense set is a dense open set.
Corollary 37
In a complete metric space the complement of a set of the first category is dense.
Proof.
Follows from relations for complements
Yc = (⋃k Yk)c = ⋂k Ykc ⊃ ⋂k | | c
|
and the fact that
Ykc is dense.
□
The following corollary is also called Baire’s category theorem in some sources:
Corollary 38
A complete metric space is of second category in itself, or plainly speaking it is never the union of a countable number of nowhere dense sets.
The theorem is often used to show abstract existence results. Here is an example.
Theorem 39
There exists a function f ∈ C[0,1] that is nowhere differentiable.
Proof.
For each
n ∈ ℕ define
Un = | ⎧
⎪
⎨
⎪
⎩ | f ∈ C[0,1] s.t. sup | ⎧
⎪
⎨
⎪
⎩ | ⎪
⎪
⎪
⎪ | | | | ⎪
⎪
⎪
⎪ | over 0 < |h| ≤ | | ⎫
⎪
⎬
⎪
⎭ | > n, ∀ x ∈ [0,1] | ⎫
⎪
⎬
⎪
⎭ | .
|
We will show that the
Un are open and dense. By the Category theorem their intersection is also dense.
Un is open:
Let f ∈ Un. For each x ∈ [0,1] choose δx>0 such that
sup | ⎧
⎪
⎨
⎪
⎩ | ⎪
⎪
⎪
⎪ | | | ⎪
⎪
⎪
⎪ | over 0 < |h| ≤ | | ⎫
⎪
⎬
⎪
⎭ | > n + δx,
|
hence there is a hx < 1/n with
⎪
⎪
⎪
⎪ | | | | ⎪
⎪
⎪
⎪ | > n + δx.
|
By continuity of f there is an open neighborhood Ix of x such that
for all y ∈ Ix. These Ix form an open cover. We choose a finite subcover (Ixk)k=1,…,N.
Let δ= min{δx1, …, δxN} > 0 . Then, for y ∈ Ixk:
Now let g ∈ Bε(f), where ε>0 is chosen so that ε < 1/2 δ hxk for all k. Then by an ε/3-style argument:
| ⎪
⎪
⎪
⎪ | | | | ⎪
⎪
⎪
⎪ | ≥ | ⎪
⎪
⎪
⎪ | | | | ⎪
⎪
⎪
⎪ | − 2 | | > n + δ − 2 ε hxk−1 >n,
|
and therefore g∈ Un. We conclude that Un is open.
Un is dense:
For each ε>0 and f ∈ C[0,1] choose a polynomial p such that || f − p || < ε/2 and a sequence of continuous function
gm ∈ C[0,1] such that || g ||∞< ε/2 and such that for all x ∈ [0,1]:
sup | ⎧
⎪
⎨
⎪
⎩ | | over 0 < |h| ≤ | | ⎫
⎪
⎬
⎪
⎭ | > m
|
by using a “zigzag” function.
Then, for large enough m we have p+gm ∈ Un.
□
The above proof actually shows much more, namely that the set of nowhere differentiable functions is dense in C[0,1]. It is also useful to compare it with the construction of the continuous nowhere differentiable Weierstrass function and identify some common elements.
16.3.2 Banach–Steinhaus Uniform Boundedness Principle
Another consequence of the Baire Category theorem is the Banach–Steinhaus uniform boundedness principle.
Recall that, if X and Y are normed spaces, T: X → Y is called a bounded operator if it is a bounded linear map.
Theorem 40 (Banach–Steinhaus Uniform Boundedness Principle)
Let X be a Banach space and Y a normed space, and let (
Tα)
α ∈ I be a family of bounded operators Tα:
X →
Y.
Suppose that
Then we have sup
α ||
Tα|| < ∞
, i.e. the family Tα is bounded in the set B(
X,
Y)
of bounded operators from X to Y.
Proof.
Define
Xn = {
x ∈
X ∣ sup
α ||
Tαx || ≤
n }. By assumption
X = ∪
n Xn. Note that all the
Xn are closed. By the Baire category theorem at least one of these sets must have non-empty interior, since otherwise
the Banach space
X would be a countable union of nowhere dense sets. Hence, there exists
N ∈ ℕ,
y ∈
XN, and ε>0 such that
Bε(y) ∈
XN.
Now
XN is symmetric under reflections
x↦ −
x and convex. So we get the same statement for −
y. Hence,
x ∈
Bε(0) implies
x = | | ⎛
⎝ | (x + y) + (x−y) | ⎞
⎠ | ∈ | | | ⎛
⎝ | XN + XN | ⎞
⎠ | ⊂ XN.
(104) |
This means that ||
x || ≤ ε implies ||
Tαx || ≤
N, and therefore ||
Tα|| ≤ ε
−1 N for all α ∈
I.
□
Recall that the Fourier series of a C1-function on a circle (identified with 2 π-periodic functions) converges uniformly to the function. We will now show that a statement like that can not hold for continuous functions.
Corollary 41
There exist continuous periodic functions whose Fourier series do not converge point-wise.
Proof.
We will show that there exists a continuous function whose Fourier series does not converge at
x=0.
Suppose by contradiction such functions would not exist, so we would have point-wise convergence of the Fourier series
| | a0 + | | am cos(m x) + bm sin(m x)
|
for every
f ∈
C(
S1) =
Cper(ℝ). Here we identify continuous functions on the unit circle with continuous 2 π-periodic functions
Cper(ℝ). Hence we have a map
Tn : C(S1) → ℝ, f ↦ | | a0 + | | am
|
by mapping the function
f to the
n-th partial sum of its Fourier series at
x=0. This is a family of bounded operators
Tn:
C(
S1) → ℝ and by assumption we have for every
f that
By Banach–Steinhaus theorem we have sup
n ||
Tn || = sup
n, || f ||∞=1 |
Tn(
f) | < ∞.
Now one computes the norm of the map
Tn : C(S1) → ℝ, f ↦ | | | | f(x) | ⎛
⎜
⎜
⎝ | | + | | cos(k x) | ⎞
⎟
⎟
⎠ | dx = | | | | f(x) Dn(x) dx
|
where
is the
Dirichlet kernel , cf. Lem.
6.
This norm equals 1/2 π ∫
−ππ|
Dn(
x) |
dx = 1/2 π ∫
02 π |
Dn(
x) |
dx (Exercise) which goes to ∞ as
n → ∞.
Indeed, using sin(
x/2) ≤
x/2 and substituting we get
| | | | [since sins ≤ s] | | | | | | | |
| | | [change of variables t=(n+ | | ) x] |
| | | | | | | |
| | | [split integral into intervals] | | | | | | | |
| | | [since t ≤ k+1 for t∈ (k,k+1) ] | | | | | | | |
| | | [evaluating the integral],
| | | | | | | |
|
which is the harmonic series divergent as
n → ∞.
This gives a contradiction.
□
Another corollary of the Banach–Steinhaus principle is an important continuity statement. Recall that of X and Y are normed spaces them so is the Cartesian product
X × Y equipped with the norm || (x,y) || = ( || x ||X2 + || y ||Y2 )1/2. It is easy to see that a sequence (xn,yn) converges to (x,y) in this norm if and only if
xn → x and yn → y.
Theorem 42
Suppose that X, Y are Banach spaces and suppose that B: X × Y → ℝ is a bilinear form on X × Y that is separately continuous, i.e.
B(·, y) is continuous on X for every y ∈ Y and B(x,·) is continuous on Y for every x ∈ X. Then B is continuous.
Proof.
Suppose that (
xn,
yn) is a sequence that converges to (
x,
y).
First note that
B(xn−x,yn−y)= B(xn,yn) − B(xn,y) − B(x,yn) + B(x,y),
|
where
B(
xn,
y) →
B(
x,
y) as well as
B(
x,
yn) →
B(
x,
y).
So it is sufficient to show that
B(
xn−
x,
yn−
y) → 0 or, equivalently,
B(
x′
n,
y′
n) → 0 for any
x′
n→ 0 and
y′
n→ 0.
Now. the linear mappings
Tn(
x)=
B(
x,
y′
n):
X → ℝ are bounded, by assumption. Since ||
y′
n||→ 0 the sequence
Tn(
x)→ 0 and is bounded for every
x∈
X. Then, by the Banach–Steinhaus theorem
there exists a constant
C such that ||
Tn|| ≤
C for all
n. That is |
Tn(
x)| =
B(
x,
y′
n) ≤
C ||
x|| for all
n and
x∈
X. Therefore, |
B(
x′
n,
y′
n)| ≤
C ||
x′
n|| → 0.
□
16.3.3 The open mapping theorem
Recall that for a continuous map the pre-image of any open set is open. This does of course not mean that the image of any open set is open
(for example, sin: ℝ → ℝ has image [−1,1], which is not open). A map f: X → Y between metric space is called open if the image of every open set is open.
If a map is invertible then it is open if and only if its inverse is continuous.
We start with a simple observation for linear maps.
We will denote open balls in normed spaces X and Y by BrX(x) and BsY(y) respectively, or simply BrX and BsY if they are centred at the origin.
Lemma 44
Let X and Y be normed spaces. Then a linear map T: X → Y is open if and only if there exists ε>0 such that
BεY(0) ⊂ T (B1X(0)), i.e. the image of the unit ball contains a zero’s neighbourhood.
Proof.
If the map
T is open it clearly has this property. Suppose conversely, that
BεY(0) ⊂
T (
B1X(0)) for some ε>0.
Then, by scaling,
Bε δY(0) ⊂
T (
BδX(0)) for any δ>0.
Suppose that
U is open. Suppose that
y ∈
f(
U), that is there exists
x∈
U such that
y=
f(
x). Then there exists δ>0 with
x +
BδX(0) ⊂
U and therefore
TU ⊃ T BδX(x) = { T x } + T BδX(0) ⊃ { y } + Bδ εY(0) =Bδ εY(y) .
|
□
Theorem 45 (Open Mapping Theorem)
Let T : X → Y be a continuous surjective linear operator between Banach spaces. Then T is open.
Proof.
Since
T is surjective we have
Y = ∪
n T BnX. Therefore trivially,
Y = ∪
n T BnX. By the Baire category theorem one of the
T BnX must have an interior point. Rescaling implies that
T B1X has an interior point
y0. Since
T B1X
is symmetric under reflection
y→ −
y, the point −
y0 must also be an interior point. Therefore, by convexity of
T B1X there exists a δ>0 with
BδY ⊂
T B1X, cf. (
104).
By linearity this means
Bδ 2−nY ⊂
T B2−nX for any natural
n.
We will show that T B1X ⊂ T B2X, with the implication from above
that BδY⊂ T B2X, which will complete the proof by the previous Lemma.
So, let y ∈ TB1X be arbitrary.
Then, there exists x1 ∈ B1X such that y − T x1 ∈ Bδ/2Y ⊂ TB1/2X. Repeating this, there exists x2 ∈ B1/2X such that
y − T x1 − T x2 ∈ Bδ/4Y.
Continuing inductively, we obtain a sequence (xn) with the property that
|| xn || < 2−n+1 and
y − | | Txn ∈ Bδ 21−nY.
(105) |
By completeness of X, the absolute convergent series ∑1n xn converges to an element x∈ X of norm || x|| < 2. By linearity an continuity of T we get from (105) that y = T x. Thus y∈ TB2.
□
If the map T is also injective (and, therefore, bijective with the inverse T−1) we can quickly conclude continuity of T−1.
Corollary 46
Suppose that T:
X →
Y is a bijective bounded linear map between Banach spaces. Then T has a bounded inverse T−1.
It is not rare that we may have two different norms ||·|| and ||·||* on the same Banach space X. We say that ||·|| and ||·||* are equivalent if there are constants c>0 and C>0 such that:
c ||x|| ≤ ||x||* ≤ C ||x|| for all x ∈ X.
(106) |
Exercise 47
-
Check that (106) defines an equivalence relations on the set of all norms on X.
- If a sequence is Cauchy/convergent/bounded in a norm then it is also Cauchy/convergent/bounded in any equivalent norm.
The Cor. 46 implies that if the identity map (X,||·||)→ (X,||·||*) is bounded then both norms are equivalent.
Corollary 48
Let (
X,||·||)
be a Banach space and ||·||
* be a norm on X in which X is complete. If ||·|| ≤
C ||·||
* for some C>0
the norms are equivalent.
16.3.4 The closed graph theorem
Suppose that X, Y are Banach spaces and suppose that D ⊂ X is a linear subspace (not necessarily closed).
Now suppose that T : D → Y is a linear operator. Then the graph gr(T) is defined as the subset {(x,Tx) ∣ x ∈ D} ⊂ X × Y.
This is a linear subspace in the Banach space X × Y, which can be equipped with the norm ||(x,y)||2 = ||x||X2 + || y||Y2. One often uses the equivalent norm
||(x,y)|| = ||x||X + || y||Y but the first choice makes sure that the product X × Y is also a Hilbert space if X and Y are Hilbert spaces.
We will refer to T as an operator from X to Y with domain D.
Definition 49
The operator T is called closed if and only if its graph is a closed subset of X × Y.
It is easy to see that T is closed if an only if xn → x and T xn → y imply that T xn → T x. Note the difference with continuity of T!!!
If T is an operator T : D → Y then its graph is a subset of X × Y. If we close this subset the resulting set may fail to be the graph of an operator.
If the closure is the graph as well, we say that T is closable and its closure is the operator whose graph is obtained by closing the graph of T.
Differential operators are often closed but not bounded. Let L2[a,b] be the Hilbert space obtained by abstract completion of (C[a,b],|| ·||2), cf. Prop. 60.
Then D=C1[a,b] is a dense subspace in L2[a,b] and the operator d/dx: C1[a,b] → L2[a,b] is of the above type. This operator is not closed, however it is closable
and its closure therefore defines a closed operator with dense domain. We have already seen that this operator is unbounded and therefore it cannot be continuous.
Of course, the map D → (x,Tx) is a bijection from D to gr(T). We can use the norm on gr(T) to define a norm on D, which is then
|| x ||D = | ⎛
⎝ | || x ||X2 + || T x ||Y2 | ⎞
⎠ | | .
|
Obviously, T is closed if and only of D with norm ||·||D is a Banach space. We are now ready to state the closed graph theorem. It is easy to check that
T continuously maps (D, || · ||D) to Y.
Theorem 50 (Closed Graph Theorem)
Suppose that X and Y are Banach spaces and suppose that T: X → Y is closed. Then T is bounded.
Proof.
Since in this case we have
D=
X with have two norms ||·||
X and || · ||
D on
X that are both complete. Clearly,
and by Cor.
48 the norms are therefore equivalent. Hence,
|| T x ||Y ≤ || x ||D ≤ C || x ||X
|
for some constant
C>0.
□
16.4 Semi-norms and locally convex topological vector spaces
Definition 51 (Semi-Norm)
Let X be a vector space, then a map p:
X → ℝ
is called semi-norm
if
-
p(x) ≥ 0 for all x ∈ X,
- p(λ x) = |λ| p(x), for all λ ∈ ℝ, x ∈ X,
- p(x+y) ≤ p(x) + p(y), for all x,y ∈ X.
An example of a semi-norm on C1[0,1] is p(f):=|| f ′ ||∞. If (pα)α is a family of semi-norms with the property that
( ∀ α ∈ I, pα(x) =0 ) x=0
|
then we say X with that family is a locally convex topological vector space. There is a topology (that is, a description of all open sets) on such a vector space, by declaring a subset U ⊂ X to be open if and only if
for every point x ∈ U and any index α ∈ I there exists ε>0 such that { y ∣ pα(y−x) < ε } ⊂ U.
The notion of convergence one gets is xn → x if and only of pα(xn −x) → 0 for all α. The topology of point-wise convergence on the space of functions S → ℝ is for example of this type, with the family of semi-norms
given by (px)s x ∈ S, px(f) = | f(x) |.
Another example is the vector space C∞(ℝm) with the topology of uniform convergence of all derivatives on compact sets. Here the family of semi-norms pα,K is indexed by all multi-indices α ∈ ℕ0m
and all compact subsets K ⊂ ℝ and is given by
If the family of semi-norms is countable then this topology is actually coming from a metric (so the space is a metric space)
Such a metric space is called Frechet space. Note that C∞(ℝm) is a Frechet space because the family of semi-norms above can be replaced by a countable one by taking a countable exhaustion of ℝm by compact subsets.
A Tutorial Problems
These are tutorial problems intended for self-assessment of the course
understanding.
A.1 Tutorial problems I
All spaces are complex, unless otherwise specified.
1 Show that ||f||=|f(0)|+sup|f′(t)| defines a norm
on C1[0,1], which is the space of (real) functions on [0,1] with
continuous derivative.
2 Show that the formula ⟨ (xn),(yn)⟩ =∑n=1∞xnyn/n2 defines
an inner product on l∞, the space of bounded (complex)
sequences. What norm does it produce?
3 Use the Cauchy–Schwarz inequality for a suitable inner product to prove that
for all f ∈
C[0,1]
the inequality
⎪
⎪
⎪
⎪ | | f(x)x d x | ⎪
⎪
⎪
⎪ |
≤ C | ⎛
⎜
⎜
⎝ | | |f(x)|2 d x | ⎞
⎟
⎟
⎠ | |
holds for some constant
C>0
(independent of f) and find the smallest possible C
that holds for all functions f
(hint: consider the cases of equality).
4
We define the following norm on l∞, the space of
bounded complex sequences:
Show that this norm makes l∞ into a Banach space (i.e., a complete
normed space).
5
Fix a vector (
w1,…,
wn)
whose components are strictly positive real numbers, and define
an inner product on ℂ
n by
Show that this makes ℂ
n into a Hilbert space (i.e., a complete
inner-product space).
A.2 Tutorial problems II
6 Show that the supremum norm on C[0,1] isn’t given by an
inner product, by finding a counterexample to the parallelogram law.
7 In l2 let e1=(1,0,0,…),
e2=(0,1,0,0,…), e3=(0,0,1,0,0,…), and so on.
Show that Lin (e1,e2,…)=c00, and that CLin (e1,e2,…)=l2. What is CLin (e2,e3,…)?
8 Let C[−1,1]
have the standard L2 inner product,
defined by
Show that the functions 1
, t and t2−1/3
form an
orthogonal (not orthonormal!) basis for the subspace P2 of
polynomials of degree at most 2
and hence calculate the best
L2-approximation of the function t4 by polynomials in
P2.
9 Define an inner product on C[0,1]
by
Use the Gram–Schmidt process to find the first 2 terms of an
orthonormal sequence formed by orthonormalising the sequence 1
,
t, t2, …. 10 Consider the plane P in ℂ4 (usual inner
product) spanned by the vectors (1,1,0,0) and (1,0,0,−1).
Find orthonormal bases for P and P⊥, and verify
directly that (P⊥)⊥=P.
A.3 Tutorial Problems III
11 Let a and b be arbitrary
real numbers with a < b. By using the fact that
the functions 1/√2πeinx, n ∈ ℤ,
are orthonormal in L2[0,2π], together with the change of variable
x=2π(t−a)/(b−a), find an orthonormal basis in L2[a,b]
of the form en(t)=α ei n λ t, n ∈ ℤ, for suitable real constants
α and λ.
12
For which real values of α
is the Fourier series
of a function in L2[−π,π]
?
13 Calculate the Fourier series of
f(
t)=
et on [−π,π]
and use Parseval’s
identity to deduce that
14 Using the fact that (
en)
is a complete
orthonormal system in L2[−π,π]
, where
en(
t)=exp(
int)/√
2π, show that
e0,
s1,
c1,
s2,
c2,…
is a complete orthonormal
system, where sn(
t)=sin
nt/√
π and
cn(
t)= cos
nt/√
π.
Show that
every L2[−π,π]
function f has a Fourier series
converging in
the L2 sense, and give
a formula for the coefficients.
15 Let C(T)
be the space of continuous (complex)
functions on the circle
T={
z ∈ ℂ: |
z|=1 }
with the supremum norm.
Show that, for any polynomial f(
z)
in C(T)
Deduce that the function f(
z)=
z is not
the
uniform limit of polynomials on the circle (i.e., Weierstrass’s
approximation theorem doesn’t hold in this form).
A.4 Tutorial Problems IV
16 Define a linear functional on C[0,1] (continuous functions
on [0,1]) by α(f)=f(1/2). Show that α is
bounded if we give C[0,1] the supremum norm. Show that
α is not bounded if we use the L2 norm, because we can
find a sequence (fn) of continuous functions on [0,1] such
that ||fn||2 ≤ 1, but fn(1/2) → ∞.
17 The Hardy space H2 is the Hilbert space of all power series
f(
z)=∑
n=0∞an zn, such that ∑
n=0∞|
an|
2 < ∞
, where the inner product is given by
⟨
⟨
⟨
⟨ | | anzn, | | bnzn | ⟩
⟩
⟩
⟩ | = | | an | | . |
Show that the sequence 1,
z,
z2,
z3, …
is an orthonormal basis for H2.Fix w with |w|<1 and define a linear functional on H2 by
α(f)=f(w). Write down a formula for the function g(z) ∈
H2 such that α(f)=⟨ f, g ⟩. What is
||α||?
18 The Volterra operator
V:
L2[0,1] →
L2[0,1]
is defined by
Use the Cauchy–Schwarz inequality to show that |(
Vf)(
x)| ≤
√
x||
f||
2 (hint: write (
Vf)(
x)=⟨
f,
Jx⟩
where
Jx is a function that you can write down explicitly).Deduce that ||Vf||22 ≤ 1/ 2||f||22, and hence ||V||
≤ 1/√2.
19 Find the adjoints of the following operators:-
A:l2 → l2, defined by
A(x1,x2,…)=(0,x1 / 1, x2/ 2, x3/ 3,
…);
and, on a general Hilbert space H:
- The rank-one operator R, defined by Rx=⟨ x,y ⟩
z, where y and z are fixed elements of H;
- The projection operator PM, defined by PM(m+n)=m,
where m ∈ M and n ∈ M⊥, and H=M ⊕ M⊥ as
usual.
20
Let U ∈
B(
H)
be a unitary operator. Show that (
Uen)
is an orthonormal basis of H whenever (
en)
is.Let l2(ℤ) denote the Hilbert space of
two-sided sequences (an)n=−∞∞ with
Show that
the bilateral right shift, V:l2(ℤ)
→ l2(ℤ) defined by V((an))=(bn),
where bn=an−1 for all n∈ ℤ, is unitary,
whereas the usual right shift S on
l2=l2(ℕ) is not unitary.
A.5 Tutorial Problems V
21
Let f∈
C[−π,π]
and let
Mf be the multiplication operator on L2(−π,π)
,
given by (
Mfg)(
t)=
f(
t)
g(
t)
, for g ∈
L2(−π,π)
.
Find a function f′ ∈
C[−π,π]
such that
Mf*=
Mf′.Show that Mf is always a normal operator. When is it Hermitian? When is it
unitary?
22 Let T be any operator such that Tn=0
for some integer n (such
operators are called nilpotent). Show that
I−
T is invertible (hint: consider I+
T+
T2+…+
Tn−1).
Deduce that I−
T/λ
is invertible for any λ ≠ 0
.What is σ(T)? What is r(T)?
23
Let (λ
n)
be a fixed bounded sequence of complex numbers,
and define an operator on l2 by T((
xn))=((
yn))
, where
yn=λ
nxn for each n. Recall that T is a bounded operator and
||
T||=||(λ
n)||
∞. Let Λ={λ
1,λ
2,…}
.
Prove the following:-
Each λk is an eigenvalue of T, and hence is in
σ(T).
- If λ ∉Λ, then the inverse of
T−λ I exists (and is bounded).
Deduce that σ(T)=Λ. Note, that then any
non-empty compact set could be a spectrum of some bounden operator.
24
Let S be an isomorphism
between Hilbert spaces H and K,
that is, S:
H →
K is a linear bijection such that S and S−1 are bounded
operators.
Suppose that T ∈
B(
H)
.
Show that T and STS−1 have the same spectrum and the same eigenvalues
(if any). 25
Define an operator U:
l2(ℤ) →
L2(−π,π)
by
U((
an))=∑
n=−∞∞an eint/√
2π.
Show that U is a bijection and an isometry, i.e.,
that ||
Ux||=||
x||
for all x ∈
l2(ℤ)
.Let V be the bilateral right shift on l2(ℤ),
the unitary operator defined on Question 20.
Let f ∈ L2(−π,π). Show that
(UVU−1f)(t)=eitf(t), and hence, using Question 24, show that
σ(V)=T, the unit circle, but
that V has no eigenvalues.
A.6 Tutorial Problems VI
26
Show that K(
X)
is a closed linear subspace of B(
X)
, and that AT and TA
are compact whenever T ∈
K(
X)
and A ∈
B(
X)
. (This means that K(
X)
is a closed
ideal of B(
X)
.)
27
Let A be a Hilbert–Schmidt operator, and let (
en)
n≥ 1
and (
fm)
m≥ 1
be orthonormal bases of A.
By writing each Aen as Aen=∑
m=1∞⟨
Aen,
fm ⟩
fm,
show that
Deduce that the quantity ||
A||
HS2=∑
n=1∞||
Aen||
2 is
independent of the choice of orthonormal basis, and that ||
A||
HS=||
A*||
HS.
(||
A||
HS is called the Hilbert–Schmidt norm of A.)
28
-
Let T∈ K(H) be a compact operator. Using
Question 26,
show that T*T and TT* are compact
Hermitian operators.
- Let (en)n≥ 1 and (fn)n ≥ 1 be orthonormal
bases of a Hilbert space H,
let (αn)n ≥ 1 be any bounded complex sequence, and
let T ∈ B(H) be an operator
defined by
Prove that T is Hilbert–Schmidt precisely when
(αn) ∈ l2.
Show that
T is a compact operator if and only if αn → 0,
and in this case write down spectral decompositions for the
compact
Hermitian operators
T*T and TT*.
29
Solve the Fredholm integral equation φ−λ
Tφ=
f,
where f(
x)=
x and
(Tφ)(x)= | | xy2 φ(y) d y (φ ∈ L2(0,1)), |
for small values of λ
by means of the Neumann series.For what values of λ does the series converge? Write down a solution
which is valid for all λ apart from one exception. What is
the exception?
30
Suppose that h is a 2π
-periodic L2(−π,π)
function with
Fourier series ∑
n=−∞∞an eint. Show that
each of the functions φ
k(
y)=
eiky, k ∈ ℤ
, is an
eigenvector of
the integral operator T on L2(−π,π)
defined by
and calculate the
corresponding eigenvalues.Now let h(t)=−log(2(1−cost)). Assuming, without proof, that h(t) has the
Fourier series
∑n ∈ ℤ, n ≠ 0 eint/|n|, use the Hilbert–Schmidt method to
solve the Fredholm equation
φ−λ Tφ=f, where f(t) has Fourier series ∑n=−∞∞cn eint and 1/λ ∉σ(T).
A.7 Tutorial Problems VII
31 Use the Gram–Schmidt algorithm to find an orthonormal basis for
the subspace X of L2(−1,1)
spanned by the functions t, t2 and
t4.Hence find the best L2(−1,1) approximation of the constant function
f(t)=1 by functions from X.
32 For n=1,2,…
let φ
n denote the linear
functional on l2 defined by
where x=(
x1,
x2,…) ∈
l2.
Use the Riesz–Fréchet theorem to calculate ||φ
n||
. 33 Let T be a bounded linear operator on a Hilbert space, and
suppose that T=
A+
iB, where
A and B are self-adjoint operators. Express T* in terms of
A and B, and hence solve for A and B in terms of T and T*.Deduce that every operator T can be written T=A+iB, where A
and B are self-adjoint, in a unique way.
Show that T is normal
if and only if AB=BA.
34 Let Pn be the subspace of L2(−π,π)
consisting of all
polynomials of degree at most n, and let Tn be the subspace
consisting of all trigonometric polynomials of the form f(
t)=∑
k=−nn ak eikt.
Calculate the spectrum of the differentiation operator D,
defined by (
Df)(
t)=
f′(
t)
, when
-
D is regarded as an operator on Pn, and
- D is regarded as an operator on Tn.
Note that both
Pn
and
Tn are finite-dimensional Hilbert spaces.
Show that Tn has an orthonormal basis of eigenvectors of D, whereas
Pn does not.
35 Use the Neumann series to solve the Volterra integral equation
φ−λ Tφ=f in L2[0,1], where λ∈ ℂ, f(t)= 1 for
all t, and
(Tφ)(x)=∫0x t2φ(t) d t. (You should be able to sum the
infinite series.)
B Solutions of Tutorial Problems
0=0Solutions of the tutorial problems will be distributed due in time on
the paper.
1<0
B.1 Solution of Tuitorial Problem I
1 Clearly the norm is non-negative.
If ||
f||=0
, then f is constantly zero (since then
f(0)=0
and f′(
t)=0
everywhere).
Also
||λ f || | = | |λ f(0)|+
sup|λ f′(t)| |
| = | |λ| |f(0)|+|λ|sup| f′(t)|
=|λ| ||f||, |
|
and ||f+g|| | = | |f(0)+g(0)|
+sup|f′(t)+g′(t)| |
| ≤ | |f(0)|+|g(0)|+sup|f′(t)|+sup|g′(t)| |
| = | ||f||+||g||. |
|
2 Clearly the sum is absolutely convergent
(since ∑1/
n2 < ∞
), and checking the
conditions:⟨ y,x ⟩ = ⟨ x,y⟩, ⟨λ x, y⟩ = λ ⟨ x,y ⟩,
⟨ x+y, z
⟩=⟨ x,z⟩ + ⟨ y,z⟩, and
⟨ x,x ⟩ > 0 except
when x=0, when ⟨ x,x⟩ =0 ,
is fairly
straightforward algebra.
Also ||(xn)|| = ⟨(xn),(xn)⟩1/2
= (∑n=1∞|xn|2/n2)1/2.
3 With the usual inner product
⟨ f,g⟩=∫f ḡ d x, we have that
|⟨ f,g⟩ | ≤ ||f|| ||g||, where
g is the function g(x)=x.
Now ||g||=1/√3 and this is the smallest possible
constant C, since we do have ⟨ g,g⟩ =||g|| ||g||.
4 I’ll omit the proof that this is a norm (but
if it gives any trouble, ask me).
The proof of completeness is a bit like the l2 proof, only simpler.
Suppose that (x(n)) is a Cauchy sequence
in l∞. Then, for each coordinate k,
|xk(n)−xk(m)| ≤ ||x(n) − x(m)||,
and so (xk(n)) is a
Cauchy sequence of complex numbers, converging to xk,
say. Also |xk(n)−xk(m)| < є for n and m greater
than or equal to Nє, say. Letting
m→∞ we get that
|xk(n)−xk| ≤ є for n ≥ Nє, so (x(n)−x) ∈
l∞ and hence x ∈ l∞. Also
||x(n)−x|| ≤ є for n ≥ Nє, and so
x(n) → x.
5
Clearly ⟨
y,
x⟩=∑
k=1n wk yk xk =
⟨ x,y⟩, and the properties ⟨
x+
y,
z⟩=⟨
x,
z⟩ + ⟨
y,
z⟩
and ⟨ λ
x,
y⟩
=λ ⟨
x,
y⟩
are also straightforward to check.
The norm produced is ||x||=⟨
x,x⟩1/2= | ⎛
⎜
⎜
⎝ | | wk |xk|2
| ⎞
⎟
⎟
⎠ | | , |
which is strictly positive unless each
xk is zero. For the
completeness, note that a Cauchy sequence
(
x(m))
has the property that, for each є>0
there is a number Mє with ||
x(m)−
x(p)||<є
for m, p ≥
Mє.
That is,
| | wk
|xk(m)−xk(p)|2<є2.
(107) |
Thus
wk|
xk(m)−
xk(p)|
2 < є
2, which is enough to
show that in the kth coordinate we have a Cauchy sequence
of complex numbers.Define a vector x ∈ ℂn by xk=limm → ∞
xk(m) for each k.
Now we have x(m) → x, since
for m ≥ Mє,
as we see on
letting p → ∞ in (107).
2<0
B.2 Solutions of Tutorial Problems II
6 Lin(
e1,
e2, …) =
c00 because a
sequence is a finite linear combination of
the ei if and only if it has finitely many
nonzero terms.
Taking the closure we get all of l2 since
anything in l2 is the limit of a sequence
in c00. This is so, because
||(x1, x2, …) − (x1, x2, …, xN, 0, 0,
…)||2 = | | |xn|2, |
which tends to zero
as N → ∞
.
Finally CLin(
e2,
e3,…)
is the same
except that we only get sequences whose first
term is zero, i.e. we get
{
x=(
xn) ∈
l2:
x1 = 0}
.
7 Calculate ⟨ 1,
t⟩
, ⟨ 1,
t2−1/3⟩
and
⟨
t,
t2−1/3⟩
: they are all zero. Clearly the set
is a basis for P2. Normalise the functions, to get
e1(
t)=1/√
2,
e2(
t)=
t√
3/2 and e3(
t)=√
45/8(
t2−1/3)
,
an orthonormal sequence. It now follows that, writing
f(
t)=
t4, the best approximation is
g(t) | =⟨ | f,e1⟩ e1+⟨ f,e2⟩ e2+⟨
f,e3⟩ e3 |
| = | (2/5)(1/2)+(0)t+(45/8)(16/105)(t2−1/3)
= (−3/35)+(6/7)t2. |
|
As a cross-check, note that f−
g is indeed
orthogonal to g.
8 ⟨ 1,1⟩ =2/3, so e1(t)=√3/2.
Now form f(t)=t−⟨ t,e1⟩ e1=t−⟨
t,1⟩ (3/2)=t−3/5. Since ⟨ f,f⟩ =8/175 we take
e2(t)=√175/8(t−3/5).
9 The Gram–Schmidt process gives
e1=(1,1,0,0)/√
2,
then y2=(1,0,0,−1)−(1,1,0,0)/2=(1/2,−1/2,0,−1)
,
and then e2=
y2/||
y2||=(1,−1,0,−2)/√
6.The plane P⊥ consists of all vectors orthogonal
to (1,1,0,0) and (1,0,0,−1), and hence is
{(x,y,z,w)∈ ℂ4: x+y=0, x−w=0}, |
with general solution (a,−a,b,a), and basis
(1,−1,0,1) and (0,0,1,0), which are already
orthogonal.
We can thus take e3=(1,−1,0,1)/√3 and
e4=(0,0,1,0) as a basis for P⊥.
Finally, (P⊥)⊥ consists of all vectors
orthogonal to (1,−1,0,1) and (0,0,1,0), namely
{(x,y,z,w)∈ ℂ4: x−y+w=0, z=0}, |
to which the general solution is (−a+b,b,0,a),
with basis (−1,0,0,1) and (1,1,0,0). We are clearly
back at P.
3<0
B.3 Solutions of Tutorial Problems III
10 The given change of variables
takes
t∈ [
a,
b]
to x ∈ [0,2π]
.
We know that
so we obtain
| ∫ | | ein λ (t−a)
e−im λ (t−a) | | d t= | |
where λ =
2π/(
b−
a)
. Hence the functions en(
t)=1 /
√
b−a einλ t form an orthonormal set in
L2[
a,
b]
. They are even an orthonormal basis
,
since the same coordinate change shows that
their closed linear span contains all
f ∈
C[
a,
b]
such that f(
a)=
f(
b)
. 11 By the Riesz–Fischer theorem, ∑cnen
converges to
an L2 function iff ∑|cn|2 < ∞.
Here cn=nα√2π so we require
∑n2α < ∞, i.e. α<−1/2.
12 We calculate ⟨ f,en⟩ =
| | et e−int d t/ | √ | |
= (−1)n | | . |
Also, by Parseval’s identity
| |⟨ f,en⟩ |2
= ||f||22 = | | e2t d t
= (e2π − e−2π)/2. |
That is,
Thus
which gives the result. 13 To check that the sequence is
orthonormal it is probably easiest to write
sn= | √ | | (en−e−n)/2i and
cn= | √ | | (en+e−n)/2, |
and use the orthonormality
of (
en)
to calculate ⟨
sn,
sm ⟩
,
⟨
sn,
cm ⟩
and ⟨
cn,
cm ⟩
.Since exp(± int)=cos(nt)± isin(nt)
and also cos(nt)=(exp(int)−exp(−int))/2 and
sin(nt)=(exp(int)−exp(−int))/2i,
the linear spans of (en) and
{e0,s1,c1,s2,c2,…} are the same.
Hence their closed linear spans are the same, so
{e0,s1,c1,s2,c2,…} also forms an o.n.b.
As we have an orthonormal basis we have an
expansion
f=⟨ f,e0⟩ e0 +
| | ⟨ f,cn⟩ cn + | | ⟨ f,sn⟩ sn |
converging in L2.
Hence
a0=(1/2π) | | f(t) d t,
an = (1/π) | | f(t) cosnt d t
and bn = (1/π) | | f(t) sinnt d t. |
14 Complex analysis method:
Clearly there is a polynomial
g such that
f(
z)=
g′(
z)
. By the fundamental theorem of the
calculus, ∫
g′ = 0
because the contour is
closed. Or you could use Cauchy’s theorem.Direct method:
∫ | | f(eiθ) ieiθ d θ
= | | ak | ∫ | |
iei(k+1)θ d θ=0, |
where f(z)=∑k=0n akzk.
Now
∫ | | z d z =
| ∫ | | (1/z) d z =
| ∫ | | e−iθ ieiθ d θ=
2π i. |
(Again, there are several ways of doing this, as above.)
But if
fn → f uniformly then ∫fn → ∫f,
which is impossible if fn are polynomials
and f(z)=z.
4<0
B.4 Solutions of Tutorial Problems IV
15 Since |α(
f)|=|
f(1/2)|≤ sup
[0,1]|
f(
x)|=||
f||
∞,
we have that ||α|| ≤ 1
(actually it equals 1) when
we use the supremum norm.Suppose now we define fn (starting at n=2, say)
to be
zero except on
piecewise linear on
[1/2−1/n, 1/2] and [1/2, 1/2+1/n], with fn(1/2)=An,
where An is a positive number that we’ll choose in a
minute. (The graph is going to be a thin steep triangle.)
Now
||fn||22 ≤ (2/n) × An2, since fn is
zero except on a set of length 2/n
and always at most An (we could work it out
exactly, but why bother?) So if we
choose An=√n/2 we get ||fn||2 ≤ 1, and
α(fn)=An so
α(fn) → ∞, which means that α is
unbounded in the L2 norm.
16 Clearly we get orthonormality—just compute
⟨
zk,
zl ⟩
. The fact that it is an
orthonormal basis (i.e., complete
) follows since
the only function orthogonal to every zk has all its
coefficients zero, so is the 0 function.Now α(∑anzn)=∑an wn, and this
is ∑an bn only if we take
bn=wn for each n.
Hence
Finally ||α||=||g||, so compute the H2 norm of
g to get
⎛
⎜
⎜
⎝ | | | | | |2n
| ⎞
⎟
⎟
⎠ | | = | ⎛
⎜
⎜
⎝ | | ⎞
⎟
⎟
⎠ | | . |
17 The function Jx must be χ
[0,x], where
It has L2 norm equal to the square root
of ∫
0x 1
2 d t, i.e., √
x. Hence
|(Vf)(x)| ≤
||χ[0,x]||2 ||f||2 = | √ | | ||f||2. |
Now integrate
∫ | | |(Vf)(x)|2 d x ≤ | ∫ | | x ||f||22
dx = | | ||f||22, as required. |
18 The strategy here is to solve the equation
⟨
Ax,
y⟩ = ⟨
x,
A* y⟩
, etc. (i) ⟨ Ax, y
⟩=x1y2/1+x2y3/2+….
This must be ⟨ x, A* y⟩, and so
A* y=(y2/1, y3/2, y4/3, …). |
(ii) ⟨ Rx, u ⟩ = ⟨ x,y ⟩ ⟨ z, u⟩
= ⟨ x, R* u ⟩, where R* u = ⟨ z,
u⟩ y = ⟨ u, z ⟩ y.
(iii) Let’s take two vectors m+n and m′+n′, with m,
m′∈ M and n, n′∈ M⊥. Then
⟨ PM(m+n), m′+n′ ⟩ = ⟨ m, m′+n′⟩=
⟨ m,m′⟩. |
This is the inner product ⟨ m+n,m′⟩ so
PM*(m′+n′)=m′, which means that PM*=PM
again.
19 Clearly
since U preserves the inner product (see notes).
Also, if ⟨
x,
Uen ⟩=0
for all n, then
⟨
U*x,
en ⟩=0
for all n, so U*x=0
,
because (
en)
is an o.n.b., and so x=0
. Hence
(
Uen)
is an o.n.b.To show that the bilateral right shift V is unitary, you
can check any of the equivalent definitions. It is
perhaps easiest just to observe that V is clearly
a surjection and that ||Vx||=||x|| for all x.
Alternatively, you can check that V* is the bilateral
left shift, i.e., V*=V−1, by using the identity
⟨
⟨
⟨
⟨ | | xnen ,
V* | | ymem
| ⟩
⟩
⟩
⟩ |
| = | ⟨
⟨
⟨
⟨ | V | | xnen ,
| | ymem
| ⟩
⟩
⟩
⟩ |
|
| = | ⟨
⟨
⟨
⟨ | V | | xnen+1 , | | ymem
| ⟩
⟩
⟩
⟩ |
|
| = | |
|
which tells you that
We saw in the lectures that S is not unitary, since
SS* ≠ I.
5<0
B.5 Solutions of Tutorial Problems V
20 Use the definition of adjoint:
⟨ Mf g, h ⟩ = ⟨ g, Mf* h ⟩
|
for g, h ∈
L2(−π,π)
, which means that
⟨ g, Mf* h ⟩ =
| | f(t) g(t) | | d t. |
This is the inner product between g and the function
taking values f(t)h(
t)
, so that
Mf* g =
Mf′g, where f′(
t)=
f(t).Clearly Mf Mf* g = Mf* Mf g, and is the function
whose value at t is f(t) f(t) g(t).
Now Mf is Hermitian if and only if
Mf=Mf*,
or f=f′. So f must be real-valued.
Also Mf is unitary if and only if Mf*=(Mf)−1,
which means that f(t)f(t)=1 for all t,
i.e. |f(t)|=1 for all t.
21 Calculate:
(
I+
T+…+
Tn−1)(
I−
T)=(
I−
T)(
I+
T+…+
Tn−1)=
I−
Tn=
I.
So we have an inverse for I−
T.Of course T/λ is also nilpotent, so
I−T/λ is invertible, and so (multiplying by
λ, which is nonzero), we have λ I−T
invertible, and
λ ∉σ(T).
The spectrum is nonempty, so can only be {0}; indeed
it’s obvious that T is not invertible when
Tn=0. Hence r(T)=0 as well.
22
(i) Since Tek=λ
k ek, where (
en)
is the usual
orthonormal basis of l2, we see that
λ
k
is an eigenvalue, with eigenvector ek. Eigenvalues are
always in the spectrum.(ii) T−λ I takes (xn) to
((λn−λ)xn), and so its inverse must take
(yn) to (yn/(λn−λ)). This is a
bounded operator, since the sequence
(λn−λ)−1 is bounded when λ ∉Λ.
Now σ(T) is a closed set. It contains Λ, so
contains Λ.
Indeed σ(T)=Λ,
as it contains no points outside Λ, by (ii).
Thus any nonempty compact set is the spectrum
of some operator!
23 Note that STS−1−λ
I =
S(
T−λ
I)
S−1,
and so STS−1−λ
I is invertible if and only if
T−λ
I is invertible—indeed in that case its
inverse would be S(
T−λ
I)
−1S−1.
Hence σ(
T)=σ(
STS−1)
.Also, if Tu=λ u, then STS−1 (Su)=STu=λ
Su, and Su ≠ 0 if u ≠ 0. Thus any eigenvector
u of T corresponds to an eigenvector Su of
STS−1, and vice-versa.
24 The fact that U is a bijection and an isometry
follows from the fact that the functions
en(
t)=
eint/√
2π, n ∈ ℤ
form an
orthonormal basis of L2(−π,π)
(see notes), so
that a function f is in L2(−π,π)
if and
only if f(
t)=∑
−∞∞an en, where
(
an) ∈
l2, and also
||
f||
2=||(
an)||
2 (Parseval).Now UVU−1f= UVU−1 ∑n=−∞∞an en =
∑n=−∞∞an en+1, because V is the
shift.
But if f(t)=∑n=−∞∞an en(t), then the function
∑n=−∞∞an en+1(t) is just f(t)eit, since
en(t)eit=en+1(t) for all t.
Now we work with the operator T=UVU−1 = Me on
L2(−π,π), where e(t)=eit.
Using Question 21, we see that this is a unitary operator
and so σ(T) ⊆ T, but we can
argue more directly.
The operator
(T−λ I) is multiplication by eit−λ
and its inverse, if it exists, is multiplication by
hλ(t)=1/(eit−λ). For λ ∉T, hλ∈ C[−π,π] and so T−λ I
has a bounded inverse. However, if λ ∈ T,
then multiplication by hλ does not give a bounded
operator (indeed, Mhλ
e0=hλ/√2π, which is not even in L2).
Hence
σ(V)=σ(T)=T.
Also T has no eigenvalues, as, no matter
which λ ∈ ℂ we choose, there will be no
nonzero function f
such that f(t)eit=λ f(t) for all t.
Hence V has no eigenvalues either, by Question 24.
6<0
B.6 Solutions of Tutorial Problems VI
25 If T1 and T2 are compact, and (
xn)
is bounded,
then we can find a subsequence (
xn(k))
of (
xn)
such
that (
T1xn(k))
converges, and a further subsequence
(
xn(k(l)))
such that both (
T1xn(k(l)))
and
(
T2xn(k(l)))
converge. Then
((
a1T1+
a2T2)
xn(k(l)))
converges for any a1,
a2 ∈ ℂ
, so a1T1+
a2T2 is compact. Since the
norm limit of compact operators is compact, they form a
closed subspace.Given (xn) bounded, we can find a subsequence
(xn(k)) such that (Txn(k)) converges, and hence
so does (ATxn(k)), since A is continuous; hence
AT is compact. Also (Axn) is bounded so there is a
subsequence of (TAxn) that converges, and TA is
compact.
26 ∑
n=1∞||
Aen||
2 = ∑
n=1∞∑
m=1∞|⟨
Aen,
fm ⟩|
2, since (
fm)
is an o.n.b. This equals
or, summing
over n first, ∑
m=1∞||
A*fm||
2, since
(
en)
is also an o.n.b. Since the right hand side of
the displayed formula
doesn’t mention (
en)
it clearly
makes no difference if we replace (
en)
by a different
o.n.b. It is also clear that ||
A||
HS=||
A*||
HS as
the LHS is just ||
A||
HS2 and the RHS is
||
A*||
HS2. 27 (a) ⟨
T*Tx,
y⟩=⟨
Tx,
Ty⟩=⟨
x,
T*T y⟩
, so
T*T is Hermitian. Also ⟨ TT*x,y⟩=⟨ T*x,T*y⟩
= ⟨ x,T**T*y⟩=⟨ x,TT*y⟩, since T=T**,
and
hence TT* is Hermitian. Both are compact, since the
product of a compact operator and a bounded operator is
always compact (by Question 1).
(b) The point is that Ten=αnfn, and so
∑||Ten||2 = ∑|αn|2< ∞ if and
only if (αn) ∈ l2.
If αn → 0, then T is the limit of finite
rank operators Tmx=∑n=1m αn ⟨ x,en⟩
fn (cf. what we proved in the course about diagonal
operators), and if αn ¬→0, then, for
some δ>0,
||Ten(k)||=|αn(k)| ≥ δ, and
(Ten(k)) has no convergent subsequence—again, see
how we did this for diagonal operators.
⟨ Ten, fm ⟩ = ⟨ en, T* fm
⟩= | |
Hence T* maps fm to αmem,
so T*x=∑m=1∞αm⟨ x,fm⟩
em.
This gives
T*Tx= | | αn ⟨ x,en⟩
T*fn=
| | |αn|2 ⟨ x,en⟩ en, |
and
TT*x=
| | | ⟨ x,fm⟩
Tem=
| | |αm|2⟨ x,fm⟩ fm. |
28 The Neumann series is (I−λ T)−1=1+λ T
+ λ T2 + …, |
valid for sufficiently small
λ
(e.g. |λ| ||
T|| < 1
).Taking f(x)=x, we find that (Tf)(x)=∫01 xy3
dy=x/4, and in general (Tnf)(x)=x/4n.
The solution we obtain is φ=(1−λ T)−1f,
which gives
φ(x)=x+λ x/4 + λ2 x2/16 +
…, |
which converges to
at least for |λ| < 4. It is easily seen that
this solution is valid for all λ ≠ 4.
29 Calculate (Tφk)(x)= | | h(x−y) eiky
d y. |
Make the change of variables t=
x−
y
to get
(Tφk)(x)= | ∫ | | h(t)eik(x−t) d t
= 2π ak eikx, |
using orthogonality and periodicity properties, so
that φ
k is an eigenvector with
eigenvalue 2π
ak.Now T is a Hilbert–Schmidt operator with an
orthonormal basis of eigenvectors, namely
(ek)=(φk)/√2π.
We can now work with either the (ek) or the
(φk). If φ has Fourier series
∑n=−∞∞dn eint, then
φ−λ Tφ has Fourier series
where
λn= | ⎧
⎨
⎩ | 0 | if n=0, |
2π /|n| | if
n ≠ 0,
|
|
|
so the solution is
B.7 Solutions of Tutorial Problems VII
30 Take e1=
t/||
t||
, and ||
t||
2=∫
−11 t2
d t = 2/3
, so e1(
t)=√
3/2 t.Let w2(t)=t2−⟨ t2,e1⟩ e1=t2, and
normalize to get e2(t)=√5/2 t2.
Let w3(t)=t4−⟨ t4,e1⟩ e1
− ⟨ t4,e2⟩ e2 = t4 − 0 − (2/7)(5/2)t2
= t4− 5t2/7. Now
||w3||2= | ∫ | | (t8−10t6/7+25t4/49) d t=
(2/9)−(20/49)+(10/49)= 8/441, |
so we take e3(t)=(21/√8)(t4−5t2/7).
The best approximation in X
to f is g=∑k=13 ⟨ f,ek⟩ ek,
giving
g(t)=0 + (2/3)(5/2)t2 + (−8/105)(441/8)(t4−5t2/7), |
which reduces to g(t)=14t2/3−21t4/5.
As a check, note that f−g is orthogonal to
each of the functions t, t2 and t4.
31 We see that φn(x)=⟨ x, un⟩, where
un=(1,1,…,1,0,0,…), with n nonzero
terms. Now ||φn||=||un||=√n.
32 To get the adjoint calculate ⟨ (A+iB)x,y⟩
= ⟨ Ax,y⟩ + i⟨ Bx,y⟩ =
⟨ x,Ay⟩ + i⟨ x,By⟩ =
⟨ x, (A−iB)y ⟩, |
so T*=
A−
iB.Now A=(T+T*)/2 and B=(T−T*)/(2i) (very like the
formulae for real and imaginary parts of a complex
number).
Since these formulae do define self-adjoint operators A
and B, it is clear that every operator T has a unique
decomposition as T=A+iB.
Note
T*T=(A−iB)(A+iB)=A2−iBA+iAB+B2 |
and
TT*=(A+iB)(A−iB)=A2+iBA−iAB+B2, |
so that
T*T−TT*=2i(AB−BA), and T is normal if and only if
AB=BA.
33 All we need to do is look for eigenvalues, as the
spaces are finite-dimensional.(i) Df=λ f is impossible unless λ=0,
since the degree of Df is lower than the degree of
f. So σ(D)={0}. Indeed Dn+1=0, so
D is nilpotent, which also implies that its spectrum
is {0}, see earlier examples sheets. The only
eigenvectors in Pn
are constant functions, so we do not get a basis of
eigenvectors.
(ii) D(eikt)=ikeikt, so σ(D)={0,± i, ±
2i, …, ± ni}. Now D has (2n+1) distinct
eigenvalues, and
Tn has an orthonormal basis of eigenvectors, namely
(eikt/√2π)k=−nn.
34 We get φ=(
I−λ
T)
−1f=
f+λ
Tf +
λ
2 T2 f + …
.Now Tf(x)=∫0x t2 d t=x3/3,
(T2f)(x)=∫0x (t5/3) d t=x6/18, ….
In
general
(Tn f)(x)=x3n/(3n n!). Summing the series
we find that
and the series converges for all λ ∈ ℂ.
C Course in the Nutshell
C.1 Some useful results and formulae (1)
1 A norm on a vector space, ||x||, satisfies
||x||≥ 0, ||x||=0 if and only if x=0, ||λ
x||=|λ| ||x||, and ||x+y|| ≤ ||x|| + ||y||
(triangle inequality). A norm defines a metric and a
complete normed space is called a Banach space.
2 An inner-product space is a vector space
(usually complex) with a scalar product on it, ⟨ x,y⟩
∈ ℂ such that ⟨ x,y⟩=⟨
y,x⟩, ⟨ λ x,y⟩=λ⟨
x,y⟩, ⟨ x+y,z⟩ =⟨ x,z⟩ +⟨
y,z⟩ , ⟨ x,x⟩ ≥ 0 and ⟨ x,x⟩
=0 if and only if x=0. This defines a norm by
||x||2=⟨ x,x⟩ . A complete inner-product space is
called a Hilbert space. A Hilbert space is automatically a
Banach space.
3 The Cauchy–Schwarz inequality. |⟨
x,y⟩ | ≤ ||x|| ||y|| with equality if and only if x
and y are linearly dependent.
4 Some examples of Hilbert spaces. (i) Euclidean
ℂn. (ii) l2, sequences (ak) with
||(ak)||22=∑|ak|2 < ∞. In both cases ⟨
(ak),(bk)⟩=∑akbk. (iii) L2[a,b],
functions on [a,b] with ||f||22=∫ab |f(t)|2
dt < ∞. Here ⟨ f,g ⟩=∫ab f(t)
g(t) d t. (iv) Any closed subspace of a Hilbert space.
5 Other examples of Banach spaces. (i) Cb(X),
continuous bounded functions on a topological space X. (ii)
l∞(X), all bounded functions on a set X.
The supremum norms on Cb(X) and l∞(X) make
them into Banach spaces. (iii) Any closed subspace of a Banach
space.
6 On incomplete spaces. The inner-product
(L2) norm on C[0,1] is incomplete. c00 (sequences
eventually zero), with the l2 norm, is another
incomplete i.p.s.
7 The parallelogram identity. ||x+y||2 +
||x−y||2 = 2||x||2 + 2||y||2 in an inner-product space. Not in
general normed spaces.
8 On subspaces. Complete =⇒
closed. The closure of a linear subspace is still a linear subspace.
Lin (A) is the smallest subspace containing A and
CLin (A) is its closure, the smallest closed subspace
containing A.
9 From now on we work in inner-product spaces.
10 The orthogonality. x ⊥ y if ⟨
x,y⟩ =0. An orthogonal sequence has ⟨
en,em⟩ =0 for n ≠ m. If all the vectors have norm 1
it is an orthonormal sequence (o.n.s.), e.g.
en=(0,…,0,1,0,0,…) ∈ l2 and
en(t)=(1/√2π) eint in L2(−π,π).
11 Pythagoras’s theorem: if x⊥ y then
||x+y||2=||x||2+||y||2.
12 The best approximation to x by a linear
combination ∑k=1nλkek is ∑k=1n ⟨
x,ek⟩ ek if the ek are orthonormal. Note that
⟨ x,ek⟩ is the Fourier coefficient of x w.r.t.
ek.
13 Bessel’s inequality. ||x||2 ≥ ∑k=1n
|⟨ x,ek⟩ |2 if e1,…,en is an o.n.s.
14 Riesz–Fischer theorem. For an o.n.s. (en)
in a Hilbert space, ∑λn en converges if and only if
∑|λn|2 < ∞; then ||∑λn en ||2 =
∑|λn|2.
15 A complete o.n.s. or orthonormal basis
(o.n.b.) is an o.n.s. ( en) such that if ⟨
y,en⟩ =0 for all n then y=0. In that case every
vector is of the form ∑λn en as in the R-F theorem.
Equivalently: the closed linear span of the (en) is the whole
space.
16 Gram–Schmidt orthonormalization process. Start
with x1, x2, … linearly independent. Construct e1,
e2, … an o.n.s. by inductively setting
yn+1=xn+1−∑k=1n ⟨ xn+1,ek⟩ ek and
then normalizing en+1=yn+1/||yn+1||.
17 On orthogonal complements. M⊥ is the
set of all vectors orthogonal to everything in M. If M is a
closed linear subspace of a Hilbert space H then H=M ⊕
M⊥. There is also a linear map, PM the projection from
H onto M with kernel M⊥.
18 Fourier series. Work in L2(−π,π) with
o.n.s. en(t)=(1/√2π)eint. Let CP(−π,π) be
the continuous periodic functions, which are dense in L2. For
f ∈ CP(−π,π) write fm=∑n=−mm ⟨
f,en⟩ en, m ≥ 0. We wish to show that ||fm−f||2
→ 0, i.e., that (en) is an o.n.b.
19 The Fejér kernel. For f∈ CP(−π,π)
write Fm=(f0+…+fm)/(m+1). Then Fm(x)=(1/2π)
∫−ππf(t) Km(x−t) d t where
Km(t)=(1/(m+1)) ∑k=0m ∑n=−kk eint is the
Fejér kernel. Also Km(t)=(1/(m+1)) [sin2 (m+1)t/2] / [sin2
t/2].
20 Fejér’s theorem. If f ∈ CP(−π,π)
then its Fejér sums tend uniformly to f on [−π,π] and
hence in L2 norm also. Hence CLin ((en)) ⊇
CP(−π,π) so must be all of L2(−π,π). Thus (en)
is an o.n.b.
21 Corollary. If f ∈ L2(−π,π) then
f(t)=∑cn eint with convergence in L2, where
cn=(1/2π) ∫−ππf(t)e−int d t.
22 Parseval’s formula. If f, g∈
L2(−π,π) have Fourier series ∑cn eint and ∑
dn eint then (1/2π)⟨ f,g⟩ = ∑cn dn.
23 Weierstrass approximation theorem. The
polynomials are dense in C[a,b] for any a<b (in the supremum
norm).
C.2 Some useful results and
formulae (2)
24
On dual spaces. A linear functional on a vector
space X is a linear mapping α:X → ℂ (or to
ℝ in the real case), i.e.,
α(ax+by)=aα(x)+bα(y). When X is a normed
space, α is continuous if and only if it is
bounded, i.e., sup{|α(x)|: ||x|| ≤ 1} <
∞. Then we define ||α|| to be this sup, and it is a
norm on the space X* of bounded linear functionals, making
X* into a Banach space.
25
Riesz–Fréchet theorem. If α:H → ℂ is a
bounded linear functional on a Hilbert space H, then there is a
unique y ∈ H such that α(x)=⟨ x,y⟩ for
all x ∈ H; also ||α||=||y||.
26
On linear operator. These are linear mappings T: X → Y,
between normed spaces. Defining ||T||=sup{||T(x)||: ||x|| ≤
1}, finite, makes the bounded (i.e., continuous) operators into a
normed space, B(X,Y). When Y is complete, so is B(X,Y).
We get ||Tx|| ≤ ||T|| ||x||, and, when we can compose
operators, ||ST|| ≤ ||S|| ||T||. Write B(X) for
B(X,X), and for T ∈ B(X), ||Tn|| ≤ ||T||n.
Inverse S=T−1 when ST=TS=I.
27
On adjoints. T ∈ B(H,K) determines T* ∈ B(K,H)
such that ⟨ Th, k ⟩K = ⟨ h, T*k ⟩H
for all h ∈ H, k ∈ K. Also ||T*||=||T|| and
T**=T.
28
On unitary operator
. Those U ∈
B(
H)
for which
UU*=
U*U=
I. Equivalently, U is surjective and an isometry
(and hence preserves the inner product).Hermitian operator or self-adjoint operator. Those T
∈ B(H) such that T=T*.
On normal operator. Those T ∈ B(H) such that
TT*=T*T (so including Hermitian and unitary operators).
29
On spectrum. σ(T)={λ ∈ ℂ: (T−λ
I) is not invertible in B(X)}. Includes all
eigenvalues λ where Tx=λ x for some
x ≠ 0, and often other things as well. On spectral
radius: r(T)=sup{|λ|: λ∈ σ(T)}.
Properties: σ(T) is closed, bounded and nonempty. Proof:
based on the fact that (I−A) is invertible for ||A|| < 1.
This implies that r(T) ≤ ||T||.
30
The spectral radius formula
. r(
T)=inf
n ≥ 1
||
Tn||
1/n = lim
n → ∞ ||
Tn||
1/n.Note that σ(Tn)={λn: λ ∈ σ(T)} and
σ(T*)={λ: λ ∈ σ(T)}. The
spectrum of a unitary operator is contained in {|z|=1}, and
the spectrum of a self-adjoint operator is real (proof by
Cayley transform: U=(T−iI)(T+iI)−1 is unitary).
31
On finite rank operator
. T∈
F(
X,
Y)
if Im T
is finite-dimensional.On compact operator. T ∈ K(X,Y) if: whenever (xn)
is bounded, then (Txn) has a convergent subsequence. Now
F(X,Y) ⊆ K(X,Y) since bounded sequences in a
finite-dimensional space have convergent subsequences (because when
Z is f.d., Z is isomorphic to l2n, i.e.,
∃ S:l2n → Z with S, S−1 bounded).
Also limits of compact operators are compact, which shows that a
diagonal operator Tx=∑λn⟨ x,en ⟩ en is
compact iff λn → 0.
32
Hilbert–Schmidt operators
. T is H–S when ∑
||
Ten||
2 < ∞
for some o.n.b. (
en)
. All such operators
are compact—write them as a limit of finite rank operators Tk
with Tk∑
n=1∞anen=∑
n=1k an (
Ten)
. This
class includes integral operators T:
L2(
a,
b)→
L2(
a,
b)
of the
form
(Tf)(x)= | | K(x,y) f(y) d y, |
where
K is continuous on [
a,
b] × [
a,
b]
.
33
On spectral properties of normal operators
. If T is
normal, then (i) ker
T=ker
T*, so Tx=λ
x
=⇒
T*x=
λx; (ii) eigenvectors
corresponding to distinct eigenvalues are orthogonal; (iii)
||
T||=
r(
T)
.If T ∈ B(H) is compact normal, then its set of eigenvalues is
either finite or a sequence tending to zero. The eigenspaces are
finite-dimensional, except possibly for λ=0. All nonzero
points of the spectrum are eigenvalues.
34
On spectral theorem for compact normal operators
. There is an
orthonormal sequence (
ek)
of eigenvectors of T, and
eigenvalues (λ
k)
, such that Tx=∑
k λ
k
⟨
x,
ek ⟩
ek. If (λ
k)
is an infinite
sequence, then it tends to 0. All operators of the above form are
compact and normal.Corollary. In the spectral theorem we can have the same
formula with an orthonormal basis, adding in vectors from
kerT.
35
On general compact operators. We can write Tx=∑µk
⟨ x, ek ⟩ fk, where (ek) and (fk) are
orthonormal sequences and (µk) is either a finite sequence or
an infinite sequence tending to 0. Hence T ∈ B(H) is
compact if and only if it is the norm limit of a sequence of
finite-rank operators.
36
On integral equations. Fredholm equations on L2(a,b) are
Tφ=f or φ−λ Tφ=f, where (Tφ)(x)=∫ab
K(x,y)φ(y) d y. Volterra equations similar, except that T
is now defined by (Tφ)(x)=∫ax K(x,y)φ(y) d y.
37
Neumann series
. (
I−λ
T)
−1=1+λ
T+λ
2
T2 + …
, for ||λ
T ||<1
.On separable kernel. K(x,y)=∑j=1n gj(x)hj(y).
The image of T (and hence its eigenvectors for λ≠ 0)
lies in the space spanned by g1,…,gn.
38
Hilbert–Schmidt theory
. Suppose that K ∈
C([
a,
b]×
[
a,
b])
and K(
y,
x)=
K(x,y). Then (in the Fredholm
case) T is a self-adjoint Hilbert–Schmidt operator and
eigenvectors corresponding to nonzero eigenvalues are continuous
functions. If λ≠ 0
and 1/λ ∉σ(
T)
,
the the solution of φ−λ
Tφ=
f is
39
Fredholm alternative. Let T be compact and normal and
λ≠ 0. Consider the equations (i) φ−λ
Tφ=0 and (ii) φ−λ Tφ=f. Then EITHER (A) The
only solution of (i) is φ=0 and (ii) has a unique solution
for all f OR (B) (i) has nonzero solutions φ and (ii) can
be solved if and only if f is orthogonal to every solution of
(i).
D Supplementary Sections
D.1 Reminder from Complex Analysis
The analytic function theory is the most powerful tool in the operator
theory. Here we briefly recall few facts of complex analysis used in
this course. Use any decent textbook on complex variables for a
concise exposition. The only difference with our version that we
consider function f(z) of a complex variable z taking value in
an arbitrary normed space V over the field
ℂ. By the direct
inspection we could check that all standard proofs of the listed
results work as well in this more general case.
Definition 1
A function f(
z)
of a complex variable z taking value in a
normed vector space V is called differentiable
at a point
z0 if the following limit (called derivative
of
f(
z)
at z0) exists:
Definition 2
A function f(z) is called holomorphic (or
analytic) in an
open set Ω⊂ℂ it is differentiable at any
point of Ω.
Theorem 3 (Laurent Series)
Let a function f(
z)
be analytical in the annulus r<
z<
R
for some real r<
R, then it could be uniquely represented by the
Laurent series:
f(z)= | | ck zk, for some
ck∈ V.
(109) |
Theorem 4 (Cauchy–Hadamard)
The radii r′
and R′
, (r′<
R′
) of convergence of the Laurent
series (109) are given by
r′= | | ⎪⎪
⎪⎪ | cn | ⎪⎪
⎪⎪ | 1/n
and
| | = | | ⎪⎪
⎪⎪ | cn | ⎪⎪
⎪⎪ | 1/n.
(110) |
Bollobas99abook
author=Bollobás, Béla,
title=Linear analysis. An introductory course,
edition=Second,
publisher=Cambridge University Press,
address=Cambridge,
date=1999,
ISBN=0-521-65577-3,
note=MR # 2000g:46001,
review=MR # 2000g:46001,
Howe80aarticle
author=Howe, Roger,
title=On the role of the Heisenberg group in harmonic analysis,
date=1980,
ISSN=0002-9904,
journal=Bull. Amer. Math. Soc. (N.S.),
volume=3,
number=2,
pages=821843,
review=MR # 81h:22010,
KirGvi82book
author=Kirillov, Alexander A.,
author=Gvishiani, Alexei D.,
title=Theorems and problems in functional analysis,
series=Problem Books in Mathematics,
publisher=Springer-Verlag,
address=New York,
date=1982,
Kisil98aarticle
author=Kisil, Vladimir V.,
title=Wavelets in Banach spaces,
date=1999,
ISSN=0167-8019,
journal=Acta Appl. Math.,
volume=59,
number=1,
pages=79109,
note=arXiv:math/9807141,
On-line,
review=MR # MR1740458 (2001c:43013),
Kisil02cincollection
author=Kisil, Vladimir V.,
title=Meeting Descartes and Klein somewhere in a noncommutative
space,
date=2002,
booktitle=Highlights of mathematical physics (London, 2000),
editor=Fokas, A.,
editor=Halliwell, J.,
editor=Kibble, T.,
editor=Zegarlinski, B.,
publisher=Amer. Math. Soc.,
address=Providence, RI,
pages=165189,
note=arXiv:math-ph/0112059,
review=MR # MR2001578 (2005b:43015),
Kisil12darticle
author=Kisil, Vladimir V.,
title=The real and complex techniques in harmonic analysis from the
point of view of covariant transform,
date=2014,
journal=Eurasian Math. J.,
volume=5,
pages=95121,
note=arXiv:1209.5072.
On-line,
KolmogorovFomin-measurebook
author=Kolmogorov, A. N.,
author=Fomin, S. V.,
title=Measure, Lebesgue integrals, and Hilbert space,
series=Translated by Natascha Artin Brunswick and Alan Jeffrey,
publisher=Academic Press,
address=New York,
date=1961,
review=MR # 0118797 (22 #9566b),
KolmogorovFominbook
author=Kolmogorov, A. N.,
author=Fomīn, S. V.,
title=Introductory real analysis,
publisher=Dover Publications Inc.,
address=New York,
date=1975,
note=Translated from the second Russian edition and edited by Richard
A. Silverman, Corrected reprinting,
review=MR # 0377445 (51 #13617),
Kreyszigbook
author=Kreyszig, Erwin,
title=Introductory functional analysis with applications,
publisher=John Wiley & Sons Inc.,
address=New York,
date=1989,
ISBN=0-471-50459-9,
note=MR # 90m:46003,
review=MR # 90m:46003,
Polya57book
author=Polya, Georg,
title=How to solve it,
publisher=Doubleday Anchor Books,
address=New York,
date=1957,
note=https://archive.org/details/howtosolveitnewa00pl,
Polya62book
author=Polya, Georg,
title=Mathematical discovery,
publisher=John Wiley & Sons, Inc.,
address=New York,
date=1962,
note=https://archive.org/details/GeorgePolyaMathematicalDiscovery,
SimonReed80book
author=Reed, Michael,
author=Simon, Barry,
title=Functional analysis,
edition=Second,
series=Methods of Modern Mathematical Physics,
publisher=Academic Press,
address=Orlando,
date=1980,
volume=1,
Rudin87abook
author=Rudin, Walter,
title=Real and complex analysis,
edition=Third,
publisher=McGraw-Hill Book Co.,
address=New York,
date=1987,
ISBN=0-07-054234-1,
note=MR # 88k:00002,
review=MR # 88k:00002,
Young88abook
author=Young, Nicholas,
title=An introduction to Hilbert space,
publisher=Cambridge University Press,
address=Cambridge,
date=1988,
ISBN=0-521-33071-8; 0-521-33717-8,
note=MR # 90e:46001,
review=MR # 90e:46001,
Last modified: November 6, 2024.
This document was translated from LATEX by
HEVEA.