This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.3 Orthogonality
Pythagoras is forever!
The catchphrase from TV commercial
of
Hilbert Spaces course
As was mentioned in the introduction the Hilbert spaces is an analog
of our 3D Euclidean space and theory of Hilbert spaces similar to
plane or space geometry. One of the primary result of Euclidean
geometry which still survives in high school curriculum despite its continuous
nasty de-geometrisation is Pythagoras’ theorem based on the notion of
orthogonality1.
So far we was concerned only with distances between points. Now we
would like to study angles between vectors and notably right
angles. Pythagoras’ theorem states that if the angle C in a
triangle is right then c2=a2+b2, see
Figure 5 .
Figure 5: The Pythagoras’ theorem c2=a2+b2 |
It is a very mathematical way
of thinking to turn this property of
right angles into their definition, which will work even in
infinite dimensional Hilbert spaces.
Look for a triangle, or
even for a right triangle
A
universal advice in solving problems from elementary geometry.
3.1 Orthogonal System in Hilbert Space
In inner product spaces it is even more convenient to give a definition of
orthogonality not from Pythagoras’ theorem but from an equivalent
property of inner product.
Definition 1
Two vectors x and y in an inner product space are
orthogonal
if ⟨
x,
y
⟩=0
,
written x ⊥
y. An orthogonal sequence (or
orthogonal system) en (finite
or infinite) is one in which en ⊥ em whenever n≠
m.
An orthonormal sequence (or
orthonormal system) en is
an orthogonal sequence with ||en||=1 for all n.
Exercise 2
-
Show that if x ⊥ x then x=0 and consequently x
⊥ y for any y∈ H.
-
Show that if all vectors of an orthogonal system are non-zero
then they are linearly independent.
Example 3 These are orthonormal sequences:
-
Basis vectors (1,0,0), (0,1,0), (0,0,1) in
ℝ3 or ℂ3.
-
Vectors en=(0,…,0,1,0,…) (with the only 1 on
the nth place) in l2. (Could you see a
similarity with the previous example?)
-
Functions en(t)=1/(√2π) eint ,
n∈ℤ in C[0,2π]:
⟨ en,em
⟩= | | | |
einte−imtdt = | ⎧
⎨
⎩ | |
|
(19) |
Exercise 4
Let A be a subset of an inner product space V and x⊥
y for any y∈
A. Prove that x⊥
z for all z∈
CLin(
A)
.
Theorem 5 (Pythagoras’)
If x ⊥
y then ||
x+
y||
2=||
x||
2+||
y||
2. Also if
e1, …, en is orthonormal then
| ⎪⎪
⎪⎪
⎪⎪
⎪⎪ | | ak ek | ⎪⎪
⎪⎪
⎪⎪
⎪⎪ | 2=⟨ | | ak ek, | | ak
ek
⟩= | | | ⎪
⎪ | ak | ⎪
⎪ | 2.
|
Proof.
A one-line calculation.
□
The following theorem provides an important property of Hilbert spaces
which will be used many times. Recall, that a subset K of a linear
space V is convex if for all x,
y∈ K and λ∈ [0,1] the point λ x
+(1−λ)y is also in K. Particularly any subspace is convex
and any unit ball as well (see Exercise 1).
Theorem 6 (about the Nearest Point)
Let K be a non-empty convex closed subset of a Hilbert space
H. For any point x∈
H there is the unique point y∈
K
nearest to x.
Proof.
Let
d=inf
y∈ K d(
x,
y), where
d(
x,
y)—the distance
coming from the norm ||
x||=√
⟨ x,x
⟩ and let
yn a sequence points in
K such that lim
n→
∞d(
x,
yn)=
d. Then
yn is a Cauchy sequence. Indeed
from the
parallelogram identity
for the parallelogram generated by vectors
x−
yn and
x−
ym
we have:
| ⎪⎪
⎪⎪ | yn−ym | ⎪⎪
⎪⎪ | 2=2 | ⎪⎪
⎪⎪ | x−yn | ⎪⎪
⎪⎪ | 2+2 | ⎪⎪
⎪⎪ | x−ym | ⎪⎪
⎪⎪ | 2− | ⎪⎪
⎪⎪ | 2x−yn−ym | ⎪⎪
⎪⎪ | 2.
|
Note that ||2
x−
yn−
ym||
2=4||
x−
yn+
ym/2||
2≥
4
d2 since
yn+
ym/2∈
K by its convexity. For
sufficiently large
m and
n we get ||
x−
ym||
2≤
d
+є and ||
x−
yn||
2≤
d +є, thus
||
yn−
ym||≤ 4(
d2+є)−4
d2=4є, i.e.
yn
is a Cauchy sequence.
Let y be the limit of yn, which exists by the completeness
of H, then y∈ K since K is closed. Then
d(x,y)=limn→ ∞d(x,yn)=d. This show the
existence of the nearest point. Let y′ be another point in
K such that d(x,y′)=d, then the parallelogram identity
implies:
| ⎪⎪
⎪⎪ | y−y′ | ⎪⎪
⎪⎪ | 2=2 | ⎪⎪
⎪⎪ | x−y | ⎪⎪
⎪⎪ | 2+2 | ⎪⎪
⎪⎪ | x−y′ | ⎪⎪
⎪⎪ | 2− | ⎪⎪
⎪⎪ | 2x−y−y′ | ⎪⎪
⎪⎪ | 2≤ 4d2−4d2=0.
|
This shows the uniqueness of the nearest point.
□
Exercise* 7 The essential rôle of the parallelogram identity
in the above proof indicates that the theorem does not hold in a
general Banach space.
-
Show that in ℝ2 with either norm
||·||1 or ||·||∞ form
Example 9 the nearest point could
be non-unique;
- Could you construct an example (in Banach space) when the
nearest point does not exists?
Liberte, Egalite, Fraternite!
A longstanding ideal
approximated in the real life by something completely different
3.2 Bessel’s inequality
For the case then a convex subset is a subspace we could characterise
the nearest point in the term of orthogonality.
Theorem 8 (on Perpendicular)
Let M be a subspace of a Hilbert space H and a point x∈
H be fixed. Then z∈
M is the nearest point to x if and
only if x−
z is orthogonal to any vector in M.
(i)
(ii)
Figure 6: (i) A smaller distance
for a non-perpendicular direction;
and |
(ii) Best approximation from a subspace |
Proof.
Let
z is the nearest point to
x existing by the
previous
Theorem. We claim that
x−
z orthogonal to any vector in
M,
otherwise there exists
y∈
M such that ⟨
x−
z,
y
⟩≠ 0. Then
| = | ⎪⎪
⎪⎪ | x−z | ⎪⎪
⎪⎪ | 2−2є
ℜ⟨ x−z,y
⟩+є2 | ⎪⎪
⎪⎪ | y | ⎪⎪
⎪⎪ | 2 |
|
| < | |
|
if є is chosen to be small enough and such
that є ℜ⟨
x−
z,
y
⟩ is positive, see
Figure
6(i). Therefore we get a contradiction with
the statement that
z is closest point to
x.
On the other hand if x−z is orthogonal to all vectors in H1 then particularly
(x−z)⊥ (z−y) for all y∈ H1, see
Figure 6(ii). Since x−y=(x−z)+(z−y)
we got by the Pythagoras’ theorem:
| ⎪⎪
⎪⎪ | x−y | ⎪⎪
⎪⎪ | 2= | ⎪⎪
⎪⎪ | x−z | ⎪⎪
⎪⎪ | 2 + | ⎪⎪
⎪⎪ | z−y | ⎪⎪
⎪⎪ | 2.
|
So ||x−y||2≥ ||x−z||2 and the are equal if and only if z=y.
□
Exercise 9
The above proof does not work if ⟨ x−z,y
⟩ is an imaginary
number, what to do in this case?
Consider now a basic case of approximation: let x∈ H be
fixed and e1, …, en be orthonormal and denote
H1=Lin{e1,…,en}. We could try to approximate x by a
vector y=λ1 e1+⋯ +λn en ∈ H1.
Corollary 10
The minimal value of ||x−y|| for
y∈ H1 is achieved when
y=∑1n⟨ x,ei
⟩ ei.
Proof.
Let
z=∑
1n⟨
x,
ei
⟩
ei, then
⟨
x−
z,
ei
⟩=⟨
x,
ei
⟩−⟨
z,
ei
⟩=0. By the
previous Theorem z is the
nearest point to
x.
□
Figure 7: Best approximation by three trigonometric polynomials |
Example 11
-
In ℝ3 find the best approximation to (1,0,0)
from the plane V:{x1+x2+x3=0}. We take an orthonormal
basis e1=(2−1/2, −2−1/2,0), e2=(6−1/2, 6−1/2,
−2· 6−1/2) of V (Check this!). Then:
z=⟨ x,e1
⟩e1+⟨ x,e2
⟩e2= | ⎛
⎜
⎜
⎝ | | ,− | | ,0 | ⎞
⎟
⎟
⎠ | +
| ⎛
⎜
⎜
⎝ | | , | | ,− | | ⎞
⎟
⎟
⎠ | = | ⎛
⎜
⎜
⎝ | | ,− | | ,− | | ⎞
⎟
⎟
⎠ | .
|
- In C[0,2π] what is the best approximation to
f(t)=t by functions a+beit+ce−it? Let
We find:
⟨ f,e0
⟩ | = | |
| | dt= | ⎡
⎢
⎢
⎢
⎢
⎣ |
| | | ⎤
⎥
⎥
⎥
⎥
⎦ | | = | √ | | π3/2; |
|
⟨ f,e1
⟩ | = | |
⟨ f,e−1
⟩ | = | |
| | dt=−i | √ | | (Why
we may not check this one?)
|
|
|
Then the best approximation is (see Figure 7):
f0(t) | = | ⟨ f,e0
⟩e0+⟨ f,e1
⟩e1+⟨ f,e−1
⟩e−1 |
| = | |
|
Corollary 12 (Bessel’s inequality)
If (
ei)
is orthonormal then
| ⎪⎪
⎪⎪ | x | ⎪⎪
⎪⎪ | 2≥ | | | ⎪
⎪ | ⟨ x,ei
⟩ | ⎪
⎪ | 2.
|
Proof.
Let
z= ∑
1n⟨
x,
ei
⟩
ei then
x−
z⊥
ei for all
i therefore by Exercise
4 x−
z⊥
z.
Hence:
| = | ⎪⎪
⎪⎪ | z | ⎪⎪
⎪⎪ | 2+ | ⎪⎪
⎪⎪ | x−z | ⎪⎪
⎪⎪ | 2 |
|
| ≥ | ⎪⎪
⎪⎪ | z | ⎪⎪
⎪⎪ | 2= | | | ⎪
⎪ | ⟨ x,ei
⟩ | ⎪
⎪ | 2.
|
|
|
□
—Did you say “rice and fish for them”?
A student question
3.3 The Riesz–Fischer theorem
When (ei) is orthonormal we call ⟨ x,en
⟩ the nth
Fourier coefficient of x (with
respect to (ei), naturally).
Theorem 13 (Riesz–Fisher)
Let (
en)
1∞ be an orthonormal sequence in a Hilbert
space H. Then ∑
1∞λ
n en converges in H
if and only if ∑
1∞| λ
n |
2 < ∞
. In
this case ||∑
1∞λ
n en||
2=∑
1∞| λ
n |
2.
Proof.
Necessity: Let
xk=∑
1k λ
n en and
x=lim
k→ ∞ xk. So
⟨
x,
en
⟩=lim
k→
∞⟨
xk,
en
⟩=λ
n for all
n. By the
Bessel’s inequality for all
k
| ⎪⎪
⎪⎪ | x | ⎪⎪
⎪⎪ | 2≥ | | | ⎪
⎪ | ⟨ x,en
⟩ | ⎪
⎪ | 2= | |
| ⎪
⎪ | λn | ⎪
⎪ | 2,
|
hence ∑
1k | λ
n |
2 converges and the sum
is at most ||
x||
2.
Sufficiency: Consider ||xk−xm||=||∑mk λn
en||=(∑mk | λn |2)1/2 for
k>m. Since ∑mk | λn |2 converges
xk is a Cauchy sequence in H and thus has a limit x. By
the Pythagoras’ theorem
||xk||2=∑1k | λn |2 thus for
k→ ∞
||x||2=∑1∞| λn |2 by the
Lemma about
inner product limit.
□
Observation: the closed linear span
of an orthonormal sequence in any Hilbert space looks like
l2, i.e. l2 is a universal model for a
Hilbert space.
By Bessel’s inequality and the
Riesz–Fisher theorem we know that the series
∑1∞⟨ x,ei
⟩ ei converges for any x∈
H. What is its limit?
Let y=x− ∑1∞⟨ x,ei
⟩ ei, then
⟨ y,ek
⟩=⟨ x,ek
⟩− | | ⟨ x,ei
⟩
⟨ ei,ek
⟩=⟨ x,ek
⟩−⟨ x,ek
⟩ =0 for all k.
(20) |
Definition 14
An orthonormal sequence (
ei)
in a Hilbert space H is
complete
if the identities ⟨
y,
ek
⟩=0
for all k imply y=0
.A complete orthonormal sequence is also called orthonormal
basis in H.
Theorem 15 (on Orthonormal Basis)
Let ei be an orthonormal basis in a Hilber space H. Then
for any x∈
H we have
x= | | ⟨ x,en
⟩en and
| ⎪⎪
⎪⎪ | x | ⎪⎪
⎪⎪ | 2= | | ⎪
⎪ | ⟨ x,en
⟩ | ⎪
⎪ | 2.
|
There are constructive existence theorems in
mathematics.
An example of pure existence statement
3.4 Construction of Orthonormal Sequences
Natural questions are: Do orthonormal sequences always exist?
Could we construct them?
Theorem 16 (Gram–Schmidt)
Let (
xi)
be a sequence of linearly independent vectors in an
inner product space V. Then there exists orthonormal sequence
(
ei)
such that
Lin{x1,x2,…,xn}=Lin{e1,e2,…,en},
for all n.
|
Proof.
We give an explicit algorithm working by induction. The
base of
induction: the first vector is
e1=
x1/||
x1||. The
step of
induction: let
e1,
e2, …,
en are already
constructed as required. Let
yn+1=
xn+1−∑
i=1n⟨
xn+1,
ei
⟩
ei. Then by
(
20)
yn+1 ⊥
ei for
i=1,…,
n. We may put
en+1=
yn+1/||
yn+1||
because
yn+1≠ 0 due to linear independence of
xk’s. Also
Lin{e1,e2,…,en+1} | = | Lin{e1,e2,…,yn+1} |
| = | Lin{e1,e2,…,xn+1} |
| = | Lin{x1,x2,…,xn+1}.
|
|
So (
ei) are orthonormal sequence.
□
Example 17
Consider C[0,1]
with the usual inner product (17)
and apply
orthogonalisation to the sequence 1
, x, x2, ….
Because ||1||=1
then e1(
x)=1
.
The continuation could be presented by the table:
e1(x)=1 |
y2(x)=x−⟨ x,1
⟩1=x− | | ,
| ⎪⎪
⎪⎪ | y2 | ⎪⎪
⎪⎪ | 2= | |
(x− | | )2 d x= | | ,
e2(x)= | √ | | (x− | | ) |
|
y3(x)=x2−⟨ x2,1
⟩1−⟨ x2,x− | |
⟩(x− | | )· 12 ,
…,
e3= | |
|
… … …
|
|
Figure 8: Five first Legendre
Pi and Chebyshev Ti polynomials |
Example 18
Many famous sequences of orthogonal
polynomials,
e.g. Chebyshev,
Legendre,
Laguerre,
Hermite, can be obtained
by orthogonalisation of 1
, x, x2, …with various
inner products.
-
Legendre polynomials in C[−1,1] with inner
product
- Chebyshev polynomials in
C[−1,1] with inner product
- Laguerre polynomials in
the space of polynomials P[0,∞) with inner product
See Figure 8 for the five first Legendre and Chebyshev
polynomials. Observe the difference caused by the different inner
products (21) and (22). On the other
hand note the similarity in oscillating behaviour with different
“frequencies”.
Another natural question is: When is an orthonormal sequence
complete?
Proposition 19
Let (
en)
be an orthonormal sequence in a Hilbert space
H. The following are equivalent:
-
(en) is an orthonormal basis.
- CLin((en))=H.
- ||x||2=∑1∞| ⟨ x,en
⟩ |2 for
all x∈ H.
Proof.
Clearly
1 implies
2 because
x=∑
1∞⟨
x,
en
⟩
en in
CLin((
en)) and
||
x||
2=∑
1∞⟨
x,
en
⟩
en by
Theorem
15. The same theorem tells that
1 implies
3.
If (en) is not complete then there exists x∈ H
such that x≠ 0 and ⟨ x,ek
⟩=0 for all k, so
3 fails, consequently 3 implies
1.
Finally if ⟨ x,ek
⟩=0 for all k then
⟨ x,y
⟩=0 for all y∈Lin((en)) and moreover for all
y∈CLin((en)), by the
Lemma on
continuity of the inner product. But then
x∉CLin((en)) and 2 also
fails because ⟨ x,x
⟩=0 is not possible. Thus
2 implies 1.
□
Corollary 20
A separable Hilbert space (i.e. one
with a countable dense set) can be identified with either
l2n or l2, in other words it has an orthonormal
basis (
en)
(finite or infinite) such that
x= | | ⟨ x,en
⟩en and
| ⎪⎪
⎪⎪ | x | ⎪⎪
⎪⎪ | 2= | | ⎪
⎪ | ⟨ x,en
⟩ | ⎪
⎪ | 2.
|
Proof.
Take a countable dense set (
xk), then
H=
CLin((
xk)),
delete all vectors which are a linear combinations of preceding
vectors, make orthonormalisation by Gram–Schmidt the remaining set
and apply the
previous proposition.
□
Most pleasant compliments are usually orthogonal to our real
qualities.
An advise based on observations
3.5 Orthogonal complements
Orthogonality allow us split a Hilbert space into subspaces which will
be “independent from each other” as much as possible.
Definition 21
Let M be a subspace of an inner product space V. The
orthogonal
complement,
written
M⊥, of M is
M⊥={x∈ V: ⟨ x,m
⟩=0 ∀ m∈ M}.
|
Theorem 22
If M is a closed subspace of a Hilbert space H then
M⊥ is a closed subspace too (hence a Hilbert space too).
Proof.
Clearly
M⊥ is a subspace of
H because
x,
y∈
M⊥ implies
ax+
by∈
M⊥:
⟨ ax+by,m
⟩= a⟨ x,m
⟩+ b⟨ y,m
⟩=0.
|
Also if all
xn∈
M⊥ and
xn→
x then
x∈
M⊥ due to
inner product
limit Lemma.
□
Theorem 23
Let M be a closed subspace of a Hilber space H. Then for any
x∈
H there exists the unique decomposition x=
m+
n with
m∈
M, n∈
M⊥ and
||
x||
2=||
m||
2+||
n||
2. Thus H=
M⊕
M⊥ and
(
M⊥)
⊥=
M.
Proof.
For a given
x there exists the unique closest point
m in
M by the
Theorem on nearest
point and by the
Theorem on
perpendicular (
x−
m)⊥
y for all
y∈
M.
So x= m + (x−m)= m+n with m∈ M and n∈ M⊥. The
identity ||x||2=||m||2+||n||2 is just Pythagoras’
theorem and M∩ M⊥={0} because null vector is the
only vector orthogonal to itself.
Finally (M⊥)⊥=M. We have H=M⊕
M⊥=(M⊥)⊥⊕ M⊥, for any
x∈(M⊥)⊥ there is a decomposition x=m+n with
m∈ M and n∈ M⊥, but then n is orthogonal to
itself and therefore is zero.
□
Last modified: November 6, 2024.