Previous Up Next
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

3 Orthogonality

  Pythagoras is forever!

 The catchphrase from TV commercial of Hilbert Spaces course


As was mentioned in the introduction the Hilbert spaces is an analog of our 3D Euclidean space and theory of Hilbert spaces similar to plane or space geometry. One of the primary result of Euclidean geometry which still survives in high school curriculum despite its continuous nasty de-geometrisation is Pythagoras’ theorem based on the notion of orthogonality1.

So far we was concerned only with distances between points. Now we would like to study angles between vectors and notably right angles. Pythagoras’ theorem states that if the angle C in a triangle is right then c2=a2+b2, see Figure 5 .


Figure 5: The Pythagoras’ theorem c2=a2+b2

It is a very mathematical way of thinking to turn this property of right angles into their definition, which will work even in infinite dimensional Hilbert spaces.

  Look for a triangle, or even for a right triangle

 A universal advice in solving problems from elementary geometry.


3.1 Orthogonal System in Hilbert Space

In inner product spaces it is even more convenient to give a definition of orthogonality not from Pythagoras’ theorem but from an equivalent property of inner product.

Definition 1 Two vectors x and y in an inner product space are orthogonal if x,y ⟩=0, written xy.

An orthogonal sequence (or orthogonal system) en (finite or infinite) is one in which enem whenever nm.

An orthonormal sequence (or orthonormal system) en is an orthogonal sequence with ||en||=1 for all n.

Exercise 2
  1. Show that if xx then x=0 and consequently xy for any yH.
  2. Show that if all vectors of an orthogonal system are non-zero then they are linearly independent.
Example 3 These are orthonormal sequences:
  1. Basis vectors (1,0,0), (0,1,0), (0,0,1) in 3 or 3.
  2. Vectors en=(0,…,0,1,0,…) (with the only 1 on the nth place) in l2. (Could you see a similarity with the previous example?)
  3. Functions en(t)=1/(√2π) eint , n∈ℤ in C[0,2π]:
    ⟨ en,em  ⟩= 
    0
    1
    einteimtdt = 

              1,n=m;
              0,n≠ m.
    (19)
Exercise 4 Let A be a subset of an inner product space V and xy for any yA. Prove that xz for all zCLin(A).
Theorem 5 (Pythagoras’) If xy then ||x+y||2=||x||2+||y||2. Also if e1, …, en is orthonormal then
  ⎪⎪
⎪⎪
⎪⎪
⎪⎪
n
1
akek⎪⎪
⎪⎪
⎪⎪
⎪⎪
2=⟨ 
n
1
akek,
n
1
ak ek  ⟩=
n
1

ak
2.
Proof. A one-line calculation. □

The following theorem provides an important property of Hilbert spaces which will be used many times. Recall, that a subset K of a linear space V is convex if for all x, yK and λ∈ [0,1] the point λ x +(1−λ)y is also in K. Particularly any subspace is convex and any unit ball as well (see Exercise 1).

Theorem 6 (about the Nearest Point) Let K be a non-empty convex closed subset of a Hilbert space H. For any point xH there is the unique point yK nearest to x.
Proof. Let d=infyK d(x,y), where d(x,y)—the distance coming from the norm ||x||=√x,x and let yn a sequence points in K such that limn→ ∞d(x,yn)=d. Then yn is a Cauchy sequence. Indeed from the parallelogram identity for the parallelogram generated by vectors xyn and xym we have:
  ⎪⎪
⎪⎪
ynym⎪⎪
⎪⎪
2=2⎪⎪
⎪⎪
xyn⎪⎪
⎪⎪
2+2⎪⎪
⎪⎪
xym⎪⎪
⎪⎪
2⎪⎪
⎪⎪
2xynym⎪⎪
⎪⎪
2.
Note that ||2xynym||2=4||xyn+ym/2||2≥ 4d2 since yn+ym/2∈ K by its convexity. For sufficiently large m and n we get ||xym||2d +є and ||xyn||2d +є, thus ||ynym||≤ 4(d2+є)−4d2=4є, i.e. yn is a Cauchy sequence.

Let y be the limit of yn, which exists by the completeness of H, then yK since K is closed. Then d(x,y)=limn→ ∞d(x,yn)=d. This show the existence of the nearest point. Let y′ be another point in K such that d(x,y′)=d, then the parallelogram identity implies:

  ⎪⎪
⎪⎪
yy⎪⎪
⎪⎪
2=2⎪⎪
⎪⎪
xy⎪⎪
⎪⎪
2+2⎪⎪
⎪⎪
xy⎪⎪
⎪⎪
2⎪⎪
⎪⎪
2xyy⎪⎪
⎪⎪
2≤ 4d2−4d2=0.

This shows the uniqueness of the nearest point. □

Exercise* 7 The essential rôle of the parallelogram identity in the above proof indicates that the theorem does not hold in a general Banach space.
  1. Show that in 2 with either norm ||·||1 or ||·|| form Example 9 the nearest point could be non-unique;
  2. Could you construct an example (in Banach space) when the nearest point does not exists?

  Liberte, Egalite, Fraternite!

 A longstanding ideal approximated in the real life by something completely different


3.2 Bessel’s inequality

For the case then a convex subset is a subspace we could characterise the nearest point in the term of orthogonality.

Theorem 8 (on Perpendicular) Let M be a subspace of a Hilbert space H and a point xH be fixed. Then zM is the nearest point to x if and only if xz is orthogonal to any vector in M.

  (i)   (ii)  
Figure 6: (i) A smaller distance for a non-perpendicular direction; and
(ii) Best approximation from a subspace

Proof. Let z is the nearest point to x existing by the previous Theorem. We claim that xz orthogonal to any vector in M, otherwise there exists yM such that ⟨ xz,y ⟩≠ 0. Then
    ⎪⎪
⎪⎪
xz−є y⎪⎪
⎪⎪
2
=
⎪⎪
⎪⎪
xz⎪⎪
⎪⎪
2−2є ℜ⟨ xz,y  ⟩+є2⎪⎪
⎪⎪
y⎪⎪
⎪⎪
2
 <
⎪⎪
⎪⎪
xz⎪⎪
⎪⎪
2,
if є is chosen to be small enough and such that є ℜ⟨ xz,y ⟩ is positive, see Figure 6(i). Therefore we get a contradiction with the statement that z is closest point to x.

On the other hand if xz is orthogonal to all vectors in H1 then particularly (xz)⊥ (zy) for all yH1, see Figure 6(ii). Since xy=(xz)+(zy) we got by the Pythagoras’ theorem:

    ⎪⎪
⎪⎪
xy⎪⎪
⎪⎪
2=⎪⎪
⎪⎪
xz⎪⎪
⎪⎪
2 + ⎪⎪
⎪⎪
zy⎪⎪
⎪⎪
2.

So ||xy||2≥ ||xz||2 and the are equal if and only if z=y. □

Exercise 9 The above proof does not work if xz,y is an imaginary number, what to do in this case?

Consider now a basic case of approximation: let xH be fixed and e1, …, en be orthonormal and denote H1=Lin{e1,…,en}. We could try to approximate x by a vector y1 e1+⋯ +λn enH1.

Corollary 10 The minimal value of ||xy|| for yH1 is achieved when y=∑1nx,eiei.
Proof. Let z=∑1nx,eiei, then ⟨ xz,ei ⟩=⟨ x,ei ⟩−⟨ z,ei ⟩=0. By the previous Theorem z is the nearest point to x. □

Figure 7: Best approximation by three trigonometric polynomials

Example 11
  1. In 3 find the best approximation to (1,0,0) from the plane V:{x1+x2+x3=0}. We take an orthonormal basis e1=(2−1/2, −2−1/2,0), e2=(6−1/2, 6−1/2, −2· 6−1/2) of V (Check this!). Then:
          z=⟨ x,e1  ⟩e1+⟨ x,e2  ⟩e2=


    1
    2
    ,−
    1
    2
    ,0


    +


    1
    6
    ,
    1
    6
    ,−
    1
    3



    =


    2
    3
    ,−
    1
    3
    ,−
    1
    3



    . 
  2. In C[0,2π] what is the best approximation to f(t)=t by functions a+beit+ceit? Let
        e0=
    1
    ,    e1=
    1
    eit,    e−1=
    1
    eit.  
    We find:
        ⟨ f,e0  ⟩=
    0
    t
    dt=




    t2
    2
    1










    0
    =
    2
    π3/2;
        ⟨ f,e1  ⟩=
    0
    teit
    dt=i
       (Check this!) 
            ⟨ f,e−1  ⟩=
    0
    teit
    dt=−i
       (Why we may not check this one?)
    Then the best approximation is (see Figure 7):
        f0(t)=⟨ f,e0  ⟩e0+⟨ f,e1  ⟩e1+⟨ f,e−1  ⟩e−1
     =
    2
    π3/2
    +ieitieit=π−2sint.
Corollary 12 (Bessel’s inequality) If (ei) is orthonormal then
    ⎪⎪
⎪⎪
x⎪⎪
⎪⎪
2≥ 
n
i=1

⟨ x,ei  ⟩ 
2.
Proof. Let z= ∑1nx,eiei then xzei for all i therefore by Exercise 4 xzz. Hence:
    ⎪⎪
⎪⎪
x⎪⎪
⎪⎪
2
=
⎪⎪
⎪⎪
z⎪⎪
⎪⎪
2+⎪⎪
⎪⎪
xz⎪⎪
⎪⎪
2
 
⎪⎪
⎪⎪
z⎪⎪
⎪⎪
2=
n
i=1

⟨ x,ei  ⟩ 
2.

  —Did you say “rice and fish for them”?

 A student question


3.3 The Riesz–Fischer theorem

When (ei) is orthonormal we call ⟨ x,en ⟩ the nth Fourier coefficient of x (with respect to (ei), naturally).

Theorem 13 (Riesz–Fisher) Let (en)1 be an orthonormal sequence in a Hilbert space H. Then 1λn en converges in H if and only if 1| λn |2 < ∞. In this case ||∑1λn en||2=∑1| λn |2.
Proof. Necessity: Let xk=∑1k λn en and x=limk→ ∞ xk. So ⟨ x,en ⟩=limk→ ∞xk,en ⟩=λn for all n. By the Bessel’s inequality for all k
    ⎪⎪
⎪⎪
x⎪⎪
⎪⎪
2≥ 
k
1

⟨ x,en  ⟩ 
2= 
k
1

λn
2, 
hence ∑1k | λn |2 converges and the sum is at most ||x||2.

Sufficiency: Consider ||xkxm||=||∑mk λn en||=(∑mk | λn |2)1/2 for k>m. Since ∑mk | λn |2 converges xk is a Cauchy sequence in H and thus has a limit x. By the Pythagoras’ theorem ||xk||2=∑1k | λn |2 thus for k→ ∞ ||x||2=∑1| λn |2 by the Lemma about inner product limit. □

Observation: the closed linear span of an orthonormal sequence in any Hilbert space looks like l2, i.e. l2 is a universal model for a Hilbert space.

By Bessel’s inequality and the Riesz–Fisher theorem we know that the series ∑1x,eiei converges for any xH. What is its limit?

Let y=x− ∑1x,eiei, then

⟨ y,ek  ⟩=⟨ x,ek  ⟩− 
1
⟨ x,ei  ⟩ ⟨ ei,ek  ⟩=⟨ x,ek  ⟩−⟨ x,ek  ⟩ =0   for all  k. (20)
Definition 14 An orthonormal sequence (ei) in a Hilbert space H is complete if the identities y,ek ⟩=0 for all k imply y=0.

A complete orthonormal sequence is also called orthonormal basis in H.

Theorem 15 (on Orthonormal Basis) Let ei be an orthonormal basis in a Hilber space H. Then for any xH we have
    x=
n=1
⟨ x,en  ⟩en    and    ⎪⎪
⎪⎪
x⎪⎪
⎪⎪
2= 
n=1

⟨ x,en  ⟩ 
2.
Proof. By the Riesz–Fisher theorem, equation (20) and definition of orthonormal basis. □

  There are constructive existence theorems in mathematics.

 An example of pure existence statement


3.4 Construction of Orthonormal Sequences

Natural questions are: Do orthonormal sequences always exist? Could we construct them?

Theorem 16 (Gram–Schmidt) Let (xi) be a sequence of linearly independent vectors in an inner product space V. Then there exists orthonormal sequence (ei) such that
  Lin{x1,x2,…,xn}=Lin{e1,e2,…,en},    for all  n.
Proof. We give an explicit algorithm working by induction. The base of induction: the first vector is e1=x1/||x1||. The step of induction: let e1, e2, …, en are already constructed as required. Let yn+1=xn+1−∑i=1nxn+1,eiei. Then by (20) yn+1ei for i=1,…,n. We may put en+1=yn+1/||yn+1|| because yn+1≠ 0 due to linear independence of xk’s. Also
    Lin{e1,e2,…,en+1}=    Lin{e1,e2,…,yn+1}
 =    Lin{e1,e2,…,xn+1}
 =    Lin{x1,x2,…,xn+1}.
So (ei) are orthonormal sequence. □
Example 17 Consider C[0,1] with the usual inner product (17) and apply orthogonalisation to the sequence 1, x, x2, …. Because ||1||=1 then e1(x)=1. The continuation could be presented by the table:
      e1(x)=1 
      y2(x)=x−⟨ x,1  ⟩1=x
1
2
,    ⎪⎪
⎪⎪
y2⎪⎪
⎪⎪
2=
1
0
(x
1
2
)2dx=
1
12
,    e2(x)=
12
(x
1
2
)
      y3(x)=x2−⟨ x2,1  ⟩1−⟨ x2,x
1
2
  ⟩(x
1
2
)· 12 ,   …,   e3=
y3
⎪⎪
⎪⎪
y3⎪⎪
⎪⎪
      …  …  …

  
Figure 8: Five first Legendre Pi and Chebyshev Ti polynomials

Example 18 Many famous sequences of orthogonal polynomials, e.g. Chebyshev, Legendre, Laguerre, Hermite, can be obtained by orthogonalisation of 1, x, x2, …with various inner products.
  1. Legendre polynomials in C[−1,1] with inner product
    ⟨ f,g  ⟩=
    1
    −1
      f(t)
    g(t)
    dt. (21)
  2. Chebyshev polynomials in C[−1,1] with inner product
    ⟨ f,g  ⟩=
    1
    −1
      f(t)
    g(t)
    dt
    1−t2
    (22)
  3. Laguerre polynomials in the space of polynomials P[0,∞) with inner product
          ⟨ f,g  ⟩=
    0
    f(t)
    g(t)
    etdt.
See Figure 8 for the five first Legendre and Chebyshev polynomials. Observe the difference caused by the different inner products (21) and (22). On the other hand note the similarity in oscillating behaviour with different “frequencies”.

Another natural question is: When is an orthonormal sequence complete?

Proposition 19 Let (en) be an orthonormal sequence in a Hilbert space H. The following are equivalent:
  1. (en) is an orthonormal basis.
  2. CLin((en))=H.
  3. ||x||2=∑1| ⟨ x,en ⟩ |2 for all xH.
Proof. Clearly 1 implies 2 because x=∑1x,enen in CLin((en)) and ||x||2=∑1x,enen by Theorem 15. The same theorem tells that 1 implies 3.

If (en) is not complete then there exists xH such that x≠ 0 and ⟨ x,ek ⟩=0 for all k, so 3 fails, consequently 3 implies 1.

Finally if ⟨ x,ek ⟩=0 for all k then ⟨ x,y ⟩=0 for all yLin((en)) and moreover for all yCLin((en)), by the Lemma on continuity of the inner product. But then xCLin((en)) and 2 also fails because ⟨ x,x ⟩=0 is not possible. Thus 2 implies 1. □

Corollary 20 A separable Hilbert space (i.e. one with a countable dense set) can be identified with either l2n or l2, in other words it has an orthonormal basis (en) (finite or infinite) such that
    x=
n=1
⟨ x,en  ⟩en    and    ⎪⎪
⎪⎪
x⎪⎪
⎪⎪
2= 
n=1

⟨ x,en  ⟩ 
2.
Proof. Take a countable dense set (xk), then H=CLin((xk)), delete all vectors which are a linear combinations of preceding vectors, make orthonormalisation by Gram–Schmidt the remaining set and apply the previous proposition. □

  Most pleasant compliments are usually orthogonal to our real qualities.

 An advise based on observations


3.5 Orthogonal complements

Orthogonality allow us split a Hilbert space into subspaces which will be “independent from each other” as much as possible.

Definition 21 Let M be a subspace of an inner product space V. The orthogonal complement, written M, of M is
M={x∈ V: ⟨ x,m  ⟩=0 ∀  m∈ M}.
Theorem 22 If M is a closed subspace of a Hilbert space H then M is a closed subspace too (hence a Hilbert space too).
Proof. Clearly M is a subspace of H because x, yM implies ax+byM:
    ⟨ ax+by,m  ⟩=    a⟨ x,m  ⟩+   b⟨ y,m  ⟩=0.
Also if all xnM and xnx then xM due to inner product limit Lemma. □
Theorem 23 Let M be a closed subspace of a Hilber space H. Then for any xH there exists the unique decomposition x=m+n with mM, nM and ||x||2=||m||2+||n||2. Thus H=MM and (M)=M.
Proof. For a given x there exists the unique closest point m in M by the Theorem on nearest point and by the Theorem on perpendicular (xm)⊥ y for all yM.

So x= m + (xm)= m+n with mM and nM. The identity ||x||2=||m||2+||n||2 is just Pythagoras’ theorem and MM={0} because null vector is the only vector orthogonal to itself.

Finally (M)=M. We have H=MM=(M)M, for any x∈(M) there is a decomposition x=m+n with mM and nM, but then n is orthogonal to itself and therefore is zero. □

site search by freefind advanced

Last modified: November 6, 2024.
Previous Up Next