Previous Up Next
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

Lecture 1 Erlangen Programme: Preview

The simplest objects with non-commutative (but still associative) multiplication may be 2× 2 matrices with real entries. The subset of matrices of determinant one has the following properties:

In other words, these matrices form a group, the SL2(ℝ) group [240]—one of the two most important Lie groups in analysis. The other group is the Heisenberg group [138]. By contrast, the ax+b group, which is used to build wavelets, is only a subgroup of SL2(ℝ)—see the numerator in (1) below.

The simplest non-linear transforms of the real linelinear-fractional or Möbius maps—may also be associated with 2× 2 matrices, cf. [26]*Ch. 13 [262]*Ch. 3 [10]:

g: x↦ g· x=
ax+b
cx+d
,   where   g=  


    ab
cd


,  x∈ℝ. (1)

An enjoyable calculation shows that the composition of two transforms (1) with different matrices g1 and g2 is again a Möbius transform with a matrix equal the product g1 g2. In other words, (1) is a (left) action of SL2(ℝ).

According to F. Klein’s Erlangen programme (which was influenced by S. Lie), any geometry deals with invariant properties under a certain transitive group action. For example, we may ask: What kinds of geometry are related to the SL2(ℝ) action (1)?

The Erlangen programme has probably the highest rate of praised/actually used among mathematical theories, not only due to the large numerator but also due to the undeservedly small denominator. As we shall see below, Klein’s approach provides some surprising conclusions even for such over-studied objects as circles.

1.1 Make a Guess in Three Attempts

It is easy to see that the SL2(ℝ) action (1) also makes sense as a map of complex numbers z=x+i y, i2=−1, assuming the denominator is non-zero. Moreover, if y>0, then g· z has a positive imaginary part as well, i.e. (1) defines a map from the upper half-plane to itself.

However, there is no need to be restricted to the traditional route of complex numbers only. Less-known double and dual numbers, see [339]*Suppl. C and Appendix B.1, also have the form z=xy but with different assumptions for the imaginary unit ι : ι2=0 or ι2=1, respectively. We will write ε and є instead of ι within dual and double numbers, respectively. Although the arithmetic of dual and double numbers is different from that of complex numbers, e.g. they have divisors of zero, we are still able to define their transforms by (1) in most cases.

Three possible values −1, 0 and 1 of σ:=ι2 will be referred to here as elliptic, parabolic and hyperbolic cases, respectively. We repeatedly meet such a division of various mathematical objects into three classes. They are named by the historic first example—the classification of conic sections—however the pattern persistently reproduces itself in many different areas: equations, quadratic forms, metrics, manifolds, operators, etc. We will abbreviate this separation as the EPH classification. The common origin of this fundamental division of any family with one parameter can be seen from the simple picture of a coordinate line split by zero into negative and positive half-axes:

<imgsrc="epal001.png">
(2)

Connections between different objects admitting EPH classification are not limited to this common source. There are many deep results linking, for example, the ellipticity of quadratic forms, metrics and operators, e.g. the Atiyah–Singer index theorem. On the other hand, there are still many white spots, empty cells, obscure gaps and missing connections between some subjects.

To understand the action (1) in all EPH cases we use the Iwasawa decomposition [240]*§ III.1 of SL2(ℝ)=ANK into three one-dimensional subgroups A, N and K:



    ab
cd


= 


α0
0α−1




1ν 
01




      cosφ−sinφ
      sinφcosφ


. (3)

Figure 1.1: Actions of the subgroups A and N by Möbius transformations. Transverse thin lines are images of the vertical axis, grey arrows show the derived action.

Subgroups A and N act in (1) irrespective of the value of σ: A makes a dilation by α2, i.e. z↦ α2z, and N shifts points to left by ν, i.e. zz+ν. This is illustrated by Fig. 1.1.


Figure 1.2: Action of the subgroup K. The corresponding orbits are circles, parabolas and hyperbolas shown by thick lines. Transverse thin lines are images of the vertical axis, grey arrows show the derived action.

By contrast, the action of the third matrix from the subgroup K sharply depends on σ—see Fig. 1.2. In elliptic, parabolic and hyperbolic cases, K-orbits are circles, parabolas and (equilateral) hyperbolas, respectively. Thin traversal lines in Fig. 1.2 join points of orbits for the same values of φ and grey arrows represent “local velocities”—vector fields of derived representations.

Definition 1 The common name cycle [339] is used to denote circles, parabolas and hyperbolas (as well as straight lines as their limits) in the respective EPH case.

(a)(b)
Figure 1.3: K-orbits as conic sections. Circles are sections by the plane EE′, parabolas are sections by PP′ and hyperbolas are sections by HH′. Points on the same generator of the cone correspond to the same value of φ.

It is well known that any cycle is a conic section and an interesting observation is that corresponding K-orbits are, in fact, sections of the same two-sided right-angle cone, see Fig. 1.3. Moreover, each straight line generating the cone, see Fig. 1.3(b), is crossing corresponding EPH K-orbits at points with the same value of parameter φ from (3). In other words, all three types of orbits are generated by the rotations of this generator along the cone.

K-orbits are K-invariant in a trivial way. Moreover, since actions of both A and N for any σ are extremely “shape-preserving”, we find natural invariant objects of the Möbius map:


Figure 1.4: Decomposition of an arbitrary Möbius transformation g into a product.

Theorem 2 The family of all cycles from Definition 1 is invariant under the action (1).

Proof. We will show that, for a given gSL2(ℝ) and a cycle C, its image gC is again a cycle. Figure 1.4 provides an illustration with C being a circle, but our reasoning works in all EPH cases.

For a fixed C, there is the unique pair of transformations gn from the subgroup N and gaA that the cycle ga gn C is exactly a K-orbit. This will be shown later in Exercise 7.

We make a decomposition of g (ga gn)−1 into a product similar to (3):

    g (gagn)−1 = gagngk.

Since ga gn C is a K-orbit, we have gk (ga gn C)=ga gn C. Then:

    gC=g (gagn)−1gagnC = gagngkgagnC
 =gagngk(gagnC) =  gagngagnC.

All transformations from subgroups A and N preserve the shape of any cycles in an obvious way. Therefore, the last expression gagn ga gn C represents a cycle and our proof is finished.


According to Erlangen ideology, we should now study the invariant properties of cycles.

1.2 Covariance of FSCc

Figure 1.3 suggests that we may obtain a unified treatment of cycles in all EPH cases by consideration of higher-dimension spaces. The standard mathematical method is to declare objects under investigation (cycles in our case, functions in functional analysis, etc.) to be simply points of some larger space. This space should be equipped with an appropriate structure to hold information externally which previously described the inner properties of our objects.

A generic cycle is the set of points (u,v)∈ℝ2 defined for all values of σ by the equation

k(u2−σ v2)−2lu−2nv+m=0. (4)

This equation (and the corresponding cycle) is defined by a point (k, l, n, m) from a projective space3, since, for a scaling factor λ ≠ 0, the point (λ k, λ l, λ n, λ m) defines an equation equivalent to (4). We call ℙ3 the cycle space and refer to the initial ℝ2 as the point space.

In order to obtain a connection with the Möbius action (1), we arrange numbers (k, l, n, m) into the matrix

Cσcs=


    lcsnm
klcsn


,  (5)

with a new hypercomplex unit ιc and an additional parameter s, usually equal to ± 1. The values of σc:=ιc2 are −1, 0 or 1 independent of the value of σ. The matrix (5) is the cornerstone of an extended Fillmore–Springer–Cnops construction (FSCc) [300, 65, 100].

The significance of FSCc in the Erlangen framework is provided by the following result:

Theorem 3[300]*§ 6.e The image S σcs of a cycle Cσcs under transformation (1) with gSL2(ℝ) is given by similarity of the matrix (5):
S σcs= gCσcsg−1. (6)
In other words, FSCc (5) intertwines Möbius action (1) on cycles with linear map (6).

There are several ways to prove (6). Either by a brute-force calculation (which can, fortunately, be performed by a CAS) in Section 4.3, or through the related orthogonality of cycles [65]—see the end of Section 1.4.

The important observation here is that our extended version of FSCc (5) uses an imaginary unit ιc, which is not related to ι, that defines the appearance of cycles on the plane. In other words, any EPH type of geometry in the cycle space ℙ3 admits drawing of cycles in the point space ℝ2 as circles, parabolas or hyperbolas. We may think of points of ℙ3 as ideal cycles while their depictions on ℝ2 are only their shadows on the wall of Plato’s cave.


(a)  (b)
Figure 1.5: Cycle implementations, centres and foci. (a) Different EPH implementations of the same cycles defined by quadruples of numbers. (b) Centres and foci of two parabolas with the same focal length.

Figure 1.5(a) shows the same cycles drawn in different EPH styles. We note the first order contact between the circle, parabola and hyperbola in the intersection points with the real line. Informally, we can say that EPH realisations of a cycle look the same in a vicinity of the real line. It is not surprising since cycles are invariants of the hypercomplex Möbius transformations, which are extensions of SL2(ℝ)-action (4) on the real line.

Points ce,p,h=(l/k, −σc n/k) on Fig. 1.5(a) are the respective e/p/h-centres of drawn cycles. They are related to each other through several identities:

ce=ch,   cp=
1
2
(ce+ch). (7)

Figure 1.5(b) presents two cycles drawn as parabolas. They have the same focal length, n/2k, and, thus, their e-centres are on the same level. In other words, concentric parabolas are obtained by a vertical shift, not by scaling, as an analogy with circles or hyperbolas may suggest.

Figure 1.5(b) also presents points, called e/p/h-foci,

fe,p,h=


l
k
, 
detCσcs
2nk



, (8)

which are independent of the sign of s. If a cycle is depicted as a parabola, then the h-focus, p-focus and e-focus are, correspondingly, the geometrical focus of the parabola, its vertex, and the point on the directrix nearest to the vertex.

As we will see (cf. Theorems 5 and 7), all three centres and three foci are useful attributes of a cycle, even if it is drawn as a circle.

1.3 Invariants: Algebraic and Geometric

We use known algebraic invariants of matrices to build appropriate geometric invariants of cycles. It is yet another demonstration that any division of mathematics into subjects is only illusive.

For 2× 2 matrices (and, therefore, cycles), there are only two essentially different invariants under similarity (6) (and, therefore, under Möbius action (1)): the trace and the determinant. The latter was already used in (8) to define a cycle’s foci. However, due to the projective nature of the cycle space ℙ3, the absolute values of the trace or determinant are irrelevant, unless they are zero.

Alternatively, we may have a special arrangement for the normalisation of quadruples (k,l,n,m). For example, if k≠0, we may normalise the quadruple to (1,l/k,n/k,m/k) with the highlighted cycle’s centre. Moreover, in this case, −detCσcs is equal to the square of cycle’s radius, cf. Section 1.6. Another normalisation, detCσcs=±1, is used in [163] to obtain a good condition for touching circles.

We still have important characterisation even with non-normalised cycles. For example, invariant classes (for different σc) of cycles are defined by the condition detCσcs=0. Such a class is parameterised only by two real numbers and, as such, is easily attached to a certain point of ℝ2. For example, the cycle Cσcs with detCσcs=0, σc=−1, drawn elliptically, represents just a point (l/k,n/k), i.e. an (elliptic) zero-radius circle. The same condition with σc=1 in hyperbolic drawing produces a null-cone originating at point (l/k,n/k):

   (u
l
k
)2−(v
n
k
)2=0,

i.e. a zero-radius cycle in a hyperbolic metric.


Figure 1.6: Different σ-implementations of the same σc-zero-radius cycles. The corresponding foci belong to the real axis.

In general, for every concept there are at least nine possibilities: three EPH cases in the cycle space times three EPH realisations in the point space. These nine cases for “zero-radius” cycles are shown in Fig. 1.6. For example, p-zero-radius cycles in any implementation touch the real axis, that is they are horocycles.

This “touching” property is a manifestation of the boundary effect in the upper-half plane geometry. The famous question about hearing a drum’s shape has a counterpart: Can we see/feel the boundary from inside a domain?

Both orthogonality relations described below are “boundary-aware” as well. After all, it is not surprising since SL2(ℝ) action on the upper half-plane was obtained as an extension of its action (1) on the boundary.

According to the categorical viewpoint, internal properties of objects are of minor importance in comparison to their relations with other objects from the same class. As an example, we may give the proof of Theorem 3 described at the end of of the next section. Thus, from now on, we will look for invariant relations between two or more cycles.

1.4 Joint Invariants: Orthogonality

The most expected relation between cycles is based on the following Möbius-­invariant “inner product”, built from a trace of the product of two cycles as matrices:

⟨ Cσcs,S σcs  ⟩= −tr(Cσcs
S σcs
). (9)

Here, S σcs means the complex conjugation of elements of the matrix S σcs. Notably, an inner product of this type is used, for example, in GNS construction to make a Hilbert space out of C*-algebra. The next standard move is given by the following definition:

Definition 4 Two cycles are called σc-orthogonal if Cσcs,S σcs ⟩=0.

For the case of σc σ=1, i.e. when the geometries of the cycle and point spaces are both either elliptic or hyperbolic, such an orthogonality is standard, defined in terms of angles between tangent lines in the intersection points of two cycles. However, in the remaining seven (9−2) cases, the innocent-looking Definition 4 brings unexpected relations.


    
(a)(b)(c)
Figure 1.7: Orthogonality of the first kind in the elliptic point space. Each picture presents two groups (green and blue) of cycles which are orthogonal to the red cycle Cσcs. Point b belongs to Cσcs and the family of blue cycles passing through b is orthogonal to Cσcs. They all also intersect at the point d which is the inverse of b in Cσcs. Any orthogonality is reduced to the usual orthogonality with a new (“ghost”) cycle (shown by the dashed line), which may or may not coincide with Cσcs. For any point a on the “ghost” cycle, the orthogonality is reduced to the local notion in terms of tangent lines at the intersection point. Consequently, such a point a is always the inverse of itself.

Elliptic (in the point space) realisations of Definition 4, i.e. σ=−1, are shown in Fig. 1.7. Figure 1.7(a) corresponds to the elliptic cycle space, e.g. σc=−1. The orthogonality between the red circle and any circle from the blue or green families is given in the usual Euclidean sense. Figure 1.7(b) (parabolic in the cycle space) and Fig. 1.7(c) (hyperbolic) show the non-local nature of the orthogonality. There are analogous pictures in parabolic and hyperbolic point spaces—see Section 6.1.

This orthogonality may still be expressed in the traditional sense if we will associate to the red circle the corresponding “ghost” circle, which is shown by the dashed lines in Fig. 1.7. To describe the ghost cycle, we need the Heaviside function χ(σ):

χ(t)=

      1,t≥ 0;
      −1,t<0.
(10)
Theorem 5 A cycle is σc-orthogonal to cycle Cσcs if it is orthogonal in the usual sense to the σ-realisation of “ghost” cycle G σcs, which is defined by the following two conditions:
  1. The χ(σ)-centre of G σcs coincides with the σc-centre of Cσcs.
  2. Cycles G σcs and Cσcs have the same roots. Moreover, detG σ1= detCσχ(σc).

The above connection between various centres of cycles illustrates their relevance to our approach.

One can easily check the following orthogonality properties of the zero-radius cycles defined in the previous section:

  1. Due to the identity ⟨ Cσcs,Cσcs ⟩=2det Cσcs, zero-radius cycles are self-ortho­go­nal (isotropic).
  2. A cycle Cσcs is σ-orthogonal to a zero-radius cycle Zσcs if and only if Cσcs passes through the σ-centre of Zσcs.

Proof.[Sketch of proof of Theorem 3] The validity of Theorem 3 for a zero-radius cycle

    Zσcs=


      zzz
1z


=    
1
2


      zz
11




      1z
1z


with the centre z=xy is straightforward. This implies the result for a generic cycle with the help of Möbius invariance of the product (9) (and, thus, the orthogonality) and the above relation (Theorem 2) between the orthogonality and the incidence. See Exercise 7 for details.


1.5 Higher-order Joint Invariants: Focal Orthogonality

With our appetite already whet we may wish to build more joint invariants. Indeed, for any polynomial p(x1,x2,…,xn) of several non-commuting variables, one may define an invariant joint disposition of n cycles jCσcs by the condition

  trp(1Cσcs, 2Cσcs, …,  nCσcs)=0.

However, it is preferable to keep some geometrical meaning of constructed notions.

An interesting observation is that, in the matrix similarity of cycles (6), one may replace element gSL2(ℝ) by an arbitrary matrix corresponding to another cycle. More precisely, the product CσcsS σcsCσcs is again the matrix of the form (5) and, thus, may be associated with a cycle. This cycle may be considered as the reflection of S σcs in Cσcs.

Definition 6 A cycle Cσcs is f-orthogonal to a cycle S σcs if the reflection of S σcs in Cσcs is orthogonal (in the sense of Definition 4) to the real line. Analytically, this is defined by
tr(CσcsS σcsCσcsRσcs)=0. (11)

Due to invariance of all components in the above definition, f-orthogonality is a Möbius-invariant condition. Clearly, this is not a symmetric relation—if Cσcs is f-orthogonal to S σcs, then S σcs is not necessarily f-orthogonal to Cσcs.


    
Figure 1.8: Focal orthogonality for circles. To highlight both similarities and distinctions with ordinary orthogonality, we use the same notations as in Fig. 1.7.

Figure 1.8 illustrates f-orthogonality in the elliptic point space. In contrast to Fig. 1.7, it is not a local notion at the intersection points of cycles for all σc. However, it may again be clarified in terms of the appropriate f-ghost cycle, cf. Theorem 5.

Theorem 7 A cycle is f-orthogonal to a cycle Cσcs if it is orthogonal in the traditional sense to its f-ghost cycle S σcσc = Cσcχ(σ)σcσc Cσcχ(σ), which is the reflection of the real line in Cσcχ(σ) and χ is the Heaviside function (10). Moreover:
  1. The χ(σ)-centre of S σcσc coincides with the σc-focus of Cσcs. Consequently, all lines f-orthogonal to Cσcs pass the respective focus.
  2. Cycles Cσcs and S σcσc have the same roots.

Note the above intriguing interplay between the cycle’s centres and foci. Although f-orthogonality may look exotic, it will appear again at the end of next section.

Of course, it is possible to define other interesting higher-order joint invariants of two or even more cycles.

1.6 Distance, Length and Perpendicularity

Geometry in the plain meaning of the word deals with distances and lengths. Can we obtain them from cycles?


(a)   (b)   
Figure 1.9: Radius and distance for parabolas. (a) The square of the parabolic diameter is the square of the distance between roots if they are real (z1 and z2). Otherwise, it is the negative square of the distance between the adjoint roots (z3 and z4). (b) Distance is the extremum of diameters in elliptic (z1 and z2) and parabolic (z3 and z4) cases.

We already mentioned that, for circles normalised by the condition k=1, the value −det Cσcs=−1/2⟨ Cσcs,Cσcs ⟩ produces the square of the traditional circle radius. Thus, we may keep it as the definition of the σc-radius for any cycle. However, we then need to accept that, in the parabolic case, the radius is the (Euclidean) distance between (real) roots of the parabola—see Fig. 1.9(a).

Having already defined the radii of circles, we may use them for other measurements in several different ways. For example, the following variational definition may be used:

Definition 8 The distance between two points is the extremum of diameters of all cycles passing through both points—see Fig. 1.9(b).

If σc=σ, this definition gives, in all EPH cases, the following expression for a distance de,p,h(u,v) between the endpoints of any vector w=u+i v:

de,p,h(u,v)2=(u+iv)(uiv)=u2−σ  v2. (12)

The parabolic distance dp2=u2, see Fig. 1.9(b), sits algebraically between de and dh according to the general principle (2) and is widely accepted [339]. However, one may be unsatisfied by its degeneracy.

An alternative measurement is motivated by the fact that a circle is the set of equidistant points from its centre. However, there are now several choices for the “centre”: it may be either a point from three centres (7) or three foci (8).

Definition 9 The length of a directed interval AB is the radius of the cycle with its centre (denoted by lc(AB)) or focus (denoted by lf(AB)) at the point A which passes through B.

This definition is less common and has some unusual properties like non-symmetry: lf(AB)≠ lf(BA). However, it comfortably fits the Erlangen programme due to its SL2(ℝ)-conformal invariance:

Theorem 10 ([191]) Let l denote either the EPH distances (12) or any length from Definition 9. Then, for fixed y, y′∈ℝσ, the limit
    
 
lim
t→ 0
l(g· y, g·(y+ty′))
l(y, y+ty′)
(where gSL2(ℝ)) exists and its value depends only on y and g and is independent of y.

Figure 1.10: The perpendicular as the shortest route to a line.

We may return from lengths to angles noting that, in the Euclidean space, a perpendicular is the shortest route from a point to a line—see Fig. 1.10.

Definition 11 Let l be a length or distance. We say that a vector AB is l-perpendicular to a vector CD if function l(ABCD) of a variable ε has a local extremum at ε=0.

A pleasant surprise is that lf-perpendicularity obtained using the length from focus (Definition 9) coincides with f-orthogonality (already defined in Section 1.5), as follows from Theorem 1.

All these study are waiting to be generalised to higher-dimensions. Quaternions and Clifford algebras provide a suitable language for this [191, 273].

1.7 The Erlangen Programme at Large

As we already mentioned, the division of mathematics into separate areas is purely imaginary. Therefore, it is unnatural to limit the Erlangen programme only to “geometry”. We may continue to look for SL2(ℝ)-invariant objects in other related fields. For example, transform (1) generates the unitary representation on the space L2(ℝ):

g: f(x)↦ 
1
(cx+d)
f


ax+b
cx+d



. (13)

The above transformations have two invariant subspaces within L2(ℝ): the Hardy space and its orthogonal complements. These spaces are of enormous importance in harmonic analysis. Similar transformations

g: f(z)↦ 
1
(cz+d)m
f


az+b
cz+d



, (14)

for m=2, 3, …, can be defined on square-integrable functions in the upper half-plane. The respective invariant subspaces are weighted Bergman spaces of complex-valued analytic and poly-analytic functions.

Transformations (14) produce the discrete series representations of SL2(ℝ), cf. [240]*§ IX.2. Consequently, all main objects of complex analysis can be obtained in terms of invariants of these representations. For example:

It would be an omission to limit this construction only to the discrete series, complex numbers and the subgroup K. Two other series of representations (principal and complimentary—see [240]*§ VI.6) are related to important special functions [333] and differential operators [254]. These series have unitary realisations in dual and double numbers [170, 194]. Their relations with subgroups N′ and will be shown in Section 3.3.4.

Moving further, we may observe that transform (1) is also defined for an element x in any algebra A with a unit 1 as soon as (cx+d1)∈A has an inverse. If A is equipped with a topology, e.g. is a Banach algebra, then we may study a functional calculus for element x [168] in this way. It is defined as an intertwining operator between the representations (13) or (14) in spaces of analytic functions and similar representations in left A-modules.

In the spirit of the Erlangen programme, such functional calculus is still a geometry, since it is dealing with invariant properties under a group action. However, even for the simplest non-normal operator, e.g. a Jordan block of length k, the obtained space is not like a space of points but is, rather, a space of k-th jets [182]. Such non-point behaviour is often attributed to non-commutative geometry and the Erlangen programme provides an important insight on this popular topic [176].

Of course, there is no reason to limit the Erlangen programme to the group SL2(ℝ) only—other groups may be more suitable in different situations. For example, the Heisenberg group and its hypercomplex representations are useful in quantum mechanics [199, 196]. However, SL2(ℝ) still possesses large unexplored potential and is a good object to start with.

site search by freefind advanced

Last modified: October 28, 2024.
Previous Up Next