Combinatorial species: more than counting!

Since their introduction by Joyal [1] in 1981, combinatorial species have been exploited in a manifold of ways. Originally, they were conceived as a categorification of powerseries by Joyal: a combinatorial species is a collection of sets, one for each natural numbers, whose cardinalities make up the coefficients of a powerseries. In this way, one can think about powerseries in a richer context, and deduce enumerative results in very elegant ways. For example, Joyal provides in [1] a very short and celebrated proof of Cayley’s theorem that the number of labelled trees on n vertices is n^{n-2}.

Let us now introduce the objects of our interest formally.

In simple terms, a (combinatorial) species is a sequence of sets X= (X_0,X_1,\ldots), each one with an action of the corresponding symmetric group S_n. In particular, X_0 and X_1 are really just sets. If Y is another species, a morphism of species from X to Y is a collection of equivariant functions f_n : X_n \longrightarrow Y_n. In this way, there is a category of species, which is simply a direct product of the various categories of S_n-sets.

Because it will prove quite useful in upcoming discussions, we will think of a combinatorial species as functor X from the category of finite sets and bijections, which we denote \mathsf{FSet}^\times, to the category of sets, \mathsf{Set}.  Thus, to each finite set I we will assign another set X(I), which we call the X-structures on I, and to each bijection \sigma : I \longrightarrow J we will assign a bijection X(\sigma) : X(I) \longrightarrow X(J). We usually say that (each element of) I is given a structure of species X by means of X(I), and think of X(\sigma) as a rule of relabelling X-structures, and if z\in X(I), we say X(\sigma)(z) is obtained by transporting z along \sigma. Naturally, morphisms are now natural transformations. We will denote this category by \mathsf{Sp}.

To connect our two definitions of species, we observe that a skeleton of \mathsf{FSet}^\times consists of the finite sets [n] = \{ 1,\ldots, n-1\} for n a natural number, whose automorphism groups are precisely the symmetric groups S_n. Thus every species X is determined, up to isomorphism, by the sets X([n])=X_n and their respective S_n-actions —this is exactly our simpler description of species.

Combinatorial species are a useful way of packing up our informal notion of a “collection of labelled combinatorial structures of kind” and how such structures are “relabelled”. Moreover, as we noted before, species project themselves onto powerseries, in the following concrete way. If X is a combinatorial species, we define its cardinality to be the (formal) powerseries X(z) = \sum x_n \dfrac{z^n}{n!} where x_n is the cardinality of X_n. The cardinality of X only stores what its name alludes to: the cardinals of the various sets that determine X, but it does not remember, for example, the action of the various symmetrics groups. Later on we will see more refined assignments that do this, at least partially.

To illustrate, there is a species of linear orders that assigns to every finite set I the collection of bijections \ell : [n] \longrightarrow I where n = \# I. We think of f as a word \ell_1\ldots \ell_n, and we assign to every bijection \sigma : I \longrightarrow J the map L(I) \longrightarrow L(J) obtained by postcomposition with \sigma. This effectively just relabels a linear order on I to one in J according to \sigma. The reader should convince him or herself that, once we identify L_n with S_n in the obvious way, the action of S_n on L_n is the regular action of the symmetric group on itself. It is immediate that L(z) = (1-z)^{-1}.

There is a similar species to L, the species of B bijections, that assigns to every finite set I its collection of bijections, and assigns a bijection \sigma : I \longrightarrow J the map B(I) \longrightarrow B(J) obtained by conjugating by \sigma. Once again, B(z) = (1-z)^{-1}. However, one should note that B and L are not isomorphic species: the action of S_n on B_n is not isomorphic to that of S_n on L_n, since one action is the regular one, yet the other is by conjugation, and the orbits of the actions are not equinumerous.

This motivates us to assign to every species X another species that remembers the automorphisms of each X-structure. Concretely, let \widetilde{X} denote the species that assigns to each finite set I the collection of pairs (z,\sigma) where z\in X(I) and \sigma is a bijection on I such that X(\sigma)(z)=z —this is called an automorphism of zThe reader should ellucidate the way bijections should act on such pairs.

Let us now find the cardinality of \widetilde{X}. The cardinal of \widetilde{X}_n is the sum of the cardinals of the set of fixed points of \sigma as \sigma runs through S_n, and by Burnside’s lemma, this is exactly n! \widetilde{x}_n, where \widetilde{x}_n is the cardinal of the orbit space of X_n. Thus \widetilde{X}(z)= \sum \widetilde{x}_n z^n. One now observes at once that \widetilde{B}(z) and \widetilde{L}(z) differ, and we can conclude —once again— that such species are not isomorphic.

One can endow the category of species with a coproduct and a product, simply by using the coproduct and product of sets. Concretely, if X and Y are species, their sum is the species X+Y that assigns to a finite set I the set X(I) \sqcup Y(I), and where bijections act in the obvious manner. Similarly, the Hadamard product X\times Y assigns to a finite set I the set X(I)\times Y(I) and, again, is acted upon by bijections in the obvious way. Is is clear that the cardinality of a sum of species is the sum of their cardinalities —we will not concern ourselves, however, with the Hadamard product of powerseries.

There is yet a third operation one can perform on species that acts as the usual formal multiplication of powerseries, the Cauchy product, which is central to essentially all the modern algebraic treatment of species we will consider. The Cauchy product X\otimes Y of two species assigns to a finite set I the set \bigsqcup X(S)\times Y(I), the sum running through all pairs F=(S,T) of subsets of I that are disjoint and whose union is I: we call this a decomposition of I into two blocks, and denote this situation by F\models_2 I. The reader can imagine what a decomposition of I into q blocks is. If \sigma: I\longrightarrow J is a bijection and (S,T) is a decomposition, there are induced bijections \sigma_S : X(S) \longrightarrow X(\sigma(S)) and \sigma_T which collect to form a bijection (X\otimes Y)(\sigma) : (X\otimes Y)(I)\longrightarrow (X\otimes Y)(J).

It is worth to remark than a structure of type X\otimes Y is exactly a pair (z,w) where z and w are “complementary” X-and Y-structures, respectively. It is precisely this interpretation that motivates our definition. If s is an integer, we will denote the product of X with itself s times by X^s. The reader should note that we can identify X^s(I) with the collection decompositions F of I into s blocks, where each block is given a structure of species X. This is the same as a function f : I \longrightarrow s where each fibre is given a structure of species X.

As our notation suggests, the Cauchy product is a tensor product on \mathsf{Sp}, that is bilinear with respect to addition of species, so this category is monoidal (with an obvious symmetry that interchanges factors) and bicartesian. Moreover, there is the so desired equality that (X\otimes Y)(z) = X(z)\cdot Y(z). Let us prove this:

The coefficient of z^n/n! in X(z)\cdot Y(z) is \sum\limits_{k=0}^n \binom{n}{k} x_k y_{n-k}, while the cardinality of (X\otimes Y)_n is that of the union of X(S)\times Y([n] \smallsetminus S) as S runs through subsets of [n]. Now each of these sets has cardinality x_ky_{n-k} if S has k elements, and being the case there are \binom{n}{k} of such k-subsets of [n], we deduce this number is \sum\limits_{k=0}^n \binom{n}{k} x_k y_{n-k}, as expected. Of course, the Cauchy product is so constructed for this to happen!

The advantages of having enriched the category \mathsf{Sp} in such a way that our cardinality map \mathsf{Sp} \longrightarrow \mathbb{C}[[z]] respects both products and is additive is that we can now perform combinatorial arguments in \mathsf{Sp} that give useful identities in \mathbb{C}[[z]], which we can in turn reinterpret as enumerative statements. Let us close this posts by giving an example of such phenomenon.

For a fixed integer s, the set L^s(I) consists of functions $\latex f : I \longrightarrow [s]$ where each fibre is linearly ordered. Because we already computed that L(z) = (1-z)^{-1}, we now compute that L^s(z) = (1-z)^{-s}. Equating coefficients by means of the binomial series, we see there are s(s+1)\cdots (s+n-1) such functions if \#I =n.

To wrap things up, we have constructed ourselves a cocartesian symmetric monoidal category \mathsf{Sp} of (set valued) combinatorial species, and a cardinality functor from this category to the discrete category of powerseries that respects sums and products. In upcoming posts, we will shift our attention to module valued combinatorial species, where we can do some algebra.

Add (16/04) In [2] the reader can find a very nice example of counting with combinatorial species.

[1] A. Joyal, Une théorie combinatoire des séries formelles, Advances in Mathematics, Volume 42, Issue 1, October 1981, Pages 1-82.

[2] Andy Hardt, Pete McNeely, Tung Phan, Justin M. Troyka, Combinatorial species and graph enumeration, 39 pages, 16 figures, senior comprehensive project at Carleton College.


Cohomology of combinatorial species

On March 31st I presented my thesis, titled “Cohomology of combinatorial species”, in partial fulfillment of my Licenciatura in Mathematics at the Universidad de Buenos Aires. You can find a rather final copy here (the introduction is in Spanish, but the thesis is written in English) and a playlist with the recording of the viva voce exam here. The viva voce was delivered in Spanish, I may add a translation in the future.

Chapters 1 is brief and useful to those unfamiliar with species, Chapter 2 is mostly optional and in Chapter 3 we introduce the cohomology of combinatorial species. The main results of the thesis are presented in Chapter 4, and a final Chapter 5 contains some ideas and problems for future work.

After an almost year long period of inactivity (which I used to write the thesis, of course) I will try to post again, probably to present combinatorial species and the universe that surrounds them. Ideally, we will borrow from Joyal’s foundational paper, some of the comprehensive book by Labelle, Leroux and Bergeron, and the expository material of Marcelo Aguiar and Swapneel Mahajan, freely available here, along with the contents of my thesis, which continue the programme set out by Aguiar and Mahajan to compute the cohomology of combinatorial species, which in particular includes computing certain “combinatorial deformations” of Hopf algebras in species.

So, we might seek out to cover:

  1. The category of combinatorial species.
  2. Counting with species and their associated powerseries.
  3. Operations on species and monoidal structures on species.
  4. Algebras and coalgebras in species.
  5. Some more monoidal nonsense.
  6. Hopf monoids in species, and their deformations.
  7. Fock functors and combinatorial Hopf algebras.
  8. Deformations as cocycles, and cohomology of species.
  9. Calculations of the cohomology of species.
  10. Some homological algebra.
  11. The combinatorial complex of a species: a new method of calculation.
  12. Using the combinatorial complex.
  13. Characters of Hopf monoids, and classical invariants.

N.B.: This should take a while.



The Steinitz replacement theorem for modules and uniform dimension.

What follows is adapted from T.Y. Lam’s book Lectures on modules and rings.

In this previous post we talked about dependence relations and how they encompass many particular ideas of what it means for some algebraic object to be spanned by a subset of elements. We also noted that the Steinitz replacement axiom fails in general rings. This post shows how to generalise our definitions so that an analogous replacement theorem holds. But first, some definitions!

Given a ring R and an R-module M, call a submodule N essential if it has nontrivial intersection with every nontrivial submodule of M. Of course, an essential submodule must be nonzero.  Call M uniform if every pair of nonzero submodules of M have nontrivial intersection. Equivalently, every submodule of M is essential.

Exercise. If V is a vector space, a submodule is uniform if and only if it is one dimensional.

It is an idea of A. Goldie (after whom Goldie rings are so called) that in order to measure the size of a module, one can try to see how many direct sums of uniform submodules one can fit inside M. After all, the dimension of a vector space is precisely this number. One calls this the uniform dimension of the given module. Is this well defined in the general case?

The answer is affirmative, and it is the generalized version of the Steinitz replacement theorem:

Theorem. Assume U=U_1\oplus  \cdots U_m and V = V_1\oplus \cdots \oplus V_n are essential submodules of a module M and the U_i,V_j are uniform. Then n=m.

Proof We can safely assume that n\geqslant m. The claim is \hat U=U_2\oplus\cdots \oplus U_m intersects some V_j trivially. Else \hat U\cap V_j would be essential in V_j, and the direct sum of such intersections (which is \hat U\cap V), would be essential in V. It would follow then, by transitivity, that $latex  \hat U\cap V$ is essential in M, so \hat U is, which is impossible. By relabelling we may assume then that \hat U\cap V_1=0. Let U'=\hat U\oplus V_1, so that U'\cap U_1 is nontrivial (why?). Since U_1 is uniform, (U'\cap U_1)\oplus U_2\oplus\cdots\oplus U_m is essential in U and thus in M. Since this is contained in U', U' itself is essential in M. We have thus replaced U_1 by V_1. After m steps, we obtain that V_1\oplus\cdots \oplus V_m is essential in M, which forces m=n, as desired. \blacktriangleleft

Combining this with the following lemma one obtains our definition of uniform dimension: we say M has uniform dimension n if there is an essential submodule that is a direct sum of n uniform submodules of M.

Lemma. Suppose that N=N_1\oplus\cdots N_k is a direct sum of uniform submodules of a given module M and that k is maximum among the possible number of uniform direct sumands of uniform submodules one can fit inside M. Then N is an essential submodule of M.

Proof This is not completely straightforward, but it is once one proves that the uniform dimension of M is the supremum of the number of direct sumands one can fit inside M. If this N weren’t essential, there would exist some nonzero N' such that $N\cap N’=\varnothing$, and then M would contain N_1\oplus\cdots N_k\oplus N, which violates maximality. \blacktriangleleft

In order to prove that the uniform dimension of M is the supremum of the number of direct sumands one can fit inside M, we prove:

Proposition. If M has finite uniform dimension n, then any direct sum of k nonzero submodule in M has k\leqslant n.

Proof. Let V be an essential submodule of M that is a direct sum of n uniform submodules, and assume N = N_1\oplus \cdots \oplus N_k\subseteq M. Then the N_i'=N_i\cap V are nonzero and form a direct sum inside V, so we might as well assume that V=M. The same argument as in the proof of the replacement theorem gives a relabeling of the V_i such that V_1\cap\hat N=0. We may then project M/V_1\to \hat V and obtain an embedding of \hat N in $\hat V$. By induction, k-1\leqslant n-1, so k\leqslant n. \blacktriangleleft

Proposition. A module has infinite uniform dimension if and only if it contains an infinite direct summand.

Proof. One direction is evident from the previous proposition. Suppose now that M does not contain any infinite direct sum. Then every nonzero submodule of M contains a uniform submodule. Indeed, if this fails for some N nonzero, then in particular N is not uniform, so N= A_1\oplus B_1 a nonzero direct sum decomposition. Then B_1 is not uniform, and it may be decomposed. We then inductively construct an infinite direct sum, which cannot be. We may thus pick a uniform submodule V_1\subseteq M. If V_1 is not essential then we may obtain a V_2 such that V_1\oplus V_2\subseteq M, and by picking some uniform submodule of V_2 we may assume V_2 is uniform. Continuing, this process must terminate, arriving at some V_1\oplus\cdots \oplus V_n\subseteq M that is essential and a direct sum of uniform submodules. \blacktriangleleft

Corollary. The uniform dimension of a module is the supremum of the number of direct sumands it contains.

Facts. Some happy, some sad:

  1. Any artinian of noetherian module has finite uniform dimension.
  2. If a module has finite composition length n, it has uniform dimension at most n, with equality if and only if this module is semisimple.
  3. Sadly, finitely generated modules do not necessarily have finite dimension. The direct product \prod \Bbb Z is cyclic over itself, but contains a direct sum of ideals, one for each copy of the integers.
  4. Dimension is additive (meaning the dimension of a direct sum is the direct sum of the dimensions.)
  5. Dimension is monotone, that is, the dimension of any submodule of M is at most that of M. Equality is attained if this submodule is essential.
  6. Sadly, dimension is not additive for exact sequences: as an example, \Bbb Z/p^n\Bbb Z has uniform dimension 1, so the obvious exact sequence does the job.
  7. If M is a torsion-free module over a commutative domain D with field of fractions k; then the dimension of M is precisely \dim_k k\otimes_D M). More generally, the uniform dimension of the torsion-free quotient of M is this dimension.



You could have discovered the Künneth formulas.

In what follows we will work in the category of modules or complexes over a fixed ring R. In particular the \rm Tor and \otimes bifunctors are understood to be taken relative to R.

Given two chain complexes C,C' their tensor product is the complex C\otimes C' with \bigoplus\limits_{i+j=q} C_i\otimes C_j in degree q and differential given by d(x\otimes y)=dx\otimes y+(-1)^{|x|}x\otimes dy. This is a chain complex, as the reader might check. A natural question is: can one deduce anything about the homology of C\otimes C' given information about the homology of C and C'?

There is a canonical map H(C)\otimes H(C')\to H(C\otimes C') that sends \left[c\right]\otimes \left[c'\right] to \left[c\otimes c'\right]. The reader should check this is well defined, that is: if c,c' are cycles then their tensor product is a cycle in the tensor product complex and if any of the two is a boundary so is their tensor product. It would be marvellous if this were an isomorphism, but (not surprisingly) it is not always the case. We will show that this map is always injective and identify its cokernel under mild hypothesis on the complex C. Our work will show that this map is always an isomorphism when R=k is a field, for example. As a final application, we will prove (as promised some time ago) that the canonical d.g. Koszul algebra over a field is acyclic.

In what follows, I will try to show how the hypotheses of the Künneth formulas can be recovered by looking at information obtained of H(C\otimes M) from H(C) and M through the usual long exact sequences of homology and the \rm Tor functor. In some sense, the hypotheses are weak enough so that the obvious steps give us useful information, eventually giving us the Künneth formulas.

Associated to any complex C we have subcomplexes with trivial differential Z and B where Z_p=\ker \partial_p and B_p =\partial C_{p+1}. Thus, we always have a short exact sequence (of complexes) 0\to Z\to C\to B\to 0 where the first map is the inclusion and the second is the differential of C. Since a complex with zero differential is its own homology, the long exact homology sequence associated with this particular exact sequence is

\cdots \to B_p\to Z_p\to H_p(C)\to B_{p-1}\to Z_{p-1}\to \cdots

and the connecting morphism H_{p+1}(B)=B_p \to H_p(Z)=Z_p is (by the diagram chase in the snake lemma) the inclusion. Similarly the map from H_p(Z)=Z_p \to H_p(C) is the projection map (this should be immediate) and the induced map H_p(C)\to H_p(B)=B_{p-1} is zero since it is induced by the differential. This means this long exact sequence breaks into the canonical short exact sequences 0\to B_p\to Z_p\to H_p(C)\to 0

If we could obtain a similar short exact sequence involving H_p(C\otimes M), we would be able to surround it by terms involving Z_p\otimes M and B_p\otimes M, and their quotient is H_p(C)\otimes M, so we would be able to get somewhere. If we assume that B is made up of flat modules, then the tensored sequence 0\to Z\otimes M\to C\otimes M\to B\otimes M \to 0 would also be exact, which is what we want. Indeed

Lemma If 0\to A\to B\to C\to 0 is an exact sequence of modules (over any ring) and of C is a flat module, then for any module M the tensored sequence is exact.

Proof The long exact sequence of the \rm Tor functor give an exact sequence

\cdots\to {\rm Tor}_1(C,M)\to A\otimes M\to B\otimes M\to C\otimes M\to 0

and {\rm Tor}_1(C,M)=0 since C is flat. \blacktriangleleft

Following the above, we now have a long exact sequence

\cdots \to B_p\otimes M \stackrel{i\otimes 1}\to Z_p\otimes M\to H_p(C\otimes M)\to B_{p-1}\otimes M \to Z_{p-1}\otimes M \to \cdots

Where the boundary map B_p\otimes M \to Z_p\otimes M is induced by the inclusion tensor the identity of M by the same reasoning, let us denote it by \iota \otimes 1. Note then that since the image of \iota\otimes 1 is just B_p\otimes M, we have an induced injective map from the quotient Z_p\otimes M/B_p\otimes M=H_p(C)\otimes M to H_p(C\otimes M). Since the map Z_p\otimes M\to H_p(C\otimes M) was induced by the projection Z_p\to H_p(C), we see this map is precisely the canonical map we defined above. By definition of exactness, we have a short exact sequence (where we just identified {\rm coker}\,(\iota \otimes 1) above!)

0\to {\rm coker}\,(\iota\otimes 1) \to H_p(C)\otimes M\to \ker(\iota\otimes 1)\to 0

Note that the cokernel is that of the map \iota\otimes 1 in degree p, while the kernel is that of the map \iota\otimes 1 in degree p-1. It remains to identify \ker(\iota \otimes 1). Now we just said we have short exact sequence

0\to B_p\to Z_p\to H_p(C)\to 0

where the first map is the inclusion i and the second map is the projection. If we use the definition of \rm Tor, we obtain a long exact sequence

\cdots\to {\rm Tor}_1(B_p,M)\to {\rm Tor}_1(Z_p,M)\to { \rm Tor}_1(H_p(C),M)\to B_p\otimes M \to Z_p\otimes \to H_p(C)\otimes M\to 0

Now B_p is already assumed to be flat, so if we also assume that Z_p is flat we get {\rm Tor}_1(Z_p,M)=0 and obtain an exact sequence

\cdots \to {\rm Tor}_1(H_p(C),M) \to B_p\otimes M \to Z_p\otimes M \to H_p(C)\otimes M\to 0

or, what is the same, {\rm Tor}\;_1(H_p(C),M) is the kernel of 1\otimes i in degree p! All in all, we obtain the so desired exact sequence

0\to H_p(C)\otimes M \to H_p(C)\otimes M\to { \rm Tor}_1(H_{p-1}(C),M)\to 0

where the first arrow is the canonical map, assuming that Z_p,B_p are flat for any p. Note that this is the same as assuming that any two of the three complexes Z,C,B are composed of flat modules, by the long exact sequence of the \rm Tor functor. In particular if C is made up projectives and R is hereditary this always work, more particularly if C is made up of free modules and R is a PID. One can show that in this last case the short exact sequence splits, which shows that homology with coefficients H(C;M) is always determined by M and H(C).

We want to move on to the general case, thus let C' be any other complex, and retain the hypotheses made on C. By the same reasoning as above, we have exact sequences for any p,q

0\to Z_p \otimes C_q'\to C_p \otimes C_q'\to B_{p-1} \otimes C_q'\to 0

which assemble to give an exact sequence of complexes

0\to Z \otimes C'\to C\otimes C'\to B \otimes C'\to 0

Since both Z,B have zero differential, the differential in the product complex send x\otimes c' to \pm x\otimes \partial'c', and in particular acts only vertically. This means that Z\otimes C' is a direct sum of subcomplexes Z'^j with (Z'^j)_p = Z_j\otimes C_{p-j}' and (B'^j)_p= B_{j-1}\otimes C_{p-j}' with differential acting only on the right terms. It follows the homology of this two complexes is the direct sum of the homology of the complexes just defined. We now use the

Lemma If M is a flat module then for any complex C, H(C)\otimes M is canonically isomorphic to H(C\otimes M) via the canonical map.

This is true, and the proof generalizes to “(additive) exact functors commute with homology”. This is not too immediate but one should convince itself to be true given that exact functors preserve kernels and cokernels and homology is the cokernel of a kernel. I won’t include the proof here, hoping the reader can convince itself of this fact or take a leap of faith.

At any rate, given the decompositions above we get that under our canonical map

H_p(Z\otimes C')=\bigoplus_i H(Z'^i)\to \bigoplus_{i+j=p} Z_i\otimes H_j(C')

H_p(B\otimes C')=\bigoplus_i H(B'^i)\to \bigoplus_{i+j=p} B_i\otimes H_j(C')

As in our first proof, the long exact sequence associated to

0\to Z \otimes C'\to C\otimes C'\to B \otimes C'\to 0

(after carrying the identifications using the above isomorphisms) has connecting morphism the sum i+j=p of  (-1)^j \iota_i\otimes 1 where \iota_i is the inclusion B_i\to Z_i. Hence again we have a short exact sequence obtained from this long exact sequence

\bigoplus\limits_{i+j=p} {\rm coker}\,((-1)^j \iota_i\otimes 1)\to H_p(C\otimes C') \to \bigoplus\limits_{i+j=p-1} \ker( (-1)^j \iota_i\otimes 1)

and it suffices we identify the sums. Again, the cokernels are easily seen to be H_i(C)\otimes H_j(C') and now using the exact sequences

0\to B_i\stackrel{\iota}\to Z_i\to H_i \to 0

and the fact that B_i,Z_i are flat, we get again that the kernel of (-1)^j \iota_i\otimes 1 is {\rm Tor}\,(H_i(C),H_j(C')). If we denote by {\rm Tor}_1\,(C,C') the total complex of the bicomplex associated to the \rm Tor bifunctor (just as we defined the tensor product of two complex as the total complex of the bicomplex of C_i\otimes C_j) we get the

Künneth formula Suppose C is a complex with flat cycles and boundaries and C' is any complex. There is a (functorial) short exact sequence

0\to (H(C)\otimes H(C'))_p\to H_p(C\otimes C')\to ({\rm Tor}_1^R(H(C),H(C')))_{p-1}

where the first arrow is the canonical map defined above.

We’re assuming C,C' to be complexes of modules to the correct sides so that the tensor products make sense, of course. Note that if C is a complex of left R-modules, H(C) is a complex of R-modules under the induced action on cycles. In particular if R=k is a field, the above sequence gives our canonical map is an isomorphism, for the \rm Tor terms always vanish.

For any field k the complex of k-modules \mathcal K(X) : 0\to k[X]\stackrel{X}\to k[X]\to 0 has homology k in degree 0 and zero homology elsewhere. We can form the tensor product (over k) of an arbitrary number of copies of \mathcal K(X) and obtain a complex \mathcal K(X_1,\ldots,X_n) which we will call the Koszul complex of degree n. By the Künneth formula, this complex has homology k in degree 0 and trivial homology elsewhere.

For example, using the canonical isomorphism k[X_1]\otimes k[X_2]\to k[X_1,X_2], one can see that \mathcal K(X_1,X_2) is a complex that has k[X_1,X_2] in degree 2 with differential

p \to (-X_2 p,X_1 p)

a direct sum k[X_1,X_2]\oplus k[X_1,X_2] in degree 1 with differential

(p,q)\to Xp+yQ

and k[X_1,X_2] in degree 0 with zero differential. One can better understand the Koszul differential by working inductively, and after some careful work one can see that \mathcal K(X_1,X_2,\ldots,X_n) is a finite complex of length n; which has binom nm copies of the polynomial algebra A=k[X_1,\ldots,X_n] in degree m, which we can index by basis elements e_I for I an m-subset of \{1,2,\ldots,n\} in increasing order. The differential acts A-linearly defined on the basis vectors by sending e_I \to \sum (-1)^j x_{i_j} e_{I-i_j} where I=\{i_1,\ldots,i_m\}. In the above example for n=2, the complex is

0\to Ae_{12}\to Ae_1\oplus Ae_2\to Ae_\varnothing \to 0

and \partial e_{12}=x_1e_2-x_2e_1, \partial e_i=x_i, \partial e_\varnothing =0.

The Koszul complex of degree n is precisely the complex we used in this previous post to obtain a resolution of A/(f) for f a homogeneous polynomial of degree at least 2.

The method of acyclic models.

Consider a nonnegative chain complex of projective modules P, and any nonnegative complex C which is acyclic in positive degrees, that is H_q(C)=0 if q>0. It is a classic result that any module morphism P_0\to C_0 extends to  a morphism of complexes P \to C and uniquely so, up to chain homotopy.

This is intensively used in homological algebra, for example in the theory of derived functors when showing that if one fixes a collection of projective resolutions to define the left derived functors L_\ast F of a functor F, any other choice of collection of resolutions gives a sequence functors that is naturally isomorphic to L_\ast F. They are in fact isomorphic as \delta-functors.

Fix two abelian categories \mathcal A,B. A homological \delta-functor T:\mathcal A\to \mathcal B is a collection of functors T_n:\mathcal A\to \mathcal B equipped with morphisms \delta_n;T_n(C)\to T_{n-1}(A) defined for each short exact sequence 0\to A\to B\to C\to 0 that is natural when one views T as a functor from short exact sequences in \mathcal A to long exact sequences in \mathcal B. A morphism of \delta-functors is obtained by specifying natural transformations at each degree that commute with the connecting morphism.

The basic result regarding \delta-functors asserts that the left (right) derived functors L_nF of a right (left) functor F are universal (when the source category has enough projectives) in the following sense: any natural transformation T_0\to L_0F extends uniquely to a morphism of \delta-functors. In particular the derived functors of a functor are unique up to natural isomorphism. For a right derived functors, the same holds, except we want to consider natural transformations R^0 G\to T^0.

There is yet another result, which is perhaps more similar to what we want to get to. Suppose T,H:\mathcal A\to\mathcal B are \delta-functors and \mathcal A has enough projectives, such that H_n(P)=0 if P is projective and n>0. Then any natural transformation \tau_0 :T_0\to H_0 extends to a morphism of \delta-functors. This is in particular true for derived functors.

There is an analogous result when dealing with functors from a suitable category to the category of modules over a ring. A category with models is just a category \mathscr C and a collection of objects \mathscr M =\{M_j:j\in J\} of \mathscr C, which we call models. A functor F:(\mathscr C,\mathscr M)\to {}_R\rm mod is said to be free on \mathscr C with models \mathscr M if it factors in the following way way as a functor to \rm Set and the free functor {\rm Set}\to {}_R\rm mod: there is a collection of elements m_j\in F(M_j) such that F(X) is free with basis the set \{ F(f)(m_j): f:M_j\to X, j\in J\}. Thus to know F(X) it suffices to look at the action of F arrows from the models to X in \mathscr C and a fixed collection of elements in the images of the models. In this sense, the objects M_j model F. A functor F from a category with models \mathscr M to the category of chain complexes over some ring R is free with models \mathscr M if it is free in each degree.

This definition might seem a bit farfetched at first, but it is motivated in the most elementary examples. Consider the singular complex functor \Delta from the category of topological spaces to the category of (free) chain complexes over \Bbb Z. This assigns to a space X the complex \Delta(X) with \Delta_q(X) the free abelian group with basis the singular simplices \sigma:\Delta^q\to X and assigns to a continuous function f:X\to Y the function that sends \sigma\in\Delta(X) to f\sigma\in\Delta(Y). In particular, if we take the singular simplex \Delta^q and the identity simplex \xi_q\in \Delta_q(\Delta^q), we see that each \Delta_q (as a functor from topological spaces to abelian groups) is free with only one model \{\Delta^q\}. If we take now K to be an (abstract) simplicial complex, and consider the poset of subcomplexes of K ordered by inclusion we have the simplicial complex functor C from this category again to the category of chain complexes over \Bbb Z. Since for a simplex  s\in K there is an arrow s\to L, L a subcomplex if and only if s is a simplex of K, one can check again that C is free with models the simplices of K.

Given a category with models (\mathcal C,\mathcal M) a functor G from (\mathcal C,\mathcal M)  to chain complexes is said to be acyclic if G(M) has trivial homology in positive degrees for every model M. We say G is nonnegative if it is the zero functor in negative degrees. The main result gathering these ideas together is usually known as “the method of acyclic models” and is due to S. Maclane and S. Eilenberg:

Theorem Let F be a nonnegative free functor on a category with models (\mathcal C,\mathcal M) and suppose G is acyclic on (\mathcal C,\mathcal M). Then given any natural transformation \phi:H_0F\to H_0G there exists a natural chain map \tau : F\to G that induces \phi and \tau is unique up to natural homotopy. That is, given \tau,\tau' that induce the same natural transformation H_0F\to H_0G, there exists a natural homotopy D:\tau\simeq \tau' from F\to G.

Proof For each object X we need to define \tau(X):F(X)\to G(X), so that for an arrow f:X\to Y we have \tau_q(X)F_q(f) = G_q(f)\tau_q(Y) and such that each \tau(X) is a chain morphism. Now since F(X) is free with basis F(f)(m_j), the last equation can be written as

\tau_q(X)(\sum \eta_{ij} F_q(f_i)(m_j))=\sum \eta_{ij} G_q(f_i)(\tau_q(M_j)(m_j))

Note that for each q we are (implicitly) using an indexing set J_q and the M_j,m_j depend on q, but we’re avoiding unnecessary notation that might otherwise clutter the proof. At any rate, \tau is determined by \tau_q(M_j)(m_j) for q\geqslant 0 and j\in J_q. We will define \tau by induction so that \tau is a chain map inducing \phi. When q=0, define \tau_0(M_j)(m_j) to be any element of G_0(M_j) such that the homology class of \tau_0(M_j)(m_j) equals \phi(M_j)({\rm cls}(m_j)). This then defines \tau_0(X) for every X using the last equation. Suppose we have defined \tau_i for i<q,q>0. It suffices then we define \tau_q so that

\partial \tau(M_j)(m_j)=\tau_{q-1}(M_j)(\partial m_j)

Now the right hand side of this equation is a cycle in G_{q-1}(M_j), and since we have assumed such complex is acyclic in positive degrees, we can choose \tau(M_j)(g_j) to satisfy the last equation. This inductively defines \tau, so that it is a natural chain map F\to G.

The construction of D is completely analogous, bearing in mind we want to have D natural, satisfying

\partial_{q+1}D_q(M_j)(m_j)=\tau_q(M_j)(m_j)-\tau_q'(M_j)(m_j)-D_{q-1}(M_j)(\partial_q m_j)

and the right hand side of this equation is a cycle in G_q, q>0\blacktriangleleft

Note that the singular and simplicial functors are never acyclic, since they are augmented, and the cyclic factor of the augmentation map can be thought as generated by the empty simplex in both cases. We can thus define free and acyclic functors to the category of augmented chain complexes, and now both the singular and simplicial chain functors are both free and acyclic (but with different categories as a source!). This the theorem that will work for us

Theorem Let F,G be functors from a category with models (\mathcal C,\mathcal M) to the category of augmented chain complexes such that F is free and G is acyclic. Then there exist natural chain maps preserving augmentation and any two are naturally homotopic.

Proof It suffices we prove there exist chain maps preserving the augmentation, since the rest follows from the previous theorem. We know that since G is acyclic, the augmentation map \varepsilon' induces an isomorphism of H_0G(M_j) with R, and hence we can pick for each basis element g_j, j\in J_0 a cycle z_j\in H_0G(M_j) such that \varepsilon'(z_j)=\varepsilon(g_j). Extending linearly, we obtain a natural transformation H_0(F)\to H_0(G), as desired. \blacktriangleleft

Using the method of acyclic models we can (with some aid) prove a variety of classical theorems in algebraic topology, including:

1. Homotopic continuous functions f,g:X\to Y induce homotopic chain maps \Delta(f),\Delta(g):\Delta(X)\to \Delta(Y): to do so, it suffices to show that for any space X, the functions $h_0,h_1:X\to X\times I$ that embed X on the top or bottom face of the cylinder induce chain homotopic maps \Delta(X)\to \Delta(X\times I). This is true since we already know \Delta(X) is a free functor to augmented complexes and \Delta'(X)=\Delta(X\times I) is acyclic: one has to show say, that convex subsets of Euclidean space have trivial reduced homology, and this can be strengthened: their singular complexes are contractible. This means that the two natural transformations induced by h_0,h_1 are naturally homotopic, and we’re done.

2. Eilenberg-Zilber theorem: the singular complexes \Delta(X\times Y) and \Delta(X)\otimes \Delta(Y) are naturally chain equivalent for any pair of spaces X,Y: indeed, both bifunctors are free with models the pairs of simplices. This is not completely straightforward but a little thinking reveals it to be true. If one is brave enough, the inductive definition above should give an explicit chain map. Both functors are also acyclic. \Delta^q\times \Delta^p is contractible for any choice of p,q, and the fact that the tensor product of two acyclic free complexes is acyclic follows from the Künneth formulas.

3. Existence of diagonal approximations (Alexander-Whitney maps), that is there exist a natural transformation \Delta(X)\to \Delta(X)\otimes \Delta(X): this is immediate since \Delta(X)\otimes \Delta(X) is acyclic (we need Künneth, once again{}^1). One such map is the Alexander-Whitney map, but again one can construct a map oneself by following the proof of the method of acyclic models. Note that one can obtain a diagonal approximation by using the induced arrow by the diagonal map and the Eilenberg-Zilber theorem, however.

In fact, the first theorem mentioned is a special case of the method of acyclic models, with the exception we need to replace projective by free: consider a category with one element and one identity arrow, with one model this element, and regard a chain complex as a functor from this point to chain complexes. The reader is encouraged to fill in the details.

1: not really. We can use that the tensor product of an acyclic complex with any flat complex is acyclic. Indeed, with an appropriate indexing the rows of the associated double complex are acyclic, so the acyclic assembly lemma gives the result.

Dependence relations: a case of good abstraction.

The following is based on the exposition of Jacobson in his classical text Basic Algebra (vol. II).

As it happens often in mathematics, going one step further in our level of abstraction may do wonders, and reveal more general facts that are widely applicable. I think this is the case of dependence relations. In every linear algebra course, we learn what it means for a subset of vectors to be linearly (in)dependent, in every field theory course we learn what it means for a subset of a field extension to be algebraically dependent over the base field. Let us try to look at this definitions with more care.

Fix a vector space V, and a subset S of V. Let us say that v\prec S (read “v is dependent on S) if v\in \langle S\rangle. This defines a relation from V to the subsets of V. We can immediately deduce some of its properties. First, if v\in S, then evidently v\prec S. Secondly, if v\prec S, then in fact there is a finite subset F of S such that v\prec F: if v=\sum\limits_{i=1}^n \lambda_iv_i with v_i\in S, then certainly v\in \langle v_1,\ldots,v_n\rangle. We also have a transitive-like property: if v\prec S and every s\in S is dependent on T, then v\prec T, that is, is S spans x and T spans S, then certainly T spans x. Finally, there is a rather nontrivial property: suppose that x\prec S, but that for some y\in S, x\not\prec S\smallsetminus y, that is, x is spanned by S, but not by S\smallsetminus y. I claim then that y must be spanned by S\smallsetminus y \cup x. Indeed, the hypothesis means that x=\sum \lambda_i v_i and some, say v_1, must be y (else we wouldn’t have x\not\prec S\smallsetminus y). Since we can safely assume all the v_j\neq v_1, reordering the equation gives that y\prec S\smallsetminus y\cup x, as desired. This is called the Steinitz replacement axiom.

We have just verified that the relation “S spans x” is a dependence relation on any vector space V. To summarize,

  1. If x\in S, then x \prec S (read: every set is self-dependent)
  2. If x\prec S, then x\prec F for some finite subset F of S (read: dependence is afforded by finitely many elements)
  3. If x\prec S and S\prec T (i.e. every element of S is dependent on T), then x\prec T.
  4. (Steinitz replacement axiom) If x\prec S and x\not\prec S\smallsetminus y, then y\prec S\smallsetminus y\cup x.

We all know the definition of an independent set of vectors in terms of linear equations. But we can also define a set of vectors S to be (linearly) independent if whenever x\in S, it follows that x\not\prec S\smallsetminus x, that is, x cannot be spanned by the remaning vectors. You can easily check both definitions coincide. Given a dependence relation \prec on a set $X$, we shall say a subset S is independent if x\not\prec S\smallsetminus x for every x\in S. Note, for example, that the empty set is always independent. We shall say a subset B is a base if it is independent and every element of X is dependent on B. In terms of our linear dependence relation on vector spaces, this means that B is (linearly) independent, and spans V.

Let us now see that given any abstract dependence relation, we can find a base for it, and that in fact, any two bases have the same cardinality. We will use Zorn’s lemma, of couse. But first we need an auxiliary result, which is pretty intuitive: suppose that S is an independent set, and that x is independent of S (i.e. x\not\prec S). Then S\cup x is independent: indeed, were it not we would find y\in S different from x (why?) such that y \prec S\smallsetminus y \cup x. Since y\not\prec S\smallsetminus y = (S\smallsetminus y \cup x )\smallsetminus x, the replacement axiom gives that x\prec S, contrary to the hypothesis. (It is instructive to carry the proof in the case of vector spaces to see exactly what the replacement axiom does).

There are two more dependence relations that are of interest. First, the relations of algebraic dependence: fix a field extension E/F and call a subset S algebraically dependent (over F), if there are s_1,\ldots,s_n\in S and a nonzero polynomial f in F[X_1,\ldots,X_n]  such that f(s_1,\ldots,s_n)=0. A mildly nontrivial matter is to show that a set S is algebraically dependent over F if there is some s\in S such that s is algebraic over F(S\smallsetminus s). This is how we want to define our dependence relation: say that x\prec S if x is algebraic over F(S). Thus an independent set S is (by the very observation above) and algebraically independent subset of the extension: for no s\in S it follows that s is algebraic over F(S\smallsetminus s). It is evident that the relation just defined satisfies the axioms above, so it is a dependence relation.

The above gives us an idea on how to try and obtain a basis for a set X. Start with the empty set X_0=\varnothing, which is always independent. If X is dependent on X_0, we’re done. Else, there is some x\in X that is independent from the empty set, and we can adjoin it, and obtain a bigger independent set by the above, X_1=\{x\}. If X is depedent on X_1, we’re done. Else…

Of course, we all know the argument above is not going to do the trick every time (though it will, say, in the case of finite dimensional vector spaces!), and we have to use Zorn’s lemma. If we consider the collection of independent subsets of X, the fact that dependence is afforded by a finite subset means the union of a chain of independent set is independent, and thus we obtain a maximal independent subset, call it B. The lemma we proved implies X must be dependent on B: if there were some x independent of B, B\cup x would be independent, contrary to the maximality of B. This means B is indeed a basis. This already proves, for example, that every vector space admits a basis, and that every field extension E/F admits a transcendency base: a subset B that is algebraically independent over F, and such that E is algebraic over B. Note that in both cases, B might be empty: if V is the zero vector space, it admits the empty base, and if E/F is algebraic, it admits an empty transcendency base!

To conclude, we prove any two bases B,C have the same cardinality. As a preliminary comment, note that if B\subseteq C are bases, we must forcefully have B=C: indeed, if there is some c\in C not in B, we know that c is dependent on B, and since B\subseteq C\smallsetminus c, we get that c\prec C\smallsetminus c, contrary to independence. This means, in particular, that if any base is empty, so is any other, so we can assume that both B,C are nonempty. Assume first that B is finite, and write B=\{x_1,\ldots,x_n\}. We will show by induction on k that for each k=1,\ldots,n+1 we can find y_i\in C such that \{y_1,\ldots,y_{k-1},x_k,\ldots,x_n\} remains a base. When we reach $k=n+1$ we will obtain \{y_1,\ldots,y_n\} a basis inside of C, which by the first comment forces C=\{y_1,\ldots,y_n\} and proves that |B|=|C|=n.

Consider B'=\{x_2,\ldots,x_n\}. Then there is some y_1\in C that is independent of B', for if every element of C is dependent on $B’$, this also holds for B, in particular for x_1, contrary to B being a base. By the replacement axiom, x_1 is dependent on B_1=\{y,x_2,\ldots,x_n\}, and since y_1 was taken independent of the independent set B', B_1 is independent. Since B is dependent on B_1, B_1 is a base. We can repeat the process: remove x_2, obtain y_2\in C that is independent of B_1\smallsetminus x_2, replace, and obtain B_2=\{y_1,y_2,x_3,\ldots,x_n\}, eventually obtaining the desired base consisting of n elements of C, and proving the claim for finite bases. In particular, we have shown that if one base is finite, so is every other.

Since the same argument above applies when C is finite, we may assume both bases to be infinite. For every y\in C, there is a finite subset of B, call it F_y, such that y\prec F_y. We have thus a map y\mapsto F_y which shows that |C| \geqslant |\{F_y:y\in C\}|. Since every F_y is finite and C is infinite, \aleph_0 |C|=|C|\geqslant |\bigcup F_y|. The F_y must union up to B, else C is dependent on a proper subset of B, call it B', and so C being a base, gives B itself dependent on a proper subset of itself, which cannot be. Thus |C|\geqslant |B|, and by symmetry |B|\geqslant |C|, which gives the desired equality.

As a final note, there is yet a third dependence relation one can consider, which generalizes that for vector spaces: given a semisimple module M, consider the collection X of its simple submodules, and say that a simple submodule N is dependent on a set of simple submodules N_i if N is in the span of the N_i. A base is now a subcollection of simple submodules that are in direct sum and genererate M, and the conclusion is that the cardinality of such collection is unique. In fact, given two such decompositions there is a bijection that preserves the isomorphism class!

ADD A friend pointed out how the above fails for general rings (as opposed to fields), i.e. that modules don’t generally admit bases with respect to the “spans” relation. The problem is that the Steinitz replacement axiom fails (the reader should note that the first three axioms always hold, so the Steinitz axiom must be the problem!). As a counterexample, take \Bbb Z and consider the set S=\{3,4\} which spans \Bbb Z, so that in particular 6\prec \{3,4\}, and of course 6\not\prec \{4\}. However, 3 is not in the span of \{4,6\}, for it is not even.

Why are simple groups simple?

Recall that a group G is said to be simple if it contains precisely two normal subgroups, and that a group H is solvable if it admits a normal series 1\lhd H_1\lhd \cdots\lhd H_r=H such that each quotient H_i/H_{i-1} is abelian. Every subgroup of a solvable group, and every quotient of a solvable group is solvable. It should be clear that if a group is simple nonabelian, it is not solvable. The title of the post, albeit a bit foolish, attemps to convey the following idea: groups are simple because they admit exactly two normal subgroups, but the moral reason behind this varies wildly between each family of simple groups. For example, the most trivial family of simple groups is \mathscr F_1=\{C_p:p\text{ a prime } \}, and the reason they are simple is no mystery: any subgroup must have index a divisor of p, which is either 1 or p. It follows every C_p (a cyclic group of prime order) admits exactly two subgroups, hence exactly two normal subgroups. In this post we will look at two other infinite families of finite simple groups: the family of alternating groups \mathscr F_2 = \{A_n: n\geqslant 5\} and the family of projective special (or unimodular) groups \mathscr F_3 = \{{\rm PSL}(n,F):F \text{ a field and } (n,|F|)\neq (2,3),(2,2)\}.

The following is based on the exposition in Jacobson’s Basic Algebra I, which I encourage anyone (acquainted or not with undergraduate algebra) to look at.

Simple groups have been objects of interest in mathematics all the way back to Galois, who constructed the Galois group G_f of a polynomial f over  a field F (modernly defined as {\rm Gal}(E/F) for E a splitting field of f over F) as a certain permutation group of its roots. Galois proved his celebrated theorem that the equation f(x)=0 is solvable by radicals over F if and only if the Galois group G_f is solvable. If f is of degree n, then one can consider G_f as a subgroup of S_n, in particular if  one considers a general equation of degree n over F(t_1,\ldots,t_n) (i.e. we’re solving the equation generically in the coefficients t_i)

x^n -t_1 x^{n-1}+\cdots+(-1)^n t_n=0

then it can be shown the corresponding Galois group is S_n. We will show now that A_n\lhd S_n is a normal subgroup (for it has index 2) which is simple if n\geqslant 5 — we shall call n the degree of the alternating group. This nomenclature arises from group actions: a group action is said to be of degree d if it is defined on a set of precisely d elements, and the alternating group acts canonically on the n-set of letters \{1,2,\ldots,n\}. Since A_n is simple, it follows that S_n is not solvable for n\geqslant 5 (why?) and settles the famous theorem (Galois’ theorem notwithstanding) that the general equation of degree n is not salvable by radicals if n\geqslant 5. Let us prove a lemma, first:

Lemma 0 The alternating groups of degree at least five are generated by the three cycles, and all three cycles are conjugate.

Proof Since S_n is generated by the transpositions (ij) it follows A_n is generated by the double transpositions (ij)(kl), so it suffices to show every double tranposition is a product of three cycles. The reader can verify that (ij)(kl) = (ilk)(ijk) if i,j,k,l are distinct and that (ij)(ik)=(ikj), which proves the first claim. Since \sigma (ijk)\sigma^{-1} = (\sigma(i)\sigma(j)\sigma(k)) for any permutation \sigma, it follows easily that every three cycle is conjugate to (123) in S_n. If \sigma is even, we’re done. Else, since n\geqslant 5 we can choose l,m different from \sigma(i),\sigma(j),\sigma(k) and take \tau= (lm)\sigma, for \tau and \sigma (ijk)\sigma^{-1} commute. \blacktriangleleft.

We can now prove the

Theorem (Galois) The alternating groups of degree at least five are all simple.

Proof  We will use the previous lemma, of course. Indeed, we will show that every nontrivial normal subgroup of A_n with n\geqslant 5 contains a three cycle, which proves the claim. To this end, we note that if N\lhd A_n is nontrivial it contains a nontrivial permutation that fixes some letter (show that for any permutation \sigma that fixes no letter there exists a permutation \tau such that [\sigma,\tau] fixes some letter). We claim that if we pick such nontrivial permutation that is not a three cycle, we may conjugate it to obtain a permutation with more fixed points. It will follow that any permutation with a maximum number of fixed points is a three cycle. Indeed, let \sigma be nontrivial with fixed points, but not a three cycle. Using the cycle decomposition in S_n we can write \sigma as a permutation of the form \sigma_1= (123\cdots)\cdots or \sigma_2 = (12)(34)\cdots. Let \tau = (345). We then have {}^\tau \sigma_1 =(124\cdots)\cdots and {}^\tau \sigma_2 = (12)(45)\cdots and in either case {}^\tau \sigma \neq \sigma so that \sigma'= [\tau,\sigma]=\tau\sigma\tau^{-1}\sigma^{-1}\neq 1. Any letter >5 is fixed by \tau so if it is fixed by \sigma it is fixed by \sigma'. The reader can verify that in the first case \sigma' fixes 2, and since \sigma_1 moves 1,2,3,4,5 we obtained more fixed points. In the second case \sigma' fixed 1 and 2, and again we have more fixed points. This proves the theorem. \blacktriangleleft

We now move on to a more geometric situation, and consider the projective special groups. Since these are not as popular as the alternating groups, let us take a moment to recall how they are defined and dwell on them a bit further. To do so we’ll have to recall, in turn, a series of definitions and results in basic group theory.

Definition 1 We say that a group G is perfect if it is equal to its commutator subgroup G', generated by all commutators [h,g]=hgh^{-1}g^{-1} with g,h\in G. A group G is said to act on a set X if there is a group homomorphism G\longrightarrow {\rm Aut}(X). The kernel of this homomorphism is said the be the kernel of the action (it is of course a normal subgroup of G). We shall say the action is faithful if the group morphism G\longrightarrow X is injective.

Definition 2  We usually denote by gx the image of x\in X under the automorphism associated to g\in G and by Gx the subset \{gx:g\in G\} and call it the G-orbit of x\in X. We denote by G_x the subset \{g\in G: gx=x\} of elements that fix x and call it the stabilizer of the point x in G. It is not hard to see the collection of distinct orbits of a fixed action partition X, and that stabilizers are always subgroups. It is not  hard to see, either, that |G:G_x|=|Gx|. There is an explicit bijection: map gG_x to gx\in Gx, and check this is a bijection. It is in fact a G-set morphism, but we won’t be caring about this here.

Definition 3 We say a partition \mathscr B of X is G-stable if for every g\in G and B\in\mathscr B, gB\in \mathscr B. In words: the action permutes the blocks of the partition. Evidently the trivial partition \{X\} and the singleton partition \{\{x\},x\in X\} are stable, as so is the partition under orbits. We say that an action is primitive if X admits only two stable partitions, else we call the action imprimitive. In particular any primitive action on a set with more than one element must be transitive.

We leave it to the reader to verify that a transitive action is imprimitive if and only if there exists a subset A of X of at least two elements such that for any g\in G either gA=A or gA\cap A=\varnothing. One direction is easy, for the converse note that gA,g\in A and their complement form a partition that is stable.

Definition 4 Given a group G acting on a set X, we have a canonical action on the product X\times X by g(x,y)=(gx,gy). We say that a group action is two-fold transitive if G acts transitively on X\setminus \Delta X under this canonical action. In words: for every pair of distinct pairs x\neq x', and y\neq y' there exists g\in G such that gx=y and gx'=y'.

Having all the ingredients to do so, we may as well present our main tool to prove that the family \mathscr F_3 consists also of simple groups:

Theorem (Iwasawa, 1941) Suppose a perfect group G acts primitively on a set X, and that for some x\in X the stabilizer G_x contains a normal abelian subgroup A whose conjugates generate G. Let K be the kernel of the action. Then the quotient group G/K is simple.

To prove this, we aid ourselves in the following.

(1) Any two-fold transitive action is primitive.

(2) If a group G acts primitively on a set X and H\lhd G is not contained in the kernel of the action, H acts transitively on X.

(3) If a subgroup H acts transitively on a set X, then G= HG_x for any x\in X.


(1) Let A be a proper subset of at least two elements x,y. By two-fold transitivity there is some g such that gx=x and gy\notin A, so that gA\cap A is nonempty and gA\neq A.

(2) Our set is partitioned into H-orbits, and since H is normal we have Hg=gH for any g\in G, which implies the partition is G-stable. Since the partition is not that by singletons by hypothesis, we must have only one H-orbit.

(3) This is rather easy, and is left as an exercise.  \blacktriangleleft

We now prove Iwasawa’s criterion. Let H>K be normal. By (1), H acts transitively on X. Let G_x be the stabilizer in the statement of the theorem. By (3) above, we know that G=HG_x. Now HA is normal in HG_x=G and since the conjugates of A generate G, it must be that G=HA. It follows that G/H=HA/H\simeq A/A\cap H is abelian, so that H\geqslant G'=G. \blacktriangleleft.

Given a field F, let P_{n-1}(F) denote the collection of one dimensional subspaces of F^{n}, which we call the n-1-dimensional projective space over F. The linear group {\rm GL}(n,F) of (invertible) matrices of size n over F acts on the projective space by M(Fx)=F(Mx) for Fx a one dimensional subspace. We will content ourselves with finite dimensional vector spaces, as above. In the following series of lemmas we will gather some information on the special linear groups that will allow us to apply Iwasawa’s criterion to the previously defined action of {\rm SL}(n,F) on the projective space. Some minor details and calculations are omitted. The reader is strongly encouraged to carry them out in detail!

Lema 1The kernel of the action just defined (for any n!) is the subgroup of scalar matrices \lambda 1 with \lambda \in F which is the center Z of {\rm GL}(n,F). The quotient group thus obtained is called the projective linear group {\rm PGL}(n,F)={\rm GL}(n,F)/Z, and the quotient associated to the restriction of this action to one of the special linear group {\rm SL}(n,F) is called the special projective group {\rm PSL}(n,F)={\rm SL}(n,F)/(Z\cap {\rm SL}(n,F)).

Proof It is clear any scalar matrix is in the kernel of the action. Suppose now that T fixes every one dimensional subspace. For each x we can write Tx=\lambda_x x and T(\mu x)= \mu Tx=\lambda_x(\mu x), hence T is diagonal if n=1. If n\geqslant 2 fix a basis (e_1,\ldots,e_n) and note that T(e_i+e_j)=Te_i+Te_j= \lambda_{e_i+e_j}e_i+\lambda_{e_i+e_j}e_j=\lambda_{e_i}e_i+\lambda_{e_j}e_j so that T=\lambda 1 with \lambda =\lambda_{e_1}. We have simply proved that every invertible linear transformation which has every vector as an eigenvector must be a scalar multiple of the identity. \blacktriangleleft

Lema 2 The special linear group is generated by the elementary matrices T_{ij}(\lambda)=1+\lambda e_{ij} , i \neq j.

Proof We can prove a stronger result, namely that if D is a PID then {\rm SL}(n,D) is generated by the elementary matrices. This follows from the Smith Normal Form for matrices over principal ideal domains and the Whitehead lemma. Indeed, given A unimodular we may find P,Q invertible products of elementary matrices plus permutation matrices such that PAQ is diagonal. If p_{ij} is the permutation matrix that interchanges row (or column) i and j, p_{ij}=T_{ij}(1)T_{ji}(-1)T_{ij}(1)(1-2e_{ii}), so we can assume we have only elementary matrices and matrices of the form e^i= 1-2e_{ii}. But T_{ij}(\lambda) e^i = e^i T_{ij} (\lambda)^{-1} and T_{ji}(\lambda)e^i=e^i T_{ji}(\lambda)^{-1}, and e^i commutes with every other elementary matrix. It follows we may gather the e^i and using they are involutions obtain P'AQ' a diagonal matrix with P',Q' products of elementary matrices, and hence simply show diagonal unimodular matrices are product of elementary matrices. The Whitehead lemma now says that {\rm diag}(d^{-1},d) is elementary, so we may multiply to the right to replace the first nonzero diagonal element by 1, and iterate to obtain {\rm diag}(1,d_1d_2,d_3,\ldots,d_n)\to {\rm diag}(1,\ldots,1,d_1\dots d_n)=1 for d_1\cdots d_n=1.

Lemma 3 Any special linear group is perfect unless it is {\rm SL}(2,2) or {\rm SL}(2,3). 

Proof If n\geqslant 3, we can use the Steinberg relations to write T_{ij}(\lambda) = [T_{ik}(\lambda),T_{kj}(1)], so we can assume that n=2. In this case [{\rm diag}(d,d^{-1}),T_{12}(\gamma)]=T_{12}(\gamma(d^2-1)), and then |F|>3 gives some d\neq 0 such that d^2\neq 1. Taking transposes and recalling the Whitehead lemma gives what we wanted. \blacktriangleleft

Lemma 4 The special linear group acts doubly transitively on the projective space if n\geqslant 2.

Proof We want to show that given two pairs of different lines, equivalently two pairs v_1,v_2, w_1,w_2 of linearly independent vectors, there is a unimodular matrix that takes v_i to w_i. If n>2 we can always extend the w_i to a basis w_1,w_2,\ldots,w_n, write v_i = \sum a_{ij}w_j and subsequently extend the 2\times n matrix obtain one of determinant 1. If n=2 then the 2\times 2 matrix (a_{ij}) has nonzero determinant d, and we may then consider the matrix associated to the linear transformation that sends v_1\mapsto w_1 and v_2\mapsto d^{-1}w_2 in our basis, which has determinant 1. \blacktriangleleft

Lemma 5 The stabilizer of e_1 in the special linear group contains a normal abelian subgroup whose conjugates generate the whole special group.

Proof The stabilizer of e_1 consists of matrices of the form

\begin{pmatrix} 1 & 0 & \cdots & 0 \\ a_{21} & a_{22} & \cdots & a_{2n} \\ \vdots & \vdots & \ddots & \vdots \\ a_{n1} & a_{n2} & \cdots & a_{nn} \end{pmatrix}

We can consider the group morphism that sends such matrix to the submatrix of i,j>1. This has kernel A the matrices of the form

\begin{pmatrix} 1 & 0 & \cdots & 0 \\ a_{21} & 1 & \cdots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ a_{n1} & 0 & \cdots & 1 \end{pmatrix}

which the reader can verify is abelian. Evidently any matrix T_{12}(\lambda) is contained in A, and the fact that S T_{12}(\lambda) S^{-1} = T_{21}(-\lambda) with S=\begin{pmatrix} 0&1\\ -1 &0\end{pmatrix} extended with the identity if necessary shows the conjugates of A contain any matrix T_{21}(\lambda). Finally, the relation [T_{jk}(1),T_{ij}(\rho)]= T_{ik}(\rho) allows us to obtain the remaining elementary matrices from these two. \blacktriangleleft

The reader is encouraged to check that {\rm PSL}(2,2)\simeq S_3 and that {\rm PSL}(2,3)\simeq A_4 are not simple. In fact there is a solvable series 1 \lhd C_2 \lhd V_4\lhd A_4 \lhd S_4: this is intimately related to the fact we can solve any quartic equation f(x)=0 by radicals and that there is a canonical procedure using what is called a resolvent cubic g of f (related to the Klein four group V_4) and the discriminant of the polynomial (which allows us to figure out the Galois group G_f by knowing that of G_g).

In an upcoming post we will hopefully dwell into the Mathieu groups, M_{10},M_{11}, M_{12},M_{22},M_{23} and M_{24}, which were studied by Émile L. Mathieu.