SymPy Group Theory (GSoC 2017) gsoc, sympy, computational group theory

Presentations of Permutation Groups

The work on finding finite presentations of permutation groups was somewhat delayed last week because I’ve come across a bug in the reidemester_presentation code while testing the one step version of the presetation algorithm (which is essentially finished). Turned out it was caused by an unnecessary cyclic reduction on one line in the main elimination technique. See PR for the explanation.

Another delay was because I realised that for the multi-step version, it would be useful to have the homomorphism PR merged. Mostly this is because of the generator_product method from the PermutationGroup class - this takes an arbitrary permutation in the group and returns a list of the strong generators such that the product of the elements of the list is equal to this permutation. But the homomorphism functionality that’s already there (the case of the homomorphism from an FpGroup to a PermutationGroup) would be helpful too. So I switched by attention to getting that PR ready for merging.

So this week I’ll try to add the multistep version and will also start implementing homomorphisms from PermutationGroups and to FpGroups, as outlined in the plan. Also, at some point when the rewriting PR is merged, I could try out the potentially improved random element generator for FpGroups that I sketched a couple of weeks ago. This is not officially part of the plan and not in any way urgent, but could probably give better random elements which would be useful for the order() method and the kernel calculation for homomorphisms. Though I might not actually get around to it this week.

The Rewriting System class

I sent the PR with the rewriting system implementation on Wednesday night, i.e. by Thursday as planned. It could have been earlier if I hadn’t spent one day experimenting with making the elements of FpGroup different from the underlying FreeGroup. At the moment they are the same so that, for instance:

>>> F, a, b = free_group('a, b')
>>> G = FpGroup(F, [a**2, a*b*a**-1*b**-1])
>>> a in F
>>> a in G

Theoretically, since F and G are different groups (not even isomorphic), it would make sense for their elements to be represented by different objects, e.g. the generators for G could be a_1, b_1 so that G and F are connected by the injective homomorphism sending a_1 and b_1 to a and b respectively. Though, for convenience, it could be allowed to refer to the elements of an FpGroup by the names of the underlying free group elements (and performing the conversion to a_1 and b_1 backstage before passing them onto FpGroup’s methods).

Now, if this were done, it would be possible to check for equality of two elements of FpGroup using the == syntax. E.g., if the generators of G were made distinct from a, b and called a_1, b_1, then we could have a_1*b_1 == b_1*a_1 (while, of course, a*b == b*a is not true because a and b are the elements of a FreeGroup). The latter was the main motivation for trying this out but in the end, after additionally discussing things with my mentor, I left everything as it is and implemented a couple new FpGroup methods instead. So, for example, to check if a*b and b*a are equal in G, one would call G.equals(a*b, b*a). The PR has more details.

At the moment I’m in the middle of writing the function for finding finite presentations of permutation groups. I’m slightly modifying the coset enumeration code to allow it to return an incomplete CosetTable (instead of raising a ValueError if, say, the maximum number of cosets is exceeded) and to start running with a partially filled table (which could be passed as a keyword argument) rather than with a new, empty instance. This is necessary as finding the presentation requires setting some of the coset table entries manually and adding relators to the future presentation while examining the undefined table entries, after which coset enumeration could be used to make more deductions. Messing with coset enumeration is the only complicated part (as far as I can see): the rest is pretty much sorted by now. I suppose, the only other thing is deciding when to find the presentation in one step and when to use the multi-step version (finding the presentation of a subgroup first and then the whole presentation based on that). The latter can be more efficient for larger groups. But I could think about that later, maybe run some tests, once the coset enumeration is set up properly.

The Plan (Part II)

At the beginning of this week I finished writing the first draft of the homomorphism class, at least as much of it as is possible without the rewriting systems for FpGroups. Homomorphisms from FpGroups to PermutationGroups are fully implemented though I’m sure corrections will be made as the PR is reviewed. I mentioned that I wasn’t planning on sending the PR until after the rewriting systems but there is quite a bit there and a lot of it can be reviewed (and even merged) now.

It took a bit longer than I expected because the rewriting for permutation groups wasn’t quite what was necessary for homomorphisms. I knew that there was a function returning the unique normal form of a permutation in the group (the coset_factor() method) and had assumed that that would be enough but the normal form wasn’t in terms of the generators, nor was there an easy way to rewrite it in terms of the generators. So I had to modify the schreier_sims() method that finds the strong generating set and the elements that make up the normal form returned by coset_factor() to keep track of which generators in what order are multiplied to obtain them. Then I wrote a method that would write any permutation in the group as a product of the strong generators and, since I had the information about the make-up of the strong generators, this could easily be rewritten in terms of the original generators. Doing this was a bit fiddly and took a while. Though I suppose it could be counted as part of the work on rewriting systems.

The rewriting systems for FpGroups are coming along. At the moment, it is a new class RewritingSystem which FpGroups will probably have an attribute for. I’ve almost finished implementing the Knuth-Bendix completion algorithm and that was most of the work. Though there is also the matter of reduction. At the moment I’m using the eliminate_words() method from the FreeGroupElement class - it takes a dictionary of substitutions (in this case a dictionary of reduction rules) and substitutes all of them into a given word, if possible. But that isn’t very efficient for long words and large reduction dictionaries since eliminate_words() will scan the word from the start every time it tries to perform a new substitution. The approach I’ve seen described in papers on rewriting systems is to construct a finite automaton to perform the reduction and modify it every time a new rule is added. This would somewhat increase the time taken to initiate the system and to make it confluent but would also make word reductions much quicker. However, it could also take a while to actually implement. I’m unsure if it’s worth doing it now - perhaps, I could add it later, when the rest of my project is done.

Anyway, I was going to use this post to outline a schedule for this month’s work, like I did in the first post. Looking back at the Timeline I sketched in my proposal, I seem to have given myself 2 weeks for implementing the rewriting systems. This estimate wasn’t unreasonable, considering the unexpected matter of rewriting in permutation groups I had to deal with in the first half of the week and the reading I had to do to get a better idea of Knuth-Bendix. But if I don’t implement finite automatons for reductions right now, I think I should be able to send the rewriting PR by Thursday (probably earlier). With this in mind, my plan for the month is roughly as follows.

The Plan (continued)

  • 3 July - 6 July: finish the RewritingSystem class.
  • 6 July - 16 July: implement finite presentations for PermutationGroups (this should be plenty of time considering that I made a start on this during the application period)
  • 17 July - 23 July: complete the homomorphism implementation to allow homomorphisms from PermutationGroups and to FpGroups. Implement the permutation presentation for finite FpGroups.
  • 24 July - 30 July: Set up the use of PermutationGroup methods for FpGroups via homomorphisms.

So by the end of the month, the homomorphism functionality should be complete, so should be the switching between FpGroup and PermutationGroup presentations (where it is possible, i.e. finite FpGroups). Making PermutationGroup methods work for FpGroups properly could take more time but if everything goes more or less according to plan, I should have most of August to try and implement Sylow subgroups for PermutationGroups (there is a chapter on it in “The Handbook of Computational Group Theory”), look into the construction of finite automatons for word reduction or automatically grouping subwords of group words, like (a*b)**2 instead of a*b*a*b, or anything else project-related really. That’s the optimistic scenario of course, provided no sudden terrible complications arise which can never be guaranteed.

The homomorphism class

At the beginning of the week, all of my previous PRs got merged. This is good. Also, the homomorphism class is now mostly written with the exception of the kernel computation. The kernel is the only tricky bit. I don’t think there is any general method of computing it, but there is a way for finite groups. Suppose we have a homomorphism phi from a finite group G. If g is an element of G and phi.invert(h) returns an element of the preimage of h under phi, then g*phi.invert(phi(g))**-1 is an element of the kernel (note that this is not necessarily the identity as the preimage of phi(g) can contain other elements beside g). With this in mind, we could start by defining K to be the trivial subgroup of G. If K.order()*phi.image().order() == G.order(), then we are done. If not, generate some random elements until g*phi.invert(phi(g))**-1 is not the identity, and redefine K to be the subgroup generated by g*phi.invert(phi(g))**-1. Continue adding generators like this until K.order()*phi.image().order() == G.order() or finite G this will always terminate (or rather almost always considering that the elements are generated randomly - though it would only not terminate if at least one of the elements of G is never generated which is practically impossible).

This is how I am going to implement it. I haven’t already because I’ve been thinking about some of the details of the implementation. One thing that could be problematic is multiple calls to reidemeister_presentation() to compute the presentation of K at each step. As I’ve discovered when I was implementing the search for finite index subgroups last week, this can be very inefficient if the index of K is large (which it could well be for a small number of generators at the start). After giving it some thought, I realised that actually we could completely avoid finding the presentation because K’s coset table would be enough to calculate the order and check if an element belongs to it. Assuming G is finite, K.order() can be calculated as the order of G divided by the length of the coset table so the knowledge of the generators is enough. And for membership testing, all that’s necessary is to check if a given element stabilises K with the respect to the action on the cosets described by the coset table. And that should be a huge improvement over the obvious calls to the subgroup() method.

Actually, FpGroups doesn’t currently have a __contains__() method so one can’t check if a word w is in a group G using w in G. This is easy to correct. What I was wondering was if we wanted to be able to check if words over G belong to a subgroup of G created with the subgroup() method - that wouldn’t be possible directly because subgroup() returns a group on a different set of generators but it wouldn’t be too unreasonable to have SymPy do this:

>>> F, a, b = free_group('a, b')
>>> G = FpGroup(F, [a**3, b**2, (a*b)**2])
>>> K = G.subgroup([a])
>>> a**2 in K

I asked Kalevi about it earlier today and they said that it would be preferable to treat K in this case as a different group that happens to be linked to G through an injective homomorphism (essentially the canonical inclusion map). If we call this homomorphism phi, then the user can check if an element of G belongs to the subgroup represented by K like so: a**2 in phi.image(). Here phi.image() wouldn’t be an instance of FpGroup but rather of a new class FpSubgroup that I wrote today - it is a way to represent a subgroup of a group while keeping the same generators as in the original group. It’s only attributes are generators, parent (the original group) and C (the coset table of the original group by the subgroup) and it has a __contains__, an order() and a to_FpGroup() methods (the names are self-explanatory). For finite parent groups, the order is calculated as I described above for the kernel and for the infinite ones redeimeister_presentation() has to be called. The injective homomorphism from an instance of FpGroup returned by subgroup() would need to be worked out during the running of reidemeister_presentation() when the schreier generators are defined - this is still to be done.

Another thing I implemented this week was an improved method for generating random elements of finitely presented groups. However, when I tried it out to see how random the elements looked, I got huge words even for small groups so I didn’t send a PR with it yet. Once I implement rewriting systems, these words could be reduced to much shorter ones. Speaking of rewriting systems, I think it could be good if the rewriting was applied automatically much like x*x**-1 is removed for FreeGroupElements. Though I suppose this could be too inefficient some times - this would need testing. This is what I’ll be working on this week.

The order method

Last week and some of this one I was working on changing the order() method of FpGroups. Currently SymPy attempts to perform coset enumeration on the trivial subgroup and, if it terminates, the order of the group is the length of the coset table. A somewhat better way, at least theoretically, is to try and find a subgroup of finite index and compute the order of this subgroup separately. The function I’ve implemented only looks for a finite index subgroup generated by a subset of the group’s generators with a pseudo-random element thrown in (this can sometimes give smaller index subgroups and make the computation faster). The PR is here.

The idea is to split the list of generators (with a random element) into two halves and try coset enumeration on one of the halves. To make sure this doesn’t go on for too long, it is necessary to limit the number of cosets that the coset enumeration algorithm is allowed to define. (Currently, the only way to set the maximum number of cosets is by changing the class variable CosetTable.coset_table_max_limit which is very large (4096000) by default - in the PR, I added a keyword argument to all functions relevant to coset enumeration so that the maximum can be set when calling the function.) If the coset enumeration fails (because the maximum number of cosets was exceeded), try the other half. If this doesn’t succeed, double the maximum number of cosets and try again. Once (if) a suitable subgroup is found, the order of the group is just the index times the order of the subgroup. The latter is computed in the same way by having order() call itself recursively.

The implementation wasn’t hard in itself but I did notice that finding the subgroup’s presentation was taking far too long in certain cases (specifically when the subgroup’s index wasn’t small enough) and spent a while trying to think of a way around it. I think that for cyclic subgroups, there is a way to calculate the order during coset enumeration without having to find the presentation explicitly but I couldn’t quite work out how to do that. Perhaps, I will eventually find a way and implement it. For now, I left it as it is. For large groups, coset enumeration will take a long time anyway and at least the new way will be faster in some cases and may also be able to tell if a group is infinite (while coset enumeration on the trivial subgroup wouldn’t terminate at all).

Now I am going to actually start working on homomorphisms which will be the main and largest part of my project. I’ll begin by writing the GroupHomomorphism class in this coming week. This won’t actually become a PR for a while but it is easier to do it first because I still have exams in the next several days. After that I’ll implement the Knuth-Bendix algorithm for rewriting systems (I might make a start on this later this week as I’ll have more time once the exams are over). Then I’ll send a PR with rewriting systems and once that’s merged, the GroupHomomorphism class one, because it would depend on rewriting systems.