SymPy Group Theory (GSoC 2017) gsoc, sympy, computational group theory

Delays

The rewriting PR hasn’t been merged yet. At the beginning of last week, there was still some work to be done on it. For example, the RewritingSystem’s check_confluence() method (which, unsurprisingly, checks if the system is confluent) would add new rewriting rules while it runs - definitely not something it should do. Another addition, suggested by my mentor, was the attribute _max_exceeded: this is True when the Knuth-Bendix completion method has already been attempted but the maximum number of rules allowed to be added to the system was exceeded. Trying again would be a waste of time. However, if the maximum number of rules is increased, _max_exceeded is reset so that the completion method could be run again.

Then I worked on the multi-step version of the presentation algorithm (which was possible because the homomorphism PR was merged by then). Once I wrote it up, made sure it worked and started testing against the single step one, I noticed that the multi-step version ran several times as fast for several of the groups I was using and yet was suddenly much slower for a group of higher order. This was unexpected and I spent a while trying to work out why that would be. The reason was this: the function uses coset_enumeration_c for filling in a certain coset table. coset_enumeration_c stores the deductions it makes while filling in so that it can process them later. However, if there are too many deductions, it’s not practical to go through all of them so the algorithm uses the method look_ahead that allows it to continue without looking at the deduction stack which is emptied. The critical size of the deduction stack was 500 by default. For large groups, the single step version would greatly exceed that most of the time so look_ahead was run instead, speeding it up considerably, while the multi step version would be working with a smaller coset table and the number of deductions would be large but not over 500 which made it slow. So it looked like processing close to 500 deductions was inefficient and I reduced the number to 100 to see what happens. What happened was that both version got faster but now the multi-step version was consistently faster than the single-step one. I ran the coset table tests and they seemed to be fine too so the reduction in the maximum number of deductions didn’t seem to affect anything negatively. I pushed the multi step version into the presentation PR but that hasn’t started being reviewed because another problem came up.

The homomorphisms tests from the merged PR would occasionally time out so that needed investigating. Turned out it was because the __contains__ method of the FpSubgroup class couldn’t handle group elements in the conjugate form, i.e. something of the form r**-1*w*r for some words r, w. It would get into an infinite loop and the tests would only occasionally time out because the method is used during kernel computation which is randomised. So a couple of days was spent thinking about what would be the best way to eliminate the problem, whether this sort of problem would only be caused by conjugate elements and finally expanding the code accordingly. You can follow what I did in this PR. Though this is still being discussed and it’s unclear whether some other solution would be called for.

So because of all this, I couldn’t work on the other homomorphism cases (though I did sketch some things that I can use once the dependent PRs are merged). If the FpSubgroup issue is resolved soon and the rewriting PR is merged, I will implement homomorphisms to FpGroups. The presentation PR will probably need more reviewing and I can’t implement PermutationGroup domains without it. I’m considering looking into an alternative method of doing homomorphisms specifically for the PermutationGroup to PermutationGroup case because I’ve seen a section on it in the Handbook. Or I could start working on Sylow subgroups which is another things I was thinking of doing. These things shouldn’t depend on any of the unmerged PRs so it would be a good use of time.