My early research revolved around addressing open questions left by Knuth in The Art of Computer Programming, Volume 3: Sorting and Searching, which was published during my first year as a graduate student.
A reviewer for one of the several journal articles I wrote that were based on research in my thesis commented, “I hope that this is the last paper we see on Quicksort.” I have a feeling that that reviewer was Tony Hoare, who might have been justified in thinking that his original 1960 paper covered everything interesting about Quicksort. But there have been hundreds of papers published on the algorithm since that time, and quite a bit of useful knowledge discovered. As the method of choice (still) in practical sorting applications, the algorithm is scrutinized carefully, and new variants are being discovered and tested to this day.
Equal keys. One contribution of my thesis was a study of the performance of Quicksort when equal keys are present, building on work about BSTs by Burge. It did not seem of much practical interest at the time, except the observation that quadratic running time could be easily avoided when large numbers of equal keys are present simply by stopping the partitioning pointers on keys equal to the partitioning element during partitioning, and that things could be further improved to make Quicksort linear in many cases by doing three-way partitioning (doing the work to put all keys equal to the partitioning element into position).
Not all practitioners followed this advice—Jon Bentley and Doug McIlroy at Bell Labs discovered early on that the Unix implementation of Quicksort went quadratic with large numbers of equal keys, and decided that it was worthwhile to do three-way partitioning. I was working with them on the third edition of Algorithms at the time, and Bentley and I realized that three-way partitioning is within a constant factor of optimal for any set of key values (and conjectured that the constant is 1). We also realized that it was easy to adapt to sort records with keys spanning multiple fields, again the method of choice because typically applications have large numbers of equal keys in some fields.
Eventually, in the fourth edition of Algorithms, we decided to recommend three-way partitioning using an ancient algorithm due to Dijkstra. More recently, a three-way partitioning algorithm using two partitioning elements has attracted some attention. I studied this one in my thesis, but judged it to be not worth the effort. Recent studies on modern machines show that it can be recommended over normal Quicksort, particularly because it also handles equal keys well.
Quicksort with Equal Keys. SIAM J. on Computing 6, 2, 1977.
Quicksort is Optimal. Slides for talk at KnuthFest, Stanford University, January, 2002.
Multiway Quicksort (with J. Bentley). Dr. Dobbs Journal, November, 1998.
Algorithms, Third Edition, in C, Parts 1–4: Fundamentals, Data Structures, Sorting, and Searching. Addison-Wesley, Reading, MA, 1998, pp. 327–329.
Algorithms, Fourth Edition (with K. Wayne). Addison-Wesley, Reading, MA, 2011, pp. 298–301.
Solved. In 2016, Sebastian Wild proved the Bentley-Sedgewick conjecture about asymptotic optimality.
Strings. But the real advantage of three-way partitioning comes when sorting with string keys. The simple string-sorting algorithm that Bentley and I developed that is based on 3-way partitioning is the method of choice in many practical applications.
Fast Algorithms for Sorting and Searching Strings (with J. Bentley). Proc. 8th Symposium on Discrete Algorithms, 1997.
One of my most original pieces of research involved a connection between a little-understood sorting algorithm and a classic problem in number theory known as the Frobenius problem.
This broke new ground and led to several papers, two Ph.D. theses on Shellsort and variants, and a number of practical improvements. At the time, the problem of whether there exist O(N log N) sorting networks was still open (resolved in 1983 by Ajtai, Komlos, and Szemeredi), so this research seemed a possible path to finding practical O(N log N) sorting networks.
Many later papers have studied this problem, and the central question about worst-case performance is resolved—there is no increment sequence for which the worst-case number of compares used by Shellsort is O(N log N)—but the question of whether there exists an increment sequence that would make Shellsort compete with Quicksort in practice still remains open.
A New Upper Bound for Shellsort. Journal of Algorithms 7, 1986. Invited paper at 1982 IEEE International Conference on Information Theory.
Analysis of Shellsort and Related Algorithms. Invited paper at 1996 European Symposium on Algorithms.
Solved. Poonen developed a lower bound that proves that there is no sorting network in the Shellsort family for which the worst-case number of compares is O(N log N).
Solved. Goodrich developed randomized O(N log N) Las Vegas and Monte Carlo variations of Shellsort.
Open (theory). Find an increment sequence where the average number of compares used by the standard Shellsort algorithm to sort N elements is O(N log N).
Open (practice). Find an increment sequence for which Shellsort sorts random files of 1 billion elements with fewer than 50 billion compares on average.
Knuth left the analysis of Heapsort almost completely open. The worst-case performance is within a factor of two of the optimal, which is good enough for most people. But settling for this result undermines the high standard that he set for precise, if not exact, formulas describing the best, worst, and average case for other leading sorting algorithms.
Best case. Tackling the average case did seem much too difficult, so my first idea was to try to understand what the best case might be. Maybe the average case could be trapped in between the best and the worst. Which permutation of N items causes Heapsort to use the fewest compares? A simple argument based on Floyd’s variant of Heapsort shows that the number of compares cannot be above 1/2 N lg N, but could it be linear? If so, Floyd’s variant would be optimal.
There are many fewer possible heaps than permutations, so I wrote a program to generate all possible heaps to look for the answer. I could see that there was a growing gap, but the size of the heaps I could generate was still too small to see a pattern.
Fortunately, I spent a couple of summers consulting at IDA, where I learned to program the CRAY-1, by far the biggest and fastest computer available. I coded up my Heapsort program (in assembly language, I think) and was able to see the best case in some good-sized heaps, which gave the insight to come up with a construction proving that Floyd’s method is not asymptotically optimal in the worst case.
Average case. Using standard techniques seemed hopeless from the start, but I realized that my CRAY-1 runs showed that most heaps seemed to require ~N lgN compares, so I wondered whether some sort of counting argument might work. My student Russel Schaeffer found such an argument—it is an amazingly simple resolution to our knowledge of basic facts about Heapsort.
The Analysis of Heapsort (with R. Schaffer). J. of Algorithms, 1993.