Robert Ramey Software Development is a company dedicated to producing and maintaining the World's Fastest General Purpose sorting program for all computing environments.
So far this goal has been achieved for all Intel platforms. In 1994, the Postman's Sort-tm- was integrated as a component in SQLBase from Gupta Software and Princeton Softech .
Many years ago I, like most programmers, believed that there was a theoretical limit to how fast a sorting program could be. Through work unrelated to sorting I came to question that premise. This led me to go back to the original reference books that we all studied in college. It turned out that the well known theorms on the limits of sorting speed only applied to methods which used comparison of two records at a time. Other methods had been proposed but interest was limited because they used much more memory that the 16KB that most computers then had available and were much more difficult to generalize.
This lead me to submit a proposal to a technical journal for an article about a general purpose linear time sorting program. The proposal was rejected as it was "well known" that it was not possible to construct such a program. It became clear that the only way to convince anyone that such a program was possible would be to write and demonstrate it. Practical problems such as efficient file and memory management just about sank the project but eventually a program was produced and results were published in the C/C++ User's Journal . This was the first version of the Postman's Sort-tm-.
The results of this effort indicated that the Postman's Sort-tm- was about twice as fast as Quick Sort for typical cases. As files got bigger the advantage of the Postman's Sort-tm- became more pronounced. More work was done to handle pathalogical cases, command line syntax, and all the other things that are required to turn a great idea into a great product.
By 1997, the product was enjoying modest success. That year an article which showed the postman's sort to be about twice as fast for sorting of files that could fit in memory and and about the same as for files sorted from a temporary work file.
Not bad - but not what I really wanted to hear. It seemed that cpu speeds were starting to improve to the point where the main bottleneck became the time required to read and write the files. This diminished any comparative advantage the Postman's sort had over other programs. I wasn't sure what to do about this. Reworking code to be faster wouldn't help. And I didn't see any way to speed up the I/O.
News wasn't all discourageing however. In 1998, I was contacted by Microsoft Research . to supply a copy of the "Postman's sort" for some sorting bench marks they were conducting. The Postman's sort won a medal for this performance. It's currently hanging on my wall.
That sorting benchmark has become an anual event hosted by Jim Gray which provides a sort of history of recent computer evolution.
Still I felt that things were stalling on the I/O side. After some thought I made some changes at the algorithmic level with an eye to diminish random access I/O. Also I implemented async I/O for the platforms which support it. After spending some time adjusting to the Windows/NT async I/O API. I managed to get sorting speeds up to where I always thought they should be for large files.
As I write this (2005), my tests show that Postman's sort is about 3 times faster than the sort that ships on Windows/XP Professional accross a wide range of file sizes. The largest file I tested was 17 GB.
I am now embarked upon the task of convincing the world that there is a better way to sort. It won't be easy.