Table of Contents
Previous Go to Robert Ramey Software Development Home Page next

7. Windows PSORT.DLL

The Postman's sort for windows has been encapsulated in a dynamic link library so that it can be seamlessly incorporated in any windows application. This DLL is named PSORT.DLL for Windows 3.1. DLLs for 32 bit versions of windows are PSORTNT.DLL and PSORTMI.DLL . The application interface is identical for both. Applications written in accordance with the following guidelines can be compiled for Windows 3.1 or Windows/NT/95 with the appropriate compiler.

The comments below that refer to DS register, MakeProcInstance, FreeProcInstance, and exported functions can be ignored if the application is destined to run only under 32 bit Windows system.

The Postman's Sort Dynamic Link library can be used in a variety of ways:

a) As a simple self contained sorting function.

b) It can use application defined callback functions to read and/or write records.

c) It can use application defined callback functions to filter and/or reformat records that read/written by PSORT.DLL functions before/after they are sorted. This permits the use of the PSORT.DLL command line to specify files as well as using the PSORT.DLL file input/output functions while still permitting the application to alter records before or after the sort. The file input/output functions used by PSORT.DLL are much faster than those found in most standard libraries.

d) It can be executed as a process parallel to the application's. This is referred to as a co-routine. This permits several sorts to be active simultaneously within the same application. An example of this might be a data base join program which interleaves records from several sorted input files.

The following sample applications illustrate how the PSORT.DLL functions are used to implement the above tasks. Complete source code for these sample applications is included with the Postman's Sort Developer's kit. PSORTW

7.1. PSORTW

The simplest program is the general purpose stand alone sorting utility PSORTW. It is just a shell around the PSORT.DLL which does all the sorting work. Stripped to its essentials, PSORTW looks like this:

PSORTYIELD a) YieldToWindows() while(PeekMessage(... DispatchMessage(... PSORTDISPLAY b) OSDisplay(.. ... PsortMain(CommandLine) OSDisplayPtr = MakeProcInstance(OSDisplay,...); c) OSYieldPtr = MakeProcInstance(OSYieldPtr,...); hPsort = PsortCreate( d) OSDisplayPtr, OSYieldPtr ); ExitCode = PsortStart( e) hPsort, (PSORTREADPTR)NULL, f) (PSORTWRITEPTR)NULL, CommandLine g) ); PsortDestroy(hPsort); h) FreeProcInstance(OSYieldPtr); i) FreeProcInstance(OSDisplayPtr); return ExitCode

a) This function is called periodically by PSORT.DLL permit the application to yield to windows or to process messages in its own message queue. This permits psort to execute a time consuming sort in the "background" while the operator continues with other tasks. The requirements for this function are described in detail in the next chapter under the PSORTYIELD api callback.

b) This function is called by psort when there is a message to be displayed. This will normally be an error message. A copyright message that is displayed during psort startup may by suppressed with the proper command line switch. More information on this function is found in the next chapter under the PSORTDISPLAY api callback.

c) Addresses of functions to be called by Psort MAY be generated with the MakeProcInstance Windows API call. This returns the address of a small temporary function which loads the applications DS register in the AX register before transfering control the the application callback function. When compiled with the appropriate command line switches, callback functions will copy the value in the AX register to the DS register upon entering the function. Thus callback functions will be able to access global and static variables declared in the application program. When compiled for Windows/NT/95, MakeProcInstance calls are not necessary and result in no operations.

d) This function allocates a sorting handle. All subsequent Psort API calls must include this handle as a parameter. This permits multiple sorting instances to be executed simultaneously.

e) This function initiates the sorting. When finished it will return an exit code. An exit code equal to zero indicates that the sorting terminated sucessfully. A non-zero exit code indicates that sorting was unable to complete.

f) The next two parameters are used to indicate the callback function that psort should use to read records to be sorted and write the sorted output. In this example, the input file(s) and the output file are to be specified on the psort command line so neither read nor write callback functions need be specified. NULL is assigned to these parameters to indicate this.

g) This parameter is a string containing the psort command line as described earlier in this manual.

h) This function releases the resources associated with the specified psort handle. It should be called as soon as possible after sorting is completed.

i) These FreeProcInstance calls release the memory reserved by the above MakeProcInstance calls if such calls are used. This should be done as soon as it will no longer be necessary to call OSDisplay and OSYield. Under Windows/NT/95 or the Win32s subsytem, FreeProcInstance is not necessary and will result in no code being generated if used. Windows 3.1 Prologues/Epilogues

7.2. Windows 3.1 Prologues/Epilogues

The usage of PSORT.DLL in this case is straight forward. The issue where most problems will occur is in the protyping, compilation, and usage of callback functions. PSORT.DLL executes with a separate data segment. PSORT.DLL restores the DS and SS registers to valid values before calling user supplied callback functions. Thus the issues of prologues/epilogues can usually be ignored. It is never necessary to invoke MakeProcInstance to create a callback function pointer. Using MakeProcInstance/FreeProcInstance does no harm and may be convenient to maintain source code compatiblity with other programs. As this is almost always a source of confusion, the Windows prologue/epilogue issues are explained below.

When a user function compiled with the application is called from a DLL the DS register must be set to that of the application when the callback function is invoked and restored to its previous value when the function returns. Several methods are used by Windows applications to accomplish this:

a) Load DS <- SS on function entry and restore on function return. This is generated by: Explicitly declaring the function as EXPORT Using special compile time switches. With Microsoft C/C++ 7.00 these switches are /GA /G2 /GEs. With Symantec C/C++ these switches are -WA.

b) Load DS <- DGROUP on function entry and restore on function return. This is generated by declaring the function with the __loadds key word. It can also be specified for all functions in a source file by using appropriate command line switches. For Microsoft C/C++ 7.00 this means /A?u while for Symantec C/C++ this means -m?u. This will work fine and callback function pointers need not be created with MakeProcInstance. The disadvantage of this method is that only one instance of the application can be executed at a time.

c) Load DS <- AX on entry to functions declared __export and restore on return from functions declared __export. This is generated by the following procedure:

i) Declare the callback function with the __export keyword. The definition of CALLBACK in the psort api header file psortw.h includes this. Note that this is redefinition of the standard windows definition of CALLBACK. If this method is to be used, this definition of CALLBACK should be used for all functions in the module. This will usually mean including the psortw.h file ahead of the header file which contains the function declarations for this module.

ii) Compile with appropriate command line switches. For Microsoft C/C++ 7.00 these are /GA /G2 /GEa while for Symantec C/C++ they are -Wea. The disadvantage of using this method is that a callback function can only be called through a function pointer created with the MakeProcInstance Windows API call. Calling the function directly from the application will normally result in undefined behavior as the DS register will be loaded from the AX register which will generally not contain the correct value.

d) Load DS <- AX on entry to all far functions and restore on return from all far functions. This is generated by compilation with appropriate command line switches. For Microsoft C/C++ 7.00 these are /Gw while for Symantec C/C++ use -W. The disadvantage of using this method is that function prologues and epilogues for setting and restoring the DS register are executed many times when it is not necessary thereby wasting execution time. However, this is the most foolproof method and recommended while debugging the application.

Since PSORT.DLL sets DS and SS registers to the original values before the users callback functions are called and retores those required by the DLL after the users callback function returns, The whole issue of callback functions can be ignored. In fact, this will result in the simplest and fastest program. Even so, any other scheme can be used on order to maintain compatibility with other source code modules or any other reason. ADDROUTW

7.3. ADDROUTW

The next example is the ADDROUT.C program whose source code is included in the developer's kit. The function of this program is to read a file and generate another file consisting of record addresses. The general method is to append the record address of each record as it is read and remove all but the record address of each record as it is written. The record addresses in the output file are in order of the sorting key. This illustrates how an application can massage records as they are read and/or written by PSORT.DLL . The interesting part of ADDROUT.C looks like the following:

local int PsortMain(CommandLine) OSDisplayPtr = MakeProcInstance(OSDisplay ...) hPsort = PsortCreate( a) OSDisplayPtr, (PSORTYIELDPTR)NULL ); ReadRecordPtr = MakeProcInstance(ReadRecord ...) b) WriteRecordPtr = MakeProcInstance(WriteRecord ...) i = PsortStart( hPsort, ReadRecordPtr, c) WriteRecordPtr, CommandLine ); FreeProcInstance(ReadRecordPtr); FreeProcInstance(WriteRecordPtr); PsortDestroy(hPsort); FreeProcInstance((FARPROC)OSDisplayPtr); return i; PSORTREAD ReadRecordRout(NewSize, NewAddress) d) if(--ReadCount == 0){ ReadCount = 1000; YieldToWindows(); } if(PsortRead(hPsort, &Size, &Address)) e) return 1; if(Size == 0){ f) *NewSize = 0; return 0; } _fmemcpy(Record, Address, Size); g) _fmemcpy(Record+Size, &DiskAddress, sizeof(DiskAddress) ); DiskAddress += Size; *NewAddress = Record; h) *NewSize = Size+sizeof(DiskAddress); return 0; PSORTWRITE WriteRecordRout(Size, RecordAddress) if(--WriteCount == 0){ WriteCount = 1000; YieldToWindows(); } return PsortWrite( i) hPsort, sizeof(long), RecordAddress + Size - sizeof(long) );

a) In this example the function parameter used to specify the windows yield callback function pointer has been loaded with NULL. PSORT.DLL will not call an application yield function. Since this application intervenes in the processing of every record there is plenty of opportunity to yield on its own.

b) Optionally, create function pointers for record read and write functions. The requirements for these functions are discussed in the following chapter under the PSORTREAD and PSORTWRITE function prototypes.

c) Sorting is initiated specifying that application functions should be used for record input and output.

d) The record input function is called when PSORT.DLL requires an input record. Parameters passed are the address which should be loaded with the record size and the address which should be loaded with the record address. In this example, YieldToWindows is called after each 1000 records is read.

e) This program uses the facilities of the PSORT.DLL to acquire the next input record. It could just as well have opened file and read the records using its own input/output code. However, in this case it is convenient to allow PSORT.DLL to read the data from the file(s) specified in the psort command line. Also the file input/output functions used by PSORT.DLL a generally much faster than those found in most libraries.

f) If PsortRead returns with 0 loaded in record size, there are no more records to be read. All that need be done is to return to the caller (In this case another part of PSORT.DLL) with 0 loaded in record size and NULL loaded in the record address. A value of 0 is returned to indicate that no error condition was encountered.

g) Here the address of the input record is appended to the record read to make a new record. Note that the input/output functions used in PSORT.DLL do binary input/output. That is, text records are terminated by a 0x0d and 0x0a on Windows systems.

h) The results of the modified read operation are copied into the data areas whose addresses where passed as parameters to the application record read function. The function returns indicating that no error was encountered.

i) The applications write callback function modifies the size and address of the data to be written and calls the appropriate record write function in the PSORT.DLL It could just as well have used its own output such as writing its output to the Windows display but in these case it was more convenient to exploit the command line processing and file output functions included in PSORT.DLL . JOINW

7.4. JOINW

The next example is the JOINW.C program whose source code is included in the developer's kit. This program reads two files sorts each one according to specified keys and joins the records with equal keys to build new records that are written to output. both sorts are executed concurrently with the application so that no intermediate files are seen by the application program. The joined records are written almost immediately after the last record of each file is read by its respective sorting instance. This illustrates PSORT.DLL functioning as a co-routine in parallel with the application as well as multiple concurrent sorting instances. Here is an outline of the relevant parts of JOINW.C

typedef struct { a) PSORTHANDLE hPsort; HWND hWnd; LPSTR CmndLine; PSORTDISPLAYPTR OSDisplayPtr; size_t Size; LPSTR Ptr; } R; R R1, R2; PSORTDISPLAY OSD1(Message) b) OSDisplay( R1.hWnd, Message ); WinMain(... Fileout = fopen(CmndLine, "w"); c) if(Fileout == (FILE *)NULL) return FALSE; R1.OSDisplayPtr = MakeProcInstance(OSD1 ... d) PsortOpen(&R1 e) nCmdShow, "First File", hInstCurrent, CmndLine ); ExitCode = PsortStart( f) R1.hPsort, NULL, PSORTSEND, R1.CmndLine ); if(ExitCode){ g) (*(R1.OSDisplayPtr)) h) ("Error detected ... "); goto windup; } ... Repeat the above for file #2 i) ... MainLoop(); j) windup: PsortClose(&R1); PsortClose(&R2); if(Fileout != (FILE *)NULL) fclose(Fileout); WinExit(); return ExitCode; MainLoop(){ PsortRetrieve( k) R1.hPsort, &(R1.Size), &(R1.Ptr) ); if(R1.Ptr == (LPSTR)NULL) return; ... Same as above for R2 ... for(;;){ int i; while(PeekMessage(... l) DispatchMessage(... } for(i = 1000;i > 0; ++i){ m) switch( n) RecordCompare(&R1, &R2) ){ case -1: PsortRetrieve( R1->hPsort, &(R1.Size), &(R1.Ptr) ); if(R1.Ptr == (LPSTR)NULL) return; break; case 0: RecordOutput(&R1, &R2); case +1: PsortRetrieve( R2.hP->hPsort, &(R2.Size), &(R2.Ptr) ); if(R2.Ptr == (LPSTR)NULL) return; break; } } } PsortOpen( &R, nCmdShow, Title, hInst ) CommandLineInit(... if(CmdLine == (LPSTR)NULL){ (*(R->OSDisplayPtr)) ("No sort command specified"); goto windup; } R->hWnd = o) InitPsortWindow( hInst, nCmdShow, Title ); R->hPsort = PsortCreate( p) R->OSDisplayPtr, (PSORTYIELDPTR)NULL ); PsortClose(&R) PsortDestroy(R->hPsort); q) FreeProcInstance(R->OSDisplayPtr);

a) There is one of these structures for each input file to be sorted. For each file to be sorted there is a separate sorting handle, command line, message display window, display function pointer and record size and address buffer pointers.

b) The display function pointer for a file calls a more general display function using the window handle as a parameter. That is for each file/sorting instance there is a message display function which displays the message on the proper window.

c) Output will be to a file using common operations. PSORT.DLL functions will not be used for writing the joined records to a file.

d) Optionally, create the message display function pointer for the first file. Assign it to the structure that will store other information related to the first file.

e) Initialize the structure associated with the first file. This will included creating handles for the sorting instance and message display windows as well as processing the command line information corresponding to the first file.

f) Initiate sorting on the first file. This invocation of PsortStart differs from our previous examples in that the record write callback function parameter contains the manifest constant PSORTSEND. This indicates that whenever a sorted record is ready, execution will be suspended pending a PsortRetrieve call by the application.

g) PsortStart will return almost immediately with an exit code indicating whether or not an error condition was encountered. This statement checks the exit code aborts the program if an error was encountered. Otherwise, execution continues on in the normal manner until a PsortRetrieve function call is executed. At that time the sorting process is re-activated and will until a record is ready to be sent to the application. When a record is ready to be sent to the application, the PsortRetrieve returns with the parameters filled with the size of the record and its address. If an error is encountered at any time in the sorting process, control reverts to this statement with a non-zero exit code. That is, if an error is encountered during sorting, this statement will be executed twice; the first time with an exit code equal to zero and the second time with a non-zero exit code.

h) Remember that exported functions must be called through pointers created with MakeProcInstance. They cannot generally be called directly unless Smart Callbacks are being used.

i) The same process is repeated for file number two. At this point there are two instances of the sorting process in existence. This is why calls to the psort api must include the correct handle of the sorting instance to be addressed.

j) At this point both sorting instances are suspended. Now the main loop function can be called. Note that we cannot return from the current function before calling the main loop. This is because it is possible that an error in one of the sorting processes would return control to the statement following its corresponding PsortStart and the stack will be restored accordingly. If the stack is shortened and control is then returned to PsortStart, the return address from the current function will have been lost. This is exactly the same as when setjmp/longjmp or CATCH/THROW are used.

k) To initialize the main loop, a record is loaded from each file. This is done through the PsortRetrieve call. When this call is executed sorting progresses until the first record is ready to be sent. At this point PsortRetrieve returns with the parameters filled with the size and address of the record read. One of the unique features of the algorithm used by the postman's sort is the fact that the first sorted records are ready for output long before the whole file is sorted. In fact they are ready almost immediately as soon as the last record from the input file is read. Thus even though the postman's sort is typically 2 or 3 times faster than alternative programs, response times for operations such as this data base join are typically one tenth times required by traditional methods.

l) Empty the Windows message queue.

m) Read 1000 records before we go back and empty the Windows message queue.

n) Compare the two records in the file buffers. If they are not equal, read the next record from the file whose current key is "less than" the other one and loop. If they are equal, write a joined record to file output and read the next record from the second file and loop. When one of the input files is exhausted, we're done.

o) Create the window to be used for messages sent by this sorting process and save its handle.

p) Create the sorting context to be used for this file and save its handle.

q) Release resources corresponding to the indicated sorting handle. Notes on PSORTNT.DLL

7.5. Notes on PSORTNT.DLL

The Postman's Sort Developer Kit for windows includes the PSORTNT.DLL . The Windows/NT/95 program PSORTWNT.EXE is built from the PSORTW.C source file just as PSORTW.EXE file is. In order to build this on a Windows 3.1 system one needs a true 32 bit compiler.

If the application is to be compiled only in 32 bit mode, the usage of MakeProcInstance/FreeProcInstance for function callbacks as described above is not necessary. Also, comments of related to function prologues and epilogues and the loading/saving of the DS register are not relevant and can be ignored. However, applications which do not follow these conventions can only be compiled in 32 bit mode. Applications designed for 16 bit mode can be compiled for 32 bit mode from the same source code. The sample applications include can compiled for 16 or 32 bit mode without modification.

Under Windows 3.1 with Win32s, PSORTNT.DLL will not support multiple concurrent sorting. Under Windows/NT/95, PSORTNT.DLL will not support multiple concurrent sorting within the same application. That is, That is, programs such as the JOINW.C example will not function when used with PSORTNT.DLL. PSORTNT.DLL will sort files only up to 2GB in size. Any and all of these limitations can be overcome by using PSORTMI.DLL instead of PSORTNT.DLL . This version supports multiple instances, multi-tasking and file sizes beyond 2GB. To use PSORTMI.DLL just change its name to PSORTNT.DLL and proceed as before. PSORTMI.DLL may be slightly slower than PSORTNT.DLL and might not work with developement environments which are not provided by Microsoft. PSORTMI.DLL requires that the file MSVCRT.DLL be installed on the target system.