CS735/CS835 Programming Assignment 6 (due Sun Dec 14)


This program concerns the parallel implementation of a linear system solver using both threads and message passing.

Write a program to parallelize the serial program (gaussian.c) found in ~cs735/public/gaussian. This program should execute on the Galaxy cluster. However, the code executing on each node of the cluster should be multithreaded.

You should use the Marcel threads library to support multithreaded execution. An implementation of the generic threads interface using Marcel is available on antares.cs.unh.edu in /usr/users/pjh/pi. This directory also includes the Pi program and a Makefile, which indicates where the Marcel include files (/usr/local/marcel/include) and library file (/usr/local/marcel/lib/LINUX) are located. Unfortunately there is no documentation (in English) for Marcel. You can consult the include files or the example programs in /usr/local/marcel/examples.

The distributed source code contains a serial solver based upon Gaussian elimination and back substitution. This serial program includes code for initializing a matrix to be solved and for checking the results. As in program 1, the performance of these two steps is not important. You are only responsible for improving the performance of the system solver itself. However, since you are executing in a distributed-memory environment, you may find that it is most convenient to use a parallel implementation to initialize and check the results.

To receive 75% of the credit: Provide parallel implementations of both the Gaussian elimination phase and the back substitution phase. Distribute rows of the matrix to processors. Then further distribute the rows on each processor to the threads on the processor.

Since we know there is little available parallelism in the back substitution phase, you do not need to use multiple threads on that phase.

To receive 85% of the credit: Evaluate the performance of your implementations. Execute on 1, 4 and 8 processors using 1, 2 and 4 threads on each processor. Use a system of 512 equations for the experiments. Report on your running times, compute speedup numbers, and analyze any trends that you recognize in the data you collect. Place your findings in the "flat" Ascii file named FINAL-REPORT. Also document in this report the basic design of your parallel program and clearly indicate what level (75%, 85% or 100%) of the assignment you have obtained.

Place your parallel implementation in the file pgauss.c.

To receive 100% of the credit: Include in the FINAL-REPORT document a discussion of how you think message-passing libraries (such as MPI or NXS) should be constructed to better support multithreaded computations at each node. Try to build on the experience you had in completing this assignment. What features should NXS have had that would have made your task easier? Why?

Submitting your assignment: To receive full credit for the assignment, you must turn in your files prior to 8am on Monday December 15. Late submissions will be accepted at the penalty of 5% per day up to one week late. To turn in this assignment, type (on agate.cs.unh.edu)

   ~cs735/bin/submit prog6 FINAL-REPORT pgauss.c

NOTE: You must submit your assignment for grading from agate.cs.unh.edu. Submissions can be checked using ~cs735/bin/scheck. For example, to check your assignment:

    ~cs735/bin/scheck prog6
The scheck program can only be executed on agate.cs.unh.edu.

Remember: as always you are expected to do your own work on this assignment.

Also: you should adequately document and structure your programs.


Last modified on November 23, 1997.

Comments and questions should be directed to pjh@cs.unh.edu