Abstract
To parallelize an application program for a distributed memory architecture, we can use a precedence task graph to represent the parallelism of this program, schedule tasks onto the given physical processors and then distribute program and data accordingly. In this chapter, we discuss program partitioning techniques for constructing task graphs and present several static scheduling algorithms that consider the overhead of inter-processor communication. Finally we give an overview of a software system PYRROS that uses scheduling algorithms to generate parallel code for distributed memory parallel machines.