Computer Science – Distributed – Parallel – and Cluster Computing
Scientific paper
2001-02-21
Computer Science
Distributed, Parallel, and Cluster Computing
12 pages, Workshop on Clusters and Computational Grids for Scientific Computing, Sept. 24-27, 2000, Le Chateau de Faverges de
Scientific paper
Parallel jobs are different from sequential jobs and require a different type of process management. We present here a process management system for parallel programs such as those written using MPI. A primary goal of the system, which we call MPD (for multipurpose daemon), is to be scalable. By this we mean that startup of interactive parallel jobs comprising thousands of processes is quick, that signals can be quickly delivered to processes, and that stdin, stdout, and stderr are managed intuitively. Our primary target is parallel machines made up of clusters of SMPs, but the system is also useful in more tightly integrated environments. We describe how MPD enables much faster startup and better runtime management of parallel jobs. We show how close control of stdio can support the easy implementation of a number of convenient system utilities, even a parallel debugger. We describe a simple but general interface that can be used to separate any process manager from a parallel library, which we use to keep MPD separate from MPICH.
Butler Ralph
Gropp William
Lusk Ewing
No associations
LandOfFree
Components and Interfaces of a Process Management System for Parallel Programs does not yet have a rating. At this time, there are no reviews or comments for this scientific paper.
If you have personal experience with Components and Interfaces of a Process Management System for Parallel Programs, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Components and Interfaces of a Process Management System for Parallel Programs will most certainly appreciate the feedback.
Profile ID: LFWR-SCP-O-520102