Parallel and Distributed Systems Lab (PDSL)
Wide-Scale Distributed Computing (Grids)
As wide-area networks have grown and their performance has improved, the scale of distributed systems has also grown. At the same time, the scale of high-end computational problems that scientists and engineers are seeking to solve has increased significantly. This combination of events has lead to the creation of a new breed of wide scale distributed systems designed for solving complex computation problems. Such systems have come to be known as computational "Grids" (by analogy with the national power grid). The sheer scale of grid systems combined with the fact that the composite systems are typically highly heterogeneous and autonomous make for many implementation challenges. Nevertheless, the need for grids (or similar systems) is undeniable as many of the most important computational problems facing our society cannot be solved by lesser computing systems.
Specific sub-areas of interest
Historically, research in Grids within the PDSL has been primarily focused on strategies for managing the resources from which a grid is composed. That is to say, discovering what resources are available, to whom they should be assigned and when as well as monitoring and controlling the on-going use of resources. Our earliest work in grids followed shortly after the initial implementation of the first version of the Globus toolkit. Mr. Rajendra Singh, an MSc student supervised by Dr. Graham, designed and prototyped a system for monitoring the load on workstations which could, on request, be used to predict which machines would likely be available for use in a dynamically constructed cluster based on their historical use over various timescales. A generic framework for resource management based on the "Resource Containers" concepts was also designed by Drs. Graham and Eskicioglu in collaboration with Dr. Maheswaran, now of McGill University. More recent work in grid resource management is being done by Mr. Pranith Kadaru, under the supervision of Dr. Graham, who is working on a Peer to Peer (P2P) scheme for building collections of widely distributed machines suitable for use as compute clusters. Also, Mr. Purnachander Parupally, again under the supervision of Dr. Graham, is exploring the usefulness of an option-like mechanism for managing risk in advanced resource allocation systems. In addition to the work on resource management, Mr. Vishnu Narayanasami, an MSc student supervised by Dr. Graham, is developing a system for predictive data staging in High Performance Computing (HPC) systems and Mr. Rajendra Singh, now a PhD student under Dr. Graham's supervision, is creating a system for the partial checkpointing of MPI jobs.
Our Publications on Grids
Related Work on Grids
For More Information Contact:
Send mail to
questions or comments about this web site.