Scheduling
In many fields of science, engineering, and medicine, data is being collected and generated at an increasing rate, thanks mainly to high-resolution measurements made possible by advanced sensor technologies and large scale simulations enabled by inexpensive, high-performance computing through commodity PC clusters. Hence, scientific research is increasingly becoming data driven. We envision that in order to support the data management and processing requirements of data driven science, research institutions and supercomputing centers will not only host, manage, and provide access to computing resources but also will have to become high-end, Grid-enabled data centers for scientific data.
The objective of this project is to develop the scheduling technologies to provide efficient access to shared resources on large scale storage systems multi-level storage hierarchies. In this project, we design, implement and evaluate scheduling approaches for job schedulers that will control the access to and sharing of resources and that will support high throughput and interactive responses for many simultaneous data analysis and computing jobs on large storage systems. We will 1) develop models for scheduling of component-based application. Task graphs are commonly used for describing the structure of compute intensive applications. We will extend task graphs to include semantic information about interactions such as pipelined vs non-pipelined stages. 2) Develop techniques for scheduling and execution of application components. We will examine methods that will take into account locations of datasets and resource availability and performance characteristics of different levels of storage. 3) Develop methods for scheduling of multiple jobs on the system. We will investigate approaches for space sharing and time sharing of resources. We will develop techniques for scheduling multiple moldable jobs and to achieve response time guarantees.
Project Researchers
Umit Catalyurek, Ph.D.
Tahsin Kurc, Ph.D.
Joel Saltz, M.D., Ph.D.
Gaurav Khanna
Naga Vydyanathan
P. Sadayappan, Ph.D.
Project Funding Participation
SOFTWARE: Job Scheduling for Data Centers with Multi-level Storage Systems
Project Publications
Publications |
Naga Vydyanathan, S. Krishnamoorthy, G. Sabin, Umit V. Catalyurek, Tahsin M. Kurc, P. Sadayappan, Joel H. Saltz, "Locality Conscious Processor Allocation and Scheduling for Mixed Parallel Applications", Proceedings of 2006 IEEE International Conference on Cluster Computing, 2006: pp. 1-10. |
Naga Vydyanathan, S. Krishnamoorthy, G. Sabin, Umit V. Catalyurek, Tahsin M. Kurc, P. Sadayappan, Joel H. Saltz, "An Integrated Approach for Processor Allocation and Scheduling of Mixed-Parallel Applications", Proceedings of the 2006 International Conference on Parallel Processing (ICPP-06), 2006: pp. 443-450. |
Gaurav Khanna, Umit V. Catalyurek, Tahsin M. Kurc, P. Sadayappan, Joel H. Saltz, "A Data Locality Aware Online Scheduling Approach for I/O-Intensive Jobs with File Sharing", Proceedings of the 12th International Workshop on Job Scheduling Strategies for Parallel Processing (JSSPP 2006), 2006. |
Gaurav Khanna, Naga Vydyanathan, Umit V. Catalyurek, Tahsin M. Kurc, S. Krishnamoorthy, P. Sadayappan, Joel H. Saltz, "Task Scheduling and File Replication for Data-Intensive Jobs with Batch-shared I/O", Proceedings of the 15th IEEE International Symposium on High-Performance Distributed Computing (HPDC-15), 2006: pp. 241-252. |
Naga Vydyanathan, Gaurav Khanna, Umit V. Catalyurek, Tahsin M. Kurc, Joel H. Saltz, P. Sadayappan, "Scheduling of Tasks with Batch-Shared I/O on Heterogeneous Systems", Proceedings of 20th International Parallel and Distributed Processing Symposium (IPDPS), 2006. |