Ümit V. Çatalyürek

OSU | citeseer | DBLP
Home
Partitioning | Middleware | Scheduling | Coloring | Clustering | Semantic Graphs
Advising | ECE 5362 | ECE 662 | ECE 694 | ECE 864 | BMI 731
2014 | 2013 | 2012 | 2011 | 2010 | 2009 | 2008 | 2007 | 2006 | 2005 | 2004 | 2003 | 2002 | 2001 | 2000 and earlier
Invited
PaToH | Zoltan | DataCutter | STORM | MSSG
Family | Hobbies
Research | Hardware&Game | Bargain | Others

Research

Below is a short list of some of the research projects I have been working, and a very brief description of them. For more up to date information please visit HPC Lab website.

Partitioning:

The partitioning work has started with my PhD studies. I have introduced computational hypergraph models for modeling applications with irregular data dependencies. Hypergraphs provides more generalized abstractions than graphs, hence they are much more flexible to model complex problems. Today we are using hypergraphs for workload partitioning in parallel processing, restructuring sparse matrices and computing fill-reducing ordering in scientific computing, scheduling of batch-shared I/O tasks in Grid Computing as well as in automatic management of memory hierarchies in global address space parallel programming.

Selected Publications:

  • On two-dimensional sparse matrix partitioning: Models, Methods, and a Recipe . U. Catalyurek, C. Aykanat and B. Ucar, Technical Report, No. OSUBMI_TR_2008_n04, The Ohio State University, Department of Biomedical Informatics, 2008.
  • Hypergraph-based Dynamic Load Balancing for Adaptive Scientific Computations. U.V. Catalyurek, E.G. Boman, K.D. Devine, D. Bozdag, R. Heaphy, L.A. Riesen. Proceedings of 21st IEEE International Parallel & Distributed Processing Symposium (IPDPS’07), 2007. Won best paper award in the Algorithms track.
  • Parallel Hypergraph Partitioning for Scientific Computing. K. D. Devine, E. G. Boman, R. T. Heaphy, R, H. Bisseling, and U. V. Catalyurek. Proceedings of 20th IEEE International Parallel & Distributed Processing Symposium (IPDPS’06), 2006.
  • Hypergraph Partitioning for Automatic Memory Hierarchy Management. S. Krishnamoorthy, U. Catalyurek, J. Nieplocha, A. Rountev and P. Sadayappan. Proceedings of SC2006 High Performance Computing, Networking, and Storage Conference, Nov 2006, to appear.
  • A Hypergraph Partitioning Based Approach for Scheduling of Tasks with Batch-shared I/O. G. Khanna, N. Vydyanathan, T. Kurc, U. Catalyurek, P. Wyckoff, J. Saltz, and P. Sadayappan. Proceedings of the Cluster Computing and Grid 2005 (CCGrid05).
  • Permuting Sparse Rectangular Matrices into Block-Diagonal Form. C. Aykanat, A. Pinar and U. V. Catalyurek. SIAM Journal of Scientific Computing, Vol. 25, No. 6, pp. 1860-1879, 2004.
  • A Hypergraph-partitioning Approach for Coarse-Grain Decomposition. U. Catalyurek and C. Aykanat. Proceedings of the 2001 ACM/IEEE SC2001, Denver, CO, November 2001.
  • A Fine-Grain Hypergraph Model for 2D Decomposition of Sparse Matrices. U. Catalyurek and C. Aykanat. Proceedings of International Parallel and Distributed Processing Symposium (IPDPS), 8th International Workshop on Solving Irregularly structured Problems in Parallel (Irregular 2001), San Francisco, April 2001.
  • Hypergraph-Partitioning Based Decomposition for Parallel Sparse-Matrix Vector Multiplication. U. Catalyurek and C. Aykanat. IEEE Transaction on Parallel and Distributed Systems, Vol. 10, No. 7, pp. 673-693, 1999.
  • Decomposing irregularly sparse matrices for parallel matrix-vector multiplications. U. Catalyurek and C. Aykanat. Lecture Notes in Computer Science, vol. 1117, pp. 75-86, 1996.

Software: PaToH and Zoltan.

top, complete list of publications

Middleware for Data-Intensive Applications:

In many fields of science, engineering, and medicine, research is increasingly becoming more and more data-intensive. Simulation and computation have become new tools of scientific discovery. Scientists are able to run simulations that almost match the complexity of the real world. In addition to the computational power of the systems, advances in sensor technologies have enabled capturing of high-throughput data, such as gene micro-arrays. Storage capacity has shown an even faster trend than processing power, continuing almost to triple every eighteen months. Hence, vast amounts of data are being generated and collected at an increasing rate. Development of applications that are able to process and analyze these data, however, is becoming extremely difficult due both to the ever-growing complexity of the datasets and the complications involved in new, high-performance computing platforms. We design and develop middleware tools that will make it easier to store, retrieve and analyze large, distributed datasets.

Selected Publications:

  • Biomedical Image Analysis on a Cooperative Cluster of GPUs and Multicores. T. Hartley, U. Catalyurek, A. Ruiz, M. Ujaldon, F. Igual, R. Mayo, Proceedings of 22nd ACM International Conference on Supercomputing (ICS’08), Jun 2008.
  • Dynamic Generation of Accident Progression Event Trees. 2. A. Hakobyan, K. Metzroth, T. Aldemir, R. Denning, S. Dunagan, D. Kunsman, B. Rutt, and U. Catalyurek, Nuclear Engineering and Design, 2008, to appear.
  • Large-scale Biomedical Image Analysis in Grid Environments . V. S. Kumar, B. Rutt, T. Kurc, U. Catalyurek, T. Pan, S. Chow, S. Lamont, M. Martone, J. Saltz. IEEE Transactions on Information Technology in Biomedicine, Vol. 12, No. 2, pp. 154-161, Mar 2008.
  • Performance vs. Accuracy Trade-offs for Large-scale Image Analysis. V. S. Kumar, T. M. Kurc, J. Kong, U. V. Catalyurek, M. N. Gurcan, J. H. Saltz. Proceedings of the IEEE Cluster 2007, 2007.
  • Servicing Seismic and Oil Reservoir Simulation Data. S. Narayanan, T. Kurc, U. Catalyurek, and J. Saltz. VLDB Workshop on Data Management in Grids, Sep 2005.
  • Data-driven Scientific Applications in the Grid. S. Narayanan, T. Kurc, U. Catalyurek, J. Saltz. Parallel Processing Letters, Vol. 13, No. 2, pp. 245-271, 2003.
  • Applying Database Support for Large Scale Data Driven Science in Distributed Environments. S. Narayanan, U. Catalyurek, T. Kurc, X. Zhang and J. Saltz. Proceedings of the Fourth International Workshop on Grid Computing (Grid 2003), pp. 141-148, 2003.
  • Processing large-scale multidimensional data in parallel and distributed environments. M. Beynon, C. Chang, U. Catalyurek, T. Kurc, A. Sussman, H. Andrade, R. Ferreira, and J. Saltz. Parallel Computing, Vol. 28, No. 5, pp. 827-859, Special issue on Data Intensive Computing, May 2002.
  • Distributed Processing of Very Large Datasets with DataCutter. M. D. Beynon, T. Kurc, U. Catalyurek, C. Chang, A. Sussman, and J. Saltz. Parallel Computing, Vol. 27, No. 11, pp. 1457-1478, October 2001.

Software: DataCutter and STORM.

top, complete list of publications

Scheduling:

We have been working on various types of scheduling problems. In more traditional DAG scheduling, we are developing novel duplication-based algorithms. We are also interested in DAG scheduling problems that combines task and data parallelism Another scheduling problem we are investigating is the efficient execution of a batch of data-intensive tasks exhibiting batch-shared I/O behavior.

  • An Integrated Approach to Locality Conscious Processor Allocation and Scheduling of Mixed Parallel Applications. N. Vydyanathan, S. Krishnamoorthy, G. Sabin, U. Catalyurek, T. Kurc, P. Sadayappan, and J. Saltz, IEEE Transaction on Parallel and Distributed Systems, to appear.
  • Using Overlays For Efficient Data Transfer Over Shared Wide-Area Networks. G. Khanna, U. Catalyurek, T.Kurc, R. Kettimuthu, P. Sadayappan, I. Foster, and J. Saltz, Proceedings of SC2008 High Performance Computing, Networking, and Storage Conference, Nov 2008. Acceptance rate 21%
  • Scheduling File Transfers for Data-Intensive Jobs on Heterogeneous Clusters. G. Khanna, U. Catalyurek, T.Kurc, P. Sadayappan, and J. Saltz. Proceedings of Euro-Par, Aug 2007. Acceptance rate 27%.
  • Toward Optimizing Latency under Throughput Constraints for Application Workflows on Clusters. N. Vydyanathan, U. Catalyurek, T. Kurc, J. Saltz, and P. Sadayappan. Proceedings of Euro-Par, Aug 2007. Acceptance rate 27%.
  • Locality Conscious Processor Allocation and Scheduling for Mixed Parallel Applications. N. Vydyanathan, S. Krishnamoorthy, G. Sabin, U. Catalyurek, T. Kurc, P. Sadayappan, and Joel Saltz. Proceedings of 2006 IEEE International Conference on Cluster Computing, Sep 2006.
  • Task Scheduling and File Replication for Data-Intensive Jobs with Batch-shared I/O. G. Khanna, N. Vydyanathan, U. Catalyurek, T.Kurc, S. Krishnamoorthy, P. Sadayappan, and J. Saltz. Proceedings of the 15th IEEE International Symposium on High-Performance Distributed Computing (HPDC-15), pp. 241-252, Paris, France, Jun 2006.
  • A Task Duplication Based Bottom-Up Scheduling Algorithm for Heterogeneous Environments. D. Bozdag, U. Catalyurek, and F. Ozguner. Proceedings of 20th International Parallel and Distributed Processing Symposium (IPDPS), The 15th Heterogeneous Computing Workshop (HCW 2006), 2006.
  • A Task Duplication Based Scheduling Algorithm Using Partial Schedules. D. Bozdag, F. Ozguner, E. Ekici, and U. Catalyurek. Proceedings of the 2005 International Conference on Parallel Processing (ICPP-05).
  • A Hypergraph Partitioning Based Approach for Scheduling of Tasks with Batch-shared I/O. G. Khanna, N. Vydyanathan, T. Kurc, U. Catalyurek, P. Wyckoff, J. Saltz, and P. Sadayappan. Proceedings of the Cluster Computing and Grid 2005 (CCGrid05).

top, complete list of publications

Coloring:

One of our recent projects is development of distributed memory graph coloring algorithms. In many parallel scientific computing applications coloring is used to identify independent tasks that can be performed concurrently. In such cases graph is typically distributed across processors and hence the coloring needs to be computed in parallel. We have started with distance-1 and distance-2 coloring algorithms, and currently we are investigating how we can scale this algorithms to machines with tens of thousands of processors.

Selected Publications:

  • A Framework for Scalable Greedy Coloring on Distributed Memory Parallel Computers. D. Bozdag, A. Gebremedhin, F. Manne, E. G. Boman, U. V. Catalyurek, Journal of Parallel and Distributed Computing, Vol. 68, No. 4, pp. 515-535, 2008.
  • A Parallel Distance-2 Graph Coloring Algorithm for Distributed Memory Computers. D. Bozdag, U. Catalyurek, A. H.  Gebremedhin, F. Manne, E. G. Boman, and F. Ozguner. Proceedings of the 2005 International Conference on High Performance Computing and Communications (HPCC-05), Sep 2005.
  • An Efficient Parallel Graph Coloring Algorithm for Distributed Memory Computers. E. G. Boman, D. Bozdag, U. Catalyurek, A. H.  Gebremedhin, and F. Manne. Proceedings of Euro-Par, Sep 2005.

Software: Zoltan.

top, complete list of publications

Clustering:

This is yet another recent project. As a first step we are investigating possible approaches to improve the ways to isolate functional modules in interactions graphs that exhibits scale-free property.

  • Improving Functional Modularity in Protein-Protein Interactions Graphs Using Hub-induced Subgraphs. D. Ucar, S. Asur, U. Catalyurek, and S. Parthasarathy. In Proceedings of the European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD), Sep 2006.

top, complete list of publications

Semantic Graphs:

Developing scalable algorithms for relationship analysis on social network graphs (also known as semantic graphs) has been identified as an important need by the Department of Homeland Security. Unlike graphs from most scientific applications, in a scale-free graph most of the vertices are connected only to a small number of other vertices, while a few vertices, known as hubs, are connected to a large number of other vertices. This extreme irregularity requires novel data structures and databases to analyze and store these large graphs. We have been working on developing a middleware that will support storage, retrieval and processing of those large graphs. Additionally, we are developing new out-of-core algorithms for search-based relationship analysis.

Selected Publications:

  • A Scalable Distributed Parallel Breadth-First Search Algorithm on BlueGene/L. A. Yoo, E. Chow, K. Henderson, W. McLendon, B. Hendrickson and U. Catalyurek Proceedings of SC2005 High Performance Computing, Networking, and Storage Conference, Nov 2005, Gordon Bell Finalist.
  • MSSG: A Framework for Massive Scale Semantic Graphs. T. Hartley, U. Catalyurek, F. Özgüner, A. Yoo, S. Kohn and K. Henderson. Proceedings of 2006 IEEE International Conference on Cluster Computing, Sep 2006.

top, complete list of publications

©2006 Umit Catalyurek. Last Update: July 20th, 2006