IEEE International Parallel & Distributed Processing Symposium


Technical Committee on
Parallel Processing
IPDPS 2009 Hosted By

IPDPS 2009 Advance Program

 

Please visit the IPDPS website regularly for updates, since there may be schedule revisions. Authors who have corrections, contact info@ipdps.org. Note that paper numbers are listed for easy reference.

IPDPS 2009 Advance Program Abstracts
Abstracts for contributed papers and all workshops have been compiled to allow authors to check accuracy and so that visitors to this Website may preview the papers to be presented at the conference. Full proceedings of the conference will be published on a cdrom pocketed in a program book to be distributed to registrants at the conference.

Click here to view abstracts in advance (pdf)

MONDAY - 25 May 2009
DAYSMondayTuesdayWednesdayThursdayFriday
WORKSHOPS
all day*

* See each individual workshop programs for schedule details

HCW Heterogeneity in Computing Workshop
RAW Reconfigurable Architectures Workshop
HIPS Workshop on High-Level Parallel Programming Models & Supportive Environments
JAVAPDC Workshop on Java and Components for Parallelism, Distribution and Concurrency
NIDISC Workshop on Nature Inspired Distributed Computing
HiCOMB Workshop on High Performance Computational Biology
APDCM Advances in Parallel and Distributed Computing Models
CAC Communication Architecture for Clusters
HPPAC High-Performance, Power-Aware Computing
HPGC High Performance Grid Computing
SMTPS Workshop on System Management Techniques, Processes, and Services
Commercial Tutorial
6:30 PM

Commercial Tutorial

John Goodhue - SiCortex CTO
Topic: SiCortex High-Productivity, Low-Power Computers

• Read the abstract for this talk

TUESDAY - 26 May 2009
DAYSMondayTuesdayWednesdayThursdayFriday
Opening & Keynote
8:00 AM -
9:30 AM

KEYNOTE SPEAKER
chair: Horst Simon

Wen-Mei Hwu
University of Illinois, Urbana-Champaign, USA
Topic: Many-core Parallel Computing: Can compilers and tools do the heavy lifting?

• Read the abstract for this keynote

Morning Break 9:30 AM - 10:00 AM
Parallel Sessions
1, 2, 3 & 4

10:00 AM -
12:00 PM

SESSION 1
Algorithms - Scheduling I

Chair: Cynthia Phillips

1569160561
On Scheduling Dags to Maximize Area
Gennaro Cordasco (University of Salerno, IT); Arnold Rosenberg (Colorado State University, US)

1569163451
Efficient Scheduling of Task Graph Collections on Heterogeneous Resources
Matthieu Gallet (École normale supérieure de Lyon, FR); Loris Marchal (CNRS, FR); Frédéric Vivien (INRIA, FR)

1569163449
Static Strategies for Worksharing with Unrecoverable Interruptions
Anne Benoit (ENS Lyon, FR); Yves Robert (École Normale Supérieure de Lyon, FR); Arnold Rosenberg (Colorado State University, US); Frédéric Vivien (INRIA, FR)

1569163249
On the Complexity of Mapping Pipelined Filtering Services on Heterogeneous Platforms
Anne Benoit (ENS Lyon, FR); Fanny Dufossé (LIP, ENS Lyon, FR); Yves Robert (École Normale Supérieure de Lyon, FR)

SESSION 2
Applications - Biological Applications

Chair: David Konerding

1569163673
Sequence Alignment with GPU: Performance and Design Challenges
Gregory Striemer (University of Arizona, US); Ali Akoglu (University of Arizona, US)

1569163707
Evaluating the Use of GPUs for Life Science Applications
John Paul Walters (University at Buffalo, US); Vidyananth Balu (University at Buffalo, US); Suryaprakash Kompalli (University at Buffalo, US); Vipin Chaudhary (University at Buffalo, SUNY, US)

1569163711
Improving MPI-HMMER's Scalability With Parallel I/O
John Paul Walters (University at Buffalo, US); Rohan Darole (University at Buffalo, US); Vipin Chaudhary (University at Buffalo, SUNY, US)

1569160563
Accelerating Leukocyte Tracking using CUDA: A Case Study in Leveraging Manycore Coprocessors
Michael Boyer (University of Virginia, US); David Tarjan (University of Virginia, US); Scott Acton (University of Virginia, US); Kevin Skadron (University of Virginia, US)

SESSION 3
Architecture - Memory Hierarchy and Transactional Memory

Chair: Per Stenstrom

1569163127
Efficient Shared Cache Management through Sharing-Aware Replacement and Streaming-Aware Insertion Policy
Yu Chen (Tsinghua University,CN); Wenlong Li ( Intel Corp, CN); Changkyu Kim (Intel Corp, US); Zhizhong Tang (Tsinghua Univsersity, CN)

1569163069
Core-aware Memory Access Scheduling Schemes
Zhibin Fang (Illinois Institute of Technology, US); Xian-He Sun (Illinois Institute of Technology, US); Yong Chen (Illinois Institute of Technology, US); Surendra Byna (Illinois Institute of Technology, US)

1569162979
Using Hardware Transactional Memory for Data Race Detection
Shantanu Gupta (University of Michigan, US); Florin Sultan (NEC Laboratories America, US); Srihari Cadambi (NEC Laboratories America, Inc, US); Franjo Ivancic (NEC Laboratories America, Inc., US); Martin Roetteler (NEC Labs America, Inc., US)

1569163214
Speculation-Based Conflict Resolution in Hardware Transactional Memory
Rubén Titos (University of Murcia, ES); Manuel Acacio (Universidad de Murcia, ES); José M. García (University of Murcia, ES)

SESSION 4
Software - Fault Tolerance and Runtime Systems

Chair: DK Panda

1569162863
Compiler-Enhanced Incremental Checkpointing for OpenMP Applications
Greg Bronevetsky (Lawrence Livermore National Laboratory, US); Keshav Pingali (U. Texas at Austin, US); Daniel Marques (University of Texas, US); Radu Rugina (Cornell University, US); Sally McKee (Chalmers University of Technology, SE)

1569163631
DMTCP: Transparent Checkpointing for Cluster Computations and the Desktop
Jason Ansel (MIT, US); Kapil Arya (Northeastern University, US); Gene Cooperman (Northeastern University, US)

1569163367
Elastic Scaling of Data Parallel Operators in Stream Processing
Scott Schneider (Virginia Tech, US); Henrique Andrade (IBM T. J. Watson Research Center, US); Bugra Gedik (IBM T. J. Watson Research Center, US); Alian Biem (IBM Research, US); Kun-Lung Wu (IBM T. J. Watson Research Center, US)

1569163247
Scalable RDMA performance in PGAS languages
Montse Farreras (Universitat Politècnica de Catalunya (UPC), ES); George Almasi (IBM T.J. Watson Research Center, US); Calin Cascaval (IBM T.J. Watson Research Center, US); Toni Cortes (Technical University of Catalonia, ES)

Parallel Sessions
5, 6, 7 & 8

2:00 PM -
4:00 PM

SESSION 5
Algorithms - Resource Management

Chair: Michele Flammini

1569163559
Singular Value Decomposition on GPU using CUDA
Sheetal Lahabar (International Institute of Information Technology, IN)

1569163675
Coupled Placement in Modern Data Centers
Madhukar Korupolu (IBM Almaden Research Center, US); Aameek Singh (IBM Almaden Research Center, US); Bhuvan Bamba (Georgia Institute of Technology, US)

1569163475
An Upload Bandwidth Threshold for Peer-to-Peer Video-on-Demand Scalability
Yacine Boufkhad (Paris Diderot University, FR); Fabien Mathieu (Orange Labs, FR); Fabien de Montgolfier (Université Paris 7, FR); Diego Perino (Orange Labs, FR); Laurent Viennot (INRIA, FR)

1569163331
Competitive Buffer Management with Packet Dependencies
Alex Kesselman (Google, US); Boaz Patt-Shamir (Tel Aviv University, IL); Gabriel Scalosub (University of Toronto, CA)

SESSION 6
Applications - System Software and Applications

Chair: Leonid Oliker

1569162947
Annotation-Based Empirical Performance Tuning Using Orio
Albert Hartono (Ohio State University, US); Boyana Norris (Argonne National Laboratory, US); Ponnuswamy Sadayappan (Ohio State University, US)

1569163271
Automatic detection of parallel applications computation phases
Juan Gonzalez Garcia (Universitat Politecnica de Catalunya, ES); Judit Gimenez (Universitat Politecnica de Catalunya, ES); Jesus Labarta (Technical University of Catalonia, ES)

1569163465
Handling OS Jitter in Multicore Multithreaded Systems
Pradipta De (IBM Research, New Delhi, India, IN); Vijay Mann (IBM Research, New Delhi, India, IN); Umang Mittal (Indian Institute of Technology, New Delhi, India, IN)

1569162919
Building A Parallel Pipelined External Memory Algorithm Library
Andreas Beckmann (Goethe-Universität Frankfurt am Main, DE); Roman Dementiev (Universität Karlsruhe, DE); Johannes Singler (Universität Karlsruhe, DE)

SESSION 7
Architecture - Power Efficiency and Process Variability

Chair: Grigorios Magklis

1569162151
On reducing misspeculations on a pipelined scheduler
Ruben Gran Tejero (University of Zaragoza, ES); Enric Morancho (Universitat Politècnica de Catalunya, ES); Angel Olive (UPC-dac, ES); Jose Maria Llaberia (Universidad Politecnica de Cataluña, ES)

1569163229
Efficient Microarchitecture Policies for Accurately Adapting to Power Constraints
Juan Cebrián (University of Murcia, ES); Juan Aragón (University of Murcia, SPAIN, ES); José M. García (University of Murcia, ES); Pavlos Petoumenos (University of Patras, GR); Stefanos Kaxiras (University of Patras, GR)

1569163689
An On/Off Link Activation Method for Low-Power Ethernet in PC Clusters
Michihiro Koibuchi (National Institute of Informatics, JP); Tomohiro Otsuka (Keio University, JP); Hiroki Matsutani (Keio University, JP); Hideharu Amano (Keio University, JP)

1569163477
A new mechanism to deal with process variability in NoC links
Carles Hernandez (Technical University of Valencia, ES); Federico Silla (Technical University of Valencia, ES); Vicente Santonja (Universidad Politecnica de Valencia, ES); Jose Duato (Universidad Politecnica de Valencia, ES)

SESSION 8
Software - Data Parallel Programming Frameworks

Chair: Michael Gerndt

1569163403
A framework for efficient and scalable execution of domain-specific templates on GPUs
Narayanan Sundaram (University of California, Berkeley, US); Anand Raghunathan (NEC-Labs America, US); Srimat Chakradhar (NEC Research Labs, US)

1569163621
CellMR: A Framework for Supporting MapReduce on Asymmetric Cell-Based Clusters
M. Mustafa Rafique (Virginia Tech, US); Benjamin Rose (Virginia Tech, US); Ali Butt (Virginia Tech., US); Dimitrios Nikolopoulos (Virginia Tech, US)

1569161963
A Cross-Input Adaptive Framework for GPU Programs Optimizations
Yixun Liu (The College of William and Mary, US); Eddy Zhang (College of William and Mary, US); Xipeng Shen (The College of William and Mary, US)

1569163669
Message Passing on Data-Parallel Architectures
Jeffery Stuart (University of California, Davis, US): John Owens (University of California, Davis, US)

Afternoon Break 4:00 PM - 4:30 PM
Parallel Sessions
9, 10, 11 & 12

4:30 PM -
6:30 PM

SESSION 9
Algorithms - Scheduling II

Chair: Boaz Patt-Shamir

1569162705
Online time constrained scheduling with penalties
Nicolas Thibault (University of Evry, FR)

1569163131
Minimizing Total Busy Time in Parallel Scheduling with Application to Optical Networks
Michele Flammini (University of L'Aquila, IT); Tami Tamir (Efi Arazi School of Computer Science and Engineering, IL); Gianpiero Monaco (Università di L'Aquila, IT); Luca Moscardelli (Università di L'Aquila, IT); Hadas Shachnai (Technion, IL); Mordechai Shalom (Technion, Israel Institute of Technology, IL); Shmuel Zaks (Technion, IL)

1569163241
Energy Minimization for Periodic Real-Time Tasks on Heterogeneous Processing Units
Jian-Jia Chen (ETH Zurich, CH); Andreas Schranzhofer (TIK ETH Zurich, CH); Lothar Thiele (ETH Zurich, CH)

1569163369
Multi-Users Scheduling in Parallel Systems
Erik Saule (Institut Polytechnique Grenoble, FR); Denis Trystram (Univ. of Grenoble, FR)

SESSION 10
Applications - Graph and String Applications

Chair: Robert Farber

1569163439
Input-independent, Scalable and Fast String Matching on the Cray XMT
Oreste Villa (PNNL, US); Daniel Chavarria (Pacific Northwest National Laboratory, US); Kristyn Maschhoff (Cray, Inc., US)

1569163605
Compact Graph Representations and Parallel Connectivity Algorithms for Massive Dynamic Network Analysis
Kamesh Madduri (Lawrence Berkeley National Laboratory, US); David Bader (Georgia Institute of Technology, US)

1569163701
Transitive Closure on the Cell Broadband Engine: A study on Self-Scheduling in a Multicore Processor
Sudhir Vinjamuri (University of Southern California, US); Viktor Prasanna (University of Southern California, US)

1569163719
Parallel Short Sequence Mapping for High Throughput Genome Sequencing
Doruk Bozdag (The Ohio State University, US); Catalin Barbacioru (Applied Biosystems, US); Umit Catalyurek (The Ohio State University, US)

SESSION 11
Architecture - Networks and Interconnects

Chair: Jose Manuel Garcia

1569163545
TupleQ: Fully-Asynchronous and Zero-Copy MPI over InfiniBand
Matthew Koop (The Ohio State University, US); Jaidev Sridhar (The Ohio State University, US); Dhabaleswar Panda (The Ohio State University, US)

1569163759
Disjoint-Path Routing: Efficient Communication for Streaming Applications
DaeHo Seo (Purdue University, US); Mithuna Thottethodi (Purdue University, US)

1569163011
Performance Analysis of Optical Packet Switches Enhanced with Electronic Buffering
Zhenghao Zhang (Florida State University, US); Yuanyuan Yang (Stony Brook University, US)

1569163359
An Approach for Matching Communication Patterns in Parallel Applications
Yong-Meng Teo (National University of Singapore, SG)

SESSION 12
Software - I/O and File Systems

Chair: Dimitrios Nikolopoulos

1569163507
Adaptable, Metadata Rich IO Methods for Portable High Performance IO
Jay Lofstead (Georgia Institute of Technology, US); Fang Zheng (Georgia Tech, US); Scott Klasky (Oak Ridge National Laboratory, US); Karsten Schwan (Georgia Tech, US)

1569163295
Small File Access in Parallel File Systems
Philip Carns (Argonne National Laboratory, US); Sam Lang (Argonne National Laboratory, US); Robert Ross (Argonne National Laboratory, US); Murali Vilayannur (Vmware Inc., US); Julian Kunkel (University of Heidelberg, DE); Thomas Ludwig (University of Heidelberg, DE)

1569163691
Making Resonance a Common Case: A High-Performance Implementation of Collective I/O on Parallel File System
Xuechen Zhang (Wayne State University, US); Song Jiang (Wayne State University, US); Kei Davis (Los Alamos National Laboratory, US)

1569162661
Design, Implementation, and Evaluation of Transparent pNFS on Lustre
Weikuan Yu (Oak Ridge National Laboratory, US); Oleg Drokin (Sun Microsystems Inc., US); Jeffrey Vetter (Oak Ridge National Laboratory, US)

Symposium Tutorial
6:30 PM -
10:00 PM

Symposium Tutorial

Title: Tools for Scalable Performance Analysis on Petascale Systems

Presenters: I-Hsin Chung, S.R. Seelam (IBM T.J. Watson, USA) B. Mohr (Research Center Juelich, Germany) J. Labarta (UPC Barcelona, Spain)

• Read the abstract for this talk

WEDNESDAY - 27 May 2009
DAYSMondayTuesdayWednesdayThursdayFriday
Keynote Session
8:30 AM -
9:30 AM

KEYNOTE SPEAKER
Chair: Christian Scheideler, TU Munich

Nir Shavit
Tel Aviv University, Israel
Topic: Software Transactional Memory: Where do we come from? What are we? Where are we going?

• Read the abstract for this keynote

Morning Break 9:30 AM - 10:00 AM
Best Papers - Plenary
10:00 AM -
12:00 PM

Best Papers - Plenary

Chair: Per Stenstrom

1569163497
Crash Fault Detection in Celerating Environments
Srikanth Sastry (Texas A&M University, US); Scott Pike (Texas A&M University, US); Jennifer Welch (Texas A&M University, US)

1569163189
HPCC RandomAccess Benchmark for Next Generation Supercomputers
Vikas Aggarwal (IBM India Research Lab, IN); Yogish Sabharwal (IBM India Research Lab, IN); Rahul Garg (IBM India Research Lab, IN); Philip Heidelberger (IBM Research, US)

1569163687
Exploring the Multi-GPU Design Space
Dana Schaa (Northeastern University, US); David Kaeli (Northeastern University, US)

1569163125
Accommodating Bursts in Distributed Stream Processing Systems
Yannis Drougas (University of California, Riverside, US); Vana Kalogeraki (University of California, Riverside, US)

Parallel Sessions
13, 14, 15 & 16
2:00 PM -
4:00 PM

SESSION 13
Algorithms - General Theory

Chair: Jennifer Welch

1569162929
Combinatorial Properties for Efficient Communication in Distributed Networks with Local Interactions
Sotiris Nikoletseas (University of Patras and Computer Technology Institute, GR); Christoforos Raptopoulos (U. of Patras, GR); Paul Spirakis (University of Patras, GR)

1569163413
Remote-Spanners: What to Know beyond Neighbors
Laurent Viennot (INRIA, FR); Philippe Jacquet (INRIA, FR)

1569163147
A Fusion-based Approach for Tolerating Faults in Finite State Machines
Vinit Ogale (University of Texas at Austin, US); Bharath Balasubramanian (University of Texas at Austin, US); Vijay Garg (IBM India Research Lab., India)

1569163611
The Weak Mutual Exclusion Problem
Paolo Romano (Inesc-ID, PT); Luis Rodrigues (Inesc-ID/IST, PT); Nuno Carvalho (Inesc-ID/IST, PT)

SESSION 14
Applications - Data Intensive Applications

Chair: Wolfgang Nagel

1569162207
Best-Effort Parallel Execution for Recognition and Mining Applications
Jiayuan Meng (University of Virginia, US); Anand Raghunathan (NEC-Labs America, US); Srimat Chakradhar (NEC Research Labs, US)

1569163495
Multi-Dimensional Characterization of Temporal Data Mining on Graphics Processors
Jeremy Archuleta (Virginia Tech, US); Yong Cao (Virginia Tech, US); Wu-chun Feng (Virginia Tech, US); Tom Scogland (Virginia Tech, US)

1569163649
A Partition-based Approach to Support Streaming Updates over Persistent Data in an Active Data Warehouse
Abhirup Chakraborty (University of Waterloo, Canada); Ajit Singh (University of Waterloo, CA)

1569163727
Architectural Implications for Spatial Object Association Algorithms
Vijay Kumar (The Ohio State University, US); Tahsin Kurc (Emory University, US); Joel Saltz (Emory University, US); Ghaleb Abdulla (Lawrence Livermore National Laboratory, US); Scott Kohn (Lawrence Livermore National Laboratory, US); Celeste Matarazzo (Lawrence Livermore National Laboratory, US)

SESSION 15
Architecture - Emerging Architectures and Performance Modeling

Chair: Josep Torrellas

1569163213
vCUDA: GPU Accelerated High Performance Computing in Virtual Machines
Hao Chen (Hunan University, CN)

1569163531
Understanding the Design Trade-offs among Current Multicore Systems for Numerical Computations
Seunghwa Kang (Georgia Institute of Technology, US); David Bader (Georgia Institute of Technology, US); Richard Vuduc (Georgia Institute of Technology, US)

1569163307
Parallel Data-Locality Aware Stencil Computations on Modern Micro-Architectures
Matthias Christen (University of Basel, CH); Olaf Schenk (University of Basel, CH); Esra Neufeld (IT'IS Foundation, ETH Zurich, CH); Peter Messmer (Tech-X Corporation, US); Helmar Burkhart (University of Basel, CH)

1569163549
Performance Projection of HPC Applications Using SPEC CFP2006 Benchmarks
Sameh Sharkawi (Texas A&M University, US); Don DeSota (IBM, US); Raj Panda (IBM, US); Rajeev Indukuru (IBM, US); Stephen Stevens (IBM, US); Valerie Taylor (Texas A&M University, US); Xingfu Wu (Texas A&M University, US)

SESSION 16
Software - Distributed Systems, Scheduling and Memory Management

Chair: Greg Bronevetsky

1569163745
Work-First and Help-First Scheduling Policies for Async-Finish Task Parallelism
Yi Guo (Rice University, US); Rajkishore Barik (Rice University, US); Raghavan Raman (Rice University, US); Vivek Sarkar (Rice University, US)

1569163333
Autonomic management of non-functional concerns in distributed and parallel application programming
Marco Aldinucci (University of Pisa, IT); Marco Danelutto (Univesity of Pisa, IT); Peter Kilpatrick (Queen's University of Belfast, UK)

1569163725
Scheduling Resizable Parallel Applications
Rajesh Sudarsan (Virginia Tech, US); Calvin Ribbens (Virginia Tech, US)

1569162787
Helgrind+: An Efficient Dynamic Race Detector
Ali Jannesari (University of Karlsruhe, DE); Kaibin Bao (University of Karlsruhe, DE); Victor Pankratius (University of Karlsruhe, DE); Walter Tichy (University Karlsruhe, DE)

Symposium Panel
4:30 PM -
6:30 PM

 

Symposium Panel

Topic: How to Build a Useful Thousand-Core System?
Moderator: Josep Torrellas, University of Illinois, Urbana-Champaign

• Read the abstract for this talk

Panelists:
• Laxmikant Kale, University of Illinois at Urbana-Champaign
• Jesus Labarta, Supercomputing Center, Universitat Politecnica de
   Catalunya, Barcelona
• Keshav Pingali, University of Texas at Austin
• Per Stenstrom, Chalmers University

Evening

Symposium Buffet Dinner

Details to be announced

THURSDAY - 28 May 2009
DAYSMondayTuesdayWednesdayThursdayFriday
Keynote Session
8:30 AM - 9:30 AM

KEYNOTE SPEAKER
Chair: Frank Mueller

Leonid Oliker
Lawrence Berkeley National Laboratory, USA
Title: Green Flash: Designing an energy efficient climate supercomputer

• Read the abstract for this keynote

Morning Break 9:30 AM- 10:00 AM
1:00 PM - 6:00 PM

2009 TCPP PhD FORUM

Selected Poster Presentations
PhD Forum presenters will be available during the Forum session to discuss their work. Posters may also be viewed during the TCPP Reception Thursday evening.

Parallel Sessions
17, 18, 19 & 20
10:00 AM -
12:00 PM

SESSION 17
Algorithms - Wireless Networks

Chair: Geppino Pucci

1569155847
Sensor Network Connectivity with Multiple Directional Antennae of a Given Angular Sum
Evangelos Kranakis (Carleton University, CA); Danny Krizanc (Wesleyan University, US); Binay Bhattacharya (Professor, CA); Yuzhuang Hu (Simon Fraser University, CA); Qiaosheng Shi (Simon Fraser University, CA)

1569163013
Unit Disk Graph and Physical Interference Model: Putting Pieces Together
Emmanuelle Lebhar (CNRS (France) and CMM-University of Chile (Chile), FR); Zvi Lotker (Ben Gurion University, Beer Sheva, IL)

1569160599
Path-Robust Multi-Channel Wireless Networks
Arnold Rosenberg (Colorado State University, US)

1569161299
Information Spreading in Stationary Markovian Evolving Graphs
Andrea Clementi (Universita' di Roma "Tor Vergata", IT); Angelo Monti (University of Rome, "La Sapienza", IT); Francesco Pasquale (University of Rome ''Tor Vergata'', IT); Riccardo Silvestri (University of Rome, ''La Sapienza'', IT)

SESSION 18
Applications I - Cluster/Grid/P2P Computing

Chair: Anne Elster

1569161939
Multiple Priority Customer Service Guarantees in Cluster Computing
Kaiqi Xiong (North Carolina State University, US)

1569163009
Treat-Before-Trick : Free-riding Prevention for BitTorrent-like Peer-to-Peer Networks
Kyuyong Shin (North Carolina State University, US); Douglas S. Reeves (North Carolina State University, US); Injong Rhee (North Carolina State University, US)

1569163237
A Resource Allocation Approach for Supporting Time-Critical Applications in Grid Environments
Qian Zhu (The Ohio State University, US); Gagan Agrawal (The Ohio State University, US)

SESSION 19
Applications II - Multicore

Chair: Dan Katz

1569163049
High-Order Stencil Computations on Multicore Clusters
Liu Peng; Richard Seymour; Ken-ichi Nomura; Rajiv K. Kalia; Aiichiro Nakano; Priya Vashishta (University of Southern California, US) Alexander Loddoch; Michael Netzband; William R. Volz; Chap C. Wong (Chevron ETC, US)

1569163597
Dynamic Iterations for the Solution of Ordinary Differential Equations on Multicore Processors
Ashok Srinivasan (Florida State University, US); Yanan Yu (Florida State University, US)

1569162909
Efficient Large-Scale Model Checking
Kees Verstoep (Vrije Universiteit, NL); Henri Bal (Vrije Universiteit, NL); Jiri Barnat (Masaryk University, CZ); Lubos Brim (Masaryk University, CZ)

SESSION 20
Software - Parallel Compilers and Languages

Chair: Frank Mueller

1569163521
Scalable Autotuning Framework for Compiler Optimization
Ananta Tiwari (University of Maryland at College Park, US); Chun Chen (USC/ISI, US); Jacqueline Chame (USC/ISI, US); Mary Hall (USC/ISI, US); Jeffrey Hollingsworth (University of Maryland, US)

1569163317
Taking the heat off transactions: dynamic selection of pessimistic concurrency control
Nehir Sonmez (Universitat Politècnica de Catalunya, ES); Adrian Cristal (Barcelona Supercomputing Center, ES); Tim Harris (Microsoft Research, UK); Osman Unsal (Barcelona Supercomputing Center, ES); Mateo Valero (Universidad Politécnica de Cataluña, ES)

1569162025
Packer: an Innovative Space-Time-Efficient Parallel Garbage Collection Algorithm Based on Virtual Spaces
Shaoshan Liu (University of California, Irvine, US); Ligang Wang (Intel China Research Center, CN); Jean-Luc Gaudiot (University of California, US); Xiao-Feng Li (Middleware Products Division, Software and Solutions Group, Intel Corp, CN)

1569163297
Concurrent SSA for General Barrier-Synchronized Parallel Programs
Harshit Shah (Tata Institute of Fundamental Research, IN); R. k. Shyamasundar (Tata Institute of Fundamental Research, IN); Pradeep Varma (IBM India Research Laboratory, IN)

Parallel Sessions
21, 22 & 23

2:00 PM -
4:00 PM

SESSION 21
Algorithms - Self-Stabilization

Chair: Christian Scheideler

1569163515
Optimal Deterministic Self-stabilizing Vertex Coloring in Unidirectional Anonymous Networks
Samuel Bernard (Universite Paris 6, FR); Stephane Devismes (VERIMAG Grenoble, FR); Maria Gradinariu (University Paris 6, FR); Sebastien Tixeuil (Univ. Pierre & Marie Curie, FR)

1569163437
Self-stabilizing minimum-degree spanning tree within one from the optimal degree
Lelia Blin (IBISC-University of Evry Val d'Essones, FR); Maria Gradinariu (University Paris 6, FR); Stephane Rovedakis (Université d'Evry (Laboratoire IBISC), FR)

1569162731
A snap-stabilizing point-to-point communication protocol in message-switched networks
Alain Cournier (Université de Picardie Jules Verne, FR); Swan Dubois (Université Pierre et Marie Curie, INRIA Rocquencourt, FR); Vincent Villain (University of Picardie Jules Verne, FR)

1569163279
An Asynchronous Leader Election Algorithm for Dynamic Networks
Jennifer Welch (Texas A&M University, US); Jennifer Walter (Vassar College, US)

SESSION 22
Applications - Scientific Applications

Chair: Kuang Jin Oh

1569162659
A Metascalable Computing Framework for Large Spatiotemporal-Scale Atomistic Simulations
Ken-ichi Nomura (University of Southern California, US); Richard Seymour (University of Southern California, US); Weiqiang Wang (University of Southern California, US); Rajiv Kalia (University of Southern California, US); Aiichiro Nakano (University of Southern California, US); Priya Vashishta (University of Southern California, US); Fuyuki Shimojo (Kumamoto University, JP); Lin Yang (Lawrence Livermore National Laboratory, US)

1569163085
Scaling Challenges for Massively Parallel AMR Applications
Brian Van Straalen (Lawrence Berkeley National Laboratory, US); Terry Ligocki (Lawrence Berkeley National Laboratory, US); John Shalf (Lawrence Berkeley National Laboratory, US); Noel Keen (Lawrence Berkeley National Laboratory, US); Woo-Sun Yang (Cray Inc., US)

1569163399
Parallel Accelerated Cartesian Expansions for Particle Dynamics Simulations
Melapudi Vikram (Michigan State University, US); Andrew Baczewski (Michigan State University, US); B. Shanker (Michigan State University, US); Srinivas Aluru (Iowa State University, US)

1569163501
Parallel Implementation of Irregular Terrain Model on IBM Cell Broadband Engine
Yang Song (University of Arizona, US); Jeffrey Rudin (Mercury Computer Systems, US); Ali Akoglu (University of Arizona, US)

SESSION 23
Software - Communications Systems

Chair: Jeff Hollingsworth

1569163699
Phaser Accumulators: a New Reduction Construct for Dynamic Parallelism
Jun Shirako (Rice University, US); David Peixotto (Rice University, US); Vivek Sarkar (Rice University, US); William Scherer III (Rice University, US)

1569162931
NewMadeleine: An Efficient Support for High-Performance Networks in MPICH2
Guillaume Mercier (INRIA-Labri, Université Bordeaux 1, FR); Francois Trahay (Université Bordeaux 1, FR); Darius Buntinas (Argonne National Laboratory, US); Elisabeth Brunet (Université Bordeaux 1, FR)

1569163723
Scaling Communication Intensive Applications on BlueGene/P Using One-Sided Communication and Overlap
Rajesh Nishtala (UC Berkeley, US); Paul Hargrove (Lawrence Berkeley National Laboratory, US); Dan Bonachea (UC Berkeley, US); Katherine Yelick (UC Berkeley, US)

1569163409
Dynamic High-Level Scripting in Parallel Applications
Filippo Gioachin (University of Illinois at Urbana-Champaign, US); Laxmikant Kale (University of Illinois at Urbana-Champaign, US)

Afternoon Break 4:00 PM - 4:30 PM
Parallel Sessions
24 & 25

4:30 PM -
6:30 PM

SESSION 24
Algorithms - Network Algorithms

Chair: Leszek Gasieniec

1569163753
Map Construction and Exploration by Mobile Agents Scattered in a Dangerous Network
Paola Flocchini (University of Ottawa, CA); Matthew Kellett (Defence R&D Canada - Ottawa, CA); Peter Mason (Defence Research & Development Canada, CA); Nicola Santoro (Carleton University, CA)

1569162729
A General Approach to Toroidal Mesh Decontamination with Local Immunity
Fabrizio Luccio (University of Pisa, IT); Linda Pagli (Universita' di Pisa, IT)

1569163061
On the Tradeoff Between Playback Delay and Buffer Space in Streaming
Alix L. H. Chow (University of Southern California, US); Leana Golubchik (USC, US); Samir Khuller (University of Maryland at College Park, US); Yuan Yao (University of Southern California, US)

SESSION 25
Applications - Sorting and FFTs

Chair: Terry Ligocki

1569162721
A Performance Model for Fast Fourier Transform on Multi-core Architecture
Yan Li (IBM China Research Lab, CN); Li Zhao (Academy of Mathematicalalals and Systems Science, Chinese Academy of Science, CN); Haibo Lin (IBM China Research Lab, CN); Alex Chow (IBM Corp., US); Jeffery R. Diamond (IBM, US)

1569162951
Designing Efficient Sorting Algorithms for Manycore GPUs
Nadathur Satish (University of California, Berkeley, US); Mark Harris (NVIDIA, AU); Michael Garland (NVIDIA Corporation, US)

1569163021
Minimizing Startup Costs for Performance-Critical Threading
Clint Whaley (University of Texas at San Antonio, US); Anthony Castaldo (University of Texas at San Antonio, US)

TCPP Membership Meeting & Reception
6:30 PM -
8:30 PM

TCPP Invited Speaker

 

Michael Garland
NVIDIA
Topic: Parallel Computing on Manycore GPUs

Read the abstract for this talk

FRIDAY - 29 May 2009
DAYSMondayTuesdayWednesdayThursdayFriday
WORKSHOPS
all day*

* See each individual workshop programs for schedule details

PDSEC Workshop on Parallel and Distributed Scientific and Engineering Computing
PMEO Performance Modeling, Evaluation, and Optimisation of Ubiquitous Computing and Networked Systems
DPDNS Dependable Parallel, Distributed and Network-Centric Systems
SSN International Workshop on Security in Systems and Networks
HOTP2P International Workshop on Hot Topics in Peer-to-Peer Systems
PCGRID Workshop on Large-Scale, Volatile Desktop Grids
MTAAP Workshop on Multi-Threaded Architectures and Applications
PDCoF Workshop on Parallel and Distributed Computing in Finance
LSPP Workshop on Large-Scale Parallel Processing
JSSPP Workshop on Job Scheduling Strategies for Parallel Processing


MONDAY COMMERCIAL TUTORIAL
Monday, May 25, 2009, 6:30 PM

Presenter:
John Goodhue, CTO
SiCortex

Topic:
SiCortex High-Productivity, Low-Power Computers

Abstract:
In order to work efficiently, clusters for high performance computing require a balance between the compute, memory, inter-node communication, and I/O. Fast communications among one thousand multicore nodes requires short wire paths and power-efficient CPUs tightly integrated with memory, communication, and I/O controllers. The tutorial describes the characteristics of a six thousand core cluster that puts all of these elements on a single chip, dramatically reducing cost and power consumption while increasing reliability and performance compared to commodity clusters.

Bio:
Mr. Goodhue has worked in the computer and communications technology business for more than 25 years. He began his career as an engineer at BBN, where he held positions ranging from software engineer to Vice President Engineering in both its high performance computing businesses. Mr. Goodhue is also co-founder of several startup businesses, including Dash Strauss and Goodhue, a compliance consulting firm, and Lightstream Inc., a networking company that was acquired by Cisco Systems in 1995. At Cisco, John served as Director of Engineering in the ATM and Core Router business units and as General Manager of Cisco's broadband aggregation business unit. Mr. Goodhue has a B.S. in Computer Science from Massachusetts Institute of Technology (MIT).

 

TUESDAY KEYNOTE ABSTRACT
Tuesday, May 26, 2009, 8:30 AM - 9:30 AM

Keynote:
Wen-mei W. Hwu, University of Illinois at Urbana-Champaign

Title:
Many-core Parallel Computing – Can compilers and tools do the heavy lifting?

Abstract:
Modern GPUs such as the NVIDIA GeForce GTX280, ATI Radeon 4860, and the upcoming Intel Larrabee are massively parallel, many-core processors. Today, application developers for these many-core chips are reporting 10X-100X speedup over sequential code on traditional microprocessors. According to the semiconductor industry roadmap, these processors could scale up to over 1,000X speedup over single cores by the end of the year 2016. Such a dramatic performance difference between parallel and sequential execution will motivate an increasing number of developers to parallelize their applications. Today, an application programmer has to understand the desirable parallel programming idioms, manually work around potential hardware performance pitfalls, and restructure their application design in order to achieve their performance objectives on many-core processors. Although many researchers have given up on parallelizing compilers, I will show evidence that by systematically incorporating high-level application design knowledge into the source code, a new generation of compilers and tools can take over the heavy lifting in developing and tuning parallel applications. I will also discuss roadblocks whose removal will require innovations from the entire research community.

Bio:
Wen-mei W. Hwu is a Professor and holds the Sanders-AMD Endowed Chair in the Department of Electrical and Computer Engineering, University of Illinois at Urbana-Champaign. His research interests are in the area of architecture, implementation, and software for high performance computer systems. He is the director of the IMPACT research group (www.crhc.uiuc.edu/Impact). For his contributions in research and teaching, he received the ACM SigArch Maurice Wilkes Award, the ACM Grace Murray Hopper Award, the Tau Beta Pi Daniel C. Drucker Eminent Faculty Award, and ISCA Most Influential Paper Award. He is a fellow of IEEE and ACM. Hwu serves on the Executive Committee of the MARCO/DARPA C2S2 (www.c2s2.org) and GSRC (www.gigascale.org) Focus Research Centers. He leads the GSRC Concurrent Systems Theme. He co-directs the new $18M UIUC Intel/Microsoft Universal Parallel Computing Research Center with Marc Snir and serves as one of the principal investigators of the $208M NSF Blue Waters Petascale computer project. Dr. Hwu received his Ph.D. degree in Computer Science from the University of California, Berkeley.


TUESDAY SYMPOSIUM TUTORIAL ABSTRACT
Tuesday, May 26, 2009, 6:30 PM

Title:
Tools for Scalable Performance Analysis on Petascale Systems

Presenters:
I-Hsin Chung, S.R. Seelam (IBM T.J. Watson, USA)
B. Mohr (Research Center Juelich, Germany)
J. Labarta (UPC Barcelona, Spain)

Brief:
Tools are becoming increasingly important to efficiently utilize the computing power available in contemporary large scale systems. The drastic increase in the size and complexity of systems requires tools to be scalable while producing meaningful and easily digestible information that may help the user pin-point problems at scale. The goal of this tutorial is to introduce some state-of-the-art performance tools from three different organizations to a diverse audience group. Together these tools provide a broad spectrum of capabilities necessary to analyze the performance of scientific and engineering applications on a variety of large and small scale systems. The tutorial will consist of one-hour presentations on three tools:

Presenters will provide demonstrations of real-world examples and based on available hardware will allow users to gain hands on experience with demonstration codes on large scale systems. This tutorial should have broad appeal to a large community as its content is suited for performance analysis on small scale server systems as well as Petascale systems.

 

WEDNESDAY KEYNOTE ABSTRACT
Wednesday, May 27, 2009, 8:30 AM - 9:30 AM

Keynote:
Nir Shavit
Computer Science Department
Tel-Aviv University, Israel

Title:
Software Transactional Memory: Where do we come from? What are we?
Where are we going?"

Abstract:
The transactional memory programming paradigm is gaining momentum as the approach of choice for replacing locks in concurrent programming. Combining sequences of concurrent operations into atomic transactions seems to promise a great reduction in the complexity of both programming and verification, by making parts of the code appear to be sequential without the need to program fine-grained locks. Software transactional memory offers to deliver a transactional programming environment without the need for costly modifications in processor design. However, the story of software transactional memory reminds one of garbage collection in its time: performance is improving, and the semantics are becoming clearer, yet there is still a long road ahead, a road strewn with stones below and crows hovering above, predicting its demise. This talk will try to take a sober look at software transactional memory, its history, the state of research today, and what we can expect to achieve it in the foreseeable future.

Bio:
Nir Shavit received a B.A. and M.Sc. from the Technion and a Ph.D. from the Hebrew University, all in Computer Science. He was a Postdoctoral Researcher at IBM Almaden Research Center, Stanford University, and MIT, and a Visiting Professor at MIT. He joined the computer science department at Tel-Aviv university in 1992 and was at various time a Member of Technical Staff at Sun Microsystems Laboratories. Prof. Shavit is the recipient of the Israeli Industry Research Prize in 1993 and the ACM/EATCS Goedel Prize in Theoretical Computer Science in 2004. His research interests include software aspects of Multiprocessor Synchronization, the design and implementation of Concurrent Data-Structures, and the Theoretical Foundations of Asynchronous Computability. He designed (together with his students) the first Software Transactional Memory system, and has been involved in the design of several of today's state of the art STMs.

 

WEDNESDAY PANEL ABSTRACT
Wednesday, May 27, 2009 4:30 PM - 6:300 PM

Topic:
How to Build a Useful Thousand-Core System?

Moderator:
Josep Torrellas, University of Illinois, Urbana-Champaign

Panelists:
• Laxmikant Kale, University of Illinois at Urbana-Champaign
• Jesus Labarta, Supercomputing Center, Universitat Politecnica de Catalunya, Barcelona
• Keshav Pingali, University of Texas at Austin
• Per Stenstrom, Chalmers University

Abstract:
Current hardware roadmaps call for doubling the number of on-chip cores approximately every two years. If this trend materializes, in at most a decade and a half, we will reach one thousand cores. This scenario has mind-boggling consequences for the IPDPS research community. There are many questions to answer. For example, at the architecture level, how are we going to power these chips and provide the required bandwidth? At the software level, how are we going to manage possibly-heterogeneous resources with low overhead, efficiently compile for these machines, and provide programmer-friendly programming models? At the application level, what kinds of applications and algorithms will we use? This panel will provide an opportunity for the conference attendees to discuss all of these topics.

 

THURSDAY KEYNOTE ABSTRACT
Thursday, May 28, 2009, 8:30 AM - 9:30 AM

Keynote:
Leonid Oliker
Computational Research Division
Lawrence Berkeley National Laboratory, Berkeley, USA

Title:
Green Flash: Designing an energy efficient climate supercomputer

Abstract:
It is clear from both the cooling demands and the electricity costs, that the growth in scientific computing capabilities of the last few decades is not sustainable unless fundamentally new ideas are brought to bear. In this talk we propose a novel approach to supercomputing design that leverages the sophisticated tool chains of the consumer electronics marketplace. We analyze our framework in the context of high-resolution global climate change simulations – an application with multi-trillion dollar ramifications to the world economies. A key aspect of our methodology is hardware-software co-tuning, which utilizes fast and accurate FPGA-based architectural emulation. This enables the design of future exaflop-class supercomputing systems to be defined by scientific requirements instead of constraining science to the machine configurations. Our talk will provide detailed design requirements for a kilometer-scale global cloud system resolving climate models and point the way toward Green Flash: an application-targeted exascale machine that could be efficiently implemented using mainstream embedded design processes. Overall, we believe that our proposed approach can provide a quantum leap in hardware and energy utilization, and may significantly impact the design of the next generation of HPC systems.

Bio:
Lenny Oliker is a Computer Scientist in the Future Technologies Group at Lawrence Berkeley National Laboratory. He received bachelor degrees in Computer Engineering and Finance from the University of Pennsylvania, and performed both his doctoral and postdoctoral work at NASA Ames research center. Lenny has co-authored over 60 technical articles, and has received four best paper awards, including IPDPS 2007 and 2008. His research interests include HPC characterization, multi-core auto-tuning, and power-efficient computing.

 

THURSDAY TCPP Invited Speaker
Thursday, May 28, 2009, 6:30 PM - 8:30 PM

Invited Speaker:
Michael Garland
NVIDIA

Title:
Parallel Computing on Manycore GPUs

Abstract:
The ongoing evolution of single-core sequential processors into manycore parallel processors is the most significant trend in modern chip architecture. Parallelism, rather than improved single-thread performance, has become the primary force driving higher computational throughput. At the leading edge of this class of massively parallel chip architectures is the modern GPU (graphics processing unit). Modern NVIDIA GPUs are fully programmable processors, delivering a peak computational throughput of up to 1 TFLOPS across 30K co-resident threads, which is a level of parallel computation that was once the preserve of supercomputers. Programming such massively parallel processors presents many interesting challenges. In this talk, I will explore the essential architectural characteristics of manycore processors in general, and the GPU in particular. I will introduce CUDA, NVIDIA's architecture for scalable parallel programming. Finally, I will examine the impact these architectures have on algorithm design, sketching some techniques for implementing common parallel algorithms for CUDA-capable processors.

Bio:
Michael Garland is a research scientist at NVIDIA. Dr. Garland holds B.S. and Ph.D. degrees in Computer Science from Carnegie Mellon University, and is an adjunct professor in the Department of Computer Science of the University of Illinois at Urbana-Champaign. He has published numerous articles in leading conferences and journals on a range of topics including surface simplification, remeshing, texture synthesis, novice-friendly modeling, free-form animation, scientific visualization, graph mining, and visualizing complex graphs. His current research interests include computer graphics and visualization, geometric algorithms, and parallel algorithms and programming models.