IPDPS 2011 Details

• IPDPS Home
• Advance Program
• Workshops
• PhD Forum
• Commercial Participation
• Intel Night
• Registration Details
• Hotel Information
• Travel Tips
• Organization
• Call For Papers (closed)
• Author Resources (closed)

General IPDPS Info

• About IPDPS
• Conference Archive
• Proceedings Library
• Steering Committee
• Contact IPDPS

Platinum Patron

® Intel, the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other Countries.

IPDPS 2011 Advance Program

Please visit the IPDPS website regularly for updates, since there may be schedule revisions. Authors who have corrections should send email to contact@ipdps.org giving full details. Note that paper numbers are listed for easy reference.

Abstracts of Contributed Papers
Abstracts for regular conference papers have been compiled to allow authors to check accuracy and so that visitors to this Website may preview the papers to be presented at the conference. Abstracts for all workshops and the PhD Forum will be posted to their respective Web page in March. Full proceedings of the conference will be published on a cdrom pocketed in a program book to be distributed to registrants at the conference.

View contributed paper abstracts in advance (pdf)

MONDAY - 16 May 2011

DAYS • Monday • Tuesday • Wednesday • Thursday • Friday

WORKSHOPS
all day*

* See each individual workshop programs for schedule details

Monday Workshop NOTE: HIPS workshop has moved to Friday
HCW	Heterogeneity in Computing Workshop
RAW	Reconfigurable Architectures Workshop
NIDISC	Workshop on Nature Inspired Distributed Computing
HiCOMB	Workshop on High Performance Computational Biology
APDCM	Advances in Parallel and Distributed Computing Models
CASS	Communication Architecture for Scalable Systems
HPPAC	High-Performance, Power-Aware Computing
HPGC	High-Performance Grid and Cloud Computing Workshop
SMTPS	Workshop on System Management Techniques, Processes, and Services
DataCloud	International Workshop on Data-Intensive Computing in the Clouds
EduPar	First NSF/TCPP Workshop on Parallel and Distributed Computing Education

IPDPS Celebrates
25 Years

Reception &
Dinner Party

Start 6:30 PM

It's A Party!
Looking Back Through 25 Years of
Meetings on Parallel and Distributed Computing

Party Host: H.J. Siegel

TUESDAY - 17 May 2011

DAYS • Monday • Tuesday • Wednesday • Thursday • Friday

Opening Session
8:00 AM -
8:30 AM

Opening Session
Chair: Alan Sussman

Keynote Session
8:30 AM -
9:30 AM

Keynote Speech

Speaker: Peter Sanders, Karlsruhe Institute of Technology
Keynote Session Chair: Olivier Beaumont

Algorithm Engineering for Scalable Parallel External Sorting

Abstract: The talk describes algorithm engineering (AE) as a methodology for algorithmic research where design, analysis, implementation and experimental evaluation of algorithms form a feedback cycle driving the development of efficient algorithm. Additional important components of the methodology include realistic models, algorithm libraries, and collections of realistic benchmark instances. We use one main example throughout this paper: sorting huge data sets using many multi-core processors and disks. The described system broke records for the GraySort and MinuteSort sorting benchmarks and helped with the record for the JouleSort benchmark.

Morning Break 9:30-10:00

Parallel Technical
Sessions 1, 2, 3, & 4
10:00 AM -
12:00 AM

SESSION 1:
Resource Management
Chair: Peter Sanders

Power-aware replica placement and update strategies in tree networks
Anne Benoit (ENS Lyon, France); Paul Renaud-Goud (LIP, ENS Lyon, France); Yves Robert (ENS Lyon, France)

Minimum Cost Resource Allocation for meeting job requirements
Venkatesan T Chakaravarthy (IBM Research (India), India); Sambuddha Roy (IBM Research - India, India); Yogish Sabharwal (IBM Research - India, India); Amit Kumar (IIT Delhi, India); Gyana Parija (IBM Research, India)

Power and Performance Management in Priority-type Cluster Computing Systems
Kaiqi Xiong (North Carolina State University, USA)

Willow: A Control System For Energy And Thermal Adaptive Computing
Krishna Kant (National Science Foundation, USA); Muthukumar Murugan (University of Minnesota, USA); David Du (University of Minnesota, USA)

SESSION 2:
Communication & I/O Optimization
Chair: Jeff Hollingsworth

Communication-Avoiding QR Decomposition for GPUs
Michael Anderson (University of California, Berkeley, USA); Grey Ballard (UC Berkeley, USA); James Demmel (University of California at Berkeley, USA); Kurt Keutzer (UC, Berkeley, USA)

Overlapping Computation and Communication for Advection on Hybrid Parallel Computers
James B White (National Center for Atmospheric Research, USA); Jack Dongarra (University of Tennessee, Knoxville, USA)

VisIO: Enabling Interactive Visualization of Ultra-Scale, Time Series Data via High-Bandwidth Distributed I/O Systems
Christopher Mitchell (University of Central Florida, USA); James Ahrens (Los Alamos National Laboratory, USA); Jun Wang (University of Central Florida, USA)

Architectural constraints to attain 1 Exaflop/s on three scientific application classes
Abhinav Bhatele (University of Illinois at Urbana-Champaign, USA); Pritish Jetley (University of Illinois at Urbana-Champaign, USA); Hormozd Gahvari (University of Illinois at Urbana-Champaign, USA); Lukasz Wesolowski (University of Illinois at Urbana-Champaign, USA); William D Gropp (University of Illinois at Urbana-Champaign, USA); Laxmikant V. Kale (University of Illinois at Urbana-Champaign, USA)

SESSION 3:
Hardware-Software Interaction
Chair: Huiyang Zhou

A Novel Power management for CMP Systems in Data-intensive Environment
Pengju Shang (University of Central Florida, USA); Jun Wang (University of Central Florida, USA)

Characterization of System Services and Their Performance Impact in Multicore Nodes
Seetharami R Seelam (IBM Research, USA); Liana L Fong (IBM TJ Watson Research Center, USA); John Divirgilio (IBM, USA); Brian F. Veale (IBM, USA); John Lewars (IBM Systems and Technology Group, USA); Kevin Gildea (IBM, USA)

Automatic Recognition of Performance Idioms in Scientific Applications
Jiahua He (University of California, San Diego, USA); Allan Snavely (University of California, San Diego, USA); Rob F Van der Wijngaart (Intel Corporation, USA); Michael Frumkin (Google Inc., USA)

Iso-energy-efficiency: An approach to power-constrained parallel computation
Shuaiwen Song (Virginia Tech, USA); Chun-Yi Su (Virginia Tech, USA); Rong Ge (Marquette University, USA); Abhinav Vishnu (Pacific Northwest National Laboratory, USA); Kirk Cameron (Virginia Tech, USA)

SESSION 4:
Runtime Systems
Chair: Pavan Balaji

A Study of Speculative Distributed Scheduling on the Cell/B.E.
Pieter Bellens (Barcelona Supercomputing Center, Spain); Josep M. Perez (Barcelona Supercomputing Center, Spain); Rosa M. Badia (Barcelona Supercomputing Center, Spain); Jesús Labarta (Barcelona Supercomputing Center, Spain)

Exploiting Data Similarity to Reduce Memory Footprints
Susmit Biswas (Lawrence Livermore National Laboratory, USA); Bronis R. de Supinski (Lawrence Livermore National Laboratory, USA); Martin Schulz (Lawrence Livermore National Laboratory, USA); Diana Franklin (University of California, Santa Barbara, USA); Tim Sherwood (University of California, Santa barbara, USA); Fred Chong (University of California, Santa Barbara, USA)

The Evaluation of an Effective Out-of-core Run-Time System in the Context of Parallel Mesh Generation
Andriy Kot (College of William and Mary, USA); Andrey N Chernikov (College of William and Mary, USA); Nikos Chrisochoides (College of William and Mary, USA)

Enriching 3-D video games on multicores
Romain Cledat (Georgia Institute of Technology, USA); Tushar Kumar (Georgia Institute of Technology, USA); Jaswanth Sreeram (Georgia Institute of Technology, USA); Santosh Pande (Georgia Institute of Technology, USA)

PhD Forum
12:00 AM Noon Start

PhD Forum Posters

Posters will be on display from noon Tuesday to the end-of-day on Wednesday.

See PhD Forum page for list of student authors.

Parallel Technical
Sessions 5, 6, 7, & 8
1:00 PM -
2:30 PM

SESSION 5:
Routing and Communication
Chair: Loris Marchal

On Nonblocking Folded-Clos Networks in Computer Communication Environments
Xin Yuan (Florida State University, USA)

vFtree - A Fat-tree Routing Algorithm using Virtual Lanes to Alleviate Congestion
Wei Lin Guay (Simula Research Laboratory, Norway); Bartosz Bogdanski (Simula Research Laboratory, Norway); Sven-Arne Reinemo (Simula Research Laboratory, Norway); Olav Lysne (Simula Research Laboratory, Norway); Tor Skeie (Simula Research Lab, Norway)

Measuring Temporal Lags in Delay-Tolerant Networks
Arnaud Casteigts (University of Ottawa, Canada); Paola Flocchini (University of Ottawa, Canada); Bernard Mans (Macquarie University, Australia); Nicola Santoro (Carleton University, Canada)

SESSION 6:
Self Stabilization and Security
Chair: Kaiqi Xiong

A Lightweight Method for Automated Design of Convergence
Ali Ebnenasir (Michigan Technological University, USA); Aly Farahat (Michigan Technological University, USA)

Snap-Stabilizing Committee Coordination
Borzoo Bonakdarpour (University of Waterloo, Canada); Stéphane Devismes (Université Joseph Fourier, France); Franck Petit (UPMC Paris6, Sorbonne Universités, France)

SC-OA: A Secure and Efficient Scheme for Origin Authentication of Interdomain Routing in Cloud Computing Networks,
Z. Le (Jiangxi University of Finance and Economics, China), N. Xiong (Georgia State University, USA), B. Yang (Jiangxi University of Finance and Economics, China), Yuzhi zhou (Tsinghua University, Beijing, China)

SESSION 7:
Numerical Algorithms
Chair: Ümit V. Çatalyürek

Automatic Library Generation for BLAS3 on GPUs
Huimin Cui (Institute of Computing Technology, P.R. China); Lei Wang (Institute of Computing Technology, Chinese Academy of Sciences, P.R. China); Jingling Xue (University of New South Wales, Australia); Yang Yang (Institute of Computing Technology, Chinese Academy of Sciences, P.R. China); Xiaobing Feng (Institute of Computing Technology, Chinese Academy of Sciences, P.R. China)

Redesign of Higher-Level Matrix Algorithms for Multicore and Distributed Architectures and Applications in Quantum Monte Carlo Simulation
Che-Rung Lee (National Tsing Hua University, Taiwan); Zhaojun Bai (University of California, Davis, USA)

Challenges of Scaling Algebraic Multigrid across Modern Multicore Architectures
Allison Baker (Lawrence Livermore National Laboratory, USA); Todd Gamblin (Lawrence Livermore National Laboratory, USA); Martin Schulz (Lawrence Livermore National Laboratory, USA); Ulrike Yang (Lawrence Livermore National Laboratory, USA)

SESSION 8:
Reliability and Security
Chair: Marian Vajtersic

Hauberk: Lightweight Silent Data Corruption Error Detector for GPGPU
Keun Soo Yim (University of Illinois at Urbana-Champaign, USA); Cuong Pham (University of Illinois at Urbana-Champaign, USA); Mushfiq Saleheen (University of Illinois at Urbana Champaign, USA); Zbigniew Kalbarczyk (University of Illinois at Urbana Champaign, USA); Ravishankar Iyer (University of Illinois at Urbana-Champaign, USA)

A Performance and Area Efficient Architecture for Intrusion Detection Systems
Govind Sreekar Shenoy (Universitat Politecnica de Catalunya, Spain); Jordi Tubella (Universitat Politecnica de Catalunya, Spain); Antonio Gonzalez (Intel and UPC, Spain)

Time-Ordered Event Traces: A New Debugging Primitive for Concurrency Bugs
Martin Dimitrov (University of Central Florida, USA); Huiyang Zhou (North Carolina State University, USA)

Afternoon Break 2:30 PM - 3:00 PM

Parallel Technical
Sessions 9, 10, 11, & 12
3:00 PM -
5:00 PM

SESSION 9:
Wireless and Sensor Networks
Chair: Venkat Chakravarthy

Singlehop Collaborative Feedback Primitives for Threshold Querying in Wireless Sensor Networks
Murat Demirbas (University at Buffalo, SUNY, USA); Serafettin Tasci (SUNY Buffalo, USA); Hanifi Gunes (SUNY Buffalo, USA); Atri Rudra (University at Buffalo, USA)

Completely Distributed Particle Filters for Target Tracking in Sensor Networks
Bo Jiang (Virginia Tech, USA); Binoy Ravindran (Virginia Tech, USA)

Maintaining Connectivity in 3D Wireless Sensor Networks using Directional Antennae
Evangelos Kranakis (Carleton University, Canada); Danny Krizanc (Wesleyan University, USA); Ashish Modi (IIT, India); Oscar Morales-Ponce (Carleton University, Canada)

Distributed Fine-grained Access Control in Wireless Sensor Networks
Sushmita Ruj (University of Ottawa, Canada); Amiya Nayak (SITE, University of Ottawa, Canada); Ivan Stojmenovic (University of Ottawa, Canada)

SESSION 10:
GPU Acceleration
Chair: Jack Dongarra

Design of MILC lattice QCD application for GPU clusters
Guochun Shi (University of Illinois at Urbana-Champaign, USA); Steven Gottlieb (Indiana University, USA); Aaron Torok (Indiana University, USA);
Volodymyr Kindratenko (National Center for Supercomputing Applications, USA)

Multifrontal Factorization of Sparse SPD Matrices on GPUs
Thomas George (IBM Research India, India); Vaibhav Saxena (IBM Research - India, New Delhi, India); Anshul Gupta (IBM T.J. Watson Research Center, USA); Amik Singh (Indian Institute of Technology, Roorkee, India); Anamitra Roy Choudhury (IBM Research - India, India)

Large-Scale Semantic Concept Detection on Manycore Platforms for Multimedia Mining
Mamadou Diao (Georgia Institute of Technology, USA); Chrysostomos Nicopoulos (University of Cyprus, Cyprus); Jongman Kim (Georgia Institute of Technology, USA)

Efficient GPU implementation for Particle in Cell Algorithm
Rejith Joseph (University of Florida, USA); Girish Ravunnikutty (University of Florida, USA); Sanjay Ranka (University of Florida, USA); Eduardo D'Azevedo (ORNL, USA); Scott Klasky (Oak Ridge National Laboratory, USA)

SESSION 11:
Multiprocessing and Concurrency
Chair: Kirk Cameron

Hardware-based Job Queue Management for Manycore Architectures and OpenMP Environments
Junghee Lee (Georgia Institute of Technology, USA); Chrysostomos Nicopoulos (University of Cyprus, Cyprus); Yongjae Lee (Georgia Institute of Technology, USA); Hyunggyu Lee (Georgia Institute of Technology, USA); Jongman Kim (Georgia Institute of Technology, USA)

HK-NUCA: Boosting Data Searches in Dynamic Non-Uniform Cache Architectures for Chip Multiprocessors
Javier Lira (Universitat Politècnica de Catalunya, Spain); Carlos Molina (Universitat Rovira i Virgili, Spain); Antonio Gonzalez (Intel and UPC, Spain)

Power Token Balancing: Adapting CMPs to Power Constraints for Parallel Multithreaded Workloads
Juan M. Cebrián (University of Murcia, Spain); Juan L. Aragón (University of Murcia, Spain); Stefanos Kaxiras (University of Patras, Greece)

A Very Fast Simulator For Exploring The Many-Core Future
Olivier Certner (INRIA, France); Zheng Li (INRIA, France); Arun Raman (Princeton University, USA); Olivier Temam (INRIA Futurs, France)

SESSION 12:
Compilers
Chair: Mitsuhisa Sato

Variable Granularity Access Tracking Scheme for Improving the Performance of Software Transactional Memory
Sandya Mannarswamy (Hewlett Packard India, India); Govindarajan Ramaswamy (Indian Institute of Science, India)

Automated architecture-aware mapping of streaming applications onto GPUs
Andrei Hagiescu (National University of Singapore, Singapore); Huynh Phung Huynh (A*STAR Institute of High Performance Computing, Singapore); Weng Fai Wong (National University of Singapore, Singapore); Rick Siow Mong Goh (A*STAR Institute of High Performance Computing, Singapore)

Automatic Loop Tiling for Direct Memory Access
Haibo Lin (IBM Research - China, P.R. China); Tao Liu (IBM Research - China, P.R. China); Lakshminarayanan Renganarayana (IBM T. J. Watson Research Center, USA); Huoding Li (IBM GCG Systems and Technology Lab, P.R. China); Tong Chen (IBM T. J. Watson Research Center, USA); Kevin O'Brien (IBM T. J. Watson Research Center, USA); Ling Shao (IBM Research, P.R. China)

Tolerant Value Speculation in Coarse-Grain Streaming Computations
Nathaniel Azuelos (Technion, Israel); Idit Keidar (Technion, Israel); Ayal Zaks (IBM Haifa Research Lab, Israel)

Late Afternoon Break 5:00 PM - 5:30 PM

Special 25th IPDPS Panel
5:30 PM - 7:30 PM

Tuesday 25th Year Panel: LOOKING BACK

Moderator:
Yves Robert, Ecole Normale Supérieure de Lyon, France

Panelists:

William (Bill) Dally, Stanford & NVIDIA
Jack Dongarra, University of Tennessee & Oak Ridge National Laboratory
Satoshi Matsuoka, Tokyo Institute of Technology, Japan
Arnold L. Rosenberg, Northeastern University & Colorado State University
Rob Schreiber, HP Labs, Palo Alto
Uzi Vishkin, University of Maryland

Abstract: The 25th year of IPDPS gives us the opportunity to look back and (to attempt) to assess what has gone wrong, what has gone well, and what came as a surprise, in the field of parallel and distributed processing. The panel members will give a few examples of striking events that took place in their area (covering Algorithms/ Applications/ Architectures/ Software). They will also give a short statement on how they would summarize the evolution of the field as a whole over the last 25 years.

Evening Tutorial
8:00 PM –
10:00 PM

Topic: Parallel Programming Using the Global Arrays Toolkit: Now and into the Future

Presenters from: Pacific Northwest National Lab
More information

WEDNESDAY - 18 May 2011

DAYS • Monday • Tuesday • Wednesday • Thursday • Friday

Keynote Session
8:00 AM -
9:00 AM

Keynote Speech

Speaker: Jack Dongarra, University of Tennessee & Oak Ridge National Laboratory
Keynote Session Chair: Leonid Oliker

Architecture-aware Algorithms and Software for Peta and Exascale Computing

Abstract: In this talk we examine how high performance computing has changed over the last 10-year and look toward the future in terms of trends. These changes have had and will continue to have a major impact on our software. Some of the software and algorithm challenges have already been encountered, such as management of communication and memory hierarchies through a combination of compile--time and run--time techniques, but the increased scale of computation, depth of memory hierarchies, range of latencies, and increased run--time environment variability will make these problems much harder. We will look at five areas of research that will have an importance impact in the development of software and algorithms. We will focus on following themes:

Redesign of software to fit multicore and hybrid architectures
Automatically tuned application software
Exploiting mixed precision for performance
The importance of fault tolerance
Communication avoiding algorithms

Morning Break 9:00 AM - 9:30 AM

Parallel Technical
Sessions 13, 14, 15, & 16
9:30 AM -
11:30 AM

SESSION 13:
Distributed Algorithms and Models
Chair: Bo Hong

Adding a referee to an interconnection network: What can(not) be computed in one round.
Florent Becker (LIFO, Universite d’Orleans, France); Martin Matamala (DIM, Universidad de Chile, Chile); Nicolas Nisse (INRIA, I3S, CNRS, Université de Nice Sophia, France); Ivan Rapaport (DIM, CMM, Universidad de Chile, Chile); Karol Suchan (Facultad de Ingeniería y Ciencias, Universidad Adolfo Ibañez, Chile); Ioan Todinca (LIFO, Universite d’Orleans, France)

Improved Algorithms for the Distributed Trigger Counting Problem
Venkatesan T Chakaravarthy (IBM Research (India), India); Anamitra Roy Choudhury (IBM Research - India, India); Yogish Sabharwal (IBM Research - India, India)

The Weighted Byzantine Agreement Problem
Vijay Garg (The University of Texas at Austin, USA); John Bridgman (The University of Texas at Austin, USA)

Leveraging Social Networks to Combat Collusion in Reputation Systems for Peer-to-Peer Networks
Ze Li (Clemson University, USA); Haiying Shen (Clemson University, USA): Karan Sapra (Clemson University, USA)

SESSION 14:
Parallel Graph and Particle Algorithms
Chair: Beth Plale

Computing Strongly Connected Components in Parallel on CUDA
Jiri Barnat (Masaryk University, Czech Republic); Petr Bauch (Masaryk University, Czech Republic); Lubos Brim (Masaryk University, Czech Republic); Milan Ceska (Masaryk University, Czech Republic)

On optimal tree traversals for sparse matrix factorization
Mathias Jacquelin (ENS Lyon, France); Loris Marchal (CNRS, France); Yves Robert (ENS Lyon, France); Bora Ucar (CNRS, France)

Fast Community Detection Algorithm With GPUs and Multi-core Architectures
Jyothish Soman (IIIT-Hyderabad, India); Ankur Narang (IBM India Research Labs, New Delhi, India)

A Study of Parallel Particle Tracing for Steady-State and Time-Varying Flow Fields
Tom Peterka (Argonne National Laboratory, USA); Robert Ross (Argonne National Laboratory, USA); Boonthanome Nouanesengsey (The Ohio State University, USA); Teng-Yok Lee (The Ohio State University, USA); Han-Wei Shen (The Ohio State University, USA); Wesley Kendall (University of Tennessee at Knoxville, USA); Jian Huang (University of Tennessee at Knoxville, USA)

SESSION 15:
Distributed Systems and Networks
Chair: Yong Chen

Critical Bubble Scheme: An Efficient Implementation of Globally-aware Network Flow Control
Lizhong Chen (University of Southern California, USA); Ruisheng Wang (University of Southern California, USA); Timothy M. Pinkston (University of
Southern California, USA)

A Scalable Reverse Lookup Scheme using Group-based Shifted Declustering Layout
Junyao Zhang (University of Central Florida, USA); Pengju Shang (University of Central Florida, USA); Jun Wang (University of Central Florida, USA)

Deadlock-Free Oblivious Routing for Arbitrary Topologies
Jens Domke (TU Dresden, Germany); Torsten Hoefler (University of Illinois at Urbana-Champaign, USA); Wolfgang E. Nagel (Technische Universitaet Dresden, Germany)

RDMA Capable iWARP over Datagrams
Ryan E Grant (Queen's University, Canada); Mohammad J Rashti (Queen's University, Canada); Ahmad Afsahi (Queen's University, Canada); Pavan Balaji (Argonne National Laboratory, USA)

SESSION 16:
Programming Environments and Tools
Chair: Ali R. Butt

Reconciling Sampling and Direct Instrumentation for Unintrusive Call-Path Profiling of MPI Programs
Zoltan Szebenyi (Jülich Supercomputing Centre, Germany); Todd Gamblin (Lawrence Livermore National Laboratory, USA); Martin Schulz (Lawrence Livermore National Laboratory, USA); Bronis R. de Supinski (Lawrence Livermore National Laboratory, USA); Felix Wolf (German Research School for Simulation Sciences, Germany); Brian J.N. Wylie (Jülich Supercomputing Centre, Germany)

A Practical Approach for Performance Analysis of Shared Memory Programs
Bogdan Marius Tudor (National University of Singapore, Singapore); Yong Meng Teo (National University of Singapore, Singapore)

Single Node On-Line Simulation of MPI Applications with SMPI
Pierre-Nicolas Clauss (Nancy University, France); Mark Lee Stillwell (INRIA, France); Stéphane Genaud (University of Strasbourg, France); Frederic Suter (CC IN2P3, France); Henri Casanova (University of Hawaii at Manoa, USA); Martin Quinson (LORIA, UMR 7503 (CNRS, INPL, INRIA, Nancy2, UHP)., France)

Patus: A Code Generation and Autotuning Framework For Parallel Iterative Stencil Computations on Modern Microarchitectures
Matthias Christen (University of Basel, Switzerland); Olaf Schenk (University of Basel, Switzerland); Helmar Burkhart (University of Basel, Switzerland)

Special NSF-SEES Presentation
11:30 AM - 12:15 PM

NSF Science, Engineering and Education for Sustainability (SEES) Initiative

Presenter: Krishna Kant, National Science Foundation
For more information (pdf)

Parallel Technical
Sessions 17, 18, 19, & 20
1:00 PM -
3:00 PM

SESSION 17:
Parallel Algorithms
Chair: Anne Benoit

A New Data Layout For Set Intersection on GPUs
Rasmus Amossen (IT University of Copenhagen, Denmark); Rasmus Pagh (University of Copenhagen, Denmark)

Partitioning Spatially Located Computations using Rectangles
Erik Saule (The Ohio State University, USA); Erdeniz O. Bas (The Ohio State University, USA); Umit V. Catalyurek (The Ohio State University, USA)

Reduced-Bandwidth Multithreaded Algorithms for Sparse-Matrix Vector Multiplication
Aydin Buluc (Lawrence Berkeley National Laboratory, USA); Samuel W. Williams (Lawrence Berkeley National Laboratory, USA); Leonid Oliker (LBNL, USA); James Demmel (University of California at Berkeley, USA)

SESSION 18:
Distributed Systems
Chair: Manish Parashar

GRAL: A Grouping Algorithm to Optimize Application Placement in Wireless Embedded Systems
Nikos Tziritas (University of Thessaly, Greece); Thanasis Loukopoulos (Technological Educational Institute of Lamia, Greece); Spyros Lalis (University of Thessaly, Greece); Petros Lampsas (Technological Educational Institute of Lamia, Greece)

Vitis: A Gossip-based Hybrid Overlay for Internet-scale Publish/Subscribe Enabling Rendezvous Routing in Unstructured Overlay Networks
Fatemeh Rahimian (KTH - Royal Institute of Technology, Sweden); Sarunas Girdzijauskas (Swedish Institute of Computer Science (SICS), Sweden); Amir Hossein Payberah (KTH - Royal Institute of Technology, Sweden); Seif Haridi (KTH - The Royal Institute of Technology, Sweden)

Moving the Code to the Data - Dynamic Code Deployment using ActiveSpaces
Ciprian Docan (Rutgers, The State University of New Jersey, USA); Manish Parashar (Rutgers, The State University of New Jersey, USA); Julian Cummings (CACR, USA); Scott Klasky (Oak Ridge National Laboratory, USA)

High performance scalable and expressive modeling environment to study mobile malware in large dynamic networks
Karthik Channakeshava (Virginia Tech, USA); Keith Bisset (Virginia Tech, USA); Anil Vullikanti (Virginia Tech., USA); Madhav Marathe (Virginia Tech,
USA); Shrirang Yardi (NVIDIA, USA)

SESSION 19:
Storage Systems and Memory
Chair: Alok Choudhary

H-Code: A Hybrid MDS Array Code to Optimize Partial Stripe Writes in RAID-6
Chentao Wu (Virginia Commonwealth University, USA); Shenggang Wan (Huazhong University of Science and Technology, P.R. China); Xubin He (Virginia Commonwealth University, USA); Qiang Cao (Huazhong University of Science and Technology, P.R. China); Changsheng Xie (Huazhong University of Science and Technology, P.R. China)

LACIO: A New Collective I/O Strategy for Parallel I/O Systems
Yong Chen (Oak Ridge National Laboratory, USA); Xian-He Sun (Illinois Institute of Technology, USA); Rajeev Thakur (Argonne National Laboratory, USA); Philip C. Roth (Oak Ridge National Laboratory, USA); William D Gropp (University of Illinois at Urbana-Champaign, USA)

Using Shared Memory to Accelerate MapReduce on Graphics Processing Units
Feng Ji (North Carolina State University, USA); Xiaosong Ma (NC State University, USA)

Unified Signatures for Improving Performance in Transactional Memory
Woojin Choi (University of Southern California/Information Sciences Institute, USA); Jeffrey Draper (University of Southern California/ Information Sciences Institute, USA)

SESSION 20:
Operating Systems and Resource Management
Chair: Martin Schulz

Reducing fragmentation on torus-connected supercomputers
Wei Tang (Illinois Institute of Technology, USA); Zhiling Lan (Illinois Institute of Technology, USA); Narayan Desai (Argonne National Laboratory, USA); Daniel Buettner (Argonne National Laboratory, USA); Yongen Yu (Illinois Institute of Technology, USA)

Co-Analysis of RAS Log and Job Log on Blue Gene/P
Ziming Zheng (Illinois Institute of Technology, USA); Li Yu (Illinois Institute of Technology, USA); Wei Tang (Illinois Institute of Technology, USA); Zhiling Lan (Illinois Institute of Technology, USA); Rinku Gupta (Argonne National Laboratory, USA); Narayan Desai (Argonne National Laboratory, USA); Susan Coghlan (Argonne National Laboratory, USA); Daniel Buettner (Argonne National Laboratory, USA)

A Quantitative Analysis of OS Noise
Alessandro Morari (Barcelona Supercomputing Center, Spain); Roberto Gioiosa (Barcelona Supercomputing Center, Spain); Robert Wisniewski (IBM Research, USA); Francisco J. Cazorla (Barcelona Supercomputing Center, Spain); Mateo Valero (Universidad Politécnica de Cataluña, Spain)

Decal: Transparent Checkpointing and Process Migration of OpenCL Applications
Hiroyuki Takizawa (Tohoku University, Japan); Kentaro Koyama (Tohoku University, Japan); Katsuto Sato (Tohoku University, Japan); Kazuhiko Komatsu (Tohoku University, Japan); Hiroaki Kobayashi (Tohoku University, Japan)

Afternoon Break 3:00 PM - 3:30 PM

Special 25th IPDPS Panel
3:30 PM -
5:30 PM

Wednesday 25th Year Panel: WHAT'S AHEAD

Moderator:
Per Stenström, Chalmers University of Technology, Sweden

Panelists:

Doug Burger, Microsoft Research
Wen-mei Hwu, University of Illinois, Urbana-Champaign
Vipin Kumar, University of Minnesota
Kunle Olukotun, Stanford University
David Padua, University of Illinois, Urbana-Champaign
Burton Smith, Microsoft

Abstract: Parallel computing has become ubiquitous and relates to challenging computational problems in science via business-driven computing to mobile computing. The scope has widened dramatically over the last decade. This panel will debate and speculate on how the parallel computing landscape is expected to change in the years to come. Areas of focus will include:

Computing platforms: How will we be able to maintain the performance growth of the past and what will be the major challenges in the next 10 years and beyond that? What technical barriers are anticipated and what disruptive technologies are behind the corner?
Software: How will software infrastructures evolve to meet performance requirements in the next 10 years and beyond? How will we ever be able to hide parallelism obstacles for the masses and what is the road forward towards that?
Algorithms: What will be the major computational problems to tackle in the next 10 years and beyond? What are the most challenging algorithmic problems to solve?
Applications: What will be the next wave of grand challenge problems to focus on in the next 10 years and beyond? What will be the major performance driving applications in the general and mobile computing domains?

PhD Forum
5:30 PM – 7:00 PM

PhD Forum
Students will be available at their posters for
questions and comments

See PhD Forum page for list of student authors

Reception
6:00 PM

Banquet
7:00 PM

Festivities continue to celebrate
25 years of people and places and parallelism

Banquet Host: Arnold Rosenberg

THURSDAY - 19 May 2011

DAYS • Monday • Tuesday • Wednesday • Thursday • Friday

Keynote Session
8:00 AM - 9:00 AM

Keynote Speech

Speaker: William (Bill) Dally, Stanford/NVIDIA
Keynote Session Chair: Dimitrios Nikolopoulos

Power, Programmability, and Granularity: The Challenges of ExaScale Computing

Abstract: Reaching an ExaScale computer by the end of the decade, and enabling the continued performance scaling of smaller systems requires significant research breakthroughs in three key areas: power efficiency, programmability, and execution granularity. To build an ExaScale machine in a power budget of 20MW requires a 200-fold improvement in energy per instruction: from 2nJ to 10pJ. Only 4x is expected from improved technology. The remaining 50x must come from improvements in architecture and circuits. To program a machine of this scale requires more productive parallel programming environments - that make parallel programming as easy as sequential programming is today. Finally, problem size and memory size constraints prevent the continued use of weak scaling, requiring these machines to extract parallelism at very fine granularity - down to the level of a few instructions. This talk will discuss these challenges and current approaches to address them.

Morning Break 9:00 AM - 9:30 AM

PLENARY SESSION:
Best Papers
9:30 AM - 11:30 AM

SESSION: Best Papers

Online Adaptive Code Generation and Tuning
Ananta N Tiwari (University of Maryland at College Park, USA); Jeffrey K. Hollingsworth (University of Maryland, USA)

GLocks: Efficient Support for Highly-Contended Locks in Many-Core CMPs
José Luis Abellán (University of Murcia, Spain); Juan Fernández (University of Murcia, Spain); Manuel E. Acacio (Universidad de Murcia, Spain)

Profiling Heterogeneous Multi-GPU Systems to Accelerate Cortically Inspired Learning Algorithms
Andrew Nere (University of Wisconsin - Madison, USA); Atif Hashmi (University of Wisconsin - Madison, USA); Mikko Lipasti (University of Wisconsin - Madison, USA)

PHAST: Hardware-Accelerated Shortest Path Trees
Daniel Delling (Microsoft Research Silicon Valley, USA); Andrew Goldberg (Microsoft Research Silicon Valley, USA); Andreas Nowatzyk (Microsoft
Research Silicon Valley, USA); Renato Werneck (Microsoft Research Silicon Valley, USA)

Parallel Technical
Sessions 21, 22, 23, & 24
1:00 PM -
3:00 PM

SESSION 21:
Numerical Algorithms
Chair: Denis Trystram

QR Factorization on a Multicore Node Enhanced with Multiple GPU Accelerators
Emmanuel Agullo (INRIA / LaBRI, France); Cédric Augonnet (LaBRI / University of Bordeaux / INRIA Bordeaux Sud-Ouest, France); Jack Dongarra (University of Tennessee, Knoxville, USA); Mathieu Faverge (University of Tennessee, USA); Hatem Ltaief (University of Tennessee, USA); Samuel Thibault (LaBRI, University of Bordeaux 1, France); Stanimire Tomov (University of Tennessee, USA)

Two-Stage Tridiagonal Reduction for Dense Symmetric Matrices using Tile Algorithms on Multicore Architectures
Piotr Luszczek (University of Tennessee, USA); Hatem Ltaief (University of Tennessee, USA); Jack Dongarra (University of Tennessee, Knoxville, USA)

An Auto-tuned Method for Solving Large Tridiagonal Systems on the GPU
Andrew Davidson (University of California, Davis, USA); Yao Zhang (University of California, Davis, USA); John D. Owens (University of California, Davis, USA)

A communication-avoiding, hybrid-parallel, rank-revealing orthogonalization method
Mark Hoemmen (Sandia National Laboratories, USA)

SESSION 22:
Fault Tolerance
Chair: Frank Mueller

Flease - Lease Coordination Without a Lock Server
Björn Kolbeck (Zuse Institute Berlin, Germany); Mikael Högqvist (Zuse Institute Berlin, Germany); Jan Stender (Zuse Institute Berlin, Germany); Felix Hupfeld (Google GmbH Zurich, Switzerland)

Uncoordinated Checkpointing Without Domino Effect for Send-Deterministic MPI Applications
Amina Guermouche (University of Paris South 11, France); Thomas Ropars (INRIA, France); Elisabeth Brunet (Télécom SudParis, France); Marc Snir (University of Illinois at Urbana Champaign, USA); Franck Cappello (INRIA and University of Illinois at Urbana Champaign, France)

Minimal Obstructions for the Coordinated Attack Problem and Beyond
Tristan Fevat (Aix-Marseille Université, France); Emmanuel Godard (Pims, Cnrs Umi, France)

Scheduling Parallel Iterative Applications on Volatile Resources
Henri Casanova (University of Hawaii at Manoa, USA); Fanny Dufossé (LIP, ENS Lyon, France); Yves Robert (ENS Lyon, France); Frederic Vivien (INRIA, France)

SESSION 23:
Resource Utilization
Chair: Yuzhong Sun

Shared Resource Monitoring and Throughput Optimization in Cloud-Computing Datacenters
Jaideep Moses (Intel Corp., USA); Ravishankar Iyer (Intel Corp, USA); Ramesh Illikkal (Intel Corporation, USA); Sadagopan Srinivasan (Intel, USA); Konstantinos Aisopos (Princeton, USA)

The Impact of Soft Resource Allocation on n-Tier Application Scalability
Qingyang Wang (Georgia Institute of Technology, USA); Simon Malkowski (Georgia Institute of Technology, USA); Deepal Jayasinghe (Georgia Institute of Technology, USA); Pengcheng Xiong (Georgia Institute of Technology, USA);
Calton Pu (Georgia Institute of Technology, USA); Yasuhiko Kanemasa (Fujitsu Laboratories Ltd, Japan); Motoyuki Kawaba (Fujitsu Laboratories Ltd, Japan); Lilian Harada (Fujitsu Laboratories Ltd, Japan);

Profiling Directed NUMA Optimisation on Linux Systems: A Case Study of the Gaussian Computational Chemistry Code
Rui Yang (University of Wollongong, Australia); Joseph Antony (Australian National University, Australia); Alistair P Rendell (Australian National University, Australia); Danny Robson (Australian National University, Australia); Peter E Strazdins (Australian National University, Australia)

Model-Driven SIMD Code Generation for a Multi-Resolution Tensor Kernel
Kevin Stock (The Ohio State University, USA); Thomas Henretty (The Ohio State University, USA); Iyyappa Murugandi (The Ohio State University, USA); Ponnuswamy Sadayappan (Ohio State University, USA); Robert Harrison (ORNL, USA)

SESSION 24:
Parallel Programming Models and Languages
Chair: Bronis R. de Supinski

Multi-GPU MapReduce on GPU Clusters
Jeffery Stuart (University of California, Davis, USA); John D. Owens (University of California, Davis, USA)

X10 as a parallel language for scientific computation: practice and experience
Josh Milthorpe (Australian National University, Australia); V. Ganesh (Australian National University, Australia); Alistair P Rendell (Australian National University, Australia); David Grove (IBM Research, USA)

Implementation and Performance Evaluation of the HPC Challenge Benchmarks in Coarray Fortran 2.0
Guohua Jin (Rice University, USA); John Mellor-Crummey (Rice University, USA); Laksono Adhianto (Rice University, USA); William Scherer III (Rice University, USA); Chaoran Yang (Rice University, USA)

Communication Optimizations for Distributed-Memory X10 Programs
Rajkishore Barik (Rice University, USA); JIsheng Zhao (Rice University, USA); David Grove (IBM Research, USA); Igor Peshansky (IBM TJ Watson Research Centre, USA); Zoran Budimlić (Rice University, USA); Vivek Sarkar (Rice University, USA)

Afternoon Break 3:00 PM - 3:30 PM

Parallel Technical
Sessions 25, 26, 27, & 28
3:30 PM -
5:30 PM

SESSION 25:
Algorithms for Distributed Computing
Chair: Henri Casanova

I/O-optimal Algorithms for Orthogonal Problems for Private-Cache Chip Multiprocessors
Deepak Ajwani (MADALGO, University of Aarhus, Denmark); Nodari Sitchinava (MADALGO, University of Aarhus, Denmark); Norbert Zeh (Dalhousie University, Canada)

A Fast Algorithm for Constructing Inverted Files on Heterogeneous Platforms
Zheng Wei (University of Maryland, USA); Joseph JaJa (University of Maryland, College Park, USA)

Graph Partitioning with Natural Cuts
Daniel Delling (Microsoft Research Silicon Valley, USA); Andrew Goldberg (Microsoft Research Silicon Valley, USA); Ilya Razenshteyn (Lomonosov Moscow State University, Russia); Renato Werneck (Microsoft Research Silicon Valley, USA)

Reader Activation Scheduling in Multi-Reader RFID Systems: A Study of General Case
Shao-Jie Tang (Illinois Institute of Technology, USA); Cheng Wang (Tongji University, Shanghai, P.R. China); Xiang-Yang Li (Illinois Institute of Technology, USA); Changjun Jiang (Tongji University, Shanghai, P.R. China)

SESSION 26:
Scheduling
Chair: Zhihui Du

Efficient Parallel Scheduling of Malleable Tasks
Peter Sanders (University of Karlsruhe, Germany); Jochen Speck (KIT, Germany)

Offline Scheduling of Multi-threaded Request Streams on a Caching Server
Veronika Rehn-Sonigo (University of Franche-Comté, France); Denis Trystram (University of Grenoble, France); Frédéric Wagner (INPG, France); Haifeng Xu (Zhejiang University, P.R. China); Guochuan Zhang (Zhejiang University, P.R. China)

Tight Analysis of Relaxed Multi-Organization Scheduling Algorithms
Daniel Cordeiro (Grenoble University, France); Pierre-François Dutot (Grenoble University, France); Gregory Mounié (Institut National Politechnique de Grenoble, France); Denis Trystram (University of Grenoble, France)

Scheduling Functional Heterogeneous Systems with Utilization Balancing
Yuxiong He (Microsoft Research, USA); Jie Liu (Microsoft Research, USA); Hongyang Sun (Nanyang Technological University, Singapore)

SESSION 27:
Computational Biology and Simulations
Chair: Philip Roth

Smith-Waterman Alignment of Huge Sequences with GPU in Linear Space
Edans Flavius de Oliveira Sandes (University of Brasilia, Brazil); Alba Cristina Magalhaes Alves de Melo (University of Brasilia (UnB), Brazil)

Accelerating Protein Sequence Search in a Heterogeneous Computing System
Shucai Xiao (Virginia Tech, USA); Heshan Lin (Virginia Tech, USA); Wu-chun Feng (Virginia Tech, USA)

Parallel Metagenomic Sequence Clustering via Sketching and Quasi-clique Enumeration on Map-reduce Clouds
Xiao Yang (Iowa State University, USA); Jaroslaw Zola (Iowa State University, USA); Srinivas Aluru (Iowa State University, USA)

Large-scale lattice gas Monte Carlo simulations for the generalized Ising model
Tobias C. Kerscher (Technische Universität Hamburg-Harburg, Germany); Stefan Müller (Technische Universität Hamburg-Harburg, Germany); Quinn Snell (Brigham Young University, USA); Gus Hart (Brigham Young University, USA);

SESSION 28:
Cloud Computing
Chair: Ron Brightwell

CATCH: A Cloud-based Adaptive Data Transfer Service for HPC
Henry Monti (Virginia Tech, USA); Ali R Butt (Virginia Tech., USA); Sudharshan S Vazhkudai (Oak Ridge National Laboratory, USA)

A Scalable and Elastic Publish/Subscribe Service
Ming LI (IBM T.J Watson Research Center, USA); Fan Ye (IBM T. J. Watson Research Center, USA); Minkyong Kim (IBM T.J. Watson Research Center, USA); Han Chen (IBM T.J. Watson Research Center, USA); Hui Lei (IBM Research, USA)

CABdedupe: A Causality-based Deduplication Performance Booster for Cloud Backup Services
Yujuan Tan (Huazhong University of Science and Technology, P.R. China), Hong Jiang (University of Nebraska-Lincoln, USA), Dan Feng (Huazhong University of Science and Technology, P.R. China), Lei Tian (Huazhong University of Science and Technology, P.R. China), Zhichao Yan (Huazhong University of Science and Technology, P.R. China)

DryadOpt: Branch-and-Bound on Distributed Data-Parallel Execution Engines
Mihai Budiu (Microsoft Research Silicon Valley, USA); Daniel Delling (Microsoft Research Silicon Valley, USA); Renato Werneck (Microsoft Research Silicon Valley, USA)

FRIDAY - 20 May 2011

DAYS • Monday • Tuesday • Wednesday • Thursday • Friday

WORKSHOPS
all day*

* See each individual workshop programs for schedule details

Friday Workshop
HIPS	Workshop on High-Level Parallel Programming Models & Supportive Environments
PDSEC	Workshop on Parallel and Distributed Scientific and Engineering Computing
DPDNS	Dependable Parallel, Distributed and Network-Centric Systems
HOTP2P	International Workshop on Hot Topics in Peer-to-Peer Systems
MTAAP	Workshop on Multi-Threaded Architectures and Applications
LSPP	Workshop on Large-Scale Parallel Processing
PCGRID	Workshop on Desktop Grids and Volunteer Computing Systems
PCO	Parallel Computing and Optimization
DCPM	Future Approaches to Data Centric Programming for Exascale