IPDPS 2014 Details

General IPDPS Info



IPDPS 2013 Report

IPDPS 2014 Advance Program

Attn Session Chairs
Please check the session and time assigned to you in the Advance Program below. If you cannot make the session, please contact the Program Chair to arrange a substitute.

Abstracts of Contributed Papers
Abstracts for regular conference papers have been compiled to allow authors to check accuracy and so that visitors to this Website may preview the papers to be presented at the conference. Full proceedings of the conference will be published on a cdrom pocketed in a program book to be distributed to registrants at the conference.

View contributed paper abstracts in advance (pdf)


MONDAY - 19 May 2014


* See each individual
workshop programs
for schedule details


HCW Heterogeneity in Computing Workshop
RAW Reconfigurable Architectures Workshop
HIPS Workshop on High-Level Parallel Programming Models & Supportive Environments
NIDISC Workshop on Nature Inspired Distributed Computing
HiCOMB Workshop on High Performance Computational Biology
APDCM Advances in Parallel and Distributed Computing Models
HPPAC High-Performance, Power-Aware Computing
HPGC High-Performance Grid and Cloud Computing Workshop
ASHES Accelerators and Hybrid Exascale Systems
PLC Programming Models, Languages and Compilers Workshop for Manycore and Heterogeneous Architectures
EduPar NSF/TCPP Workshop on Parallel and Distributed Computing Education
GABB Graph Algorithms Building Blocks [new]
Light Reception
6:00 PM – 8:00 PM

IPDPS 2014 Welcome Reception & TCPP Annual Meeting

TUESDAY - 20 May 2014


Opening Session
8:00 AM - 8:30 AM

Opening Session: TBA

Keynote Session
8:30 AM - 9:30 AM

Keynote Speech


Session Chair: Manish Parashar


Yutong Lu
National University of Defense Technology, China
Scalability­Centric HPC System Design


Abstract: Since scalability is one of the major challenges for advanced HPC systems in the post-petascale and exascale era, innovative integrated technology designs are needed for new architecture as well as associated software stacks.  We need to explore the capability of cpu, accelerator, interconnection, I/O storage system, and till whole system. This talk will discuss the way of scalability-centric HPC system hardware and software design related to the computation, communication, data procession, and fault tolerance. The experiences on the design and implementation of Tianhe systems will also be given. Furthermore, some investigations on architecture and software design for the next generation HPC system will be presented. In general, a co-design approach should be followed throughout the research and development activities to deliver a whole system for scalable computing, to support the large-scale domain applications efficiently.


Bio: Professor Yutong Lu is the Director of the System Software Laboratory, School of Computer Science, National University of Defense Technology (NUDT), Changsha, China. She is also a professor in the State Key Laboratory of High Performance Computing, China. She got her B.S, M.S, and PhD degrees from the NUDT. Her extensive research and development experience has spanned several generations of domestic supercomputers in China.  During this period, Prof. Lu was the Director Designer for the Tianhe-1A and Tianhe-2 systems – both of which have been internationally recognized as the top-ranked supercomputing system worldwide in respectively November of 2010 and June of 2013.  Her continuing research interests include parallel operating systems (OS), high speed communications, global file systems, and advanced programming environments with MPI.

PhD Forum
Starts at 9:30 AM

PhD Forum Posters

See poster details here

Morning Break 9:30 AM -10:00 AM

Parallel Technical
Sessions 1, 2, 3, & 4

10:00 AM - 12:00 PM

Algorithms for Resource Management and Awareness


Sesssion Chair: Helen Karatza


Cost-Optimal Execution of Boolean Query Trees with Shared Streams
Dounia Zaidouni (ENS Lyon, France); Yves Robert (ENS Lyon, France); Frederic Vivien (INRIA, France); Henri Casanova (University of Hawaii at Manoa, USA); Lipyeow Lim (University of Hawaii, USA)


It's About Time: On Optimal Virtual Network Embeddings under Temporal Flexibilities
Matthias Rost (T-Labs / TU Berlin, Germany); Stefan Schmid (T-Labs & TU Berlin, Germany); Anja Feldmann (TU-Berlin, Germany)


Exploiting Geometric Partitioning in Task Mapping for Parallel Computers
Mehmet Deveci (The Ohio State University, USA); Sivasankaran Rajamanickam (Sandia National Laboratories, USA); Vitus Leung (Sandia National Labs, USA); Kevin T Pedretti (Sandia National Laboratories, USA); Stephen L Olivier (Sandia National Laboratories, USA); David P. Bunde (Knox College, USA); Umit V. Catalyurek (The Ohio State University, USA); Karen D Devine (Sandia National Laboratories, USA)


Communication-efficient Distributed Variance Monitoring and Outlier Detection for Multivariate Time Series
Moshe Gabel (Technion - Israel Institute of Technology, Israel); Daniel Keren (Haifa University, Israel); Assaf Schuster (Technion - Israel Institute of Technology, Israel)



Big Data Processing


Sesssion Chair: Yan Sohlin


MobiStreams: A Reliable Distributed Stream Processing System for Mobile Devices
Huayong Wang (MIT, USA); Li-Shiuan Peh (MIT, USA)


MapReuse: Reusing Computation in an In-Memory MapReduce System
Devesh Tiwari (Oak Ridge National Laboratory, USA); Yan Solihin (North Carolina State University, USA)


PAGE: A Framework for Easy Parallelization of Genomic Applications
Mucahid Kutlu (The Ohio State University, USA); Gagan Agrawal (The Ohio State University, USA)


Pythia: Faster Big Data in Motion through Predictive Software-Defined Network Optimization at Runtime
Marcelo Neves (PUCRS, Brazil); Kostas Michael Katrinis (IBM, Ireland); Hubertus Franke (IBM, USA); César A. F. De Rose (PUCRS, Brazil)





Sesssion Chair: Carole Wu


A Case for a Flexible Scalar Unit in SIMT Architecture
Yi Yang (NEC Laboratories America, Inc., USA); Ping Xiang (North Carolina State University, USA); Michael Mantor (AMD, USA); Norman Rubin (AMD Inc., USA); Lisa Hsu (AMD, USA); Qunfeng Dong (University of Science and Technology of China, P.R. China); Huiyang Zhou (North Carolina State University, USA)


Scalar Waving: Improving the Efficiency of SIMD Execution on GPUs
Ayse Yilmazer (Northeastern University, USA); Zhongliang Chen (Northeastern University, USA); David Kaeli (Northeastern University, USA)


Power and Performance Characterization and Modeling of GPU-accelerated Systems
Yuki Abe (Kyushu University, Japan); Hiroshi Sasaki (Kyushu University, Japan); Shinpei Kato (Nagoya University, Japan); Koji Inoue (Kyushu University Japan, Japan);5 Masato Edahiro (Nagoya University, Japan); Martin Peres (University of Bordeaux, LaBRI, France)


Energy Efficient HPC on Embedded SoCs: Optimization Techniques for Mali GPU
Ivan Grasso (University of Innsbruck, Austria); Petar Radojkovic (Barcelona Supercomputing Center, Spain); Nikola Rajovic (Barcelona Supercomputing Center, Spain); Isaac Gelado (Barcelona Supercomputing Center, Spain); Alex Ramirez (UPC, BSC, Spain)



I/O, Storage, and Networking


Sesssion Chair: Kathryn Mohror


Bursting the Cloud Data Bubble: Towards Transparent Storage Elasticity in IaaS Clouds
Bogdan Nicolae (IBM Research, Ireland); Pierre Riteau (University of Chicago, USA); Kate Keahey (Argone National Lab, USA)


Scibox: Online Sharing of Scientific Data via the Cloud
Jian Huang (Georgia Institute of Technology, USA); Xuechen Zhang (Georgia Institute of Technology, USA); Greg Eisenhauer (Georgia Institute of Technology, USA); Karsten Schwan (Georgia Tech, USA); Matthew Wolf (Georgia Institute of Technology, USA); Stephane Ethier (PPPL, USA); Scott Klasky (Oak Ridge National Laboratory, USA)


CALCioM: Mitigating I/O Interference in HPC Systems through Cross-Application Coordination
Matthieu Dorier (ENS Cachan/IRISA, France); Gabriel Antoniu (INRIA Rennes - Bretagne Atlantique, France); Robert Ross (Argonne National Laboratory, USA); Dries Kimpe (Argonne National Laboratory, USA); Shadi Ibrahim (INRIA Rennes, France)


Active Measurement of the Impact of Network Switch Utilization on Application Performance
 Marc Casas (Barcelona Supercomputing Center, Spain); Greg Bronevetsky (Lawrence Livermore National Laboratory, USA)

Parallel Technical Sessions 5, 6, 7, & 8
1:30 PM - 3:30 PM

Multi-core Algorithms


Sesssion Chair: Kamesh Madduri


Multi-Resource Real-Time Reader/Writer Locks for Multiprocessors
Bryan Ward (The University of North Carolina at Chapel Hill, USA); Jim Anderson (Unc-Cs, USA)


Remote Invalidation: Optimizing the Critical Path of Memory Transactions
Ahmed Hassan (Virginia Tech, USA); Roberto Palmieri (Virginia Tech, USA); Binoy Ravindran (Virginia Tech, USA)


Revisiting Asynchronous Linear Solvers: Provable Convergence Rate Through Randomization
Haim Avron (IBM Research, USA); Alex Druinsky (Tel-Aviv University, Israel); Anshul Gupta (IBM TJ Watson Research Center, USA)


Accelerating MPI Collective Communications through Hierarchical Algorithms without Sacrificing Inter-Node Communication Flexibility
Benjamin S Parsons (Purdue University, USA); Vijay S Pai (Purdue University, USA)



Computational Biology


Sesssion Chair: Srinivas Aluru


Enabling in-situ data analysis for large protein folding trajectory datasets
Boyu Zhang (University of Delaware, USA); Trilce Estrada (University of Delaware, USA); Pietro Cicotti (University of California, San Diego, USA); Michela Taufer (University of Delaware, USA)


Overcoming the Limitations Posed by TCRβ Repertoire Modeling through a GPU-based In-Silico DNA Recombination Algorithm
Gregory Striemer (University of Arizona, USA); Harsha Krovi (University of Colorado, USA); Ali Akoglu (University of Arizona, USA); Benjamin Vincent (University of North Carolina-Chapel Hill, USA); Ben Hopson (The University of Edinburgh, United Kingdom); Jeffrey Frelinger (University of Arizona, USA); Adam Buntzman (University of Arizona, USA)


Parallel Mutual Information Based Construction of Whole-Genome Networks on the Intel(R) Xeon Phi(TM) Coprocessor
Sanchit Misra (Intel Corporation, India); Kiran Pamnany (Intel Corporation, India); Srinivas Aluru (Georgia Institute of Technology, USA)


cuBLASTP: Fine-Grained Parallelization of Protein Sequence Search on a GPU
Jing Zhang (Virginia Tech, USA); Hao Wang (Virginia Tech, USA);Heshan Lin (Virginia Tech, USA); Wu-chun Feng (Virginia Tech, USA)



Interconnection Network


Sesssion Chair: Huiyang Zhou


Skywalk: a Topology for HPC Networks with Low-delay Switches
Ikki Fujiwara (National Institute of Informatics, Japan); Michihiro Koibuchi (National Institute of Informatics, Japan); Hiroki Matsutani (Keio University, Japan); Henri Casanova (University of Hawaii at Manoa, USA)


LFTI: A New Performance Metric for Assessing Interconnect Designs for Extreme-Scale HPC Systems
Xin Yuan (Florida State University, USA); Santosh K Mahapatra (Florida State University, USA); Scott Pakin (Los Alamos National Laboratory, USA); Michael Lang (Los Alamos National Laboratory, USA)

An Improved Router Design for Reliable On-Chip Networks
 Pavan Poluri (University of Arizona, USA); Ahmed Louri (University of Arizona, USA)


Energy-Efficient Time-Division Multiplexed Hybrid-Switched NoC for Heterogeneous Multicore Systems
Jieming Yin (University of Minnesota, USA); Pingqiang Zhou (University of Minnesota, USA); Sachin Sapatnekar (University of Minnesota, USA); Antonia Zhai (University of Minnesota, Minneapolis, USA)



System-Level Resource Management


Sesssion Chair: Wei Tang


Heterogeneity-aware Workload Placement and Migration in Distributed Sustainable Datacenters
Dazhao Cheng (University of Colorado at Colorado Springs, USA); Changjun Jiang (Tongji University, P.R. China); Xiaobo Zhou (University of Colorado at Colorado Springs, USA)


Online server and workload management for joint optimization of electricity cost and carbon footprint across data centers
Zahra Abbasi (Arizona State University, USA); Madhurima Pore (Arizona State University, USA); Sandeep Gupta (Arizona State University, USA)


Cost-efficient and Resilient Job Life-cycle Management on Hybrid Clouds
Hsuan-Yi Chu (University of Southern California, USA); Yogesh Simmhan (Indian Institute of Science, India)


A Coprocessor Sharing-Aware Scheduler for Xeon Phi-based Compute Clusters
Giuseppe Coviello (NEC Laboratories America, Inc., USA); Srihari Cadambi (NEC Laboratories America, Inc, USA); Srimat Chakradhar (NEC Research Labs, USA)

Afternoon Break 3:30 PM - 4:00 PM

Parallel Technical
Sessions 9, 10, 11, & 12

4:00 PM - 6:00 PM

GPU algorithms


Sesssion Chair: Fredrik Manne


Work-Efficient Parallel GPU Methods for Single Source Shortest Paths
Andrew Davidson (University of California, Davis, USA); Sean Baxter (NVIDIA, USA); Michael Garland (Nvidia, USA); John D. Owens (University of California, Davis, USA)


Efficient Multi-GPU Computation of All-Pairs Shortest Paths
Hristo Djidjev (Los Alamos National Laboratory, USA); Rumen Andonov (University of Rennes, France); Guillaume Chapuis (University of Rennes, France); Dominique Lavenier (Cnrs-Irisa, France); Sunil Thulasidasan (Los Alamos National Laboratory, USA)


An Efficient GPU General Sparse Matrix-Matrix Multiplication for Irregular Data
Weifeng Liu (University of Copenhagen, Denmark); Brian Vinter (University of Copenhagen, Denmark)


Improving the Performance of CA-GMRES on Multicores with Multiple GPUs
Ichitaro Yamazaki (University of Tennessee, USA); Hartwig Anzt (University of Tennessee, USA); Stanimire Tomov (University of Tennessee, USA); Mark Hoemmen (Sandia National Laboratories, USA);  Jack Dongarra (University of Tennessee, Knoxville, USA)



Graph and Network Processing


Sesssion Chair: Padma Raghavan


How Well do Graph-Processing Platforms Perform? An Empirical Performance Evaluation and Analysis
Yong Guo (TU Delft, The Netherlands); Marcin Biczak (TU Delft, The Netherlands); Ana Lucia Varbanescu (University of Amsterdam, The Netherlands); Alexandru Iosup (Delft University of Technology, The Netherlands); Claudio Martella (VU University Amsterdam, The Netherlands); Theodore L. Willke (Intel Corporation, USA)


Complex Network Analysis using Parallel Approximate Motif Counting
George Slota (The Pennsylvania State University, USA); Kamesh Madduri (The Pennsylvania State University, USA)


Finding Motifs in Biological Sequences Using the Micron Automata Processor
Indranil Roy (Georgia Institute of Technology, USA); Srinivas Aluru (Georgia Institute of Technology, USA)


Traversing Trillions of Edges in Real-time: Graph Exploration on Large-Scale Parallel Machines
Fabio Checconi (IBM TJ Watson Research Center, USA); Fabrizio Petrini (IBM TJ Watson Research Center, USA)



Modeling, Simulation, and Reliability


Sesssion Chair: Devesh Tiwari


TBPoint: Reducing Simulation Time for Large Scale GPGPU Kernels
Jen-Cheng Huang (Georgia Institute of Technology, USA); Lifeng Nai (Georgia Institute of Technology, USA); Hyesoon Kim (Georgia Tech, USA); Hsien-Hsin Lee (Georgia Institute of Technology, USA)


Algorithmic time, energy, and power on candidate HPC compute building blocks
Jee Choi (Georgia Institute of Technology, USA); Marat Dukhan (Georgia Institute of Technology, USA); Xing Liu (Georgia Institute of Technology, USA); Richard W Vuduc (Georgia Institute of Technology, USA)


Characterization of Impact of Transient Faults and Detection of Data Corruption Errors in Large-Scale N-Body Programs Using Graphics Processing Units
Keun Soo Yim (Google, Inc., USA)


Analytically Modeling Application Execution for Software-Hardware Co-Design
Jichi Guo (University of Colorado at Colorado Springs, USA); Jiayuan Meng (Argonne National Laboratory, USA); Qing Yi (University of Colorado at Colorado Springs, USA); Vitali Morozov (Argonne National Laboratory, USA); Kalyan Kumaran (Argonne National Laboratory, USA)



Accelerator Application Development and Optimization


Sesssion Chair: Torsten Hoefler


Interactive Program Debugging and Optimization for Directive-Based, Efficient GPU Computing
Seyong Lee (Oak Ridge National Laboratory, USA); Dong Li (Oak Ridge National Laboratory, USA); Jeffrey S Vetter (Oak Ridge National Laboratory, USA)


Unified Development for Mixed Multi-GPU and Multi-Coprocessor Environments using a Lightweight Runtime Environment
Azzam Haidar (University of Tennessee, USA); Chongxiao Cao (University of Tennessee Knoxville, USA); Jack Dongarra (University of Tennessee, Knoxville, USA); Piotr Luszczek (University of Tennessee, USA); Stanimire Tomov (University of Tennessee, USA); Asim YarKhan (UTK, USA); Khairul Kabir (University of Tennessee, USA)


Nitro: A Framework for Adaptive Code Variant Tuning
Saurav Muralidharan (University of Utah, USA); Manu Shantharam (University of Utah, USA); Mary Hall (University of Utah, USA); Michael Garland (NVIDIA Corporation, USA); Bryan Catanzaro (NVIDIA Corporation, USA)

Industry Talk
6:00 PM – 7:00 PM



Sesssion Chair: Kalyana Chadal



The Exascale Architecture



Richard Graham, Senior Solutions Architect
Mellanox Technologies, Inc.

Read abstract and bio

WEDNESDAY - 21 May 2014


Keynote Session
8:30 AM – 9:30 AM

Keynote Speech


Sesssion Chair: Viktor Prasanna


Peter Kogge
University of Notre Dame
Reading the Tea-Leaves: How Architecture Has Evolved at the High End


Abstract: The 2008 DARPA Exascale study was one of the first in-depth attempts to project ahead key characteristics for high-end massively parallel systems on the basis of technology trends, architectures, and computational kernels, and identified four major challenges for future systems designs. It focused on a single benchmark, Linpack, and identified two distinct classes of architectures: “heavyweight” and “lightweight.” This talk is a continuation of a series of updates to that study, and includes not only the most recent technology projections but also several new benchmarks for which significant multi-year data exists, and new classes of architectures that have emerged since then. The talk will address changes in characteristics (both before and after the seminal year of 2004 where multi-core took over), and how those characteristics are likely to project into the future. A series of vignettes on specific features will provide insight into areas where current design trends are becoming over or under-balanced. Special attention is given to both computational energy and memory.


Bio: Dr. Peter Kogge currently holds the Ted McCourtney Chair in Computer Science and Engineering at the University of Notre Dame, and is a founder and Chief Scientist of Emu Solutions, Inc. Before coming to Notre Dame, Dr. Kogge spent 25 years at IBM’s former Federal Systems Division where he was appointed an IBM Fellow, and was responsible for the development and deployment of multiple leading edge architectures, including the world’s second multi-threaded machine (also the first parallel processor to fly in space), in retrospect the world’s first multi-core chip, and several of the world’s first working demonstrations of “Processing-In-Memory” architecture. His Ph.D. work in the early 70s’ on what is now called parallel prefix algorithms led to what is known now as the “Kogge-Stone” adder. His text on pipelining in 1981 is often cited as the first formal text on this architectural technique. He led the DARPA Exascale study group in 2008 and was lead on the resulting report. He is also an IEEE Fellow and in 2012 received the Seymour Cray award for achievements in computer engineering.

Morning Break 9:30 AM - 10:00 AM

Parallel Technical Sessions 13, 14, 15, 16 & 17
10:00 AM - 12:00 PM

Combinatorial Algorithms


Sesssion Chair: Aydin Buluc


New Effective Multithreaded Matching Algorithms
Fredrik Manne (University of Bergen, Norway); Mahantesh Halappanavar (Pacific Northwest National Laboratory, USA)


A medium-grain method for fast 2D bipartitioning of sparse matrices
Daniel M. Pelt (CWI, The Netherlands); Rob H. Bisseling (Utrecht University, The Netherlands)


Randomized matching heuristics with quality guarantees on shared memory parallel computers
Fanny Dufosse (LAAS-CNRS, France); Kamer Kaya (The Ohio State University, USA); Bora Uçar (CNRS, France)


BFS and Coloring-based Parallel Algorithms for Strongly Connected Components and Related Problems
George Slota (The Pennsylvania State University, USA); Sivasankaran Rajamanickam (Sandia National Laboratories, USA); Kamesh Madduri (The Pennsylvania State University, USA)



Large Scale Scientific Applications


Sesssion Chair: Edmond Chow


Large-scale Hydrodynamic Brownian Simulations on Multicore and Manycore Architectures
Xing Liu (Georgia Institute of Technology, USA); Edmond Chow (Georgia Institute of Technology, USA)


Enabling Scalable Parallelization of Sampling-Based Motion Planning Algorithms Through Load Balancing
Adam Fidel (Texas A&M University, USA); Sam Jacobs (Texas A&M University, USA); Shishir Sharma (Texas A&M University, USA); Lawrence Rauchwerger (Texas A&M University, USA); Nancy Amato (Texas A&M University, USA)


Petascale Application of a Coupled CPU-GPU Algorithm for Simulation and Analysis of Multiphase Flow Solutions in Porous Medium Systems
James McClure (Virginia Tech, USA); Hao Wang (Virginia Tech, USA); Jan F Prins (University of North Carolina at Chapel Hill, USA); Cass Miller (University of North Carolina at Chapel Hill, USA); Wu-chun Feng (Virginia Tech, USA)


A Spatio-Temporal Coupling Method to Reduce the Time-to-Solution of Cardiovascular Simulations
Amanda Randles (Lawrence Livermore National Laboratory, USA); Efthimios Kaxiras (Harvard University, USA)



Multicore and Transactional Memory


Sesssion Chair: Padma Raghavan


Mitigating the Mismatch between the Coherence Protocol and Conflict Detection in Hardware Transactional Memory
Lihang Zhao (University of Southern California/ Information Sciences Institute, USA); Lizhong Chen (University of Southern California, USA); Jeffrey Draper (University of Southern California/ Information Sciences Institute, USA)


Performance and Energy Analysis of the Restricted Transactional Memory Implementation on Haswell
Bhavishya Goel (Chalmers University of Technology, Sweden); Rubén Titos (University of Murcia, Spain); Anurag Negi (Chalmers University Of Technology, Sweden); Sally A. McKee (Chalmers University of Technology, Sweden); Per Stenstrom (Chalmers University of Technology, Sweden)


Runtime-Guided Cache Coherence Optimizations in Multi-core Architectures
Madhavan Manivannan (Chalmers University of Technology, Sweden); Per Stenstrom (Chalmers University of Technology, Sweden)


High Performance Alltoall and Allgather designs for InfiniBand MIC Clusters
Akshay Venkatesh (Ohio State University, USA); Sreeram Potluri (The Ohio State University, USA); Raghunath Rajachandrasekar (The Ohio State University, USA); Miao Luo (The Ohio State University, USA); Khaled Hamidouche (The Ohio State University, USA); Dhabaleswar Panda (The Ohio State University, USA)



HPC Operating Systems and Runtime Systems


Sesssion Chair: Ron Brightwell


HPMMAP: Lightweight Memory Management for Commodity Operating Systems
Brian Kocoloski (University of Pittsburgh, USA); John R Lange (University of Pittsburgh, USA)


Victim Selection and Distributed Work Stealing Performance: A Case Study
Swann Perarnau (RIKEN AICS, Japan); Mitsuhisa Sato (University of Tsukuba, Japan)


Power-efficient Multiple Producer-Consumer
Ramy Medhat (University of Waterloo, Canada); Borzoo Bonakdarpour (University of Waterloo, Canada); Sebastian Fischmeister (University of Waterloo, Canada)


Efficient Data Race Detection for C/C++ Programs Using Dynamic Granularity
Young Wn Song (Arizona State University, USA); Yann-Hang Lee (Arizona State University, USA)



Algorithms for Distributed Computing


Sesssion Chair: Yinglong Xia


Improved Time Bounds for Linearizable Implementations of Abstract Data Types
Jiaqi Wang (Texas A&M University, USA); Edward Talmage (Texas A&M University, USA); Hyunyoung Lee (Texas A&M University, USA); Jennifer Welch (Texas A&M University, USA)


DEX: Self-healing Expanders
 Gopal Pandurangan (Nanyang Technological University, Singapore); Peter Robinson (Nanyang Technological University, Singapore); Amitabh Trehan (Technion - Israel Institute of Technology, Israel)


Fair Maximal Independent Sets
Jeremy Fineman (Georgetown University, USA); Calvin Newport (Georgetown University, USA); Micah Sherr (Georgetown University, USA); Tonghe Wang (Georgetown University, USA)

Parallel Technical Sessions 18, 19, 20

& 21
1:30 PM - 3:30 PM

Milestones at the Petascale


Sesssion Chair: Abhinav Bhatele


Balancing CPU-GPU Collaborative High-order CFD Simulations on the TianHe-1A Supercomputer
Chuanfu Xu (National University of Defense Technology, P.R. China)


Shedding Light On Lithium/Air Batteries Using Million Threads On the BG/Q Supercomputer
Valery Weber (IBM Research - Zurich, Switzerland); Costas Bekas (IBM Zurich Research Laboratory, Switzerland); Teodoro Laino (IBM Research - Zurich, Switzerland); Alessandro Curioni (IBM Zurich Research Laboratory, Switzerland); Adam Bertsch (Lawrence Livermore National Laboratory, USA); Scott Futral (Lawrence Livermore National Laboratory, USA)


Enabling and Scaling a global shallow-water atmospheric model on Tianhe-2
Wei Xue (Tsinghua University, P.R. China); Chao Yang (Institute of Software, Chinese Academy of Sciences, P.R. China); Haohuan Fu (Tsinghua University, P.R. China); Xinliang Wang (Tsinghua University, P.R. China); Yangtong Xu (Tsinghua University, P.R. China); Lin Gan (Tsinghua University, P.R. China)

Overcoming the Scalability Challenges of Epidemic Simulations on Blue Waters
Jae-seung Yeom (Virginia Tech, USA); Abhinav Bhatele (Lawrence Livermore National Laboratory, USA); Keith R Bisset (Virginia Tech, USA); Eric Bohm (University of Illinois at Urbana-Champaign, USA); Abhishek Gupta (University of Illinois at Urbana-Champaign, USA); Laxmikant V. Kale (University of Illinois at Urbana-Champaign, USA); Madhav Marathe (Virginia Tech, USA); Dimitrios S. Nikolopoulos (Queen's University of Belfast, United Kingdom); Martin Schulz (Lawrence Livermore National Laboratory, USA); Lukasz Wesolowski (University of Illinois at Urbana-Champaign, USA)



Storage and Reliability


Sesssion Chair: D.K. Panda


POD: Performance Oriented I/O Deduplication for Primary Storage Systems in the Cloud
Bo Mao (Xiamen University, P.R. China); Hong Jiang (University of Nebraska at Lincoln, USA); Suzhen Wu (Xiamen University, P.R. China); Lei Tian (University of Nebraska-Lincoln, USA)


Pipelined Compaction for the LSM-tree
Zigang Zhang (Institute of Computing Technology, Chinese Academy of Science, P.R. China); Yinliang Yue (Institute of Computing Technology, Chinese Academy of Sciences, P.R. China); Bingsheng He (Nanyang Technological University, Singapore); Jin Xiong (Institute of Computing Technology, Chinese Academy of Science, P.R. China); Mingyu Chen (Institute of Computing Technology, Chinese Academy of Sciences, P.R. China) Lixin Zhang (Chinese Academy of Sciences, P.R. China); Ninghui Sun (Institute of Computing Technology, Chinese Academy of Sciences, P.R. China)


EDM: an Endurance-aware Data Migration Scheme for Load Balancing in SSD Storage Clusters
Jiaxin Ou (University of Tsinghua, P.R. China); Youyou Lu (University of Tsinghua, P.R. China); Jiwu Shu (Tsinghua University, P.R. China); Letian Yi (Tsinghua University, P.R. China); Wei Wang (Tsinghua, P.R. China)



Map/Reduce and Big Data


Sesssion Chair: Fabrizio Petrini


Characterization and Optimization of Memory-Resident MapReduce on HPC Systems
Yandong Wang (Auburn University, USA); Robin Goldstone (Lawrence Livermore National Laboratory, USA); Weikuan Yu (Auburn University, USA); Teng Wang (Auburn University, USA); Yizheng Jiao (Auburn University, USA)


MIC-SVM: Designing A Highly Efficient Support Vector Machine For Advanced Modern Multi-Core and Many-Core Architectures
Yang You (Tsinghua University, P.R. China); Shuaiwen Song (Pacific Northwest National Laboratory, USA); Haohuan Fu (Tsinghua University, P.R. China); Andres Marquez (Pacific Northwest National Lab, USA); Guangwen Yang (Tsinghua University, P.R. China); Kevin Barker (Pacific Northwest National LAboratory, USA); Kirk Cameron (Virginia Tech, USA);983635 Maryam Dehnavi (MIT, USA); Amanda Randles (Lawrence Livermore National Laboratory, USA)


BigKernel -- High Performance CPU-GPU Communication Pipelining for "Big Data"-style Applications
Reza Mokhtari (University of Toronto, Canada); Michael Stumm (University of Toronto, Canada)

DataMPI: Extending MPI to Hadoop-like Big Data Computing
Xiaoyi Lu (Institute of Computing Technology, Chinese Academy of Sciences, P.R. China); Fan Liang (Institute of Computing Technology, Chinese Academy of Sciences, P.R. China); Bing Wang (Institute of Computing Technology, Chinese Academy of Sciences, P.R. China); Li Zha (Institute of Computing Technology, Chinese Academy of Sciences, P.R. China); Zhiwei Xu (Institute of Computing Technology, Chinese Academy of Sciences, P.R. China)



Network algorithms


Sesssion Chair: Frédéric Vivien


An Efficient Method for Stream Semantics over RDMA
Patrick MacArthur (University of New Hampshire, USA); Robert D. Russell (University of New Hampshire, USA)


Collaborative Network Configuration in Hybrid Electrical/optical Data Center Networks
Zhiyang Guo (Stony Brook University, USA); Yuanyuan Yang (Stony Brook University, USA)


Optimizing Bandwidth Allocation in Flex-Grid Optical Networks with Application to Scheduling
Hadas Shachnai (Technion, Israel); Ariella Voloshin (Technion, Israel); Shmuel Zaks (Technion, Israel)


Balancing On-Chip Network Latency in Multi-Application Mapping for Chip-Multiprocessors
Di Zhu (University of Southern California, USA); Lizhong Chen (University of Southern California, USA); Siyu Yue (University of Southern California, USA); Timothy M. Pinkston (University of Southern California, USA); Massoud Pedram (University of Southern California, USA)

Afternoon Break 3:30 PM - 4:00 PM

Symposium Panel
4:00 PM - 6:00 PM

Symposium Town Hall Meeting


Meeting Facilitators:
Viktor Prasanna
David Bader
Srinivas Aluru
Manish Parashar
Arnold Rosenberg
… and others

PhD Forum &
Pre-Banquet Reception

6:00 PM – 7:00 PM

PhD Forum

Presenters will be available to discuss their poster with attendees at the reception held in the poster area.


7:00 PM


THURSDAY - 22 May 2014


Keynote Session
8:30 AM - 9:30 AM

Keynote Speech


Sesssion Chair: David Bader


Joshua Bloom
University of California, Berkeley
Astrophysical Applications of Machine Learning at Scale and Under Duress


Abstract: The universe is teeming with change on timescales from billions of years to milliseconds. A major goal of modern synoptic imaging surveys is to categorize this change over the entire sky to infer the diverse physical origins of variability. However, event discovery is only the beginning in the quest to extract the deepest insights: expensive followup resources (telescopes and people) are required, often in a time constrained environment. Viewing discovery and scientific insight through a resource-maximization lens, I discuss how machine learning is being applied to some modern astrophysics challenges. Here, the surfacing of parallelized feature engineering and machine learning into production-quality (scalable and fault tolerant) frameworks is the frontier for our field.


Bio: Dr. Joshua Bloom is an astronomy professor at the University of California, Berkeley where he teaches high-energy astrophysics and Python for data scientists. He has published over 250 refereed articles largely on time-domain transients events and telescope/insight automation. His book on gamma-ray bursts, a technical introduction for physical scientists, was published recently by Princeton University Press.  Josh has been awarded the Pierce Prize from the American Astronomical Society; he is also a former Sloan Fellow, Junior Fellow at the Harvard Society, and Hertz Foundation Fellow. He holds a PhD from Caltech and degrees from Harvard and Cambridge University.

Morning Break 9:30 AM - 10:00 AM

Best Papers

10:00 AM - 12:00 PM

Session Best Papers


Scalable Single Source Shortest Path algorithms for Massively Parallel Systems
Venkatesan T Chakaravarthy (IBM Research - India, India); Fabio Checconi (IBM TJ Watson Research Center, USA); Fabrizio Petrini (IBM TJ Watson Research Center, USA); Yogish Sabharwal (IBM Research - India, India)

A New Scalable Parallel Algorithm for Fock Matrix Construction
 Xing Liu (Georgia Institute of Technology, USA); Aftab Patel (Georgia Institute of Technology, USA); Edmond Chow (Georgia Institute of Technology, USA)


ReDHiP: Recalibrating Deep Hierarchy Prediction for Energy Efficiency" accepted by IPDPS'14.
Xun Li (Facebook); Diana Franklin (University of California, Santa Barbara, USA); Ricardo Bianchini (Rutgers University, USA); Fred Chong (University of California, Santa Barbara, USA)


F2C2-STM: Flux-based Feedback-driven Concurrency Control for STMs
Kaushik Ravichandran (Georgia Institute of Technology, USA); Santosh Pande (Georgia Institute of Technology, USA)

Parallel Technical Sessions 22, 23, 24 & 25
1:30 PM - 3:30 PM

Performance Characterization and Optimization


Sesssion Chair: Piotr Luszczek


Identifying code phases using piece-wise linear regressions
Harald Servat (Universitat Politècnica de Catalunya - Barcelona Supercomputing Center, Spain); German Llort (Barcelona Supercomputing Center, Spain); Juan Gonzalez (Universitat Politecnica de Catalunya, Spain); Judit Gimenez (Barcelona Supercomputing Center - Universitat Politècnica de Catalunya, Spain); Jesús Labarta (Barcelona Supercomputing Center, Spain)


Auto-Tuning Dedispersion for Many-Core Accelerators
Alessio Sclocco (Vrije Universiteit Amsterdam, The Netherlands); Henri Bal (Vrije Universiteit, The Netherlands); Jason Hessels (ASTRON, The Netherlands); Joeri van Leeuwen (ASTRON, The Netherlands); Rob V van Nieuwpoort (ASTRON, The Netherlands)


RCMP: Enabling Efficient Recomputation Based Failure Resilience for Big Data Analytics
Florin Dinu (Rice University, USA); T. S. Eugene Ng (Rice University, USA)


A Step towards Energy Efficient Computing: Redesigning A Hydrodynamic Application on CPU-GPU
Tingxing "Tim" Dong (University of Tennessee, Knoxville, USA); Veselin Dobrev (Lawrence Livermore National Lab, USA); Tzanio Kolev (Lawrence Livermore National Lab, USA); Robert Rieben (Lawrence Livermore National Lab, USA); Stanimire Tomov (University of Tennessee, USA); Jack Dongarra (University of Tennessee, Knoxville, USA)



Multithreading and Concurrency


Sesssion Chair: Bronis De Supinski


Using Multiple Threads to Accelerate Single Thread Performance
Zehra Sura (IBM Research, USA); Kevin O'Brien (IBM Research, USA); Jose Brunheroto (IBM Research, USA)


Active Measurement of Memory Resource Consumption
Marc Casas (Barcelona Supercomputing Center, Spain); Greg Bronevetsky (Lawrence Livermore National Laboratory, USA)


Locating Parallelization Potential in Object-Oriented Data Structures
Korbinian Molitorisz (Karlsruhe Institute of Technology (KIT), Germany); Thomas Karcher (Karlsruhe Institute of Technology (KIT), Germany); Alexander Bieles (Karlsruhe Institute of Technology (KIT), Germany); Walter F. Tichy (University Karlsruhe, Germany)



Numerical Algorithms


Sesssion Chair: Mathias Jacquelin


An Accelerated Recursive Doubling Algorithm for Block Tridiagonal Systems
Sudip Seal (Oak Ridge National Laboratory, USA)


Designing LU-QR hybrid solvers for performance and stability
Mathieu Faverge (IPB, France); Julien Herrmann (ENS Lyon, France); Julien Langou (University of Colorado Denver, USA); Yves Robert (ENS Lyon, France); Bradley Lowery (University of Colorado Denver, USA); Jack Dongarra (University of Tennessee, Knoxville, USA)


Effectively exploiting parallel scale for all problem sizes in LU factorization
Md Rakib Hasan (Louisiana State University, USA); Clint Whaley (Louisiana State University, USA)


Anatomy of High-Performance Many-Threaded Matrix Multiplication
Tyler M Smith (University of Texas at Austin, USA); Robert van de Geijn (U. Texas at Austin, USA); Mikhail Smelyanskiy (Intel, USA); Jeff Hammond (Argonne National Laboratory, USA); Field G Van Zee (University of Texas at Austin, USA)



Performance Impacts of Hardware Acceleration


Sesssion Chair: Kamesh Madduri


Comparative Performance Analysis of Intel Xeon Phi, GPU, and CPU: A Case Study from Microscopy Image Analysis
George Teodoro (Emory University, USA); Tahsin Kurc (Emory University, USA); Jun Kong (Emory University, USA) Lee Cooper (Emory University, USA); Joel Saltz (Emory University, USA)


A Framework for Lattice QCD Calculations on GPUs
Frank Winter (Thomas Jefferson National Accelerator Facility, USA); Mike Clark (NVIDIA Corporation, USA); Robert Edwards (Thomas Jefferson National Accelerator Facility, USA); Balint Joo (Thomas Jefferson National Accelerator Facility, USA)


Improving Communication Performance and Scalability of Native Applications on Intel(R) Xeon Phi(TM) Coprocessor Clusters
Karthikeyan Vaidyanathan (Intel Corporation, India); Kiran Pamnany (Intel Corporation, India); Dhiraj D Kalamkar (Intel Corporation, India); Alexander Heinecke (Technische Universität München, Germany);  Mikhail Smelyanskiy (Intel, USA); Jongsoo Park (INTEL Corporation, USA); Daehyun Kim (Intel Corporation, USA); Aniruddha Shet (Intel Corporation, India); Bharat Kaul (Intel Corporation, India); Balint Joo (Thomas Jefferson National Accelerator Facility, USA); Pradeep Dubey (Intel Corporation, USA)

Computational Co-design of a Multiscale Plasma Application: A Process and Initial Results
Joshua Payne (Los Alamos National Laboratory, USA); Dana Knoll (Los Alamos National Laboratory, USA); Allen McPherson (Los Alamos National Laboratory, USA); William Taitano (Los Alamos National Laboratory, USA); Luis Chacón (Los Alamos National Laboratory, USA); Guangye Chen (Los Alamos National Laboratory, USA); Scott Pakin (Los Alamos National Laboratory, USA)

Afternoon Break 3:30 PM - 4:00 PM

Parallel Technical Sessions 26, 27, 28 & 29
4:00 PM - 6:00 PM

Programming Models and Tools


Sesssion Chair: John Shalf


UPC++: A PGAS Extension for C++
Yili Zheng (Lawrence Berkeley National Laboratory, USA); Amir Kamil (Lawrence Berkeley National Lab, USA); Michael Driscoll (UC Berkeley, USA); Hongzhang Shan (LBL, USA);Katherine Yelick (University of California at Berkeley, USA)


A Performance Evaluation of One-Sided and Two-Sided Communication Paradigms on Relaxed-Ordering Interconnect
Khaled Z Ibrahim (Lawrence Berkeley National Labratory, USA); Paul H. Hargrove (Lawrence Berkeley National Laboratory, USA); Costin Iancu (Lawrence Berkeley National Laboratory, USA); Katherine Yelick (University of California at Berkeley, USA)


Scaling Irregular Applications through Data Aggregation and Software Multithreading
Alessandro Morari (Pacific Northwest National Laboratory, USA); Oreste Villa (NVIDIA, USA); Antonino Tumeo (Pacific Northwest National Laboratory, USA); Daniel Gerardo Chavarria (Pacific Northwest National Laboratory, USA); Mateo Valero (Universidad Politécnica de Cataluña, Spain)


Generalizing Run-time Tiling with the Loop Chain Abstraction
Michelle Strout (Colorado State University, USA); Fabio Luporini (Imperial College London, United Kingdom);  Christopher Krieger  (Colorado State University, USA); Carlo Bertolli  (Imperial College London, United Kingdom); Gheorghe-teodor Bercea (Imperial College London, United Kingdom); Catherine Olschanowsky (Colorado State University, USA); J. Ram Ramanujam (Louisiana State University, USA); Paul H J Kelly (Imperial College London, United Kingdom)



Algorithms for High Performance Computing


Sesssion Chair: Bora Uçar


s-step Krylov Subspace Methods as Bottom Solvers for Geometric Multigrid
Samuel W. Williams (Lawrence Berkeley National Laboratory, USA); Erin Carson (University of California at Berkeley, USA); Michael Lijewski (Lawrence Berkeley National Laboratory, USA); Nicholas Knight (University of California at Berkeley, USA); Ann Almgren (Lawrence Berkeley National Laboratory, USA); Brian Van Straalen (Lawrence Berkeley National Laboratory, USA); James Demmel (University of California at Berkeley, USA)


Reconstructing Householder Vectors from Tall-Skinny QR
Grey Ballard (Sandia National Laboratories, USA); James Demmel (University of California at Berkeley, USA); Laura Grigori (INRIA, France); Mathias Jacquelin (Lawrence Berkeley National Laboratory, USA); Hong Diep Nguyen (UC Berkeley, USA); Edgar Solomonik (University of California at Berkeley, USA)


Peta-scale General Solver for Semidefinite Programming Problems with over Two Million Constraints
Katsuki Fujisawa (Chuo University, Japan); Toshio Endo (Tokyo institute of Technology, Japan);Yuichiro Yasui (Chuo University, Japan);Hitoshi Sato (Tokyo Institute of Technology, Japan); Naoki Matsuzawa (University of Tokyo, Japan); Satoshi Matsuoka (Tokyo Institute of Technology, Japan); Hayato Waki (Kyushu University, Japan)


Optimization of Multi-level Checkpoint Model for Large Scale HPC Applications
Sheng Di (INRIA, France); Mohamed Slim Bouguerra (Joint Lab INRIA and UIUC, USA); Leonardo Bautista-gomez (Argonne National Laboratory, USA); Franck Cappello (INRIA and University of Illinois at Urbana Champaign, France)



Scalable Algorithms


Sesssion Chair: Mark Hoemmen


Evaluating the Impact of SDC on the GMRES Iterative Solver
James Elliott (North Carolina State University, USA); Mark Hoemmen (Sandia National Laboratories, USA); Frank Mueller (NCSU, USA)


A Multi-Core Parallel Branch-and-Bound Algorithm Using Factorial Number System
Mohand Mezmaz (University of Mons, Belgium); Rudi Leroy (INRIA, France); Nouredine Melab (CNRS/LIFL - Université Lille 1, France); Daniel Tuyttens (University of Mons, Belgium)


Optimizing Sparse Matrix-Multiple Vectors Multiplication for Nuclear Configuration Interaction Calculations
Hasan Metin Aktulga (Lawrence Berkeley National Laboratory, USA); Samuel W. Williams (Lawrence Berkeley National Laboratory, USA);  Aydin Buluc (Lawrence Berkeley National Laboratory, USA); Chao Yang (Lawrence Berkeley National Lab, USA)



Resilience and Reliability


Sesssion Chair: David Bernholdt


Fault Tolerant Messaging Interface for Fast and Transparent Recovery
Kento Sato (Tokyo Institute of Technology, USA); Adam Moody (Lawrence Livermore National Lab, USA); Kathryn Mohror (Lawrence Livermore National Laboratory, USA); Todd Gamblin (Lawrence Livermore National Laboratory, USA); Bronis R. de Supinski (Lawrence Livermore National Laboratory, USA); Naoya Maruyama (RIKEN AICS, Japan); Satoshi Matsuoka (Tokyo Institute of Technology, Japan)


Designing Bit-Reproducible Portable High-Performance Applications
Andrea Arteaga (Swiss Federal Institute of Technology, Switzerland); Oliver Fuhrer (MeteoSwiss, Switzerland); Torsten Hoefler (ETH Zurich, Switzerland)


F-SEFI: A Fine-grained Soft Error Fault Injection Tool for Profiling Application Vulnerability
Qiang Guan (University of North Texas, USA); Nathan DeBardeleben (Los Alamos National Laboratory, USA); Sean Blanchard (Los Alamos National Laboratory, USA); Song Fu (University of North Texas, USA)

FRIDAY - 23 May 2014


* See each individual
workshop programs
for schedule details


PDSEC Workshop on Parallel and Distributed Scientific and Engineering Computing
DPDNS Dependable Parallel, Distributed  and Network-Centric Systems
MTAAP Workshop on Multi-Threaded Architectures and Applications
LSPP Workshop on Large-Scale Parallel Processing
PCO Parallel Computing and Optimization
ParLearning Parallel and Distributed Computing for Machine Learning and Inference Problems
HPDIC High Performance Data Intensive Computing

Workflow Models, Systems, Services and Applications in the Cloud

Now part of HPGC

JSSPP Workshop on Job Scheduling Strategies for Parallel Processing

Virtual Prototyping of Parallel and Embedded Systems

Now part of RAW

CHIUW Chapel Implementers and Users Workshop [new]


Search IPDPS


March 25th Deadline for Advance Registration

Registration Details

Follow IPDPS


Tweets by @IPDPS