General IPDPS Info

Sponsors


IN COOPERATION WITH

ACM

ACM SIGARCH

and

TCCA.png

TCDP.png

IPDPS 2024 Advance Program

Please visit the IPDPS website regularly for updates, since there may be schedule revisions.

Authors who have corrections should send email to contact@ipdps.org giving full details.

MONDAY - 27 May 2024

DAYS • Monday • Tuesday • Wednesday • Thursday • Friday

MONDAY
Work
shops

 

ALL DAY

 

See each individual
workshop program
for schedule details

1

HCW

Heterogeneity in Computing Workshop

2

RAW

Reconfigurable Architectures Workshop

3

APDCM

Advances in Parallel and Distributed Computational Models

4

AsHES

Accelerators and Hybrid Emerging Systems

5

EduPar

NSF/TCPP Workshop on Parallel and Distributed Computing Education

6

ESSA

Extreme-Scale Storage and Analysis

7

GrAPL

Graphs, Architectures, Programming, and Learning

8

HiCOMB

High Performance Computational Biology

9

PAISE

Parallel AI and Systems for the Edge

Reception
6:00 PM -7:30 PM

IPDPS - TCPP Welcome Reception


TUESDAY - 28 May 2024

DAYS • Monday • Tuesday • Wednesday • Thursday • Friday

Opening Session
8:15 AM - 8:30 AM

Opening Session

Keynote Session
8:30 AM - 9:30 AM

Keynote

Session Chair: TBA

 

Franck Cappello
Argonne National Laboratory

 

To be announced

Morning Break 9:30 AM -10:00 AM

All Day

Main Conference Poster-Accept Papers

 

See listing here. Posters on Display in Ballroom Foyer

Parallel Technical
Sessions 1A & 1B

10:00 AM - 12:00 PM

Session 1A: Numerical Linear Algebra


Session Chair:
TBA

  • PckSpMM: Towards Optimizing SpMM with Packing Strategies in Graph Neural Networks
     Zhengding Hu, Jingwei Sun, Zhongyang Li,  Guangzhong Sun(University of Science and Technology of China)

  • VNEC: A Vectorized Non-Empty Column Format for SpMV on CPUs 
    Luhan Wang, Haipeng Jia, Lei Xu,  Cunyang Wei (Institute of Computing Technology, Chinese Academy of Sciences); Kun Li (Microsoft Research); Xianmeng Jiang, Yunquan Zhang (Institute of Computing Technology, Chinese Academy of Sciences)

  • Improving Performance of s-step GMRES by Two-step Block Orthogonalization     
    Ichitaro Yamazaki (SNL);  Andrew J. Higgins (Temple University); Erik Boman (SNL); Daniel B. Szyld (Temple University)

  • Alternative Basis Matrix Multiplication is Fast and Stable
    Oded Schwartz (The Hebrew University of Jerusalem); Sivan Toledo (Tel Aviv University); Noa Vaknin (The Hebrew University of Jerusalem);  Gal Wiernik (Tel Aviv University)

  • Fast multiplication of random dense matrices with sparse matrices
    Tianyu Liang, Riley Murray, Aydin Buluc, James Demmel (UC Berkeley)

  • A Cholesky QR Type Algorithm for Computing Tall-Skinny QR Factorization with Column Pivoting
    Takeshi Fukaya (Hokkaido University); Yuji Nakatsukasa (University of Oxford); Yusaku Yamamoto (The University of Electro-Communications)


Session 1B: Containers and Serverless Computing


Session Chair:
TBA
  • CKSM: An Efficient Memory Deduplication Method for Container-based Cloud Computing Systems          
    Yunfei Gu, Yihui Lu, Chentao Wu, Jie Li, Minyi Guo (Shanghai Jiao Tong University)

  • Tackling Cold Start in Serverless Computing with Multi-Layer Container Reuse      
    Amelie Chi Zhou (Hong Kong,  Baptist University); Rongzheng Huang (Shenzhen University); Zhoubin Ke (Shenzhen University); Yusen Li (Nankai University); Yi Wang, Rui Mao (Shenzhen University)

  • PALDIA: Enabling SLO-Compliant and Cost-Effective Serverless Computing on Heterogeneous Hardware
    Vivek M. Bhasi, Aakash Sharma, Shruti Mohanty, Mahmut Taylan Kandemir, Chita R. Das (The Pennsylvania State University)

  • Application-Attuned Memory Management for Containerized HPC Workflows
    Moiz Arif, Avinash Maurya, M. Mustafa Rafique (Rochester Institute of Technology); Dimitrios S. Nikolopoulos (Virginia Tech); Ali R. Butt (Rochester Institute of Technology)

  • FEDGE: An Interference-Aware QoS Prediction Framework for Black-Box Scenario in IaaS Clouds with Domain Generalization
    Yunlong Cheng, Xiuqi Huang, Zifeng Liu, Jiadong Chen, Xiaofeng Gao (Shanghai Jiao Tong University); Zhen Fang, Yongqiang Yang (Huawei)

  • Software Resource Disaggregation for HPC with Serverless Computing
    Marcin Copik, Marcin Chrapek (ETH Zürich); Larissa Schmid (Karlsruhe Institute of Technology); Alexandru Calotoiu, Torsten Hoefler (ETH Zürich)                     

12:00 PM – 1:30 PM

Lunch & PhD Program

Parallel Technical
Sessions 2A & 2B

1:30 PM – 2:30 PM

Session 2A: Algorithms on Trees

 

Session Chair: TBA

  • AMST: Accelerating Large-Scale Graph Minimum Spanning Tree Computation on FPGA
    Haishuang Fan, Rui Meng, Qichu Sun, Jingya Wu, Xiaowei Li, Guihai Yan (State Key Laboratory of Processors, Institute of Computing Technology, Chinese Academy of Sciences)

  • Wait-free trees supporting asymptotically efficient range queries
    Ilya Kokorin (ITMO University); Victor Yudov (ITMO University); Vitaly Aksenov (City, University of London); Dan Alistarh (ISTA)

  • Low-Depth Spatial Tree Algorithms
    Yves Baumann, Tal Ben-Nun, Maciej Besta, Lukas Gianinazzi, Torsten Hoefler, Piotr Luczynski (ETH Zurich)


Session 2B: Federated and Distributed Learning

 

Session Chair: TBA

  • QSync: Quantization-Minimized Synchronous Distributed Training Across Hybrid Devices
    Juntao Zhao, Borui Wan (The University of Hong Kong); Yanghua Peng, Haibin Lin, Yibo Zhu (ByteDance Inc.) Chuan Wu (The University of Hong Kong)

  • Enhancing the Generalization of Personalized Federated Learning with Multi-head Model and Ensemble Voting
    Van An Le (National Institute of Advanced Industrial Science and Technology, Japan); Nam Duong Tran, Phuong Nam Nguyen, Thanh Hung Nguyen, Phi Le Nguyen (Hanoi University of Science and Technology, Vietnam) Truong Thao Nguyen (National Institute of Advanced Industrial Science and Technology, Japan); Yusheng Ji (National Institute of Informatics, Japan);

  • UniFaaS: Programming across Distributed Cyberinfrastructure with Federated Function Serving
    Yifei Li (Southern University of Science and Technology); Ryan Chard (Argonne National Laboratory); Yadu Babuji, Kyle Chard (University of Chicago); Ian Foster (Argonne National Laboratory); Zhuozhao Li (Southern University of Science and Technology)

Parallel Technical
Sessions 3A & 3B

2:30 PM – 4:10 PM

Session 3A: Applications I

 

Session Chair: TBA

  • Scalable and Differentiable Simulator for Quantum Computational Chemistry
    Zhiqian Xu (Institute of Computing Technology, Chinese Academy of Sciences); Honghui Shang, Yi Fan, Xiongzhi Zeng (University of Science and Technology of China); Yunquan Zhang (Institute of Computing Technology, Chinese Academy of Sciences); Chu Guo (Hunan normal University)

  • Picasso: Memory-Efficient Graph Coloring Using Palettes With Applications in Quantum Computing
    S.M. Ferdous (Pacific Northwest National Laboratory); Reece Neff (North Carolina State University); Bo Peng, Salman Shuvo, Marco Minutoli,  Sayak Mukherjee, Karol Kowalski (Pacific Northwest National Laboratory);  Michela Becchi (North Carolina State University);  Mahantesh Halappanavar, (Pacific Northwest National Laboratory

  • Optimizing and Scaling the 3D Reconstruction of Single-Particle Imaging
    Wu-chun Feng (Virginia Tech); Vinay Ramakrishnaiah (Los Alamos National Labaratory); Christine Sweeney (Los Alamos National Laboratory); Niteya Shah (Virginia Tech); Jeffrey Donatelli (Lawrence Berkeley National Laboratory)

  • Parallel Approximations for High-Dimensional Multivariate Normal Probability Computation in Confidence Region Detection Applications
    Xiran Zhang, Sameh Abdulah (King Abdullah University of Science and Technology); Jian Cao (University of Houston); Hatem Ltaief,  Ying Sun, Marc G. Genton, David E. Keyes (King Abdullah University of Science and Technology)

  • Enabling High-Performance Physical Based Rendering on New Sunway Supercomputer
    Zeyu Song, Lin Gan, Shengye Xiang, Yinuo Wang (Tsinghua University); Xiaohui Duan (Shandong University); Guangwen Yang (Tsinghua University)


Session 3B: Scheduling I

 

Session Chair: TBA

  • CoCG: Fine-grained Cloud Game Co-location on Heterogeneous Platform  
    Taolei Wang, Chao Li, Jing Wang, Cheng Xu, Xiaofeng Hou, Minyi Guo (Shanghai Jiao Tong University)

  • Adaptive Task-Oriented Resource Allocation for Large Dynamic Workflows on Opportunistic Resources
    Thanh Son Phung, Douglas Thain (University of Notre Dame)

  • nOS-V: Co-Executing HPC Applications Using System-Wide Task Scheduling
    David Álvarez, Kevin Sala, Vicenç Beltran (Barcelona Supercomputing Center)

  • Cross-System Analysis of Job Characterization and Scheduling in Large-Scale Computing Clusters
    Di Zhang, Monish Soundar Raj  (University of North Carolina at Charlotte); Bing Xie (Microsoft); Sheng Di (ANL); Dong Dai (University of North Carolina at Charlotte)

  • Interpretable Analysis of GPU-Accelerated Cluster Traces: Discovering Association Patterns for Operational Insights
    Baolin Li (Northeastern University); Siddharth Samsi (MIT); Vijay Gadepally (MIT Lincoln Laboratory);  Devesh Tiwari (Northeastern University)

Late Afternoon Break 4:10 PM – 4:40 PM

PLENARY Session:
Best Papers
4:40 PM - 6:40 PM

Best Paper Nominees

 

Session Chair: TBA

  • CloverLeaf on Intel Multi-Core CPUs: A Case Study in Write-Allocate Evasion
    Jan Laukemann, Thomas Gruber, Georg Hager (University of Erlangen-Nuremberg); Dossay Oryspayev (Brookhaven National Laboratory); Gerhard Wellein (Erlangen National High Performance Computing Center)

  • ARGO: An Auto-Tuning Runtime System for Scalable GNN Training on Multi-Core Processor
    Yi-chien Lin (University of Southern California); Yuyang Chen (Tsinghua University); Sameh Gobriel, Nilesh Jain, Gopi Krishna Jha (Intel); Viktor Prasanna (University of Southern California)

  • Accelerating Lossy and Lossless Compression on Emerging BlueField DPU Architectures
    Yuke Li, Arjun Kashyap, Weicong Chen (University of California, Merced); Yanfei Guo (Argonne National Laboratory); Xiaoyi Lu (University of California, Merced)

  • Performance-Portable Multiphase Flow Solutions with Discontinuous Galerkin Methods
    Tobias Flynn (University of Warwick); Robert Manson-Sawko (IBM-Research Europe); Gihan Mudalige (University of Warwick)


WEDNESDAY - 29 May 2024

DAYS • Monday • Tuesday • Wednesday • Thursday • Friday

All Day

Main Conference Poster-Accept Papers

 

See listing here. Posters on Display in Ballroom Foyer

Parallel Technical
Sessions
4A & 4B

8:30 AM – 10:30 AM

Session 4A: Applications II

 

Session Chair: TBA

  • Optimized GPU implementation of grid refinement in lattice Boltzmann method
    Ahmed h. Mahmoud (Autodesk Research and University of California, Davis); Hesam Salehipour, Massimiliano Meneghin (Autodesk Research)

  • Alya towards Exascale: Optimal OpenACC Performance of the Navier-Stokes Finite Element Assembly on GPUs
    Dominik Ernst (FAU Erlangen-Nürnberg); Herbert Owen (Barcelona Supercomputing Center); Thomas Gruber (FAU Erlangen-Nürnberg);  Oriol Lemkuhl, Guillaume Houzeaux, Lucas Gasparino (Barcelona Supercomputing Center); Gerhard Wellein (FAU Erlangen-Nürnberg)

  • CliZ: Optimizing Lossy Compression for Climate Datasets with Adaptive Fine-tuned Data Prediction
    Zizhe Jian (University of California, Riverside); Sheng Di (Argonne National Laboratory);  Jinyang Liu (University of California, Riverside); Kai Zhao (Florida State University); Xin Liang (University of Kentucky); Haiying Xu (NCAR); Robert Underwood (Argonne National Laboratory);  Shixun Wu, Zizhong Chen (University of California, Riverside); Franck Cappello (Argonne National Laboratory)

  • Automating GPU Scalability for Complex Scientific Models: Phonon Boltzmann Transport Equation
    Eric Heisler (University of Utah); Siddharth Saurav (The Ohio State University); Aadesh Deshmukh (University of Utah); Sandip Mazumder (The Ohio State University); Hari Sundar (University of Utah)

  • A distributed-memory parallel algorithm for discretized integral equations using Julia
    Tianyu Liang (The University of California, Berkeley); Chao Chen (North Carolina State University); Per-gunnar Martinsson, George Biros (The University of Texas at Austin)

  • Exploiting long vectors with a CFD code: a co-design show case
    Marc Blancafort, Roger Ferrer,  Guillaume Houzeaux, Marta Garcia-Gasulla, Filippo Mantovani (Barcelona Supercomputing Center)       


Session 4B: I/O and Storage Systems

 

Session Chair: TBA

  • Capturing Periodic I/O Using Frequency Techniques
    Ahmad Tarraf (Technical University of Darmstadt); Alexis Bandet, Francieli Boito (Inria, University of Bordeaux); Guillaume Pallez (Inria); Felix Wolf (Technical University of Darmstadt)

  • To Store or Not to Store: a graph theoretical approach for Dataset Versioning
    Anxin Guo (Northwestern University); Jingwei Li (Columbia University); Pattara Sukprasert (Databricks); Samir Khuller (Northwestern University); Amol Deshpande (University of Maryland); Koyel Mukherjee (Adobe Research)

  • TunIO: An AI-powered Framework for Optimizing HPC I/O
    Neeraj Rajesh, Keith Bateman (Illinois Institute of Technology); Jean luca Bez (Lawrence Berkeley National Laboratory); Suren Byna (Ohio State University); Anthony Kougkas, Xian-he Sun (Illinois Institute of Technology)

  • A2FL: Autonomous and Adaptive File Layout in HPC through Real-time Access Pattern Analysis
    Dong Kyu Sung (Seoul National University); Yongseok Son (Chung-Ang University); Alex Sim, Kesheng Wu (Lawrence Berkeley National Laboratory); Suren Byna (The Ohio State University); Houjun Tang (Lawrence Berkeley National Laboratory); Hyeonsang Eom (Seoul National University); Changjong Kim, Sunggon Kim (Seoul National University of Science and Technology)

  • NVMe-oPF: Designing Efficient Priority Schemes for NVMe-over-Fabrics with Multi-Tenancy Support
    Darren Ng, Andrew Lin, Arjun Kashyap (University of California, Merced); Guanpeng Li (University of Iowa); Xiaoyi Lu (University of California, Merced)

  • Drilling Down I/O Bottlenecks with Cross-layer I/O Profile Exploration
    Hammad Ather (University of Oregon); Jean luca Bez (Lawrence Berkeley National Laboratory); Yankun Xia, Suren Byna (The Ohio State University)

Morning Break 10:30 AM -11:00 AM

Keynote Session
11:00 AM – 12:00PM

Keynote

Session Chair: TBA

 

Peng Wu
Meta

 

To be announced

12:00 PM – 1:30 PM

Lunch & PhD Program

Parallel Technical
Sessions
5A & 5B

1:30 AM – 2:30 AM

Session 5A: Performance

 

Session Chair: TBA

  • CachedArrays: Optimizing Data Movement for Heterogeneous Memory Systems
    Mark Hildebrand, Jason Lowe-Power, Venkatesh Akella (UC Davis)

  • Comparative Study of Large Language Model Architectures on HPC
    Junqi Yin, Avishek Bose, Guojing Cong, Isaac Lyngaas (Oak Ridge National Laboratory)Quentin Anthony (Ohio State University)

  • Predicting Cross-Architecture Performance of Parallel Programs
    Daniel Nichols, Alexander Movsesyan (University of Maryland);  Jae-seung Yeom, Abhik Sarkar, Daniel Milroy, Tapasya Patki (Lawrence Livermore National Laboratory); Abhinav Bhatele (University of Maryland)


Session 5B: Resilience

 

Session Chair: TBA

  • DRUTO: Upper-Bounding Silent Data Corruption Vulnerability in GPU Applications
    Md hasanur Rahman (University of Iowa); Sheng Di (Argonne National Laboratory); Shengjian Guo (Amazon Web Services); Xiaoyi Lu (University of California, Merced); Guanpeng Li (University of Iowa); Franck Cappello (Argonne National Laboratory)

  • MPI Errors Detection using GNN Embedding and Vector Embedding over LLVM IR
    Jad El Karchi (Inria); Hanze Chen, Ali Tehrani, Ali Jannesari (Iowa State University); Mihail Popov, Emmanuelle Saillard (Inria)

  • A Parallel Partial Merge Repair Algorithm for Multi-block Failures for Erasure Storage Systems

    Shuaipeng Zhang (Harbin Institute of Technology, Shenzhen); Shiyi Li (Harbin Institute of Technology, Shenzhen); Chentao Wu (Shanghai Jiao Tong University); Ruobin Wu (Harbin Institute of Technology, Shenzhen);  Saiqin Long (Jinan University); Wen Xia (Harbin Institute of Technology, Shenzhen)

Parallel Technical
Sessions
6A & 6B

2:30 PM – 4:10 PM

Session 6A: Accelerators

 

Session Chair: TBA

  • Harmonica: Hybrid Accelerator to Overcome Imperfections of Mixed-signal DNN Accelerators
    Payman Behnam, Uday Kamal (Georgia Institute of Technology); Ali Shafiee (Meta); Alexey Tumanov, Saibal Mukhopadhyay (Georgia Institute of Technology)

  • IPU-EpiDet: Identifying Gene Interactions on Massively Parallel Graph-Based AI Accelerators
    Ricardo Nobre, Aleksandar Ilic (INESC-ID); Sergio Santander-Jiménez (University of Extremadura (UNEX)); Leonel Sousa (INESC-ID)

  • DEFCON: Deformable Convolutions Leveraging Interval Search and GPU Texture Hardware
    Malith Jayaweera, Yanyu Li (Northeastern University); Bin Ren (William & Mary); David Kaeli (Northeastern University); Yanzhi Wang (Northeastern University)

  • Benchmarking and Dissecting the Nvidia Hopper GPU Architecture
    Weile Luo, Ruibo Fan, Zeyu Li, Dayou Du (The Hong Kong University of Science and Technology/ Guangzhou); Qiang Wang (Harbin Institute of Technology, Shenzhen); Xiaowen Chu (The Hong Kong University of Science and Technology/ Guangzhou)

  • Exploration of Trade-offs Between General-Purpose and Specialized Processing Elements in HPC-Oriented CGRA
    Emanuele Del Sozzo (RIKEN Center for Computational Science);  Xinyuan Wang (University of Toronto); Boma Adhi, Carlos Cortes (RIKEN Center for Computational Science); Jason Anderson (University of Toronto); Kentaro Sano (RIKEN Center for Computational Science)


Session 6B: Scheduling II

 

Session Chair: TBA

  • Hadar: Heterogeneity-Aware Optimization-Based Online Scheduling for Deep Learning Clusters
    Abeda Sultana (University of Louisiana at Lafayette); Fei Xu (East China Normal University); Xu Yuan , Li Chen, Nian-feng Tzeng (University of Louisiana at Lafayette)

  • Fast Abort-freedom for Deterministic Transactions
    Chen Chen (University of Illinois at Chicago); Xingbo Wu (Microsoft Research); Wenshao Zhong, Jakob Eriksson (University of Illinois at Chicago)

  • SYNPA: SMT Performance Analysis and Allocation of Threads to Cores in ARM Processors
    Marta Navarro, Josué Feliu, Salvador Petit, María e. Gómez (Universitat Politècnica de València); Victor Lixin (HiSilicon); Julio Sahuquillo (Universitat Politècnica de València)

  • SWEEP: Adaptive Task Scheduling for Exploring Energy Performance Trade-offs
    Jing Chen, Madhavan Manivannan, Bhavishya Goel, Miquel Pericàs (Chalmers University of Technology)

  • Automatic Task Parallelization of Dataflow Graphs in ML/DL Models

    Srinjoy Das, Lawrence Rauchwerger (University of Illinois at Urbana Champaign) 

Afternoon Break 4:10 PM - 4:40 PM

Conference Poster Session
4:40 PM

Details to be announced

5:30 PM

PHD Forum - Students at posters

6:30 PM - 7:30 PM

Pre-Banquet Reception

7:30 PM

Banquet (Paper and Poster Awards)


THURSDAY - 30 May 2024

DAYS • Monday • Tuesday • Wednesday • Thursday • Friday

All Day

Main Conference Poster-Accept Papers

 

See listing here. Posters on Display in Ballroom Foyer

Parallel Technical
Sessions
7A & 7B

8:30 AM – 10:30 AM

Session 7A: Message Passing and Communication

 

Session Chair: TBA

  • Adaptive Prefetching for Fine-grain Communication in PGAS Programs
    Thomas B. Rolinger (NVIDIA); Alan Sussman (University of Maryland)

  • An Optimized Error-controlled MPI Collective Framework Integrated with Lossy Compression
    Jiajun Huang (University of California, Riverside); Sheng Di (Argonne National Laboratory); Xiaodong Yu (Stevens Institute of Technology); Yujia Zhai (University of California, Riverside); Zhaorui Zhang (The Hong Kong Polytechnic University); Jinyang Liu (University of California, Riverside); Xiaoyi Lu (University of California, Merced); Ken Raffenetti, Hui Zhou (Argonne National Laboratory); Kai Zhao (Florida State University); Zizhong Chen (University of California, Riverside); Franck Cappello, Yanfei Guo (Argonne National Laboratory); Rajeev Thakur (Argonne National Laboratory)

  • MUSE: A Runtime Incrementally Reconfigurable Network Adapting to HPC Real-Time Traffic
    Zijian Li, Zixuan Chen, Yiying Tang,  Xin Ai, Yuanyi Zhu, Zhigao Zhao, Jiang Shao (Fudan University); Guowei Liu (Tsinghua University); Sen Liu (Fudan University); Bin Liu (Tsinghua University); Yang Xu (Fudan University)

  • Fast Policy Convergence for Traffic Engineering with Proactive Distributed Message-Passing
    Zicheng Wang, Zirui Zhuang, Jingyu Wang, Qi Qi, Haifeng Sun,  Jianxin Liao (Beijing University of Posts and Telecommunications)

  • The Self-adaptive and Topology-aware MPI Bcast leveraging Collective offload on Tianhe Express Interconnect
    Weixia Xu, Jintao Peng, Jinbo Xu, Yi Dai, Chongshan Liang (NUDT); Jun Xia (Nanhu Lab); Chongshan Liang, Yi Dai (NUDT); Jun Xia (Nanhu Lab); Jinbo Xu, Jintao Peng, Weixia Xu, Ming Xie, Jie Liu, Zhiquan Lai, Sheng Ma, Qi Zhu (NUDT)

  • HINT: Designing Cache-Efficient MPI_Alltoall using Hybrid Memory Copy Ordering and Non-Temporal Instructions
    Bharath Ramesh, Nick Contini, Nawras Alnaasan, Kaushik Kandadi Suresh, Mustafa Abduljabbar, Aamir Shafi, Hari Subramoni, Dhabaleswar K. Panda (The Ohio State University)

 

Session 7B: Communication Subsystems

 

Session Chair: TBA

  • Flexible NVMe Request Routing for Virtual Machines
    Tu Dinh Ngoc, Boris Teabe, Georges Da Costa, Daniel Hagimont (IRIT, Université de Toulouse, CNRS, Toulouse INP, UT3)

  • HA-CSD: Host and SSD Coordinated Compression for Capacity and Performance
    Xiang Chen (Huazhong University of Science and Technology); Tao Lu, Jiapin Wang (DapuStor); Yu Zhong (Huazhong University of Science and Technology); Guangchun Xie (DapuStor); Xueming Cao, Yuanpeng Ma, Bing Si, Feng Ding, Ying Yang, Yunxing Huang  (DapuStor); Yafei Yang, You Zhou, Fei Wu (Huazhong University of Science and Technology)

  • Graph Analytics on Jellyfish Topology
    Md Nahid Newaz (Oakland University); Sayan Ghosh,  Joshua Suetterlein, Nathan Tallent (Pacific Northwest National Laboratory); Md Atiqul Mollah (Cornelis Networks); Hua Ming (Oakland University)

  • TEEMO: Temperature Aware Energy Efficient Multi-Retention STT-RAM Cache Architecture
    Sukarn Agarwal (IIT Mandi); Shounak Chakraborty, Magnus Sjalander (Norwegian University of Science and Technology)

  • LockillerTM: Enhancing Performance Lower Bounds in Best-Effort Hardware Transactional Memory
    Li Wan, Chao Fu, Qiang Li, Jun Han (Fudan University)

  • Attention, Distillation, and Tabularization: Towards Practical Neural Network-Based Prefetching

    Pengmiao Zhang, Neelesh Gupta (University of Southern California); Rajgopal Kannan (DEVCOM Army Research Lab); Viktor Prasanna (University of Southern California)

Morning Break 10:30 AM -11:00 AM

Keynote Session
11:00 AM – 12:00PM

Keynote

Session Chair: TBA

 

Kunle Olukotun
Stanford University

 

To be announce

12:00 PM – 1:30 PM

Lunch & PhD Program

Parallel Technical
Sessions
8A & 8B

1:30 AM – 2:50 PM

Session 8A: Graph and MoE Learning

 

Session Chair: TBA

  • Aurora: A Versatile and Flexible Accelerator for Generic Graph Neural Networks
    Jiaqi Yang (George Washington University); Hao Zheng (University of Central Florida); Ahmed Louri (George Washington University)

  • cuKE: An Efficient Code Generator for Score Function Computation in Knowledge Graph Embedding
    Lihan Hu (The University of Iowa); Jing Li (Nvidia); Peng Jiang (The University of Iowa)

  • Exploiting Inter-Layer Expert Affinity for Accelerating Mixture-of-Experts Model Inference
    Jinghan Yao, Quentin Anthony, Aamir Shafi, Hari Subramoni, Dhabaleswar K. Panda (The Ohio State University)

  • TASER: Temporal Adaptive Sampling for Fast and Accurate Dynamic Graph Representation Learning

    Gangda Deng, Hongkuan Zhou (University of Southern California); Hanqing Zeng, Yinglong Xia, Christopher Leung, Jianbo Li (Meta); Rajgopal Kannan (DEVCOM US Army Research Lab); Viktor Prasanna (University of Southern California)

 

Session 8B: Performance Optimization

 

Session Chair: TBA

  • OpenFFT-SME: An Efficient Outer Product Pattern FFT Library on ARM SME CPUs
    Ruge Zhang, Haipeng Jia, Yunquan Zhang (Institute of Computing Technology, Chinese Academy of Sciences); Baicheng Yan, Penghao Ma, Long Wang (Huawei Technologies Co. Ltd); Wenxuan Zhao (Institute of Computing Technology, Chinese Academy of Sciences)

  • Harnessing Deep Learning and HPC Kernels via High-Level Loop and Tensor Abstractions on CPU Architectures
    Evangelos Georganas, Kirill Voronin, Abhisek Kundu (Intel Corporation); Antonio Noack (Friedrich Schiller Universität Jena); Dhiraj Kalamkar (Intel Corporation); Alexander Breuer (Friedrich Schiller Universität Jena); Alexander Heinecke, Hans Pabst (Intel Corporation)

  • Optimizing General Matrix Multiplications on Modern Multi-core DSPs
    Kainan Yu, Xinxin Qi, Peng Zhang, Jianbin Fang, Dezun Dong, Ruibo Wang, Tao Tang, Chun Huang, Yonggang Che (National University of Defense Technology); Zheng Wang (Northwest University)

  • Machine-Learning-Driven Runtime Optimization of BLAS Level 3 on Modern Multi-Core Systems

    Yufan Xia (The Chinese University of Hong Kong); Giuseppe Maria Junior Barca (The University Of Melbourne)

Afternoon Break 2:50 PM -3:30 PM

Parallel Technical
Sessions
9A & 9B

3:30 PM – 4:50 PM

Session 9A: Distributed Algorithms

 

Session Chair: TBA

  • Time-Color Tradeoff on Uniform Circle Formation by Asynchronous Robots
    Debasish Pattanayak (Carleton University); Gokarna Sharma (Kent State University)

  • LightDAG: A Low-latency DAG-based BFT Consensus through Lightweight Broadcast
    Xiaohai Dai, Guanxiong Wang, Jiang Xiao, Zhengxuan Guo (Huazhong University of Science and Technology); Rui Hao (Nanjing University); Xia Xie (Hainan University); Hai Jin (Huazhong University of Science and Technology)

  • MAAD: A Distributed Anomaly Detection Architecture for Microservices Systems
    Rongyuan Tan, Zhuozhao Li (Southern University of Science and Technology)

    OneShot: View-Adapting Streamlined BFT Protocols with Trusted Execution Environments

  • Jeremie Decouchant (Delft University of Technology); David Kozhaya (ABB Research); Vincent Rahli (University of Birmingham); Jiangshan Yu (Monash University)


Session 9B: Graph Algorithms

 

Session Chair: TBA

  • Practically Tackling Memory Bottlenecks of Graph-Processing Workloads
    Alexandre Valentin Jamet (Universitat Politecnica de Catalunya); Georgios Vavouliotis (Huawei Zurich Research Center); Daniel A. Jiménez (Texas A&M University); Lluc Alvarez (Barcelona Supercomputing Center); Marc Casas (Universitat Politecnica de Catalunya)

  • GCSM: GPU-Accelerated Continuous Subgraph Matching for Large Graphs
    Yihua Wei, Peng Jiang (The University of Iowa)

  • Parallel Derandomization for Coloring
    Sam Coy,  Artur Czumaj (University of Warwick); Peter Davies (Durham University);  Gopinath Mishra (National University of Singapore)

  • A Comparative Study of Intersection-Based Triangle Counting Algorithms on GPUs
    Jiangbo Li, Zichen Xu (The Nanchang University); Minh Pham, Yicheng Tu (University of South Florida); Qi Zhou (City University of Macau)

 

MainConference Closing Session

 

Details to be announced


FRIDAY - 31 May 2024

DAYS • Monday • Tuesday • Wednesday • Thursday • Friday

FRIDAY
Workshops

 

ALL DAY


See each individual
workshop program
for schedule details

 

10

CGRA4HPC

Coarse-Grained Reconfigurable Architectures for High-Performance Computing

11

HIPS

High-level Parallel Programming Models and Supportive Environments

12

iWAPT

International Workshop on Automatic Performance Tuning

13

JSSPP

Job Scheduling Strategies for Parallel Processing

14

ParSocial

Parallel and Distributed Processing for Computational Social Systems

15

PDCO

Parallel / Distributed Combinatorics and Optimization

16

PDSEC

Parallel and Distributed Scientific and Engineering Computing

17

Q-CASA

Quantum Computing Algorithms, Systems, and Applications

 

Register Today

Early Deadline Extended
To April 8, 2024

Registration Details

Search IPDPS

 

Follow IPDPS

   

IPDPS 2023 Report



37th IEEE International Parallel
& Distributed Processing Symposium
May 15-19, 2023

Hilton St. Petersburg
Bayfront Hotel
St. Petersburg, Florida USA

REPORT ON IPDPS 2023