|
IPDPS 2024 Advance Program |
Please visit the IPDPS website regularly for updates, since there may be schedule revisions.
Authors who have corrections should send email to contact@ipdps.org giving full details.
MONDAY - 27 May 2024
DAYS • Monday • Tuesday • Wednesday • Thursday • Friday |
MONDAY
Workshops
ALL DAY
See each individual
workshop program
for schedule details |
|
Reception
6:00 PM -7:30 PM |
IPDPS - TCPP Welcome Reception |
TUESDAY - 28 May 2024
DAYS • Monday • Tuesday • Wednesday • Thursday • Friday |
Opening Session
8:15 AM - 8:30 AM |
Opening Session |
Keynote Session
8:30 AM - 9:30 AM |
Keynote
Session Chair: TBA
Franck Cappello
Argonne National Laboratory
To be announced |
Morning Break 9:30 AM -10:00 AM |
All Day |
Main Conference Poster-Accept Papers
See listing here. Posters on Display in Ballroom Foyer |
Parallel Technical
Sessions 1A & 1B
10:00 AM - 12:00 PM |
Session 1A: Numerical Linear Algebra
Session Chair: TBA
-
PckSpMM: Towards Optimizing SpMM with Packing Strategies in Graph Neural Networks
Zhengding Hu, Jingwei Sun, Zhongyang Li, Guangzhong Sun(University of Science and Technology of China)
-
VNEC: A Vectorized Non-Empty Column Format for SpMV on CPUs
Luhan Wang, Haipeng Jia, Lei Xu, Cunyang Wei (Institute of Computing Technology, Chinese Academy of Sciences); Kun Li (Microsoft Research); Xianmeng Jiang, Yunquan Zhang (Institute of Computing Technology, Chinese Academy of Sciences)
-
Improving Performance of s-step GMRES by Two-step Block Orthogonalization
Ichitaro Yamazaki (SNL); Andrew J. Higgins (Temple University); Erik Boman (SNL); Daniel B. Szyld (Temple University)
-
Alternative Basis Matrix Multiplication is Fast and Stable
Oded Schwartz (The Hebrew University of Jerusalem); Sivan Toledo (Tel Aviv University); Noa Vaknin (The Hebrew University of Jerusalem); Gal Wiernik (Tel Aviv University)
-
Fast multiplication of random dense matrices with sparse matrices
Tianyu Liang, Riley Murray, Aydin Buluc, James Demmel (UC Berkeley)
-
A Cholesky QR Type Algorithm for Computing Tall-Skinny QR Factorization with Column Pivoting
Takeshi Fukaya (Hokkaido University); Yuji Nakatsukasa (University of Oxford); Yusaku Yamamoto (The University of Electro-Communications)
Session 1B: Containers and Serverless Computing
Session Chair: TBA
-
CKSM: An Efficient Memory Deduplication Method for Container-based Cloud Computing Systems
Yunfei Gu, Yihui Lu, Chentao Wu, Jie Li, Minyi Guo (Shanghai Jiao Tong University)
-
Tackling Cold Start in Serverless Computing with Multi-Layer Container Reuse
Amelie Chi Zhou (Hong Kong, Baptist University); Rongzheng Huang (Shenzhen University); Zhoubin Ke (Shenzhen University); Yusen Li (Nankai University); Yi Wang, Rui Mao (Shenzhen University)
-
PALDIA: Enabling SLO-Compliant and Cost-Effective Serverless Computing on Heterogeneous Hardware
Vivek M. Bhasi, Aakash Sharma, Shruti Mohanty, Mahmut Taylan Kandemir, Chita R. Das (The Pennsylvania State University)
-
Application-Attuned Memory Management for Containerized HPC Workflows
Moiz Arif, Avinash Maurya, M. Mustafa Rafique (Rochester Institute of Technology); Dimitrios S. Nikolopoulos (Virginia Tech); Ali R. Butt (Rochester Institute of Technology)
-
FEDGE: An Interference-Aware QoS Prediction Framework for Black-Box Scenario in IaaS Clouds with Domain Generalization
Yunlong Cheng, Xiuqi Huang, Zifeng Liu, Jiadong Chen, Xiaofeng Gao (Shanghai Jiao Tong University); Zhen Fang, Yongqiang Yang (Huawei)
-
Software Resource Disaggregation for HPC with Serverless Computing
Marcin Copik, Marcin Chrapek (ETH Zürich); Larissa Schmid (Karlsruhe Institute of Technology); Alexandru Calotoiu, Torsten Hoefler (ETH Zürich)
|
12:00 PM – 1:30 PM |
Lunch & PhD Program |
Parallel Technical
Sessions 2A & 2B
1:30 PM – 2:30 PM |
Session 2A: Algorithms on Trees
Session Chair: TBA
-
AMST: Accelerating Large-Scale Graph Minimum Spanning Tree Computation on FPGA
Haishuang Fan, Rui Meng, Qichu Sun, Jingya Wu, Xiaowei Li, Guihai Yan (State Key Laboratory of Processors, Institute of Computing Technology, Chinese Academy of Sciences)
-
Wait-free trees supporting asymptotically efficient range queries
Ilya Kokorin (ITMO University); Victor Yudov (ITMO University); Vitaly Aksenov (City, University of London); Dan Alistarh (ISTA)
-
Low-Depth Spatial Tree Algorithms
Yves Baumann, Tal Ben-Nun, Maciej Besta, Lukas Gianinazzi, Torsten Hoefler, Piotr Luczynski (ETH Zurich)
Session 2B: Federated and Distributed Learning
Session Chair: TBA
-
QSync: Quantization-Minimized Synchronous Distributed Training Across Hybrid Devices
Juntao Zhao, Borui Wan (The University of Hong Kong); Yanghua Peng, Haibin Lin, Yibo Zhu (ByteDance Inc.) Chuan Wu (The University of Hong Kong)
-
Enhancing the Generalization of Personalized Federated Learning with Multi-head Model and Ensemble Voting
Van An Le (National Institute of Advanced Industrial Science and Technology, Japan); Nam Duong Tran, Phuong Nam Nguyen, Thanh Hung Nguyen, Phi Le Nguyen (Hanoi University of Science and Technology, Vietnam) Truong Thao Nguyen (National Institute of Advanced Industrial Science and Technology, Japan); Yusheng Ji (National Institute of Informatics, Japan);
-
UniFaaS: Programming across Distributed Cyberinfrastructure with Federated Function Serving
Yifei Li (Southern University of Science and Technology); Ryan Chard (Argonne National Laboratory); Yadu Babuji, Kyle Chard (University of Chicago); Ian Foster (Argonne National Laboratory); Zhuozhao Li (Southern University of Science and Technology)
|
Parallel Technical
Sessions 3A & 3B
2:30 PM – 4:10 PM |
Session 3A: Applications I
Session Chair: TBA
-
Scalable and Differentiable Simulator for Quantum Computational Chemistry
Zhiqian Xu (Institute of Computing Technology, Chinese Academy of Sciences); Honghui Shang, Yi Fan, Xiongzhi Zeng (University of Science and Technology of China); Yunquan Zhang (Institute of Computing Technology, Chinese Academy of Sciences); Chu Guo (Hunan normal University)
-
Picasso: Memory-Efficient Graph Coloring Using Palettes With Applications in Quantum Computing
S.M. Ferdous (Pacific Northwest National Laboratory); Reece Neff (North Carolina State University); Bo Peng, Salman Shuvo, Marco Minutoli, Sayak Mukherjee, Karol Kowalski (Pacific Northwest National Laboratory); Michela Becchi (North Carolina State University); Mahantesh Halappanavar, (Pacific Northwest National Laboratory
-
Optimizing and Scaling the 3D Reconstruction of Single-Particle Imaging
Wu-chun Feng (Virginia Tech); Vinay Ramakrishnaiah (Los Alamos National Labaratory); Christine Sweeney (Los Alamos National Laboratory); Niteya Shah (Virginia Tech); Jeffrey Donatelli (Lawrence Berkeley National Laboratory)
-
Parallel Approximations for High-Dimensional Multivariate Normal Probability Computation in Confidence Region Detection Applications
Xiran Zhang, Sameh Abdulah (King Abdullah University of Science and Technology); Jian Cao (University of Houston); Hatem Ltaief, Ying Sun, Marc G. Genton, David E. Keyes (King Abdullah University of Science and Technology)
-
Enabling High-Performance Physical Based Rendering on New Sunway Supercomputer
Zeyu Song, Lin Gan, Shengye Xiang, Yinuo Wang (Tsinghua University); Xiaohui Duan (Shandong University); Guangwen Yang (Tsinghua University)
Session 3B: Scheduling I
Session Chair: TBA
-
CoCG: Fine-grained Cloud Game Co-location on Heterogeneous Platform
Taolei Wang, Chao Li, Jing Wang, Cheng Xu, Xiaofeng Hou, Minyi Guo (Shanghai Jiao Tong University)
-
Adaptive Task-Oriented Resource Allocation for Large Dynamic Workflows on Opportunistic Resources
Thanh Son Phung, Douglas Thain (University of Notre Dame)
-
nOS-V: Co-Executing HPC Applications Using System-Wide Task Scheduling
David Álvarez, Kevin Sala, Vicenç Beltran (Barcelona Supercomputing Center)
-
Cross-System Analysis of Job Characterization and Scheduling in Large-Scale Computing Clusters
Di Zhang, Monish Soundar Raj (University of North Carolina at Charlotte); Bing Xie (Microsoft); Sheng Di (ANL); Dong Dai (University of North Carolina at Charlotte)
-
Interpretable Analysis of GPU-Accelerated Cluster Traces: Discovering Association Patterns for Operational Insights
Baolin Li (Northeastern University); Siddharth Samsi (MIT); Vijay Gadepally (MIT Lincoln Laboratory); Devesh Tiwari (Northeastern University)
|
Late Afternoon Break 4:10 PM – 4:40 PM |
PLENARY Session:
Best Papers
4:40 PM - 6:40 PM |
Best Paper Nominees
Session Chair: TBA
-
CloverLeaf on Intel Multi-Core CPUs: A Case Study in Write-Allocate Evasion
Jan Laukemann, Thomas Gruber, Georg Hager (University of Erlangen-Nuremberg); Dossay Oryspayev (Brookhaven National Laboratory); Gerhard Wellein (Erlangen National High Performance Computing Center)
-
ARGO: An Auto-Tuning Runtime System for Scalable GNN Training on Multi-Core Processor
Yi-chien Lin (University of Southern California); Yuyang Chen (Tsinghua University); Sameh Gobriel, Nilesh Jain, Gopi Krishna Jha (Intel); Viktor Prasanna (University of Southern California)
-
Accelerating Lossy and Lossless Compression on Emerging BlueField DPU Architectures
Yuke Li, Arjun Kashyap, Weicong Chen (University of California, Merced); Yanfei Guo (Argonne National Laboratory); Xiaoyi Lu (University of California, Merced)
-
Performance-Portable Multiphase Flow Solutions with Discontinuous Galerkin Methods
Tobias Flynn (University of Warwick); Robert Manson-Sawko (IBM-Research Europe); Gihan Mudalige (University of Warwick)
|
WEDNESDAY - 29 May 2024
DAYS • Monday • Tuesday • Wednesday • Thursday • Friday |
All Day |
Main Conference Poster-Accept Papers
See listing here. Posters on Display in Ballroom Foyer |
Parallel Technical
Sessions 4A & 4B
8:30 AM – 10:30 AM |
Session 4A: Applications II
Session Chair: TBA
-
Optimized GPU implementation of grid refinement in lattice Boltzmann method
Ahmed h. Mahmoud (Autodesk Research and University of California, Davis); Hesam Salehipour, Massimiliano Meneghin (Autodesk Research)
-
Alya towards Exascale: Optimal OpenACC Performance of the Navier-Stokes Finite Element Assembly on GPUs
Dominik Ernst (FAU Erlangen-Nürnberg); Herbert Owen (Barcelona Supercomputing Center); Thomas Gruber (FAU Erlangen-Nürnberg); Oriol Lemkuhl, Guillaume Houzeaux, Lucas Gasparino (Barcelona Supercomputing Center); Gerhard Wellein (FAU Erlangen-Nürnberg)
-
CliZ: Optimizing Lossy Compression for Climate Datasets with Adaptive Fine-tuned Data Prediction
Zizhe Jian (University of California, Riverside); Sheng Di (Argonne National Laboratory); Jinyang Liu (University of California, Riverside); Kai Zhao (Florida State University); Xin Liang (University of Kentucky); Haiying Xu (NCAR); Robert Underwood (Argonne National Laboratory); Shixun Wu, Zizhong Chen (University of California, Riverside); Franck Cappello (Argonne National Laboratory)
-
Automating GPU Scalability for Complex Scientific Models: Phonon Boltzmann Transport Equation
Eric Heisler (University of Utah); Siddharth Saurav (The Ohio State University); Aadesh Deshmukh (University of Utah); Sandip Mazumder (The Ohio State University); Hari Sundar (University of Utah)
-
A distributed-memory parallel algorithm for discretized integral equations using Julia
Tianyu Liang (The University of California, Berkeley); Chao Chen (North Carolina State University); Per-gunnar Martinsson, George Biros (The University of Texas at Austin)
-
Exploiting long vectors with a CFD code: a co-design show case
Marc Blancafort, Roger Ferrer, Guillaume Houzeaux, Marta Garcia-Gasulla, Filippo Mantovani (Barcelona Supercomputing Center)
Session 4B: I/O and Storage Systems
Session Chair: TBA
-
Capturing Periodic I/O Using Frequency Techniques
Ahmad Tarraf (Technical University of Darmstadt); Alexis Bandet, Francieli Boito (Inria, University of Bordeaux); Guillaume Pallez (Inria); Felix Wolf (Technical University of Darmstadt)
-
To Store or Not to Store: a graph theoretical approach for Dataset Versioning
Anxin Guo (Northwestern University); Jingwei Li (Columbia University); Pattara Sukprasert (Databricks); Samir Khuller (Northwestern University); Amol Deshpande (University of Maryland); Koyel Mukherjee (Adobe Research)
-
TunIO: An AI-powered Framework for Optimizing HPC I/O
Neeraj Rajesh, Keith Bateman (Illinois Institute of Technology); Jean luca Bez (Lawrence Berkeley National Laboratory); Suren Byna (Ohio State University); Anthony Kougkas, Xian-he Sun (Illinois Institute of Technology)
-
A2FL: Autonomous and Adaptive File Layout in HPC through Real-time Access Pattern Analysis
Dong Kyu Sung (Seoul National University); Yongseok Son (Chung-Ang University); Alex Sim, Kesheng Wu (Lawrence Berkeley National Laboratory); Suren Byna (The Ohio State University); Houjun Tang (Lawrence Berkeley National Laboratory); Hyeonsang Eom (Seoul National University); Changjong Kim, Sunggon Kim (Seoul National University of Science and Technology)
-
NVMe-oPF: Designing Efficient Priority Schemes for NVMe-over-Fabrics with Multi-Tenancy Support
Darren Ng, Andrew Lin, Arjun Kashyap (University of California, Merced); Guanpeng Li (University of Iowa); Xiaoyi Lu (University of California, Merced)
-
Drilling Down I/O Bottlenecks with Cross-layer I/O Profile Exploration
Hammad Ather (University of Oregon); Jean luca Bez (Lawrence Berkeley National Laboratory); Yankun Xia, Suren Byna (The Ohio State University)
|
Morning Break 10:30 AM -11:00 AM |
Keynote Session
11:00 AM – 12:00PM |
Keynote
Session Chair: TBA
Peng Wu
Meta
To be announced |
12:00 PM – 1:30 PM |
Lunch & PhD Program |
Parallel Technical
Sessions 5A & 5B
1:30 AM – 2:30 AM |
Session 5A: Performance
Session Chair: TBA
-
CachedArrays: Optimizing Data Movement for Heterogeneous Memory Systems
Mark Hildebrand, Jason Lowe-Power, Venkatesh Akella (UC Davis)
-
Comparative Study of Large Language Model Architectures on HPC
Junqi Yin, Avishek Bose, Guojing Cong, Isaac Lyngaas (Oak Ridge National Laboratory)Quentin Anthony (Ohio State University)
-
Predicting Cross-Architecture Performance of Parallel Programs
Daniel Nichols, Alexander Movsesyan (University of Maryland); Jae-seung Yeom, Abhik Sarkar, Daniel Milroy, Tapasya Patki (Lawrence Livermore National Laboratory); Abhinav Bhatele (University of Maryland)
Session 5B: Resilience
Session Chair: TBA
-
DRUTO: Upper-Bounding Silent Data Corruption Vulnerability in GPU Applications
Md hasanur Rahman (University of Iowa); Sheng Di (Argonne National Laboratory); Shengjian Guo (Amazon Web Services); Xiaoyi Lu (University of California, Merced); Guanpeng Li (University of Iowa); Franck Cappello (Argonne National Laboratory)
-
MPI Errors Detection using GNN Embedding and Vector Embedding over LLVM IR
Jad El Karchi (Inria); Hanze Chen, Ali Tehrani, Ali Jannesari (Iowa State University); Mihail Popov, Emmanuelle Saillard (Inria)
-
A Parallel Partial Merge Repair Algorithm for Multi-block Failures for Erasure Storage Systems
Shuaipeng Zhang (Harbin Institute of Technology, Shenzhen); Shiyi Li (Harbin Institute of Technology, Shenzhen); Chentao Wu (Shanghai Jiao Tong University); Ruobin Wu (Harbin Institute of Technology, Shenzhen); Saiqin Long (Jinan University); Wen Xia (Harbin Institute of Technology, Shenzhen)
|
Parallel Technical
Sessions 6A & 6B
2:30 PM – 4:10 PM |
Session 6A: Accelerators
Session Chair: TBA
-
Harmonica: Hybrid Accelerator to Overcome Imperfections of Mixed-signal DNN Accelerators
Payman Behnam, Uday Kamal (Georgia Institute of Technology); Ali Shafiee (Meta); Alexey Tumanov, Saibal Mukhopadhyay (Georgia Institute of Technology)
-
IPU-EpiDet: Identifying Gene Interactions on Massively Parallel Graph-Based AI Accelerators
Ricardo Nobre, Aleksandar Ilic (INESC-ID); Sergio Santander-Jiménez (University of Extremadura (UNEX)); Leonel Sousa (INESC-ID)
-
DEFCON: Deformable Convolutions Leveraging Interval Search and GPU Texture Hardware
Malith Jayaweera, Yanyu Li (Northeastern University); Bin Ren (William & Mary); David Kaeli (Northeastern University); Yanzhi Wang (Northeastern University)
-
Benchmarking and Dissecting the Nvidia Hopper GPU Architecture
Weile Luo, Ruibo Fan, Zeyu Li, Dayou Du (The Hong Kong University of Science and Technology/ Guangzhou); Qiang Wang (Harbin Institute of Technology, Shenzhen); Xiaowen Chu (The Hong Kong University of Science and Technology/ Guangzhou)
-
Exploration of Trade-offs Between General-Purpose and Specialized Processing Elements in HPC-Oriented CGRA
Emanuele Del Sozzo (RIKEN Center for Computational Science); Xinyuan Wang (University of Toronto); Boma Adhi, Carlos Cortes (RIKEN Center for Computational Science); Jason Anderson (University of Toronto); Kentaro Sano (RIKEN Center for Computational Science)
Session 6B: Scheduling II
Session Chair: TBA
-
Hadar: Heterogeneity-Aware Optimization-Based Online Scheduling for Deep Learning Clusters
Abeda Sultana (University of Louisiana at Lafayette); Fei Xu (East China Normal University); Xu Yuan , Li Chen, Nian-feng Tzeng (University of Louisiana at Lafayette)
-
Fast Abort-freedom for Deterministic Transactions
Chen Chen (University of Illinois at Chicago); Xingbo Wu (Microsoft Research); Wenshao Zhong, Jakob Eriksson (University of Illinois at Chicago)
-
SYNPA: SMT Performance Analysis and Allocation of Threads to Cores in ARM Processors
Marta Navarro, Josué Feliu, Salvador Petit, María e. Gómez (Universitat Politècnica de València); Victor Lixin (HiSilicon); Julio Sahuquillo (Universitat Politècnica de València)
-
SWEEP: Adaptive Task Scheduling for Exploring Energy Performance Trade-offs
Jing Chen, Madhavan Manivannan, Bhavishya Goel, Miquel Pericàs (Chalmers University of Technology)
-
Automatic Task Parallelization of Dataflow Graphs in ML/DL Models
Srinjoy Das, Lawrence Rauchwerger (University of Illinois at Urbana Champaign)
|
Afternoon Break 4:10 PM - 4:40 PM |
Conference Poster Session
4:40 PM |
Details to be announced |
5:30 PM |
PHD Forum - Students at posters |
6:30 PM - 7:30 PM |
Pre-Banquet Reception |
7:30 PM |
Banquet (Paper and Poster Awards) |
THURSDAY - 30 May 2024
DAYS • Monday • Tuesday • Wednesday • Thursday • Friday |
All Day |
Main Conference Poster-Accept Papers
See listing here. Posters on Display in Ballroom Foyer |
Parallel Technical
Sessions 7A & 7B
8:30 AM – 10:30 AM |
Session 7A: Message Passing and Communication
Session Chair: TBA
-
Adaptive Prefetching for Fine-grain Communication in PGAS Programs
Thomas B. Rolinger (NVIDIA); Alan Sussman (University of Maryland)
-
An Optimized Error-controlled MPI Collective Framework Integrated with Lossy Compression
Jiajun Huang (University of California, Riverside); Sheng Di (Argonne National Laboratory); Xiaodong Yu (Stevens Institute of Technology); Yujia Zhai (University of California, Riverside); Zhaorui Zhang (The Hong Kong Polytechnic University); Jinyang Liu (University of California, Riverside); Xiaoyi Lu (University of California, Merced); Ken Raffenetti, Hui Zhou (Argonne National Laboratory); Kai Zhao (Florida State University); Zizhong Chen (University of California, Riverside); Franck Cappello, Yanfei Guo (Argonne National Laboratory); Rajeev Thakur (Argonne National Laboratory)
-
MUSE: A Runtime Incrementally Reconfigurable Network Adapting to HPC Real-Time Traffic
Zijian Li, Zixuan Chen, Yiying Tang, Xin Ai, Yuanyi Zhu, Zhigao Zhao, Jiang Shao (Fudan University); Guowei Liu (Tsinghua University); Sen Liu (Fudan University); Bin Liu (Tsinghua University); Yang Xu (Fudan University)
-
Fast Policy Convergence for Traffic Engineering with Proactive Distributed Message-Passing
Zicheng Wang, Zirui Zhuang, Jingyu Wang, Qi Qi, Haifeng Sun, Jianxin Liao (Beijing University of Posts and Telecommunications)
-
The Self-adaptive and Topology-aware MPI Bcast leveraging Collective offload on Tianhe Express Interconnect
Weixia Xu, Jintao Peng, Jinbo Xu, Yi Dai, Chongshan Liang (NUDT); Jun Xia (Nanhu Lab); Chongshan Liang, Yi Dai (NUDT); Jun Xia (Nanhu Lab); Jinbo Xu, Jintao Peng, Weixia Xu, Ming Xie, Jie Liu, Zhiquan Lai, Sheng Ma, Qi Zhu (NUDT)
-
HINT: Designing Cache-Efficient MPI_Alltoall using Hybrid Memory Copy Ordering and Non-Temporal Instructions
Bharath Ramesh, Nick Contini, Nawras Alnaasan, Kaushik Kandadi Suresh, Mustafa Abduljabbar, Aamir Shafi, Hari Subramoni, Dhabaleswar K. Panda (The Ohio State University)
Session 7B: Communication Subsystems
Session Chair: TBA
-
Flexible NVMe Request Routing for Virtual Machines
Tu Dinh Ngoc, Boris Teabe, Georges Da Costa, Daniel Hagimont (IRIT, Université de Toulouse, CNRS, Toulouse INP, UT3)
-
HA-CSD: Host and SSD Coordinated Compression for Capacity and Performance
Xiang Chen (Huazhong University of Science and Technology); Tao Lu, Jiapin Wang (DapuStor); Yu Zhong (Huazhong University of Science and Technology); Guangchun Xie (DapuStor); Xueming Cao, Yuanpeng Ma, Bing Si, Feng Ding, Ying Yang, Yunxing Huang (DapuStor); Yafei Yang, You Zhou, Fei Wu (Huazhong University of Science and Technology)
-
Graph Analytics on Jellyfish Topology
Md Nahid Newaz (Oakland University); Sayan Ghosh, Joshua Suetterlein, Nathan Tallent (Pacific Northwest National Laboratory); Md Atiqul Mollah (Cornelis Networks); Hua Ming (Oakland University)
-
TEEMO: Temperature Aware Energy Efficient Multi-Retention STT-RAM Cache Architecture
Sukarn Agarwal (IIT Mandi); Shounak Chakraborty, Magnus Sjalander (Norwegian University of Science and Technology)
-
LockillerTM: Enhancing Performance Lower Bounds in Best-Effort Hardware Transactional Memory
Li Wan, Chao Fu, Qiang Li, Jun Han (Fudan University)
-
Attention, Distillation, and Tabularization: Towards Practical Neural Network-Based Prefetching
Pengmiao Zhang, Neelesh Gupta (University of Southern California); Rajgopal Kannan (DEVCOM Army Research Lab); Viktor Prasanna (University of Southern California)
|
Morning Break 10:30 AM -11:00 AM |
Keynote Session
11:00 AM – 12:00PM |
Keynote
Session Chair: TBA
Kunle Olukotun
Stanford University
To be announce |
12:00 PM – 1:30 PM |
Lunch & PhD Program |
Parallel Technical
Sessions 8A & 8B
1:30 AM – 2:50 PM |
Session 8A: Graph and MoE Learning
Session Chair: TBA
-
Aurora: A Versatile and Flexible Accelerator for Generic Graph Neural Networks
Jiaqi Yang (George Washington University); Hao Zheng (University of Central Florida); Ahmed Louri (George Washington University)
-
cuKE: An Efficient Code Generator for Score Function Computation in Knowledge Graph Embedding
Lihan Hu (The University of Iowa); Jing Li (Nvidia); Peng Jiang (The University of Iowa)
-
Exploiting Inter-Layer Expert Affinity for Accelerating Mixture-of-Experts Model Inference
Jinghan Yao, Quentin Anthony, Aamir Shafi, Hari Subramoni, Dhabaleswar K. Panda (The Ohio State University)
-
TASER: Temporal Adaptive Sampling for Fast and Accurate Dynamic Graph Representation Learning
Gangda Deng, Hongkuan Zhou (University of Southern California); Hanqing Zeng, Yinglong Xia, Christopher Leung, Jianbo Li (Meta); Rajgopal Kannan (DEVCOM US Army Research Lab); Viktor Prasanna (University of Southern California)
Session 8B: Performance Optimization
Session Chair: TBA
-
OpenFFT-SME: An Efficient Outer Product Pattern FFT Library on ARM SME CPUs
Ruge Zhang, Haipeng Jia, Yunquan Zhang (Institute of Computing Technology, Chinese Academy of Sciences); Baicheng Yan, Penghao Ma, Long Wang (Huawei Technologies Co. Ltd); Wenxuan Zhao (Institute of Computing Technology, Chinese Academy of Sciences)
-
Harnessing Deep Learning and HPC Kernels via High-Level Loop and Tensor Abstractions on CPU Architectures
Evangelos Georganas, Kirill Voronin, Abhisek Kundu (Intel Corporation); Antonio Noack (Friedrich Schiller Universität Jena); Dhiraj Kalamkar (Intel Corporation); Alexander Breuer (Friedrich Schiller Universität Jena); Alexander Heinecke, Hans Pabst (Intel Corporation)
-
Optimizing General Matrix Multiplications on Modern Multi-core DSPs
Kainan Yu, Xinxin Qi, Peng Zhang, Jianbin Fang, Dezun Dong, Ruibo Wang, Tao Tang, Chun Huang, Yonggang Che (National University of Defense Technology); Zheng Wang (Northwest University)
-
Machine-Learning-Driven Runtime Optimization of BLAS Level 3 on Modern Multi-Core Systems
Yufan Xia (The Chinese University of Hong Kong); Giuseppe Maria Junior Barca (The University Of Melbourne)
|
Afternoon Break 2:50 PM -3:30 PM |
Parallel Technical
Sessions 9A & 9B
3:30 PM – 4:50 PM |
Session 9A: Distributed Algorithms
Session Chair: TBA
-
Time-Color Tradeoff on Uniform Circle Formation by Asynchronous Robots
Debasish Pattanayak (Carleton University); Gokarna Sharma (Kent State University)
-
LightDAG: A Low-latency DAG-based BFT Consensus through Lightweight Broadcast
Xiaohai Dai, Guanxiong Wang, Jiang Xiao, Zhengxuan Guo (Huazhong University of Science and Technology); Rui Hao (Nanjing University); Xia Xie (Hainan University); Hai Jin (Huazhong University of Science and Technology)
-
MAAD: A Distributed Anomaly Detection Architecture for Microservices Systems
Rongyuan Tan, Zhuozhao Li (Southern University of Science and Technology)
OneShot: View-Adapting Streamlined BFT Protocols with Trusted Execution Environments
-
Jeremie Decouchant (Delft University of Technology); David Kozhaya (ABB Research); Vincent Rahli (University of Birmingham); Jiangshan Yu (Monash University)
Session 9B: Graph Algorithms
Session Chair: TBA
-
Practically Tackling Memory Bottlenecks of Graph-Processing Workloads
Alexandre Valentin Jamet (Universitat Politecnica de Catalunya); Georgios Vavouliotis (Huawei Zurich Research Center); Daniel A. Jiménez (Texas A&M University); Lluc Alvarez (Barcelona Supercomputing Center); Marc Casas (Universitat Politecnica de Catalunya)
-
GCSM: GPU-Accelerated Continuous Subgraph Matching for Large Graphs
Yihua Wei, Peng Jiang (The University of Iowa)
-
Parallel Derandomization for Coloring
Sam Coy, Artur Czumaj (University of Warwick); Peter Davies (Durham University); Gopinath Mishra (National University of Singapore)
-
A Comparative Study of Intersection-Based Triangle Counting Algorithms on GPUs
Jiangbo Li, Zichen Xu (The Nanchang University); Minh Pham, Yicheng Tu (University of South Florida); Qi Zhou (City University of Macau)
|
|
MainConference Closing Session
Details to be announced |
FRIDAY - 31 May 2024
DAYS • Monday • Tuesday • Wednesday • Thursday • Friday |
FRIDAY
Workshops
ALL DAY
See each individual
workshop program
for schedule details
|
|
|
|
|
37th IEEE International Parallel
& Distributed Processing Symposium
May 15-19, 2023
Hilton St. Petersburg
Bayfront Hotel
St. Petersburg, Florida USA
REPORT ON IPDPS 2023
|