Automated Partitioning of CUDA Kernels for Multi-GPU Systems
- Date in the past
- Wednesday, 4. September 2024, 13:00
- INF 368, R.531
- Lorenz Braun
Address
INF 368
R.531Organizer
Dean
Event Type
Doctoral Examination
This work shows the feasibility of automated partitioning of CUDA kernels for multi-GPU systems. The problem is approached by modeling the compute graph of selected applications. With the help of a simulator models are derived to predict performant partitionings. The problem of accurately predicting GPU kernel runtime is aided by a new compiler assisted method to profile GPU kernels. The GPU independent metrics of the profiler are used to develop a methodology for kernel runtime and power usage prediction. With the methodology four GPU benchmark suites are used to model time and power usage of GPU kernels on five different GPUs.