Automated Partitioning of CUDA Kernels for Multi-GPU Systems

Date in the past
Wednesday, 4. September 2024, 13:00
INF 368, R.531
- Lorenz Braun

Address
INF 368
R.531
Organizer
Dean
Event Type
Doctoral Examination

This work shows the feasibility of automated partitioning of CUDA kernels for multi-GPU systems. The problem is approached by modeling the compute graph of selected applications. With the help of a simulator models are derived to predict performant partitionings. The problem of accurately predicting GPU kernel runtime is aided by a new compiler assisted method to profile GPU kernels. The GPU independent metrics of the profiler are used to develop a methodology for kernel runtime and power usage prediction. With the methodology four GPU benchmark suites are used to model time and power usage of GPU kernels on five different GPUs.

Deutsch

Automated Partitioning of CUDA Kernels for Multi-GPU Systems

Address

Organizer

Event Type

For Students

For Staff

FOR VISITORS