GPU-Cluster Computing

Summary

Relatively inexpensive scientific PC clusters have become very popular in recent years and already dominate the TOP 500 list of the fastest computers. Each node of such a cluster can be enhanced with a powerful graphics card forming a GPU-cluster. In this way the peak performance of the system increases enormously without putting much strain on the space or cooling constraints. However, programming of parallel computers is already a demanding task. The inclusion of GPUs components into the nodes of a parallel computer not only requires a different programming model for these devices but also creates a heterogeneous hardware system. This project explores the efficient utilization of such heterogeneous systems for scientific computing.

The focus is both on high performance and high productivity. We enhance the FEM solver package FEAST with GPU functionality in a minimally invasive fashion, with less than 1% of the code basis being affected [1]. Moreover, applications based on FEAST can benefit from the GPU acceleration without any code changes. We explore the large-scale scalability [2] and the practical benefits and limits [3] of this approach in detail.

Figures

Bandwidth in a typical GPU-node. Algorithms executing on the GPU-cluster must be able to tolerate the enormous discrepancy between the bandwidth on the co-processor board and the bandwidth from board to board that has to pass through the main memory of the hosts.

Displacements and van Mises stress of an object under load, computed with FeastSolid on a heterogeneous 16 node cluster using GPUs as scientific co-processors; no code changes, equal accuracy, 2.6x speedup.

Publications

  1. Dominik Göddeke, Hilmar Wobker, Robert Strzodka, Jamaludin Mohd-Yusof, Patrick S. McCormick and Stefan Turek
    Co-processor acceleration of an unmodified parallel solid mechanics code with FEASTGPU
    Int. J. Comput. Sci. Eng., 4(4), 254–269, 2009
    @article{DBLP:journals/ijcse/GoddekeWSMMT09,
      author = {G{\"{o}}ddeke, Dominik and Wobker, Hilmar and Strzodka, Robert and Mohd{-}Yusof, Jamaludin and McCormick, Patrick S. and Turek, Stefan},
      title = {Co-processor acceleration of an unmodified parallel solid mechanics
                        code with {FEASTGPU}},
      journal = {Int. J. Comput. Sci. Eng.},
      volume = {4},
      number = {4},
      pages = {254--269},
      year = {2009},
      url = {https://doi.org/10.1504/IJCSE.2009.029162}
      doi = {10.1504/IJCSE.2009.029162},
      timestamp = {Mon, 05 Feb 2024 00:00:00 +0100},
    }
    
  2. Dominik Göddeke, Robert Strzodka, Jamaludin Mohd-Yusof, Patrick S. McCormick, Hilmar Wobker, Christian Becker and Stefan Turek
    Using GPUs to improve multigrid solver performance on a cluster
    Int. J. Comput. Sci. Eng., 4(1), 36–55, 2008
    @article{DBLP:journals/ijcse/GoddekeSMMWBT08,
      author = {G{\"{o}}ddeke, Dominik and Strzodka, Robert and Mohd{-}Yusof, Jamaludin and McCormick, Patrick S. and Wobker, Hilmar and Becker, Christian and Turek, Stefan},
      title = {Using GPUs to improve multigrid solver performance on a cluster},
      journal = {Int. J. Comput. Sci. Eng.},
      volume = {4},
      number = {1},
      pages = {36--55},
      year = {2008},
      url = {https://doi.org/10.1504/IJCSE.2008.021111}
      doi = {10.1504/IJCSE.2008.021111},
      timestamp = {Mon, 05 Feb 2024 00:00:00 +0100},
    }
    
  3. Dominik Göddeke, Robert Strzodka, Jamaludin Mohd-Yusof, Patrick S. McCormick, Sven H. M. Buijssen, Matthias Grajewski and Stefan Turek
    Exploring weak scalability for FEM calculations on a GPU-enhanced cluster
    Parallel Comput., 33(10-11), 685–699, 2007
    @article{DBLP:journals/pc/GoddekeSMMBGT07,
      author = {G{\"{o}}ddeke, Dominik and Strzodka, Robert and Mohd{-}Yusof, Jamaludin and McCormick, Patrick S. and Buijssen, Sven H. M. and Grajewski, Matthias and Turek, Stefan},
      title = {Exploring weak scalability for {FEM} calculations on a GPU-enhanced
                        cluster},
      journal = {Parallel Comput.},
      volume = {33},
      number = {10-11},
      pages = {685--699},
      year = {2007},
      url = {https://doi.org/10.1016/j.parco.2007.09.002}
      doi = {10.1016/J.PARCO.2007.09.002},
      timestamp = {Mon, 05 Feb 2024 00:00:00 +0100},
    }
    

Code

Contact