Neither solvers with best numerical convergence nor solvers with best parallel efficiency are the best choice for the fast solution of PDE problems in practice. The fastest solvers require a delicate balance between their numerical and hardware characteristics. Balancing both aspects we can parallelize strong sequential preconditioners with large parallel speedup and hardly any loss in numerical performance. In this way GMG can also solve ill-conditioned systems.
Mixed Precision Methods
★Acceleration of unmodified legacy code on GPU-clusters★ A single GPU already offers two levels of parallelism, but similar to CPUs, demand for higher performance and larger problem sizes leads to the utilization of GPU-clusters, in which every cluster node is equipped with GPUs. This adds the intra-node and inter-node parallelism. The main challenge for these heterogeneous systems is the enormous discrepancy in the bandwidth between the two finer and two coarser levels of parallelism and their integration in legacy code.
★Double accuracy with single precision GPUs and FPGAs★ To obtain a result of high accuracy it is not necessary to compute all intermediate results with high precision. Mixed precision methods apply high precision computations only where necessary and save space or time without decreasing the accuracy of the final solution.