On solving separable block tridiagonal linear systems using a GPU implementation of radix-4 PSCR method
Myllykoski, M., Rossi, T., & Toivanen, J. (2018). On solving separable block tridiagonal linear systems using a GPU implementation of radix-4 PSCR method. Journal of Parallel and Distributed Computing, 115(May), 56-66. https://doi.org/10.1016/j.jpdc.2018.01.004
Published inJournal of Parallel and Distributed Computing
© 2018 Elsevier Inc. This is a final draft version of an article whose final and definitive form has been published by Elsevier Inc. Published in this repository with the kind permission of the publisher.
Partial solution variant of the cyclic reduction (PSCR) method is a direct solver that can be applied to certain types of separable block tridiagonal linear systems. Such linear systems arise, e.g., from the Poisson and the Helmholtz equations discretized with bilinear finite-elements. Furthermore, the separability of the linear system entails that the discretization domain has to be rectangular and the discretization mesh orthogonal. A generalized graphics processing unit (GPU) implementation of the PSCR method is presented. The numerical results indicate up to 24-fold speedups when compared to an equivalent CPU implementation that utilizes a single CPU core. Attained floating point performance is analyzed using roofline performance analysis model and the resulting models show that the attained floating point performance is mainly limited by the off-chip memory bandwidth and the effectiveness of a tridiagonal solver used to solve arising tridiagonal subproblems. The performance is accelerated using off-line autotuning techniques. ...
Publication in research information system
MetadataShow full item record
Related funder(s)Research Council of Finland
Funding program(s)Academy Project, AoF
Additional information about fundingThe research of the first author was supported by the Academy of Finland[grant number 252549]; the Jyväskylä Doctoral Program in Computing and Mathematical Sciences ; and the Foundation of Nokia Corporation (Project number 201510310). The research of the third author was supported by the Academy of Finland[grant numbers 252549, 295897]
Showing items with similar title or keywords.
Mattila, Keijo (University of Jyväskylä, 2010)
Lyapunov quantities and limit cycles in two-dimensional dynamical systems : analytical methods, symbolic computation and visualization Kuznetsova, Olga (University of Jyväskylä, 2011)
Myllykoski, Mirko (University of Jyväskylä, 2015)
Kuznetsov, Nikolay V. (University of Jyväskylä, 2008)
Efficient Bayesian generalized linear models with time-varying coefficients : The walker package in R Helske, Jouni (Elsevier BV, 2022)The R package walker extends standard Bayesian general linear models to the case where the effects of the explanatory variables can vary in time. This allows, for example, to model the effects of interventions such as ...