On solving separable block tridiagonal linear systems using a GPU implementation of radix-4 PSCR method
Myllykoski, M., Rossi, T., & Toivanen, J. (2018). On solving separable block tridiagonal linear systems using a GPU implementation of radix-4 PSCR method. Journal of Parallel and Distributed Computing, 115(May), 56-66. https://doi.org/10.1016/j.jpdc.2018.01.004
Julkaistu sarjassa
Journal of Parallel and Distributed ComputingPäivämäärä
2018Tekijänoikeudet
© 2018 Elsevier Inc. This is a final draft version of an article whose final and definitive form has been published by Elsevier Inc. Published in this repository with the kind permission of the publisher.
Partial solution variant of the cyclic reduction (PSCR) method is a direct solver that can be applied to certain types of separable block tridiagonal linear systems. Such linear systems arise, e.g., from the Poisson and the Helmholtz equations discretized with bilinear finite-elements. Furthermore, the separability of the linear system entails that the discretization domain has to be rectangular and the discretization mesh orthogonal. A generalized graphics processing unit (GPU) implementation of the PSCR method is presented. The numerical results indicate up to 24-fold speedups when compared to an equivalent CPU implementation that utilizes a single CPU core. Attained floating point performance is analyzed using roofline performance analysis model and the resulting models show that the attained floating point performance is mainly limited by the off-chip memory bandwidth and the effectiveness of a tridiagonal solver used to solve arising tridiagonal subproblems. The performance is accelerated using off-line autotuning techniques.
...
Julkaisija
Academic PressISSN Hae Julkaisufoorumista
0743-7315Asiasanat
Julkaisu tutkimustietojärjestelmässä
https://converis.jyu.fi/converis/portal/detail/Publication/27908455
Metadata
Näytä kaikki kuvailutiedotKokoelmat
Rahoittaja(t)
Suomen AkatemiaRahoitusohjelmat(t)
Akatemiahanke, SALisätietoja rahoituksesta
The research of the first author was supported by the Academy of Finland[grant number 252549]; the Jyväskylä Doctoral Program in Computing and Mathematical Sciences ; and the Foundation of Nokia Corporation (Project number 201510310). The research of the third author was supported by the Academy of Finland[grant numbers 252549, 295897]Samankaltainen aineisto
Näytetään aineistoja, joilla on samankaltainen nimeke tai asiasanat.
-
Implementation techniques for the lattice Boltzmann method
Mattila, Keijo (University of Jyväskylä, 2010) -
On GPU-accelerated fast direct solvers and their applications in image denoising
Myllykoski, Mirko (University of Jyväskylä, 2015) -
Fast Poisson solvers for graphics processing units
Myllykoski, Mirko; Rossi, Tuomo; Toivanen, Jari (Springer, 2013)Two block cyclic reduction linear system solvers are considered and implemented using the OpenCL framework. The topics of interest include a simplified scalar cyclic reduction tridiagonal system solver and the impact ... -
Lyapunov quantities and limit cycles in two-dimensional dynamical systems : analytical methods, symbolic computation and visualization
Kuznetsova, Olga (University of Jyväskylä, 2011) -
Stability and oscillation of dynamical systems : theory and applications
Kuznetsov, Nikolay V. (University of Jyväskylä, 2008)
Ellei toisin mainittu, julkisesti saatavilla olevia JYX-metatietoja (poislukien tiivistelmät) saa vapaasti uudelleenkäyttää CC0-lisenssillä.