The open archive for STFC research publications

Full Record Details

Persistent URL http://purl.org/net/epubs/work/63053
Record Status Checked
Record Id 63053
Title A fast triangular solve on GPUs
Abstract The level 2 BLAS operation trsv performs a dense triangular solve, and is often used in the solve phase of a direct solver following a matrix factorization. With the advent of manycore architectures the importance of this memory-bound kernel is increasingly important, particularly for sparse direct solvers used in optimization applications. In this paper, a high performance implementation of the triangular solve is developed through a careful analysis of theoretical and practical bounds on the possible performance. This implementation outperforms the the CUBLAS by a factor of 5--15.
Organisation CSE-NAG , STFC , SCI-COMP
Funding Information
Related Research Object(s): 66081 , 62933
Licence Information:
Language English (EN)
Type Details URI(s) Local file(s) Year
Report RAL Preprints RAL-P-2012-002. 2012. RAL-P-2012-002.pdf 2012
Journal Article SIAM J Sci Comput 35, no. 3 (2013): C303-C322. doi:10.1137/12088358X 88358.pdf 2013