The open archive for STFC research publications

Full Record Details

Persistent URL http://purl.org/net/epubs/work/12240374
Record Status Checked
Record Id 12240374
Title Pivoting for Matrix Factorizations on Manycore Architectures
Abstract The use of direct methods for the solution of linear systems through matrix factorizations (e.g. Gaussian elimination) is a staple of many numerical methods. Of particular interest in optimization is the sparse symmetric indefinite factorization used at the core of many interior-point methods. In the numerical analysis of such methods, the major technique for ensuring stability is known as pivoting. This involves the use of row and column interchanges to avoid division by small numbers. Whilst there are are alternatives in the form of FFT structured scalings, these are unsuitable for use in the sparse case as they cause the problem to become fully dense. Meanwhile, the efficient use of the growing number of cores and length of vector units on commodity CPUs is becoming more important. Today's manycore accelerators offer a means to evaluate algorithms for future CPU architectures. Whilst high parallel efficiencies have been obtained using unpivoted factorizations on GPUs (e.g.\ MAGMA), the same cannot be said for factorizations that employ pivoting. There are two non-trivial problems that arise in the implementation of pivoted factorizations on such architectures that do not arise in their unpivoted cousins. The first problem is that as the number of processors increases, the communication costs of finding pivot candidates with traditional algorithms becomes asymptotically dominant. In particular, the latency of communicating pivoting decisions can limit the number of processors that can be effectively used. A number of communication-avoiding algorithms have been proposed to reduce the number of messages from $O(n^2)$ to $O(n\log n)$. These methods include variants of tournament pivoting, a posterori threshold pivoting and subset pivoting. The second problem arises after pivoting decisions have been made. The pivoting decisions may require the permutation of matrix rows and columns. Given that to exploit maximum parallelism, these rows and columns may have been updated to different degrees, a method to effect these permutations in an efficient yet maintainable fashion is not readily apparent, and little work has been published on this to date. Indeed, the received wisdom on the GPU is that it cannot be done efficiently. In this talk we describe our practical experience implementing different variants of a symmetric indefinite solver using a range of mechanisms to tackle these problems on manycore accelerators. We also briefly discuss their possible translation to current and future desktop architectures.
Organisation STFC , SCI-COMP , SCI-COMP-CM
Funding Information
Related Research Object(s):
Licence Information:
Language English (EN)
Type Details URI(s) Local file(s) Year
Presentation Presented at 4th IMA Conference on Numerical Linear Algebra and Optimisation , Birmingham, UK, 3-5 Sep 2014. manycore_pivot.pdf 2014