The open archive for STFC research publications

Full Record Details

Persistent URL http://purl.org/net/epubs/work/65073
Record Status Checked
Record Id 65073
Title Parallel Performance of Fast Fourier Transform Routines in PRACE
Abstract The Fast Fourier Transform (FFT) is one of the most widely used and useful algorithms in engineering and scientific applications and therefore its analysis and performance on large-scale computing platforms is of much importance to a range of research fields. In computational fluid dynamics applications, computing fast and efficient FFTs enables ever larger direct numerical simulations and large-eddy simulations, in which Reynolds numbers can approach those found in reality. Under the European Community's Seventh Framework Programme, the PRACE [1] `Tier-0' systems [2,3] with parallel computing environments, enabling a great deal of processing power (either through a large numbers of CPU cores or the provision of computational accelerators such as GPUs), have been made available for high-end computing researchers and code developers. Recently, high-end computing resources (IBM Blue Gene/Q) have also been made available to researchers in the UK through the Hartree Centre at STFC Daresbury Laboratory [4]. This paper analyses parallel three-dimensional FFT performance on these high-end resources using routines from the numerical libraries FFTW [5], FFTE [6] and DAFT [7]. The implementations of the FFT investigated range from pure MPI versions to hybrid MPI-OpenMP approaches that can utilize simultaneous multithreading features on multicore architectures. Alternative three-dimensional data distributions, such as slab, pencil and block are also investigated to assess the impact upon parallel performance. The paper extends former work to testing the various FFT methods for the large datasets often used in simulations involving the High-Performance Solver for Turbulence and Aeroacoustic Research (HiPSTAR), which is developed at the University of Southampton, UK [8]. The paper presents, compares and analyses performance results from benchmark runs undertaken on the three architectures listed above. The authors conclude that although new implementations and techniques can now extend performance scalability to several thousands of cores, parallel scalability is ultimately limited by the all-to-all nature of the underlying communications.
Funding Information
Related Research Object(s):
Licence Information:
Language English (EN)
Type Details URI(s) Local file(s) Year
Paper In Conference Proceedings In THIRD INTERNATIONAL CONFERENCE ON PARALLEL, DISTRIBUTED, GRID AND CLOUD COMPUTING FOR ENGINEERING.Comutational and Technology Resources. ISSN 1759-3433. The third in a series of conferences concerned with new developments and applications of high performance computing (including parallel, distributed, grid and cloud computing) in engineering. (PARENG2013), Pecs, Hungary, 25 - 27 March 2013, (2013). doi:10.4203/ccp.101.14 2013