The open archive for STFC research publications

Full Record Details

Persistent URL http://purl.org/net/epubs/work/12153111
Record Status Checked
Record Id 12153111
Title Strategies for I/O-bound applications at scale
Abstract In the environmental sciences, the useful product of running a simulation code is usually one or more files containing four-dimensional 'history data' - time-series of various three-dimensional fields on the model domain (e.g. temperature, pressure, etc.). Writing this data to disk is increasingly seen to be the major bottleneck in running these types of simulation and the situation is getting worse: as machine core-counts increase, scientists take the opportunity to run higher-resolution models and thus each 'frame' of a time-series consists of more data. In this work we investigate the I/O performance of a typical I/O-bound code, the Weather Research and Forecast Model (WRF). As machine manufacturers continue the push towards exascale, minimising power consumption is a major requirement. We are therefore seeing a proliferation of relatively low-power processing cores, often sharing connections to high-performance, parallel file systems. IBM's BlueGene Q is typical of this approach, with a few hundred cores sharing a separate I/O node. Actual numbers depend upon machine configuration. It is also the case that the number of processing cores per node is increasing faster than the quantity of memory on a compute node. WRF has a well-defined API for its I/O layer. This has enabled several different implementations of the I/O layer. In the default, all data is gathered to the master process before being written via the standard, serial NetCDF library. Although simple, this approach is limited by the amount of memory available to the master process and makes poor use of any parallel file system. In contrast, in the layer based on pNetCDF (parallel NetCDF), every process writes it's own part of the data. In addition to these layers, WRF has the ability to reserve a set of processes for performing parallel I/O. Once data is offloaded to these processes, the computation is able to proceed without blocking on the I/O operations themselves. All of these layers generate NetCDF files which is a popular format within the scientific community. However, WRF is also capable of outputting data in the alternative GRIB2 format which results in considerably smaller data files. We investigate the performance of all of these options on the BlueGene Q architecture, and consider the implications as I/O-bound applications continue to be pushed towards the exascale.
Organisation STFC , HC
Funding Information
Related Research Object(s):
Licence Information:
Language English (EN)
Type Details URI(s) Local file(s) Year
Paper In Conference Proceedings In Exascale Applications and Software Conference 2013 (EASC2013), Edinburgh, UK, 9-11 Apr 2013, (2013). 2013