ePubs
The open archive for STFC research publications
Home
About ePubs
Content Policies
News
Help
Privacy/Cookies
Contact ePubs
Full Record Details
Persistent URL
http://purl.org/net/epubs/work/53443
Record Status
Checked
Record Id
53443
Title
dCSE Fluidity-ICOM : high performance computing driven software development for next-generation modelling of the Worlds oceans
Contributors
Xiaohu Guo (STFC Daresbury Lab.)
,
M Ashworth (STFC Daresbury Lab.)
,
G Gorman (Imperial College London)
Abstract
During the course of this project dCSE Fluidity-ICOM has been transformed from a code that was primarily used on institution level clusters with typically 64 tasks used per simulation into a highly performing scalable code which can be run efficiently on 4096 cores of the current HECToR hardware (Cray XT4 Phase2a). Fluidity-ICOM has been parallelised with MPI and optimised for HECToR alongside continual in-depth performance analysis. The following list highlights the ma jor developments: - The matrix assembly code has been optimised, including blocking. Fluidity-ICOM now supports block-CSR for the assembly and solves of vector fields and DG fields. - Interleaved I/O has been implemented to the vtu output. The performance analysis has been done with gyre test case, so far no performance improvement has been observed. The parallel I/O strategy has not yet been applied to the mesh file output as the final file format has still not been decided yet. - An optimal renumbering method for parallel linear solver performance has been implemented (provided via the PETSc interface). In general, it is recommended to use Reverse Cuthil l-McKee to get best performance. - Fluidity-ICOM has relatively complex dependencies on third party software, several modules were made for HECToR users to easily set software environment and install Fluidity-ICOM on HECToR. - The differentially heated rotating annulus benchmark was used to evaluate the scalability of mesh adaptivity. A scalability analysis of both the parallel mesh optimisation algorithm and of the complete GFD model was performed. This allows the performance of the parallel mesh optimisation method to be evaluated in the context of a "real" application. Extensive profiling has been performed with several benchmark test cases using CrayPAT and VampirTrace: -Auto profiling is proved not very useful for large test cases but MPI statistics of auto prfiling are still very useful for large test cases, which also helped to identify the problems with surface labelling which cause large overhead for CrayPAT. There are still on going issues of PETSc instrumentation. - VampirTrace GNU version are proved to be useful for mesh adaptivity part tracing, several interesting results have been made. - Profiling the real world applications has proved to be a big challenge. It required a considerable understanding of profiling tools and extensive knowledge of the software itself. The introduction of manual instrumentation was required in order to focus on specific sections of the code. Determining a suitable way to reduce the profiling data size without losing the fine grain details was critical for successfully profiling. Inevitably this procedure involved much experimentation requiring large numbers of profiling runs.
Organisation
CSE
,
CSE-HEC
,
STFC
Keywords
Natural environment
,
Unstructured Mesh
,
Adaptivity
,
Profiling
,
Mesh Optimisation
,
Fluidity-ICOM
,
Parallel I/O
Funding Information
Related Research Object(s):
Licence Information:
Language
English (EN)
Type
Details
URI(s)
Local file(s)
Year
Report
2010.
http://www.hector…rts/fluidity-icom01/
fluidity-icom01(2).pdf
2010
Showing record 1 of 1
Recent Additions
Browse Organisations
Browse Journals/Series
Login to add & manage publications and access information for OA publishing
Username:
Password:
Useful Links
Chadwick & RAL Libraries
SHERPA FACT
SHERPA RoMEO
SHERPA JULIET
Journal Checker Tool
Google Scholar