High Performance Computing needs High Performance Data Management
Abstract Parallel environmental models are currently some of the most demanding codes we have. They push the machines available to their limit and are still in need of more resources. However the bottleneck in running these codes is not so much the code performance as the data handling strategies employed. To address the identified environmental research challenges, researchers need access to a wide range of observational data and model output, covering the human, physio-chemical and biological components of the Earth system. This data should be recognised as a highly valuable resource, but data from existing data centres are frequently under-used because of the difficulties of accessing the data and of assessing whether they contain anything of interest or relevance. Furthermore data exploration is still in its early stages. The seriousness of the problem is increasing rapidly, as new computer and observation technologies encourage the production of more data in shorter time spans. For an efficient working environment, mechanisms have to be put in place to support scientific work more effectively. The High Performance Computing Initiative Centre at CLRC, Daresbury Laboratory (UK) has set up a project to investigate this important issue (DAMP - Data Management in Climate Research Project). Although this project is specifically targeting problems in Climate Research many of its finding are easily applicable to other scientific disciplines. Besides analysing existing solutions, the project will also look for new, more flexible and portable approaches.
Miscellaneous Computing 1999 - Grand Challenges in Computer Simulation p.331-336 Adrian Tentner (Eds), The Society for Computer Simulation International (SCS), San Diego, California, USA, (April 1999). http://www.dl.ac.…blications/hpc99.htm hpc99.htm 1999
