Optimisation

From UKCA
Revision as of 14:39, 21 April 2015 by MarkRichardson (talk | contribs)

JWCRP Project Improve Computational Efficiency of UKCA in UKMO UM

A 33 month project has been funded by the Joint Weather and Climate Research Programme between UKMO, UoLeeds and NCAS. The plan is to analyse components of UKCA and implement revisions that improve its computational efficiency. Two logical demarcations are aerosols and chemistry. As of March 2015 developments within the interface to the aerosol sub-system are preparing to deal with columns of atmosphere and thus allow better use of cache. These changes also allow for the extension of Open MP into UKCA. Later analysis will address the chemical solver and build on the experience already gained from the investigation into the backward Euler method as a replacement for the Newton-Raphson technique.

Below is earlier work prior to 2014


Model Optimisation

The cost of the Stratosphere-Troposphere chemistry scheme in UKCA using the Newton-Raphson solver with the on-line photolysis scheme Fast-jX relative to the climate model HadGEM3-A is as follows:

Model PEs OpenMP Threads Time Elapsed (sec)
HadGEM3-A 8x16 1 3798
HadGEM3-A + StratTrop(N-R) + Fast-jX 8x16 1 15602
HadGEM3-A 8x16 2 2328
HadGEM3-A + StratTrop(N-R) + Fast-jX 8x16 2 11730


On a 8x16 PE configuration and 1 OpenMP thread on the Monsoon facility or the Met Office's Power6 IBM (hpc1e/1f), UKCA and Fast-jX together add 310% to the cost of HadGEM3-A. However, with 2 OpenMP threads (now standard in HadGEM3-A runs) and no OpenMP compiler directives in UKCA, the relative cost of UKCA is even higher, adding approximately 400% to the cost of HadGEM3-A. Adding aerosol chemistry and UKCA-MODE aerosols will make it even more costly. Therefore, there is a clear need for optimisation. As part of the HadGEM3-ES development project, currently being led by F. O'Connor, there are plans to do some optimisation work on UKCA. In particular, a more complete assessment of the model cost will be carried out and the potential speedup which may be gained from simple code re-writing, load balancing, and the use of OpenMP, dedicated I/O servers and maths libraries will be explored. Some scientific optimisation such as throwing out unwanted reactions (or species), tweaking chemistry to improve convergence, etc. should also be considered. The use of an alternative solver, such as a Rosenbrock solver, may also be investigated.

Updates (8 June 2012)

  • The High Performance Computing (HPC) team at the Met Office have now been provided with 3 jobs: HadGEM3-A, HadGEM3-A+StratTrop+Fast-jX, and HadGEM3-A+StratTrop+Achem+Fast-jX+MODE, which run on the Met Office Power7 IBM (hpc2e).
  • Calls to Dr Hook in UKCA do not work - Andy Malcolm to investigate and fix.
  • Based on crude timings, chemistry scales super-linearly with number of processors.
  • Based on crude timings, load balancing appears to be worse for the chemistry than for Fast-jX.
  • Further profiling and Dr Hook timing to do.
  • Current advice for hpc2e is to run on a PE configuration of 16x16 with 1 OpenMP thread.