Difference between revisions of "Met Office Virtual Machine"
(Created page with "This page will describe how to get UKCA at GA7 vn10.6+ working on the Met Office Virtual Machine ==Download and set-up the Virtual Machine== ===Prerequisites=== This makes ...") |
(No difference)
|
Revision as of 13:52, 13 November 2017
This page will describe how to get UKCA at GA7 vn10.6+ working on the Met Office Virtual Machine
Download and set-up the Virtual Machine
Prerequisites
This makes use of VirtualBox and Vagrant. You should use the most recent versions, if possible, and install them before proceeding.
You will also need an account on the Met Office Science Repository Service, known as the SRS, or MOSRS. You will need to make sure that you have access to the following projects:
- um
- JULES
- SOCRATES
- roses-u
at least. You can request access by emailing cms-support@ncas.ac.uk.
You should only do this on a local machine with enough memory that is running Ubuntu 16.04 (ideally 16.04.2). Currently the machines that can be used are:
- celsius
Set-up VirtualBox and Vagrant to use /scratch
You MUST change the VirtualBox & Vagrant settings to ensure that the guest machine disk is not placed in your /home
directory.
The names of the example directories given below can be changed, but they must be placed somewhere in in /scractch/[YOUR CRSID]
Note: because you will be putting the virtual disk images on /scractch
, you will only be able to use the VM from one machine, e.g. celsius, brewer etc. You will not ever be running from these directories - they are just used to hold the VM disk image. You will be running from the metomi-vms
directory that you will learn about in the next section.
Change the VirtualBox preferences
To do this for VirtualBox, open it from the command line by typing
virtualbox
then click file -> preferences and under the General settings, change the Default Machine Folder to e.g. /scractch/[YOUR CRSID]/VirtualBox_VMs
.
You should not use VirtualBox to make the VM, this will be done using Vagrant.
Change the Vagrant home directory
To ensure that the vagrant boxes are on scratch you need to set the $VAGRANT_HOME
environment variable. To do this, put the following at the end of your .profile
:
export VAGRANT_HOME=/scratch/[YOUR CRSID]/VirtualBox_VMs/.vagrant.d
This will then put it in the same directory as you have set for VirtualBox.
Set-up the VM
The Rose/Cylc VM that this is based on can be found on GitHub here:
You should download it, e.g. using git:
git clone https://github.com/metomi/metomi-vms.git
The current (as of 2017-01-19) recommended OS is Ubuntu 16.04 and this is the default, although this also works in Ubuntu 15.10. Using 16.04 is recommended as this is the long-term supported version.
For running UKCA you need a minimum of 6GB of RAM, with a recommended amount of 8GB (to be able to compile and run with rigorous compiler settings. You may also want to set-up a shared directory with the host filesystem.
For macOS or GNU/Linux hosts, you don't need to run the fill desktop environments, so you can delete the desktop from the config.vm.provision
line, and also comment-out the v.gui = true
line.
If you wish to mount a directory, you need to add a block, like the following, at the end of your Vagrantfile:
Vagrant.configure("2") do |config| # other config here config.vm.synced_folder "/path/on/host/machine", "/path/on/VM" end
The /path/on/VM
will be created. I suggest something like /mnt/Shared
.
Start the VM
When you have your Vagrantfile how you like it, on macOS or linux, in the metomi-vms
directory, you should type
vagrant up
You should then get something similar to the following:
Bringing machine 'metomi-vm-ubuntu-1604' up with 'virtualbox' provider... ==> metomi-vm-ubuntu-1604: Box 'bento/ubuntu-16.04' could not be found. Attempting to find and install... metomi-vm-ubuntu-1604: Box Provider: virtualbox metomi-vm-ubuntu-1604: Box Version: >= 0 ==> metomi-vm-ubuntu-1604: Loading metadata for box 'bento/ubuntu-16.04' metomi-vm-ubuntu-1604: URL: https://atlas.hashicorp.com/bento/ubuntu-16.04 ==> metomi-vm-ubuntu-1604: Adding box 'bento/ubuntu-16.04' (v2.3.1) for provider: virtualbox metomi-vm-ubuntu-1604: Downloading: https://atlas.hashicorp.com/bento/boxes/ubuntu-16.04/versions/2.3.1/providers/virtualbox.box
The OS will the be downloaded and installed. This may take some time, possibly around 45 minutes depending on the speed of your internet connection.
When it is finished you will see the message:
==> metomi-vm-ubuntu-1604: Finished provisioning at Thu Jan 19 11:09:03 UTC 2017 (started at Thu Jan 19 10:53:43 UTC 2017) ==> metomi-vm-ubuntu-1604: ==> metomi-vm-ubuntu-1604: Please run vagrant ssh to connect.
You should then use
vagrant ssh
to connect. The first time you do, you will be prompted for your MOSRS password and username, e.g.:
Met Office Science Repository Service password: Met Office Science Repository Service username: lukeabraham Subversion password cached Rosie password cached
Stopping the VM
To stop the VM you must first log-out, e.g. by using Ctrl-d
, then type vagrant halt
. You should then see
==> metomi-vm-ubuntu-1604: Attempting graceful shutdown of VM...
Set-up the UM
Start the VM using vagrant up
in the metomi-vms
directory. Now you can install and set-up the UM. Further details are here:
But you can just follow the instructions below to get started.
Currently only UM vn10.4 and above can be used on the VM.
2017-03-02: There is an issue with Cylc v7, so to use the UM properly you need to run the following command to install Cylc 6:
- sudo install-cylc-6
This issue should be resolved in a few weeks.
Essnentially, you should run the following commands, in this order:
- sudo install-um-extras
- um-setup
- install-um-data
- install-rose-meta
It can take several minutes to complete each of the above steps.
You should then make a ticket on MOSRS with a milestone of Not for Builds to use to make a branch to put on your VM to run the required rose stem groups and also to make prebuilds. For example, I made a vn10.6 branch called vn10.6_prebuilds as it gives a nice naming convention once these are make. However, you should not use my branch, but instead make your own, e.g.
fcm branch-create --type=DEV --ticket=N prebuilds fcm:um.x_tr@vn10.6
where N is your ticket number, and prebuilds is the name of the branch you will be making (it will end up being called vn10.6_prebuilds
), and then do e.g.
fcm checkout https://code.metoffice.gov.uk/svn/um/main/branches/dev/lukeabraham/vn10.6_prebuilds
Note: You should NOT use this branch though, please make your own. You should not make any changes so that you know that the code within it is identical to the trunk at the UM version you are using (e.g. vn10.6).
You should then run the following command from inside the top-level directory of your new branch:
rose stem --group=vm_install -S CENTRAL_INSTALL=true -S UKCA=true --group=install_source
Note: if you are working at vn10.7 or higher you should instead use this command to install the mule
utilities that have replaced cumf, pumf etc.
rose stem --group=vm_install_mule,vm_install_ctldata -S CENTRAL_INSTALL=true -S UKCA=true --group=install_source
It might take about 10-15 minutes to do this step.
Upgrading to a new version
When a new version of the UM comes out, you will need to repeat the following steps to be able to run again:
- um-setup
- install-rose-meta
rose stem --group=vm_install -S CENTRAL_INSTALL=true -S UKCA=true --group=install_source
on a new branch that you have made (you may have problems running this command on a direct checkout of the trunk).
Prebuilds
To speed-up compiling of your code, you can use pre-compiled builds, where the UM source-code is already compiled, and only the files that you have changed are compiled from scratch. This can greatly speed-up compile time. To do this you should run the following command from inside the top-level directory of your new branch (which has no changes in it, so is effectively the same source-code as the vn10.x trunk):
rose stem --group=fcm_make --name=vn10.6_prebuilds -S MAKE_PREBUILDS=true
Further information on pre-builds can be found in UMDPX04. It can take about an hour or so to make the prebuilds.
Testing the UM
I would advise using a different branch to the one that you used to make the prebuilds, to prevent anything being accidentally over-written, e.g.
fcm branch-create --type=DEV --ticket=N vm_testing fcm:um.x_tr@vn10.6 fcm checkout https://code.metoffice.gov.uk/svn/um/main/branches/dev/lukeabraham/vn10.6_vm_testing
You can then run the rose stem test suggested on the MOSRS page, e.g. cd into the top level directory of your new branch and run the following:
rose stem -O offline --group=vm_n48_eg_omp_noios -S INTEGRATION_TESTING=true
Setting-up UKCA
Required input files
There are a number of missing ancillary and netCDF emissions files that need to be copied into the VM. These can be installed by running the install-ukca-data
script provided as part of the VM install scripts. It will extract the required files from JASMIN and put them into the appropriate directories.
vn10.6 example suite: u-ai906
When you open rosie go, add u as a data source and then search for u-ai906. This suite is a 3-hour run with some standard STASH included. It will take ~12 minutes for a fresh compile (~50s for a recompile), ~20s for reconfiguration, and ~12minutes for the atmosphere step. You should right-click on this and click copy (and not checkout).
The STASH output is:
This is just representative to cover various bits of UKCA code. Please feel free to add your own as needed to your own suite.
Output is in
$HOME/cylc-run/[SUITE-ID]/work/1/atmos/atmosa.pa19810901_00
- U-ai906 O3mlev.png
UKCA Ozone on model levels (y-z)
- U-ai906 O3plev.png
UKCA Ozone on pressure levels (y-log(z))
- U-ai906 O3 20m.png
UKCA Ozone at 20m
- U-ai906 O3 1000hPa.png
UKCA Ozone at 1000hPa
Known Issues
MOSRS ticket #2442 identified and fixed an issue with the gnu compiler (used on the VM) with the UM as a whole and with UKCA specifically. Ticket #2536 fixed an issue with the pressure-level output of the UKCA tracers.
If you are using vn10.6 you should include the following branch to fix the compiler problems:
branches/dev/lukeabraham/vn10.6_UKCA_gnu@31624
and the following branch to fix the pressure level issue:
branches/dev/lukeabraham/vn10.6_ukca_plev_tr_bug@30736
At vn10.4, vn10.5, and vn10.6 you should include the following compiler flag
-fdefault-double-8
to the fcflags_overrides
variable the Advanced compilation panel in Rose which will correct for the compiler problem.
Additional Set-up
Xconv
You will need to download Xconv (xconv1.93
) from here:
You can download this by
wget http://cms.ncas.ac.uk/documents/xconv/_downloads/xconv1.93_linux_x86_64.tar.gz
Download it to your $HOME/bin on the VM, cd into this directory, tar -zxvf
the tar-ball, and then
ln -s xconv1.93 convsh1.93 ln -s xconv1.93 xconv ln -s xconv1.93 convsh
Iris
There is an install-iris
script provided, but you will need to set-up modules yourself to be able to use it properly. The anaconda install breaks Rose if put in your PATH, however, there is now an alias
conda
which will open a new terminal with all the anaconda python packages in its $PATH. This will allow you to use Rose in one terminal and Iris in another.
I have found a handy way to use python is to use ipython
with the following arguments
ipython --pylab --logfile=ipython-`date +\"%Y%m%d-%H%M%S\"`.py"
I have aliased this to
pylab
in my .bashrc
. The
--pylab
sets-up a MatLab-type environment (numpy, scipy, matplotlib all loaded using standard shortcuts), and
--logfile=ipython-`date +\"%Y%m%d-%H%M%S\"`.py
means that all commands are saved to a file of the format
ipython-YYMMDD-HHMMSS.py
Debugging
Some handy commands for debugging UM jobs on the VM are below.
Get memory usage relative to print statements
You can diagnose the memory of the running UM job by doing the following:
- Change the job to run 1x1 rather than 1x2 (change the values of
UM_ATM_NPROCX
and/orUM_ATM_NPROCY
in the suite.rc file) - this prevents any confusion in the following output from top - Turn on flushing of the print buffer
- Change the run command to be
/usr/bin/time -v --output=/home/vagrant/um-atmos.time um-atmos
instead ofum-atmos
- In a xterm window run the following command and pipe it to a file:
top -b -d1 -u vagrant | grep --line-buffered um-atmos
- In another xterm window use the following command:
tail -n 0 -qF atmos.fort6.pe0
and pipe this to another file - You can combine these two files together using
tail -n 0 -qF
to give the memory usage of the um-atmos executable at the same time as the print statements from the model
This can give you output like:
ATMOS_PHYSICS1:calling cosp_init 12154 vagrant 22 2 3642600 1.306g 15088 R 96.6 16.7 0:03.64 um-atmos.exe 12154 vagrant 22 2 4167928 1.657g 15088 R 95.9 21.3 0:03.74 um-atmos.exe 12154 vagrant 22 2 7713796 2.059g 15088 R 100.0 26.4 0:03.85 um-atmos.exe 12154 vagrant 22 2 7713796 2.461g 15088 R 95.4 31.6 0:03.95 um-atmos.exe 12154 vagrant 22 2 7713796 2.863g 15088 R 95.6 36.7 0:04.05 um-atmos.exe 12154 vagrant 22 2 7713796 3.260g 15088 R 100.0 41.8 0:04.16 um-atmos.exe 12154 vagrant 22 2 7713796 3.663g 15088 R 100.0 47.0 0:04.27 um-atmos.exe 12154 vagrant 22 2 7713796 4.061g 15088 R 96.5 52.1 0:04.37 um-atmos.exe 12154 vagrant 22 2 7713796 4.470g 15088 R 100.0 57.3 0:04.48 um-atmos.exe 12154 vagrant 22 2 7713796 4.872g 15088 R 96.3 62.5 0:04.58 um-atmos.exe ATMOS_PHYSICS1:left cosp_init
Using the time command will give the following output
Command being timed: "um-atmos" User time (seconds): 518.43 System time (seconds): 84.94 Percent of CPU this job got: 99% Elapsed (wall clock) time (h:mm:ss or m:ss): 10:04.59 Average shared text size (kbytes): 0 Average unshared data size (kbytes): 0 Average stack size (kbytes): 0 Average total size (kbytes): 0 Maximum resident set size (kbytes): 4260268 Average resident set size (kbytes): 0 Major (requiring I/O) page faults: 0 Minor (reclaiming a frame) page faults: 684122 Voluntary context switches: 272816 Involuntary context switches: 266901 Swaps: 0 File system inputs: 0 File system outputs: 1435448 Socket messages sent: 0 Socket messages received: 0 Signals delivered: 0 Page size (bytes): 4096 Exit status: 0
Where the line
Maximum resident set size (kbytes): 4260268
above gives the maximum memory used by the job. It is also helpful to run 1x1 here, as if you run 1x2 the value for this will actually be ~50% of the maximum value.
Written by Luke Abraham 13th November 2017