Version 6 (modified by jpolcher, 8 years ago) (diff)


Benchmark of ORCHIDEE using 3 different drivers for off-line simulations


These benchmarks were done over two domains in order to sample differently sized problems. ORCHIDEE is always configured in the same way and the same executable us used in both sets of bechmarks.

Euro-Mediterranean Global
Length of experiment 1 year 1 year
nb lon. x nb Lat. 164 x 168 720 x 360
Number of land points 15909 94742
Size of history file 41M 225M

The version of the model used is the last version of ORCHIDEE-DRIVER before the merge (May 2016). The same version and configuration of the model is used and only the way the forcing data (radiation fluxes, atmospheric conditions and precipitation types) is provided to ORCHIDEE differ.

The IO was configured to minimise the output but still using IOIPSL. The size of the output files are given in the table above.

The routing is activated. It is relevant here as it generates some MPI exchanges.

Computer and compiler details

A dedicated node of Climserv was used (merlin5). This the simulations were not in competition for memory access with other applications.

The model was compiled with PGF 2013 and OpenMPI 1.6.5. The following modules were loaded for the execution :

  • module load pgi/2013
  • module load openmpi/1.6.5-pgf2013
  • module load netcdf4/
  • module load oasis3-mct-nc43/2.0-pgf2013


The three drivers benched here can be characterised as follows :

  • Old driver : In this historical driver all CPUs access the netCDF file which contains the forcing data, does the time interpolation and provides it to ORCHIDEE
  • New driver : Only the root processor reads the data and this only at the beginning of the simulation. Then each processor gets its fraction of the grid points and does the temporal interpolation before passing it to ORCHIDEE.
  • OASIS driver : This is a totally different approach where one processor is dedicated to reading the forcing data and doing the time interpolation. The OASIS is then used to distribute the data at each time step (every 900 seconds) to the n-1 processors which run ORCHIDEE.

The new driver and the OASIS mode contain much more complex time interpolations. This can generate more operations and thus need more CPU time.

Real Time

The real time (As well as user and system times.) was measured with the time command which encapsulated the mpirun used to run the model.

The graphic only shows the impact of parallel processing of the model and already hints to the fact that the fastest driver for the Euro-Mediterranean region is the old one. The OASIS coupling between the driver and ORCHIDEE is slower than the subroutine call used for the new and old drivers.

User Time

The results for the real time is confirmed and better illustrated by the user time returned by UNIX's time command.


The speed-up was computed relative to the 2 processor case as this is the only common reference case. The OASIS-driver cannot be used with only 1 CPU.

It shows that the better speed-up is achieved by the new driver. To understand this one has to remember that the new driver will read the forcing only on one processor and then scatters the information to the other processors using MPI. In the old driver all processors read the netCDF file in order to obtain the data needed and thus generate more system calls.

Two indices show that this can explain the better speed-up of the new driver.

The above figure displays the CPU time taken by MPI. It is clear that as the number of CPUs increases in the new driver more time is spent in the data transfer. I this graphic the OASIS driver does not count MPI exchanges needed for the forcing data as it is part of OASIS and not ORCHIDEE.

Another hint to the slow down caused by multiple processors accessing the same netCDF file for the forcing can be seen in the system time returned by UNIX's time command.

As we increase the number of CPUs in the old driver, the system time increases while in the new driver this time remains flat.

Attachments (5)

Download all attachments as: .zip