wiki:DevelopmentActivities/Branches/ORCHIDEE-MICT-IMBALANCE-P/SimulationTimes

Version 32 (modified by ajornet, 8 years ago) (diff)

--

Performance

Basic Performance Report

attachment:performance_mict_albert_jornet_150616.pdf

MICT V6 (3344 + PFT interpolation) Module computing time

Perf_MICT_options

Mict R3359 (allinea map)

This profile is done with Allinea Map under Curie Machine, 16 cores MPI, 1 Month and 0.5 degree with IOIPSL. All components are compiled in Production mode (fast).

########## ########## ########## ########## ########## ########## ########## ##########
Execution Sum Up
########## ########## ########## ########## ########## ########## ########## ##########
Jobid     : 4569560
Jobname   : M65_test
User      : p529jorn
Account   : gen6328@standard
Limits    : time = 4:10:00 , memory/task = 4000 Mo
Date      : submit = 15/04/2016 09:42:37 , start = 15/04/2016 09:52:13
Execution : partition = standard , QoS = normal , Comment = (null)
Resources : ntasks = 16 , cpus/task = 1 , ncpus = 16 , nodes = 1
   Nodes=curie4179 CPU_IDs=0-15 Mem=64000

Memory / step
--------------
                        Resident Size (Mo)                     Virtual Size (Go)
JobID          Max     (Node:Task)       AveTask    Max  (Node:Task)            AveTask
-----------    ------------------------  -------    --------------------------  -------

Accounting / step
------------------

JobID          JobName             Ntasks  Ncpus Nnodes     Layout       Elapsed   Ratio      CPusage    Eff  State
------------   ------------        ------  ----- ------     -------      -------   -----      -------    ---  -----
4569560       M65_test                  -     16      1           -     00:55:53     100            -      -  -
########## ########## ########## ########## ########## ########## ########## ##########

Screenshots:

Main: Allinea map screenshot main window for Orchidee MICT MPI: Allinea map screenshot MPI window for Orchidee MICT Memory: Allinea map screenshot Memory window for Orchidee MICT IO: Allinea map screenshot Input/Output window for Orchidee MICT CPU Time: Allinea map screenshot CPU time window for Orchidee MICT CPI: Allinea map screenshot CPI time window for Orchidee MICT

Click the link below to download the profiling file:

attachment:orchidee_ol_16p_1t_2016-04-15_09-52.map

Trunk vs MICT Comparision 11/04/2016

  • Date 11/04/2016
  • ADA Machine
  • IOIPSL production mode
  • Orchidee production mode
  • 1Y
  • 16 cores
  • Forcing:
    • 1 Degree
    • 3H

Considerations:

  • MICT is in the same level of modifications as Trunk revision 3346
  • MICT is using parallel interpolation for aggregate 2D subroutine

Overview

Orchidee vs trunk profiling

Subroutines are placed in 4 different groups described below:

  • ioipsl: all subroutines related to IOIPSL library
  • Top orchidee: subroutines >1% of computing time
  • Interpolation: interpolation time by aggregate_2D subroutine
  • other orchidee: remaining subroutines from orchidee

Mict R3359 (gprof)

This is a profiling test done with gprof tool:

Flat profile:

Each sample counts as 0.01 seconds.
  %   cumulative   self              self     total           
 time   seconds   seconds    calls  Ks/call  Ks/call  name    
 25.66   1383.92  1383.92  2245127     0.00     0.00  mathelp_mp_ma_fuscat_r21_
  9.62   1902.84   518.92  3835809     0.00     0.00  mathelp_mp_moycum_index_
  9.18   2398.02   495.18  3835826     0.00     0.00  histcom_mp_histwrite_real_
  5.96   2719.41   321.39    17524     0.00     0.00  thermosoil_mp_thermosoil_cond_pft_
  3.81   2924.90   205.49    17520     0.00     0.00  hydrol_mp_hydrol_soil_
  3.62   3119.87   194.97   420480     0.00     0.00  hydrol_mp_hydrol_soil_coef_
  3.59   3313.39   193.52    17524     0.00     0.00  thermosoil_mp_thermosoil_getdiff_
  3.11   3481.04   167.65      365     0.00     0.00  stomate_wet_ch4_pt_ter_wet2_mp_ch4_wet_flux_density_wet2_
  3.05   3645.33   164.29      365     0.00     0.00  stomate_wet_ch4_pt_ter_wet1_mp_ch4_wet_flux_density_wet1_
  2.92   3803.03   157.70      365     0.00     0.00  stomate_wet_ch4_pt_ter_wet3_mp_ch4_wet_flux_density_wet3_
  2.86   3957.34   154.31      365     0.00     0.00  stomate_wet_ch4_pt_ter_0_mp_ch4_wet_flux_density_0_
  2.74   4105.24   147.90      365     0.00     0.00  stomate_wet_ch4_pt_ter_wet4_mp_ch4_wet_flux_density_wet4_
  2.67   4249.50   144.26    17522     0.00     0.00  thermosoil_mp_thermosoil_coef_
  1.63   4337.37    87.87    17520     0.00     0.00  hydrol_mp_hydrol_diag_soil_
  1.59   4423.39    86.02  2666157     0.00     0.00  mod_orchidee_omp_transfert_mp_gather_omp_r1_
  1.57   4507.82    84.43       55     0.00     0.00  interpol_help_mp_aggregate_2d_
  1.37   4581.90    74.08    17520     0.00     0.00  diffuco_mp_diffuco_trans_co2_
  1.36   4655.06    73.16    17520     0.00     0.00  stomate_mp_stomate_main_
  1.22   4720.59    65.53    17520     0.00     0.00  stomate_permafrost_soilcarbon_mp_microactem_
  1.06   4777.86    57.27    17520     0.00     0.00  hydrol_mp_hydrol_main_
  0.96   4829.85    51.99  1602027     0.00     0.00  mathelp_mp_ma_fuscat_r11_
  0.77   4871.20    41.35    17522     0.00     0.00  thermosoil_mp_thermosoil_readjust_
  0.74   4911.35    40.15  2664512     0.00     0.00  mod_orchidee_omp_transfert_mp_gather_omp_i1_

Total Simulation time: 5358 seconds

IO: mathelp + histcom = 25.66 + 9.62 + 9.18 = ~45%

Trunk R3346

This is a profiling test done with gprof tool:

Flat profile:

Each sample counts as 0.01 seconds.
  %   cumulative   self              self     total           
 time   seconds   seconds    calls  Ks/call  Ks/call  name    
 22.26    441.54   441.54        7     0.06     0.06  interpol_help_mp_aggregate_2d_
 14.52    729.66   288.12  2171415     0.00     0.00  histcom_mp_histwrite_real_
 13.26    992.66   263.00    17520     0.00     0.00  hydrol_mp_hydrol_soil_
 10.28   1196.56   203.90   773813     0.00     0.00  mathelp_mp_ma_fuscat_r21_
  5.07   1297.17   100.61  2171397     0.00     0.00  mathelp_mp_moycum_index_
  4.16   1379.77    82.60    17520     0.00     0.00  diffuco_mp_diffuco_trans_co2_
  3.81   1455.34    75.57    17520     0.00     0.00  hydrol_mp_hydrol_diag_soil_
  3.67   1528.21    72.87   157680     0.00     0.00  hydrol_mp_hydrol_soil_coef_
  2.29   1573.69    45.48  1400412     0.00     0.00  mathelp_mp_ma_fuscat_r11_
  2.27   1618.76    45.07    17520     0.00     0.00  hydrol_mp_hydrol_main_
  1.86   1655.66    36.90    17521     0.00     0.00  thermosoil_mp_thermosoil_getdiff_
  1.46   1684.63    28.97    17521     0.00     0.00  thermosoil_mp_thermosoil_humlev_
  0.99   1704.17    19.54   157680     0.00     0.00  hydrol_mp_hydrol_soil_tridiag_
  0.94   1722.82    18.65    17520     0.00     0.00  stomate_litter_mp_littercalc_
  0.92   1740.99    18.17    17520     0.00     0.00  hydrol_mp_hydrol_split_soil_
  0.86   1758.10    17.11    17520     0.00     0.00  stomate_mp_stomate_main_
  0.81   1774.07    15.98  1133588     0.00     0.00  mod_orchidee_omp_transfert_mp_gather_omp_r1_

Total Simulation time: 1956 seconds

IO: mathelp + histcom = 14.25 + 10.28 + 5.07 = ~30%

Trunk vs MICT Comparision 18/02/2016

18/02/2016: revisions trunk 2916 and MICT 3161 were considered to be equivalents.

The same run.def file is used to compare both developments.

The simulations were carried out under the following conditions:

  • 1 Year
  • Global
  • CRU-NCEP v5.3.2 (6 hourly)
  • CURIE
  • IO library: IOIPSL
  • Compilation mode IOIPSL: production
  • Compilation mode Orchidee: production

Mict R3527

Time table:

N procs 8 16 32 64 128 256
0.5 deg >16h39 322 days 11h09 7h06 4h50 3h31 2h47
1 deg 4h10 2h14 1h20 52 37 30
2 deg - - - - - -

Mict R3161

Time table:

N procs 4 8 16 32 64 128
0.5 deg Memory error >16h39 322 days 13h00 8h46 6h35 5h38
1 deg 6h37 4h20 2h36 1h45 1h21 1h08
2 deg 1h40 56 35 24 19 16

Note: 0.5 deg in 4 N procs did not start due to memory requirements. 0.5 deg in 8 N procs could not finish the simulation in the maximum time given by the HPC. It stopped at the simulation day 322. Both values can be extrapolated.

Trunk R2916

The same simulations with the same options where carried out with the following results:

N procs 4 8 16 32 64 128
0.5 deg 8h38 5h31 3h26 2h23 1h48 1h31
1 deg 2h07 1h17 47 32 25 21
2 deg 38 19 11 8 6 5

Attachments (12)