New URL for NEMO forge!   http://forge.nemo-ocean.eu

Since March 2022 along with NEMO 4.2 release, the code development moved to a self-hosted GitLab.
This present forge is now archived and remained online for history.
ticket/1721/General (diff) – NEMO

Changes between Version 1 and Version 2 of ticket/1721/General


Ignore:
Timestamp:
2016-04-29T10:38:10+02:00 (8 years ago)
Author:
frrh
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • ticket/1721/General

    v1 v2  
    11 
    22== Running eORCA1 on a single Cray XC40 PE == 
     3 
     4 
     5=== Background notes === 
     6Higher resolutions (e.g. eORCA025) cannot run on a single node so we concentrate solely on eORCA1 for the purposes of this work. 
     7 
     8Tim Graham has run a gyre ORCA2 configuration successfully on a single PE. 
     9 
     10Here we aim to go up a level.  
     11 
     12=== Procedures and observations === 
     13 
     14 
     15It turns out that if we take an eORCA1 NEMO-CICE GO6-type configuration and employ 1x1 with no separate XIOS server PEs then the  
     16job fails with a very clear message from icbinit (iceberg initialisation) that it is not possible to run with 1 PE in the X direction....  
     17 
     18Note: we have to manually adjust the aprun command line options to get what we need because the logic in the existing  
     19controls (in suite.rc) breaks down when running on 1x1 (and possibly all odd numbers) of PEs.  
     20 
     21 
     22The reasons for that are not clear (why would iceberg code be any different from the main NEMO (or CICE) code. That seems odd but I don't  
     23propose to pursue it. 
     24 
     25So try switching off icebergs.... 
     26 
     27Well this seems to submit and start running but times out with no suggestion that it has got very far (no time.step file etc). It's not  
     28clear if things are failing somewhere in the NEMO code, in CICE or in the IO.   
     29 
     30 
     31Try extending the run time and shortening the total run to 6 hours...  
     32 
     33 
     34This seems to abort in XIOS with an allocation problem. 
     35 
     36How about we add separate XIOS procs back to the job? 
     37 
     38Setting up to run with 8 separate XIOS processes in detached mode... that fails too.  
     39 
     40It seems that regardless of how long we give the model to run, it reads (at least some of) the NEMO namelists and then just hangs.  
     41 
     42Tim says this smacks of an XIOS problem.  
     43 
     44He suggests switching off XIOS (in the external libraries control of fcm_make_ocean) and deactivating key_iomput.  
     45 
     46So I do this and set the run to go for 6 hours (8x45 min TS in this case.) 
     47 
     48This actually seems to work and sure enough at the end of the run we have a single NEMO restart file for the whole domain.  
     49 
     50So it seems we can run eORCA1 on a single PE with the caveats that: 
     51 
     52   * we must turn off icebergs 
     53   * we must not compile with key_iomput 
     54   * we must turn off XIOS completely from the compilation (i.e. not merely leave it to run in attached mode). 
     55 
     56So not exactly an unqualified success but better than we might have hoped and potentially giving us something  
     57to work with should we need it.    
     58 
     59 
     60 
     61 
     62 
     63 
     64