New URL for NEMO forge!   http://forge.nemo-ocean.eu

Since March 2022 along with NEMO 4.2 release, the code development moved to a self-hosted GitLab.
This present forge is now archived and remained online for history.
user/acc/IoserverAssignment/776 – NEMO
wiki:user/acc/IoserverAssignment/776

Discarding land-only regions when using IO servers

First a description of the problem:

When using stand-alone io servers (i.e. key_iomput is defined AND using_servers = .true. in xmlio_server.def) then ocean processing regions are assigned to the available io servers in mpi rank order. For example, consider a (jpni=8) x (jpnj=4) decomposition using 4 io servers. Each io server will be assigned 8 ocean regions to collate output from. The 32 ocean regions are allocated mpi ranks 0 to 31 within the ocean communicator starting with the bottom left-hand region with rank zero and proceeding left to right then bottom to top until the top right-hand region is assigned mpi rank 31. In this case ranks 0-7 are associated with the first io server; ranks 8-15 with the second etcetera. Each io server is therefore responsible for a complete horizontal strip of ocean regions and the output files produced are equivalent to those for a (jpni=1) x (jpnj=4) decomposition.

          ocean ranks:                    assigned io server:
     24 25 26 27 28 29 30 31           03 03 03 03 03 03 03 03
     16 17 18 19 20 21 22 23           02 02 02 02 02 02 02 02
     08 09 10 11 12 13 14 15           01 01 01 01 01 01 01 01
     00 01 02 03 04 05 06 07           00 00 00 00 00 00 00 00
 Assignment of ocean regions to io servers for a 8x4 decomposition using 4 io servers

The geographical region handled by each io server is determined by the minimum and maximum longitudes and latitudes of all the ocean regions assigned to it. Therefore, using a number of io servers which is not a multiple or integer factor of jpnj is not recommended. For example using 3 io servers with the 8x4 decomposition would lead to:

          ocean ranks:                    assigned io server:
     24 25 26 27 28 29 30 31           02 02 02 02 02 02 02 02
     16 17 18 19 20 21 22 23           01 01 01 01 01 01 02 02
     08 09 10 11 12 13 14 15           00 00 00 01 01 01 01 01
     00 01 02 03 04 05 06 07           00 00 00 00 00 00 00 00
 Assignment of ocean regions to io servers for a 8x4 decomposition using 3 io servers

for which the bounding rectangular areas for each io server overlap. The code copes with this and no data are lost but there is redundant storage used and more complex algorithms are required to stitch together the io server files into global datasets.

With a fully populated decomposition, this situation is easily avoided by ensuring that the number of io servers is either a multiple of or an integer factor of jpnj. Thus for the 8x4 decomposition: 1,2,4,8,16 or 32 are all valid choices (although 32 would be achieved more efficiently by not using stand-alone servers at all).

Problems arise, however, when discarding land-only regions. In this case the mpi rank assigned to each ocean region is no longer simply related to its geographical position. Take the following example where two land-only regions have been discarded, resulting in 30 active regions:

          ocean ranks:                    assigned io server:
     22 23 24 25 26 27 28 29           02 03 03 03 03 03 03 03
     14 15 16 17 18 19 20 21           01 01 02 02 02 02 02 02
     07 08 ** 09 10 11 12 13           00 01 ** 01 01 01 01 01
     00 01 02 ** 03 04 05 06           00 00 00 ** 00 00 00 00
 Assignment of ocean regions to io servers for a 8x4 decomposition with two land-only regions (**) using 4 io servers

The elimination of land-only regions results in irregular areas assigned to io servers and, in general, this can not be solved by a judicious choice of the number of io servers.

The solution

The solution provided by changeset [2462] and outlined in ticket #776 assigns ocean regions to io servers according to the mpi rank they would have if land-only regions had not been discarded. This ensures that the general rule for choosing a sensible number of io servers still applies. The only catch is that knowledge of the ocean region's "virtual" rank can only be obtained after the mpi start-up phase by which time the assignment has already happened. Starting a new decomposition is, therefore, a two step process. The first run following any change in the decomposition will exit gracefully after producing the layout.dat file. The layout.dat file now contains a new column containing the "virtual" rank of each processor. Subsequent runs of the same code will use this information to assign ocean processes to io servers with results as intuitively expected.

Last modified 13 years ago Last modified on 2010-12-08T17:24:24+01:00