Changes between Version 14 and Version 15 of 2011WP/2011Stream2/DynamicMemory
- Timestamp:
- 2011-02-09T17:27:55+01:00 (13 years ago)
Legend:
- Unmodified
- Added
- Removed
- Modified
-
2011WP/2011Stream2/DynamicMemory
v14 v15 1 1 = Discussion of coding approach/style for dynamic memory = 2 3 2 Last edited [[Timestamp]] 4 3 … … 6 5 7 6 == S2-1 : Andrew P. : opening of the discusion == 8 9 7 As a basis for the discussion, here's how I've currently coded NEMO (v.3.2) to use dynamic memory. I've used !''...!'' to indicate places where I've missed-out chunks of code for clarity/brevity. 10 8 … … 278 276 279 277 ---- 280 281 278 == S2-2 : Richard H. comments == 282 283 284 279 I agree with the sentiment about dropping 'key_mpp_dyndist' in favour of only supporting the dynamic memory code. (On the basis that proliferation of cpp keys makes maintenance and development difficult in the long term and implies the need to test model developments using equivalent configurations under both static and dynamic configurations). 285 280 … … 319 314 Aborting using MPI_COMM_WORLD is particularly pertinent to coupled (OASIS based) models (otherwise things just tend to dangle). 320 315 321 322 ---- 323 316 ---- 324 317 == S2-3 : Gurvan M. comments == 325 326 318 * Definitively, we have to make a complete break from static-memory version. The key_mpp_dyndist should disappear. We have all agreed on that at the developer committee. 327 319 … … 335 327 336 328 * issue of work-space or local arrays: 337 338 In my opinion, we can simply return back to what was done in earlier versions of OPA (v1.0 to v6.0 !!). Declare and allocate one for all 4 3D work arrays, and 4 2D wok arrays. Then use them as workspace in the subroutines. I say 4, as ti was sufficient in those release. Currently, some more can be required, and with the Griffies operator and the merge of TRA and TRC routines some 4D local arrays have appeared arrays. 329 330 In my opinion, we can simply return back to what was done in earlier versions of OPA (v1.0 to v6.0 !!). Declare and allocate one for all 4 3D work arrays, and 4 2D wok arrays. Then use them as workspace in the subroutines. I say 4, as ti was sufficient in those release. Currently, some more can be required, and with the Griffies operator and the merge of TRA and TRC routines some 4D local arrays have appeared arrays. 339 331 340 332 We can check in the code the maximum number of 4D, 3D and 2D arrays are required to decide the exact number. It should not be that large. 341 333 342 Note that such a technique is already used in some modules.For example in zdftke, I use the fact that after field (ua, va, ta, sa) are only used in the momentum and tracer part, so that in the computation of the physics there are considered as workspace. 343 344 So what I suggest a new module wrk_nemo (_nemo since it will be probably used in OPA, LIM, CICE, TOP...) : 334 Note that such a technique is already used in some modules.For example in zdftke, I use the fact that after field (ua, va, ta, sa) are only used in the momentum and tracer part, so that in the computation of the physics there are considered as workspace. 335 336 So what I suggest a new module wrk_nemo (_nemo since it will be probably used in OPA, LIM, CICE, TOP...) : 345 337 346 338 {{{ … … 400 392 END MODULE wrk_nemo 401 393 }}} 402 403 Then, your example of dia_ptr routine becomes: 394 Then, your example of dia_ptr routine becomes: 404 395 405 396 {{{ … … 425 416 ... 426 417 }}} 427 428 418 Note that in this example, I have already introduced a 'USE oce, vt => ua' ... since dia_ptr is a diagnostics, so that after arrays are available as work space. 429 419 … … 435 425 436 426 ---- 437 438 439 427 == S2-4 : Italo E. comments == 440 441 442 428 Hi all, I have just a couple of comments. 443 429 444 Re the opa_partition routine and the policy for choosing the "best" partition, I suggest to set jpni and jpnj such that the local subdomain is as much "square" as possible. Indeed the "best" performance, with the current domain decomposition, is reached when the local subdomain has a square shape. 445 I suggest to modify the opa_patition as follows 430 Re the opa_partition routine and the policy for choosing the "best" partition, I suggest to set jpni and jpnj such that the local subdomain is as much "square" as possible. Indeed the "best" performance, with the current domain decomposition, is reached when the local subdomain has a square shape. I suggest to modify the opa_patition as follows 431 446 432 {{{ 447 433 ... … … 472 458 ... 473 459 }}} 474 475 Re the allocation of work arrays. 476 The sharing of work arrays among different routines gives us the possibility to save relevant memory space; so the idea to have a module such as wrk_nemo could be useful. However the usage of those arrays could introduce several contraindications: 1. the code could be less readable; 2. when I write a new routine that calls some other already available, I must be sure that I will not use the same work arrays. 460 Re the allocation of work arrays. The sharing of work arrays among different routines gives us the possibility to save relevant memory space; so the idea to have a module such as wrk_nemo could be useful. However the usage of those arrays could introduce several contraindications: 1. the code could be less readable; 2. when I write a new routine that calls some other already available, I must be sure that I will not use the same work arrays. 477 461 478 462 Some actions can be adopted in order to reduce the dangerously of such work arrays, but I would avoid to use routine arguments for passing work arrays. Typically the usage of work arrays is strictly related to the kind of implementation of the routine; on the other hand, the routine prototype should be as stable as possible during the refinement/optimization/modification of the routine implementation. The maintenance of the code becomes very heavy if updating the implementation of one routine implies also the modification of its prototype. 479 463 480 For those routines, at lower level, I suggest to declare locally their work allocatable arrays with the SAVE attribute 481 482 ---- 483 464 For those routines, at lower level, I suggest to declare locally their work allocatable arrays with the SAVE attribute 465 466 ---- 484 467 == S2-5 : Andrew P's follow-up comments == 485 486 468 I like Gurvan's suggestion of a module containing globally-accessible work-space arrays. We could add some error-checking functionality to this by having an 'in_use' flag for each work-space array in the module. Before using a work-space array, a developer should check that the appropriate flag is .FALSE. and if it is, set it to .TRUE. while they are using it. Once they are done using the array the flag should be set back to .FALSE. 487 469 … … 491 473 492 474 ---- 493 494 475 == S2-6 : Marie-Alice Foujols' comments == 495 496 476 As this modification will impact all the code, I suggest to use a script to easily redo modification. It'll be usefull for NEMO users to compare old part of their own copie of code with new one. If this script is distributed, they could use it to change their code and to easily incorpore their modifications to the new version. I suggest also to avoid cosmetic changse (move of comments, line splitting, ....) for the same reason : reduce time for users to compare their own copie with new version of NEMO including dynamic allocation. 497 477 … … 500 480 Hope this helps. 501 481 502 503 ---- 504 482 ---- 505 483 == S2-7 : Andy Porter's 3rd set of comments == 506 507 484 I can appreciate that an almost global change like this will be difficult for users who have locally modified versions. Ideally the source-code revision-control system would facilitate applying the changes to a locally-modified version/branch - one that isn't in the official repository. Unfortunately I don't think subversion has this functionality (although I'd be very pleased to learn otherwise). Certainly I'll do my best to avoid unnecessary cosmetic changes. However, while I can imagine that scripting the change of module arrays from static to dynamic might be possible, I don't think the same can be said of the work-space arrays and they account for a lot of the code changes. 508 485 509 In fact, I'm discovering that some routines have an awful lot of workspace arrays. 510 e.g.: 486 In fact, I'm discovering that some routines have an awful lot of workspace arrays. e.g.: 511 487 512 488 {{{ … … 542 518 IF( kt == nit000 ) THEN !* initialisation 543 519 }}} 544 I make that 21 2D workspace arrays! Should the global workspace module contain that 545 many or should we make some of these into module-wide arrays? 546 547 Do people want jpk to be treated like jpi and jpj and have it become a run-time 548 variable or is it OK to leave it as a compile-time parameter? My thinking is that 549 one doesn't change the no. of levels in a model lightly and it has no bearing on 550 the MPP domain decomposition. 551 552 ---- 553 520 I make that 21 2D workspace arrays! Should the global workspace module contain that many or should we make some of these into module-wide arrays? 521 522 Do people want jpk to be treated like jpi and jpj and have it become a run-time variable or is it OK to leave it as a compile-time parameter? My thinking is that one doesn't change the no. of levels in a model lightly and it has no bearing on the MPP domain decomposition. 523 524 ---- 525 == S2-8 : Gurvan's 2nd comments == 526 I don't thing having many 2D work arrays is a problem. 21 2D arrays are still much smaller than a single 3D array (jpk is usually between 30 and 70).[[BR]]As a starting point, I prefer the solution in which we define as many 2D and 3D allocatable working arrays as necessary in the worth case. [[BR]]In a first step this will be much more simple. In a second stage, if the large number of work arrays is only for a few modules that are not systematically used, then we can decide to only systematically allocate let say 10 work arrays and in those module allocate the additional one (obviously testing before whether they are already allocated or not). 527 528 For jpk, it is true that jpk will not be changed at run-time, BUT with AGRIF the mother and child can have a different jpk (this is a new feature planned to be introduced this year). Therefore jpk MUST be considered as a run-time variable together with jpi and jpj. 529 530 About the computation of jpni, jpnj at run-time or in namelist.... the problem I have in mind is the suppression of land-only processor. For the moment the user give the i and j processor cuting AND the number of really used processor (jpnij). It is unclear for me how this can be chosen at run-time... 531 532 ---- 554 533 == S2-x : XXX' comments == 555 556 557 ---- 558 559 534 ----