Changes between Version 17 and Version 18 of 2020WP/ENHANCE-10_acc_fix_traqsr
- Timestamp:
- 2020-05-20T16:00:23+02:00 (4 years ago)
Legend:
- Unmodified
- Added
- Removed
- Modified
-
2020WP/ENHANCE-10_acc_fix_traqsr
v17 v18 118 118 Part 1 has even greater inefficiencies though since it repeats some computationally expensive calculations in a 3D loop even though much of it is 2D in nature. In fact the complexity of the calculations can be reduced further by operating in log-space. 119 119 120 After trying a few options and iterating at the preview stage (see earlier investigations below), the final solution choose was: 121 120 After trying a few options and iterating at the preview stage (see earlier investigations below), the final solution chosen is given below. Its performance, compared with the original and a minimum memory option (see earlier investigations) is shown here: 121 122 [[Image(percent_cpu_qsr.4.png, 530px)]] 123 [[Image(rankqsr.4.png, 530px)]] 124 125 where the information is taken from the timing.output report of a 64 timestep, ORCA2_ICE_PISCES test using the SETTE test harness. Rank refers to the position of the tra_qsr routine in the list of routines sorted by CPU time (most expensive first). A higher rank, therefore, indicates improved performance relative to the rest of the code. 126 127 The final code: 122 128 {{{#!f 123 129 CASE( np_RGB , np_RGBc ) !== R-G-B fluxes ==! … … 785 791 ! 786 792 }}} 787 [[Image(percent_cpu_qsr.3.png )]]788 [[Image(rankqsr.3.png )]]793 [[Image(percent_cpu_qsr.3.png, 530px)]] 794 [[Image(rankqsr.3.png, 530px)]] 789 795 ''...'' 790 796