#1351 closed Bug (fixed)
Problems with AMM12 SETTE tests with -O3 optimisation (ifort compiler)
Reported by: | acc | Owned by: | nemo |
---|---|---|---|
Priority: | low | Milestone: | |
Component: | OCE | Version: | v3.6 |
Severity: | Keywords: | ||
Cc: |
Description
My initial attempts to run the AMM12 SETTE tests with a v3.6 trunk (rev 4673) failed when using the ifort compiler (v14) and a -O3 optimisation level. The failure was CFL breaches after 12 time steps. The tests ran successfully at -O2 and below for AMM12 and at -O3 for all other tests (except AGRIF).
Eventually tracked the cause to this block at line 291 in dynspg_ts.F90 ( key_vectopt_loop is not defined ):
DO jk = 1, jpkm1 #if defined key_vectopt_loop DO jj = 1, 1 !Vector opt. => forced unrolling DO ji = 1, jpij #else DO jj = 1, jpj DO ji = 1, jpi #endif zu_frc(ji,jj) = zu_frc(ji,jj) + fse3u_n(ji,jj,jk) * ua(ji,jj,jk) * umask(ji,jj,jk) zv_frc(ji,jj) = zv_frc(ji,jj) + fse3v_n(ji,jj,jk) * va(ji,jj,jk) * vmask(ji,jj,jk) END DO END DO END DO
which looks harmless but adding a compiler directive to suppress loop fusion enables a successful SETTE test at -O3 optimisation. I.e.:
DO jk = 1, jpkm1 !DIR$ NOFUSION #if defined key_vectopt_loop DO jj = 1, 1 !Vector opt. => forced unrolling DO ji = 1, jpij #else DO jj = 1, jpj DO ji = 1, jpi #endif zu_frc(ji,jj) = zu_frc(ji,jj) + fse3u_n(ji,jj,jk) * ua(ji,jj,jk) * umask(ji,jj,jk) zv_frc(ji,jj) = zv_frc(ji,jj) + fse3v_n(ji,jj,jk) * va(ji,jj,jk) * vmask(ji,jj,jk) END DO END DO END DO
Does anyone have a clue what might be happening here?
Commit History (1)
Changeset | Author | Time | ChangeLog |
---|---|---|---|
4687 | acc | 2014-06-24T17:22:03+02:00 | #1351 alternative loop structure to fix errors in dynspg_ts.F90 when compiling with -O3 and the ifort compiler. Without this change the AMM12 SETTE tests fail after 12 timesteps. Also included a single line efficiency change in domzgr.F90 and improvements to sette scripts and local NOCS files. |
Change History (5)
comment:1 Changed 10 years ago by acc
comment:2 Changed 10 years ago by jchanut
We found exactly the same solution: That's however weird.
It may depend on which ifort version you use.
No reason not to do this change.
comment:3 Changed 10 years ago by acc
- Resolution set to fixed
- Status changed from new to closed
Change submitted at revision #4687
comment:4 Changed 10 years ago by smasson
do you use the compilation option:
-fp-model precise
comment:5 Changed 10 years ago by acc
Yes. Although on this machine '-fp-model source' makes more sense otherwise you get a lot of warnings:
ifort: command line warning #10212: -fp-model precise evaluates in source precision with Fortran.
Both give the same error after 12 time steps with the original code.
Alternatively, if I replace the inner loops and reduce the block to:
Then the tests are successful at -O3 without any compiler directives. Can anyone see any reason not to make this change permanent?