Opened 5 years ago

Closed 5 years ago

Last modified 3 years ago

#1530 closed Bug (fixed)

Failed compilation of AGRIF configuration in SETTE on ADA

Reported by: bouttier Owned by: nemo
Priority: normal Milestone:
Component: AGRIF Version: release-3.6
Severity: minor Keywords:
Cc:

Description (last modified by nemo)

After my fix about storng.F90 to compile ORCA2_LIM with AGRIF (see ticket #1527), this configuration did not compile on ADA computer (See X64_ADA arch file/revision 5366) and ended with this error :

fcm_internal compile:F nemo /linkhome/rech/egi/regi906/NEMO_STANDARD/trunk/NEMOGCM/CONFIG/AGRIF/BLD/ppsrc/nemo/sbcisf.f90 sbcisf.o
mpiifort -c -cpp -o sbcisf.o -I/linkhome/rech/egi/regi906/NEMO_STANDARD/trunk/NEMOGCM/CONFIG/AGRIF/BLD/inc -DCPP_PARA -i4 -r8 -O3 -xAVX -fp-model precise -I/linkhome/rech/egi/regi906/NEMO_STANDARD/xios-1.0//inc -I/not/yet/defined/build/lib/mct -I/not/yet/defined/build/lib/psmile.MPI1 -I/smplocal/pub/NetCDF/4.1.3/mpi/include -c /linkhome/rech/egi/regi906/NEMO_STANDARD/trunk/NEMOGCM/CONFIG/AGRIF/BLD/ppsrc/nemo/sbcisf.f90
ifort: command line warning #10212: -fp-model precise evaluates in source precision with Fortran.
010101_13220

catastrophic error: **Internal compiler error: internal abort** Please report this error along with the circumstances in which it occurred in a Software Problem Report.  Note: File and line given may not be explicit cause of this error.
compilation aborted for /linkhome/rech/egi/regi906/NEMO_STANDARD/trunk/NEMOGCM/CONFIG/AGRIF/BLD/ppsrc/nemo/sbcisf.f90 (code 1)
fcm_internal compile failed (256)
gmake: *** [sbcisf.o] Error 1
gmake -f /linkhome/rech/egi/regi906/NEMO_STANDARD/trunk/NEMOGCM/CONFIG/AGRIF/BLD/Makefile -j 1 all failed (2) at /gpfs4l/smphome/rech/egi/regi906/NEMO_STANDARD/trunk/NEMOGCM/EXTERNAL/fcm/bin/../lib/Fcm/Build.pm line 597

I have tested on revision 5301 (succesfully compiled by Pierre Mathiot as indicated in the ticket #1527 but on which computer?) and I have the same error.

Commit History (2)

ChangesetAuthorTimeChangeLog
5541jchanut2015-07-02T17:24:34+02:00

change eos_fzp function into a subroutine (needed for AGRIF), #1530

5540jchanut2015-07-02T17:11:23+02:00

Change eos_fzp function into a subroutine (needed for AGRIF), #1530

Change History (9)

comment:1 Changed 5 years ago by mathiot

I re-tested ORCA2+AGRIF at revision 5366 on the BAS local cluster:

[piethi@node009 ~]$ cat /proc/cpuinfo
vendor_id   : AuthenticAMD
model name  : AMD Opteron(tm) Processor 6136
cpu MHz     : 2400.043
cache size  : 512 KB

The flags used to compile are as close as possible to the ADA flags:

%FCFLAGS         -DCPP_PARA -i4 -r8 -O3 -fp-model precise
%FFLAGS       -DCPP_PARA -i4 -r8 -O3 -fp-model precise

(I am able to compile but not to start NEMO with the option -xAVX)

On this machine with these flags, I am able to compile and run ORCA2+AGRIF:

piethi@bslscihub-ws2:.../NEMO_3.6_headref5366/NEMOGCM/SETTE$ ./sette.sh
export AGRIFUSE="10"
export BAS_MATLAB_QUEUE="compute.q"
export BATCH_COMMAND_PAR="qsub"
...
fcm_internal compile:F nemo /data/scihub-users/piethi/WORKDIR/NEMO_3.6_headref5366/NEMOGCM/CONFIG/ORCA2AGUL_1_2/OPAFILES/ppsrc/nemo/sbcisf.f90 sbcisf.f90
/data/scihub-users/piethi/WORKDIR/NEMO_3.6_headref5366/NEMOGCM/TOOLS/COMPILE/agrifpp.sh  sbcisf.f90 /data/scihub-users/piethi/WORKDIR/NEMO_3.6_headref5366/NEMOGCM/CONFIG/ORCA2AGUL_1_2/OPAFILES/inc  /data/scihub-users/piethi/WORKDIR/NEMO_3.6_headref5366/NEMOGCM/CONFIG/ORCA2AGUL_1_2/OPAFILES/ppsrc/nemo/sbcisf.f90
fcm_internal compile:F nemo /data/scihub-users/piethi/WORKDIR/NEMO_3.6_headref5366/NEMOGCM/CONFIG/ORCA2AGUL_1_2/OPAFILES/ppsrc/nemo/sbcrnf.f90 sbcrnf.f90
...
touch /data/scihub-users/piethi/WORKDIR/NEMO_3.6_headref5366/NEMOGCM/CONFIG/ORCA2AGUL_1_2/BLD/flags/FFLAGS__nemo__geo2ocean.flags
touch /data/scihub-users/piethi/WORKDIR/NEMO_3.6_headref5366/NEMOGCM/CONFIG/ORCA2AGUL_1_2/BLD/flags/FFLAGS__nemo__sbcisf.flags
touch /data/scihub-users/piethi/WORKDIR/NEMO_3.6_headref5366/NEMOGCM/CONFIG/ORCA2AGUL_1_2/BLD/flags/FFLAGS__nemo__zdfbfr.flags
...
mpif90 -c -cpp -o sbcisf.o -I/data/scihub-users/piethi/WORKDIR/NEMO_3.6_headref5366/NEMOGCM/CONFIG/ORCA2AGUL_1_2/BLD/inc -DCPP_PARA -i4 -r8 -O3 -fp-model precise -I/data/scihub-users/piethi/WORKDIR/XIOS-1.0/inc -I/not/yet/defined/build/lib/mct -I/not/yet/defined/build/lib/psmile.MPI1 -I/cluster-packages/apps/netcdf/intel/64/4.2/include -c /data/scihub-users/piethi/WORKDIR/NEMO_3.6_headref5366/NEMOGCM/CONFIG/ORCA2AGUL_1_2/BLD/ppsrc/nemo/sbcisf.f90
mpif90 -c -cpp -o zdfddm.o -I/data/scihub-users/piethi/WORKDIR/NEMO_3.6_headref5366/NEMOGCM/CONFIG/ORCA2AGUL_1_2/BLD/inc -DCPP_PARA -i4 -r8 -O3 -fp-model precise -I/data/scihub-users/piethi/WORKDIR/XIOS-1.0/inc -I/not/yet/defined/build/lib/mct -I/not/yet/defined/build/lib/psmile.MPI1 -I/cluster-packages/apps/netcdf/intel/64/4.2/include -c /data/scihub-users/piethi/WORKDIR/NEMO_3.6_headref5366/NEMOGCM/CONFIG/ORCA2AGUL_1_2/BLD/ppsrc/nemo/zdfddm.f90
...
ar: creating /data/scihub-users/piethi/WORKDIR/NEMO_3.6_headref5366/NEMOGCM/CONFIG/ORCA2AGUL_1_2/BLD/tmp/lib__fcm__nemo.a
mpif90 -o nemo.exe /data/scihub-users/piethi/WORKDIR/NEMO_3.6_headref5366/NEMOGCM/CONFIG/ORCA2AGUL_1_2/BLD/obj/nemo.o -L/data/scihub-users/piethi/WORKDIR/NEMO_3.6_headref5366/NEMOGCM/CONFIG/ORCA2AGUL_1_2/BLD/lib -l__fcm__nemo -lstdc++ -L/data/scihub-users/piethi/WORKDIR/XIOS-1.0/lib -lxios -L/not/yet/defined/lib -L/cluster-packages/apps/netcdf/intel/64/4.2/lib -lnetcdff -lnetcdf -L/cluster-packages/apps/hdf5/intel/64/1.8.8/lib -lhdf5hl_fortran -lhdf5_hl -lhdf5_fortran -lhdf5
/cm/shared/apps/intel/composerxe-2011.3.174/compiler/lib/intel64/libimf.so: warning: warning: feupdateenv is not implemented and will always fail
->Make: 396 seconds
->TOTAL: 413 seconds
Build command finished on Fri Jun  5 14:22:39 2015.
/data/scihub-users/piethi/WORKDIR/NEMO_3.6_headref5366/NEMOGCM/CONFIG
looking for tarfile ORCA2_LIM_nemo_v3.6.tar and directory /data/scihub-users/piethi/WORKDIR/FORCING/ORCA2_LIM_nemo_v3.6
/data/scihub-users/piethi/WORKDIR/NEMO_3.6_headref5366/NEMOGCM/CONFIG/ORCA2AGUL_1_2/SHORT/run_job.sh is ready
Your job 57772 ("sette") has been submitted
...
piethi@bslscihub-ws2:.../NEMO_3.6_headref5366/NEMOGCM/SETTE$ ls NEMO_VALIDATION/WORCA2AGUL_1_2/mpif90_scihub/5366/SHORT/
1_ocean.output  1_output.namelist.dyn  1_solver.stat  ocean.output  output.namelist.dyn  output.namelist.ice  solver.stat
piethi@bslscihub-ws2:.../NEMO_3.6_headref5366/NEMOGCM/SETTE$./sette_rpt 
GYRE        restartability  passed
GYRE        reproducibility passed
ORCA2_LIM_AGRIF runability passed

In order to reproduce the compilation error from Pierre Antoine, I tried on ARCHER but I was unable to compile the first file. I was killed because it takes to much time (more than 45 minutes).

mathiot@eslogin008:.../NEMO_3.6_head5366/NEMOGCM/SETTE$ time ./sette.sh
...
->Generate Fortran interface: 0 second
->Make: start
fcm_internal compile:F nemo /work/n02/n02/mathiot/NEMO_3.6_head5366/NEMOGCM/CONFIG/ORCA2AGUL_1_2/OPAFILES/ppsrc/nemo/lib_cray.f90 lib_cray.f90
/work/n02/n02/mathiot/NEMO_3.6_head5366/NEMOGCM/TOOLS/COMPILE/agrifpp.sh  lib_cray.f90 /work/n02/n02/mathiot/NEMO_3.6_head5366/NEMOGCM/CONFIG/ORCA2AGUL_1_2/OPAFILES/inc  /work/n02/n02/mathiot/NEMO_3.6_head5366/NEMOGCM/CONFIG/ORCA2AGUL_1_2/OPAFILES/ppsrc/nemo/lib_cray.f90

Message from root@eslogin008 on <no tty> at 14:51 ...
Application /work/n02/n02/mathiot/NEMO_3.6_head5366/NEMOGCM/CONFIG/ORCA2AGUL_1_2/OPAFILES/conv terminated after 00:47:30 cputime exceeded interactive limit
EOF
...
real  47m47.806s
user  0m17.749s
sys   47m17.557s
Version 1, edited 3 years ago by nemo (previous) (next) (diff)

comment:2 Changed 5 years ago by bouttier

I have the same result on my local machine (macport_osx arch file) about the "to much time compilation" on the lib_cray file.

comment:3 Changed 5 years ago by jchanut

I get a quite similar compilation issue for Agrif (ifort 15 compiler - no matter optimization level). It seems to be related to the call of the function eos_fzp. The first occurence is indeed in sbcisf.F90. Commenting the call to eos_fzp allows successfull compilation for that routine.
The same problem arises in other routines whenever eos_fzp is called.
Pierre-Antoine, could you ckeck if it works for you too ?

comment:4 Changed 5 years ago by bouttier

I confirm that the problem seems to come from the call of the eos_fzp function. When commenting it, the compilation is successful.

comment:5 Changed 5 years ago by bouttier

I think that the problem come from the structure FUNCTION f(x) RESULT y under an INTERFACE in eosbn2 (function eos_fzp). Apparently, The AGRIF conv does not handle that structure correctly. Moreover, it seems that the conv does not handle correctly functions which do not return a scalar. I have made preliminary tests in turning those functions into subroutines and it compiles sucessfully.

I need to perform more tests before proposing a fix.

comment:6 Changed 5 years ago by jchanut

  • Version changed from trunk to nemo_v3_6_STABLE

comment:7 Changed 5 years ago by jchanut

eos_fzp as been changed into a subroutine which enables AGRIF compilation.
ALL SETTE tests but passed successfully at revision 5540 and reported into the trunk (rev. 5541).

Note for ifort users (version 15.0.2):
I found necessary to use a moderate level of optimization (O1) to run ORCA2_Agulhas test. I did not look too much at the problem, but it seems to come from what the compiler does in geo2ocean routine (decreasing the level of optimization there does the job and produces identical results as in the unoptimized case). It's certainly worth a ticket.

Confirmation by others (Pierre-Antoine) needed in order to close this ticket.

http://forge.ipsl.jussieu.fr/nemo/changeset/5540/branches/2015/nemo_v3_6_STABLE

comment:8 Changed 5 years ago by bouttier

  • Resolution set to fixed
  • Status changed from new to closed

The fix works. See commit #1530.

comment:9 Changed 3 years ago by nemo

  • Description modified (diff)
  • Severity set to minor
Note: See TracTickets for help on using tickets.