source: trunk/libIGCM/AA_monitoring @ 315

Last change on this file since 315 was 315, checked in by mafoipsl, 14 years ago

Add mail when job aborts in mercure front-end header. Correct SendMail? function on mercurex9.

  • Property licence set to
    The following licence information concerns ONLY the libIGCM tools
    ==================================================================

    Copyright © Centre National de la Recherche Scientifique CNRS
    Commissariat à l'Énergie Atomique CEA

    libIGCM : Library for Portable Models Computation of IGCM Group.

    IGCM Group is the french IPSL Global Climate Model Group.

    This library is a set of shell scripts and functions whose purpose is
    the management of the initialization, the launch, the transfer of
    output files, the post-processing and the monitoring of datas produce
    by any numerical program on any plateforme.

    This software is governed by the CeCILL license under French law and
    abiding by the rules of distribution of free software. You can use,
    modify and/ or redistribute the software under the terms of the CeCILL
    license as circulated by CEA, CNRS and INRIA at the following URL
    "http://www.cecill.info".

    As a counterpart to the access to the source code and rights to copy,
    modify and redistribute granted by the license, users are provided only
    with a limited warranty and the software's author, the holder of the
    economic rights, and the successive licensors have only limited
    liability.

    In this respect, the user's attention is drawn to the risks associated
    with loading, using, modifying and/or developing or reproducing the
    software by the user in light of its specific status of free software,
    that may mean that it is complicated to manipulate, and that also
    therefore means that it is reserved for developers and experienced
    professionals having in-depth computer knowledge. Users are therefore
    encouraged to load and test the software's suitability as regards their
    requirements in conditions enabling the security of their systems and/or
    data to be ensured and, more generally, to use and operate it in the
    same conditions as regards security.

    The fact that you are presently reading this means that you have had
    knowledge of the CeCILL license and that you accept its terms.
  • Property svn:keywords set to Date Author Revision
File size: 10.0 KB
Line 
1#-Q- cesium #!/bin/ksh
2#-Q- cesium ######################
3#-Q- cesium ## CESIUM   CEA ##
4#-Q- cesium ######################
5#-Q- cesium #MSUB -r MONITORING     # Nom du job               
6#-Q- cesium #MSUB -N 1              # Reservation du noeud
7#-Q- cesium #MSUB -n 1              # Reservation du processus
8#-Q- cesium #MSUB -T 86400          # Limite de temps elapsed du job
9#-Q- cesium #MSUB -E "-j o"
10#-Q- cesium #MSUB -E "-S /bin/ksh"
11#-Q- platine #!/usr/bin/ksh
12#-Q- platine ##################
13#-Q- platine ## PLATINE   CEA ##
14#-Q- platine ##################
15#-Q- platine #BSUB -J MONITORING             # Nom du job
16#-Q- platine #BSUB -N                        # message a la fin du job
17#-Q- platine #BSUB -n 1                      # reservation des processeurs pour le job
18#-Q- platine #BSUB -W 1:00                   # Limite temps
19#-Q- platine #BSUB -q post              # Passage en queue post
20#-Q- sx8brodie #!/bin/ksh
21#-Q- sx8brodie #######################
22#-Q- sx8brodie ## SX8BRODIE   IDRIS ##
23#-Q- sx8brodie #######################
24#-Q- sx8brodie # Temps Elapsed max. d'une requete hh:mm:ss
25#-Q- sx8brodie # @ wall_clock_limit = 20:00:00
26#-Q- sx8brodie # Nom du travail LoadLeveler
27#-Q- sx8brodie # @ job_name   = MONITORING
28#-Q- sx8brodie # Fichier de sortie standard du travail       
29#-Q- sx8brodie # @ output     = $(job_name).$(jobid)
30#-Q- sx8brodie # Fichier de sortie d'erreur du travail
31#-Q- sx8brodie # @ error      =  $(job_name).$(jobid)
32#-Q- sx8brodie # pour recevoir un mail en cas de depassement du temps Elapsed (ou autre pb.)
33#-Q- sx8brodie # @ notification = error
34#-Q- sx8brodie # @ environment  = $POST_DIR ; $SUBMIT_DIR ; $libIGCM ;  $libIGCM_SX ; $R_INIT ; $R_BC ; $StandAlone ; $RESOL_ATM ; $RESOL_OCE ; $RESOL_ICE ; $RESOL_MBG ; $RESOL_SRF ; $R_SAVE ; $config_UserChoices_JobName ; $config_UserChoices_TagName ; $YEARS ; $MASTER
35#-Q- sx8brodie # @ queue
36#-Q- aix6 #!/bin/ksh
37#-Q- aix6 #######################
38#-Q- aix6 ##   VARGAS   IDRIS  ##
39#-Q- aix6 #######################
40#-Q- aix6 # Temps Elapsed max. d'une requete hh:mm:ss
41#-Q- aix6 # @ wall_clock_limit = 20:00:00
42#-Q- aix6 # Nom du travail LoadLeveler
43#-Q- aix6 # @ job_name   = MONITORING
44#-Q- aix6 # Fichier de sortie standard du travail       
45#-Q- aix6 # @ output     = $(job_name).$(jobid)
46#-Q- aix6 # Fichier de sortie d'erreur du travail
47#-Q- aix6 # @ error      =  $(job_name).$(jobid)
48#-Q- aix6 # pour recevoir un mail en cas de depassement du temps Elapsed (ou autre pb.)
49#-Q- aix6 # @ notification = error
50#-Q- aix6 # @ environment  = $POST_DIR ; $SUBMIT_DIR ; $libIGCM ; $libIGCM_SX ; $R_INIT ; $R_BC ; $StandAlone ; $RESOL_ATM ; $RESOL_OCE ; $RESOL_ICE ; $RESOL_MBG ; $RESOL_SRF ; $R_SAVE ; $config_UserChoices_JobName ; $config_UserChoices_TagName ; $YEARS ; $MASTER
51#-Q- aix6 # @ queue
52#-Q- sx8mercure #!/bin/ksh
53#-Q- sx8mercure ######################
54#-Q- sx8mercure ## SX8MERCURE   CEA ##
55#-Q- sx8mercure ######################
56#-Q- sx8mercure #PBS -N  MONITORING          # Nom du job
57#-Q- sx8mercure #PBS -j o                    # regroupement des stdout et stderr
58#-Q- sx8mercure #PBS -S /usr/bin/ksh         # shell de soumission
59#-Q- sx8mercure #PBS -l memsz_job=1gb        # Limite memoire a 1 Go
60#-Q- sx8mercure #PBS -l cputim_job=1:00:00   # Limite temps a 2 heures
61#-Q- sx8mercure #PBS -q scalaire
62#-Q- sx9mercure #!/bin/ksh
63#-Q- sx9mercure ######################
64#-Q- sx9mercure ## SX9MERCURE   CEA ##
65#-Q- sx9mercure ######################
66#-Q- sx9mercure #PBS -N  MONITORING          # Nom du job
67#-Q- sx9mercure #PBS -m a                    # message si abort
68#-Q- sx9mercure #PBS -j o                    # regroupement des stdout et stderr
69#-Q- sx9mercure #PBS -S /usr/bin/ksh         # shell de soumission
70#-Q- sx9mercure #PBS -l memsz_job=1gb        # Limite memoire a 1 Go
71#-Q- sx9mercure #PBS -l cputim_job=1:00:00   # Limite temps a 2 heures
72#-Q- sx9mercure #PBS -q scalaire
73#-Q- titane #!/bin/ksh
74#-Q- titane ######################
75#-Q- titane ## TITANE   CEA ##
76#-Q- titane ######################
77#-Q- titane #MSUB -r MONITORING     # Nom du job               
78#-Q- titane #MSUB -N 1              # Reservation du noeud
79#-Q- titane #MSUB -n 1              # Reservation du processus
80#-Q- titane #MSUB -T 86400          # Limite de temps elapsed du job
81#-Q- titane #MSUB -E "-j o"
82#-Q- titane #MSUB -E "-S /bin/ksh"
83#-Q- titane ##MSUB -e nco.out        # Sortie standard
84#-Q- titane ##MSUB -o nco.out        # Sortie standard
85#-Q- lxiv8 ######################
86#-Q- lxiv8 ## OBELIX      LSCE ##
87#-Q- lxiv8 ######################
88#-Q- lxiv8 #PBS -N MONITORING
89#-Q- lxiv8 #PBS -m a
90#-Q- lxiv8 #PBS -j oe
91#-Q- lxiv8 #PBS -q medium
92#-Q- lxiv8 #PBS -o MONITORING.$$
93#-Q- lxiv8 #PBS -S /bin/ksh
94#-Q- default #!/bin/ksh
95#-Q- default ##################
96#-Q- default ## DEFAULT HOST ##
97#-Q- default ##################
98
99# $Date$
100# $Author$
101# $Revision$
102# IPSL (2006)
103#  This software is governed by the CeCILL licence see libIGCM/libIGCM_CeCILL.LIC
104
105#set -eu
106#set -vx
107
108date
109
110#-Q- sx8brodie export OMP_NUM_THREADS=1
111#-Q- aix6 export OMP_NUM_THREADS=1
112
113########################################################################
114
115#D- Flag to determine if this job in a standalone mode
116#D- Default : value from AA_job if any
117StandAlone=${StandAlone:=true}
118
119#D- Low level debug : to bypass lib test checks and stack construction
120#D- Default : value from AA_job if any
121libIGCM=${libIGCM:=::modipsl::/libIGCM}
122# WARNING for StandAlone used : To run this script on some machine,
123# you must check MirrorlibIGCM variable in sys library.
124# If this variable is true, you must use libIGCM_POST path instead
125# of your running libIGCM directory.
126
127######################################################################
128
129. ${libIGCM}/libIGCM_debug/libIGCM_debug.ksh
130      ( ${DEBUG_debug} ) && IGCM_debug_Check
131. ${libIGCM}/libIGCM_card/libIGCM_card.ksh
132     ( ${DEBUG_debug} ) && IGCM_card_Check
133. ${libIGCM}/libIGCM_date/libIGCM_date.ksh
134     ( ${DEBUG_debug} ) && IGCM_date_Check
135#-------
136. ${libIGCM}/libIGCM_sys/libIGCM_sys.ksh
137
138######################################################################
139
140#set -vx
141
142#===========================================
143RUN_DIR=${RUN_DIR_PATH}
144IGCM_sys_MkdirWork ${RUN_DIR}
145IGCM_sys_Cd ${RUN_DIR}
146
147if [ ${StandAlone} = true ] ; then
148    CARD_DIR=${SUBMIT_DIR}
149else
150    CARD_DIR=${RUN_DIR}/$( basename ${SUBMIT_DIR} )
151    IGCM_sys_Get_Master ${SUBMIT_DIR} ${RUN_DIR}
152fi
153
154#
155# First of all
156#
157IGCM_card_DefineArrayFromSection       ${CARD_DIR}/config.card UserChoices
158typeset option
159for option in ${config_UserChoices[*]} ; do
160    IGCM_card_DefineVariableFromOption ${CARD_DIR}/config.card UserChoices ${option}
161done
162IGCM_card_DefineArrayFromSection       ${CARD_DIR}/config.card ListOfComponents
163
164#==================================
165#R_SAVE : Job output directory
166if ( [ ! X${config_UserChoices_SpaceName} = X ] && [ ! X${config_UserChoices_ExperimentName} = X ] ) ; then
167    FreeName=$( echo ${config_UserChoices_JobName} | sed 's/.*_//' )
168    R_SAVE=${R_OUT}/${config_UserChoices_TagName}/${config_UserChoices_SpaceName}/${config_UserChoices_ExperimentName}/${FreeName}
169    R_DODS=${config_UserChoices_TagName}/${config_UserChoices_SpaceName}/${config_UserChoices_ExperimentName}/${FreeName}
170else
171    R_SAVE=${R_OUT}/${config_UserChoices_TagName}/${config_UserChoices_JobName}
172    R_DODS=${config_UserChoices_TagName}/${config_UserChoices_JobName}
173fi
174#
175IGCM_sys_TestDirArchive ${R_SAVE}/MONITORING
176if [ $? = 0 ] ; then
177    IGCM_debug_Print 1 "Get MONITORING directory from archive"
178    IGCM_sys_Get_Dir ${R_SAVE}/MONITORING ${RUN_DIR}
179else
180    IGCM_debug_Print 1 "MONITORING first pass. Nothing has been done before"
181fi
182# --------------------------------------------
183# Insert your commands between III...III lines
184# and precise produced directories to save
185# IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII
186
187for comp in ${config_ListOfComponents[*]} ; do
188    IGCM_debug_Print 1 "################## Component: ${comp} ######################"
189    liste_file_monitoring=""
190    IGCM_card_DefineArrayFromOption ${CARD_DIR}/config.card ListOfComponents ${comp}
191    eval compname=\${config_ListOfComponents_${comp}[0]}                > /dev/null 2>&1
192
193    PATH_monitoring_file=""
194    eval monitoring_file=monitoring01_${compname}_\${RESOL_${comp}}.cfg > /dev/null 2>&1
195    if [[ -d ${CARD_DIR}/POST && -f ${CARD_DIR}/POST/monitoring01_${compname}.cfg ]] ; then
196        PATH_monitoring_file=${CARD_DIR}/POST/monitoring01_${compname}.cfg
197    elif [ -f ${FER_ATLAS}/${monitoring_file} ] ; then
198        PATH_monitoring_file=${FER_ATLAS}/${monitoring_file}
199    else
200        IGCM_debug_Print 1 "No monitoring file found for this component. Was expecting ${monitoring_file}"
201        IGCM_debug_Print 1 "Step to next component"
202        continue
203    fi
204    #
205    if [ X${PATH_monitoring_file} != X"" ] ; then
206        IGCM_debug_Print 1 "Monitoring file used : ${PATH_monitoring_file}"
207        IGCM_debug_Print 1 "Determine which files we need."
208        . monitoring01 -l2 --listcommand 'IGCM_sys_RshArchive ls' ${PATH_monitoring_file} ${R_SAVE}/${comp}/Analyse/TS_MO
209        liste_file_monitoring=$( . monitoring01 -q -l2 --listcommand 'IGCM_sys_RshArchive ls' ${PATH_monitoring_file} ${R_SAVE}/${comp}/Analyse/TS_MO )
210    fi
211    #
212    if [ ! "X${liste_file_monitoring}" = X ] ; then
213        IGCM_sys_Get /l liste_file_monitoring ${RUN_DIR}
214        IGCM_debug_Print 1 "monitoring01 -c ${CARD_DIR} -p ${comp} --time -t \"${config_UserChoices_JobName} monitoring\" -o ${RUN_DIR}/MONITORING ${PATH_monitoring_file} ."
215        IGCM_debug_Print 1 "monitoring01 starts ................................................."
216        monitoring01 -c ${CARD_DIR} -p ${comp} --time -t "${config_UserChoices_JobName} monitoring" -o ${RUN_DIR}/MONITORING ${PATH_monitoring_file} .
217    else
218        IGCM_debug_Print 1 "No time series detected by this command :"
219        IGCM_debug_Print 1 "monitoring01 -l2 --listcommand 'IGCM_sys_RshArchive ls' ${PATH_monitoring_file} ${R_SAVE}/${comp}/Analyse/TS_MO"
220        . monitoring01 -l2 --listcommand 'IGCM_sys_RshArchive ls' ${PATH_monitoring_file} ${R_SAVE}/${comp}/Analyse/TS_MO
221    fi
222    #
223done
224
225# IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII
226
227# Save files
228IGCM_sys_Put_Dir MONITORING ${R_SAVE}
229
230# Dods copy
231IGCM_sys_Put_Dods MONITORING
232
233# Clean RUN_DIR_PATH (necessary for cesium)
234IGCM_sys_RmRunDir -Rf ${RUN_DIR_PATH}
Note: See TracBrowser for help on using the repository browser.