| 90 | == Commandes permettant de connaitre la mémoire utilisée par les jobs == |
| 91 | |
| 92 | * En cours d'exécution, "Qstat -r". Par exemple : |
| 93 | {{{ |
| 94 | $ Qstat -r -u rgpi001 |
| 95 | Dispatch Data Stack Rss |
| 96 | Step Id Owner Class Date Avg Avg Max Cpu Used |
| 97 | ------------------- ------- ------- -------- ----- ----- ----- --------- |
| 98 | vargas043.964451.0 rgpi001 c32t4 05 08:24 0.0 0.0 0.8 894:55:56 |
| 99 | }}} |
| 100 | On lit 0,8 GiB pour "max RSS" (maximum resident set size, = data + stack normalement). |
| 101 | |
| 102 | * En cours d'exécution, "llq -x -l". Par exemple : |
| 103 | {{{ |
| 104 | llq -j vargas043.964451.0 -x -l |
| 105 | }}} |
| 106 | donne des pages et des pages d'information dont : |
| 107 | {{{ |
| 108 | Step maxrss: 826588 (en KiB) |
| 109 | }}} |
| 110 | * Pour avoir des informations à la fin de l'exécution, "hpccount". |
| 111 | L'utiliser comme "time", devant un exécutable. Surcoût négligeable. |
| 112 | |
| 113 | Exemple pour un exécutable séquentiel : |
| 114 | {{{ |
| 115 | module load hpccount |
| 116 | hpccount ce0l |
| 117 | }}} |
| 118 | affiche à la fin des informations, dont "Maximum resident set size" : |
| 119 | {{{ |
| 120 | hpccount v3.2.1 (IHPCT v2.2.0) summary |
| 121 | |
| 122 | ######## Resource Usage Statistics ######## |
| 123 | |
| 124 | Total amount of time in user mode : 101.501224 seconds |
| 125 | Total amount of time in system mode : 0.084285 seconds |
| 126 | Maximum resident set size : 289888 Kbytes |
| 127 | Average shared memory use in text segment : 185030 Kbytes*sec |
| 128 | Average unshared memory use in data segment : 23379240 Kbytes*sec |
| 129 | Number of page faults without I/O activity : 72046 |
| 130 | Number of page faults with I/O activity : 464 |
| 131 | Number of times process was swapped out : 0 |
| 132 | Number of times file system performed INPUT : 0 |
| 133 | Number of times file system performed OUTPUT : 0 |
| 134 | Number of IPC messages sent : 0 |
| 135 | Number of IPC messages received : 0 |
| 136 | Number of signals delivered : 0 |
| 137 | Number of voluntary context switches : 102 |
| 138 | Number of involuntary context switches : 184 |
| 139 | |
| 140 | ####### End of Resource Statistics ######## |
| 141 | |
| 142 | Execution time (wall clock time) : 103.605896331836 seconds |
| 143 | |
| 144 | PM_FPU_1FLOP (FPU executed one flop instruction ) : 14242086741 |
| 145 | PM_FPU_FMA (FPU executed multiply-add instruction) : 906129309 |
| 146 | PM_FPU_FSQRT_FDIV (FPU executed FSQRT or FDIV instruction) : 56844327 |
| 147 | PM_FPU_FLOP (FPU executed 1FLOP, FMA, FSQRT or FDIV instruction) : 15205060377 |
| 148 | PM_RUN_INST_CMPL (Run instructions completed) : 108472281030 |
| 149 | PM_RUN_CYC (Run cycles) : 478031532649 |
| 150 | |
| 151 | Utilization rate : 98.085 % |
| 152 | Instructions per run cycle : 0.227 |
| 153 | Total floating point operations : 16111.190 M |
| 154 | Flop rate (flops / WCT) : 155.505 |
| 155 | Mflop/s |
| 156 | Flops / user time : 158.540 |
| 157 | Mflop/s |
| 158 | Algebraic floating point operations : 16054.345 M |
| 159 | Algebraic flop rate (flops / WCT) : 154.956 |
| 160 | Mflop/s |
| 161 | Algebraic flops / user time : 157.980 |
| 162 | Mflop/s |
| 163 | FMA percentage : 11.248 % |
| 164 | % of peak performan |
| 165 | }}} |
| 166 | * Exemple sur un exécutable parallèle : |
| 167 | {{{ |
| 168 | export HPM_ASC_OUTPUT=yes |
| 169 | export HPM_AGGREGATE=average.so |
| 170 | poe hpccount -o hpccount_out -u -n gcm -procs 4 -stdoutmode 0 |
| 171 | }}} |
| 172 | crée un fichier "hpccount_out_vargas....hpm" qui contient les |
| 173 | informations moyennées sur les processus MPI. |