#445 closed Bug (fixed)
Performance of NEMO 3.1
Reported by: | ebehrens | Owned by: | nemo |
---|---|---|---|
Priority: | high | Milestone: | |
Component: | OCE | Version: | v3.1 |
Severity: | Keywords: | MPI | |
Cc: |
Description
NEMO 3.1 on a NEC SX9 has half of the performance compared to NEMO 2.3. This can be attributed to the communication along the north-fold boundary condition. We have performed the following test:
- ORCA05 on 4x1 PE under NEMO 2.3
- ORCA05 on 4x1 PE under NEMO 3.1 (with IOF)
- ORCA05 on 4x1 PE under NEMO 3.1 jperio=1 (with IOF)
Commit History (0)
(No commits)
Attachments (6)
Change History (13)
Changed 15 years ago by ebehrens
Changed 15 years ago by ebehrens
Changed 15 years ago by ebehrens
Changed 15 years ago by ebehrens
comment:1 follow-up: ↓ 2 Changed 15 years ago by rblod
HI
- it would be nice to have the global MPI information for both run with NEMO_3.1 too
- running 3.1 without IOF could be nice also to clarify the situation
- finally on NEC computers we use to run in cutting only along latitude, both for vectorization and to get rid of the north fold condition
Rachid
Changed 15 years ago by ebehrens
Changed 15 years ago by ebehrens
comment:2 in reply to: ↑ 1 ; follow-up: ↓ 3 Changed 15 years ago by ebehrens
Replying to rblod:
HI
- it would be nice to have the global MPI information for both run with NEMO_3.1 too
- running 3.1 without IOF could be nice also to clarify the situation
- finally on NEC computers we use to run in cutting only along latitude, both for vectorization and to get rid of the north fold condition
Rachid
Hi
- MPI INFOS see attachments
- comparison between IOF and offline are under way
Erik
comment:3 in reply to: ↑ 2 Changed 15 years ago by ebehrens
Hi
-concerning "cutting only along latitude" (1x4) and IOF (extra halos) we have problems, and will submit a new separate ticket after proofing results
Erik
comment:4 Changed 15 years ago by rblod
Hi
I just noticed in a previous mail from Markus than he was using noopt_ieee option for compilation. If it is the case in your benchmark, this option shouldn't be in use and can slow down drastically the code.
Rachid
comment:5 Changed 15 years ago by ebehrens
- Resolution set to fixed
- Status changed from new to closed
comment:6 Changed 5 years ago by andmirek
In 10659:
comment:7 Changed 5 years ago by andmirek
In 10660:
Ftrace output for performed Experiments