#2001 closed Task (worksforme)
OpenMP in MO eORCA012
Reported by: | andmirek | Owned by: | systeam |
---|---|---|---|
Priority: | low | Milestone: | Unscheduled |
Component: | OCE | Version: | v3.6 |
Severity: | minor | Keywords: | OPA v3.6 |
Cc: |
Description
Context
This ticket is to investigate performance of the branch with OpenMP directives in MetOffice?
eORCA012 configuration (NEMO3.6).
Implementation plan
Add OpenMP directives and investigate performance. The configuration is based on MO suite mi-ao921@70665.
See MetOffice? twiki: http://www-twiki/Main/GO6eORCA12TechnicalImprovements for details.
Commit History (3)
Changeset | Author | Time | ChangeLog |
---|---|---|---|
9616 | andmirek | 2018-05-22T11:09:09+02:00 | #2001 few additionale changes |
9176 | andmirek | 2018-01-04T13:30:03+01:00 | #2001: OMP directives |
9175 | andmirek | 2018-01-04T13:26:40+01:00 | #2001: OMP for eORCA12 |
Change History (7)
comment:1 Changed 7 years ago by andmirek
comment:2 Changed 7 years ago by andmirek
comment:3 Changed 7 years ago by andmirek
In 9176:
comment:4 Changed 7 years ago by andmirek
branch at 9176 used for tests. Gives the same solution independent on number of OMP threads for a 1day simulation.
comment:5 Changed 7 years ago by andmirek
- Resolution set to worksforme
- Status changed from new to closed
Results:
| OMP threads | Nr. of CPU | Execution time [s] | Comments |
---|---|---|---|---|
36x30 | 1 | 742 | 10600 | model failed when writing restart because of memory. Execution time is an approximation. |
36x30 | 2 | 1484 | 4957 | |
36x30 | 3 | 2226 | 4080 | |
36x30 | 6 | 4452 | 3223 | |
40x40 | 1 | 1057 | 6019 | |
40x40 | 2 | 2114 | 3924 | |
40x40 | 3 | 3171 | 3044 | |
40x40 | 6 | 6342 | 2557 | |
50x50 | 1 | 1609 | 3953 | |
50x50 | 2 | 3218 | 2591 | |
50x50 | 3 | 4827 | 2092 | |
50x50 | 6 | 9654 | 1729 | |
70x70 | 1 | 3029 | 2247 | |
70x70 | 2 | 6058 | 1488 | |
70x70 | 3 | 9087 | 1355 | |
128x108 | 1 | 7693 | 1207 | |
128x108 | 2 | 15386 | 929 |
Based on this branch (results are the same independent on number of threads; not all routines have OMP; and for some loops OMP directives were not inserted because it would change results) use of 1 OMP thread gives best performance. There is a region for 5000-10000 processors where use 2 OMP threads has a chance to be as efficient as MPI only job. However, because land suppression was used (what reduce number of processors by ~30%) it's impossible to have different configurations using the same number of processors.
comment:6 Changed 7 years ago by andmirek
In 9616:
comment:7 Changed 3 years ago by nemo
- Keywords OPA v3.6 added
In 9175: