Opened 3 years ago

Closed 3 years ago

Last modified 3 years ago

#2001 closed Task (worksforme)

OpenMP in MO eORCA012

Reported by: andmirek Owned by:
Priority: low Milestone: Unscheduled
Component: OCE Version: release-3.6
Severity: minor Keywords:
Cc: Review: failed
MP ready?: no
Progress: Unspecified

Description

Context

This ticket is to investigate performance of the branch with OpenMP directives in Met Office?
eORCA012 configuration (NEMO3.6).

Implementation plan

Add OpenMP directives and investigate performance. The configuration is based on MO suite mi-ao921@70665.
See Met Office? twiki: http://www-twiki/Main/GO6eORCA12TechnicalImprovements for details.

Commit History (3)

ChangesetAuthorTimeChangeLog
9616andmirek2018-05-22T11:09:09+02:00

#2001 few additionale changes

9176andmirek2018-01-04T13:30:03+01:00

#2001: OMP directives

9175andmirek2018-01-04T13:26:40+01:00

#2001: OMP for eORCA12

Change History (6)

comment:1 Changed 3 years ago by andmirek

In 9175:

#2001: OMP for eORCA12

comment:3 Changed 3 years ago by andmirek

In 9176:

#2001: OMP directives

comment:4 Changed 3 years ago by andmirek

branch at 9176 used for tests. Gives the same solution independent on number of OMP threads for a 1day simulation.

comment:5 Changed 3 years ago by andmirek

  • Resolution set to worksforme
  • Status changed from new to closed

Results:

RUN

OMP threads

Nr. of CPU

Execution time [s]

Comments

36x30

1

742

10600

model failed when writing restart because of memory. Execution time is an approximation.

36x30

2

1484

4957

36x30

3

2226

4080

36x30

6

4452

3223

40x40

1

1057

6019

40x40

2

2114

3924

40x40

3

3171

3044

40x40

6

6342

2557

50x50

1

1609

3953

50x50

2

3218

2591

50x50

3

4827

2092

50x50

6

9654

1729

70x70

1

3029

2247

70x70

2

6058

1488

70x70

3

9087

1355

128x108

1

7693

1207

128x108

2

15386

929

Based on this branch (results are the same independent on number of threads; not all routines have OMP; and for some loops OMP directives were not inserted because it would change results) use of 1 OMP thread gives best performance. There is a region for 5000-10000 processors where use 2 OMP threads has a chance to be as efficient as MPI only job. However, because land suppression was used (what reduce number of processors by ~30%) it's impossible to have different configurations using the same number of processors.

comment:6 Changed 3 years ago by andmirek

In 9616:

#2001 few additionale changes

Note: See TracTickets for help on using tickets.