Opened 8 years ago

Closed 8 years ago

#1121 closed Bug (fixed)

OBC in parallel

Reported by: julienjouanno Owned by: epico
Priority: low Milestone:
Component: OCE Version: trunk
Severity: Keywords:
Cc: epico Branch review:
MP ready?: Task progress:

Description

It seems there is a problem with OBC when running in parallel.

It works fine when using a few procs (1,2,4). The solver.stat are then strictly equal. When using 8 procs it depends on the distribution (2*4 works but not 4*2). When using a higher number of procs it does not work anymore and fail with mpi errors at the first or second time step.

This problem seems to be solved (I ran one year !) when replacing the current subroutine mppobc with mppobc from nemo_3.4.

Commit History (2)

ChangesetAuthorTimeChangeLog
4047epico2013-10-01T06:59:38+02:00

bug fix in mppobc. ticket #1121

4017epico2013-09-09T15:15:27+02:00

Bug fixed in mppobc routine. see Ticket #1121

Change History (2)

comment:1 in reply to: ↑ description ; follow-up: Changed 8 years ago by epico

  • Cc epico added
  • Owner changed from NEMO team to epico
  • Status changed from new to assigned

Dear Julien

we have tested again the OBC on our configuration with NEMO v3.5 and we did not found any problem up to 32 processes.

Can you provide us with your configuration/settings or more details on the error ?

best regards
Italo Epicoco

Replying to julienjouanno:

It seems there is a problem with OBC when running in parallel.

It works fine when using a few procs (1,2,4). The solver.stat are then strictly equal. When using 8 procs it depends on the distribution (2*4 works but not 4*2). When using a higher number of procs it does not work anymore and fail with mpi errors at the first or second time step.

This problem seems to be solved (I ran one year !) when replacing the current subroutine mppobc with mppobc from nemo_3.4.

comment:2 in reply to: ↑ 1 Changed 8 years ago by epico

  • Resolution set to fixed
  • Status changed from assigned to closed

The bug concerns the communication pattern among the processes on the boundary in the mppobc routine, namely, it rises up when some of the processes on the boundary do not have any point to send.

The bug has been fixed

The fix has been tested with the NATL025 configuration provided by julien using 64 processes with a run of 6 months and a restartability run with 3mo+3mo

Replying to epico:

Dear Julien

we have tested again the OBC on our configuration with NEMO v3.5 and we did not found any problem up to 32 processes.

Can you provide us with your configuration/settings or more details on the error ?

best regards
Italo Epicoco

Replying to julienjouanno:

It seems there is a problem with OBC when running in parallel.

It works fine when using a few procs (1,2,4). The solver.stat are then strictly equal. When using 8 procs it depends on the distribution (2*4 works but not 4*2). When using a higher number of procs it does not work anymore and fail with mpi errors at the first or second time step.

This problem seems to be solved (I ran one year !) when replacing the current subroutine mppobc with mppobc from nemo_3.4.

Note: See TracTickets for help on using tickets.