wiki:2020WP/HPC-10_mcastril_HPDAonlineDiagGPU

Context Navigation

Version 2 (modified by mcastril, 4 years ago) (diff)
--

Name and subject of the action

Last edition: Wikinfo(changed_ts)? by Wikinfo(changed_by)?

The PI is responsible to closely follow the progress of the action, and especially to contact NEMO project manager if the delay on preview (or review) are longer than the 2 weeks expected.

Summary
Preview
Tests
Review

Summary

Action	HPC-10_mcastril_HPDAonline DiagGPU
PI(S)	Miguel Castrillo
Digest	High Performance GPU Diagnostics Online - 2nd Phase. After having successfully ported the dia_hsb diagnostic into a toy model, achieving 50x speedup, this task will focus on implementing the rest of the diagnostics and improving the data transfer between CPU and GPU.
Dependencies	HPC-04_MCastrillo_HPDAonlineDiagGPU (completed)
Branch	source:/NEMO/branches/{YEAR}/dev_r{REV}_{ACTION_NAME}
Previewer(s)	Italo Epicoco
Reviewer(s)	Italo Epicoco
Ticket	#XXXX

Description

High performance data analytics solutions aiming at tackling the online diagnostics of the NEMO model will be explored as complementary components in the model diagnostics software eco-system. Online techniques leveraging fast (low latency and real-time) data analytics approaches (e.g. on fat nodes) will be evaluated in real cluster environments. In particular, an interface of NEMO to the High Performance Data Analitics (HPDA) framework will be designed and implemented for online diagnostics.

The rationale of this activity is to improve the NEMO computational performance by executing the computations for diagnostics on GPU.

Implementation

As first step, the portability of NEMO diagnostic calculations to GPUs has been analyzed, exploring how to adapt these regions from the current MPI implementation to the CUDA paradigm. A toy model has been created to perform preliminary tests, that were done using the dia_hsb diagnostic. The code itself is executed 50x faster than in a single CPU but the data transfer to and from GPU is the main bottleneck.

We are working on the asynchronous strategy in order to hide all communications among GPU/CPU. We also plan to increase the efficiency of the overall solution, by mitigating the impact of the offloaded data and extending our approach to the rest of the diagnostics.

Documentation updates

Preview

Error: Failed to load processor box

No macro or processor named 'box' found

...

Tests

Error: Failed to load processor box

No macro or processor named 'box' found

...

Review

Error: Failed to load processor box

No macro or processor named 'box' found

...

Tags
WP

Download in other formats:

Plain Text

New URL for NEMO forge! http://forge.nemo-ocean.eu

Context Navigation

Name and subject of the action

Summary

Description

Implementation

Documentation updates

Preview

Tests

Review

Download in other formats: