- Timestamp:
- 07/13/18 14:18:28 (6 years ago)
- Location:
- XIOS/dev/branch_openmp/Note
- Files:
-
- 5 edited
Legend:
- Unmodified
- Added
- Removed
-
XIOS/dev/branch_openmp/Note/rapport ESIWACE.aux
r1552 r1560 15 15 \newlabel{fig:sendrecv}{{6}{5}} 16 16 \citation{ep:2018} 17 \@writefile{lof}{\contentsline {figure}{\numberline {7}{\ignorespaces }}{6}}17 \@writefile{lof}{\contentsline {figure}{\numberline {7}{\ignorespaces \input "rapport ESIWACE"-1.cpt\relax }}{6}} 18 18 \newlabel{fig:bcast}{{7}{6}} 19 \@writefile{lof}{\contentsline {figure}{\numberline {8}{\ignorespaces }}{6}}19 \@writefile{lof}{\contentsline {figure}{\numberline {8}{\ignorespaces \input "rapport ESIWACE"-2.cpt\relax }}{6}} 20 20 \newlabel{fig:allreduce}{{8}{6}} 21 \citation{ep:2018} 21 22 \citation{ep:2018} 22 23 \bibstyle{plain} 23 24 \bibdata{reference} 24 25 \bibcite{ep:2018}{1} 26 \@writefile{toc}{\contentsline {section}{\numberline {3}The multi-threaded XIOS and performance results}{7}} 27 \@writefile{toc}{\contentsline {subsection}{\numberline {3.1}LMDZ work-flow}{7}} 25 28 \bibcite{Dinan:2013}{2} 26 29 \bibcite{Sridharan:2014}{3} 27 \@writefile{toc}{\contentsline {s ection}{\numberline {3}The multi-threaded XIOS and performance results}{7}}28 \@writefile{toc}{\contentsline {section}{\numberline {4}Future works for XIOS}{ 7}}30 \@writefile{toc}{\contentsline {subsection}{\numberline {3.2}CMIP6 work-flow}{8}} 31 \@writefile{toc}{\contentsline {section}{\numberline {4}Future works for XIOS}{8}} -
XIOS/dev/branch_openmp/Note/rapport ESIWACE.log
r1552 r1560 1 This is pdfTeX, Version 3.14159265-2.6-1.40.16 (TeX Live 2015/Debian) (preloaded format=pdflatex 2017.8.24) 27 JUN 2018 15:101 This is pdfTeX, Version 3.14159265-2.6-1.40.16 (TeX Live 2015/Debian) (preloaded format=pdflatex 2017.8.24) 13 JUL 2018 12:17 2 2 entering extended mode 3 3 restricted \write18 enabled. … … 499 499 \verbatim@in@stream=\read1 500 500 ) 501 (/usr/share/texlive/texmf-dist/tex/latex/cprotect/cprotect.sty 502 Package: cprotect 2011/01/27 v1.0e (Bruno Le Floch) 503 504 (/usr/share/texlive/texmf-dist/tex/latex/base/ifthen.sty 505 Package: ifthen 2014/09/29 v1.1c Standard LaTeX ifthen package (DPC) 506 ) 507 (/usr/share/texlive/texmf-dist/tex/latex/bigfoot/suffix.sty 508 Package: suffix 2006/07/15 1.5a Variant command support 509 ) 510 \CPT@WriteOut=\write3 511 \c@CPT@WriteCount=\count109 512 \c@CPT@numB=\count110 513 \CPT@commandatend@toks=\toks27 514 ) 501 515 (./rapport ESIWACE.aux) 502 516 \openout1 = `"rapport ESIWACE.aux"'. 503 517 504 LaTeX Font Info: Checking defaults for OML/cmm/m/it on input line 1 7.505 LaTeX Font Info: ... okay on input line 1 7.506 LaTeX Font Info: Checking defaults for T1/cmr/m/n on input line 1 7.507 LaTeX Font Info: ... okay on input line 1 7.508 LaTeX Font Info: Checking defaults for OT1/cmr/m/n on input line 1 7.509 LaTeX Font Info: ... okay on input line 1 7.510 LaTeX Font Info: Checking defaults for OMS/cmsy/m/n on input line 1 7.511 LaTeX Font Info: ... okay on input line 1 7.512 LaTeX Font Info: Checking defaults for OMX/cmex/m/n on input line 1 7.513 LaTeX Font Info: ... okay on input line 1 7.514 LaTeX Font Info: Checking defaults for U/cmr/m/n on input line 1 7.515 LaTeX Font Info: ... okay on input line 1 7.518 LaTeX Font Info: Checking defaults for OML/cmm/m/it on input line 18. 519 LaTeX Font Info: ... okay on input line 18. 520 LaTeX Font Info: Checking defaults for T1/cmr/m/n on input line 18. 521 LaTeX Font Info: ... okay on input line 18. 522 LaTeX Font Info: Checking defaults for OT1/cmr/m/n on input line 18. 523 LaTeX Font Info: ... okay on input line 18. 524 LaTeX Font Info: Checking defaults for OMS/cmsy/m/n on input line 18. 525 LaTeX Font Info: ... okay on input line 18. 526 LaTeX Font Info: Checking defaults for OMX/cmex/m/n on input line 18. 527 LaTeX Font Info: ... okay on input line 18. 528 LaTeX Font Info: Checking defaults for U/cmr/m/n on input line 18. 529 LaTeX Font Info: ... okay on input line 18. 516 530 517 531 (/usr/share/texlive/texmf-dist/tex/context/base/supp-pdf.mkii 518 532 [Loading MPS to PDF converter (version 2006.09.02).] 519 \scratchcounter=\count1 09533 \scratchcounter=\count111 520 534 \scratchdimen=\dimen120 521 535 \scratchbox=\box30 522 \nofMPsegments=\count11 0523 \nofMParguments=\count11 1524 \everyMPshowfont=\toks2 7525 \MPscratchCnt=\count11 2536 \nofMPsegments=\count112 537 \nofMParguments=\count113 538 \everyMPshowfont=\toks28 539 \MPscratchCnt=\count114 526 540 \MPscratchDim=\dimen121 527 \MPnumerator=\count11 3528 \makeMPintoPDFobject=\count11 4529 \everyMPtoPDFconversion=\toks2 8541 \MPnumerator=\count115 542 \makeMPintoPDFobject=\count116 543 \everyMPtoPDFconversion=\toks29 530 544 ) (/usr/share/texlive/texmf-dist/tex/generic/oberdiek/pdftexcmds.sty 531 545 Package: pdftexcmds 2011/11/29 v0.20 Utility functions of pdfTeX for LuaTeX (HO … … 576 590 e 577 591 )) 578 \c@lstlisting=\count11 5592 \c@lstlisting=\count117 579 593 580 594 <Charge1.png, id=1, 412.54124pt x 228.10219pt> 581 595 File: Charge1.png Graphic file (type png) 582 596 <use Charge1.png> 583 Package pdftex.def Info: Charge1.png used on input line 3 3.597 Package pdftex.def Info: Charge1.png used on input line 34. 584 598 (pdftex.def) Requested size: 165.01357pt x 91.23924pt. 585 599 … … 587 601 File: Charge2.png Graphic file (type png) 588 602 <use Charge2.png> 589 Package pdftex.def Info: Charge2.png used on input line 3 4.603 Package pdftex.def Info: Charge2.png used on input line 35. 590 604 (pdftex.def) Requested size: 165.25446pt x 91.05858pt. 591 605 [1 … … 596 610 File: domain.pdf Graphic file (type pdf) 597 611 <use domain.pdf> 598 Package pdftex.def Info: domain.pdf used on input line 6 0.612 Package pdftex.def Info: domain.pdf used on input line 64. 599 613 (pdftex.def) Requested size: 236.1567pt x 71.13055pt. 600 614 … … 602 616 File: omp.pdf Graphic file (type pdf) 603 617 <use omp.pdf> 604 Package pdftex.def Info: omp.pdf used on input line 68.618 Package pdftex.def Info: omp.pdf used on input line 78. 605 619 (pdftex.def) Requested size: 291.64784pt x 126.32893pt. 606 620 [2 <./domain.pdf>] … … 608 622 File: scheme.png Graphic file (type png) 609 623 <use scheme.png> 610 Package pdftex.def Info: scheme.png used on input line 86.624 Package pdftex.def Info: scheme.png used on input line 102. 611 625 (pdftex.def) Requested size: 266.18977pt x 207.17032pt. 612 626 [3 <./omp.pdf>] … … 614 628 File: tag.png Graphic file (type png) 615 629 <use tag.png> 616 Package pdftex.def Info: tag.png used on input line 1 40.630 Package pdftex.def Info: tag.png used on input line 156. 617 631 (pdftex.def) Requested size: 301.11966pt x 41.35376pt. 618 632 [4 <./scheme.png (PNG copy)>] <sendrecv.png, id=41, 829.0975pt x 694.595pt> 619 633 File: sendrecv.png Graphic file (type png) 620 634 <use sendrecv.png> 621 Package pdftex.def Info: sendrecv.png used on input line 1 49.635 Package pdftex.def Info: sendrecv.png used on input line 165. 622 636 (pdftex.def) Requested size: 331.63313pt x 277.83307pt. 623 637 … … 626 640 File: bcast.png Graphic file (type png) 627 641 <use bcast.png> 628 Package pdftex.def Info: bcast.png used on input line 1 73.642 Package pdftex.def Info: bcast.png used on input line 188. 629 643 (pdftex.def) Requested size: 153.87605pt x 165.62003pt. 630 631 <allreduce.png, id=46, 743.77875pt x 552.0625pt> 644 \openout3 = `"rapport ESIWACE-1.cpt"'. 645 646 647 (./rapport ESIWACE-1.cpt) <allreduce.png, id=46, 743.77875pt x 552.0625pt> 632 648 File: allreduce.png Graphic file (type png) 633 <use allreduce.png> 634 Package pdftex.def Info: allreduce.png used on input line 185. 649 650 <use allreduce.png> 651 Package pdftex.def Info: allreduce.png used on input line 200. 635 652 (pdftex.def) Requested size: 223.13535pt x 165.62003pt. 636 [6 <./bcast.png (PNG copy)> <./allreduce.png (PNG copy)>] (./rapport ESIWACE.b 637 bl 653 \openout3 = `"rapport ESIWACE-2.cpt"'. 654 655 (./rapport ESIWACE-2.cpt) [6 <./bcast.png (PNG copy)> <./allreduce.png (PNG co 656 py)>] (./rapport ESIWACE.bbl 638 657 Underfull \hbox (badness 3354) in paragraph at lines 4--9 639 658 []\OT1/cmr/m/n/10 XIOS de-vel-op-per group. Note for XIOS End-points. Tech-ni … … 647 666 [] 648 667 649 ) [7] (./rapport ESIWACE.aux) )668 [7]) [8] (./rapport ESIWACE.aux) ) 650 669 Here is how much of TeX's memory you used: 651 4 637strings out of 494953652 6 1251string characters out of 6180977653 13 7489words of memory out of 5000000654 7 849 multiletter control sequences out of 15000+600000655 9 090 words of font info for 34fonts, out of 8000000 for 9000670 4795 strings out of 494953 671 63636 string characters out of 6180977 672 139394 words of memory out of 5000000 673 7989 multiletter control sequences out of 15000+600000 674 9397 words of font info for 35 fonts, out of 8000000 for 9000 656 675 14 hyphenation exceptions out of 8191 657 41i,8n,35p,1270b, 264s stack positions out of 5000i,500n,10000p,200000b,80000s658 </usr/share/texlive/texmf-dist/fonts/type1/pu bli659 c/amsfonts/cm/cmbx12.pfb></usr/share/texlive/texmf-dist/fonts/type1/public/amsf 660 onts/cm/cmr10.pfb></usr/share/texlive/texmf-dist/fonts/type1/public/amsfonts/cm 661 /cm r12.pfb></usr/share/texlive/texmf-dist/fonts/type1/public/amsfonts/cm/cmr17.662 pfb></usr/share/texlive/texmf-dist/fonts/type1/public/amsfonts/cm/cmr6.pfb></us 663 r/share/texlive/texmf-dist/fonts/type1/public/amsfonts/cm/cmr7.pfb></usr/share/ 664 texlive/texmf-dist/fonts/type1/public/amsfonts/cm/cmr8.pfb></usr/share/texlive/ 665 texmf-dist/fonts/type1/public/amsfonts/cm/cmti10.pfb></usr/share/texlive/texmf- 666 dist/fonts/type1/public/amsfonts/cm/cmtt10.pfb></usr/share/texlive/texmf-dist/f 667 onts/type1/public/amsfonts/cm/cmtt8.pfb>668 Output written on "rapport ESIWACE.pdf" ( 7 pages, 269577bytes).676 41i,8n,35p,1270b,618s stack positions out of 5000i,500n,10000p,200000b,80000s 677 </usr/share/texlive/texmf-dist/fonts/type1/pu 678 blic/amsfonts/cm/cmbx12.pfb></usr/share/texlive/texmf-dist/fonts/type1/public/a 679 msfonts/cm/cmr10.pfb></usr/share/texlive/texmf-dist/fonts/type1/public/amsfonts 680 /cm/cmr12.pfb></usr/share/texlive/texmf-dist/fonts/type1/public/amsfonts/cm/cmr 681 17.pfb></usr/share/texlive/texmf-dist/fonts/type1/public/amsfonts/cm/cmr6.pfb>< 682 /usr/share/texlive/texmf-dist/fonts/type1/public/amsfonts/cm/cmr7.pfb></usr/sha 683 re/texlive/texmf-dist/fonts/type1/public/amsfonts/cm/cmr8.pfb></usr/share/texli 684 ve/texmf-dist/fonts/type1/public/amsfonts/cm/cmti10.pfb></usr/share/texlive/tex 685 mf-dist/fonts/type1/public/amsfonts/cm/cmtt10.pfb></usr/share/texlive/texmf-dis 686 t/fonts/type1/public/amsfonts/cm/cmtt8.pfb> 687 Output written on "rapport ESIWACE.pdf" (8 pages, 273883 bytes). 669 688 PDF statistics: 670 8 6PDF objects out of 1000 (max. 8388607)671 5 7compressed objects within 1 object stream689 89 PDF objects out of 1000 (max. 8388607) 690 59 compressed objects within 1 object stream 672 691 0 named destinations out of 1000 (max. 500000) 673 692 46 words of extra memory for PDF output out of 10000 (max. 10000000) -
XIOS/dev/branch_openmp/Note/rapport ESIWACE.tex
r1552 r1560 7 7 \usepackage{url} 8 8 \usepackage{verbatim} 9 \usepackage{cprotect} 9 10 10 11 % Title Page … … 47 48 project develops a new dynamical core for LMD-Z, the atmospheric general circulation model (GCM) part of IPSL-CM Earth System Model. 48 49 \url{http://www.lmd.polytechnique.fr/~dubos/DYNAMICO/}} all use XIOS as the output back end. M\'et\'eoFrance and MetOffice also choose XIOS 49 to man ege the I/O for their models.50 to manage the I/O for their models. 50 51 51 52 … … 54 55 Although XIOS copes well with many models, there is one potential optimization in XIOS which needs to be investigated: making XIOS thread-friendly. 55 56 56 This topic comes along with the configuration of the climate models. Take LMDZ as example, it is designed with the 2-level parallelization scheme. To be more specific, LMDZ uses the domain decomposition method in which each sub-domain is associated with one MPI process. Inside of the sub-domain, the model also uses OpenMP derivatives to accelerate the computation. We can imagine that the sub-domain be divided into sub-sub-domain and is managed by threads. 57 This topic comes along with the configuration of the climate models. Take LMDZ as example, it is designed with the 2-level parallelization 58 scheme. To be more specific, LMDZ uses the domain decomposition method in which each sub-domain is associated with one MPI process. Inside 59 of the sub-domain, the model also uses OpenMP derivatives to accelerate the computation. We can imagine that the sub-domain be divided into 60 sub-sub-domain and is managed by threads. 57 61 58 62 \begin{figure}[ht] … … 62 66 \end{figure} 63 67 64 As we know, each sub-domain, or in another word, each MPI process is a XIOS client. The data exchange between client and XIOS servers is handled by MPI communications. In order to write an output field, all threads must gather the data to the master thread who acts as MPI process in order to call MPI routines. There are two disadvantages about this method : first, we have to spend time on gathering information to the master thread which not only increases the memory use, but also implies an OpenMP barrier; second, while the master thread calls MPI routine, other threads are in the idle state thus a waster of computing resources. What we want obtain with the thread-friendly XIOS is that all threads can act like MPI processes. They can call directly the MPI routine thus no waste in memory nor in computing resources as shown in Figure \ref{fig:omp}. 68 As we know, each sub-domain, or in another word, each MPI process is a XIOS client. The data exchange between client and XIOS servers is 69 handled by MPI communications. In order to write an output field, all threads must gather the data to the master thread who acts as MPI 70 process in order to call MPI routines. There are two disadvantages about this method : first, we have to spend time on gathering information 71 to the master thread which not only increases the memory use, but also implies an OpenMP barrier; second, while the master thread calls MPI 72 routine, other threads are in the idle state thus a waster of computing resources. What we want obtain with the thread-friendly XIOS is that 73 all threads can act like MPI processes. They can call directly the MPI routine thus no waste in memory nor in computing resources as shown 74 in Figure \ref{fig:omp}. 65 75 66 76 \begin{figure}[ht] … … 71 81 \end{figure} 72 82 73 There are two ways to make XIOS thread-friendly. First of all, change the structure of XIOS which demands a lot of modification is the XIOS library. Knowing that XIOS is about 100 000 lines of code, this method will be very time consuming. What's more, the modification will be local to XIOS. If we want to optimize an other code to be thread-friendly, we have to redo the modifications. The second choice is to add an extra interface to MPI in order to manage the threads. When a thread want to call an MPI routine inside XIOS, it will first pass the interface, in which the communication information will be analyzed before the MPI routine is invoked. With this method, we only need to modify a very small part of XIOS in order to make it work. What is more interesting is that the interface we created can be adjusted to suit other MPI based libraries. 83 There are two ways to make XIOS thread-friendly. First of all, change the structure of XIOS which demands a lot of modification is the XIOS 84 library. Knowing that XIOS is about 100 000 lines of code, this method will be very time consuming. What's more, the modification will be 85 local to XIOS. If we want to optimize an other code to be thread-friendly, we have to redo the modifications. The second choice is to add an 86 extra interface to MPI in order to manage the threads. When a thread want to call an MPI routine inside XIOS, it will first pass the 87 interface, in which the communication information will be analyzed before the MPI routine is invoked. With this method, we only need to 88 modify a very small part of XIOS in order to make it work. What is more interesting is that the interface we created can be adjusted to suit 89 other MPI based libraries. 74 90 75 91 … … 163 179 data, execution of the MPI function by all master/root threads, distribution or arrangement of the resulting data among threads. 164 180 165 %The most representative functions of the collective communications are \verb|MPI_Gather| and \verb|MPI_Bcast|.166 181 167 182 For example, if we want to perform a broadcast operation, only 2 steps are needed (\textit{c.f.} Figure \ref{fig:bcast}). Firstly, the root … … 172 187 \centering 173 188 \includegraphics[scale=0.3]{bcast.png} 174 \c aption{}189 \cprotect\caption{\verb|MPI_Bcast|} 175 190 \label{fig:bcast} 176 191 \end{figure} … … 184 199 \centering 185 200 \includegraphics[scale=0.3]{allreduce.png} 186 \c aption{}201 \cprotect\caption{\verb|MPI_Allreduce|} 187 202 \label{fig:allreduce} 188 203 \end{figure} 189 204 190 205 Other MPI routines, such as \verb|MPI_Wait|, \verb|MPI_Intercomm_create| \textit{etc.}, can be found in the technique report of the 191 endpoint interface .206 endpoint interface \cite{ep:2018}. 192 207 193 208 \section{The multi-threaded XIOS and performance results} 194 209 195 210 The development of endpoint interface for thread-friendly XIOS library took about one year and a half. The main difficulty is the 196 co-existence of MPI processes and OpenMP threads. All MPI classes must be redefined in the endpoint interface along with all the routines. 197 The development is now available on the forge server: \url{http://forge.ipsl.jussieu.fr/ioserver/browser/XIOS/dev/branch_openmp}. One 198 technique report is also available in which one can find more detail about how endpoint works and how the routines are implemented 199 \cite{ep:2018}. We must note that the thread-friendly XIOS library is still in the phase of optimization. It will be released in the 200 future with a stable version. 201 202 All the functionalities of XIOS is reserved in its thread-friendly version. Single threaded code can work successfully with the new 203 version of XIOS. For multi-threaded models, some modifications are needed in order to work with the multi-threaded XIOS library. Detail can 204 be found in our technique report \cite{ep:2018}. 211 co-existence of MPI processes and OpenMP threads. One essential requirement for using the endpoint interface is that the underlying MPI 212 implementation must support the level-3 of thread support which is \verb|MPI_THREAD_MULTIPLE|. This means that if the MPI process is 213 multi-threaded, multiple threads may call MPI at once with no restrictions. Another importance aspect to be mentioned is that in XIOS, we 214 have variables with \verb|static| attribute. It means that inside of an MPI process, threads share the static variable. In order to use 215 correctly the endpoint interface, these static variables have to be defined as \verb|threadprivate| to limit the visibility to thread. 216 217 To develop the endpoint interface, we redefined all MPI classes along with all the MPI routines that are used in XIOS library. The current 218 version of the interface includes about 7000 lines of code and is now available on the forge server: 219 \url{http://forge.ipsl.jussieu.fr/ioserver/browser/XIOS/dev/branch_openmp}. One technique report is also available in which one can find 220 more detail about how endpoint works and how the routines are implemented \cite{ep:2018}. We must note that the thread-friendly XIOS 221 library is still in the phase of optimization. It will be released in the future with a stable version. 222 223 All the functionalities of XIOS is reserved in its thread-friendly XIOS library. Single threaded code can work successfully under the 224 endpoint interface with the new version of XIOS. For multi-threaded models, some modifications are needed in order to work with the 225 multi-threaded XIOS library. For example, the MPI initialization has be to modified to require the \verb|MPI_THREAD_MULTIPLE| 226 support. Each thread should have its own data set. What's most important is that the OpenMP master region in which the master thread calls 227 XIOS routines should be erased in order that every threads can call XIOS routines simultaneously. More detail can be found in our technique 228 report \cite{ep:2018}. 205 229 206 230 Even though the multi-threaded XIOS library is not fully accomplished and further optimization in ongoing. We have already done some tests 207 231 to see the potential of the endpoint framework. We take LMDZ as the target model and have tested with several work-flow charges. 232 233 \subsection{LMDZ work-flow} 234 235 In the LMDZ work-flow, we have a daily output file. We have up to 413 two-dimension variables and 187 three-dimension variables. According 236 to user's need, we can change the ``output\_level'' key argument in the \verb|xml| file to select the desired variables to be written. In 237 our 238 tests, we choose to set ``output\_level=2'' for a light output, and ``output\_level=11'' for a full output. We run the LMDZ code for 239 one, two, and three-month simulations using 12 MPI client processes and 1 server process. Each client process includes 8 OpenMP threads 240 which gives us 92 XIOS clients in total. 241 242 \subsection{CMIP6 work-flow} 208 243 209 244 \begin{comment} -
XIOS/dev/branch_openmp/Note/rapport ESIWACE.tex.backup
r1552 r1560 6 6 \usepackage{amsmath} 7 7 \usepackage{url} 8 \usepackage{verbatim} 8 9 9 10 % Title Page … … 46 47 project develops a new dynamical core for LMD-Z, the atmospheric general circulation model (GCM) part of IPSL-CM Earth System Model. 47 48 \url{http://www.lmd.polytechnique.fr/~dubos/DYNAMICO/}} all use XIOS as the output back end. M\'et\'eoFrance and MetOffice also choose XIOS 48 to man ege the I/O for their models.49 to manage the I/O for their models. 49 50 50 51 … … 149 150 \caption{This figure shows the classic pattern of a P2P communication with the endpoint interface. Thread/endpoint rank 0 sends a message 150 151 to thread/endpoint rank 3 with tag=1. The underlying MPI function called by the sender is indeed a send for MPI rank of 1 151 and tag=65537. From the receiver's point of view, the endpoint 3 is actually rece ving a message from MPI rank 0 with152 and tag=65537. From the receiver's point of view, the endpoint 3 is actually receiving a message from MPI rank 0 with 152 153 tag=65537.} 153 154 \label{fig:sendrecv} … … 177 178 Figure \ref{fig:allreduce} illustrates how the \verb|MPI_Allreduce| function is proceeded in the endpoint interface. First of all, We 178 179 perform a intra-process ``allreduce'' operation: source data is reduced from slave threads to the master thread via local memory transfer. 179 Next, al mmaster threads call the classic \verb|MPI_Allreduce| routine. Finally, all master threads send the updated reduced data to its180 Next, all master threads call the classic \verb|MPI_Allreduce| routine. Finally, all master threads send the updated reduced data to its 180 181 slaves via local memory transfer. 181 182 … … 190 191 endpoint interface. 191 192 192 \section{The multi-threaded XIOS and perform ce results}193 \section{The multi-threaded XIOS and performance results} 193 194 194 195 The development of endpoint interface for thread-friendly XIOS library took about one year and a half. The main difficulty is the 195 co-exist ance of MPI processes and OpenMP threads. All MPI classes must be redefined in the endpoint interface along with all the routines.196 co-existence of MPI processes and OpenMP threads. All MPI classes must be redefined in the endpoint interface along with all the routines. 196 197 The development is now available on the forge server: \url{http://forge.ipsl.jussieu.fr/ioserver/browser/XIOS/dev/branch_openmp}. One 197 198 technique report is also available in which one can find more detail about how endpoint works and how the routines are implemented … … 199 200 future with a stable version. 200 201 201 All the func ionalities of XIOS is reserved in its thread-friendly version. Single threaded code can work successfully with the new202 All the functionalities of XIOS is reserved in its thread-friendly version. Single threaded code can work successfully with the new 202 203 version of XIOS. For multi-threaded models, some modifications are needed in order to work with the multi-threaded XIOS library. Detail can 203 204 be found in our technique report \cite{ep:2018}. 204 205 205 Even though the multi-threaded 206 206 Even though the multi-threaded XIOS library is not fully accomplished and further optimization in ongoing. We have already done some tests 207 to see the potential of the endpoint framework. We take LMDZ as the target model and have tested with several work-flow charges. 208 209 \subsection{LMDZ work-flow} 210 211 In the LMDZ work-flow, we have a daily output file. We have up to 413 two-dimension variables and 187 three-dimension variables. According 212 to user's need, we can change the ``output\_level'' key argument in the xml file to select the desired variables to be written. 213 214 In our tests, we choose to set ``output\_level=2'' for a light output, and ``output\_level=11'' for a full output. 215 216 \subsection{CMIP6 work-flow} 217 218 \begin{comment} 207 219 \section{Performance of LMDZ using EP\_XIOS} 208 220 … … 241 253 histmth with daily output 242 254 243 \section{Perspectives of EP\_XIOS} 255 \end{comment} 256 257 258 \section{Future works for XIOS} 244 259 245 260
Note: See TracChangeset
for help on using the changeset viewer.