Ignore:
Timestamp:
07/13/18 14:18:28 (2 years ago)
Author:
yushan
Message:

report update

Location:
XIOS/dev/branch_openmp/Note
Files:
5 edited

Legend:

Unmodified
Added
Removed
  • XIOS/dev/branch_openmp/Note/rapport ESIWACE.aux

    r1552 r1560  
    1515\newlabel{fig:sendrecv}{{6}{5}} 
    1616\citation{ep:2018} 
    17 \@writefile{lof}{\contentsline {figure}{\numberline {7}{\ignorespaces }}{6}} 
     17\@writefile{lof}{\contentsline {figure}{\numberline {7}{\ignorespaces \input "rapport ESIWACE"-1.cpt\relax }}{6}} 
    1818\newlabel{fig:bcast}{{7}{6}} 
    19 \@writefile{lof}{\contentsline {figure}{\numberline {8}{\ignorespaces }}{6}} 
     19\@writefile{lof}{\contentsline {figure}{\numberline {8}{\ignorespaces \input "rapport ESIWACE"-2.cpt\relax }}{6}} 
    2020\newlabel{fig:allreduce}{{8}{6}} 
     21\citation{ep:2018} 
    2122\citation{ep:2018} 
    2223\bibstyle{plain} 
    2324\bibdata{reference} 
    2425\bibcite{ep:2018}{1} 
     26\@writefile{toc}{\contentsline {section}{\numberline {3}The multi-threaded XIOS and performance results}{7}} 
     27\@writefile{toc}{\contentsline {subsection}{\numberline {3.1}LMDZ work-flow}{7}} 
    2528\bibcite{Dinan:2013}{2} 
    2629\bibcite{Sridharan:2014}{3} 
    27 \@writefile{toc}{\contentsline {section}{\numberline {3}The multi-threaded XIOS and performance results}{7}} 
    28 \@writefile{toc}{\contentsline {section}{\numberline {4}Future works for XIOS}{7}} 
     30\@writefile{toc}{\contentsline {subsection}{\numberline {3.2}CMIP6 work-flow}{8}} 
     31\@writefile{toc}{\contentsline {section}{\numberline {4}Future works for XIOS}{8}} 
  • XIOS/dev/branch_openmp/Note/rapport ESIWACE.log

    r1552 r1560  
    1 This is pdfTeX, Version 3.14159265-2.6-1.40.16 (TeX Live 2015/Debian) (preloaded format=pdflatex 2017.8.24)  27 JUN 2018 15:10 
     1This is pdfTeX, Version 3.14159265-2.6-1.40.16 (TeX Live 2015/Debian) (preloaded format=pdflatex 2017.8.24)  13 JUL 2018 12:17 
    22entering extended mode 
    33 restricted \write18 enabled. 
     
    499499\verbatim@in@stream=\read1 
    500500) 
     501(/usr/share/texlive/texmf-dist/tex/latex/cprotect/cprotect.sty 
     502Package: cprotect 2011/01/27 v1.0e (Bruno Le Floch) 
     503 
     504(/usr/share/texlive/texmf-dist/tex/latex/base/ifthen.sty 
     505Package: ifthen 2014/09/29 v1.1c Standard LaTeX ifthen package (DPC) 
     506) 
     507(/usr/share/texlive/texmf-dist/tex/latex/bigfoot/suffix.sty 
     508Package: suffix 2006/07/15 1.5a Variant command support 
     509) 
     510\CPT@WriteOut=\write3 
     511\c@CPT@WriteCount=\count109 
     512\c@CPT@numB=\count110 
     513\CPT@commandatend@toks=\toks27 
     514) 
    501515(./rapport ESIWACE.aux) 
    502516\openout1 = `"rapport ESIWACE.aux"'. 
    503517 
    504 LaTeX Font Info:    Checking defaults for OML/cmm/m/it on input line 17. 
    505 LaTeX Font Info:    ... okay on input line 17. 
    506 LaTeX Font Info:    Checking defaults for T1/cmr/m/n on input line 17. 
    507 LaTeX Font Info:    ... okay on input line 17. 
    508 LaTeX Font Info:    Checking defaults for OT1/cmr/m/n on input line 17. 
    509 LaTeX Font Info:    ... okay on input line 17. 
    510 LaTeX Font Info:    Checking defaults for OMS/cmsy/m/n on input line 17. 
    511 LaTeX Font Info:    ... okay on input line 17. 
    512 LaTeX Font Info:    Checking defaults for OMX/cmex/m/n on input line 17. 
    513 LaTeX Font Info:    ... okay on input line 17. 
    514 LaTeX Font Info:    Checking defaults for U/cmr/m/n on input line 17. 
    515 LaTeX Font Info:    ... okay on input line 17. 
     518LaTeX Font Info:    Checking defaults for OML/cmm/m/it on input line 18. 
     519LaTeX Font Info:    ... okay on input line 18. 
     520LaTeX Font Info:    Checking defaults for T1/cmr/m/n on input line 18. 
     521LaTeX Font Info:    ... okay on input line 18. 
     522LaTeX Font Info:    Checking defaults for OT1/cmr/m/n on input line 18. 
     523LaTeX Font Info:    ... okay on input line 18. 
     524LaTeX Font Info:    Checking defaults for OMS/cmsy/m/n on input line 18. 
     525LaTeX Font Info:    ... okay on input line 18. 
     526LaTeX Font Info:    Checking defaults for OMX/cmex/m/n on input line 18. 
     527LaTeX Font Info:    ... okay on input line 18. 
     528LaTeX Font Info:    Checking defaults for U/cmr/m/n on input line 18. 
     529LaTeX Font Info:    ... okay on input line 18. 
    516530 
    517531(/usr/share/texlive/texmf-dist/tex/context/base/supp-pdf.mkii 
    518532[Loading MPS to PDF converter (version 2006.09.02).] 
    519 \scratchcounter=\count109 
     533\scratchcounter=\count111 
    520534\scratchdimen=\dimen120 
    521535\scratchbox=\box30 
    522 \nofMPsegments=\count110 
    523 \nofMParguments=\count111 
    524 \everyMPshowfont=\toks27 
    525 \MPscratchCnt=\count112 
     536\nofMPsegments=\count112 
     537\nofMParguments=\count113 
     538\everyMPshowfont=\toks28 
     539\MPscratchCnt=\count114 
    526540\MPscratchDim=\dimen121 
    527 \MPnumerator=\count113 
    528 \makeMPintoPDFobject=\count114 
    529 \everyMPtoPDFconversion=\toks28 
     541\MPnumerator=\count115 
     542\makeMPintoPDFobject=\count116 
     543\everyMPtoPDFconversion=\toks29 
    530544) (/usr/share/texlive/texmf-dist/tex/generic/oberdiek/pdftexcmds.sty 
    531545Package: pdftexcmds 2011/11/29 v0.20 Utility functions of pdfTeX for LuaTeX (HO 
     
    576590e 
    577591)) 
    578 \c@lstlisting=\count115 
     592\c@lstlisting=\count117 
    579593 
    580594<Charge1.png, id=1, 412.54124pt x 228.10219pt> 
    581595File: Charge1.png Graphic file (type png) 
    582596 <use Charge1.png> 
    583 Package pdftex.def Info: Charge1.png used on input line 33. 
     597Package pdftex.def Info: Charge1.png used on input line 34. 
    584598(pdftex.def)             Requested size: 165.01357pt x 91.23924pt. 
    585599 
     
    587601File: Charge2.png Graphic file (type png) 
    588602 <use Charge2.png> 
    589 Package pdftex.def Info: Charge2.png used on input line 34. 
     603Package pdftex.def Info: Charge2.png used on input line 35. 
    590604(pdftex.def)             Requested size: 165.25446pt x 91.05858pt. 
    591605 [1 
     
    596610File: domain.pdf Graphic file (type pdf) 
    597611 <use domain.pdf> 
    598 Package pdftex.def Info: domain.pdf used on input line 60. 
     612Package pdftex.def Info: domain.pdf used on input line 64. 
    599613(pdftex.def)             Requested size: 236.1567pt x 71.13055pt. 
    600614 
     
    602616File: omp.pdf Graphic file (type pdf) 
    603617 <use omp.pdf> 
    604 Package pdftex.def Info: omp.pdf used on input line 68. 
     618Package pdftex.def Info: omp.pdf used on input line 78. 
    605619(pdftex.def)             Requested size: 291.64784pt x 126.32893pt. 
    606620 [2 <./domain.pdf>] 
     
    608622File: scheme.png Graphic file (type png) 
    609623 <use scheme.png> 
    610 Package pdftex.def Info: scheme.png used on input line 86. 
     624Package pdftex.def Info: scheme.png used on input line 102. 
    611625(pdftex.def)             Requested size: 266.18977pt x 207.17032pt. 
    612626 [3 <./omp.pdf>] 
     
    614628File: tag.png Graphic file (type png) 
    615629 <use tag.png> 
    616 Package pdftex.def Info: tag.png used on input line 140. 
     630Package pdftex.def Info: tag.png used on input line 156. 
    617631(pdftex.def)             Requested size: 301.11966pt x 41.35376pt. 
    618632 [4 <./scheme.png (PNG copy)>] <sendrecv.png, id=41, 829.0975pt x 694.595pt> 
    619633File: sendrecv.png Graphic file (type png) 
    620634 <use sendrecv.png> 
    621 Package pdftex.def Info: sendrecv.png used on input line 149. 
     635Package pdftex.def Info: sendrecv.png used on input line 165. 
    622636(pdftex.def)             Requested size: 331.63313pt x 277.83307pt. 
    623637 
     
    626640File: bcast.png Graphic file (type png) 
    627641 <use bcast.png> 
    628 Package pdftex.def Info: bcast.png used on input line 173. 
     642Package pdftex.def Info: bcast.png used on input line 188. 
    629643(pdftex.def)             Requested size: 153.87605pt x 165.62003pt. 
    630  
    631 <allreduce.png, id=46, 743.77875pt x 552.0625pt> 
     644\openout3 = `"rapport ESIWACE-1.cpt"'. 
     645 
     646 
     647(./rapport ESIWACE-1.cpt) <allreduce.png, id=46, 743.77875pt x 552.0625pt> 
    632648File: allreduce.png Graphic file (type png) 
    633  <use allreduce.png> 
    634 Package pdftex.def Info: allreduce.png used on input line 185. 
     649 
     650<use allreduce.png> 
     651Package pdftex.def Info: allreduce.png used on input line 200. 
    635652(pdftex.def)             Requested size: 223.13535pt x 165.62003pt. 
    636  [6 <./bcast.png (PNG copy)> <./allreduce.png (PNG copy)>] (./rapport ESIWACE.b 
    637 bl 
     653\openout3 = `"rapport ESIWACE-2.cpt"'. 
     654 
     655 (./rapport ESIWACE-2.cpt) [6 <./bcast.png (PNG copy)> <./allreduce.png (PNG co 
     656py)>] (./rapport ESIWACE.bbl 
    638657Underfull \hbox (badness 3354) in paragraph at lines 4--9 
    639658[]\OT1/cmr/m/n/10 XIOS de-vel-op-per group.  Note for XIOS End-points.  Tech-ni 
     
    647666 [] 
    648667 
    649 ) [7] (./rapport ESIWACE.aux) )  
     668[7]) [8] (./rapport ESIWACE.aux) )  
    650669Here is how much of TeX's memory you used: 
    651  4637 strings out of 494953 
    652  61251 string characters out of 6180977 
    653  137489 words of memory out of 5000000 
    654  7849 multiletter control sequences out of 15000+600000 
    655  9090 words of font info for 34 fonts, out of 8000000 for 9000 
     670 4795 strings out of 494953 
     671 63636 string characters out of 6180977 
     672 139394 words of memory out of 5000000 
     673 7989 multiletter control sequences out of 15000+600000 
     674 9397 words of font info for 35 fonts, out of 8000000 for 9000 
    656675 14 hyphenation exceptions out of 8191 
    657  41i,8n,35p,1270b,264s stack positions out of 5000i,500n,10000p,200000b,80000s 
    658 </usr/share/texlive/texmf-dist/fonts/type1/publi 
    659 c/amsfonts/cm/cmbx12.pfb></usr/share/texlive/texmf-dist/fonts/type1/public/amsf 
    660 onts/cm/cmr10.pfb></usr/share/texlive/texmf-dist/fonts/type1/public/amsfonts/cm 
    661 /cmr12.pfb></usr/share/texlive/texmf-dist/fonts/type1/public/amsfonts/cm/cmr17. 
    662 pfb></usr/share/texlive/texmf-dist/fonts/type1/public/amsfonts/cm/cmr6.pfb></us 
    663 r/share/texlive/texmf-dist/fonts/type1/public/amsfonts/cm/cmr7.pfb></usr/share/ 
    664 texlive/texmf-dist/fonts/type1/public/amsfonts/cm/cmr8.pfb></usr/share/texlive/ 
    665 texmf-dist/fonts/type1/public/amsfonts/cm/cmti10.pfb></usr/share/texlive/texmf- 
    666 dist/fonts/type1/public/amsfonts/cm/cmtt10.pfb></usr/share/texlive/texmf-dist/f 
    667 onts/type1/public/amsfonts/cm/cmtt8.pfb> 
    668 Output written on "rapport ESIWACE.pdf" (7 pages, 269577 bytes). 
     676 41i,8n,35p,1270b,618s stack positions out of 5000i,500n,10000p,200000b,80000s 
     677</usr/share/texlive/texmf-dist/fonts/type1/pu 
     678blic/amsfonts/cm/cmbx12.pfb></usr/share/texlive/texmf-dist/fonts/type1/public/a 
     679msfonts/cm/cmr10.pfb></usr/share/texlive/texmf-dist/fonts/type1/public/amsfonts 
     680/cm/cmr12.pfb></usr/share/texlive/texmf-dist/fonts/type1/public/amsfonts/cm/cmr 
     68117.pfb></usr/share/texlive/texmf-dist/fonts/type1/public/amsfonts/cm/cmr6.pfb>< 
     682/usr/share/texlive/texmf-dist/fonts/type1/public/amsfonts/cm/cmr7.pfb></usr/sha 
     683re/texlive/texmf-dist/fonts/type1/public/amsfonts/cm/cmr8.pfb></usr/share/texli 
     684ve/texmf-dist/fonts/type1/public/amsfonts/cm/cmti10.pfb></usr/share/texlive/tex 
     685mf-dist/fonts/type1/public/amsfonts/cm/cmtt10.pfb></usr/share/texlive/texmf-dis 
     686t/fonts/type1/public/amsfonts/cm/cmtt8.pfb> 
     687Output written on "rapport ESIWACE.pdf" (8 pages, 273883 bytes). 
    669688PDF statistics: 
    670  86 PDF objects out of 1000 (max. 8388607) 
    671  57 compressed objects within 1 object stream 
     689 89 PDF objects out of 1000 (max. 8388607) 
     690 59 compressed objects within 1 object stream 
    672691 0 named destinations out of 1000 (max. 500000) 
    673692 46 words of extra memory for PDF output out of 10000 (max. 10000000) 
  • XIOS/dev/branch_openmp/Note/rapport ESIWACE.tex

    r1552 r1560  
    77\usepackage{url} 
    88\usepackage{verbatim} 
     9\usepackage{cprotect} 
    910 
    1011% Title Page 
     
    4748project develops a new dynamical core for LMD-Z, the atmospheric general circulation model (GCM) part of IPSL-CM Earth System Model.  
    4849\url{http://www.lmd.polytechnique.fr/~dubos/DYNAMICO/}} all use XIOS as the output back end. M\'et\'eoFrance and MetOffice also choose XIOS  
    49 to manege the I/O for their models. 
     50to manage the I/O for their models. 
    5051 
    5152 
     
    5455Although XIOS copes well with many models, there is one potential optimization in XIOS which needs to be investigated: making XIOS thread-friendly. 
    5556 
    56 This topic comes along with the configuration of the climate models. Take LMDZ as example, it is designed with the 2-level parallelization scheme. To be more specific, LMDZ uses the domain decomposition method in which each sub-domain is associated with one MPI process. Inside of the sub-domain, the model also uses OpenMP derivatives to accelerate the computation. We can imagine that the sub-domain be divided into sub-sub-domain and is managed by threads.  
     57This topic comes along with the configuration of the climate models. Take LMDZ as example, it is designed with the 2-level parallelization  
     58scheme. To be more specific, LMDZ uses the domain decomposition method in which each sub-domain is associated with one MPI process. Inside  
     59of the sub-domain, the model also uses OpenMP derivatives to accelerate the computation. We can imagine that the sub-domain be divided into  
     60sub-sub-domain and is managed by threads.  
    5761 
    5862\begin{figure}[ht] 
     
    6266\end{figure} 
    6367 
    64 As we know, each sub-domain, or in another word, each MPI process is a XIOS client. The data exchange between client and XIOS servers is handled by MPI communications. In order to write an output field, all threads must gather the data to the master thread who acts as MPI process in order to call MPI routines. There are two disadvantages about this method : first, we have to spend time on gathering information to the master thread which not only increases the memory use, but also implies an OpenMP barrier; second, while the master thread calls MPI routine, other threads are in the idle state thus a waster of computing resources. What we want obtain with the thread-friendly XIOS is that all threads can act like MPI processes. They can call directly the MPI routine thus no waste in memory nor in computing resources as shown in Figure \ref{fig:omp}. 
     68As we know, each sub-domain, or in another word, each MPI process is a XIOS client. The data exchange between client and XIOS servers is  
     69handled by MPI communications. In order to write an output field, all threads must gather the data to the master thread who acts as MPI  
     70process in order to call MPI routines. There are two disadvantages about this method : first, we have to spend time on gathering information  
     71to the master thread which not only increases the memory use, but also implies an OpenMP barrier; second, while the master thread calls MPI  
     72routine, other threads are in the idle state thus a waster of computing resources. What we want obtain with the thread-friendly XIOS is that  
     73all threads can act like MPI processes. They can call directly the MPI routine thus no waste in memory nor in computing resources as shown  
     74in Figure \ref{fig:omp}. 
    6575 
    6676\begin{figure}[ht] 
     
    7181\end{figure} 
    7282 
    73 There are two ways to make XIOS thread-friendly. First of all, change the structure of XIOS which demands a lot of modification is the XIOS library. Knowing that XIOS is about 100 000 lines of code, this method will be very time consuming. What's more, the modification will be local to XIOS. If we want to optimize an other code to be thread-friendly, we have to redo the modifications. The second choice is to add an extra interface to MPI in order to manage the threads. When a thread want to call an MPI routine inside XIOS, it will first pass the interface, in which the communication information will be analyzed before the MPI routine is invoked. With this method, we only need to modify a very small part of XIOS in order to make it work. What is more interesting is that the interface we created can be adjusted to suit other MPI based libraries. 
     83There are two ways to make XIOS thread-friendly. First of all, change the structure of XIOS which demands a lot of modification is the XIOS  
     84library. Knowing that XIOS is about 100 000 lines of code, this method will be very time consuming. What's more, the modification will be  
     85local to XIOS. If we want to optimize an other code to be thread-friendly, we have to redo the modifications. The second choice is to add an  
     86extra interface to MPI in order to manage the threads. When a thread want to call an MPI routine inside XIOS, it will first pass the  
     87interface, in which the communication information will be analyzed before the MPI routine is invoked. With this method, we only need to  
     88modify a very small part of XIOS in order to make it work. What is more interesting is that the interface we created can be adjusted to suit  
     89other MPI based libraries. 
    7490 
    7591 
     
    163179data, execution of the MPI function by all master/root threads, distribution or arrangement of the resulting data among threads.  
    164180 
    165 %The most representative functions of the collective communications are \verb|MPI_Gather| and \verb|MPI_Bcast|. 
    166181 
    167182For example, if we want to perform a broadcast operation, only 2 steps are needed (\textit{c.f.} Figure \ref{fig:bcast}). Firstly, the root  
     
    172187\centering 
    173188\includegraphics[scale=0.3]{bcast.png}  
    174 \caption{} 
     189\cprotect\caption{\verb|MPI_Bcast|} 
    175190\label{fig:bcast} 
    176191\end{figure} 
     
    184199\centering 
    185200\includegraphics[scale=0.3]{allreduce.png}  
    186 \caption{} 
     201\cprotect\caption{\verb|MPI_Allreduce|} 
    187202\label{fig:allreduce} 
    188203\end{figure} 
    189204 
    190205Other MPI routines, such as \verb|MPI_Wait|, \verb|MPI_Intercomm_create| \textit{etc.}, can be found in the technique report of the  
    191 endpoint interface. 
     206endpoint interface \cite{ep:2018}. 
    192207 
    193208\section{The multi-threaded XIOS and performance results} 
    194209 
    195210The development of endpoint interface for thread-friendly XIOS library took about one year and a half. The main difficulty is the  
    196 co-existence of MPI processes and OpenMP threads. All MPI classes must be redefined in the endpoint interface along with all the routines.  
    197 The development is now available on the forge server: \url{http://forge.ipsl.jussieu.fr/ioserver/browser/XIOS/dev/branch_openmp}. One  
    198 technique report is also available in which one can find more detail about how endpoint works and how the routines are implemented  
    199 \cite{ep:2018}. We must note that the thread-friendly XIOS library is still in the phase of optimization. It will be released in the  
    200 future with a stable version. 
    201  
    202 All the functionalities of XIOS is reserved in its thread-friendly version. Single threaded code can work successfully with the new  
    203 version of XIOS. For multi-threaded models, some modifications are needed in order to work with the multi-threaded XIOS library. Detail can  
    204 be found in our technique report \cite{ep:2018}. 
     211co-existence of MPI processes and OpenMP threads. One essential requirement for using the endpoint interface is that the underlying MPI  
     212implementation must support the level-3 of thread support which is \verb|MPI_THREAD_MULTIPLE|. This means that if the MPI process is  
     213multi-threaded, multiple threads may call MPI at once with no restrictions. Another importance aspect to be mentioned is that in XIOS, we  
     214have variables with \verb|static| attribute. It means that inside of an MPI process, threads share the static variable. In order to use  
     215correctly the endpoint interface, these static variables have to be defined as \verb|threadprivate| to limit the visibility to thread.   
     216 
     217To develop the endpoint interface, we redefined all MPI classes along with all the MPI routines that are used in XIOS library. The current  
     218version of the interface includes about 7000 lines of code and is now available on the forge server:  
     219\url{http://forge.ipsl.jussieu.fr/ioserver/browser/XIOS/dev/branch_openmp}. One technique report is also available in which one can find  
     220more detail about how endpoint works and how the routines are implemented \cite{ep:2018}. We must note that the thread-friendly XIOS  
     221library is still in the phase of optimization. It will be released in the future with a stable version. 
     222 
     223All the functionalities of XIOS is reserved in its thread-friendly XIOS library. Single threaded code can work successfully under the  
     224endpoint interface with the new version of XIOS. For multi-threaded models, some modifications are needed in order to work with the  
     225multi-threaded XIOS library. For example, the MPI initialization has be to modified to require the \verb|MPI_THREAD_MULTIPLE|  
     226support. Each thread should have its own data set. What's most important is that the OpenMP master region in which the master thread calls  
     227XIOS routines should be erased in order that every threads can call XIOS routines simultaneously. More detail can be found in our technique  
     228report \cite{ep:2018}. 
    205229 
    206230Even though the multi-threaded XIOS library is not fully accomplished and further optimization in ongoing. We have already done some tests  
    207231to see the potential of the endpoint framework. We take LMDZ as the target model and have tested with several work-flow charges.  
     232 
     233\subsection{LMDZ work-flow} 
     234 
     235In the LMDZ work-flow, we have a daily output file. We have up to 413 two-dimension variables and 187 three-dimension variables. According  
     236to user's need, we can change the ``output\_level'' key argument in the \verb|xml| file to select the desired variables to be written. In  
     237our  
     238tests, we choose to set ``output\_level=2'' for a light output, and ``output\_level=11'' for a full output. We run the LMDZ code for  
     239one, two, and three-month simulations using 12 MPI client processes and 1 server process. Each client process includes 8 OpenMP threads  
     240which gives us 92 XIOS clients in total.  
     241 
     242\subsection{CMIP6 work-flow} 
    208243 
    209244\begin{comment} 
  • XIOS/dev/branch_openmp/Note/rapport ESIWACE.tex.backup

    r1552 r1560  
    66\usepackage{amsmath} 
    77\usepackage{url} 
     8\usepackage{verbatim} 
    89 
    910% Title Page 
     
    4647project develops a new dynamical core for LMD-Z, the atmospheric general circulation model (GCM) part of IPSL-CM Earth System Model.  
    4748\url{http://www.lmd.polytechnique.fr/~dubos/DYNAMICO/}} all use XIOS as the output back end. M\'et\'eoFrance and MetOffice also choose XIOS  
    48 to manege the I/O for their models. 
     49to manage the I/O for their models. 
    4950 
    5051 
     
    149150\caption{This figure shows the classic pattern of a P2P communication with the endpoint interface. Thread/endpoint rank 0 sends a message  
    150151to thread/endpoint rank 3 with tag=1. The underlying MPI function called by the sender is indeed a send for MPI rank of 1  
    151 and tag=65537. From the receiver's point of view, the endpoint 3 is actually receving a message from MPI rank 0 with  
     152and tag=65537. From the receiver's point of view, the endpoint 3 is actually receiving a message from MPI rank 0 with  
    152153tag=65537.} 
    153154\label{fig:sendrecv} 
     
    177178Figure \ref{fig:allreduce} illustrates how the \verb|MPI_Allreduce| function is proceeded in the endpoint interface. First of all, We  
    178179perform a intra-process ``allreduce'' operation: source data is reduced from slave threads to the master thread via local memory transfer.  
    179 Next, alm master threads call the classic \verb|MPI_Allreduce| routine. Finally, all master threads send the updated reduced data to its  
     180Next, all master threads call the classic \verb|MPI_Allreduce| routine. Finally, all master threads send the updated reduced data to its  
    180181slaves via local memory transfer.  
    181182 
     
    190191endpoint interface. 
    191192 
    192 \section{The multi-threaded XIOS and performce results} 
     193\section{The multi-threaded XIOS and performance results} 
    193194 
    194195The development of endpoint interface for thread-friendly XIOS library took about one year and a half. The main difficulty is the  
    195 co-existance of MPI processes and OpenMP threads. All MPI classes must be redefined in the endpoint interface along with all the routines.  
     196co-existence of MPI processes and OpenMP threads. All MPI classes must be redefined in the endpoint interface along with all the routines.  
    196197The development is now available on the forge server: \url{http://forge.ipsl.jussieu.fr/ioserver/browser/XIOS/dev/branch_openmp}. One  
    197198technique report is also available in which one can find more detail about how endpoint works and how the routines are implemented  
     
    199200future with a stable version. 
    200201 
    201 All the funcionalities of XIOS is reserved in its thread-friendly version. Single threaded code can work successfully with the new  
     202All the functionalities of XIOS is reserved in its thread-friendly version. Single threaded code can work successfully with the new  
    202203version of XIOS. For multi-threaded models, some modifications are needed in order to work with the multi-threaded XIOS library. Detail can  
    203204be found in our technique report \cite{ep:2018}. 
    204205 
    205 Even though the multi-threaded 
    206  
     206Even though the multi-threaded XIOS library is not fully accomplished and further optimization in ongoing. We have already done some tests  
     207to see the potential of the endpoint framework. We take LMDZ as the target model and have tested with several work-flow charges.  
     208 
     209\subsection{LMDZ work-flow} 
     210 
     211In the LMDZ work-flow, we have a daily output file. We have up to 413 two-dimension variables and 187 three-dimension variables. According  
     212to user's need, we can change the ``output\_level'' key argument in the xml file to select the desired variables to be written. 
     213 
     214In our tests, we choose to set ``output\_level=2'' for a light output, and ``output\_level=11'' for a full output. 
     215 
     216\subsection{CMIP6 work-flow} 
     217 
     218\begin{comment} 
    207219\section{Performance of LMDZ using EP\_XIOS} 
    208220 
     
    241253histmth with daily output 
    242254 
    243 \section{Perspectives of EP\_XIOS} 
     255\end{comment} 
     256 
     257 
     258\section{Future works for XIOS} 
    244259 
    245260 
Note: See TracChangeset for help on using the changeset viewer.