wiki:Documentation/UserGuide/DifferencesNetcdf

Version 7 (modified by alanso, 5 years ago) (diff)

--

How to check whether two (netcdf) files are identical

cdo diffv

Rather than comparing plots, its faster and more precise to compare whether two netcdf files (i.e. a history or restart file between 2 model versions) are numerically identical. The follow command works on asterix and obelix

cdo diffv   path_file_1   path_file_2 > output_file_name.txt

ADVANTAGE: the output file tells you which fields are different. Be aware, though that this method works best for smaller netCDF files. If your history file is more than a few megabytes, the output text file may be many hundreds of megabytes. In that case, the md5sum command may be a better option.

DISADVANTAGE: only works for netcdf files

md5sum

If you expect that the files are identical (bit by bit) you can use

md5sum path_file

as a result you will get a code. Run the same command on the second file and only when the code is identical for both files, the files are exactly the same.

ADVANTAGE: works for all files.

DISADVANTAGE: you only know whether the files are identical or not. If not, you have no idea which fields are different.

Matlab

The matlab function nccmp are able to compare all variables contained within two netcdf files. The original version can be found here: https://fr.mathworks.com/matlabcentral/fileexchange/47857-comparing-two-netcdf-files. I have made some small modifications such that the information produced by the script are put into a file instead of printed to the screen. The update version can be found here:/ccc/work/cont003/dofoco/dofoco/SCRIPTS/debug

Run the function by typing:

NCCMP(ncfile1,ncfile2,tolerance,forceCompare)

Tolerance is if you allow some variation in the variables between the two files. We want identical files thus put [] here.

forceCompare can be set to true or false.

True - write all occurrences of differences in a variable (specifically gives all the indices) to the file: all_diff.txt.

False - only write if there is differences in a variable and its first occurrence of such differences to the file: first_diff.txt.

For global simulation the True option can produce a large file and the information might be hard to process, if there are many differences between the compared files. In addition, the True option can make the much script slower. However, for small simulation the true option might be very useful.

Attachments (2)

Download all attachments as: .zip