Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Warning
titleWARNING

Some modifications have been made in this page since the meeting at C3S General Assembly at Toulouse
More details about those these changes, and the final netCDF examples following the guide are expected to be made available around the 21st April 2017

...

During these first stages of the proof-of-concept phase of C3S seasonal forecast activity, we have been working in standardise to define a  standard for the data provision in netCDF. This standard is described below.

The proposal is constrained by the CF convention, and we have also tried not to diverge from specifications coming from other well established communities: SPECS and CMIP5/6

Additionally ACDD has been also taken into account when defining the data discovery related metadata.

In that senseHence, the following links are valuable sources of information that have informed the definition of this proposal:

...

CMIP6 Data Request: MIP variables search

ACDD convention

Encoding Guide for netCDF files conforming to C3S-0.1

Global attributes

The following properties are intended to provide information about where the data came from and what has been done to it. This information is mainly for the benefit of human readers and data discovery mechanisms. The attribute values are all character strings. When an attribute appears both globally and as a variable attribute, it is the variable’s version which has precedence.

 

Attribute NameValueExamplesComment
ConventionsCF_convention_string  C3S-0.1 [Other convention] :..."CF-1.6 C3S-0.1"

Multiple conventions may be included (separated by blank spaces)

title

"<short institution name> seasonal forecast model output prepared for C3S"

CF: Free text

ACDD (highly recommended)

"ECMWF seasonal forecast model output prepared for C3S"
A short phrase or sentence describing the dataset. In many discovery systems, the title will be displayed in the results list from a search, and therefore should be human readable and reasonable to display in a list of such names
<short institution name> is the first element of the comma-separated list of values of the corresponding "institution" attribute
references

URIs (such as a URL or DOI) for papers or other references. A valid doi is recommended

CF: Free text

"doi:10.5194/gmd-8-1509-2015"
Published or web-based references that describe the data or methods used to produce it.
source

The following template should be followed in constructing this string: '<model_id> :  atmosphere: <model_name> (<technical_name>, <resolution_and_levels>); ocean: <model_name> (<technical_name>, <resolution_and_levels>); sea ice: <model_name> (<technical_name>); land: <model_name> (<technical_name>)''

Additional explanatory information may follow the required information.

NOTE that for some models, it may not make much sense to include all these components.

The first portion of the string, “model_id”, should be built using the following template:

"model_name-vYYYYMMDD" where YYYYMMDD is the release date of that version of the model (the date when it was first used)

"System4-v20111101: atmos: IFS (CY36R4, TL255L91); ocean: NEMOv3.0 (ORCA1_z42, 1x1L42); Local modifications in NEMOv3.0 (dynamic memory, flexible output, surface flux forcing, closure of frewsh water budget). FLAKE lake model "

The method of production of the original data. If it was model-generated, source should name the model and its version, as specifically as it could be useful.

It is a character string fully identifying the model and version used to generate the output. It should include information concerning the component models.

Note that information about changes in the individual components respect to the "official" releases should be included (e.g. a different bathymetry)

The "source" attribute should include as much information as possible to not just identify the model but to brief the user about it

institute_id

Controlled Vocabulary:

"ecmf" for ECMWF
"egrr" for Met Office
"lfpw" for Météo-France
"edzw" for DWD
"cmcc" for CMCC

"edzw"Standardized 4 characters identifier of the institution that produced the data;
NOTE all the values come from abbreviations of WMO/GRIB "originating centre" table, except CMCC (not available there)
institution

Controlled Vocabulary:

"ECMWF, European Centre for Medium-Range Weather Forecasts, Reading, United Kingdom"

"Met Office, Exeter, United Kingdom"

"Météo-France, Toulouse, France"

"DWD, Deutscher Wetterdienst, Offenbach, Germany"

"CMCC, Centro Euro-Mediterraneo sui Cambiamenti Climatici, Bologna, Italy"

CF: Free text

 

"Météo-France, Toulouse, France"

Specifies where the original data was produced. The name of the institution principally responsible for originating this data.

NOTE: The first element of the comma-separated list of values will be used as a shortened version of this attribute in some of the other global attributes ('summary', 'title')

contact

Copernicus User Support URI should be used
http://copernicus-support.ecmwf.int

CF: Free text

"http://copernicus-support.ecmwf.int"

 

project

"C3S Seasonal Forecast" should be used

CF: Free text

 

"C3S Seasonal Forecast" 
creation_date

SPECS: YYYY-MM-DDThh:mm:ss<zone>

ISO 8601:2004 extended format

"2011-06-24T02:53:46Z"

The date on which this version of the data was created. Modification of values implies a new version, hence this would be assigned the date of the most recent values modification. Metadata changes are not considered when assigning the creation_date

NOTE: The ACDD 1.3 names this attribute as "date_create". The name "creation_date" has been used following SPECS convention.

commentFree text
  • "Produced by University of Hamburg for DWD at ECMWF HPC facilities"
  • "Run by CMCC at CINECA"
Miscellaneous information about the data, not captured elsewhere.
forecast_type

Controlled Vocabulary

"forecast" or "hindcast"

"forecast"

To identify the type of data



modeling_realm

Controlled Vocabullary

"atmos", "ocean", "land", "landIce", "seaIce", "aerosol", "atmosChem", "ocnBgchem"

"seaIce"

A string that indicates the high-level modelling component that is particularly relevant to the variable encoded
Controlled vocabulary taken from SPECS


Value depends on the variable (see "global attributes" column in variables tables)

frequency

Controlled Vocabulary

"mon", "day", "12hr", "6hr", "fix"

"day"

A string indicating the interval between individual time-samples.
Controlled vocabulary taken from SPECS (extended to include "12hr" data)

Value depends on the variable (see "global attributes" column in variables tables)

level_type

Controlled Vocabulary

"surface", "pressure", "soil"

"pressure"

A string indicating the type of the level where the variable comes from

Value depends on the variable (see "global attributes" column in variables tables)

historyEmpty string""

To avoid this attribute being polluted by usual netcdf netCDF tools, it must be enforced to an empty string.

 

commit

timestamp + URL of a commit in a CVS repository

"2017-04-01T13:48:25Z https://software.ecmwf.int/stash/projects/C3SS/repos/ecmf/System4_v20111101"

This attribute intends to keep trace of the tools/scripts used to post-process the data output from the model.

Ideally it should contain the link to a repository containing the specific set of tools and scripts needed to reproduce the same data from the model output. It is highly desirable to have that traceability information.

As a surrogate when the previous is not feasible it should include the timestamp followed by an URL pointing to the C3S documentation repository of the correspondent model version (properly labelled with the <model_id> introduced in 'source" attribute)

summary

Controlled Vocabulary:
"Seasonal Forecast data produced by <short institution name> as its contribution to the seasonal forecast activity of the Copernicus Climate Change Service (C3S). The data has global coverage with a 1-degree horizontal resolution and spans for around 6 months since the start date"

ACDD (highly recommended)

"Seasonal Forecast data produced by DWD as its contribution to the seasonal forecast activity of the Copernicus Climate Change Service (C3S). The data has global coverage with a 1-degree horizontal resolution and spans for around 6 months since the start date"

A short paragraph describing the dataset

 

<short institution name> is the first element of the comma-separated list of values of the corresponding "institution" attribute

keywords

Fixed string

"Seasonal Forecasts, C3S, ECMWF, Copernicus, Climate Change, Climate Services, Earth Science Services, Environmental Advisories, Climate Advisories"

ACDD (highly recommended)

 

A comma separated list of key words and phrases.

NOTE: This attribute is likely to be modified in the future, once the contents of the Thesaurus for CDS faceting will be defined

forecast_reference_time

SPECS: YYYY-MM-DDThh:mm:ssZ

NOTE: This is ISO 8601:2004 extended format, but time zone is required to be UTC

"2011-06-01T00:00:00Z"

time of the analysis from which the forecast was made


Introduced as global attribute to keep compatibility with SPECS
(note that works fine for SPECS data structure, i.e. one variable per start time per file)

...

NOTE about the horizontal coordinates: The regridding procedure to provide the data in the 1-degree grid must take into account that the full definition of the gird cells is done given by the cell boundaries (lat_bnds, lon_bnds)

...

TypeCoordinate NameDimension NamesAxisstandard_namelong_namecalendarunitsboundsNotes
doublereftimeN/AN/Aforecast_reference_time"Start date of the forecast"gregorianUDUNITS time units
e.g.
"hours since YYYY-MM-DD hh:mm:ss TZhh:TZmm"
N/AIn SPECS it is only given as a "global_attribute"
It has been additionally introduced here as a coordinate variable to ease future netCDF management (e.g. file merging)
doubleleadtimeleadtimeN/Aforecast_period"Time elapsed since the start of the forecast"N/A

SPECS: days
C3S: requested units can be relaxed to equivalent time units

leadtime_bnds

The interval of time between the forecast reference time and the valid time

Boundaries not needed when this time coordinate is used for instantaneous values (note that "time:point" is used as cell_method in those cases)

When boundaries are required, the value of the coordinate must be in the centre of the correspondent time cell boundaries

doubletimeleadtimeN/A

time

"Verification time of the forecast"gregorian

SPECS: "days since 1850-01-01"

C3S: requested units can be relaxed to equivalent time units

time_bnds

Time for which the forecast is valid

Boundaries not needed when this time coordinate is used for instantaneous values (note that "time:point" is used as cell_method in those cases)

When boundaries are required, the value of the coordinate must be in the centre of the correspondent time cell boundaries

 

...

NOTE: Even though there are different requested time steps among the variables (6h, 12h, 24h), just one set of time axes has been defined, as that would be enough when applying the requirement of "one variable per file" 


Warning

"leadtime" has been selected as dimension (instead of "time") for both time and leadtime. That means "leadtime" is the coordinate and "time" is an auxiliary coordinate

  • This diverges from SPECS (where "time" was the name of the dimension and the coordinate, and "leadtime" was an auxiliary coordinate)
  • Here it has been done like that because
    1. both reftime and leadtime are the relevant (let's say "orthogonal") coordinates coming from the relationship time = reftime + leadtime
    2. doing like that has some advantages when merging netcdf netCDF files ("leadtime" can be easily shared by different variables in a merged file, while "time" cannot)

 

 

Cell boundaries

 As described in section 7.1 Cell Boundaries of CF convention.

...

attribute
name
valuecomment
grid_mappinghcrs 
_FillValue1.0e20Those attributes must be included JUST when the variable has missing values.
It needs to be enforced that they are not present if they should not be there.
missing_value1.0e20

Conditional attributes

The following attributes may be included in the attribute list for a given variable if the conditions specified are fulfilled:

attribute namevaluecomment
_FillValue
1.0e20
Set by netCDF libraryThose attributes must be included
JUST
ONLY when the variable has missing values.
It needs to be enforced that
If they are
not present if
NOT present in the actual data field values, they should
not be there
NOT be included as variable attributes.
missing_value
1.0e20
Set by netCDF library

Candidate attributes

The following attributes may be included in the attribute list for a given variable at a later date as this standard evolves:

 attribute namevaluecomment
valid_maxTBD see below
valid_minTBDsee below
Info
titlevalid range attributes

Both "valid_min" and "valid_max" attributes should be included as variable attributes, but a set of sensible values for each and every variable needs to be provided.

In the meantime, it is proposed not to include them and just bear in mind that they will be required to be introduced at some point in the future.

...