Introduction
(include an explanation about previous conventions.... SPECS, CF, ACDD)
include link to standard names, and list of variables...
Encoding Guide
Global attributes
The following properties are intended to provide information about where the data came from and what has been done to it. This information is mainly for the benefit of human readers and data discovery mechanisms. The attribute values are all character strings. When an attribute appears both globally and as a variable attribute, the variable’s version has precedence.
Attribute Name | Value | Examples | Comment |
---|---|---|---|
Conventions | CF convention string [Other convention] :... | "CF-1.6" "CF-1.6 C3S-0.1" | Multiple conventions may be included (separated by blank spaces) |
title | A controlled vocabulary will be provided CF: Free text ACDD (highly recommended) | "IPSL-CM5A-LR model output prepared for CMIP5 RCP4.5" | A short phrase or sentence describing the dataset. In many discovery systems, the title will be displayed in the results list from a search, and therefore should be human readable and reasonable to display in a list of such names |
references | URIs (such as a URL or DOI) for papers or other references. A valid doi is recommended CF: Free text | "doi:10.5194/gmd-8-1509-2015" | Published or web-based references that describe the data or methods used to produce it. |
source | A methodology to build this attribute will be provided |
| The method of production of the original data. If it was model-generated, source should name the model and its version, as specifically as it could be useful |
institution | A controlled vocabulary will be provided CF: Free text
| "Met Office" | Specifies where the original data was produced. The name of the institution principally responsible for originating this data. |
contact | Copernicus User Support URI should be used CF: Free text | "http://copernicus-support.ecmwf.int" |
|
project | "C3S Seasonal Forecast" should be used CF: Free text
| "C3S Seasonal Forecast" |
|
creation_date | SPECS: YYYY-MM-DDThh:mm:ss<zone> ISO 8601:2004 extended format | "2011-06-24T02:53:46Z" | The date on which this version of the data was created. Modification of values implies a new version, hence this would be assigned the date of the most recent values modification. Metadata changes are not considered when assigning the creation_date NOTE: The ACDD 1.3 names this attribute as |
comment | Free text |
| Miscellaneous information about the data, not captured elsewhere. |
forecast_type | "forecast" or "hindcast" | "forecast" | To identify the type of data |
history | Each line should begin with a timestamp indicating the date and time of day when the program was executed CF: Free Text |
| To record relevant information, such as the command history which led to this file being produced. Provides an audit trail for modifications to the original data.
|
commit, iso_lineage or lineage | Free text (ISO Lineage model 19115-2) | "Produced using CDS Toolbox v1.0" | trace of the tools/scripts used. Paco: include information about the versioning of the software used to create the data Antonio S. Cofino Gonzalez: We need a more implementtios examples on this. This could achiived in EQC WP where metadata is been part of their activities (i.e. WP4@QA4SEAS). ISO 19115-2 defines a linage model where this is been considered. TBD. |
summary | The content will be provided ACDD (highly recommended): Text, defined phrase | A short paragraph describing the dataset | |
keywords | The content will be provided ACDD (highly recommended) : text, controlled vocabulary | A comma separated list of key words and phrases. | |
forecast_reference_time | SPECS: YYYY-MM-DDThh:mm:ssZ NOTE: This is ISO 8601:2004 extended format, but time zone is required to be UTC | "2011-06-01T00:00:00Z" | time of the analysis from which the forecast was made |
Spatial Coordinates
Type (CMIP5) | Coordinate Name (CMIP5) | Dimension Names (CMIP5) | Axis | standard_name | long_name (CMIP5) | units (CF canonical units) | positive | valid_min (CMIP5) | valid_max (CMIP5) | bounds | Notes |
---|---|---|---|---|---|---|---|---|---|---|---|
double | lat | lat | Y | latitude | latitude | degrees_north | N/A | -90. | 90. | lat_bounds | Values (1x1deg grid) prescribed: [-90. , -89. , ..., 0., ... 90.] |
double | lon | lon | X | longitude | longitude | degrees_east | N/A | 0. | 360. | lon_bounds | Values (1x1deg grid) prescribed: dimension lon=360 [0. , 1. , ..., 358., 359.] |
double | plev | plev | Z | air_pressure | pressure | Pa | down | N/A | N/A | bounds? | This is also referred to as isobaric level by some tools [925., 850., 700., 500., 400., 300., 200., 100., 50., 30., 10.] |
double | depth | depth | Z | depth | depth | m | down | N/A | N/A | depth_bounds | Only used for soil model levels NOTE: Number and depth of levels is not prescribed by C3S |
double | height | height | Z | height | height | m | up or down | CMIP5: 2mtemp: 1. | CMIP5: 2mtemp: 10. | Used for single level fields (height, soil,SST) e.g. 2 m (for Temperature) | |
C3S: string
| realization | C3S: realization_dim CF: a different name is needed for dim/variable | E | realization | realization | 1 | N/A | N/A | N/A | members are not a physical quantity. Realization is a discrete coordinate and the mebers it categorical values (ordered or non-ordered ones) |
Time Coordinates
Type | Coordinate Name | Dimension Names | Axis | standard_name | long_name | calendar | units | bounds | Notes |
---|---|---|---|---|---|---|---|---|---|
double | reftime | N/A | T | forecast_reference_time | "Start date of the forecast" | gregorian | UDUNITS time units e.g. "hours since YYYY-MM-DD hh:mm:ss TZhh:TZmm" | bounds? | In SPECS it was a "global_attribute" It has been additionally introduced here as a coordinate variable to ease future netCDF management (e.g. file merging) |
double | leadtime | time | N/A | forecast_period | "Time elapsed since the start of the forecast" | N/A | SPECS: days | bounds? | The interval of time between the forecast reference time and the valid time |
double | time | time | T | time | "Verification time of the forecast" | gregorian | SPECS: "days since 1850-01-01" C3S: requested units can be relaxed to equivalent time units | time_bounds | Time for which the forecast is valid |
NOTE: Definitions for "leadtime" and "time" have been taken from SPECS. The introduction of "reftime" as a variable has been adapted from SPECS global attribute description for the forecast reference time.
NOTE: Even though there are different requested time steps among the variables (6h, 12h, 24h), just one set of time axes has been defined, as that would be enough when applying the requirement of "one variable per file"
Cell boundaries
As described in section 7.1 Cell Boundaries of CF convention.
To represent cells we add the attribute bounds
to the appropriate coordinate variable(s). The value of bounds
is the name of the variable that contains the vertices of the cell boundaries. We refer to this type of variable as a "boundary variable." A boundary variable will have one more dimension than its associated coordinate or auxiliary coordinate variable. The additional dimension should be the most rapidly varying one, and its size is the maximum number of cell vertices. Since a boundary variable is considered to be part of a coordinate variable’s metadata, it is not necessary to provide it with attributes such as long_name
and units
Bounds Name | Dimensions | Comments |
---|---|---|
time_bounds | time,bounds |
e.g.
[0,24] is that convention always valid? |
lat_bounds | lat, bounds | Values (1x1deg grid) prescribed: [-90., 89.], [-89., -88.], ... [89., 90.] |
lon_bounds | lon, bounds | Values (1x1deg grid) prescribed: [0., 1.], [1., 2.], ... [359., 360.] |
depth_bounds | depth,bounds | Should define the full vertical extent of the soil model layers |
Grid mapping
As described in section 5.6 Grid Mappings and Projections of CF convention.
grid_mapping_name
of latitude_longitude
may be used to specify the ellipsoid and prime meridian.char latitude_longitude ; latitude_longitude:grid_mapping_name = "latitude_longitude" ;
NOTE: Wouldn't a different name be more appropriate? (e.g. CRS as in CF examples)
Variables
NOTE: coordinates should list first of all the auxiliary coordinate(s) and then all the other coordinates.... SHOULD reftime and leadtime be included as well????
NOTE: type double/real????
Static Fields
attributes | |||||||
name (CMIP5) | dimensions | standard_name | long_name (CMIP5) | units | coordinates | grid_mapping | NOTES |
---|---|---|---|---|---|---|---|
sftlf | lat,lon | land_area_fraction | "Land Area Fraction" | 1 | "lat lon"
| latitude_longitude | |
orog | lat,lon | surface_altitude | "Surface Altitude" | m | "lat lon" | latitude_longitude |
Surface Fields (defined at a given height level)
attributes | ||||||||
name (CMIP5) | dimensions | standard_name | long_name (CMIP5) | units | coordinates | cell_methods | grid_mapping | NOTES |
---|---|---|---|---|---|---|---|---|
tas | time,lat,lon | air_temperature | "Near-Surface Air Temperature" | K | "height time lat lon"
| "time: point" | latitude_longitude | height is usually 2m |
tasmax | time,lat,lon | air_temperature | "Daily Maximum Near-Surface Air Temperature" | K | "height time lat lon" | "time: maximum (interval: <value> <unit>)" C3S: required. CF: interval is optional | latitude_longitude | height is usually 2m C3S: The interval is required to have a value<=3 hours) |
tasmin | time,lat,lon | air_temperature | "Daily Minimum Near-Surface Air Temperature" | K | "height time lat lon" | "time: minimum (interval: <value> <unit>)" C3S: required. CF: interval is optional | latitude_longitude | height is usually 2m C3S: The interval is required to have a value<=3 hours) |
time,lat,lon | dew_point_temperature | K | "height time lat lon" | "time: point" C3S: required CF: recommended | latitude_longitude | height is usually 2m
| ||
uas | time,lat,lon | x_wind | Eastward Near-Surface Wind | m s-1 | "height time lat lon" | "time: point" C3S: required CF: recommended | latitude_longitude | height is usually 10m |
vas | time,lat,lon | y_wind | Northward Near-Surface Wind | m s-1 | "height time lat lon" | "time: point" C3S: required CF: recommended | latitude_longitude | height is usually 10m |
time,lat,lon | wind_speed_of_gust | m s-1 | "height time lat lon" | "time: maximum (interval: <value> <unit>)" C3S: required. CF: interval is optional | latitude_longitude | height is usually 10m C3S: The interval is required to have a value<=3 hours) |
Surface Fields (not defined at a height level)
attributes | ||||||||
name (CMIP5) | dimensions | standard_name | long_name (CMIP5) | units | coordinates | cell_methods | grid_mapping | NOTES |
---|---|---|---|---|---|---|---|---|
time,lat,lon | air_pressure_at_sea_level | Pa | "time: point" C3S: required CF: recommended | latitude_longitude | ||||
time,lat,lon | cloud_area_fraction | 1 | "time: point" C3S: required CF: recommended | latitude_longitude | ||||
time,lat,lon | soil_temperature | K | "time: point" C3S: required CF: recommended | latitude_longitude | ||||
time,lat,lon | sea_surface_temperature | K | "time: point" C3S: required CF: recommended | latitude_longitude | ||||
time,lat,lon | sea_ice_temperature | K | "time: point" C3S: required CF: recommended | latitude_longitude | ||||
time,lat,lon | sea_ice_area_fraction | 1 | "time: point" C3S: required CF: recommended | latitude_longitude | ||||
time,depth,lat,lon | mass_content_of_water_in_soil_layer | kg m-2 | "time: point" C3S: required CF: recommended | latitude_longitude | ||||
time,lat,lon | lwe_thickness_of_surface_snow_amount | m | "time: point" C3S: required CF: recommended | latitude_longitude | ||||
time,lat,lon | snow_density | kg m-3 | "time: point" C3S: required CF: recommended | latitude_longitude | ||||
time,lat,lon | lwe_thickness_of_stratiform_precipitation_amount | m | "time: sum" C3S: required CF: recommended | latitude_longitude | ||||
time,lat,lon | lwe_thickness_of_convective_precipitation_amount | m | "time: sum" C3S: required CF: recommended | latitude_longitude | ||||
time,lat,lon | lwe_thickness_of_precipitation_amount | m | "time: sum" C3S: required CF: recommended | latitude_longitude | ||||
time,lat,lon | lwe_thickness_of_snowfall_amount | m | "time: sum" C3S: required CF: recommended | latitude_longitude | ||||
time,lat,lon | integral_of_surface_downward_sensible_heat_of_flux_wrt_time | W s m-2 | "time: sum" C3S: required CF: recommended | latitude_longitude | ||||
time,lat,lon | integral_of_surface_downward_latent_heat_of_flux_wrt_time | W s m-2 | "time: sum" C3S: required CF: recommended | latitude_longitude | ||||
time,lat,lon | integral_of_surface_downwelling_shortwave_flux_in_air_wrt_time | W s m-2 | "time: sum" C3S: required CF: recommended | latitude_longitude | ||||
time,lat,lon | integral_of_surface_net_downward_shortwave_flux_wrt_time | W s m-2 | "time: sum" C3S: required CF: recommended | latitude_longitude | ||||
time,lat,lon | integral_of_surface_net_downward_longwave_flux_wrt_time | W s m-2 | "time: sum" C3S: required CF: recommended | latitude_longitude | ||||
time,lat,lon | integral_of_toa_net_downward_shortwave_flux_wrt_time | W s m-2 | "time: sum" C3S: required CF: recommended | latitude_longitude | ||||
time,lat,lon | integral_of_toa_net_downward_longwave_flux_wrt_time | W s m-2 | "time: sum" C3S: required CF: recommended | latitude_longitude | ||||
time,lat,lon | integral_of_toa_incoming_shortwave_flux_wrt_time | W s m-2 | "time: sum" C3S: required CF: recommended | latitude_longitude | ||||
time,lat,lon | integral_of_surface_downward_eastward_stress_wrt_time | Pa s | "time: sum" C3S: required CF: recommended | latitude_longitude | ||||
time,lat,lon | integral_of_surface_downward_northward_stress_wrt_time | Pa s | "time: sum" C3S: required CF: recommended | latitude_longitude | ||||
time,lat,lon | lwe_thickness_of_watert_evaporation_amount | m | "time: sum" C3S: required CF: recommended | latitude_longitude | ||||
time,lat,lon | runoff_amount | kg m-2 | "time: sum" C3S: required CF: recommended | latitude_longitude | ||||
time,lat,lon | surface_runoff_amount | kg m-2 | "time: sum" C3S: required CF: recommended | latitude_longitude | ||||
time,lat,lon | subsurface_runoff_amount | kg m-2 | "time: sum" C3S: required CF: recommended | latitude_longitude |
Pressure Level Fields
attributes | ||||||||
name (CMIP5) | dimensions | standard_name | long_name (CMIP5) | units | coordinates | cell_methods | grid_mapping | NOTES |
---|---|---|---|---|---|---|---|---|
time,lat,lon | air_pressure_at_sea_level | Pa | "time: point" C3S: required CF: recommended | latitude_longitude |
Additional Questions to be addressed
Question | Discussion | Decision |
---|---|---|
File format to be used? | Francisco Doblas-Reyes NetCDF4? With or without compression? Kevin Marsh netCDF4 classic model (with deflate =6 suggested by Pierre-Antoine) | |
File naming, | Kevin Marsh Pierre-Antoine Bretonniere proposed follow SPECS convention | |
forecast/hindcast matching and labelling | ||
File size recommendation (maximum size)? | Kevin Marsh Pierre-Antoine Bretonniere suggested 4GB recommended maximum size | Kevin Marsh recommend 4GB Max Size for data files |
Versioning of data files? | ||
DOI | Kevin Marsh DOI likely to be assigned at dataset level | Kevin Marsh DOI likely to be assigned at dataset level |
Variable short names to be specified? | Kevin Marsh Antonio S. Cofino Gonzalez suggested follow cmip5 short names | Kevin Marsh follow cmip5 short names |
Coordinate short names to be specified? | Kevin Marsh Antonio S. Cofino Gonzalez suggested follow cmip5 coordinate short names | Kevin Marsh follow cmip5 coordinate short names |
Extension to include ocean data for C3S? | Kevin Marsh yes, but not in the initial convention release | Kevin Marsh Not considered in initial release |
Grids, resolution etc to be specified? | Kevin Marsh Antonio S. Cofino Gonzalez agreed 1 degree grid specified with valid max/min, but actual grid points not specified | Kevin Marsh 1 degree grid specified with valid max/min, but actual grid points not specified |
MARS attributes to be specified? | Kevin Marsh These will be added by C3S, rather than data provider | Kevin Marsh These will be added by C3S |
standard name request/assignment process? | Kevin Marsh requested via standard name mailing list. Note that this process can take some considerable time. | Kevin Marsh requested via standard name mailing list |
Discussion about time coordinates
NOTE: The SPECS approach (2 1D time coordinates) has been chosen for the "providers" convention
The encoding of multiple time coordinates requires particular consideration. An explicit example of the structure is given below.
Example of encoding data with multiple time axis informations
double forecast_reference_time(forecast_reference_time) ;
forecast_reference_time:bounds = "forecast_reference_time_bnds" ;
forecast_reference_time:units = "hours since 1970-01-01 00:00:00" ;
forecast_reference_time:standard_name = "forecast_reference_time" ;
forecast_reference_time:calendar = "gregorian" ;
double leadtime(leadtime) ;
leadtime:bounds = "leadtime_bnds" ;
leadtime:units = "hours" ;
leadtime:standard_name = "forecast_period" ;
leadtime:calendar = "gregorian" ;
double time(forecast_reference_time,leadtime) ;
time:axis = "T" ;
time:bounds = "time_bnds" ;
time:units = "hours since 1970-01-01 00:00:00" ;
time:standard_name = "time" ;
float temp(forecast_reference_time,leadtime,pressure,latitude,longitude);
temp:units = "K";
temp:standard_name = "air_temperature";
temp:coordinates = "time";
Francisco Doblas-Reyes I interpret this as the time coordinates being a hypercube, where there could be missing data; this won't be consistent with the CMIP files; I
still find this confusing unless a discussion about what to do with the missing data is undertaken.
Eduardo Penabad: Wouldn't that be solved by clarifying that different variables within the same file could potentially have different time coordinates/dimensions?
Francisco Doblas-Reyes Not sure. If to simplify you assume one variable only and this variable has in one file data for two start dates, one with three forecast time steps and another one with only two, the time dimensions will be forecast_reference_time=2, leadtime=3, but one of the values of temp() will have missing values, unless I haven't understood the model.
Antonio S. Cofino Gonzalez: discussion on multi-time dimension data