Standards and Conventions
File formatting
blah
File Structure
Fi
File Naming
Metadata
Summary
Info |
---|
|
NetCDF Dataset Design Overview
Component | Example | Notes | ||||
---|---|---|---|---|---|---|
File Structure | ||||||
Single output variable per file | egrr_enfh_atmos_day_plev_201601A_19950417-19950418_ta_r3.nc |
| ||||
Directory Structure | ||||||
| ec:/copernicus/c3s/<activity>/<institute>/<stream>/<modeling realm>/<frequency>/<production date and start date identifier>/<data year>/<data month>/<data day>/<variable MARS name>/<ensemble member>/ | ec:/copernicus/c3s/seasonal/egrr/enfh/atmos/month/201601A/1995/04/17/ta/r1/
|
| |||
File Format | ||||||
netCDF 4 |
| |||||
File Names | ||||||
<institute>_<stream>_<modeling realm>_<frequency>_<level>_<production date and start date identifier>_<data year><data month><data day>[-<data year><data month><data day>]_<variable MARS name>_<ensemble member>/ | egrr_enfh_atmos_day_sfc_201601A_19950417-19950418_ta_r3.nc egrr_enfh_atmos_month_plev_201601A_199504-199505_ta_r3.nc
"201601A" is a placeholder while a form fore representing the model version, production year and startdate is determined: egrr_enfh_atmos_month_plev_P2016_M1A_S19950401_199504-199505_ta_r3.nc P=production year M=model version S=startdate e.g. could the filename (alternatively) be something like: egrr_enfh_atmos_month_plev_S19950401_199504-199505_ta_r3p20160101m411.nc
|
|
Additional Questions to be addressed
Question | Discussion | Decision |
---|---|---|
File format to be used? | Francisco Doblas-Reyes NetCDF4? With or without compression? Kevin Marsh netCDF4 classic model (with deflate =6 suggested by Pierre-Antoine) | |
File naming, | Kevin Marsh Pierre-Antoine Bretonniere proposed follow SPECS convention | |
forecast/hindcast matching and labelling | ||
File size recommendation (maximum size)? | Kevin Marsh Pierre-Antoine Bretonniere suggested 4GB recommended maximum size | Kevin Marsh recommend 4GB Max Size for data files |
Versioning of data files? | ||
DOI | Kevin Marsh DOI likely to be assigned at dataset level | Kevin Marsh DOI likely to be assigned at dataset level |
Variable short names to be specified? | Kevin Marsh Antonio S. Cofino Gonzalez suggested follow cmip5 short names | Kevin Marsh follow cmip5 short names |
Coordinate short names to be specified? | Kevin Marsh Antonio S. Cofino Gonzalez suggested follow cmip5 coordinate short names | Kevin Marsh follow cmip5 coordinate short names |
Extension to include ocean data for C3S? | Kevin Marsh yes, but not in the initial convention release | Kevin Marsh Not considered in initial release |
Grids, resolution etc to be specified? | Kevin Marsh Antonio S. Cofino Gonzalez agreed 1 degree grid specified with valid max/min, but actual grid points not specified | Kevin Marsh 1 degree grid specified with valid max/min, but actual grid points not specified |
MARS attributes to be specified? | Kevin Marsh These will be added by C3S, rather than data provider | Kevin Marsh These will be added by C3S |
standard name request/assignment process? | Kevin Marsh requested via standard name mailing list. Note that this process can take some considerable time. | Kevin Marsh requested via standard name mailing list |
Discussion about time coordinates
NOTE: The SPECS approach (2 1D time coordinates) has been chosen for the "providers" convention
...
The encoding of multiple time coordinates requires particular consideration. An explicit example of the structure is given below.
Example of encoding data with multiple time axis informations
double forecast_reference_time(forecast_reference_time) ;
forecast_reference_time:bounds = "forecast_reference_time_bnds" ;
forecast_reference_time:units = "hours since 1970-01-01 00:00:00" ;
forecast_reference_time:standard_name = "forecast_reference_time" ;
forecast_reference_time:calendar = "gregorian" ;
...