Contributors: Hendrik Boogaard (WAGENINGEN ENVIRONMENTAL RESEARCH), Allard de Wit (WAGENINGEN ENVIRONMENTAL RESEARCH), Jenny Lazebnik (WAGENINGEN ENVIRONMENTAL RESEARCH), Jonathan Schubert (METEOGROUP), Gerald van der Grijn (METEOGROUP)
History of Modifications
Acronyms
Scope of the document
This document provides an overview of the AgERA5 product, the underlying data sets, the underlying algorithms and workflow. The AgERA5 dataset provides daily, agronomic relevant, meteorological data for the period 1979 to present at a spatial resolution of 0.1° .
Executive summary
The AgERA5 dataset provides daily surface meteorological data for the period 1979 to present at spatial resolution of 0.1° grid. The service is based on the fifth generation of ECMWF atmospheric re-analyses of the global climate, better known as ERA5. AgERA5 'connects' users in the agricultural domain to the new ERA5 data set. It includes daily aggregates of agronomic relevant variables, tuned to local day definitions and adapted to the finer topography, finer land use pattern and finer land-sea delineation of the ECMWF HRES operational model. The variables cover temperature, precipitation, snow depth, humidity, cloud cover and radiation.
Product description
The following text applies to AgERA5 version 1.0.
Introduction
Climate forcing data is used in analysis and agro-environmental modelling to study aspects of productivity and externalities of agriculture (e.g. Toreti et al, 2019; Glotter et al., 2016; De Wit et al., 2010). In this service we start from the hourly ECMWF ERA5 model data and convert the data into meaningful input for these analyses and modelling. It involves a large amount of data that needs to be processed. Acquisition and pre-processing of ERA5 data, both archive and near real- time (NRT) data, is a large and specialized job. It requires a heavy investment for users like technical policymakers, information agencies, NGOs, commodity traders, agri-businesses, insurance providers etcetera. The complex task and required effort may even be a barrier to start using the data.
This service is based on the original hourly deterministic ECMWF ERA5 data, at surface level and available at a spatial resolution of 30 km (~0.28125°). Data were aggregated to daily time steps and corrected towards a finer topography at a 0.1° spatial resolution. Aggregated data at daily time steps follow a local time zone definition and include a number of major agronomic parameters. The correction to the 0.1° grid was realized by applying grid and variable-specific regression equations to an ERA5 data set interpolated at 0.1 ° grid. The equations were trained on operational ECMWF HRES model data at a 0.1° resolution. The final data set is referred to as AgERA5. AgERA5 users will save potential users money and stimulate businesses in using such high quality data set. It avoids a possible proliferation of different data sets, originating from the basic hourly ERA5 data set.
Variable definitions
The AgERA5 includes 22 agronomic relevant variables. See Table 3-1.
Table 3-1:List of variables in the AgERA5 data set
Short name | Long name | Unit | Aggregation | AGROVOC URI |
Cloud_Cover_Mean | Total cloud cover (00-00LT) | (0 - 1) | Mean | |
Dew_Point_Temperature_2m_Mean | 2 meter dewpoint temperature (00-00LT) | K | Mean | |
Preciptation_Flux | Total precipitation (00-00LT) | mm d-1 | Sum | |
Preciptation_Rain_Duration_Fraction | Precipitation type duration - rain (00-00LT) | - | Count | |
Preciptation_Solid_Duration_Fraction | Precipitation type duration - solid fraction (no hail) composed of: precipitation types freezing rain (3), snow (5), wet snow (6), mixture of rain and snow (7) and ice pellets (8) (00-00LT) | - | Count | |
Relative_Humidity_2m_06h | Relative humidity at 06LT | % | - | |
Relative_Humidity_2m_09h | Relative humidity at 09LT | % | - | |
Relative_Humidity_2m_12h | Relative humidity at 12LT | % | - | |
Relative_Humidity_2m_15h | Relative humidity at 15LT | % | - | |
Relative_Humidity_2m_18h | Relative humidity at 18LT | % | - | |
Snow_Thickness_LWE_Mean | Snow liquid water equivalent (00-00LT) | cm of liquid water equivalent | Mean | |
Snow_Thickness_Mean | Snow depth (00-00LT) | cm snow | Mean | |
Solar_Radiation_Flux | Surface solar radiation downwards (00-00LT) | J m-2d-1 | Sum | |
Temperature_Air_2m_Max_24h | Maximum air temperature at 2 meter (00-00LT) | K | Maximum | |
Temperature_Air_2m_Max_Day_Time | Maximum air temperature at 2 meter (06-18LT) | K | Maximum | |
Temperature_Air_2m_Mean_24h | 2 meter air temperature (00-00LT) | K | Mean | |
Temperature_Air_2m_Mean_Day_Tim e |
| K | Mean | |
Temperature_Air_2m_Mean_Night_Ti me |
| K | Mean | |
Temperature_Air_2m_Min_24h | Minimum air temperature at 2 meter (00-00LT) | K | Minimum | |
Temperature_Air_2m_Min_Night_Time | Minimum air temperature at 2 meter (18-06LT) | K | Minimum | |
Vapour_Pressure_Mean | Vapour pressure (00-00LT) | hPa | Mean | |
Wind_Speed_10m_Mean | 10 meter wind component (00-00LT) | m s-1 | Mean |
Input data used
Logically the ERA5 data set is the main input data set. See https://www.ecmwf.int/en/forecasts/datasets/reanalysis-datasets/era5 ERA5 provides hourly estimates of a large number of atmospheric, land and oceanic climate variables. The data cover the earth on a 30 km grid and resolve the atmosphere using 137 levels from the surface up to a height of 80 km. ERA5 includes information about uncertainties for all variables at reduced spatial and temporal resolutions.
Concerning the archive the years 1979 to present were available during the project. Note that two versions of ERA5 are available through the CDS:
- interpolated to a 0.25° grid
- original ERA5 model level data (reanalysis-era5-complete) The latter version was used in this project.
ERA5 has a wide list of variables. See the following link: ERA5: data documentation, especially the tables:
- 2: surface, instantaneous (averages)
- 3: surface, accumulations
- 4: surface, minimum/maximum
The following table shows the variables used for the AgERA5 product.
Table 3-2: Essential variables used for the AgERA5 product
Variable name | Unit | Short | Reference | Group |
Snow density | kg m-3 | rsn | table 2 | INST1 |
Snow depth | m of water | sd | table 2 | INST1 |
10 metre U wind component | m s-1 | u10 | table 2 | INST1 |
10 metre V wind component | m s-1 | v10 | table 2 | INST1 |
Total cloud cover | (0 - 1) | tcc | table 2 | INST1 |
2 metre temperature | K | 2t | table 2 | INST1 |
2 metre dewpoint temperature | K | 2d | table 2 | INST1 |
Surface solar radiation downwards | J m-2 | ssrd | table 3 | ACCMNMX |
Total precipitation | m | tp | table 3 | ACCMNMX |
Precipitation type | code table | ptype | table 2 | INST2 |
Maximum temperature at 2 metres since | K | mx2t | table 5 | ACCMNMX |
Minimum temperature at 2 metres since | K | mn2t | table 5 | ACCMNMX |
Data of the HRES model were needed as a training data set to derive the bias correction. HRES data is not part of the C3S catalogue and was accessed through the contract (C3S422Lot1WEnR).
3.4 Algorithms used
The workflow includes:
- 0) Retrieving original hourly data of ERA5 from the CDS
- 1) Nearest Neighbor interpolation to 0.1° grid (ECMWF HRES grid)
- 2) Temporal aggregation and calculation of additional variables
- 3) Apply location, variable and seasonal specific bias correction plus sea mask
The workflow is further described in Chapter 4 (workflow) and Chapter 5 (develop bias corrections).
Workflow
The AgERA5 workflow includes (see Figure 4-1):
- 0) Retrieving original hourly data of ERA5 from the CDS
- 1) Nearest Neighbor interpolation to 0.1° grid (ECMWF HRES grid)
- 2) Temporal aggregation and calculation of additional variables
- 3) Apply location, variable and seasonal specific bias correction plus sea mask
Step 0: Retrieving hourly data
The original ERA52 data are stored in the MARS archive and were retrieved, via the CDS (version: reanalysis-era5-complete), and prepared for further processing (see also section 3.3). ERA5 is originally calculated in a T639-spectral space and on a N320-gaussian grid3. This relates best to a 0.28125° grid and therefore this grid definition was used in the download.
Step 1: NN interpolation to 0.1° grid
Downloaded data were interpolated to a 0.1° grid which is close to the current HRES resolution. To preserve variability and extremes in the original data the Nearest Neighbor (NN) technique was applied.
Step 2: Temporal aggregation and additional variables
Next, hourly data were aggregated into daily accumulations applying variable and longitude specific aggregation schemes. By applying clever algorithms, agronomically relevant weather variables were computed that honor local time (LT), e.g. maximum temperature over daytime and minimum temperature over nighttime. Therefore, data comply with local calendar day definitions and aggregation schemes being used by NMIs4. Examples of such aggregation schemes, used to aggregate 3-hourly ERA-Interim data, can be found via the following URL: http://marswiki.jrc.ec.europa.eu/agri4castwiki/index.php/Meteorological_data_from_ECMWF_models.
In contrast to the study provided in the above URL, the number of longitudinal aggregation zones were increased from three to eight5 zones. Each zone was assigned to a certain longitude range for which a specific aggregation scheme was defined. See Annex I for the zone definition and Annex II for the aggregation schemes.
Figure 4-1: Overview of the different processing steps in the whole workflow
An example: the ERA5 archive includes the maximum temperature of the previous hour. The 24 values of maximum temperature can be used to:
- Derive the maximum temperature over day time taking the maximum of 12 maximum temperatures values occurring during the local day time (e.g. London between 06 and 18 UTC).
- Derive the maximum temperature over 24 hours taking the maximum of 24 maximum temperatures values occurring during the local day (e.g. London between 00 UTC day X and 00 UTC day X + 1).
Similar aggregation can be done for minimum temperature but then taking the minimum over a range of hourly values. Most other elements were aggregated as the mean or sum over 24 hours of the local day. To obtain the set of 24 hours for a certain zone, hourly data of ERA5 is needed of day X, and possibly day X – 1 or X + 1. The exact dataset depends obviously on the zone (longitude range).
In case of precipitation type (rain, snow) the aggregation to a daily time step can be done type specific, thus counting the hours that the type appeared.
The applied aggregation zone definitions work very well with the local time zones of West- and East- Europe and mostly for the North-American continent. For Asia there is a shift of 2-3 hours between the actual local time definition and the definition in our study. The only extreme mismatch of the local time definitions will happen eastward of the dateline in zone E4. Fortunately, the affected areas (Pacific islands and the very western coast of Alaska) are, from an agricultural perspective, not particularly significant.
The following conversions were done:
- unit conversion of precipitation (tp): m d-1 -> mm d-1
- unit conversion of snow (sd; liquid water equivalent): m -> cm
In addition, the following variables were calculated:
- 10m wind speed (m s-1) from the 10m u (10u) and 10 m v (10v) wind components: sqrt(10u*2 + 10v*2)
- snow depth (cm) from snow density (rsn) and snow depth of liquid water equivalent (sd): (sd / rsn) * 1000 * 100
- partial water vapour pressure (hPa) from dewpoint temperature (Td; Priestley and Taylor, 1972)): 10 * 0.6108 * exp((17.27 * d2m) / (d2m + 237.3))
- relative humidity (%) from 2m temperature (t2m) and dewpoint temperature (d2m): 100 * (exp((17.27 * d2m) / (237.3 + d2m)) / exp((17.27 * t2m) / (237.3 + t2m)))
The temporal aggregation and calculation of additional variables lead to the final list of variables presented in Table 3-1.
The variables in the dataset answers the need of most common crop models6 (working at a daily time step) and their regional implementations and, in addition, the needs of users inventoried at the first stage of the project.
Step 3: Bias correction of data at 0.1° grid
A location, variable and season specific bias correction towards the HRES operational model was applied. This way the finer topography, finer land use pattern and finer land-sea delineation of the HRES operational model is more or less included in the downscaled ERA5. In fact, the ERA5 data set is tuned to the detailed topography of the HRES operational model also leading to more consistent time series between ERA5 and the HRES operational model.
For each grid cell and all variables, except precipitation and snow related variables, a linear equation is applied:
\[ Y_{i,j}^{ERA-5,corr} = \alpha_{i,j}Y_{i,j}^{ERA-5} + \beta_{i,j} + [T_{i,j}] \]in which \( Y_{i,j}^{ERA-5} \) is the ERA5 NN-interpolated variable (e.g. temperature, wind) for grid box [i,j], \( Y_{i,j}^{ERA-5,corr} \) is the ERA5 NN-interpolated and bias corrected variable for grid box [i,j], and αi,j, βi,j are correction coefficients (hereinafter referred to as slope and intercept, respectively).
The parameter Ti,j accounts for an additional seasonal correction and reads:
\[ T_{i,j} = \gamma_{1,i,j}T_{1} + \gamma_{2,i,j}T_{2} + \gamma_{3,i,j}T_{3} + \gamma_{4,i,j}T_{4} \]The correction towards the HRES operational model is very relevant for users that do near real time monitoring of growing conditions and agricultural production. Note that the final ERA5 product will come available with a time lag of one week including the temporary ERA5 line. For monitoring systems like JRC’s Monitoring Agricultural ResourceS (MARS) such time lag is too large and therefore data in such systems have to be completed with data from the HRES operational model. When combining data of two datasets, originating from different resolutions, biases might be introduced that negatively affect the monitoring performance. This can be avoided by correcting the ERA5 towards the HRES operational model. Similar reasoning applies to forecast products like the ENS forecasts (15/30 day ensemble forecasts). This product can also be downscaled and bias corrected towards the HRES operational model. This way more or less consistent time series are obtained linking reanalysis, HRES and ENS data all around a common ‘HRES’ reference. Some remarks:
- To improve the timeliness of the foreseen service the preliminary ERA5 product, ERA5t, needs also to be processed. We hereby assume that the bias correction algorithms, which are based on ERA5 data, can also be applied on ERA5t data.
- Specifically for users that need to link ERA5 to HRES for NRT monitoring purposes the following issue is relevant. The merge with the HRES operational model would need an
- additional service relying on specific data contracts with ECMWF. And the HRES operational model data must be processed in a similar way (daily aggregation, possibly elevation corrections etc.) as the ERA5 data.
- Note that the HRES model is constantly improving (improved model physics, increased spatial resolution etc.). Therefore, with each additional HRES model upgrade, the established statistical relationship between ERA5 and HRES will become less valid. Over time, this may lead to jumps in the time series as the bias correction is correcting for aspects that changed in the HRES model. In such case users, that link ERA5 to HRES, need to be warned and eventually the bias correction needs to be updated.
During the processing only the 'land' locations at the surface level (topographical elevation) were maintained using the HRES land-sea mask. This mask includes the area fraction of land within each 0.1° grid cell. As threshold, the fraction 0.05 has been selected: above it is land, below it is sea (see Figure 4-2).
Figure 4-2: Select of land 0.1° grid cells: the area fraction land within a 0.1° grid cell (top) and selection of land grid cells after applying the threshold of 0.05 area fraction
Develop bias corrections
Step 3, as described in the previous chapter, covers the bias correction towards the HRES grid (0.1° grid). The grid and variable-specific regression equations are trained on operational ECMWF HRES model data.
The approach, to develop the equations, consists of the following main steps:
- Interpolate the data towards a 1° grid (see step 1)
- Aggregate hourly model data to daily variables (see step 2)
- Train statistical correction equations for each variable and grid point
- Apply the trained equations to the ERA5 data set
The development of the equations (using HRES operational model as training set) is an on-off action and has been documented in a separate document named “C3S422Lot1.WEnR.DS2_Downscaling and bias correction v1.7.pdf”. This section provides a summary of this work.
The input data:
- ECMWF ERA5 reanalysis (grid1: 28125° x 0.28125°)
- ECMWF HRES (grid2: 0.10° x 10°)
Both data sets are covering the globe, including land and sea grid boxes.
Originally, ERA5 data is available as hourly fields, while HRES has a temporal resolution of 3 hours. For both models, a set of 12 base parameters (see Table 3-2) was retrieved from the ECMWF MARS archive covering a period of two years. These base parameters with 1-hourly/3-hourly resolution were then aggregated to 22 (derived) daily parameters over 8 different longitudinal bands (see section 4.3; note that schemes given in Annex II only apply to ERA5, the schemes for HRES-data are available on request). Note that the ERA5 data was first interpolated towards the 0.1° grid using the NN-technique (see section 4.2) before applying the aggregation to days.
To train the regression equations, a data set of 2-3 years is desired. Both, ERA5 and HRES, need to be available for this period. Based on the recent HRES model upgrades outlined in the separate report, the period between 2016-04-01 and 2018-03-31 was chosen as the training period for the final bias correction equations. Most importantly, this period does not include any horizontal grid or resolution changes. Also, data of both models were available through ECMWFs MARS archive at the moment the bias correction analysis took place. Therefore, the generated equations correct ERA5 data towards a mixture HRES model cycles (41r2, 43r1 and 43r3).
The equations were derived by means of multiple linear regression.
Not all daily aggregated elements (see Table 3-1) are fitted to be corrected by this method. For instance, the snow parameters lack snow cases for most parts of the world, to build a robust correction statistic. Similar issues are expected to happen with the precipitation parameters (sum and type) in arid regions.
The MOS (Model Output Statistics) routine was used to carry out a multiple linear regression between the ECMWF HRES data and the NN-interpolated ERA5 data for each grid cell. The outcome is a linear equation (in this case demonstrated for the ERA5 data set):
\[ Y_{i,j}^{ERA-5,corr} = \alpha_{i,j}Y_{i,j}^{ERA-5} + \beta_{i,j} + [T_{i,j}] \]in which \( Y_{i,j}^{ERA-5} \) is the ERA5 NN-interpolated variable (e.g. temperature, wind) for grid box [i,j], \( Y_{i,j}^{ERA-5,corr} \) is the ERA5 NN-interpolated and bias corrected variable for grid box [i,j], and αi,j, βi,j are correction coefficients (hereinafter referred to as slope and intercept, respectively).
The parameter Ti,j accounts for an additional seasonal correction and reads:
\[ T_{i,j} = \gamma_{1,i,j}T_{1} + \gamma_{2,i,j}T_{2} + \gamma_{3,i,j}T_{3} + \gamma_{4,i,j}T_{4} \]in which T1 to T4 are sinusoidal time functions with a period of one year, and 𝛾1,𝑖,𝑗 to 𝛾4,𝑖,𝑗 are the respective coefficients. The sinusoidal time functions that were used read:
\[ T_{1} = 100\sin \left(2\pi \frac{day-21}{365} \right) \] \[ T_{2} = 100\sin \left(2\pi \frac{day-81}{365} \right) \]With the combination of the above sine functions and coefficients, any grid-specific time correction function can be constructed. To achieve this, it is enough to use only the 2 best sinusoidal time functions of the 4 available for each grid point in the final equation.
The objects created by the bias correction application are twofold. The trained regression equation of a particular parameter was written to a NetCDF file, having the slope, the intercept and each of the seasonal cycle coefficients stored as a normal NetCDF parameter. The evaluation metrics were handled similarly. For analysis purposes the MAE, RMSE and R-squared were calculated and stored in a second NetCDF file.
A detailed analysis of the significance of the bias correction can be found in document “C3S422Lot1.WEnR.DS2_Downscaling and bias correction v1.7.pdf”.
Table 5-1 summarizes how the ERA5 improves (in terms of MAE for the main elements) after applying the bias correction.
Table 5-1: MAE (HRES-ERA5corrected) and MAE improvement of different bias corrected variables. The MAE improvements indicate the added value through the bias correction. All metrics were calculated for different regions and for subsets of grid points meeting certain conditions. E.g. “Land & above 800m” only uses grid points being located on land and above 800m. “Coasts & Lakes” subsets all grid points with a land fraction between 10% and 90%.
Land | Land & below 800m | Land & above 800m | Coasts & Lakes | ||||||
Variable | Region | MAE | MAE Impr | MAE | MAE Impr | MAE | MAE Impr | MAE | MAE Impr |
2t_davg [K] | Africa | 0.44 | 40% | 0.42 | 36% | 0.47 | 48% | 0.36 | 50% |
2t_davg | Asia | 0.72 | 36% | 0.67 | 27% | 0.86 | 48% | 0.66 | 32% |
2t_davg | Australia | 0.43 | 42% | 0.43 | 35% | 0.37 | 83% | 0.30 | 49% |
2t_davg | Europe | 0.51 | 36% | 0.47 | 30% | 0.75 | 55% | 0.45 | 38% |
2t_davg | N-America | 0.71 | 31% | 0.67 | 25% | 0.85 | 41% | 0.68 | 28% |
2t_davg | S-America | 0.45 | 50% | 0.42 | 41% | 0.61 | 65% | 0.38 | 48% |
2d_davg [K] | Africa | 0.76 | 38% | 0.77 | 38% | 0.76 | 39% | 0.55 | 46% |
2d_davg | Asia | 0.90 | 29% | 0.81 | 25% | 1.09 | 35% | 0.73 | 28% |
2d_davg | Australia | 0.57 | 34% | 0.57 | 28% | 0.43 | 78% | 0.36 | 43% |
2d_davg | Europe | 0.58 | 28% | 0.55 | 22% | 0.81 | 46% | 0.54 | 27% |
2d_davg | N-America | 0.80 | 23% | 0.73 | 18% | 0.97 | 32% | 0.70 | 21% |
2d_davg | S-America | 0.54 | 42% | 0.44 | 37% | 0.99 | 50% | 0.41 | 40% |
ff_davg [m/s] | Africa | 0.27 | 25% | 0.26 | 22% | 0.28 | 32% | 0.33 | 47% |
ff_davg | Asia | 0.29 | 28% | 0.27 | 24% | 0.34 | 35% | 0.36 | 35% |
ff_davg | Australia | 0.24 | 31% | 0.25 | 30% | 0.22 | 41% | 0.31 | 53% |
ff_davg | Europe | 0.25 | 31% | 0.24 | 31% | 0.32 | 33% | 0.33 | 48% |
ff_davg | N-America | 0.29 | 28% | 0.28 | 26% | 0.33 | 31% | 0.33 | 34% |
ff_davg | S-America | 0.23 | 30% | 0.22 | 26% | 0.27 | 42% | 0.32 | 51% |
tcc_davg [0-1] | Africa | 0.08 | 3% | 0.08 | 2% | 0.08 | 4% | 0.08 | 5% |
tcc_davg | Asia | 0.07 | 0% | 0.07 | -2% | 0.08 | 4% | 0.08 | -2% |
tcc_davg | Australia | 0.06 | -1% | 0.06 | -1% | 0.06 | 5% | 0.07 | 2% |
tcc_davg | Europe | 0.07 | -1% | 0.07 | -1% | 0.07 | 2% | 0.07 | -1% |
tcc_davg | N-America | 0.08 | 0% | 0.08 | -1% | 0.07 | 2% | 0.08 | -1% |
tcc_davg | S-America | 0.07 | 4% | 0.07 | 3% | 0.07 | 8% | 0.07 | 5% |
ssrd_dsumdiff [J/m2d] | Africa | 1055575 | 7% | 1030480 | 7% | 1118699 | 8% | 1151300 | 13% |
ssrd_dsumdiff | Asia | 872717 | 4% | 836249 | 3% | 958997 | 7% | 899084 | 5% |
ssrd_dsumdiff | Australia | 1205911 | 6% | 1177253 | 6% | 1772895 | 14% | 1497494 | 12% |
ssrd_dsumdiff | Europe | 832226 | 2% | 815116 | 2% | 951428 | 5% | 782759 | 4% |
ssrd_dsumdiff | N-America | 899054 | 4% | 902781 | 3% | 888809 | 6% | 916596 | 4% |
ssrd_dsumdiff | S-America | 1427243 | 9% | 1448626 | 9% | 1328043 | 13% | 1316248 | 11% |
The MAE indicates the error of the corrected data (HRES-ERA5corrected), while the MAE improvement compares the error of the corrected versus the not corrected ERA5 data. All metrics were aggregated for different regions and certain subsets of grid points. Overall, the temperature, humidity and wind speed variables benefit most from the correction. The MAE is reduced by 30% to 60% in the majority of cases. Grid points being located in mountainous areas or along coasts and lakes are improved most. This is not surprising as these are the areas where the largest systematic differences between ERA5 and HRES can be expected. But not only the relative improvements are quite large, also the absolute MAE values after the correction are small. The MAE for the 24h mean of the 2m temperatures (2t_davg) for example is for all continents below 0.72K, and for 4 of 6 continents even below 0.51K.
For the solar radiation flux (ssrd_dsumdiff) the MAE improvement is solid and ranges between 2% and 14%, depending on the region and subset. The results of element "24h mean cloud cover" (tcc_davg) are mixed. For most grid points the correction doesn't add any value. The MAE improvement of the majority of all grid points (land and below 800m) is between -2% and +4%, and therefore near zero. Only for grid points above 800m we can observe a small but clear improvement (2% - 8%).
The following conclusions were drawn from the evaluation study:
- The selected bias correction method has its largest benefits in mountainous areas, at coast lines and at lakes.
- Seasonal correction on top of the simple bias correction further improves the accuracy of the derived correction equations.
- The approach works remarkable well for 3 out of the 4 groups of variables. The averaged relative reduction of MAE is between 30% and 60%. These are:
- Temperature parameters
- Humidity parameters
- Wind speed
- The correction models for solar radiation flux reach a MAE improvement of 2% to 14%.
- For cloud cover the correction has only a minor effect for most of the grid points. However, mountainous regions still benefit from the correction with a MAE improvement of 2%-8%.
Appendix I Longitudinal aggregation zones
Longitudinal aggregation zones are defined around central longitudes. The first zone is at zero longitude (London). This zones stretches from 22.5 west to 22.5 east. The next zone is centered around 45 east stretching from 22.5 east to 67.5 east. And so on. This definition works very well with the local time zone configuration of West- and East-Europe and mostly with the American continent. For Asia there will be a shift between the real local time definition and our definition of 2-3 hours. The only extreme mismatch of the local time definitions will happen eastward of the dateline in zone E4. Fortunately, the affected areas (island in the Pacific and the very western coast of Alaska) are, from agricultural perspective, not so interesting.
Appendix II Aggregation schemes
Some remarks:
- A „hour box" in the top row is always meant to represent the hour on the left border of the box
- Variables 2t, 2d, ff, tcc, sd, rsn, vp and ptype and rh are all instantaneous values. To align with HRES (only available with 3-hour timestep) the period 03-00 has been selected: aggregate 8 values like 03,06,09,12,15,18,21,00
- Variables mn2t, mx2t, ssrd, tp summarize the condition of 1 hour (sum, min, max, type)
References
Toreti, A. Maiorano, G. De Sanctis, H. Webber, A.C. Ruane, D. Fumagalli, A. Ceglar, S. Niemeyer, Zampieri Using reanalysis in crop monitoring and forecasting systems Agricultural Systems, Volume 168, 2019, pp. 144-15.
Glotter et al., 2016, M.J. Glotter, A.C. Ruane, E.J. Moyer, J.W. Elliott Evaluating the sensitivity of agricultural model performance to different climate inputs Appl. Meteorol. Climatol., 55 (2016), pp. 579-594.
Wit, A.J.W. de, Baruth, B., Boogaard, H., Diepen, K. van, Kraalingen, D.W.G. van, Micale, F., Roller, J.A. te, Supit, I., Wijngaart, R. van der, 2010. Using ERA-INTERIM for regional crop yield forecasting in Europe. Climate Research 44 (2010)1. - ISSN 0936-577X - p. 41 - 53.
https://www.ecmwf.int/en/forecasts/datasets/reanalysis-datasets/era5
https://software.ecmwf.int/wiki/display/CKB/ERA5+data+documentation
http://marswiki.jrc.ec.europa.eu/agri4castwiki/index.php/Meteorological_data_from_ECMWF_mo dels.
https://confluence.ecmwf.int/display/CKB/ERA5%3A+What+is+the+spatial+reference