Background on EFAS medium-range forecasting
The forecast quality of EFAS medium-range forecasts (out to 10 day lead time) has been evaluated in order to provide additional information to users to aid decision-making. The forecast evaluation method is described in Harrigan et al. (2020) and summarised below with the results provided in the following page: EFAS v4.0 medium range forecast skill as well as summarised as a new headline score included as a "medium-range forecast skill" layer on the EFAS web map viewer: EFAS medium-range forecast skill product.
Headline medium-range forecast skill score
The headline medium-range ensemble forecast skill score is the maximum lead time (in days), up to 10-days ahead, in which the Continuous Ranked Probability Skill Score (CRPSS) is greater than a value of 0.5, when compared to a simple persistence benchmark forecast using EFAS historical Forced simulation (sfo) as proxy observations. Forecast skill is calculated using river discharge reforecasts for a set of past dates, based on a configuration as close as possible to the operational setting. ECMWF-ENS medium and extended range reforecasts are used and are run twice per week for the past 20-years with 11 ensemble members.
Scores are shown on the EFAS map viewer in the 'Medium-range forecast skill' layer under the 'Evaluation' menu (an example of the layer is shown here: EFAS medium-range forecast skill product). For each n=2651 fixed reporting points, the maximum lead time the CRPSS is greater than 0.5 is given, with stations with darker purple circles having high skill for at longer lead times. The category "0-1", marked as light pink circles represents stations that have forecast skill lower than the 0.5 threshold for any lead time or less than one day. Note: This does not mean that a station has no skill. Only when the CRPSS ≤ 0 is when the forecast has no skill, when compared to a persistence benchmark forecast.
Method
Reforecasts
ECMWF-ENS medium and extended range reforecasts are generated every Monday and Thursday, for the same date in the past 20 years for 11 ensemble members out to a lead time of 46 days. For the EFAS medium-range forecast skill evaluation, ECMWF-ENS reforecasts run over a year long reference period (for example, January to December 2019) are used. These are then forced through EFAS hydrological modelling chain to produce 20 years of river discharge reforecasts, twice weekly, for 11 ensemble members. A schematic of the ECMWF-ENS reforecast configuration is given in Figure 1 for the reference period January to December 2019. In total, there are 2080 start dates in an EFAS medium-range reforecast set (52 weeks x 2 per week x 20 years). EFAS medium-range river discharge reforecasts are run for each river cell with an upstream area > 350 km2 at a 6 hr time-step out to a lead time of 46 days with 11-ensemble members each.
Figure 1: ECMWF-ENS reforecast configuration schematic for the reference period January to December 2019.
Benchmark forecast
A widely used benchmark forecast for short to medium-range forecast skill evaluation is a hydrological persistence forecast (Alfieri et al., 2014; Pappenberger et al., 2015). Here, the 6 hr river discharge value of the EFAS historical Forced simulation (sfo) from the time-step previous to reforecast initilisation is used for all lead-time out to 10-days ahead. For example, for the reforecast initialised on 00UTC 3 January 1999, the mean 6hr river discharge value from 18UTC 2 January 1999 to 00UTC 3 January 1999 is extracted from EFAS historical Forced simulation (sfo) and persisted for all lead times.
Proxy observations
Forecast skill is evaluated against EFAS historical Forced simulation (sfo), as a proxy to river discharge observations, for n=2651 fixed reporting point stations across the EFAS domain. The advantage of using sfo instead of in situ river discharge observations is that the forecast skill can be determined independently from the hydrological model error and having a complete spatial and temporal coverage, so that forecast skill can be determined across the full EFAS domain. Users must be aware that the key assumption with the proxy observation approach is that the EFAS hydrological model performance, in which sfo is based, is reasonably good for the station of interest. If the hydrological model performance is poor, then particular care must be made in interpreting forecast skill scores.
Skill score
The ensemble forecast performance is evaluated using the Continuous Ranked Probability Score (CRPS) (Hersbach, 2000), one of the most widely used headline scores for probabilistic forecasts. The CRPS compares the continuous cumulative distribution of an ensemble forecast with the distribution of the observations. It has an optimum value of 0 and measures the error in the same units as the variable of interest (here river discharge in m3 s-1). It collapses to the mean absolute error for deterministic forecasts (as is the case here for the single-valued persistence benchmark forecast). The CRPS is expressed as a skill score to calculate forecast skill, CRPSS, which measures the improvement over a benchmark forecast and is given in:
\[ {CRPSS}={1-}\frac{{CRPS_{fc}}}{{CRPS_{bench}}} \]A CRPSS value of 1 indicates a perfect forecast, CRPSS > 0 shows forecasts more skilful than the benchmark, CRPSS = 0 shows forecasts are only as accurate as the benchmark, and a CRPSS < 0 warns that forecasts are less skilful than the benchmark forecast. The headline EFAS medium-range forecast skill score uses a CRPSS threshold of 0.5 in the summary layer in the EFAS web map viewer, this can be interpreted as the EFAS forecast has 50% less error than the benchmark forecast.
The CRPSS is calculated with EFAS medium-range reforecasts against a single-valued persistence benchmark forecasts and verified against EFAS sfo river discharge simulations as proxy observations. CRPSS headline scores are then mapped on the EFAS map viewer, and CRPSS and CRPS time-series plots are produced for each fixed reporting point station.
References
Alfieri, L., Pappenberger, F., Wetterhall, F., Haiden, T., Richardson, D., Salamon, P., 2014. Evaluation of ensemble streamflow predictions in Europe. Journal of Hydrology 517, 913–922.
Hersbach, H., 2000. Decomposition of the Continuous Ranked Probability Score for Ensemble Prediction Systems. Wea. Forecasting 15, 559–570.
Pappenberger, F., Ramos, M.H., Cloke, H.L., Wetterhall, F., Alfieri, L., Bogner, K., Mueller, A., Salamon, P., 2015. How do I know if my forecasts are better? Using benchmarks in hydrological ensemble prediction. Journal of Hydrology 522, 697–713.