Clustering - ENS medium-range

Weather scenarios

The emphasis is on large-scale developments when clustering ensemble members and so the 500hPa and 1000hPa geopotential forecast fields are used for daily weather scenarios.   The area considered covers Europe and its immediate surroundings including the northeast Atlantic.

Fig8.1.3.1-1: Area considered for clustering purposes.

The root mean square (RMS) of all solutions of the ensemble members within this area is used as the norm.

Clustering is performed over four predefined lead-time windows:

Clustering in this way, rather than on individual forecast days, has the advantage that temporal continuity and synoptic consistency are more likely to be retained.   The clustering is flow-dependent and is not based on pre-defined regimes.  Since all members are regarded as equally likely, the number of members in each cluster could define its probability.

To ensure synoptic consistency, each individual ensemble member must belong to the same cluster throughout the lead-time window.   For two members to be assigned to the same cluster, they must display similar synoptic 500hPa development over the whole time window.  However, weak gradients in 500hPa forecasts can lead to synoptically rather different members being assigned to the same cluster because of the the RMS norm.  The clustering scheme is designed to create no more than six clusters.

A cluster is represented not by the mean of its members but by its most representative member (MRM or cluster scenario).  This is selected by a pattern-matching algorithm minimising root mean square differences between the “centre of gravity” of the cluster and each member.  The most representative member is chosen to symbolize the cluster.  But it should not be used as a deterministic forecast, nor should it be seen as a substitute for the cluster average (cluster averages are not currently available as web charts but are available for download from MARS).

The number of cluster scenarios is related to the characteristics of the ensemble forecast distribution. 

 Note:

The clusters are intended to give an overview of the ensemble forecast information and should not be over-interpreted.  The differences between two clusters will be mainly related to genuine differences in the 500hPa flow pattern, in particular if the differences are large scale.  

The 1000hPa clusters correspond to the clusters the 500hPa fields; clustering is not done on the 1000hPa fields.   Each cluster has the same population of members and the same most representative member.  Major differences might be due to the relation between the flow at 500hPa and that at 1000hPa – fairly similar 500hPa patterns might be linked to quite different 1000hPa patterns.

 

 Fig8.1.3.1-2: The most representative 500hPa members selected to describe the clustering of the forecast DT 00UTC 12 March 2017, T+120 to T+168 hours.  Here there are 3 clusters (one per row).  The most representative member or cluster scenario is the member of the cluster which has the minimum difference from the RMS of the cluster members.  On 500hPa cluster scenario charts, shading denotes the 500mb height anomaly of the ensemble member height field (as contoured) from the long term climatological average

In Fig8.1.3.1-2 there are three clusters:

Differences can be seen in the depth and timing of the upper trough near Scotland at T+120 and the building of the following upper ridge towards southwest Britain.  However, overall the differences between the three clusters do not look to be particularly large on this occasion.

The web site includes cluster products equivalent to Fig8.1.3.1-2 for each of the four predefined lead-time windows.  For additional information, the 1000hPa geopotential fields are also provided for each ensemble scenario to show the corresponding near-surface evolution.  Users should note that for these the clustering has been made on the 500hPa fields, not the 1000hPa fields.  Whilst the user should not treat the most representative members as deterministic solutions, it can nonetheless be helpful to examine the details of the evolution in such members, to see how a particular scenario can plausibly arise and evolve.  One good way to do this is to use the cyclone database products presented by the extra-tropical cyclone diagrams, specifically the animations, at 12 hour intervals, of synoptic patterns for individual members.

Weather Regimes

After day10, it is desirable to place the daily clustering in the context of the large-scale flow and to allow the investigation of regime changes.  Use of weather regimes indicates differences between most representative members (MRMs or Cluster scenarios) in terms of the large-scale flow and provides information about the possible transitions between regimes during the forecast.  

So, after clustering by weather scenario, each most representative member is then attributed to one of four large scale climatological weather regimes.  These have been evaluated over an area covering Europe and the north Atlantic; an area considerably larger area than that used for the weather scenario clustering (see above and Fig8.1.3.1-1).

Fig8.1.3.1-3: The Euro-Atlantic area considered for the computation of four weather regimes (NAO+, NAO-, BL, AR) derived from reanalysis of 500hPa geopotential height.

Large scale climatological weather regimes have been computed from 29 years of reanalysis of 500hPa height data (ERA-Interim and ERA-40) within the weather regime analysis area.  The reanalysis results were clustered, using the same clustering algorithm as for the ensemble weather scenarios, to produce four fixed climatological regimes.  These are: 


Fig8.1.3.1-4:  The large scale climatological regimes for 500hPa heights computed for the cold season (October to April).


Fig8.1.3.1-5:  The large scale climatological regimes for 500hPa heights computed for the warm season (May to September).

The climatological regimes have been computed from 29 years of reanalysis of 500hPa height data (ERA-Interim and ERA-40) using the same clustering algorithm as for the ensemble scenarios.  Geopotential heights are shown in units of tenths of a metre, colours show anomalies from the mean 500hPa height derived over the reanalysis period.


Each most representative member is attributed to one of four large scale climatological weather regimes by a pattern-matching algorithm which assigns it to the closest large scale climatological weather regime (by minimising root mean square differences).  This attribution is indicated by the colour of the frame surrounding each cluster scenario (Blue:+NAO; Green:–NAO; Red:BL; Purple:AR).  The climatological weather regime refers only to the displayed most representative member.


Fig8.1.3.1-6 As Fig8.1.3.1-2 but referring to the forecast DT 00UTC 05 March 2017, T+264 to T+360hr.  The colour of the frame surrounding each most representative member indicates the large scale climatological weather regimes to which it has been attributed.  On 500hPa plots such as these, shading denotes anomalies relative to climatology (as in Fig8.1.3.1-2).

Flow dependent skill

It has been found: 

Additional Sources of Information

(Note: In older material there may be references to issues that have subsequently been addressed)