Heikki Järvinen, Sami Saarinen, Per Undén
- Please note: Observation Data Selection was previously known as "blacklisting". The terminology in this user guide was updated in line with policies to avoid unconscious bias.
It describes the blacklist observation data selection language as well as its usage in IFS
...
In the operational suite on Cray computer, the blacklist denylist was basically a list of undesired stations to be excluded from the analysis in operations, and usually in research experiments, too, based on monthly monitoring by the Operations Department. The technique for blacklisting observation data selection has been streamlined as a part of the migration of operational codes from Cray to Fujitsu.
A new blacklist observation data selection format has been introduced that allows a great deal more flexibility in decision making on the use of observations. The blacklist observation data selection process now consists of two parts: data selection part and monthly monitoring part. Data selection part contains information about which variables will be used in the assimilation, and it should be amended only rarely, except in experimentation. The monthly monitoring part, on the other hand, will be updated fairly frequently as a result of data monitoring. The former automatic ship blacklist is denylist is not supported any more.
This guide comprehensively describes the format of the blacklist language observation data selection language developed at ECMWF during the migration project in 1995-96 based on initial idea by Mats Hamrud.
2 The
...
observation data selection language
The way the blacklisting observation data selection now works in the IFS context is as follows. One edits a blacklist data selection file which is written in a specific format. That file is then converted into a subroutine (C language) using the blacklist observation data selection compiler. The subroutine is then compiled and linked into the executable. This external routine is called from the IFS with a list of arguments in the observation screening run. IFS then receives a few flags telling whether to reject or accept this station or variable for assimilation. The following example will clarify the consepts concepts used in blacklistingthe denylist.
if (OBSTYP = synop) then
if VARIAB in (u10m, v10m)
and LSMASK = land
and abs(LAT) < 25 then
fail(constant);
endif
endif;
...
There are several patterns in this single blacklisting denylist rule and in the following they will be called:
...
Variables get their values from IFS. These are compared against the keywords or values given in the blacklistdenylist. If the blacklist denylist rule is true, fail-function takes action activating blacklisting flags denial flags and returning back to the calling routine in IFS. Note that the blacklist observation data selection language is case insensitive and no column orientation is required.
...
The up to date list of variables related to observation header and model fields can be found on HPCF in the external file of our blacklist denylist (for instance /home/rd/rdx/data/37r3/an/external_bl_mon_monit.b for CY37R3).
Variable | Meaning | Possible values |
obstyp | observation type | Keyword (as listed below) |
statid | station id | Right justified 8 character string |
codtyp | code type | Integer value as defined in IFS |
instrm | instrument type | Integer value as defined in IFS |
date | date | Packed integer YYMMDD |
time | time | Packed integer HHMMSS |
lat | latitude | Real value in degrees (-90<=LAT<=90) |
lon | longitude | Real value in degrees (-180<LON<=180) |
stalt | station altitude | Real value in metres |
line_sat | line position atovs | Integer value |
retr_type | retrieval type | Integer value |
qi_fc | EUMETSAT Quality Indicators: with forecast dependence | |
rff | CIMSS Quality Indicator: Recursive Filter Flag | |
qi_nofc | EUMETSAT Quality Indicators: without forecast dependence | |
sensor | satellite sensor indicator (for RTTOV) | Integer value |
fov | field of view number | Integer value |
satza | satellite zenith angle | Real value (in degrees) |
nandat | analysis date | Packed integer YYMMDD |
nantim | analysis time | Packed integer HHMMSS |
soe | solar elevation | Real value |
qr | quality of retrieval | |
clc | cloud cover | |
cp | cloud top pressure | |
pt | product type | Integer value |
sonde_type | sonde type | Integer value |
specific | amsua=clwp on sea | |
gen_centre | Generating centre | Integer value (WMO defined) |
gen_subcentre | Generating sub-centre | Integer value (WMO defined) |
datastream | Data stream (see datastream in odb) | Integer value |
ifs_cycle | six digit IFS-cycle f.ex 331001 for CY33R1.001 | 6 digit integer value |
retrsource | retrieval source | Integer value |
surftype | surface type indicator | |
sza | solar zenith angle | Real Value |
reportype | MARS reportype | Integer value for MARS archiving |
solar_hour | solar hour | Real value |
satellite_identifier | satellite identifier | Integer value |
station_identifier | station identifier (for some conventional only) | Integer value (similar to statid but for integer values only) |
2.1.2 Model/first guess characteristics
Variable | Meaning | Possible values |
modps | model surface pressure | Real Value |
modts | model surface temperature | Real Value |
modt2m | model 2 metre temperature | Real Value |
modtop | model top level pressure (hPa) | Real Value |
sea_ice | model sea-ice fraction | Real Value |
2.1.3 Observation characteristics
External variables (SPECIAL, i.e. related to obs. body entry only)
Variable | Meaning | Possible values |
variab | variable name (varno in ODB) | Integer value |
vert_co | type of vert. coord. | Integer value |
press | pressure (hPa) | Real value |
press_rl | ref. level press. (hPa) | Real value |
ppcode | synop press. code | Integer value |
obs_value | observed value | Real value |
fg_departure | first guess depart. | Real value |
obs_error | observation error | Real value |
fg_error | first guess error | Real value |
winchan_dep | window chan dep | Real value |
obs_t | Obs temperature at same level, for R/S only. | Real value |
elevation | Radar elevation | Real value |
winchan_dep2 | alternative window chan dep | Real value |
tausfc | Surface transmittance for AIRS screening. | Real value |
csr_pclear | percentage of clear pixel (GEOS) | Real value |
2.2 Keywords
Keywords are fixed values against which certain variables are compared. They should be consistent with the IFS definitions. A list of keywords that are currently defined in the blacklist denylist (in the external file of our blacklistdenylist). Adding new keywords is straightforward.
Variable | Keyword |
OBSTYP | synop, airep, satob, dribu, temp, pilot, satem, paob, scatt, limb, gbrad (or integer values as defined in IFS) |
CODTYP | rtovs, tovs, ssmi, meris, am_profiler, jp_profiler, eu_profiler, templand, tempship, dropsonde, reo3, metar, pgps, radar_rr, rad1c, satem500, satem250 (or integer values as defined in IFS) |
SENSOR | hirs, msu, ssu, amsua, amsub, ssmi_sensor, vtpr1, vtpr2, tmi, ssmis, airs, mhs, iasi, amsre, meteosat, msg, geosimg, mtsatimg, windsat, mwts, iras, mwri, envisat |
INSTRM | mipas, gome, gomos, sciamachy, seviry, gome2, omi, toms, sbuv, auramls, iasi_reo3, modis_sensor, mopitt |
VARIAB | u,v,z, z, dz, rh, q, pwc, rh2m, t, td, t2m, td2m, ts, ptend, w,ww, vv, ch, cm, cl, nh, nn, hshs, c, ns, s, e, tgtg, spsp1, spsp2, rs, eses, is, trtr, rr,jj,vs,ds, hwhw, dwdw, gclg, rhlc, rhmc, n, snra, ps, dd, ff, rawbt, rawra, satcl, scatss, du, dv, u10m, v10m, rhlay, auxil, cllqw, ambigv, ambigu, apdss, ro_bangk, rrefl, o3, hlos, no2, so2, co, hcho, go3, co2, ch4, aod, rao, od, rfltnc, lnprc |
LSMASK | sea, land |
RLMASK | tovsland |
PPCODE | psealev, pstalev, g850hpa, g700hpa, p500gpm, p1000gpm, p2000gpm, p3000gpm, p4000gpm, g900hpa, g500hpa |
VERT_CO | pressure, height, tovs_cha, sca |
RETR_TYP for TOVS | cloudy, partly_cloudy, clear |
RETR_TYP for Satob | wvcl, ir, vis, wv, comb_spec_channels, wvmw, wvcl1, wvcl2, wvcl3, ir1, ir2, ir3, vis1, vis2, vis3, wvmw1, wvmw2, wvmw3 |
SONDE_TYPE for radiosondes | st_avk_mrz, st_rs80_usa, st_rs80, st_rs90, st_viz |
DATASTREAM | ears, pacrars, dbmodis |
ODB constants | rmdi, ndmi (real values as defined in ODB) |
2.3 Statements and operators
...
The IF-statement syntax (note the semicolon (;) after each statement):
Syntax | Meaning |
if (condition) then | IF-test with optional ELIF/ELSE-blocks. Nested IF-tests are valid in every statement. Every IF-THEN or IF-THEN-ELSE must match an ENDIF Condition can be any logical or arithmetic operation. |
2.3.2 List of the simple operators
...
A list of operators that are currently defined in the Blacklistobservation data selection-language:
2.3.3 List of more complex operators
...
2.4 Built-in functions
The Blacklistobservation data selection-language also contains some built-in functions. They are listed below:
...
In addition, there is one special function to study whether a point is within a circular area on the Earth (e.g. to blacklist Meteosat deny Meteosat SATOBs if they are too far away):
...
The fail()-function is a variable number argument function. If no arguments are given, the first argument is assumed to contain keyword monthly, i.e. rejection occurs in the monthly monitoring part of the blacklist-data selection file. If the second argument -- seriousness of the blacklisting denial -- is omitted, then seriousness is assumed to be equal to one.
Arguments in the fail(arg1, arg2)-function are:
Argument#1 (arg1) | Meaning |
monthly | monthly monitoring (default) |
constant | constant |
denial | |
experimental | experimental |
denial | |
use_emiskf_only | emiskf |
denial |
Argument#2 (arg2) | Meaning |
level | Level of seriousness of |
denial |
When a call to the fail()-function occurs, the control is returned immediately to the calling application. Normally the application is the IFS, which will get the following (Fortran) variables updated:
Variable | Type | Meaning |
NCMBLI | Integer |
Denial indicator 0= not |
denied (default) | ||
ZCMCCC | Real | Seriousness of the |
denial 0= Default if not |
denied |
denied (i.e. NCMBLI > 0) |
denial (optional) | ||
FEEDBACK | Integer | Feedback vector telling which variable(s) caused the |
denial to occur: 0 = |
denylist line number where the fail()-function took action |
There is a range of values for ZCMCCC, and together with other information in the quality control, and a value less than one may still lead to the use of this variable in the assimilation. The inclusion of this option of non-strict blacklisting data denial increases flexibility of the use of observations.
...
Variable declaration has to be performed, if data will be passed from an application (like IFS) into the blacklistdenylist. This is normally done through external-declaration (see for 4.2 or 5.1). Also, selected variables can be protected by defining them as constants.
Additional or local variables can be defined everywhere in the code, even within the IF-THEN-ELSE-ENDIF -block (except in IF-condition). However, any attempt to use undeclared or uninitialized variables will cause the Blacklistdata selection -compilation to fail.
The simplest variable declaration is an assignment operation.
3 Operational and experimental use of
...
the denylist
3.1 Location of
...
data selection files
3.2 Some guidelines
Please do not place any station identifiers into the data selection part of the blacklistdenylist. Instead, have them in the monthlt monthly monitoring part. By this way we can have as few changes as possible in the data selection part and make e.g. re-analysis much easier.
After any modifications to the blacklistdenylist, please remember to recompile (preferably on a workstation) to check for syntax errors.
4 Creating new
...
data selection file
Blacklist Data selection compilation is fully controlled by the script called blcomp. It has the following capabilities:
- Optionally convert from an old ASCII blacklist format denylist format to a new format
- Check the syntax of a given blacklistdenylist
- Create C-language file ( C_code.c) catered for observation processing
- C-compile the C-file to create linkable object
...
blcomp [-aAcCdDefiILmMnoOpSx8] blacklistdata_selection_file.b (or blacklistdata_selection_file.B)
where the flags are as follows:
The new BLACKLISTDENYLIST-file must have either suffix ".b" or ".B". In the latter case the C-preprocessor /lib/cpp will be run in the front of BL-compiler mainly to resolve any possible #include-statements.
For pure syntax checking of the new BLACKLISTDENYLIST-file, give:
blcomp blacklistdata_selection_file.b
or
blcomp blacklistdata_selection_file.B
By giving blcomp without arguments you will get the usage. If you fail to do this, check for your setting of the PATH-environment variable.
4.2 Conversion from old to new
...
denylist
Conversion from old to new and syntax checking of the new BLACKLISTDENYLIST-file can be accomplish accomplished in the following way:
blcomp -o old_text_blacklistdata_selection_file newfile.b
or
blcomp -o old_text_blacklistdata_selection_file newfile.B
Here, the input file is old_text_data_blacklistselection_file, and output file is newfile.b (or newfile.B) in the new blacklist denylist format.
While converting from old to new format, the used suffix .b or .B of the new blacklist file data selection file plays an important role. First of all, there MUST always be one suffix. When the suffix is .b, then a single blacklist file data selection file (here: newfile.b) will be created with all external (e.g. variable declarations) and monthly monitoring rules (a portion of blacklist that data selection that normally does not change during one month period) inlined.
...
4.3 C-code generation
Enabling fast blacklist denylist handling the blacklist data selection file is always converted into an object file ( .o) meant to be linked with the (Fortran-)application (like IFS) in conjunction with the blacklist object observation data selection object library (normally libbl95.a).
Once a blacklist data selection file (either with .b or .B suffix) is available, it can be converted to C-language file C_code.c and compiled to an object for maximum performance. This can be done as follows:
blcomp -c blacklistdata_selection_file.b
or
blcomp -c blacklistdata_selection_file.B
4.4 Linking with an application
A Fortran-application (IFS) interfaces the blacklist via observation data selection via two subroutines:
- BLACKBOX_INIT
- BLACKBOX
The former one is responsible for initiating the variable list active by the application. And the latter one handles all burden of interfacing the blacklist filedata selection file.
To link application with the blacklist softwareobservation data selection software, one needs not only the C_code.o-object file, but also the blacklist library data selection library libbl95.a. Linking command is normally:
...
The exact location of the blacklist observation data selection library can be found via command:
...
If no data selection part is needed, one can combine conversion from old to new blacklist and denylist and object code generation described above:
blcomp -c -o old_text_data_blacklistselection_file newfile.b
or
blcomp -c -o old_text_data_blacklistselection_file newfile.B
4.6 User interface
It is always recommended to (cold-)compile a modified blacklist denylist on a workstation to check for syntax errors. If any errors are detected, the blcomp-command attempts to open an editor session and jump directly to the line where the (first) error occurred.
Sometimes this facility is not desirable and can be disabled by using -i flag in the blcomp-command.
5 Examples
The blacklist file data selection file is normally about 1 000 lines long. In order not to confuse readers, we will explain here with very short examples what can be done with the blacklistobservation data selection language
5.1 A simple example
A fraction of an old blacklist denylist ( old) looks like as follows:
...
!
! Written by an automatic conversion program, version 3
!
!
! File converted from the file "old"
!
! FAILCODE :
const monthly = 1;
const constant = 2;
const experimental = 3;
const whitelistallowlist = 4;
! OBSTYP :
const synop = 1;
const airep = 2;
const satob = 3;
const dribu = 4;
const temp = 5;
const pilot = 6;
const satem = 7;
const paob = 8;
const scatt = 9;
! CODTYP : none
! INSTRM : none
! VARIAB :
const u = 3;
const v = 4;
const z = 1;
const dz = 57;
const rh = 29;
const q = 7;
const pwc = 9;
const rh2m = 58;
const t = 2;
const td = 59;
const t2m = 39;
const td2m = 40;
const ts = 11;
const ptend = 30;
const w = 60;
const ww = 61;
const vv = 62;
const ch = 63;
const cm = 64;
const cl = 65;
const nh = 66;
const nn = 67;
const hshs = 68;
const c = 69;
const ns = 70;
const s = 71;
const e = 72;
const tgtg = 73;
const spsp1 = 74;
const spsp2 = 75;
const rs = 76;
const eses = 77;
const is = 78;
const trtr = 79;
const rr = 80;
const jj = 81;
const vs = 82;
const ds = 83;
const hwhw = 84;
const pwpw = 85;
const dwdw = 86;
const gclg = 87;
const rhlc = 88;
const rhmc = 89;
const rhhc = 90;
const n = 91;
const snra = 92;
const ps = 110;
const dd = 111;
const ff = 112;
const rawbt = 119;
const rawra = 120;
const satcl = 121;
const scatss = 122;
const du = 5;
const dv = 6;
const u10m = 41;
const v10m = 42;
const rhlay = 19;
const auxil = 200;
const cllqw = 123;
const scatdd = 124;
const scatff = 125;
! LSMASK :
const sea = 0;
const land = 1;
! PPCODE :
const psealev = 0;
const pstalev = 1;
const g850hpa = 2;
const g700hpa = 3;
const p500gpm = 4;
const p1000gpm = 5;
const p2000gpm = 6;
const p3000gpm = 7;
const p4000gpm = 8;
const g900hpa = 9;
const g1000hpa = 10;
const g500hpa = 11;
! VERT_CO:
const pressure = 1;
const height = 2;
const tovs_cha = 3;
const scat_cha = 4;
...
And finally the actual monthly monitoring rules in a new blacklist denylist format:
if ( OBSTYP = synop ) then
if VARIAB in ( z, ps )
and STATID = " 3ELC"
then fail(); endif;
if VARIAB in ( z, ps, u10m, v10m )
and STATID = " ELBX3"
then fail(); endif;
return; endif;
if ( OBSTYP = airep ) then
if (VARIAB = t)
and STATID in ( " N503US", " UAL...")
then fail(); endif;
return; endif;
if ( OBSTYP = satob ) then
if STATID in ( " 0//", " 024")
then fail(); endif;
return; endif;
if ( OBSTYP = dribu ) then
if VARIAB in ( z, ps, u, v )
and STATID = " 46527"
then fail(); endif;
return; endif;
if ( OBSTYP = temp ) then
if (VARIAB = z)
and STATID = " ERES"
then fail(); endif;
return; endif;
if ( OBSTYP = pilot ) then
if VARIAB in ( u, v )
and STATID = " 08221"
then fail(); endif;
return; endif;
if ( OBSTYP = satem ) then
if STATID = " 201"
then fail(); endif;
return; endif;
5.2 A more complex example
The Blacklist observation data selection compiler will generate quite a compact and readable code from the following excerpt:
...
The constant definition is not different from the previous example. For the monthly monitoring rules in a new blacklist denylist format becomes:
if ( OBSTYP = synop ) then
if VARIAB in ( z, ps )
and STATID in ( " ATQM", " ATRK", " ATSR", " C6BB", " C6QK")
then fail(); endif;
return; endif;
if ( OBSTYP = airep ) then
if ( 50 >= PRESS >= 10 )
and STATID = " AN..."
then fail(); endif;
if ( ( LAT < -90 or LAT > 90 ) or ( -80 < LON < -40 ) )
and STATID = " NWA74"
then fail(); endif;
return; endif;
if ( OBSTYP = satob ) then
if ( ( LAT < -50 or LAT > 50 ) or ( -170 < LON < 90 ) )
and STATID = " 104"
then fail(); endif;
if ( ( LAT < -50 or LAT > 50 ) or ( LON < -50 or LON > 50 ) )
and ( 1000 >= PRESS >= 401 )
and STATID = " 035"
then fail(); endif;
return; endif;
if ( OBSTYP = temp ) then
if (VARIAB = z)
and ( 100 >= PRESS >= 10 )
and ( 110000 <= TIME <= 130000 )
and STATID = " 20674"
then fail(); endif;
if VARIAB in ( u, v )
and ( 50000 <= TIME <= 70000 )
and STATID = " 40179"
then fail(); endif;
return; endif;
if ( OBSTYP = pilot ) then
if VARIAB in ( u, v )
and ( 50000 <= TIME <= 70000 )
and STATID = " 40179"
then fail(); endif;
return; endif;
...
- Never remove or redifine existing variables. That will make re-running earlier cases virtually impossible.
- Add the new variable in the SQL requests black_rob*.sql. If the new variable is not in hdr or body but in some data-specific tables (e.g. sat, or conv), you need to modify *only* those requests that are relevant for those data and have access to these tables.
- Add a variable to the IFS source code in obs_preproc/blinit.F90.
- Increase the number of defined variables in obs_preproc/blinit.F90.
- External declaration must be done into the external-file.
- Before starting to use the new variable, initialize it properly in obs_preproc/black.F90. If the new variable is not in hdr or body but in some data-specific tables (e.g. sat, or conv):
- make sure the variable is always initialized, and
- put some logic in place (e.g. IF (IOBTYP == NSYNOP)...) in order to populate, only when appropriate, the variable with values from the sql.
- The new variable can now be added into the blacklistdenylist. If keywords are associated with, declare them in the external-file as well.
...