General information

Service options available

Option 2: Member State ecFlow suites monitored by ECMWF:

  • Suitable for more complex applications comprising several tasks with interdependencies between them.
  • The suites will be developed according to the technical guidelines described in this document.
  • To be requested by the TAC representative of the relevant Member State.
  • Monitored by ECMWF.

Option 3: Member State ecFlow suites managed by ECMWF:

  • Further enhancement of Option 2.
  • Requires an ecFlow suite, which has usually been developed under Option 2 of the Framework.
  • Application developed, tested and maintained by the Member State.
  • It must be possible to test the application using ECMWF pre-operational (e-suite) data.
  • Member State suite handed over to ECMWF.
  • Member State responsible for the migration of the application, e.g. when supercomputer changes.
  • Monitored by ECMWF.
  • ECMWF will provide first-level on-call support, while second-level support would be provided by the Member State.
  • To be requested by the TAC representative of the relevant Member State.

General characteristics of Member State time-critical work

The main characteristics of Member State time-critical work are:

  1. The work needs to be executed reliably, according to a predetermined and agreed schedule.
  2. It runs regularly, in most cases on a daily basis, but could be executed on a weekly, monthly or ad hoc basis.
  3. It must have an official owner who is responsible for its development and maintenance.

Systems that can be used at ECMWF to execute time-critical work

Within this Framework, Member State users can use the Atos HPCF and ECS services (HPCF resources are needed to use the HPCF / HPC service). In general, users should minimise the number of systems they use. Use of the HPC service is recommended for all computationally intensive (e.g. for running a numerical model) time-critical option 2 and 3 activity.  The ECGATE (ECS) service should be used only for work that is not excessively computationally intensive such as post-processing output graphically before it is transferred to their Member State. Member State time-critical work may also need to use additional systems outside ECMWF after processing at ECMWF has been performed, for example to run other models using data produced by their work at ECMWF. It is not the purpose of this document to provide guidelines on how to run work which does not make use of ECMWF computing systems.

How to request this service and information required

Every registered user of ECMWF computing systems can run work under Option 1 of the Framework and no formal request is required. Note that access to ECMWF real-time operational data is restricted. Users interested in running this kind of work should refer to the User guide to simple time-critical jobs - time-critical job submission under ECaccess.

To run work under either Option 2 or Option 3 of the Framework, the TAC representative of your Member State should send a formal request to the Director of Forecasting Department at ECMWF. 

Before submitting a request, we advise you to contact ECMWF to discuss the time-critical work you intend to run at ECMWF.  You will be assigned a User Services contact point who will be able to advise you on the technical aspects of your time-critical work and how best to implement it within the guidelines described here.

Your official request will need to provide the following information:

  1. Description of the main tasks.
  2. The systems needed to run these tasks.
  3. Technical characteristics of the main tasks running on HPCF: number of processors required, memory needed, CPU/elapsed time needed, size of the input and output files, any software/library dependencies, system billing units (SBU) needed (if applicable).
  4. A detailed description of the data flow, in particular describing which data are required before processing can start. This description must state any dependency on data that are not produced by any of the ECMWF models and include from where these data can be obtained and their availability time.
  5. A proposed time schedule for the main tasks, stating in particular when it is desirable to have the output of this work available 'to the customers'.

ECMWF will consider your request and reply officially within three months taking into account, in particular, the resources required for the implementation of your request.

Technical guidelines for setting up time-critical work

Basic information

Opton 2

As this work will be monitored by ECMWF staff (User Services during the development phase; the ECMWF 24x7 Shift Staff, once your work is fully implemented), the only practical option is to implement time-critical work using a suite under ecFlow, ECMWF's monitoring and scheduling software package. The suite must be developed according to the technical guidelines provided in this document. General documentation, training course material, etc, on ecFlow can be found at ecFlow home. No on-call support will be provided by ECMWF staff but the ECMWF Shift Staff can contact the relevant Member State suite support person, if this is clearly requested in the suite 'manual' pages.

Option 3

In this case, the Member State ecFlow suite will be managed by ECMWF. The suite will usually be based either on a suite developed previously under Option 2 of the Framework or on a similar suite already used to run ECMWF operational work. The suite will be run using the ECMWF operational User ID and will be managed by ECMWF Production Section staff. The suite will generally be developed following similar guidelines to Option 2 suites. The main technical differences are that Option 3 suites will have higher batch scheduling priority than Option 2 work and will also benefit from second-level on -call support from the ECMWF Production Section staff.

Before implementing your ecFlow suite

You are advised to discuss the requirements of your work with your User Services technical contact before you start any implementation. An initial implementation should be tested under your normal Member State User ID, using the file systems normally available to you and standard batch job queues, etc, following the technical guidelines given in this document. You should proceed with the final implementation only after your official request has been approved by ECMWF.

User ID used to run the work

A specific User ID, also referred to as an "application identifier", will be created to run a particular suite as Option 2. Such User IDs start with a "z", followed by two or three characters.  A person responsible should be nominated for every "application identifier" User ID. A limited number of other registered users can also be authorized to access this User ID and a mechanism to allow such access under strict control will be available. The person associated with the User ID and other authorised users have responsibility for all changes made to the files owned by the User ID. The User ID will be registered with a specific Identity and Access Management role and entitlement ("timecrit") which allows access to restricted batch classes and restricted file systems on the ECMWF HPC.

General ecFlow suite guidelines

As mentioned previously, and unless otherwise previously agreed, Option 2 work must be implemented as an ecFlow suite.

The ecFlow environment is not set up by default for users on the Atos HPCF and ECGATE systems. Users will have to load the ecFlow environment using

$ module load ecflow


ECMWF will create a "ready-to-go" ecFlow server running on an independent Virtual Machine outside the HPCF. See also Using ecFlow for further information about using ecFlow on the Atos HPCF and ECGATE systems.

Port number and ecFlow server

The ecFlow port number for the suite will be the default 3141 and the host will be the Virtual Machine which will have a hostname of the form ecflow-tc2-$USER-001.  The ecFlow server will run on the Virtual Machine and will be started at system boot time. You should not need to SSH onto the Virtual Machine itself unless there is a problem.

If the ecFlow server dies for some reason, it should be restarted automatically.  ECMWF 24x7 Shift Staff also monitor the ecFlow servers used to run Member State time-critical activity and should take action if the ecFlow server dies for some reason.  If the server has not restarted or appears unresponsive the you may try to restart it manually with:

$ ssh $ECF_HOST sudo systemctl restart ecflow-server

Housekeeping the ecFlow server log file

Depending on your activity with ecFlow, the ecFlow log file (stored in /home/$USER/ecflow_server/ecflow-tc2-$USER-001.log) will grow steadily. We recommend that you install either a cron job or an administration task in your suite to clean these ecFlow log files. This can be achieved with the ecflow_client command.

ecflow_client log command help
$ ecflow_client --help log  

log
---

Get,clear,flush or create a new log file.
The user must ensure that a valid path is specified.
Specifying '--log=get' with a large number of lines from the server,
can consume a lot of **memory**. The log file can be a very large file,
hence we use a default of 100 lines, optionally the number of lines can be specified.
 arg1 = [ get | clear | flush | new | path ]
  get -   Outputs the log file to standard out.
          defaults to return the last 100 lines
          The second argument can specify how many lines to return
  clear - Clear the log file of its contents.
  flush - Flush and close the log file. (only temporary) next time
          server writes to log, it will be opened again. Hence it best
          to halt the server first
  new -   Flush and close the existing log file, and start using the
          the path defined for ECF_LOG. By changing this variable
          a new log file path can be used
          Alternatively an explicit path can also be provided
          in which case ECF_LOG is also updated
  path -  Returns the path name to the existing log file
 arg2 = [ new_path | optional last n lines ]
         if get specified can specify lines to get. Value must be convertible to an integer
         Otherwise if arg1 is 'new' then the second argument must be a path
Usage:
  --log=get                        # Write the last 100 lines of the log file to standard out
  --log=get 200                    # Write the last 200 lines of the log file to standard out
  --log=clear                      # Clear the log file. The log is now empty
  --log=flush                      # Flush and close log file, next request will re-open log file
  --log=new /path/to/new/log/file  # Close and flush log file, and create a new log file, updates ECF_LOG
  --log=new                        # Close and flush log file, and create a new log file using ECF_LOG variable


The client reads in the following environment variables. These are read by user and child command

|----------|----------|------------|-------------------------------------------------------------------|
| Name     |  Type    | Required   | Description                                                       |
|----------|----------|------------|-------------------------------------------------------------------|
| ECF_HOST | <string> | Mandatory* | The host name of the main server. defaults to 'localhost'         |
| ECF_PORT |  <int>   | Mandatory* | The TCP/IP port to call on the server. Must be unique to a server |
| ECF_SSL  |  <any>   | Optional*  | Enable encrypted comms with SSL enabled server.                   |
|----------|----------|------------|-------------------------------------------------------------------|

* The host and port must be specified in order for the client to communicate with the server, this can 
  be done by setting ECF_HOST, ECF_PORT or by specifying --host=<host> --port=<int> on the command line
 


For example, to empty the log file, use:

ecflow_client --port=%ECF_PORT% --host=%ECF_HOST% --log=clear

For information about using crontabs on the Atos HPCF and ECGATE service, please see Cron service.

Your crontab should be installed on either ecs-cron or hpc-cron - see Cron service.

Access to the job output files

Job output files should be stored on the Lustre $TCWORK file system.  As this file system cannot be directly accessed from the ecFlow Virtual Machine, we recommend a simple log server (implemented as a Perl script) is used to access the output files of jobs running on the HPCF. This log server requires another port number, which will have the format "35000+<UID>", where <UID> is the numeric uid of the User ID used to run the work. The log server should be run on the hpc-log node of the Atos HPCF. The ecflow_logserver.sh command should be used to start the log server on the HPCF. The syntax of the ecflow_logserver.sh command is:

$ ecflow_logserver.sh -h
Usage: /usr/bin/ecflow_logserver.sh [-d <dir>] [-m <map>] [-l <logfile>] [-h]
       -d <dir>     specify the directory name where files will be served
                    from - default is $HOME
       -m <map>     gives mapping between local directory and directory
                    where ecflow server runs - default is <dir>:<dir>
       -l <logfile> logserver log file - default is $SCRATCH/log/logfile
       -h           print this help page
Example:
       start_logserver.sh -d %ECF_OUT% -m %ECF_HOME%:%ECF_OUT% -l logserver.log
 

The mapping can consist of a succession of mappings. Each individual mapping will first give the directory name on the ecFlow server, followed by the directory name on the HPC system, as in the following example:

-m <dir_ecflow_vm>:<dir1_hpc>:<dir_ecflow_vm>:<dir2_hpc>

We recommend that you implement a cron job or define an administration task in your suite to check the presence of the log server process. The above script ecflow_logserver.sh can be used for this purpose. The logserver should be started on hpc-log.

Managing ecFlow tasks

Your jobs will be managed by ecFlow. Three main actions on the ecFlow tasks are required: one to submit, one to check and one to kill a task. These three actions are defined respectively through the ecFlow variables ECF_JOB_CMD, ECF_KILL_CMD and ECF_STATUS_CMD. You can use any script to take these actions on your tasks. We recommend that you use the troika provided by ECMWF.  The "troika" command is installed on your ecFlow Virtual Machine and can be used to submit, check or kill a task:

trioka command help
$ troika -h
usage: troika [-h] [-V] [-v] [-q] [-l LOGFILE] [-c CONFIG] [-n] action ...

Submit, monitor and kill jobs on remote systems

positional arguments:
  action                perform this action, see `troika <action> --help` for details
    submit              submit a new job
    monitor             monitor a submitted job
    kill                kill a submitted job
    check-connection    check whether the connection works
    list-sites          list available sites

optional arguments:
  -h, --help            show this help message and exit
  -V, --version         show program's version number and exit
  -v, --verbose         increase verbosity level (can be repeated)
  -q, --quiet           decrease verbosity level (can be repeated)
  -l LOGFILE, --logfile LOGFILE
                        save log output to this file
  -c CONFIG, --config CONFIG
                        path to the configuration file
  -n, --dryrun          if true, do not execute, just report

environment variables:
  TROIKA_CONFIG_FILE    path to the default configuration file


We recommend you set your ecFlow variables to use troika as follows:

ECF_JOB_CMD="troika -vv submit -u %USER% -o %ECF_JOBOUT% %HOST% %ECF_JOB%"

ECF_KILL_CMD="troika kill -u %USER% %HOST% %ECF_JOB%"

ECF_STATUS_CMD="troika monitor -u %USER% %HOST% %ECF_JOB%"

Access protection with ecFlow

Access to your ecFlow server can be controlled using an ecf.list file in $ECF_HOME.

We recommend that you set up and use this file, mainly to allow ECMWF staff to monitor your suite and to prevent unintentional access by other users. A sample file is available in ~usx/time_critical/ecf.list.

ecFlow suite design recommendations

Some key points to keep in mind when designing your suite:

  1. The suite should easily run in a different configuration. It is therefore vital to allow for easy changes of configuration. Possible changes could include:
    1. Running on a different HPCF system.
    2. Running the main task on fewer or more CPUs, with fewer or more threads (if relevant).
    3. Using a different file system.
    4. Using a different data set, e.g. ECMWF e-suite or own e-suite.
    5. Using a different ”model” version.
    6. Using a different ecFlow server.
    7. Using a different User ID and different queues, e.g. for testing and development purposes.

      The worst that could happen is that you lose everything and need to restart from scratch. Although this is very unlikely, you should keep safe copies of your libraries, executables and other constant data files.

      To achieve flexibility in the configuration of your suite, we recommend that you have one core suite and define ecFlow variables for all those changes of configuration you want to cater for. See variable definitions in suite definition file ~usx/time_critical/sample_suite.def.



  2. It is also important to document clearly the procedures for any changes to the configuration, if these may need to be run by, for example, by ECMWF's 24x7 Shift Staff.

  3. All tasks that are part of the critical path, i.e. that will produce the final ”products” to be used by you, have to run in the safest environment:

    1. All time-critical tasks should run on the HPCF system.
    2. Your time-critical tasks should not use the Data Handling System (DHS), including ECFS and MARS. The data should be available online, on the HPCF, either in a private file system or accessed from the Fields Data Base (FDB) with MARS. If some data must be stored in MARS or ECFS, do not make time-critical tasks dependent on these archive tasks, but keep them independent. See the sample ecFlow definition in ~usx/time_critical/sample_suite.def.
    3. Do not use cross-mounted file systems. Always use file systems local to the HPC.
    4. To exchange data between remote systems, we recommend the use of rsync.

  4. The suite manual ('man') pages should include specific and clear instructions for ECMWF Shift Staff. An example man page is available from ~usx/ime_critical/suite/man_page. Manual pages should include the following information:
    1. A description of the task.
    2. The dependencies on other tasks.
    3. What to do in case of failure.
    4. Whom to contact in case of failure, how and when.

      If you require an email to be sent to contacts whenever a suite task aborts then this can be included in the task ERROR function so that the email is sent automatically.



  5. The ecFlow functionality of ”late tasks” is useful to draw the ECMWF Shift Staffs’ attention to possible problems in the running of your suite.

    Use the "late tasks" functionality sparingly.  Set it for a few key tasks only, with appropriately selected warning thresholds. If the functionality is used too frequently or if an alarm is triggered every day, it is likely that no one will pay attention to it.


  6. The suite should be self-cleaning. Disk management should be very strict and is your responsibility. All data no longer needed should be removed. The ecFlow jobs and job output files, if kept, should be stored (in ECFS), then removed from local disks.

  7. Your suite definition will loop over many dates, e.g. to cover one year. Depending on the relation between your suite and the operational activity at ECMWF, you will trigger (start) your suite in one of the following ways:
    1. If your suite depends on the ECMWF operational suite, you will set up a time-critical job under ECaccess (see option 1) which will simply set a first dummy task in your suite to complete. Alternatively, you could resume the suite, which would be reset to ”suspended” after completing a cycle. See sample job in ~usx/time_critical/suite/trigger_suite.cmd.
    2. If your suite has no dependencies with the ECMWF operational activity, we suggest you to define a time in your suite definition file when to start the first task in your suite.
    3. If your suite has no dependencies on the ECMWF operational activity, but has dependencies on external events, we suggest that you also define a time when to start the first task in your suite, and that you check for your external dependency in this first task.
    4. The cycling from day to day will usually happen by defining a time when the last task in the suite will run. This last task should run sufficiently long in advance before the next run will start. Setting up this time will allow you to watch the previous run of the suite up until the last task has run. See the sample suite definition in ~usx/time_critical/sample_suite.def.
      Note that if one task of your suite remains in aborted status, this will NOT prevent the last task to run at the given time but your suite will not be able to cycle through to the next run, e.g. for the next day. Different options are available to you to overcome this problem. If the task that failed is not in the critical path, you can give instructions to the ECMWF Shift Staff to set the aborted task to complete. Another option would be to build an administrative task that checks before each run that all tasks are set to complete, and therefore forces your suite to cycle through to the next run.

One key point in the successful communication between the jobs running on the HPCF systems and your ecFlow server is the error handling. We recommend the use of a trap, as illustrated in the sample suite in ~usx/time_critical/include/head.h. The shell script run by your batch job should also use the ”set -ue” options.

Sample suite

A sample suite illustrating the previous recommendation is available in  ~usx/time_critical/sample_suite.def.

File systems

File systems have been set-up on  the HPCF clusters for the User ID which will be used to run the time critical applications: they are called /ec/ws1 and /ec/ws2 on the current Atos HPC system. These file systems are quota controlled and therefore you will need to provide your User Services technical contact with an estimate of the total size and number of files which you need to keep on this file system.

This file system should be used to hold both your binaries/libraries and input and output data.

No select/delete process will run on this file system and you will be required to regularly remove any files no longer needed as part of your suite.

You will also be required to safely backup copies of your binaries, etc, into ECFS. It is recommended to include a task at the beginning of your suite, not to be run every day, that will restore your environment, in case of emergency (”restart from scratch” functionality).

If there is a need for a file system with different characteristics (e.g. to hold files safely online for several days), these requirements can be discussed with your User Services technical contact and a file system with the required functionalities can be made available.

Batch job queues

Specific batch job queues have been set up on the HPCF clusters with access restricted to the UIDs authorised to run Option 2 work only. These queues are called ”tf” , used for sequential or fractional work (work using less than half of one node) and "tp", used for parallel work (work using more than 1 node). There are the queues that should be used to run any time-critical work on the Atos HPCF.  If there are any non time-critical tasks in your suite (e.g. archiving tasks), these can use the nf and np queues available to all HPC users. 

Archiving tasks should always use the nf queue and not be included as part of your parallel work.

When you develop or test a new version of your time-critical suite, we advise you to use the standard nf and np queues available to all users. In this way, your time-critical activity will not be delayed by this testing or development work.

Data required by your work

Your work will normally require some ”input” data before processing can start. The following possibilities have been considered:

  1. Your work requires input data which is produced by any of the ECMWF models. In such case it is possible to set up a specific dissemination stream which will send the required data to the HPCF. ECPDS allows for the "local" dissemination to a specific User ID (the User ID used to run time-critical work) so that only this recipient User ID can see the data and is similar to the standard dissemination to remote sites. This ”local” dissemination option is the recommended option.  The recipient User ID is responsible for the regular clean-up of the received data.

    If produced by ECMWF, your required data will also be available in the FDB and will remain online for a limited (variable depending on the model) amount of time. You can access these data using the usual ”mars” command. If your suite requires access to data which may no longer be contained in the FDB then your suite needs to access these data before they are removed from the FDB and temporarily store them in one of your disk storage areas.

    For no reason should any of your time-critical suite tasks depend on data only available from the Data Handling System (MARS archive or ECFS). Beware that using a keyword value of "ALL" in any mars request will automatically redirect it to the MARS archive (DHS). Note also that we recommend you do not use abbreviations for a MARS verb, parameter or value in your mars requests. If too short, these abbreviations may become ambiguous if a new verb, parameter or value name is added to the mars language.

  2. Your work requires input data which is available at ECMWF but not produced by an ECMWF model. For example, your work requires observations normally available on the GTS e.g. if you are interested in running some data assimilation work at ECMWF. In such a case you can obtain the required observations from /ec/vol/msbackup/ on the Atos HPCF where they are stored by a regular extraction task which runs as part of the ECMWF operational suite. For any other data you may need for your time-critical activity and which is available at ECMWF, please ask your User Services technical contact.

  3. Your work requires input data which is neither produced by any of the ECMWF models nor available at ECMWF. You will then be responsible for setting up the required ”acquisition” tasks and establish their level of time criticality. For example, your suite may need some additional observations which improve the quality of your assimilation but your work can also run without them in case there is a delay/problem in their arrival at ECMWF. Please see the section ”Data transfers” for advice on how to transfer incoming data.

Data transfers

Outgoing data - sending data from ECMWF

We recommend the use of the ECMWF Product Dissemination System (ECPDS) to disseminate data produced by option 2 suites to remote sites. This option is more robust than ectrans or other file transfer methods such as sftp or rsync.   Your User Services technical contact can provide more information about this.

Otherwise, the ectrans command can be used to send data to remote sites. The ectrans command allows for a retry of failed transfers from the spool.

We recommend that ectrans remote associations are set up on your local ECaccess gateway. If this is not available, you can set up remote associations on the ECaccess gateway at ECMWF (boaccess.ecmwf.int).

Note that, by default, ectrans transfers are asynchronous; the successful completion of the ectrans command does not mean your file has been transferred successfully. You may want to use the option ”-put” to request synchronous transfers.

Incoming data - sending data to ECMWF

We recommend ectrans (option -get) to upload some data from a remote site to ECMWF. Other options, including the use of ECPDS in acquisition mode, may be considered in specific situations. Please discuss your requirements with your User Services technical contact.

Transferring data between systems at ECMWF

The ECGATE (ECS) and the HPCF (ECS) systems share the same file systems so there should be no need to transfer data between them. If you need to transfer data to other systems at ECMWF then we recommend that you use the rsync command.  We remind you not to use the DHS (MARS or ECFS) for any tasks in the critical path.

Scheduling of work

The User IDs authorised to run Option 2 work are given a higher batch job scheduling priority than normal Member State work. All Option 2 User IDs are given the same priority. The Centre’s core operational activities will always be given higher priority than Member State time-critical activities. If problems affecting ECMWF's core activity suites arise or long system sessions are needed, the Member State suites and jobs may be delayed or possibly abandoned. Member State users should consider setting up suitable backup procedures for such eventualities.

Backup procedure

The User IDs authorised to run Option 2  work have access to all Atos HPCF complexes and you are advised to implement your suite so it can run on any of the clusters in case a specific cluster is unavailable for an extended period of time.

The two separate HPCF time-critical file systems (currently only the /ec/ws1 and /ec/ws2 file systems) should be kept regularly synchronised using utilities such as ”rsync”.

It should be possible to change the HPCF cluster used by the suite by making a simple change of ecFlow variable (variable SCHOST in the sample suite).

Similarly it is desirable to change the file system used by the suite by changing an ecFlow variable (variable STHOST in the sample suite). Users may also wish to consider the possibility of setting up more sophisticated backup procedures such as the regular creation of backup products based on previous runs of the suite.

Data archiving

Users wishing to set up Member State time-critical suites at the Centre should carefully consider their requirements regarding the long term storage of the products of their suite.

In particular, they should consider if they want to archive their time-critical application’s output data in MARS. The same recommendation applies to users wishing to consider the storage of their suite’s output in the on-line FDB (ECMWF’s Fields Data Base).  In both cases, users are advised to discuss requirements with their User Services technical contact.

For most users we recommend that their time-critical application’s output data is stored in the ECFS system, if this is required.

Note that no time-critical task in your suite should depend on the completion of an archiving task.

Please also note that possible users of your suite’s products should be advised not to depend on the availability of such products in any part of the DHS system (both ECFS and MARS archive), as this service can be unavailable for several hours.

Making changes to the suite

Once your Option 2 suite has been approved and is declared running in time-critical mode, we recommend you not to make changes to the suite for new developments. Instead, we recommend that you define a similar suite in parallel to the time-critical one and that you first test any changes under this suite.  Only once you have tested the changes in the parallel suite should you implement them in the time-critical suite.  When you make important changes to your suite, we recommend that you inform ECMWF via the Support Portal.

At ECMWF, we will set up the appropriate information channels to keep you aware of changes that may affect your time-critical activity. The most appropriate tool is a mailing list.

Feedback

We welcome any comments on this document and the framework for time-critical applications. In particular, please do let us know if any of the above general purpose scripts does not fit with your requirements. We will then try to incorporate the changes needed.  All feedback can be provided via the ECMWF Support Portal.