A zombie is a running job that fails authentication when communicating with the ecflow_server
How are zombies created ?
There are wide variety of reasons why a
zombie is created.
The most common causes are due to user action:
- The node tree is deleted, replaced or reloaded whilst jobs are running
- A task is rerun, whilst in a submitted or active state
- A job is forced to new state, i.e. complete
More rarer causes might be:
How can zombie’s be handled ?
The default behaviour for init, complete, abort and wait child commands, is to block the job, and for event, label, meter to fob. (from version 4.0.4, previously all zombie, child commands, blocked)
This is done for period of 24 hours. (This period is configurable see ECF_TIMEOUT on
ecflow_client).
The jobs can also configured, so that if the server denies the communication, then
ecflowview provides a dialog which lists all the zombies and the actions that can be taken. These include:
Of the four action above, only Rescue will allow child command to change the state of the node tree. |
What to do
- Create a zombie by starting a task, and setting it to complete immediately via ecflowview
- Inspect the log file, it will show you how the zombie has arisen.
- Inspect the zombie dialog in ecflow_ui (right mouse button selection on the host node)
- Experiment with the different actions on the zombie