...
ecflow_client --alter=add zombie "ecf:adopt:complete:" /suiteZ
Semi-Automated
Sometimes zombies can arise for more obscure reasons. i.e. The job sends a --init message to the server, meanwhile the server is busy(i.e. processing jobs), when finally the server makes the task active, and sends a message back to the client/job, the ecflow_client has timed out. This causes the ecflow_client to send the same message again. However this time the server treats the child command as a zombie, since the task is already active. Hence we get these false zombies.
These scenario's are very rare, but tends to happen, for the following situations:
...
...
To diagnose these cases, we need to look at the log file. Typically you will see two or more child commands (--init/complete), where the second will then be treated as a zombie.
To get round these issue you can add a variable ECF_NONSTRICT_ZOMBIES, which will reduce these false zombies.
ecflow_client --alter=add variable ECF_NONSTRICT_ZOMBIES 1 / # adds the variable to the root/server level, and hence affect all suites on the server
ecflow_client --alter=add variable ECF_NONSTRICT_ZOMBIES 1 /suiteX # adds the variable at the suite level,, and hence only affects this suite.