...
Horizontal Navigation Bar | ||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
Manual
When zombie s arise they can be handled manually by ecflow_ui. (See Zombie) or via the command-line interface:
...
It is also possible to ask ecflow_server to make the same response in an automated fashion. However, very careful consideration should be made before doing this. Otherwise, it could mask a serious underlying problem.
The automated response can be defined statically using python and text interface or dynamically (add/remove) via alter.:
python interface( See ecflow.ZombieAttr)
text interface ( See Definition file Grammar)
zombie ::= "zombie" >> `zombie_type` >> ":" >> !(`client_side_action` | `server_side_action`) >> ":" >> *`child` >> ":" >> !`zombie_life_time` zombie_type ::= "user" | "ecf" | "path" | "ecf_pid" | "ecf_passwd" | "ecf_pid_passwd" child ::= "init" | "event" | "meter" | "label" | "wait" | "abort" | "complete" | "queue" client_side_action ::= "fob" | "fail" | "block" server_side_action ::= "adopt" | "delete | "kill" zombie_life_time ::= unsigned integer( default: user(300), ecf(3600), path(900) ), the server poll timer runs every 60 seconds, hence this is the effective minimum value
Where:ecf_pid - PID miss-match, password matches. Job scheduled twice. Check submitter
ecf_pid_passwd - Both PID and password miss-match. Re-queue & submit of the active job?
ecf_passwd - Password miss-match, PID matches, system has re-cycled PID or hacked job file?
ecf - Two init commands or task complete or aborted but receives another child cmd
ecf_user - Created by user action
ecf_path - Task not found. Nodes replaced whilst jobs were running
- --alter command(dynamic)
ecflow_client --alter add zombie <zombie-attribute> <path>
ecflow_client --later delete zombie < ecf | path | user> <path>
However note, the effect will only be seen, when the child command, makes the next attempt to communicate with the server.
The zombie attribute is inherited in the same manner as Variable inheritance.
Example: For tasks under suite “s1” add a zombie attribute, such that child label commands(i.e.. ecflow_client –label) never blocks the job: (not strictly needed as this is the default behaviour)
...
Horizontal Navigation Bar | ||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
...