...
python interface( See ecflow.ZombieAttr)
text interface ( See Definition file Grammar)
zombie ::= "zombie" >> `zombie_type` >> ":" >> !(`client_side_action` | `server_side_action`) >> ":" >> *`child` >> ":" >> !`zombie_life_time` zombie_type ::= "user" | "ecf" | "path" | "ecf_pid" | "ecf_passwd" | "ecf_pid_passwd" child ::= "init" | "event" | "meter" | "label" | "wait" | "abort" | "complete" client_side_action ::= "fob" | "fail" | "block" server_side_action ::= "adopt" | "delete | "kill" zombie_life_time ::= unsigned integer( default: user(300), ecf(3600), path(900) ), the server poll timer runs every 60 seconds, hence this is the effective minimum value
Where:ecf_pid - PID miss-match, password matches. Job scheduled twice. Check submitter
ecf_pid_passwd - Both PID and password miss-match. Re-queue & submit of active job?
ecf_passwd - Password miss-match, PID matches, system has re-cycled PID or hacked job file?
ecf - Two init commands or task complete or aborted but receives another child cmd
ecf_user - Created by user action
ecf_path - Task not found. Nodes replaced whilst jobs were running
- --alter command(dynamic)
ecflow_client --alter add zombie <zombie-attribute> <path>
ecflow_client --later delete zombie < ecf | path | user> <path>
However note, the effect will only be seen, when the child command, makes the next attempt to communicate with the server.
...
Example: For tasks under suite “s1” add a zombie attribute, such that job that issues the child commands( event, meter, label) never blocks: (not strictly needed as this is the default behaviour from release 4.0.5 onwards)
python
s1 = ecflow.Suite('s1') child_list = [ ChildCmdType.label, ChildCmdType.event, ChildCmdType.meter ] zombie_attr = ZombieAttr(ZombieType.ecf, child_list, ZombieUserActionType.fob, 300) s1.add_zombie(zombie_attr)
text
suite s1 zombie ecf:fob:label,event,meter:
- alter
ecflow_client --alter=add zombie "ecf:fob:label,event,meter:" /s1
...
python
with ecflow.Suite('s1') as s1: with s1.add_family("critical") as crit : child_list = [ ] # empty child list means apply to all child commands
for crit.add_zombie(ZombieAttrzombie_type in (ZombieType.ecf, child_list, ZombieUserActionType.fail, 300)) crit.add_zombie(ZombieAttr(ZombieType.path, child_list, ZombieUserActionType.fail, 300)) ZombieType.path,ZombieType.user,ZombieType.ecf_pid,ZombieType.ecf_passwd,ZombieType.ecf_pid_passwd): crit.add_zombie(ZombieAttr(ZombieType.userzombie_type, child_list, ZombieUserActionType.fail, 300))text
suite s1 family critical zombie ecf:fail:: zombie path:fail:: zombie user:fail::
zombie ecf_pid:fail::
zombie ecf_passwd:fail::
zombie ecf_pid_passwd:fail::- alter
ecflow_client --alter=add zombie "ecf:fail::" /s1
ecflow_client --alter=add zombie "path:fail::" /s1
ecflow_client --alter=add zombie "user:fail::" /s1
ecflow_client --alter=add zombie "ecf_pid:fail::" /s1
ecflow_client --alter=add zombie "ecf_passwd:fail::" /s1
ecflow_client --alter=add zombie "ecf_pid_passwd:fail::" /s1
Here are some further example of using --alter:
...