aborted | When the ECF_JOB_CMD fails or the job file sends a ecflow_client –abort child command , then the task is placed into a aborted state. | ||||||||||||||||||||||||||||||||||||||||||||
active | If job creation was successful, and job file has started, then the ecflow_client –init child command is received by the ecflow_server and the task is placed into a active state | ||||||||||||||||||||||||||||||||||||||||||||
autocancel | autocancel is a way to automatically delete a node which has completed. The delete may be delayed by an amount of time in hours and minutes or expressed in days. Any node may have a single autocancel attribute. If the auto cancelled node is referenced in the trigger expression of other nodes it may leave the node waiting. This can be solved by making sure the trigger expression also checks for the unknown state. i.e.: trigger node_to_cancel == complete or node_to_cancel == unknown This guards against the ‘node_to_cancel’ being undefined or deleted For python see ecflow.Autocancel and ecflow.Node.add_autocancel . For text BNF see autocancel | ||||||||||||||||||||||||||||||||||||||||||||
check point | The check point file is like the suite definition , but includes all the state information. It is periodically saved by the ecflow_server . It can be used to recover the state of the node tree should server die, or machine crash. By default when a ecflow_server is started it will look to load the check point file. The default check point file name is <host>.<port>.ecf.check. This can be overridden by the ECF_CHECK environment variable | ||||||||||||||||||||||||||||||||||||||||||||
child command | Child command’s(or task requests) are called from within the ecf script files. The table also includes the the default action(from version 4.0.4) if the child command is part of a zombie. They include:
The following environment variables must be set for the child commands. ECF_NODE, ECF_NAME ,ECF_PASS and ECF_RID. See ecflow_client . | ||||||||||||||||||||||||||||||||||||||||||||
clock | A clock is an attribute of a suite . A gain can be specified to offset from the given date. The hybrid and real clock’s always runs in phase with the system clock (UTC in UNIX) but can have any offset from the system clock. The clock can be :
time , day and date and cron dependencies work a little differently under the clocks. If the ecflow_server is shutdown or halted the job scheduling is suspended. If this suspension is left for period of time, then it can affect task submission under hybrid and real clocks. In particular it will affect task s with time , today or cron dependencies .
For python see ecflow.Clock and ecflow.Suite.add_clock . For text BNF see clock | ||||||||||||||||||||||||||||||||||||||||||||
complete | The node can be set to complete:
| ||||||||||||||||||||||||||||||||||||||||||||
complete expression | Force a node to be complete if the expression evaluates, without running any of the nodes. This allows you to have tasks in the suite which a run only if others fail. In practice the node would need to have a trigger also. For python see ecflow.Expression and ecflow.Node.add_complete | ||||||||||||||||||||||||||||||||||||||||||||
cron | Like time , cron defines time dependency for a node , but it will be repeated indefinitely cron 11:00 cron 10:00 22:00 00:30 # <start> <finish> <increment> When the node becomes complete it will be queued immediately. This means that the suite will never complete, and the output is not directly accessible through ecflowview If tasks abort, the ecflow_server will not schedule it again. If the time the job takes to complete is longer than the interval a time "slot" is missed, e.g. cron 10:00 20:00 01:00 if the 10:00 run takes more than an hour, the 11:00 run will never occur. If the cron defines months, days of the month, or week days or a single time slot the it relies on a day change, hence if a hybrid clock is defined, then it will be set to complete at the beginning of the suite , without running the corresponding job. Otherwise under a hybrid clock the suite would never complete . For python see ecflow.Cron and ecflow.Node.add_cron . For text BNF see cron | ||||||||||||||||||||||||||||||||||||||||||||
date | This defines a date dependency for a node. There can be multiple date dependencies. The European format is used for dates, which is: dd.mm.yy as in 31.12.2007. Any of the three number fields can be expressed with a wildcard * to mean any valid value. Thus, 01.*.* means the first day of every month of every year. If a hybrid clock is defined, any node held by a date dependency will be set to complete at the beginning of the suite , without running the corresponding job. Otherwise under a hybrid clock the suite would never complete . For python see: ecflow.Date and ecflow.Node.add_date . For text BNF see date | ||||||||||||||||||||||||||||||||||||||||||||
day | This defines a day dependency for a node. There can be multiple day dependencies. If a hybrid clock is defined, any node held by a day dependency will be set to complete at the beginning of the suite , without running the corresponding job. Otherwise under a hybrid clock the suite would never complete . For python see: ecflow.Day and ecflow.Node.add_day . For text BNF see day | ||||||||||||||||||||||||||||||||||||||||||||
defstatus | Defines the default status for a task/family to be assigned to the node when the begin command is issued. By default node gets queued when you use begin on a suite . defstatus is useful in preventing suites from running automatically once begun or in setting tasks complete so they can be run selectively. For python see ecflow.DState and ecflow.Node.add_defstatus . For text BNF see defstatus | ||||||||||||||||||||||||||||||||||||||||||||
dependencies | Dependencies are attributes of node, that can suppress/hold a task from taking part in job creation . They include trigger , date , day , time , today , cron , complete expression , inlimit and limit . A task that is dependent can not be started as long as some dependency is holding it or any of its parent node s. The ecflow_server will check the dependencies every minute, during normal scheduling and when any child command causes a state change in the suite definition . | ||||||||||||||||||||||||||||||||||||||||||||
directives | Directives start with a % charater. This is referred to as ECF_MICRO character. The directives are used in two main context.
Directives are expanded during pre-processing . Examples include:
| ||||||||||||||||||||||||||||||||||||||||||||
ecf file location algorithm | ecflow_server and job creation checking uses the following algorithm to locate the ‘.ecf’ file corresponding to a task :
| ||||||||||||||||||||||||||||||||||||||||||||
ecf script | The ecFlow script refers to an '.ecf' file. The script file is transformed into the job file by the job creation process. The base name of the script file must match its corresponding task . i.e t1.ecf , corresponds to the task of name ‘t1’. The script if placed in the ECF_FILES directory, may be re-used by multiple tasks belonging to different families, providing the task name matches. The ecFlow script is similar to a UNIX shell script. The differences, however, includes the addition of “C†like pre-processing directives and ecFlow variable ‘s. Also the script must include calls to the init and complete child command s so that the ecflow_server is aware when the job starts(i.e changes state to active ) and finishes ( i.e changes state to complete ) | ||||||||||||||||||||||||||||||||||||||||||||
ECF_DUMMY_TASK | This is a user variable that can be added to task to indicate that there is no associated ecf script file. If this variable is added to suite or family then all child tasks are treated as dummy. This stops the server from reporting an error during job creation . edit ECF_DUMMY_TASK '' | ||||||||||||||||||||||||||||||||||||||||||||
ECF_JOB | This is a generated variable . If defines the path name location of the job file. The variable is composed as: ECF_HOME/ECF_NAME.job<ECF_TRYNO> | ||||||||||||||||||||||||||||||||||||||||||||
ECF_JOBOUT | This is a generated variable . This variable defines the path name for the job output file. The variable is composed as following. If ECF_OUT is specified: ECF_OUT/ECF_NAME.ECF_TRYNO otherwise: ECF_HOME/ECF_NAME.ECF_TRYNO | ||||||||||||||||||||||||||||||||||||||||||||
ECF_MICRO | This is a suite and generated variable . The default value is %. This variable is used in variable substitution during command invocation and default directive character during pre-processing . It can be overriden, but must be replaced by a single character. | ||||||||||||||||||||||||||||||||||||||||||||
ECF_NAME | This is a generated variable . It defines the path name of the task. | ||||||||||||||||||||||||||||||||||||||||||||
ECF_NONSTRICT_ZOMBIES | When the server is heavily overloaded, or when the server is being run on a virtual machines, where the scripts/lob are not local, then the server can spend a lot of time in job generation processing. This can end up affecting clients who try to communicate with the server. Which results in the client calls in the jobs to time out, resulting in zombies. When a child command 'ecflow_client --int <process_id>' is called in the job, the server receives the request, and it is logged, however by the times the server replies back to the job/ecflow_client. It has already timed out. The job then retries to send the same message again, however this time the server treat's the request as a zombie. The net result being the job suspends, since server treats it as a zombie. The default behaviour of zombies is to block. However when ECF_NONSTRICT_ZOMBIES is added as a variable to defs/suite/family/task, then the server behaves as follows:
This variable can be added/removed using the alter functionality. The following example adds the variable at the server level, and hence affects all suites. ecflow_client --alter add variable ECF_NONSTRICT_ZOMBIES 1 / | ||||||||||||||||||||||||||||||||||||||||||||
ECF_SCRIPT | This is a generated variable . If defines the path name for the ecf script | ||||||||||||||||||||||||||||||||||||||||||||
ECF_TRYNO | This is a generated variable that is used in file name generation. It represents the current try number for the task . After begin it is set to 1. The number is advanced if the job is re-run. It is re-set back to 1 after a re-queue. It is used in output and job file numbering. (i.e It avoids overwriting the job file output during multiple re-runs) | ||||||||||||||||||||||||||||||||||||||||||||
ECF_OUT | This is user/suite variable that specifies a directory PATH. It controls the location of job output(stdout and stderr of the process) on a remote file system. It provides an alternate location for the job and cmd output files. If it exists, it is used as a base for ECF_JOBOUT, but it is also used to search for the output by ecflow, when asked by ecflowview/CLI. If the output is in ECF_HOME/ECF_NAME.ECF_TRYNO it is returned, otherwise ECF_OUT/ECF_NAME.ECF_TRYNO is used.
The user must ensure that all the directories exists, including suite/family. If this is not done, you may well find task remains stuck in a submitted state. At ECMWF our submission scripts will ensure that directories exists. | ||||||||||||||||||||||||||||||||||||||||||||
ecFlow | Is the ECMWF work flow manager. A general purpose application designed to schedule a large number of computer process in a heterogeneous environment. Helps computer jobs design, submission and monitoring both in the research and operation departments. | ||||||||||||||||||||||||||||||||||||||||||||
ecflow_client | This executable is a command line program; it is used for all communication with the ecflow_server . To see the full range of commands that can be sent to the ecflow_server type the following in a UNIX shell:
This functionality is also provided by the Client Server API . The following variables affect the execution of ecflow_client. Since the ecf script can call ecflow_client( i.e child command ) then typically some are set in an include header. i.e. head.h .
| ||||||||||||||||||||||||||||||||||||||||||||
ecflow_server | This executable is the server. It is responsible for scheduling the jobs and responding to ecflow_client requests Multiple servers can be run on the same machine/host providing they are assigned a unique port number. The server record’s all request’s in the log file. The server will periodically(See ECF_CHECKINTERVAL) write out a check point file. The following environment variables control the execution of the server and may be set before the start of the server. ecflow_server will start happily with out any of these variables being set, since all of them have default values.
The server can be in several states. The default when first started is halted , See server states | ||||||||||||||||||||||||||||||||||||||||||||
ecflowview | ecflowview executable is the GUI based client, that is used to visualise and monitor the hierarchical structure of the suite definition
| ||||||||||||||||||||||||||||||||||||||||||||
event | The purpose of an event is to signal partial completion of a task and to be able to trigger another job which is waiting for this partial completion. Only tasks can have events and they can be considered as an attribute of a task . There can be many events and they are displayed as nodes. The event is updated by placing the –event child command in a ecf script . An event has a number and possibly a name. If it is only defined as a number, its name is the text representation of the number without leading zeroes. For python see: ecflow.Event and ecflow.Node.add_event For text BNF see event If the event child command s, results in a zombie , then the default action if for the server to fob, this allows the ecflow_client command to exit normally. (i,e without any errors). This default can be overridden by using a zombie attribute. Events can be referenced in trigger and complete expression s. | ||||||||||||||||||||||||||||||||||||||||||||
extern | This allows an external node to be used in a trigger expression. All node ‘s in trigger ‘s must be known to ecflow_server by the end of the load command. No cross-suite dependencies are allowed unless the names of tasks outside the suite are declared as external. An external trigger reference is considered unknown if it is not defined when the trigger is evaluated. You are strongly advised to avoid cross-suite dependencies . Families and suites that depend on one another should be placed in a single suite . If you think you need cross-suite dependencies, you should consider merging the suites together and have each as a top-level family in the merged suite. For BNF see extern | ||||||||||||||||||||||||||||||||||||||||||||
family | A family is an organisational entity that is used to provide hierarchy and grouping. It consists of a collection of task ‘s and families. Typically you place tasks that are related to each other inside the same family, analogous to the way you create directories to contain related files. For python see ecflow.Family . For BNF see family It serves as an intermediate node in a suite definition . | ||||||||||||||||||||||||||||||||||||||||||||
halted | Is a ecflow_server state. See server states | ||||||||||||||||||||||||||||||||||||||||||||
hybrid clock | A hybrid clock is a complex notion: the date and time are not connected. The date has a fixed value during the complete execution of the suite . This will be mainly used in cases where the suite does not complete in less than 24 hours. This guarantees that all tasks of this suite are using the same date . On the other hand, the time follows the time of the machine. Hence the date never changes unless specifically altered or unless the suite restarts, either automatically or from a begin command. Under a hybrid clock any node held by a date , day or cron dependency will be set to complete at the beginning of the suite. (i.e without its job ever running). Otherwise the suite would never complete . | ||||||||||||||||||||||||||||||||||||||||||||
inlimit | The inlimit works in conjunction with limit / ecflow.Limit for providing simple load management inlimit is added to the node that needs to be limited. For python see ecflow.InLimit and ecflow.Node.add_inlimit . For text BNF see inlimit | ||||||||||||||||||||||||||||||||||||||||||||
job creation | Job creation or task invocation can be initiated manually via ecflowview but also by the ecflow_server during scheduling when a task (and all of its parent node s) is free of its dependencies . The process of job creation includes:
The steps above transforms an ecf script to a job file that can be submitted by performing variable substitution on the ECF_JOB_CMD variable and invoking the command. The running jobs will communicate back to the ecflow_server by calling child command ‘s. This causes status changes on the node ‘s in the ecflow_server and flags can be set to indicate various events. If a task is to be treated as a dummy task( i.e. is used as a scheduling task) and is not meant to to be run, then a variable of name ECF_DUMMY_TASK can be added. task.add_variable("ECF_DUMMY_TASK", "") | ||||||||||||||||||||||||||||||||||||||||||||
job file | The job file is created by the ecflow_server during job creation using the ECF_TRYNO variable It is derived from the ecf script after expanding the pre-processing directives . It has the form <task name>.job< ECF_TRYNO >”, i.e. t1.job1. Note job creation checking will create a job file with an extension with zero. i.e ‘.job0’. See ecflow.Defs.check_job_creation When the job is run the output file has the ECF_TRYNO as the extension. i.e t1.1 where ‘t1’ represents the task name and ‘1’ the ECF_TRYNO | ||||||||||||||||||||||||||||||||||||||||||||
label | A label has a name and a value and is a way of displaying information in ecflowview By placing a label child command s in the ecf script the user can be informed about progress in ecflowview . If the label child command s, results in a zombie then the default action if for the server to fob, this allows the ecflow_client command to exit normally. (i,e without any errors). This default can be overridden by using a zombie attribute. For python see ecflow.Label and ecflow.Node.add_label . For text BNF see label | ||||||||||||||||||||||||||||||||||||||||||||
late | Define a tag for a node to be late. Suites cannot be late, but you can define a late tag for submitted in a suite, to be inherited by the families and tasks. When a node is classified as being late, the only action ecflow_server takes is to set a flag. ecflowview will display these alongside the node name as an icon (and optionally pop up a window). For python see ecflow.Late and ecflow.Node.add_late . For text BNF see late | ||||||||||||||||||||||||||||||||||||||||||||
limit | Limits provide simple load management by limiting the number of tasks submitted by a specific ecflow_server . Typically you either define limits on suite level or define a separate suite to hold limits so that they can be used by multiple suites. The limit max value can be changed on the command line >ecflow_client --alter change limit_max <limit-name> <new-limit-value> <path-to-limit> >ecflow_client --alter change limit_max limit 2 /suite It can also be changed in python: #!/usr/bin/env python2.7 import ecflow try: ci = ecflow.Client() ci.alter("/suite","change","limit_max","limit", "2") except RuntimeError, e: print "Failed: " + str(e) For python see ecflow.Limit and ecflow.Node.add_limit . For BNF see limit and inlimit | ||||||||||||||||||||||||||||||||||||||||||||
manual page | Manual pages are part of the ecf script . This is to ensure that the manual page is updated when the ecf script is updated. The manual page is a very important operational tool allowing you to view a description of a task, and possibly describing solutions to common problems. The pre-processing can be used to extract the manual page from the script file and is visible in ecflowview . The manual page is the text contained within the %manual and %end directives . They can be seen using the manual button on ecflowview . The text in the manual page in not included in the job file . There can be multiple manual sections in the same ecf script file. When viewed they are simply concatenated. It is good practice to modify the manual pages when the script changes. The manual page may have the %include directives . | ||||||||||||||||||||||||||||||||||||||||||||
meter | The purpose of a meter is to signal proportional completion of a task and to be able to trigger another job which is waiting on this proportional completion. The meter is updated by placing the –meter child command in a ecf script . For python see: ecflow.Meter and ecflow.Node.add_meter . For text BNF see meter If the meter child command s, results in a zombie, then the default action if for the server to fob , this allows the ecflow_client command to exit normally. (i,e without any errors). This default can be overridden by using a zombie attribute. Meter’s can be referenced in trigger and complete expression expressions. | ||||||||||||||||||||||||||||||||||||||||||||
node | suite , family and task form a hierarchy. Where a suite serves as the root of the hierarchy. The family provides the intermediate nodes, and the task provide the leaf’s. Collectively suite , family and task can be referred to as nodes. For python see ecflow.Node . | ||||||||||||||||||||||||||||||||||||||||||||
pre-processing | Pre-processing takes place during job creation and acts on directives specified in ecf script file. This involves:
| ||||||||||||||||||||||||||||||||||||||||||||
queued | After the begin command, the task without a defstatus are placed into the queued state | ||||||||||||||||||||||||||||||||||||||||||||
real clock | A suite using a real clock will have its clock matching the clock of the machine. Hence the date advances by one day at midnight. | ||||||||||||||||||||||||||||||||||||||||||||
repeat | Repeats provide looping functionality. There can only be a single repeat on a node .
The repeat VARIABLE can be used in trigger and complete expression expressions. If a “repeat date” VARIABLE is used in a trigger expression then date arithmetic is used, when the expression uses addition and subtraction. i.e defs = ecflow.Defs() s1 = defs.add_suite("s1"); t1 = s1.add_task("t1").add_repeat( ecflow.RepeatDate("YMD",20090101,20091231,1) ); t2 = s1.add_task("t2").add_trigger("t1:YMD - 1 eq 20081231"); assert t2.evaluate_trigger(), "Expected trigger to evaluate. 20090101 - 1 == 20081231"
For python see ecflow.Node.add_repeat , ecflow.Repeat , ecflow.RepeatDate , ecflow.RepeatEnumerated , ecflow.RepeatInteger , ecflow.RepeatDay For text BNF see repeat | ||||||||||||||||||||||||||||||||||||||||||||
running | Is a ecflow_server state. See server states | ||||||||||||||||||||||||||||||||||||||||||||
scheduling | The ecflow_server is responsible for task scheduling. It will check dependencies in the suite definition every minute. If these dependencies are free, the ecflow_server will submit the task. See job creation . | ||||||||||||||||||||||||||||||||||||||||||||
server states | The following tables reflects the ecflow_server capabilities in the different states | ||||||||||||||||||||||||||||||||||||||||||||
shutdown | Is a ecflow_server state. See server states | ||||||||||||||||||||||||||||||||||||||||||||
status | Each node in suite definition has a status. Status reflects the state of the node . In ecflowview the background colour of the text reflects the status. task status are: unknown , queued , submitted , active , complete , aborted and suspended ecflow_server status are: shutdown , halted , running this is shown on the root node in ecflowview | ||||||||||||||||||||||||||||||||||||||||||||
submitted | When the task dependencies are resolved/free the ecflow_server places the task into a submitted state. However if the ECF_JOB_CMD fails, the task is placed into the aborted state | ||||||||||||||||||||||||||||||||||||||||||||
suite | A suite is organisational entity. It is serves as the root node in a suite definition . It should be used to hold a set of jobs that achieve a common function. It can be used to hold user variable s that are common to all of its children. Only a suite node can have a clock . It is a collection of family ‘s, variable ‘s, repeat and a single clock definition. For a complete list of attributes look at BNF for suite . For python see ecflow.Suite . | ||||||||||||||||||||||||||||||||||||||||||||
suite definition | The suite definition is the hierarchical node tree. It describes how your task ‘s run and interact. It can built up using:
Once the definition is built, it can be loaded into the ecflow_server , and started. It can be monitored by ecflowview | ||||||||||||||||||||||||||||||||||||||||||||
suspended | Is a node state. A node can be placed into the suspended state via a defstatus or via ecflowview A suspended node including any of its children can not take part in scheduling until the node is resumed. | ||||||||||||||||||||||||||||||||||||||||||||
task | A task represents a job that needs to be carried out. It serves as a leaf node in a suite definition Only tasks can be submitted. A job inside a task ecf script should generally be re-entrant so that no harm is done by rerunning it, since a task may be automatically submitted more than once if it aborts. For python see ecflow.Task . For text BNF see task | ||||||||||||||||||||||||||||||||||||||||||||
time | This defines a time dependency for a node. Time is expressed in the format [h]h:mm. Only numeric values are allowed. There can be multiple time dependencies for a node, but overlapping times may cause unexpected results. To define a series of times, specify the start time, end time and a time increment. If the start time begins with ‘+’, times are relative to the beginning of the suite or, in repeated families, relative to the beginning of the repeated family. If the time the job takes to complete is longer than the interval a time 'slot' is missed, e.g. time 10:00 20:00 01:00 if the 10:00 run takes more than an hour, the 11:00 run will never occur. For python see ecflow.Time and ecflow.Node.add_time . For BNF see time | ||||||||||||||||||||||||||||||||||||||||||||
today | Like time , but If the suites begin time is past the time given for the “today” command the node is free to run (as far as the time dependency is concerned). For example task x today 10:00 If we begin or re-queue the suite at 9.00 am, then the task in held until 10.00 am. However if we begin or re-queue the suite at 11.00am, the task is run immediately. No lets look at time task x time 10:00 If we begin or re-queue the suite at 9.00am, then the task in held until 10.00 am. If we begin or re-queue the suite at 11.00am, the task is still held. If the time the job takes to complete is longer than the interval a “slot†is missed, e.g. today 10:00 20:00 01:00 if the 10:00 run takes more than an hour, the 11:00 run will never occur. For python see ecflow.Today . For text BNF see today | ||||||||||||||||||||||||||||||||||||||||||||
trigger | Triggers defines a dependency for a task or family . There can be only one trigger dependency per node , but that can be a complex boolean expression of the status of several nodes. Triggers should be avoided on suites. A node with a trigger can only be activated when its trigger has expired. A trigger holds the node as long as the trigger’s expression evaluation returns false. Trigger evaluation occurs when ever the child command communicates with the server. i.e whenever there is a state change in the suite definition. The keywords in trigger expressions are: unknown , suspended , complete , queued , submitted , active , aborted and clear and set for event status. Triggers can also reference Node attributes like event , meter , variable , repeat and generated variables. Trigger evaluation for node attributes uses integer arithmetic:
Here are some examples suite trigger_suite task a event EVENT meter METER 1 100 50 edit VAR_INT 12 edit VAR_STRING "captain scarlett" # This is not convertible to an integer, if referenced will use '0' family f1 edit SLEEP 2 repeat string NAME a b c d e f # This has values: a(0),b(1), c(3), d(4), e(5), f(6) i.e index family f2 repeat integer VALUE 5 10 # This has values: 5,6,7,8,9,10 family f3 repeat enumerated red green blue # red(0), green(1), blue(2) task t1 repeat date DATE 19991230 20000102 # This has values: 19991230,19991231,20000101,20000102 endfamily endfamily endfamily family f2 task event_meter trigger /suite/a:EVENT == set and /suite/a:METER >= 30 task variable trigger /suite/a:VAR_INT >= 12 and /suite/a:VAR_STRING == 0 task repeat_string trigger /suite/f1:NAME >= 4 task repeat_integer trigger /suite/f1/f2:VALUE >= 7 task repeat_date trigger /suite/f1/f2/f3/t1:DATE >= 19991231 task repeat_date2 # Using plus/minus on a repeat DATE will use date arithmetic # Since the starting value of DATE is 19991230, this task will run # straight away trigger /suite/f1/f2/f3/t1:DATE - 1 == 19991229 endfamily endsuite What happens when we have multiple node attributes of the same name, referenced in trigger expressions ? task foo event blah meter blah 0 200 50 edit blah 10 task bar trigger foo:blah >= 0 In this case ecFlow will use the following precedence: Hence in the example above expression ‘foo:blah >= 0’ will reference the event. For python see ecflow.Expression and ecflow.Node.add_trigger | ||||||||||||||||||||||||||||||||||||||||||||
unknown | This is the default node status when a suite definition is loaded into the ecflow_server | ||||||||||||||||||||||||||||||||||||||||||||
user commands | User commands are any client to server requests that are not child command s. | ||||||||||||||||||||||||||||||||||||||||||||
variable | ecFlow makes heavy use of different kinds of variables.There are several kinds of variables:
Variables can be referenced in trigger and complete expression s . The value part of the variable should be convertible to an integer otherwise a default value of 0 is used. For python see ecflow.Node.add_variable . For BNF see variable | ||||||||||||||||||||||||||||||||||||||||||||
variable inheritance | When a variable is needed at job creation time, it is first sought in the task itself. If it is not found in the task , it is sought from the task’s parent and so on, up through the node levels until found. For any node , there are two places to look for variables. Suite definition variables are looked for first, and then any generated variables. | ||||||||||||||||||||||||||||||||||||||||||||
variable substitution | Takes place during pre-processing or command invocation.(i.e ECF_JOB_CMD,ECF_KILL_CMD,etc) It involves searching each line of ecf script file or command, for ECF_MICRO character. typically ‘%’ The text between two % character, defines a variable. i.e %VAR% This variable is searched for in the suite definition . First the suite definition variables( sometimes referred to as user variables) are searched and then Repeat variable name, and finally the generated variables.If no variable is found then the same search pattern is repeated up the node tree. The value of the variable is replaced between the % characters. If the micro character are not paired and an error message is written to the log file, and the task is placed into the aborted state. If the variable is not found in the suite definition during pre-processing then job creation fails, and an error message is written to the log file, and the task is placed into the aborted state. To avoid this, variables in the ecf script can be defined as: %VAR:replacement% This is similar to %VAR% but if VAR is not found in the suite definition then ‘replacement’ is used. | ||||||||||||||||||||||||||||||||||||||||||||
virtual clock | Like real clock until the ecflow_server is suspended (i.e shutdown or halted ), the suites clock is also suspended. Hence will honour relative times in cron , today and time dependencies. It is possible to have a combination of hybrid/real and virtual. More useful when we want complete adherence to time related dependencies at the expense being out of sync with system time. | ||||||||||||||||||||||||||||||||||||||||||||
zombie | Zombies are running jobs that fail authentication when communicating with the ecflow_server child command s like (init, event,meter, label, abort,complete) are placed in the ecf script file and are used to communicate with the ecflow_server . The ecflow_server authenticates each connection attempt made by the child command . Authentication can fail for a number of reasons:
When authentication fails the job is considered to be a zombie. The ecflow_server will keep a note of the zombie for a period of time, before it is automatically removed. However the removed zombie, may well re-appear. ( this is because each child command will continue attempting to contact the ecflow_server for 24 hours. This is configurable see ECF_TIMEOUT on ecflow_client ) For python see ecflow.ZombieAttr , ecflow.ZombieUserActionType There are several types of zombies see zombie type and ecflow.ZombieType | ||||||||||||||||||||||||||||||||||||||||||||
zombie attribute | The zombie attribute defines how a zombie should be handled in an automated fashion. Very careful consideration should be taken before this attribute is added as it may hide a genuine problem. It can be added to any node . But is best defined at the suite or family level. If there is no zombie attribute the default behaviour for init,complete,wait and abort child command s, is to block, whereas for label, event, meter the default behaviour is to fob. (from version 4.0.4, previously all child command s blocked). To add a zombie attribute in python, please see: ecflow.ZombieAttr | ||||||||||||||||||||||||||||||||||||||||||||
zombie type | See zombie and class ecflow.ZombieAttr for further information. There are several types of zombies:
The type of the zombie is not fixed and may change. |