Setting the environment is done calling
module load ecflow
When module is not available on a platform, use is the original way to set PATH and PYTHONPATH variables
use ecflow
It is possible to setup a specific version with
module unload ecflow; module load ecflow/4.0.2
Server can be started with
ecflow_start.sh
Client command can be called to get the self-contained documentation
ecflow_client --help
and the graphical interface is started with
# line below shall add localhost as part of # the ecflowview->Servers list grep localhost $HOME/.ecflowrc/servers ||echo\ "localhost $(uname -n) $((1500 + $(id -g)))"\ >> $HOME/ecflowrc/servers # start the GUI: ecflowview
Server administrator directory is $HOME/ecflow_server/ which will contain the server log file, the check point file (binary snapshot of the server content). It is defined as variable ECF_HOME on the top node.
Once the server and its GUI are started, click on Servers->localhost to connect the first time.
Next step is to load a suite into the server. The following python script can be used for suite definition, to expand the suite into a file, and to load it into the server, as shown in the ecflowview snapshot below
Clicking on each task, we can check the presence of task wrapper script (ECF_FILES defined properly), then Edit it, preprocess it (ECF_INCLUDE defined properly, no micro character on its own), and submit it, as a task or as an alias.
Consider Options->CloseOnApply/Submit when multiple aliases must be sent from the same task in a short time.
When the task does not reach the active status, we shall check that ECF_OUT directory is existing on the remote host, check that expected rsh or ssh connection does not request password anymore, or query the queueing system while the directive may not be valid (user account, queue), yet.
When the submit family is working for the expected remote host(s), time to fill the main family with relevant tasks. Enjoy!
(Expanded) Definition file:
It consists of the keyword:
- suite family task endsuite endfamily endtask
- autocancel clock complete cron date day defstatus edit event extern inlimit label late limit meter repeat time today trigger
Comparing with SMS, automigrate autorestore text owner are left behind.
autocancel
for a node to be deleted automatically
autocancel +01:00 # cancel one hour after complete
autocancel 10 # cancel 10 days after complete
autocancel 0 # cancel immediately after being complete
clock
clock real # hybrid may be used in test mode
complete
for a node, to be recursively forced complete from a condition
complete t1:1 or t1==complete
cron
to run a task regularly, task is requeued as soon as complete is received
ie no trigger on the parent task complete shall be used
task can only become complete, thanks to inherited defstatus or complete attribute
cron 23:00 # at next 23:00
cron 10:00 20:00 01:00 # every hour from 10am to 8pm
date
date 25.12.2012
date 01.*.*
day
day monday # sunday,monday,tuesday,wednesday,thursday,friday,saturday
defstatus
defstatus complete # unknown,suspended,complete,queued,submitted,active,aborted,shutdown,halted
edit
to attach a variable definition to a node
edit variable value
# variables to be find/and/replaced in a task wrapper
edit COMMAND "echo OK" # %COMMAND:sleep 1%
edit TRIGGER "t1:1 or t1==complete" # ecflow_client --wait "%TRIGGER:1==1%"
event
event 1
event ready
extern
extern /path/to/a/external/node # in order to allow its use in a trigger/complete
inlimit
register the node and its kids to a limit
inlimit /limits:hpc
inlimit /suite/limits:hpc
inlimit /suite/limits:hpc 10
label
label name "default message"
late
late -s +00:15 -a 20:00 -c +02:00
limit
limit hpc 500
meter
meter name -1 100 90 # 90 is threshold (optional)
repeat
repeat is incremented when all nodes below are complete
an aborted task DOES prevent repeat to increment
an Operator/Analsyst/dedicated task can help carry on
repeat day step [ENDDATE] # only for suites
repeat integer VARIABLE start end [step]
repeat enumerated VARIABLE first [second [third ...]]
repeat string VARIABLE str1 [str2 ...]
repeat date VARIABLE yyyymmdd yyyymmdd [delta]
time
task become complete ONLY when time range is over
better not to use such task in a trigger expression
time 23:00 # at next 23:00
time 10:00 20:00 01:00 # every hour from 10am to 8pm
time +00:01 # one minute after the begin suite
time +00:10 01:00 00:05 # 10-60 min after begin every 5 min
today
with such attribute, task will start straight when loaded/replaced after given time
while time attribute would make it wait the next day
today 3:00 # today at 3:00
today 10:00 20:00 01:00 # every hour from 10am to 8pm
trigger
for a task to wait the right condition (step/meter/status/vriable(int)) to start
As soon as the definition file is beyond few hundred lines, or even before when obvious repeated pattern are used for suite definition, a language like Python can be used for this. At the Centre, a python module is used for both research and operation to reduce verbosity in suite definition script /home/ma/emos/def/o/def/ecf.py
#!/usr/bin/env python import sys, pwd; sys.path.append('/home/ma/emos/def/o/def') # ipython # import ecf; help(ecf.<tab>) from ecf import * defs = Defs() def fill(): # functions can generate tasks/families out = [] for i in xrange(0, 10): out.append(Task("t%d" % i)) return out top = Suite("test").add( Family("fam").add( Task("example").add( Variables(var= "value", v2= "another variable"), ), fill(), ) ) if __name__ == "main": uid = pwd.getpwnam(pwd.getpwuid( os.getuid() )[ 0 ]).pw_uid client = Client("localhost@%s" % uid) path = "/test" client.replace(path, top)