The task wrapper file does not normally need many changes, if the task designer sticks to the KISS principle, focusing on the functional aspect of the task.
vars="SMSRID SMSTRYNO SMSNAME SMSSCRIPT SMSJOB SMSJOBOUT SMSDATE SMSTIME SMSCLOCK SMSKILLCMD SMSURLCMD SMSURLBASE SMSURL SMSPASS SMSNODESMSCMD SMSKILL SMSKILLCMD SMSCHECK SMSCHECKOLD SMSSTATUSCMD SMSCHECKCMD SMSOUT SMSTRIES" for var in $vars; do case $var in SMSCMD) ecf=ECF_JOB;; SMSKILL*) ecf=ECF_KILL_CMD;; SMSSTATUS*) ecf=ECF_STATUS_CMD;; SMSCHECK_CMD*) ecf=ECF_CHECK_CMD;; SMSURL*CMD*) ecf=ECF_URL_CMD;; *) ecf=$(echo $var | sed -e 's:SMS:ECF_:');; esac node=/ # node=path_to_suite_or_family add=add # add=change ecflow_client --alter $add variable $var "%$ecf%" $node done |
The file name is changed, ending with .ecf instead of .sms.
simply copy or link the original file from .sms into .ecf
alternatively, define a variable ECF_EXTN in the definition file:: edit ECF_EXTN .sms
This requests that the ecFlow server uses .sms wrappers as the task template. In some cases, no files will need translation (no SMS variables, no CDP calls)
smsmicro is replaced with ecf_micro, when needed
SMS | ecFlow | location |
---|---|---|
SMSMICRO | ECF_MICRO | definition file |
%smsmicro | %ecf_micro | script .ecf .h |
In ECMWF Operations, in the main branch, amongst 1394 files, only 43 use SMS system variables, i.e. variables whose name starts with SMS. Among all the suites MetApps is in charge of, amongst 3738 files, 216 are affected. Extracting these variables, we have:
============
%SMS in .sms
============
SMSCHECK
SMSCHECKOLD
SMSDATE
SMSFILES
SMSHOME
SMSHOST
SMSINCLUDE
SMSJOBOUT
SMSLOG
SMSNAME
SMSNODE
SMSTRYNO
SMSURLBASE
SMS_PROG
============
Similarly, we can identify all scripts that call the CDP text client.
It is a good design principle to create tasks that are independent of SMS system variables. Only the tasks in charge of “advanced use” are concerned: SMSTRYNO was used to make a job aware of its instance number, enabling verbose output in case of rerun.
One step translation consists of running the scripts through a filter that can be used for both expanded SMS definition files or for task wrappers:
> sed -f sms2ecf-min.sed X.sms > X.ecf
#!/bin/sed -f
/^ *action */d
/^ *edit ECF_DATE */d
s:SMSNAME:ECF_NAME:g
s:SMSNODE:ECF_NODE:g
s:SMSPASS:ECF_PASS:g
s:SMS_PROG:ECF_PORT:g
s:SMSINCLUDE:ECF_INCLUDE:g
s:SMSFILES:ECF_FILES:g
s:SMSTRYNO:ECF_TRYNO:g
s:SMSTRIES:ECF_TRIES:g
s:SMSHOME:ECF_HOME:g
s:SMSRID:ECF_RID:g
s:SMSJOB:ECF_JOB:g
s:SMSJOBOUT:ECF_JOBOUT:g
s:SMSOUT:ECF_OUT:g
s:SMSCHECKOLD:ECF_CHECKOLD:g
s:SMSCHECK:ECF_CHECK:g
s:SMSLOG:ECF_LOG:g
s:SMSLISTS:ECF_LISTS:g
s:SMSPASSWD:ECF_PASSWD:g
s:SMSSERVERS:ECF_SERVERS:g
s:SMSMICRO:ECF_MICRO:g
s:SMSPID:ECF_PID:g
s:SMSHOST:ECF_HOST:g
s:SMSDATE:ECF_DATE:g
s:SMSURL:ECF_URL:g
s:SMSURLBASE:ECF_URLBASE:g
s:SMSCMD:ECF_JOB_CMD:g
s:SMSKILL:ECF_KILL_CMD:g
s:SMSSTATUSCMD:ECF_STATUS_CMD:g
s:SMSURLCMD:ECF_URL_CMD:g
s:SMSWEBACCESS:ECF_WEBACCESS:g
s:SMS_VERS:ECF_VERS:g
s:SMS_VERSION:ECF_VERSION:g
/edit ECF_INCLUDE/ {
s:/include:/include_ecf:g
}
/edit ECF_INCLUDE/ {
s:_prod:_prod_ecf:g
}
/edit ECF_FILES/ {
s:_prod:_prod_ecf:g
}
s:smshostfile:ecf_hostfile:g
s:sms_hosts:ecf_hosts:g
Applying such a filter to all sms tasks can be simplfied:
#!/bin/ksh
files=`find -type f -name "*.sms" ` ## all sms wrappers
for f in $files ; do
ecf=$(basename $f .sms).ecf ## ecf task name
sed -f sms2ecf-min.sed $f > $ecf ## translate
diff $f $ecf > /dev/null && rm $ecf && ln -sf $f $g ## or link
done
SMS wrappers links can be preserved:
#!/bin/ksh
files=`find -type l -name "*.sms" `
for f in $files ; do
ecf=$(basename $f .sms).ecf ## ecf task name
link=$(readlink $f)
dir=$(dirname $f); cd $dir
ln -sf $link $ecf
cd -
done
Special attention is needed for the variables renaming:
SMS | ecFlow |
---|---|
SMSCMD | ECF_JOB_CMD |
SMSKILL | ECF_KILL_CMD |
SMS_STATUSCMD | ECF_STATUS_CMD |
SMS_URLCMD | ECF_URL_CMD |
It is not a good idea to systematically replace SMS with ECF_, for example, we use the variables NO_SMS and LSMSSIG which are not related to SMS.
If we want to run the the same job using both SMS and ecFlow, %SMSXXX% may be replaced with shell variables ECF_XXX. Then in a header file, we will define ECF_XXX=%SMSXXX:0% for sms mode and ECF_XXX=%ECF_XXX:0% for ecFlow mode.
All tasks calling CDP directly must be treated carefully and text client commands replaced with their ecFlow counterpart. They may force complete a family or a task, requeue a job or change a variable value:
#!/usr/bin/env cdp
cdp << EOF
define ERROR {
if(rc==0) then exit 1; endif
}
set SMS_PROG %SMS_PROG%
login %SMSNODE% %USER% 1 ; ERROR
suites -s %SUITE%
loop task ( $missing ) do
force -r complete /%SUITE%/%FAMILY%/tc\$task ; ERROR
endloop
exit
EOF
The ECF_PORT variable gives us the ability to discriminate between jobs under ecFlow control or not:
#!/bin/ksh
if [ %ECF_PORT:0% -gt 0 ] ; then
for task in $missing; do
ecflow_client --force complete recursive /%SUITE%/%FAMILY%/tc$task
done
else
cdp << EOF
define ERROR {
if(rc==0) then exit 1; endif
}
set SMS_PROG %SMS_PROG%
login %SMSNODE% %USER% 1 ; ERROR
suites -s %SUITE%
loop task ( $missing ) do
force -r complete /%SUITE%/%FAMILY%/tc\$task ; ERROR
endloop
exit
EOF
fi
sms child commands may also be called in few sms task wrappers. These should again be replaced with their ecFlow equivalents.
There is no right way to do this. It is simple to design a task whose language is pure python or pure perl. We tend to use ksh scripting for task templates for the following reasons:
- trap ERROR 0: to prevent early exit from the script and call the ERROR if exited
- set -e: to raise an error if a command exit status is not 0
- set -u: to prevent undefined variable usage
- set -x: to display each command before execution
- PS4 variable: to allow time stamping and evaluate each lines runtime
- trap: to redirect internal/external signal reception to an ERROR function
Task headers can be used to make common what can be shared among multiple tasks (head.h, tail.h, trap.h, rcp.h, qsub.h).