Horizontal Navigation Bar |
---|
Button Group |
---|
Button Hyperlink |
---|
title | Previous |
---|
type | standard |
---|
url | https://softwareconfluence.ecmwf.int/wiki/display/ECFLOW/LimitsAlias |
---|
|
Button Hyperlink |
---|
title | Up |
---|
type | standard |
---|
url | https://softwareconfluence.ecmwf.int/wiki/display/ECFLOW/Advanced+Topics |
---|
|
Button Hyperlink |
---|
title | Next |
---|
type | standard |
---|
url | https://softwareconfluence.ecmwf.int/wiki/display/ECFLOW/File+location |
---|
|
|
|
To start a job,
the ecflow_server uses the content of the ECF_JOB_CMD
variable.
By modifying this variable, it is possible to control
where and
how a
job file will run.
The command should be used in
conjunctions conjunction with the
variable ECF_JOB and ECF_JOBOUT.
The ECF_JOB variable contains the
job file path and ECF_JOBOUT contains
the path of a file where the output of the job will be written. The default command:
Note |
---|
ECF_JOB_CMD = %ECF_JOB% 1> %ECF_JOBOUT% 2>&1 & |
Let us run the tasks on a remote machine. For that, we could use the unix UNIX command rshssh.
We would like the name of the host to be defined by
an a variable called HOST.
We assume that all the files are visible on all the hosts, ie.eg. using NFS.
In the examples below replace the string ?????? with a host name of your choice.
Note |
---|
The environment of a task running on a remote host is can be different from that of a task running locally. This It depends on how your system is set up. The head.h should already be using the correct PATH, to allow child command‘s s to be used. > export PATH=$PATH:/usr/local/apps/ecflow/%ECF_VERSION% /bin |
To use ssh requires your public key to be available on the destination machine.
Check if you can log on to the remote machine through ssh without a password check.
If you need to enter a password you will need to add your public key on the destination machine. To do this issue the following commands:
Code Block |
---|
language | bash |
---|
title | no password for ssh connection |
---|
|
REMOTE_HOST=?????? # change me
ssh $USER@$REMOTE_HOST mkdir -p \$HOME/.ssh # if you are prompted for a password use your Training password that was provided
cat $HOME/.ssh/id_rsa.pub || ssh-keygen -t rsa -b 2048
cat $HOME/.ssh/id_rsa.pub | ssh $USER@$REMOTE_HOST 'cat >> $HOME/.ssh/authorized_keys'
|
You may experience other problems using rsh, caused by standard unix issues.
Make sure that the file $HOME/.rhosts contains a line with your user ID and
the machine where your server is running.
Modify the family f5 so that all its tasks will run on another machine in the classroom.
Text
Code Block |
---|
# Definition of the suite test
suite test
edit ECF_INCLUDE "$HOME/course"
edit ECF_HOME "$HOME/course"
limit l1 2
family f5
edit HOST ??????
edit ECF_OUT /tmp/$USER
edit ECF_JOB_CMD "rshssh %HOST% 'mkdir -p %ECF_OUT%/%SUITE%/%FAMILY% && %ECF_JOB% > %ECF_JOBOUT% 2>&1 &'"
inlimit l1
edit SLEEP 20
task t1
task t2
task t3
task t4
task t5
task t6
task t7
task t8
task t9
endfamily
endsuite |
In If your login shell is csh, you should define ECF_JOB_CMD as:
Code Block |
---|
edit ECF_JOB_CMD "rshssh %HOST% 'mkdir -p %ECF_OUT%/%SUITE%/%FAMILY%; %ECF_JOB% >& %ECF_JOBOUT%'" |
Python
In python modify the function create_family_f5() created in the earlier page, to add HOST,ECF_OUT,ECF_LOGHOST,ECF_LOGPORT, and ECF_JOB_CMD:
#!/usr/bin/env python2.7
Code Block |
---|
language | py |
---|
title | $HOME/course/test.py |
---|
|
import os
|
import ecflow
from ecflow import Defs,Suite,Family,Task,Edit,Trigger,Complete,Event,Meter,Time,Day,Date,Label, \
RepeatString,RepeatInteger,RepeatDate,InLimit,Limit
def create_family_f5() :
|
f5 =ecflow.)f5.add_inlimitf5.add_variable("HOST", " Edit(SLEEP=20,
HOST='????? |
?")f5add_variableECF_JOB_CMD "rsh %HOST% '%ECF_JOB% > %ECF_JOBOUT% 2>&1 &'")
f5.add_variable("SLEEP", 20)
foriinrange(1,10):
f5.add_task( "t"ECF_LOGPORT='?????', # port=$((35000 + |
str(i)
returnf5
print "Creating suite definition"
defs = ecflow.Defs()
suite = defs.add_suite("test")
suite.add_variable("ECF_INCLUDE", ECF_JOB_CMD="ssh %HOST% 'mkdir -p %ECF_OUT%/%SUITE%/%FAMILY%; %ECF_JOB% > %ECF_JOBOUT% 2>&1 &'"),
[ Task('t{}'.format(i)) for i in range(1,10) ] )
print("Creating suite definition")
home = os.path.join(os.getenv("HOME"), |
)
suite.add_variable("ECF_HOME", os.path.join(os.getenv("HOME"), "course"))
suite.add_limit
defs = Defs(
Suite("test",
Edit(ECF_INCLUDE=home,ECF_HOME=home),
Limit("l1", |
suite.add_family( ("Checking job creation: .ecf -> .job0" |
(defs.check_job_creation())
print("Checking trigger expressions")
assert len(defs.check()) == 0,defs.check()
print("Saving definition to file 'test.def'")
defs.save_as_defs("test.def") |
Logserver
We can view the output on the remote machine (class??) by using a log server.
This assumes you have defined variables ECF_LOGHOST and ECF_LOGPORT in your definition.
Launch the log server on a remote machine:
Code Block |
---|
|
ssh $USER@class01 /usr/local/apps/ecflow/5.5.1/bin/ecflow_logserver.sh -d /tmp/$USER -m /tmp/$USER:/tmp/$USER |
What to do
- Modify PATH environment variable in head.h
- Change the suite definition
- Replace the suite definition
- It may not work immediately. Have a look in the file $HOME/course/host.port.ecf.log to see why.
- Add a uname -n to your ecf ECF script to see what machine the task is running on.
- What do you need to do in order to have the task /test/f5/t9 run on another machine? Try your solution.
- Create a log server, to access the remote output
Horizontal Navigation Bar |
---|
Button Group |
---|
Button Hyperlink |
---|
title | Previous |
---|
type | standard |
---|
url | https://softwareconfluence.ecmwf.int/wiki/display/ECFLOW/LimitsAlias |
---|
|
Button Hyperlink |
---|
title | Up |
---|
type | standard |
---|
url | https://softwareconfluence.ecmwf.int/wiki/display/ECFLOW/Advanced+Topics |
---|
|
Button Hyperlink |
---|
title | Next |
---|
type | standard |
---|
url | https://softwareconfluence.ecmwf.int/wiki/display/ECFLOW/File+location |
---|
|
|
|
...