Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.


Horizontal Navigation Bar


Button Group

Button Hyperlink
titlePrevious
typestandard
urlhttps://confluence.ecmwf.int/display/ECFLOW/Alias
Button Hyperlink
titleUp
typestandard
urlhttps://confluence.ecmwf.int/display/ECFLOW/Advanced+Topics
Button Hyperlink
titleNext
typestandard
urlhttps://confluence.ecmwf.int/display/ECFLOW/File+location


To start a job, the ecflow_server uses the content of the ECF_JOB_CMD variable.
By modifying this variable, it is possible to control where and how a job file will run.
The command should be used in conjunction with the variable ECF_JOB and ECF_JOBOUT.
The ECF_JOB variable contains the job file path and ECF_JOBOUT contains
the path of a file where the output of the job will be written.  The default command:


Note
 ECF_JOB_CMD = %ECF_JOB% 1> %ECF_JOBOUT% 2>&1 &


Let us run the tasks on a remote machine. For that, we could use the UNIX command ssh.
We would like the name of the host to be defined by a variable called HOST.
We assume that all the files are visible on all the hosts, e.g. using NFS.

In the examples below replace the string ?????? with a host name of your choice.


Note
The environment of a task running on a remote host can be different from that of a task running locally.
It depends on how your system is set up. The head.h  should already be using the correct PATH, to allow child commands to be used.
If not add the following line into your head.h file before the call to ecflow_client –init

> export PATH=$PATH:/usr/local/apps/ecflow/%ECF_VERSION%/bin


To use ssh requires your public key to be available on the destination machine.
Check if you can log on to the remote machine through ssh without a password check.
If you need to enter a password you will need to add your public key on the destination machine. To do this issue the following commands:


Code Block
languagebash
titleno password for ssh connection
REMOTE_HOST=??????  # change me
ssh $USER@$REMOTE_HOST mkdir -p \$HOME/.ssh      # if you are prompted for a password use your Training password that was provided
cat $HOME/.ssh/id_rsa.pub || ssh-keygen -t rsa -b 2048
cat $HOME/.ssh/id_rsa.pub | ssh $USER@$REMOTE_HOST 'cat >> $HOME/.ssh/authorized_keys'


Modify the family f5 so that all its tasks will run on another machine in the classroom.

    

Text

Code Block
# Definition of the suite test
suite test
 edit ECF_INCLUDE "$HOME/course"
 edit ECF_HOME    "$HOME/course"
 limit l1 2

 family f5
     edit HOST ??????
     edit ECF_OUT /tmp/$USER
     edit ECF_JOB_CMD "ssh %HOST% 'mkdir -p %ECF_OUT%/%SUITE%/%FAMILY% && %ECF_JOB% > %ECF_JOBOUT% 2>&1 &'"
     inlimit l1
     edit SLEEP 20
     task t1
     task t2
     task t3
     task t4
     task t5
     task t6
     task t7
     task t8
     task t9
 endfamily
endsuite

If your login shell is csh, you should define ECF_JOB_CMD as:

Code Block
edit ECF_JOB_CMD "ssh %HOST% 'mkdir -p %ECF_OUT%/%SUITE%/%FAMILY%; %ECF_JOB% >& %ECF_JOBOUT%'"

Python

In python modify the function create_family_f5() created in the earlier page, to add HOST,ECF_OUT,ECF_LOGHOST,ECF_LOGPORT, and ECF_JOB_CMD:

Code Block
languagepy
title$HOME/course/test.py
import os
from ecflow import Defs,Suite,Family,Task,Edit,Trigger,Complete,Event,Meter,Time,Day,Date,Label, \
                   RepeatString,RepeatInteger,RepeatDate,InLimit,Limit
         
def create_family_f5() :
    return Family("f5",
            InLimit("l1"),
            Edit(SLEEP=20,
                 HOST='?????',
                 ECF_OUT = '/tmp/%s' % os.getenv("USER"),
                 ECF_LOGHOST='%HOST%',
                 ECF_LOGPORT='?????',  # port=$((35000 + $(id -u))) run this on the command line
                 ECF_JOB_CMD="ssh %HOST% 'mkdir -p %ECF_OUT%/%SUITE%/%FAMILY%; %ECF_JOB% > %ECF_JOBOUT% 2>&1 &'"),
            [ Task('t{}'.format(i)) for i in range(1,10) ] )
     
print("Creating suite definition")  
home = os.path.join(os.getenv("HOME"),"course")
defs = Defs(
        Suite("test",
            Edit(ECF_INCLUDE=home,ECF_HOME=home),
            Limit("l1",2),
            create_family_f5()))
print(defs)
 
print("Checking job creation: .ecf -> .job0") 
print(defs.check_job_creation())
 
print("Checking trigger expressions")
assert len(defs.check()) == 0,defs.check()
 
print("Saving definition to file 'test.def'")
defs.save_as_defs("test.def")

Logserver

 We can view the output on the remote machine (class??) by using a log server.

This assumes you have defined variables ECF_LOGHOST and ECF_LOGPORT in your definition.

 Launch the log server on a remote machine:

Code Block
languagebash
ssh $USER@class01 /usr/local/apps/ecflow/5.5.1/bin/ecflow_logserver.sh -d /tmp/$USER -m /tmp/$USER:/tmp/$USER 

What to do

  1. Modify PATH environment variable in head.h
  2. Change the suite definition
  3. Replace the suite definition
  4. It may not work immediately. Have a look in the file $HOME/course/host.port.ecf.log to see why.
  5. Add a uname -n to your ECF script to see what machine the task is running on.
  6. What do you need to do in order to have the task /test/f5/t9 run on another machine? Try your solution.
  7. Create a log server, to access the remote output

Horizontal Navigation Bar


Button Group

Button Hyperlink
titlePrevious
typestandard
urlhttps://confluence.ecmwf.int/display/ECFLOW/Alias
Button Hyperlink
titleUp
typestandard
urlhttps://confluence.ecmwf.int/display/ECFLOW/Advanced+Topics
Button Hyperlink
titleNext
typestandard
urlhttps://confluence.ecmwf.int/display/ECFLOW/File+location

HTML
<style type="text/css"> .highlight .hll { background-color: #ffffcc } .highlight { background: #eeffcc; } .highlight .c { color: #408090; font-style: italic } /* Comment */ .highlight .err { border: 1px solid #FF0000 } /* Error */ .highlight .k { color: #007020; font-weight: bold } /* Keyword */ .highlight .o { color: #666666 } /* Operator */ .highlight .cm { color: #408090; font-style: italic } /* Comment.Multiline */ .highlight .cp { color: #007020 } /* Comment.Preproc */ .highlight .c1 { color: #408090; font-style: italic } /* Comment.Single */ .highlight .cs { color: #408090; background-color: #fff0f0 } /* Comment.Special */ .highlight .gd { color: #A00000 } /* Generic.Deleted */ .highlight .ge { font-style: italic } /* Generic.Emph */ .highlight .gr { color: #FF0000 } /* Generic.Error */ .highlight .gh { color: #000080; font-weight: bold } /* Generic.Heading */ .highlight .gi { color: #00A000 } /* Generic.Inserted */ .highlight .go { color: #303030 } /* Generic.Output */ .highlight .gp { color: #c65d09; font-weight: bold } /* Generic.Prompt */ .highlight .gs { font-weight: bold } /* Generic.Strong */ .highlight .gu { color: #800080; font-weight: bold } /* Generic.Subheading */ .highlight .gt { color: #0040D0 } /* Generic.Traceback */ .highlight .kc { color: #007020; font-weight: bold } /* Keyword.Constant */ .highlight .kd { color: #007020; font-weight: bold } /* Keyword.Declaration */ .highlight .kn { color: #007020; font-weight: bold } /* Keyword.Namespace */ .highlight .kp { color: #007020 } /* Keyword.Pseudo */ .highlight .kr { color: #007020; font-weight: bold } /* Keyword.Reserved */ .highlight .kt { color: #902000 } /* Keyword.Type */ .highlight .m { color: #208050 } /* Literal.Number */ .highlight .s { color: #4070a0 } /* Literal.String */ .highlight .na { color: #4070a0 } /* Name.Attribute */ .highlight .nb { color: #007020 } /* Name.Builtin */ .highlight .nc { color: #0e84b5; font-weight: bold } /* Name.Class */ .highlight .no { color: #60add5 } /* Name.Constant */ .highlight .nd { color: #555555; font-weight: bold } /* Name.Decorator */ .highlight .ni { color: #d55537; font-weight: bold } /* Name.Entity */ .highlight .ne { color: #007020 } /* Name.Exception */ .highlight .nf { color: #06287e } /* Name.Function */ .highlight .nl { color: #002070; font-weight: bold } /* Name.Label */ .highlight .nn { color: #0e84b5; font-weight: bold } /* Name.Namespace */ .highlight .nt { color: #062873; font-weight: bold } /* Name.Tag */ .highlight .nv { color: #bb60d5 } /* Name.Variable */ .highlight .ow { color: #007020; font-weight: bold } /* Operator.Word */ .highlight .w { color: #bbbbbb } /* Text.Whitespace */ .highlight .mf { color: #208050 } /* Literal.Number.Float */ .highlight .mh { color: #208050 } /* Literal.Number.Hex */ .highlight .mi { color: #208050 } /* Literal.Number.Integer */ .highlight .mo { color: #208050 } /* Literal.Number.Oct */ .highlight .sb { color: #4070a0 } /* Literal.String.Backtick */ .highlight .sc { color: #4070a0 } /* Literal.String.Char */ .highlight .sd { color: #4070a0; font-style: italic } /* Literal.String.Doc */ .highlight .s2 { color: #4070a0 } /* Literal.String.Double */ .highlight .se { color: #4070a0; font-weight: bold } /* Literal.String.Escape */ .highlight .sh { color: #4070a0 } /* Literal.String.Heredoc */ .highlight .si { color: #70a0d0; font-style: italic } /* Literal.String.Interpol */ .highlight .sx { color: #c65d09 } /* Literal.String.Other */ .highlight .sr { color: #235388 } /* Literal.String.Regex */ .highlight .s1 { color: #4070a0 } /* Literal.String.Single */ .highlight .ss { color: #517918 } /* Literal.String.Symbol */ .highlight .bp { color: #007020 } /* Name.Builtin.Pseudo */ .highlight .vc { color: #bb60d5 } /* Name.Variable.Class */ .highlight .vg { color: #bb60d5 } /* Name.Variable.Global */ .highlight .vi { color: #bb60d5 } /* Name.Variable.Instance */ .highlight .il { color: #208050 } /* Literal.Number.Integer.Long */ </style> <span class="target" id="index-0"></span><div class="section" id="running-the-jobs"> <div class="line-block"> <div class="line">To start a job, ECF uses the content of the ECF_JOB_CMD <a class="reference internal" href="/wiki/display/ECFLOW/Glossary#term-variable"><em class="xref std std-term">variable</em></a>.</div> <div class="line">By modifying this variable, it is possible to control where and how a <a class="reference internal" href="/wiki/display/ECFLOW/Glossary#term-job-file"><em class="xref std std-term">job file</em></a> will run.</div> <div class="line">The command should be used in conjunctions with the <a class="reference internal" href="/wiki/display/ECFLOW/Glossary#term-variable"><em class="xref std std-term">variable</em></a> ECF_JOB and ECF_JOBOUT.</div> <div class="line">ECF_JOB contains the name of the file containing the <a class="reference internal" href="/wiki/display/ECFLOW/Glossary#term-job-file"><em class="xref std std-term">job file</em></a>, and ECF_JOBOUT contains</div> <div class="line">the name of the file that should contain the output.</div> </div> <div class="admonition note"> <p class="first admonition-title">Note</p> <p class="last">The default command ECF_JOB_CMD = %ECF_JOB% 1&gt; %ECF_JOBOUT% 2&gt;&amp;1 &amp;</p> </div> <div class="line-block"> <div class="line">Let us run the tasks on a remote machine. For that we could use the unix command rsh.</div> <div class="line">We would like the name of the host to be defined by an <a class="reference internal" href="/wiki/display/ECFLOW/Glossary#term-variable"><em class="xref std std-term">variable</em></a> called HOST.</div> <div class="line">We assume that all the files are visible on all the hosts, i.e. using NFS.</div> </div> <p>Modify the <a class="reference internal" href="/wiki/display/ECFLOW/Glossary#term-family"><em class="xref std std-term">family</em></a> f5 so that all its tasks will run on another machine in the classroom:</p> <div class="highlight-python"><pre># Definition of the suite test suite test edit ECF_INCLUDE "$HOME/course" edit ECF_HOME "$HOME/course" limit l1 2 family f1 edit SLEEP 20 task t1 meter progress 1 100 90 task t2 trigger t1 eq complete event a event b task t3 trigger t2:a task t4 trigger t2 eq complete complete t2:b task t5 trigger t1:progress ge 30 task t6 trigger t1:progress ge 60 task t7 trigger t1:progress ge 90 endfamily family f2 edit SLEEP 20 task t1 time 00:30 23:30 00:30 task t2 day sunday task t3 date 01.*.* time 12:00 task t4 time +00:02 task t5 time 00:02 endfamily family f3 task t1 label info "" endfamily family f4 edit SLEEP 2 repeat string NAME a b c d e f family f5 repeat integer VALUE 1 10 task t1 repeat date DATE 19991230 20000105 label info "" endfamily endfamily family f5 edit HOST ?????? edit ECF_JOB_CMD "rsh %HOST% '%ECF_JOB% &gt; %ECF_JOBOUT% 2&gt;&amp;1 &amp;'" inlimit l1 edit SLEEP 20 task t1 task t2 task t3 task t4 task t5 task t6 task t7 task t8 task t9 endfamily endsuite</pre> </div> <p>In python modify the function create_family_f5() created in the earlier page:</p> <div class="highlight-python"><div class="highlight"><pre><span class="c">#!/usr/bin/env python2.5</span> <span class="kn">import</span> <span class="nn">os</span> <span class="kn">import</span> <span class="nn">ecflow</span> <span class="k">def</span> <span class="nf">create_family_f5</span><span class="p">()</span> <span class="p">:</span> <span class="n">f5</span> <span class="o">=</span> <span class="n">ecflow</span><span class="o">.</span><span class="n">Family</span><span class="p">(</span><span class="s">&quot;f5&quot;</span><span class="p">)</span> <span class="n">f5</span><span class="o">.</span><span class="n">add_inlimit</span><span class="p">(</span><span class="s">&quot;l1&quot;</span><span class="p">)</span> <span class="n">f5</span><span class="o">.</span><span class="n">add_variable</span><span class="p">(</span><span class="s">&quot;HOST&quot;</span><span class="p">,</span><span class="s">&quot;??????&quot;</span><span class="p">)</span> <span class="n">f5</span><span class="o">.</span><span class="n">add_variable</span><span class="p">(</span><span class="s">&quot;ECF_JOB_CMD&quot;</span><span class="p">,</span><span class="s">&quot;rsh %HOST% &#39;</span><span class="si">%E</span><span class="s">CF_JOB% &gt; </span><span class="si">%E</span><span class="s">CF_JOBOUT% 2&gt;&amp;1 &amp;&#39;&quot;</span><span class="p">)</span> <span class="n">f5</span><span class="o">.</span><span class="n">add_variable</span><span class="p">(</span><span class="s">&quot;SLEEP&quot;</span><span class="p">,</span><span class="mi">20</span><span class="p">)</span> <span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="mi">1</span><span class="p">,</span><span class="mi">10</span><span class="p">):</span> <span class="n">f5</span><span class="o">.</span><span class="n">add_task</span><span class="p">(</span> <span class="s">&quot;t&quot;</span> <span class="o">+</span> <span class="nb">str</span><span class="p">(</span><span class="n">i</span><span class="p">)</span> <span class="p">)</span> <span class="k">return</span> <span class="n">f5</span> <span class="n">defs</span> <span class="o">=</span> <span class="n">ecflow</span><span class="o">.</span><span class="n">Defs</span><span class="p">()</span> <span class="n">suite</span> <span class="o">=</span> <span class="n">defs</span><span class="o">.</span><span class="n">add_suite</span><span class="p">(</span><span class="s">&quot;test&quot;</span><span class="p">)</span> <span class="n">suite</span><span class="o">.</span><span class="n">add_variable</span><span class="p">(</span><span class="s">&quot;ECF_INCLUDE&quot;</span><span class="p">,</span><span class="n">os</span><span class="o">.</span><span class="n">getenv</span><span class="p">(</span><span class="s">&quot;HOME&quot;</span><span class="p">)</span> <span class="o">+</span> <span class="s">&quot;/course&quot;</span><span class="p">)</span> <span class="n">suite</span><span class="o">.</span><span class="n">add_variable</span><span class="p">(</span><span class="s">&quot;ECF_HOME &quot;</span><span class="p">,</span><span class="n">os</span><span class="o">.</span><span class="n">getenv</span><span class="p">(</span><span class="s">&quot;HOME&quot;</span><span class="p">)</span> <span class="o">+</span> <span class="s">&quot;/course&quot;</span><span class="p">)</span> <span class="n">suite</span><span class="o">.</span><span class="n">add_limit</span><span class="p">(</span><span class="s">&quot;l1&quot;</span><span class="p">,</span><span class="mi">2</span><span class="p">)</span> <span class="n">suite</span><span class="o">.</span><span class="n">add_family</span><span class="p">(</span> <span class="n">create_family_f5</span><span class="p">()</span> <span class="p">)</span> </pre></div> </div> <p>You have to replace the string ?????? with a host name of your choice.</p> <div class="admonition note"> <p class="first admonition-title">Note</p> <p>The environment you get when submitting tasks remotely is different from that when submitting locally. This depends on how your system is set up. Here we need to set PATH, to allow <a class="reference internal" href="/wiki/display/ECFLOW/Glossary#term-child-command"><em class="xref std std-term">child command</em></a>&#8216;s to be used. So add the following line into your <a class="reference internal" href="/wiki/display/ECFLOW/Understanding+Includes#head-h"><em>head.h</em></a> file before the call to <a class="reference internal" href="/wiki/display/ECFLOW/Glossary#term-ecflow-client"><em class="xref std std-term">ecflow_client</em></a> &#8211;init</p> <p class="last">&gt; export PATH=$PATH:/usr/local/bin</p> </div> <div class="line-block"> <div class="line">You may experience other problems using rsh, caused by standard unix issues.</div> <div class="line">Make sure that the file $HOME/.rhosts contains a line with your user ID and</div> <div class="line">the machine where your server is running.</div> </div> <p>If your login shell is csh, you should define ECF_JOB_CMD as:</p> <div class="highlight-python"><pre>edit ECF_JOB_CMD "rsh %HOST% '%ECF_JOB% &gt;&amp; %ECF_JOBOUT%'"</pre> </div> <p>What to do:</p> <ol class="arabic simple"> <li>Change the <a class="reference internal" href="/wiki/display/ECFLOW/Glossary#term-suite-definition"><em class="xref std std-term">suite definition</em></a></li> <li>Load and begin the <a class="reference internal" href="/wiki/display/ECFLOW/Glossary#term-suite"><em class="xref std std-term">suite</em></a></li> <li>It may not work immediately. Have a look in the file <tt class="file docutils literal"><span class="pre">$HOME/course/</span><em><span class="pre">host</span></em><span class="pre">.</span><em><span class="pre">port</span></em><span class="pre">.ecf.log</span></tt> to see why.</li> <li>What do you need to do in order to have the task <strong>/test/f5/t9 run</strong> on even another machine? Try your solution.</li> </ol> </div>