ecFlow's documentation is now on readthedocs!

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 25 Current »

This page contains macros or features from a plugin which requires a valid license.

You will need to contact your administrator.

Previous Up Next


In the real world suites can have several thousand tasks. These tasks are not required all the time.

Having a server with extremely large number of tasks can cause performance issues.

  • The server writes to the checkpoint file periodically. This disk i/o can interfere with job scheduling, when dealing with excessively large number of tasks.
  • Clients like GUI(ecflow_ui), are also adversely affected by the memory requirements, and slow interactive experience 
  • Network traffic is heavily affected

This is where autoarchive becomes useful.

autoarchive example
autoarchive +01:00 # archive one hour after complete
autoarchive 01:00  # archive at 1 am in morning after complete
autoarchive 10     # archive 10 days after complete
autoarchive 0      # archive immediately after complete, can take up to a minute

autoarchive will write a portion of the definition to disk.

  • Archives suite or family nodes *IF* they have child nodes(otherwise does nothing).
  • Saves the suite/family nodes to disk, and then removes the in memory child nodes from the definition.
  •  It improves time taken to checkpoint and reduces network bandwidth
  •  If archived node is re-queued or begun, the child nodes are automatically restored
  • The nodes are saved to ECF_HOME/<host>.<port>.ECF_NAME.check, where '/' has been replaced with ':' in ECF_NAME
  • Care must be taken if you have trigger reference to the archived nodes

Use  ecflow_client --archive to archive manually

  • ecflow_client --archive=/s1                       # archive suite s1
  • ecflow_client --archive=/s1/f1 /s2            # archive family /s1/f1 and suite /s2
  • ecflow_client --archive=force /s1 /s2      # archive suites /s1,/s2 even if they have active tasks

Autorestore can also be done automatically, but is only applied when a node completes.

To restore archived nodes manually use : 

  • ecflow_client --restore=/s1/f1     # restore family /s1/f1
  • ecflow_client  --restore=/s1 /s2  # restore suites /s1 and /s2

Text

Let us modify the suite definition file again. To avoid waiting this exercise will archive immediately.

# Definition of the suite test.
suite test
 edit ECF_INCLUDE "$HOME/course"
 edit ECF_HOME    "$HOME/course"
 edit SLEEP 20
 family lf1
     autoarchive 0
     task t1 ;  task t2 ; task t3 ; task t4; task t5 ; task t6; task t7; task t8 ; task t9
 endfamily
 family lf2
     autoarchive 0
     task t1 ;  task t2 ; task t3 ; task t4; task t5 ; task t6; task t7; task t8 ; task t9
 endfamily
 family lf3
     autoarchive 0
     task t1 ;  task t2 ; task t3 ; task t4; task t5 ; task t6; task t7; task t8 ; task t9
 endfamily
 family restore
    trigger ./lf1<flag>archived and ./lf2<flag>archived and ./lf3<flag>archived 
    task t1
       autorestore ../lf1 ../lf2 ../lf3.   # restore when t1 completes
 endfamily
endsuite

Python

$HOME/course/test.py
import os
from ecflow import Defs,Suite,Family,Task,Edit,Trigger,Complete,Event,Meter,Time,Day,Date,Label, \
                   RepeatString,RepeatInteger,RepeatDate,InLimit,Limit,Autoarchive,Autorestore
         
def create_family(name) :
    return Family(name, 
                  Autoarchive(0),
                  [ Task('t{}'.format(i)) for i in range(1,10) ] )

def create_family_restore() :
    return Family("restore",
                 Trigger("./lf1<flag>archived and ./lf2<flag>archived and ./lf3<flag>archived"),
                 Task('t1', Autorestore(["../lf1","../lf2","../lf3"])))
     
print("Creating suite definition") 
home = os.path.join(os.getenv("HOME"),"course")
defs = Defs(
        Suite("test",
            Edit(ECF_INCLUDE=home,ECF_HOME=home,SLEEP=20),
            create_family("lf1"),create_family("lf2"),create_family("lf3"),
            create_family_restore()
        )
      )
print(defs)
 
print("Checking job creation: .ecf -> .job0") 
print(defs.check_job_creation())
 
print("Checking trigger expressions and inlimits")
assert len(defs.check()) == 0,defs.check()
 
print("Saving definition to file 'test.def'")
defs.save_as_defs("test.def")

What to do

  1. Type in the changes, cp -r f5 lf1; cp -r f5 lf2; cp -r f5 lf3 
  2. Replace the suite definition
  3. Run the suite, you should see nodes getting archived, then restored in ecflow_ui
  4. Experiment with archive and restore in ecflow_ui.
  5. Experiment with archive and restore from the command line.

The Autoarchive(0) can take up to one minute to take effect. The server has a 1 minute resolution.


  • No labels