In the previous exercise we showed how ecflow provides simple load management.
The limits can still allow several hundred jobs to be submitted at once. This can cause problems:
- Excessive disk/io in job generation
- Server busy in job generation, and slow to respond to the GUI.
- Overload queueing systems like PBS/SLURM
Hence we need a load management that can limit the number of submission. When the Job becomes active the limit token is released.
Here is the simple illustration:
Text
Let us modify our suite definition file:
# Definition of the suite test. suite test edit ECF_INCLUDE "$HOME/course" edit ECF_HOME "$HOME/course" limit l1 2 family f5 inlimit -s l1 edit SLEEP 20 task t1 task t2 task t3 task t4 task t5 task t6 task t7 task t8 task t9 endfamily endsuite
Python
$HOME/course/test.py
import os from ecflow import Defs,Suite,Family,Task,Edit,Trigger,Complete,Event,Meter,Time,Day,Date,Label, \ RepeatString,RepeatInteger,RepeatDate,InLimit,Limit def create_family_f5() : return Family("f5", InLimit("l1"), Edit(SLEEP=20), [ Task('t{}'.format(i)) for i in range(1,10) ] ) print("Creating suite definition") home = os.path.join(os.getenv("HOME"),"course") defs = Defs( Suite("test", Edit(ECF_INCLUDE=home,ECF_HOME=home), Limit("l1",2), create_family_f5())) print(defs) print("Checking job creation: .ecf -> .job0") print(defs.check_job_creation()) print("Checking trigger expressions and inlimits") assert len(defs.check()) == 0,defs.check() print("Saving definition to file 'test.def'") defs.save_as_defs("test.def")
What to do
- Edit the changes
- Replace the suite definition
- In ecflow_ui , observe the triggers of the limit l1
- Open the Info panel for l1
- Change the value of the limit
- Open the Why? panel for one of the queued tasks of /test/f5
Introduce an error in the limits and make sure this error is trapped. i.e. change the Limit.
Check InLimit/Limit referencesLimit("unknown",2)