In the previous exercise we showed how ecflow provides simple load management.
The limits can still allow several hundred jobs to be submitted at once. This can cause problems:
- Excessive disk/io in job generation
- Server busy in job generation, and slow to respond to the GUI.
- Overload queueing systems like PBS/SLURM
Hence we need a load management that can limit the number of submission. When the Job becomes active the limit token is released.
We can have more than 2 active jobs, since we are only control the number in the submitted state.
If we removed the -s then we can only have two active jobs running at one time
This will allow the configuration of the suite, depending on the load the disk/io and queuing system can sustain.
Here is the simple illustration that modifies the previous example:
Text
Let us modify our suite definition file:
# Definition of the suite test. suite test edit ECF_INCLUDE "$HOME/course" edit ECF_HOME "$HOME/course" limit l1 2 family f5 inlimit -s l1 # by default consume 1 token from the limit l1 edit SLEEP 20 task t1 task t2 task t3 task t4 task t5 task t6 task t7 task t8 task t9 endfamily endsuite
Python
import os from ecflow import Defs,Suite,Family,Task,Edit,Trigger,Complete,Event,Meter,Time,Day,Date,Label, \ RepeatString,RepeatInteger,RepeatDate,InLimit,Limit def create_family_f5() : return Family("f5", # limit_name(l1),limit_path(""),no_of_tokens_to_consume(1),limit node(False), limit submission(True) InLimit("l1","",1,False,True), Edit(SLEEP=20), [ Task('t{}'.format(i)) for i in range(1,10) ] ) print("Creating suite definition") home = os.path.join(os.getenv("HOME"),"course") defs = Defs( Suite("test", Edit(ECF_INCLUDE=home,ECF_HOME=home), Limit("l1",2), create_family_f5())) print(defs) print("Checking job creation: .ecf -> .job0") print(defs.check_job_creation()) print("Checking trigger expressions and inlimits") assert len(defs.check()) == 0,defs.check() print("Saving definition to file 'test.def'") defs.save_as_defs("test.def")
What to do
- Edit the changes
- Replace the suite definition
- In ecflow_ui , observe the triggers of the limit l1
- Open the Info panel for l1
- Change the value of the limit
- Open the Why? panel for one of the queued tasks of /test/f5
Introduce an error in the limits and make sure this error is trapped. i.e. change the Limit.
Check InLimit/Limit referencesLimit("unknown",2)