...
Persistent interactive job with ecinteractive
However, you may want to To facilitate that task, we are providing the ecinteractive tool. Its main features are the following:
- Only one interactive job is allowed at a time
- Your job keeps on running after you exit the interactive shell, so you can reattach to it any time or open multiple interactive shells within the same job.
- You may open a basic graphical desktop for X11 applications.
- You may open a Jupyter Lab instance and connect to it through your browser.
- By default it will submit to AA, but you can choose what complex (platform) to use.
- You can run ecinteractive from any Atos HPCF complex, Red Hat Linux VDI. You may also copy the script to your end user device and use it from there. It should work from Linux, Mac, or WSL under windows, and requires the Teleport tsh client to be installed.
No Format |
---|
$ ecinteractive -h
Usage : /usr/local/bin/ecinteractive [options] [--]
-d|desktop Submits a vnc job (default is interactive ssh job)
-j|jupyter Submits a jupyter job (default is interactive ssh job)
More Options:
-h|help Display this message
-v|version Display script version
-p|platform Platform (default aa. Choices: aa, ab, ac, ad)
-u|user ECMWF User (default user)
-A|account Project account
-c|cpus Number of CPUs (default 2)
-m|memory Requested Memory (default 8G)
-s|tmpdirsize Requested TMPDIR size (default 3 GB)
-t|time Wall clock limit (default 06:00:00)
-k|kill Cancel any running interactive job
-q|query Check running job
-o|output Output file for the interactive job (default /dev/null)
-x set -x |
Creating an interactive job
You can get an interactive shell running on an allocated node within the Atos HCPF with just calling ecinteractive. By default it will just use the default settings which are:
...
If you need more resources, you may use the ecinteractive options when creating the job. For example, to get a shell with 4 cpus and 16 GB or memory for 12 hours:
No Format |
---|
[user@aa6-100 ~]$ ecinteractive -c4 -m 16G -t 12:00:00
Submitted batch job 10225018
Waiting 5 seconds for the job to be ready...
Using interactive job:
CLUSTER JOBID STATE EXEC_HOST TIME_LIMIT TIME_LEFT MAX_CPUS MIN_MEMORY TRES_PER_NODE
aa 10225018 RUNNING aa6-104 12:00:00 11:59:55 4 16G ssdtmp:3G
To cancel the job:
/usr/local/bin/ecinteractive -k
Last login: Mon Dec 13 09:39:09 2021
[ECMWF-INFO-z_ecmwf_local.sh] /usr/bin/bash INTERACTIVE on aa6-104 at 20211213_093914.794, PID: 1736962, JOBID: 10225018
[ECMWF-INFO-z_ecmwf_local.sh] $SCRATCH=/ec/res4/scratch/user
[ECMWF-INFO-z_ecmwf_local.sh] $PERM=/ec/res4/perm/user
[ECMWF-INFO-z_ecmwf_local.sh] $HPCPERM=/ec/res4/hpcperm/user
[ECMWF-INFO-z_ecmwf_local.sh] $TMPDIR=/etc/ecmwf/ssd/ssd1/tmpdirs/user.10225018
[ECMWF-INFO-z_ecmwf_local.sh] $SCRATCHDIR=/ec/res4/scratchdir/user/8/10225018
[ECMWF-INFO-z_ecmwf_local.sh] $_EC_ORIG_TMPDIR=N/A
[ECMWF-INFO-z_ecmwf_local.sh] $_EC_ORIG_SCRATCHDIR=N/A
[ECMWF-INFO-z_ecmwf_local.sh] Job 10225018 time left: 11:59:54
[user@aa6-104 ~]$ |
Note |
---|
If you log out, the job continues to run until explicietly cancelled or reaching the time limit. |
Reattaching to an existing interactive job
Once you have an interactive job running, you may reattach to it, or open several shells within that job calling ecinteractive again.
Note |
---|
If you have a job already running, ecinteractive will always attach you to that one regardless of the resources options you pass. If you wish to run a job with different settings, you will have to cancel it first |
No Format |
---|
[user@aa6-100 ~]$ ecinteractive
Using interactive job:
CLUSTER JOBID STATE EXEC_HOST TIME_LIMIT TIME_LEFT MAX_CPUS MIN_MEMORY TRES_PER_NODE
aa 10225018 RUNNING aa6-104 12:00:00 11:57:56 4 16G ssdtmp:3G
WARNING: Your existing job 10225018 may have a different setup than requested. Cancel the existing job and rerun if you with to run with different setup
To cancel the job:
/usr/local/bin/ecinteractive -k
Last login: Mon Dec 13 09:39:14 2021 from aa6-100.bullx
[ECMWF-INFO-z_ecmwf_local.sh] /usr/bin/bash INTERACTIVE on aa6-104 at 20211213_094114.197, PID: 1742608, JOBID: 10225018
[ECMWF-INFO-z_ecmwf_local.sh] $SCRATCH=/ec/res4/scratch/user
[ECMWF-INFO-z_ecmwf_local.sh] $PERM=/ec/res4/perm/user
[ECMWF-INFO-z_ecmwf_local.sh] $HPCPERM=/ec/res4/hpcperm/user
[ECMWF-INFO-z_ecmwf_local.sh] $TMPDIR=/etc/ecmwf/ssd/ssd1/tmpdirs/user.10225018
[ECMWF-INFO-z_ecmwf_local.sh] $SCRATCHDIR=/ec/res4/scratchdir/user/8/10225018
[ECMWF-INFO-z_ecmwf_local.sh] $_EC_ORIG_TMPDIR=N/A
[ECMWF-INFO-z_ecmwf_local.sh] $_EC_ORIG_SCRATCHDIR=N/A
[ECMWF-INFO-z_ecmwf_local.sh] Job 10225018 time left: 11:57:54
[user@aa6-104 ~]$ |
Checking the status of a running interactive job
You may query ecinteractive for existing interactive jobs, and you can do so from within or outside the job. It may be useful to see how much time is left
No Format |
---|
[user@aa6-100 ~]$ ecinteractive -q
CLUSTER JOBID STATE EXEC_HOST TIME_LIMIT TIME_LEFT MAX_CPUS MIN_MEMORY TRES_PER_NODE
aa 10225018 RUNNING aa6-104 12:00:00 11:55:40 4 16G ssdtmp:3G |
Killing/Cancelling a running interactive job
Logging out of your interactive shells spawn through ecinteractive will not cancel the job. If you have finished working with it, you should cancel it with:
No Format |
---|
[user@aa6-100 ~]$ ecinteractive -k
cancelling job 10225018...
CLUSTER JOBID STATE EXEC_HOST TIME_LIMIT TIME_LEFT MAX_CPUS MIN_MEMORY TRES_PER_NODE
aa 10225018 RUNNING aa6-104 12:00:00 11:55:34 4 16G ssdtmp:3G
Cancel job_id=10225018 name=user-ecinteractive partition=inter [y/n]? y
Connection to aa-login closed. |
Opening graphical applications within your interactive job
if you need to run graphical applications, you can do so through the standard x11 forwarding.
- If running it from an Atos HPCF login node, make sure you have connected there with ssh -X and that you have a working X11 server on your end user device (i.e. XQuartz on MAC, MobaXterm, Xming or similar on Windows)
- If running it from the Red Hat Linux VDI, it should work out of the box
- If running it from your end user device, make sure you have a working X11 server on your end user device (i.e. XQuartz on MAC, MobaXterm, Xming or similar on Windows)
Alternatively, you may use ecinteractive to open a basic window manager running on the allocated interactive node, which will open a VNC client on your end to connect to the running desktop in the allocated node:
No Format |
---|
[user@aa6-100 ~]$ ecinteractive -d
Submitted batch job 10225277
Waiting 5 seconds for the job to be ready...
Using interactive job:
CLUSTER JOBID STATE EXEC_HOST TIME_LIMIT TIME_LEFT MAX_CPUS MIN_MEMORY TRES_PER_NODE
aa 10225277 RUNNING aa6-104 6:00:00 5:59:55 2 8G ssdtmp:3G
To cancel the job:
/usr/local/bin/ecinteractive -k
Attaching to vnc session...
To manually re-attach:
vncviewer -passwd ~/.vnc/passwd aa6-104:9598 |
Opening a Jupyter Lab instance
You can also use ecinteractive to open up a Jupyter Lab instance very easily:
...
Include Page | ||||
---|---|---|---|---|
|