...
However, note that they will be time constrained and once the job has reached it's time limit they will be closed.
Main Features
The main features of this ecinteractive tool are the following:
...
No Format |
---|
$ ecinteractive -h
Usage : /usr/local/bin/ecinteractive [options] [--]
-d|desktop Submits a vnc job (default is interactive ssh job)
-j|jupyter Submits a jupyter job (default is interactive ssh job)
-J|jupyters Submits a jupyter job with HTTPS support (default is interactive ssh job)
More Options:
-h|help Display this message
-v|version Display script version
-p|platform Platform (default aa. Choices: aa, ab, ac, ad, ecs)
-u|user ECMWF User (default user)
-A|account Project account
-c|cpus Number of CPUs (default 2)
-m|memory Requested Memory (default 8G)
-s|tmpdirsize Requested TMPDIR size (default 3 GB)
-t|time Wall clock limit (default 0612:00:00)
-k|kill Cancel any running interactive job
-q|query Check running job
-Q|quiet Silent mode
-o|output Output file for the interactive job (default /dev/null)
-x set -x |
Creating an interactive job
Note |
---|
title | Before you start: Set up your SSH key-based authentication |
---|
|
For ecinteractive to work properly, passwordless ssh must be configured between Atos HPCF nodes. See HPC2020: How to connect for more information on how to set it up. |
Creating an interactive job
You can get an interactive shell running on an allocated node within the Atos HCPF You can get an interactive shell running on an allocated node within the Atos HCPF with just calling ecinteractive. By default it will just use the default settings which are:
Cpus | 12 |
---|
Memory | 8 GB |
---|
Time | 6 12 hours |
---|
TMPDIR size | 3 GB |
---|
If you need more resources, you may use the ecinteractive options when creating the job. For example, to get a shell with 4 cpus and 16 GB or memory for 12 hours:
...
Note |
---|
If you log out, the job continues to run until explicietly explicitly cancelled or reaching the time limit. |
The maximum resources you request for your interactive session are those described in the ni (or ei for ecs users) in HPC2020: Batch system.
Reattaching Reattaching to an existing interactive job
...
No Format |
---|
[user@aa6-100 ~]$ ecinteractive
Using interactive job:
CLUSTER JOBID STATE EXEC_HOST TIME_LIMIT TIME_LEFT MAX_CPUS MIN_MEMORY TRES_PER_NODE
aa 10225018 RUNNING aa6-104 12:00:00 11:57:56 4 16G ssdtmp:3G
WARNING: Your existing job 10225018 may have a different setup than requested. Cancel the existing job and rerun if you wish to run with different setup
To cancel the job:
/usr/local/bin/ecinteractive -k
Last login: Mon Dec 13 09:39:14 2021 from aa6-100.bullx
[ECMWF-INFO-z_ecmwf_local.sh] /usr/bin/bash INTERACTIVE on aa6-104 at 20211213_094114.197, PID: 1742608, JOBID: 10225018
[ECMWF-INFO-z_ecmwf_local.sh] $SCRATCH=/ec/res4/scratch/user
[ECMWF-INFO-z_ecmwf_local.sh] $PERM=/ec/res4/perm/user
[ECMWF-INFO-z_ecmwf_local.sh] $HPCPERM=/ec/res4/hpcperm/user
[ECMWF-INFO-z_ecmwf_local.sh] $TMPDIR=/etc/ecmwf/ssd/ssd1/tmpdirs/user.10225018
[ECMWF-INFO-z_ecmwf_local.sh] $SCRATCHDIR=/ec/res4/scratchdir/user/8/10225018
[ECMWF-INFO-z_ecmwf_local.sh] $_EC_ORIG_TMPDIR=N/A
[ECMWF-INFO-z_ecmwf_local.sh] $_EC_ORIG_SCRATCHDIR=N/A
[ECMWF-INFO-z_ecmwf_local.sh] Job 10225018 time left: 11:57:54
[user@aa6-104 ~]$ |
Checking the status of a running interactive job
You may query ecinteractive for existing interactive jobs, and you can do so from within or outside the job. It may be useful to see how much time is left
Note |
---|
title | Race conditions possbile |
---|
|
If you run multiple ecinteractive on different terminals with very short time between them, and you did not have an interactive job already running, you may experience some issues as multiple interactive jobs may be submitted. If that happens, it is best to cancel all of them and rerun just one ecinteractive, waiting for that one to be ready before opening other parallel sessions: No Format |
---|
for j in $(ecsqueue -ho "%i" -u $USER -q ni); do ecscancel $j; done |
|
Checking the status of a running interactive job
You may query ecinteractive for existing interactive jobs, and you can do so from within or outside the job. It may be useful to see how much time is left
No Format |
---|
[user@aa6-100 ~]$ ecinteractive -q
CLUSTER JOBID STATE EXEC_HOST TIME_LIMIT TIME_ |
No Format |
---|
[user@aa6-100 ~]$ ecinteractive -q
CLUSTER JOBID STATE EXEC_HOST TIME_LIMIT TIME_LEFT MAX_CPUS MIN_MEMORY TRES_PER_NODE
aa 10225018 RUNNING aa6-104 12:00:00 11:55:40 4 16G ssdtmp:3G |
...
No Format |
---|
[user@aa6-100 ~]$ ecinteractive -d
Submitted batch job 10225277
Waiting 5 seconds for the job to be ready...
Using interactive job:
CLUSTER JOBID STATE EXEC_HOST TIME_LIMIT TIME_LEFT MAX_CPUS MIN_MEMORY TRES_PER_NODE
aa 10225277 RUNNING aa6-104 6:00:00 5:59:55 2 8G ssdtmp:3G
To cancel the job:
/usr/local/bin/ecinteractive -k
Attaching to vnc session...
To manually re-attach:
vncviewer -passwd ~/.vnc/passwd aa6-104:9598 |
Excerpt |
---|
Opening a Jupyter Lab instance |
...
Show If |
---|
|
Tip |
---|
| You may find it more convenient to use ECMWF's Jupyterhub instead of ecinteractive to run Jupyter on HPCF or ECS. Only your browser is required to access this service. |
|
You can use ecinteractive to open up a Jupyter Lab instance on the HPCF. The application would effectively run on the allocated node for the job, and would allow you to conveniently interact with it from your browser. When running from VDI or your end user device, ecinteractive will try to open it in a new tab automatically. Alternatively you may manually open the URL provided to connect to your Jupyter Lab session. No Format |
---|
[user@aa6-100 ~]$ ecinteractive -j
Using interactive job:
CLUSTER JOBID STATE EXEC_HOST TIME_LIMIT TIME_LEFT MAX_CPUS MIN_MEMORY TRES_PER_NODE
aa 10225277 RUNNING aa6-104 6:00:00 5:58:07 2 8G ssdtmp:3G
To cancel the job:
/usr/local/bin/ecinteractive -k
Attaching to Jupyterlab session...
To manually re-attach go to http://aa6-104.ecmwf.int:33698/?token=b1624da17308654986b1fd66ef82b9274401ea8982f3b747 |
Image Modified To use your own conda environment as a kernel for Jupyter notebook you will need to have ipykernel installed in the conda environment before starting ecinteractive job. ipykernel can be installed with: No Format |
---|
[user@aa6-100 ~]$ conda activate {myEnv}
[user@aa6-100 ~]$ conda install ipykernel
[user@aa6-100 ~]$ python3 -m ipykernel install --user --name={myEnv} |
The same is true if you want to make your own Python virtual environment visible in Jupyterlab No Format |
---|
[user@aa6-100 ~]$ source {myEnv}/bin/activate
[user@aa6-100 ~]$ pip3 install ipykernel
[user@aa6-100 ~]$ python3 -m ipykernel install --user --name={myEnv} |
To remove your personal kernels from Jupyterlab once you don't need them anymore, you could do so with: No Format |
---|
jupyter kernelspec uninstall {myEnv} |
|
HTTPS access
If you wish to run Juptyer Lab on HTTPS instead of plain HTTP, you may use the -J
option in ecinteractive
. In that case, a personal SSL certificate would be created under ~/.ssl
the first time, and would be used to encrypt the HTTP traffic between your browser and the compute node.
...
Expand |
---|
title | How to connect to the Jupiter Lab instance from web browser on your end user device |
---|
| - Authenticate with Teleport using jump.ecmwf.int as described in the Teleport SSH Access page
Copy ecinteractive script from hpc to your local machine:
Code Block |
---|
| scp username@hpc-login:/usr/local/bin/ecinteractive . |
Run ecinteractive script on your local machine:
Code Block |
---|
| ./ecinteractive -u username -j |
"-u" should be used to provide ECMWF username if username on your local machine doesn't match username on the Atos HPC. The ecinteractive job will start, new tab will automatically open on your web browser and attach to the Jupyter server running on the HPC. If not, you will get a link which can be used on your local machine only to paste it manually into the web browser. It should look like this: http://localhost:..... To kill your ecinteractive job from the local machine use:
Code Block |
---|
| ./ecinteractive -p hpc -k -u username |
|
HTTPS accessIf you wish to run Juptyer Lab on HTTPS instead of plain HTTP, you may use the -J option in ecinteractive . In that case, a personal SSL certificate would be created under ~/.ssl the first time, and would be used to encrypt the HTTP traffic between your browser and the compute node. In order to avoid browser security warnings, you may fetch the ~/.ssl/selfCA.crt certificate from the HPCF and import it into your browser as a trusted Certificate Authority. This is only needed once. Customising your jupyter version and environmentBy default, ecinteractive will start the jupyterlab coming from the default version of python 3. If you wish to customise the version of python or jupyterlab, or simply want to tailor its environment in your ecinteractive session, create the following file in your Atos HPCF HOME: No Format |
---|
~/.ecinteractive/jupyter_setup.sh |
Then add in it the commands needed to set up the environment so that the jupyter and node commands can be found in the path. This would be equivalent to the default behaviour: No Format |
---|
module load python3 node |
Examples of contents for ~/.ecinteractive/jupyter_setup.sh Code Block |
---|
language | bash |
---|
title | Using the newest Python |
---|
| module load python3/new node/new |
Code Block |
---|
language | bash |
---|
title | Using jupyter from a custom conda environment |
---|
| module load conda
conda activate myjupyterenv |
|