If you wish to run interactively but are constrained by the limits on the CPUs, CPU Time or memory, you may run a small interactive job requesting the resources you want.

By doing that, you will get a dedicated allocation of CPUs and memory to run your application interactively.

Using srun directly

If you have a single script or a command you wish to run interactively, one way to do this through the batch system is with a direct call to srun from within session in the login node. It would feel as if you were running locally, but it is instead using a job with dedicated resources:

$ cat myscript.sh 
#!/bin/bash
echo "This is my super script"
echo "Doing some heavy work on $HOSTNAME..."
$ ./myscript.sh 
This is my super script
Doing some heavy work on at1-11...
$ srun ./myscript.sh 
This is my super script
Doing some heavy work on at1-105...

In that example the submitted job would have run using the default settings (default qos, with just 1 cpu and default memory). You can of course pass additional options to srun to customise the resources allocated to this interactive job. For example, to run with 4 cpus, 12 GB with a limit of 6 hours:

$ srun -c 4 --mem=12G -t 06:00:00 ./myscript.sh

Check man srun for a complete list of options.

Persistent interactive job with ecinteractive

However, you may want to To facilitate that task, we are providing the ecinteractive tool

$ ecinteractive -h
Usage :  /usr/local/bin/ecinteractive [options] [--]

  -d|desktop     Submits a vnc job (default is interactive ssh job)

  More Options:
  -h|help        Display this message
  -v|version     Display script version
  -A|account     Project account
  -c|cpus        Number of CPUs (default 2)
  -m|memory      Requested Memory (default 4G)
  -t|time        Wall clock limit (default 06:00:00)
  -r|reservation Submit the job into a SLURM reservation
  -g|cgroups     Launch cgroups watcher
  -k|kill        scancel the running job (if any). To cancel vnc jobs, use together with -d
  -x             set -x

Main features

Getting a shell with 4 cpus and 16 GB or memory for 12 hours

[user@at1-11 ~]$ ecinteractive -k
cancelling job...
    JOBID       NAME  USER   QOS    STATE       TIME TIME_LIMIT NODES      FEATURES NODELIST(REASON)
    63769 user-ecint  user    ni  RUNNING       0:56   12:00:00     1        (null) at1-103
[user@at1-11 ~]$ ecinteractive -c 4 -m 16G -t 12:00:00

Interactive batch job is launched with following resources:
  Maximum run time (hours:min:sec): 12:00:00
  Maximum memory (MB): 16G
  Number of cores/threads: 4
Submitted batch job 63770
Found 1 interactive job running on at1-103 ... attaching to it
To manually re-attach:
        ssh at1-103
To cancel the job on tems:
        /usr/local/bin/ecinteractive -c 4 -m 16G -t 12:00:00 -k
[ECMWF-INFO-z_ecmwf_local.sh] /usr/bin/bash INTERACTIVE on at1-103 at 20210319_174542.052, PID: 428874, JOBID: 63770
[ECMWF-INFO-z_ecmwf_local.sh] $SCRATCH=/lus/pfs1/scratch/user
[ECMWF-INFO-z_ecmwf_local.sh] $PERM=/perm/user
[ECMWF-INFO-z_ecmwf_local.sh] $HPCPERM=/lus/pfs1/hpcperm/user
[ECMWF-INFO-z_ecmwf_local.sh] $TMPDIR=/etc/ecmwf/ssd/ssd1/tmpdirs/user.63770
[ECMWF-INFO-z_ecmwf_local.sh] $SCRATCHDIR=/lus/pfs1/scratchdir/user/0/63770
[ECMWF-INFO-z_ecmwf_local.sh] $_EC_ORIG_TMPDIR=N/A
[ECMWF-INFO-z_ecmwf_local.sh] $_EC_ORIG_SCRATCHDIR=N/A
[user@at1-103 ~]$ 

Reattaching to an existing interactive job

[user@at1-11 ~]$ ecinteractive -c 4 -m 16G -t 12:00:00
Found 1 interactive job running on at1-103 ... attaching to it
To manually re-attach:
        ssh at1-103
To cancel the job on tems:
        /usr/local/bin/ecinteractive -c 4 -m 16G -t 12:00:00 -k
[ECMWF-INFO-z_ecmwf_local.sh] /usr/bin/bash INTERACTIVE on at1-103 at 20210319_174956.074, PID: 429252, JOBID: 63770
[ECMWF-INFO-z_ecmwf_local.sh] $SCRATCH=/lus/pfs1/scratch/user
[ECMWF-INFO-z_ecmwf_local.sh] $PERM=/perm/user
[ECMWF-INFO-z_ecmwf_local.sh] $HPCPERM=/lus/pfs1/hpcperm/user
[ECMWF-INFO-z_ecmwf_local.sh] $TMPDIR=/etc/ecmwf/ssd/ssd1/tmpdirs/user.63770
[ECMWF-INFO-z_ecmwf_local.sh] $SCRATCHDIR=/lus/pfs1/scratchdir/user/0/63770
[ECMWF-INFO-z_ecmwf_local.sh] $_EC_ORIG_TMPDIR=N/A
[ECMWF-INFO-z_ecmwf_local.sh] $_EC_ORIG_SCRATCHDIR=N/A
[user@at1-103 ~]$ 

Killing a running interactive job

[user@at1-11 ~]$ ecinteractive -k
cancelling job...
    JOBID       NAME  USER   QOS    STATE       TIME TIME_LIMIT NODES      FEATURES NODELIST(REASON)
    63770 user-ecint  user    ni  RUNNING       5:31   12:00:00     1        (null) at1-103

Opening a graphical desktop within your interactive job

[user@at1-11 ~]$ ecinteractive -c 4 -m 16G -t 12:00:00 -d

Interactive batch job is launched with following resources:
  Maximum run time (hours:min:sec): 12:00:00
  Maximum memory (MB): 16G
  Number of cores/threads: 4
Submitted batch job 63771
A vnc session job is running on tems node at1-103 - this tool will re-attach to it.
To manually re-attach:
        vncviewer -passwd ~/.vnc/passwd at1-103:9598
To cancel the job on tems:
        /usr/local/bin/ecinteractive -c 4 -m 16G -t 12:00:00 -d -k

TigerVNC Viewer 64-bit v1.10.1
Built on: 2020-10-06 13:51
Copyright (C) 1999-2019 TigerVNC Team and many others (see README.rst)
See https://www.tigervnc.org for information on TigerVNC.

Fri Mar 19 17:52:35 2021
 DecodeManager: Detected 256 CPU core(s)
 DecodeManager: Creating 4 decoder thread(s)
 CConn:       Connected to host at1-103 port 9598
 CConnection: Server supports RFB protocol version 3.8
 CConnection: Using RFB protocol version 3.8
 CConnection: Choosing security type VeNCrypt(19)
 CVeNCrypt:   Choosing security type VncAuth (2)
 CConn:       Using pixel format depth 24 (32bpp) little-endian rgb888
 CConnection: Enabling continuous updates