Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

For transfers to ECMWF, we recommend using rsync which will transfer the files over an ssh connection. For that, you will need to have Teleport configured with the appropriate settings in your ssh config file.

Any file transfer tool that supports SSH and the ProxyJump feature should work, such as the command line tools sftp or scp. Alternatively, you may also use the Linux Virtual Desktop and its folder sharing capabilities to copy local files to your ECMWF's HOME or PERM.

For transfers from ECMWF to external sites, you may use tools such as ectrans, ftp, scp, sftp or rsync, depending on the protocol used on the external site.

Table of Contents

Transfers to ECMWF

No filesystems from other platforms are cross-mounted, so you will need to copy over what you need.

First of all, make sure you have set up password-less authentication as described in HPC2020: How to connect before proceeding.

For transfers, we recommend using rsync which will transfer the files over an ssh connection.

Note

All examples are done with the generic HPCF login node hpc-login, but you should use ecs-login if you don't have access to the full HPCF service.

...

Please note that the examples below assume that you have a valid teleport session established. 

Transferring a directory tree between

...

your computer and the Atos HPCF with rsync

This is the recommended and most versatile option. You may initiate the transfer from ECGATEyour computer for the standard set of filesystems:

No Format
user@ecgb11user@yourlaptop:~> rsync -avavz $SCRATCH/mydataset hpc-login:/scratch/user/

or alternatively from Atos HPCF:

No Format
user@aa6-100:~> rsync -av ecgate:/scratch/group/user/mydataset $SCRATCH/

Both solutions are equivalent, and This command can be run multiple times. Only new , since only new or modified files will be transferred.

warning
Tip

You may add the --delete option if you also wish to delete files on the destination that have been removed from the source.

Transferring a single file between your computer and the Atos HPCF with scp

You may initiate the transfer from your computer:

No Format
user@yourlaptop:~> scp myfile hpc-login:/scratch/user/

Note that running the command multiple times will always overwrite the file on destination.

Transferring a single file between your computer and the Atos HPCF with sftp

You may initiate the transfer from your computer:

No Format
user@yourlaptop:~> sftp hpc-login
sftp> cd /scratch/user
sftp> put myfile

Note that running the command multiple times will always overwrite the file on destination.

Transferring files between your computer and the Atos HPCF or ECFS domains via ECaccess gateway

For a data transfer from an Internet site to ECMWF, invoke a connection to the ECaccess gateway in Bologna:

No Format
user@yourlaptop:~> sftp user@boaccess.ecmwf.int
sftp> cd ECSCRATCH
sftp> put MyFile


Advanced: High Performance Transfers with bbcp

You may also use a specialised tool called bbcp for best transfer rates. It is available on Atos HPCF. This tool is not as flexible as rsync when it comes to updating existing or partial copies, but it should be quicker when doing a one-off transfer.

For example, if you wanted to transfer a directory called mydataset in your SCRATCH in Reading, you could initiate the transfer from ECGATE:

No Format
user@yourlaptop:~> bbcp -rp -s 10 -P 2 mydataset hpc-login:/scratch/user/

The options above would perform a recursive copy, preserving file permissions and using 10 parallel streams. It would also report progress every 2 seconds.

You may find all the details in the bbcp official documentation.

Note
titleCopying softlinks

bbcp will not copy soft links, so if you are copying an entire directory structure that contains them, you may need to copy them over at a second stage with another tool like rsync.

Fetching data from ECMWF to your computer

Transferring a directory tree between the Atos HPCF and your computer with rsync

This is the recommended and most versatile option. You may initiate the transfer from your computer for the standard set of filesystems:

No Format
user@yourlaptop:~> rsync -avz hpc-login:/scratch/user/mydataset .

This command can be run multiple times, since only new or modified files will be transferred.

Tip

You may add the --delete option if you also wish to delete files on the destination that have been removed from the source.

Transferring a directory tree between the Atos HPCF and your computer with scp

You may initiate the transfer from your computer:

No Format
user@yourlaptop:~> scp hpc-login:/scratch/user/myfile .

Note that running the command multiple times will always overwrite the file on destination.

Transferring a directory tree between the Atos HPCF and your computer with sftp

You may initiate the transfer from your computer:

No Format
user@yourlaptop:~> sftp hpc-login
sftp> cd /scratch/user
sftp> get myfile

Note that running the command multiple times will always overwrite the file on destination.

Transferring a directory tree between the Atos HPCF or ECFS domains and your computer via ECaccess gateway

For a data transfer from an Internet site to ECMWF, invoke a connection to the ECaccess gateway in Bologna:

No Format
user@yourlaptop:~> sftp user@boaccess.ecmwf.int
sftp> cd ECHOST/hpc-login/hpcperm/user
sftp> get MyFile

Pushing data from ECMWF to external sites

There are various methods for transferring files from ECMWF to remote sites. Which is best suited for your requirements depends on the configuration of the remote system. If you are not sure what it is, please ask your Computing Representative or system administrator.

In general you have the following options:

ectrans

The ectrans command, part of the ECaccess framework, allows files to be transferred securely and unattended, as it does not require a password to be specified for the remote host: the ECaccess gateway performs the security checking. Ideally, an ECaccess gateway should be installed at the remote site to which you want to transfer; if there is no remote site ECaccess gateway, you can use ectrans via the ECMWF gateway (ecaccess.ecmwf.int), provided that the destination site is accessible from the ECMWF gateway via (s)ftp.

Before you can make use of ectrans, you need to declare an ECtrans association (msuser) for the storage/retrieval of the remote file. This can be done either with the ECaccess tool ecaccess-association-put or through the ECaccess web interface of the gateway you want to transfer to (or at http://ecaccess.ecmwf.int , if no gateway is installed at the remote site). For every msuser declaration, the hostname and the login username and password are requested and stored on the gateway in encrypted form. After these preliminaries you should be able to use ectrans as described in its help page:

No Format
usage: ectrans [-gateway name] -remote msuser@[destination] \                                                                                                                              
          [-get|-put] -source [ec:|ectmp:]filename [args ...] (*)                                                                                                                          
        ectrans -check requestID (*)                                                                                                                                                       
                                                                                                                                                                                           
 -gateway  {arg} - access gateway name (default (**): ecaccess.ecmwf.int)
 -remote   {arg} - access method (default (**): *none*)
 -source   {arg} - source file name
 -target   {arg} - target file name (default: same as -source)
 -mailto   {arg} - target email address (default: current user)
 -lifetime {arg} - lifetime of the file in the spool (default: 1w) (***) (****)
 -delay    {arg} - transmission delay (default: immediate transfer) (***) (****)
 -at       {arg} - transmission date (default: immediate transfer) (****)
 -format   {arg} - define the date format as used with -at (default: yyyyMMddHHmmss)
 -retryCnt {arg} - define the number of retries (default: async=144, sync=0)
 -retryFrq {arg} - define the frequency of retries (default: async=10m, sync=1m) (***)
 -maxTime  {arg} - define the maximum transfer duration (default: 12h) (***)
 -priority {arg} - transmission priority 0-99 (default: 99) (****)
 -put            - interactive/synchronous transfer (no spool)
 -get            - interactive/synchronous pull (rather than push) file
 -onsuccess      - mail sent on successful transfer
 -onfailure      - mail sent when transfer has failed
 -onretry        - mail sent when transfer is retried
 -keep           - keep the request in the spool till expiration (****) (*****)
 -remove         - always remove the request from the spool (****) (*****)
 -reject         - if existing target file (default)
 -append         - if existing target file
 -resume         - if existing target file
 -overwrite      - if existing target file
 -verbose        - verbose mode on
 -version        - print version number
 -help           - this message

    (*) If successful, a requestID is returned, which can be used in
          check requests. Exit code is 0 on success and >0 otherwise.
   (**) The default values depend on the GATEWAY or REMOTE environment
          variables.
  (***) Duration in weeks, days, hours, minutes or seconds (e.g. 1w|2d).
 (****) These options are only relevant when the spool is used. The spool
          is no used during interactive transfers (-get and -put options).
(*****) By default, successful requests are removed from the spool and
          failed requests are kept in the spool till expiration.

Ectrans can be regarded as an extended ftp, which offers e.g. direct access to ECFS files, restart facility for failed transfers etc. For more details please refer to the ECaccess documentation or reach us via the ECMWF Support Portal.

rsync/sftp/scp

The most popular SSH-based transfer commands such as rsync, sftp and scp are fully supported as part of SSH to directly connect to remote sites via the Internet. No proxies are needed, but the only supported remote ports are 22, the standard for SSH and non-standard 2222.

ftp and lftp

FTP (File Transfer Protocol) can be used without a proxy to connect to remote servers on the standard port, 21, and non-standard 2121. You may use the ftp classic client, or the lftp loading the appropriate module:

No Format
module load lftp
lftp yourftpserver

When transferring data, make sure you are using the right transfer type (binary/ASCII). Attempting to transfer a binary file as ASCII will result in a corrupted binary file.


can use the $HOME environment variable to refer to your files in the rsync command provided this is protected by single quotes to ensure the variable is expanded on the remote host.

For example,  to copy a directory from $HOME on ecgate to $HOME on Atos HPCF, initiating the transfer from ecgate, use:

Show If
groupecmwf

ftp.ecmwf.int service

There is no connectivity from Atos HPCF to the ftp.ecmwf.int service in Reading. The new ftp service in Bologna is now available. See FTP Service - Internal users to provide files for external access

No Format
rsync -av $HOME/mydataset 'hpc-login:$HOME/'
However, the $SCRATCH or $PERM environment variables can only be used to refer to your files on the local host from where you initiate the rsync command as these are not set for use by the rsync command on the remote host.

An alternative option is to use the $HOME variable with pattern substitution.  For example, to copy a directory tree from $SCRATCH on ecgate to $SCRATCH on Atos HPCF running the command on ecgate use:

No Format
user@ecgb11:~> rsync -av $SCRATCH/mydataset 'hpc-login:${HOME/home/scratch}/'

or from $PERM on ecgate to $PERM on Atos HPCF running the command on the latter use:

No Format
user@aa6-100:~> rsync -av 'ecgate:${HOME/home/perm}/mydataset' $PERM/

The single quotes are needed here to ensure the $HOME variable takes its value on the remote rather than the local host.

Show If
groupecmwf

Transferring files from local Reading-based workstation disks /var/tmp or /hugetmp

Direct transfers from your workstation to the spaces in Bologna are discouraged, since they would not use the dedicated network link between the two data centres and transfer rates will be poor. Instead, you may force the transfers to jump through ecgate for best results. For example, to transfer a directory and its contents under your local disk onto the PERM space in Bologna:

No Format
user@leap42-workstation:~> rsync -av -e "ssh -o ProxyCommand='ssh -q -x -W %h:%p ecgate'" /var/tmp/user/mydataset hpc-login:/perm/user/mydataset

Transferring files from Reading-based workstation filesystems $HOME and $PERM

These fileystems are also available in ecgate and lxc, so we would recommend users to run the transfers from either of those systems and not directly from the workstation. See the example above.