Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

ECFS is ECMWF's File Storage system. ECFS is a file oriented client-server application, providing facilities to archive and retrieve files between your local workstation or server HPC/ECS and the Data Handling System (DHS), which is based on HPSS (High Performance Storage System). The system is non-transparent: you must issue explicit commands to store files into ECFS, or to retrieve them back into local (client) storage.

...

The domain names shown above (ec:, ectmp:) are used in the various ECFS commands to indicate which domain you are working with.

...

Files transferred between a client and ECFS storage can be up to 32 GB 128 GiB in size.

ECFS file names can contain the following characters: letters (A-Z,a-z), numbers (0-9), underscores (_), commas (,), periods (.) and plus and minus signs (+, -).

...

ECaccess allows users to transfer files between their remote host and an ECFS domain at ECMWF.

Table of Contents

ECFS commands

The Unix style of file interface has been adopted by ECFS:

...

ecat Copy files between a domain and STDIN/STDOUT
ecd Change the current ECFS working directory
ecfs_status Get status on ECFS usage (on server only)
ecfsdir Archive or retrieve a complete UNIX directory as one ECFS file
echgrp Change the group ownership of an ECFS file or directory
echmod Change the permissions of a ECFS file or directory
ecp Copy files across ECFS domains, including UNIX
els List ECFS files
emkdir Create empty ECFS directories
emove Rename files or directories within an ECFS domain
emv Move files across ECFS domains, including UNIX
epwd Display the current ECFS working directory for the relevant domain
ermdir Remove empty ECFS directories
erm Remove ECFS files
etest Check file types and compare file attributes
etouch Change file timestamps
eumask Change the current ECFS umask

File name 'globbing'

File name globbing refers to the ability of Unix shells to allow users to specify a single pattern that expands to a list of file names that match that pattern. The typical case is the '*' character: A specification of, for instance, file* will expand to a list of all local files whose names, such as file1 or file_X, match that pattern.

...

It is strongly recommended to use an explicit domain name (ec: or ectmp:) to specify ECFS files; rather than a potentially failing 'els file*' you should use 'els domain:file*' (in csh, 'els "domain:file*"').

Backup support

By default files saved in ECFS will not have a backup copy created.

...

NOTE: Files are removed from ECFS with a soft-delete: files will still be kept for some time (currently 30 days) during which it will be possible, on request, to un-delete any file that was removed/deleted by mistake. After that period any removal will become permanent, irrespective of the existence of backup copies.

ECFS usage monitoring

The ecfs_status command will give you the most recent status on ECFS usage for your default project account. For more information please call ecfs_status -h.

...


In addition, you might want to use the new ecfs_filter tool to help analyse and/or delete your ECFS files or use the web based ECFS Data Management tool (web login with your computer user-ID required).

Using ECFS in scripts

All ECFS commands specifying a single operation return the value 0 if successful, and an error code > 0 otherwise. However, as with Unix, ECFS commands may specify a number of operations, either due to multiple arguments or to wildcard expressions. In such cases, ECFS will always attempt to carry out all operations, even if some intervening operations are not successful. If all such operations for a command are successful, the return code will be 0; if some of the operations are not successful, the return code will be 1; if fatal errors are incurred, the entire command is discontinued, and the error number > 1 is returned with an explanatory message.

Optimising ECFS read access

If you want to extract large amounts of data from ECFS, we would recommend you to transfer the files in the order in which they are written to tapes. This can be done with the '–order=tape' option of the ecp and emv commands. This option is also available for the els command,

Listing files - example

ecd ec:<ecfs_path_name>
els -l --order=tape ec:2020010100/f*

Note that the path name given to 'els' should be relative. This command shows the files selected with the tape number they are written to together with their position on tape. Files on disk are have no tape information. E.g.:

# ECFS files on disk
2020100100/file1
2020100100/file2
# files on tapes
2020100100/file3 volser:J12551 fileno:144 offset:450852437
2020100100/file4 volser:J12551 fileno:144 offset:505435733
2020100100/file5 volser:J12805 fileno:223 offset:29688824476
2020100100/file6 volser:J12805 fileno:223 offset:42359779994


Getting files

ecd ec:<ecfs_path_name>

cat>sourcelist<<eof
ec:2020100100/f*
ec:2020100106/f*
ec:2020100112/f*
ec:2020100118/f*
...
eof

ecp --order=tape -F sourcelist <local_directory>

Note that you have to use the '-F sourcelist' option.

Error handling

The following techniques are suggested for trapping ECFS error codes when running batch scripts in the Korn shell environment:

...

set +e
ecp nofile ec:
RC=$?
set -e
if [ $RC -gt 0 ]
  then
  echo " ECFS call exited with RC= $RC"
fi

DOs and DON'Ts

  • DON'T archive many small files separately. ECFS is most efficient at handling a small number of large files. Thus 
  • DO tar and compress (or gzip) many small files into one large file when archiving (say) a directory or use the ecfsdir command. 
  • DON'T copy in/out the same files frequently.
  • If you wish to archive files for a short period only (less than 90 days) DO store them in the ectmp: domain. Then they will be automatically deleted after 90 days without any further action from you.
  • Check the existence of a local copy before getting the ECFS version of a file:

    #!/bin/ksh
    if [ ! -r $SCRATCH/myfile ]
    then
      ecp ec:myfile $SCRATCH/.  
    fi
    


  • If it is required to store a large number ( > 5000) of files into ECFS then DO contact User the ECMWF Support Portal in advance to discuss the most efficient way to store/retrieve this large number.

...