ECFS is ECMWF ECMWF's File Storage File Storage system. ECFS is a file oriented client-server application, providing facilities to store archive and retrieve entire files between your local workstation or server HPC/ECS and the Data Handling System (DHS), which is based on HPSS (High Performance Storage System). The system is non-transparent: you must issue explicit commands to store files into ECFS, or to retrieve them back into local (client) storage.
ECFS files are stored in domains. Currently two domains are available to users.
ec: | The permanent domain where files are stored indefinitely. This is the default domain. |
ectmp: | The temporary domain where files are stored for 90 days, after which they are automatically deleted. Note that once a file has been deleted it CANNOT be recovered. |
The domain names shown above (ec:
, ectmp:
) are used in the various ECFS commands to indicate which domain you are working with.
Note that, as an alternative, the ectmp:
domain can be referenced by ec:/TMP
. Thus ectmp:/uid/newdir
and ec:/TMP/uid/newdir
are equivalent:
...
ECFS supports the transport of files between various clients and the ECFS storage system. However, the transport of files between clients is not supported.
ECFS is available on the AIX, Linux and other platforms, in both the and supports C- shell , Korn- and Kornbash-shell environments.
Files transferred between a client and ECFS storage can be up to 32 GB in size128 GiB in size.
ECFS file names can contain the following characters: letters (A-Z,a-z), numbers (0-9), underscores (_), commas (,), periods (.) and plus and minus signs (+, -).
If an ECFS command fails due to a recoverable error (HPSS down, network problems, etc.), the ECFS client will retry the command until it succeeds.
ECaccess allows users to transfer files between their remote host and an ECFS domain at ECMWF.
Table of Contents |
---|
ECFS commands
The Unix style of file interface has been adopted by ECFS:
...
All available ECFS commands are described in more detail in the relevant man pages:
...
| Copy files between a domain and STDIN/STDOUT |
ecd |
...
Change the current ECFS working directory | |
ecfs_status |
...
Get status on ECFS usage (on server only) | |
ecfsdir |
...
Archive or retrieve a complete UNIX directory as one ECFS file | |
echgrp |
...
Change the group ownership of an ECFS file or directory | |
echmod |
...
Change the permissions of a ECFS file or directory | |
ecp |
...
Copy files across ECFS domains, including UNIX | |
els |
...
List ECFS files | |
emkdir |
...
Create empty ECFS directories | |
emove |
...
Rename files or directories within an ECFS domain | |
emv |
...
Move files across ECFS domains, including UNIX | |
epwd |
...
Display the current ECFS working directory for the relevant domain | |
ermdir |
...
Remove empty ECFS directories | |
erm |
...
Remove ECFS files | |
etest |
...
Check file types and compare file attributes | |
etouch |
...
Change file timestamps | |
eumask |
...
Change the current ECFS umask |
File name 'globbing'
File name globbing refers to the ability of Unix shells to allow users to specify a single pattern that expands to a list of file names that match that pattern. The typical case is the '*
' character: A specification of, for instance, file*
will expand to a list of all local files whose names, such as file1
or file_X
, match that pattern.
With the client version 2 globbing is now left to the calling shell so that a command usually passes the expanded set of files to the client. (Should a user have disabled such expansion, the client receives just the character '*'.)
It is strongly recommended to use an explicit domain name (ec:
or ectmp:
) to specify ECFS files; rather than a potentially failing 'els file*
' you should use 'els domain:file*
' (in csh, 'els "domain:file*"
').
Backup support
By default files saved in ECFS will not have a backup copy created.
...
The existence of backup copies will be indicated by the first character of the els -l output:
-
for files with no secondary copy;b
for files with a secondary copy.
e.g.
br--r----- 1 uid group 510 Nov 19 2012 essential_data
-r--r----- 1 uid group 510 Nov 19 2012 unimportant_data
|
NOTE: Files are removed from ECFS with a soft-delete: files will still be kept for some time (currently 30 days) during which it will be possible, on request, to un-delete any file that was removed/deleted by mistake. After that period any removal will become permanent, irrespective of the existence of backup copies.
...
ECFS usage monitoring
The ecfs_status
command will give you the most recent status on ECFS usage for your default project account. For more information please call ecfs_status -h.
To get an overview on their your ECFS usage, users you can also refer to the your audit files ec:ecfs_audit
and/or ectmp:ecfs_audit.tmp
which are created once per month and contain a complete list of a user's files in each ECFS domain
Excerpt | ||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Each ECFS user has an ecfs_audit file, placed in each of their ECFS domains. These files contain a list of all files you own in the relevant ECFS domai, excluding any backup copies. To list the audit file in your ec: domain use the following command:
Small audit files can be inspected quickly using the ecat command, e.g.
To examine the audit file in more detail you might want to copy it to your disk space in $HOME, $PERM or $SCRATCH, e.g. :
For a complete list of available ECFS commands please refer to the ECFS user documentation or read the relevant man pages on ecgate. The ecfs_audit file lists your files (no directories) in the format
where the columns have the following meaning:
The audit file's creation date in the format "today= YYYY-MM-DD" is stated in its first line. At the bottom the total of your ECFS content is reported in terms of number and (binary) volume. The very last line gives the number of directories and the number of files not accessed for 18 months. |
In addition, you might want to use the new ecfs_filter tool to help analyse and/or delete your ECFS files or use the web based ECFS Data Management tool (web login with your computer user-ID required).
Using ECFS in scripts
All ECFS commands specifying a single operation return the value 0 if successful, and an error code > 0 otherwise. However, as with Unix, ECFS commands may specify a number of operations, either due to multiple arguments or to wildcard expressions. In such cases, ECFS will always attempt to carry out all operations, even if some intervening operations are not successful. If all such operations for a command are successful, the return code will be 0; if some of the operations are not successful, the return code will be 1; if fatal errors are incurred, the entire command is discontinued, and the error number > 1 is returned with an explanatory message.
Optimising ECFS read access
If you want to extract large amounts of data from ECFS, we would recommend you to transfer the files in the order in which they are written to tapes. This can be done with the '–order=tape' option of the ecp and emv commands. This option is also available for the els command,
Listing files - example
ecd ec:<ecfs_path_name> els -l --order=tape ec:2020010100/f* |
---|
Note that the path name given to 'els' should be relative. This command shows the files selected with the tape number they are written to together with their position on tape. Files on disk are have no tape information. E.g.:
# ECFS files on disk 2020100100/file1 # files on tapes |
---|
Getting files
ecd ec:<ecfs_path_name> cat>sourcelist<<eof ecp --order=tape -F sourcelist <local_directory> |
---|
Note that you have to use the '-F sourcelist' option.
Error handling
The following techniques are suggested for trapping ECFS error codes when running batch scripts in the Korn shell environment:
Set a trap for the entire script:
trap " echo ECFS call exited with RC= \$? " ERR
|
or catch any errors on each call:
set +e
ecp nofile ec:
RC=$?
set -e
if [ $RC -gt 0 ]
then
echo " ECFS call exited with RC= $RC"
fi
|
DOs and DON'Ts
- DON'T archive many small files separately. ECFS is most efficient at handling a small number of large files. Thus
- Thus DO tar and compress (or gzip) many small files into one large file when archiving (say) a directory or use the
ecfsdir
command. - DON'T archive many small files separately.Do not copy in/out the same files frequently.
- If you wish to archive files for a short period only (less than 90 days) DO store them in the
ectmp:
domain. Then they will be automatically deleted after 90 days without any further action from you. Check the existence of a local copy before getting the ECFS version of a file:
#!/bin/csh if ( ! -r $SCRATCH/myfile ) then#!/bin/ksh if [ ! -r $SCRATCH/myfile ] then ecp ec:myfile $SCRATCH/. fi
ecp ec:myfile $SCRATCH/. endif- If it is required to store a large number ( > 5000) of files into ECFS then DO contact User the ECMWF Support Portal in advance to discuss the most efficient way to store/retrieve this large number.
...