HPC storage

The following file systems are available from the supercomputers of the HPAC Platform. Usually there are

  • a home directory (for source code, binaries, libraries, applications, small datasets),
  • a scratch and/or work file system (temporary storage for larger datasets, automated clean-up),
  • an archive system (for long-term storage of results), and
  • a file system used as software repository for pre-installed software.

The file systems provide different quota per user and/or per project or group.

JUQUEEN & JURECA

All user file systems on the HPC systems (e.g. JUQUEEN, JURECA, DEEP), and the Data Access System (JUDAC) are provided via Multi Cluster GPFS from the HPC-fileserver JUST.

The storage locations assigned to each user in the system environment are encapsulated with the help of shell environment variables (see below). The user’s directory in each file system is shared for all systems the user has granted access. It is recommended to organize the data by system architecture specific subdirectories.

The following file systems are available (Login Node or Compute Nodes, Front End Nodes (FENs) or I/O Nodes (IONs)):

File system Usable space
Description Backup? Access from
$HOME 1.8 PB total,
10 TB quota
Full path to the user’s home directory inside GPFS
for source code, binaries, libraries, and applications
yes login nodes, compute nodes, FENS, IONs
$WORK 5.3 PB total,
30 TB quota
Full path to the user’s standard scratch directory inside GPFS
temporary storage location for applications with large size and I/O demands; data are automatically deleted (files after 90 days by modification and access date, empty directories after 3 days)
no login nodes, compute nodes, FENS, IONs
$DATA 2.7 PB total Full path to limited available user’s project data directory inside GPFS
storage for large projects in collaboration with JSC; needed space must be applied for explicitly
yes login nodes, compute nodes, FENS, IONs
$ARCH 1.2 PB total Full path to user’s archive directory inside GPFS
storage for all files not in use for a longer time;
data are migrated to tape storage by TSM-HSM; No hard disk space limit for $ARCH exists but if more than 100 TB will be requested please contact the supercomputing support at JSC ( sc@fz-juelich.de ) to discuss optimal data processing particularly with regard to the end of the project.
yes login nodes,  FENs
/usr/local 53 TB total Software repository (usage via $PATH, $LD_LIBRARY_PATH,…) yes login nodes, compute nodes, FENS, IONs

More information: http://www.fz-juelich.de/ias/jsc/EN/Expertise/Datamanagement/OnlineStorage/JUST/Filesystems/JUST_filesystems_node.html

Piz Daint

CSCS supports different file systems, whose specifications are summarized in the table below:

/scratch (Daint) /users /project /store
Type Lustre GPFS GPFS GPFS
Quota None 10 GB per user 5 TB per group As per contract
Expiration 30 days None End of the project As per contract
Data Backup None Active Active Active
Access Speed Fast Slow Medium Slow
Capacity 2.7 PB 86 TB 3.2 PB 2.6 PB
  • All users have a personal folder under the /scratch file system.
  • Each user has a personal home folder under the /users file system.
  • Production projects proposals can ask for group storage under the /project file system.
  • Contractual partners can buy dedicated storage under the /store file system.

More information: http://user.cscs.ch/storage/index.html

MareNostrum 4

These are the GPFS filesystems available from all nodes:

File system Description
/apps Used to store applications and libraries.
/gpfs/home This filesystem has the home directories of all the users, and when you log in you start in your home directory by default. Every user will have their own home directory to store own developed sources and personal data. A default quota will be enforced on all users to limit the amount of data stored there. Also, it is highly discouraged to run jobs from this filesystem. Please run your jobs on your group’s /gpfs/projects or /gpfs/scratch instead.
/gpfs/projects In addition to the home directory, there is a directory in /gpfs/projects for each group of users. For instance, the group bsc01 will have a /gpfs/projects/bsc01 directory ready to use. This space is intended to store data that needs to be shared between the users of the same group or project. A quota per group will be enforced depending on the space assigned by the Access Committee. It is the project’s manager responsibility to determine and coordinate the use of this space, and how it is distributed or shared between their users.
/gpfs/scratch Each user will have a directory over /gpfs/scratch. Its intended use is to store temporary files of your jobs during their execution. A quota per group will be enforced depending on the space assigned.

The quotas can be queried using the quota command.

More information: http://www.bsc.es/support/MareNostrum4-ug.pdf

FERMI

FERMI is completely consistent with the general CINECA infrastructure (see Sect “Data Storage and File Systems“).

File system Usable space Description Backup?
$HOME 50 GB This file system is permanent, user specific, system specific.

This is a local area where you are placed after the login procedure. It is where system and user applications store their dot-files and dot-directories (.nwchemrc, .ssh,…) and where users keeps initialization files specific for the systems (.cshrc, .profile,…). There is a $HOME area for each username on the machine.

This area  is conceived to store programs and small personal data. Files are never deleted from this area, moreover they are guaranteed by daily backups. The retention of the files is related with the life of the username, data are preserved until the username remains active.

yes
$CINECA_SCRATCH no quota (900 TB in total) This file system is temporary , user specific, system specific.

This is a local temporary storage, like $WORK, conceived for temporary files from batch applications. The difference with $WORK is mainly that it is user specific (not project specific) and that it can be used for sharing data with people outside your project. By default the file access is open to everyone.

On this area could be applied a periodic cleaning procedure, with a normal  retention time of 30 days: files are daily cancelled by an automatic procedure if not accessed for more than 30 days.

$CINECA_SCRATCH does not have any disk quota. However, it is strongly recommended  to maintain a low occupancy of this area in order to prevent the very dangerous filling condition.

no
$WORK 1 TB (or as required by the project)

This file system is permanent , project specific, system specific.

This is a scratch area for collaborative work within a given project. The retention of the files is related with the life of the project. Files in $WORK will be conserved up to 6 months after the project end, then they will be cancelled.

This area is conceived for hosting large working data files, since it is characterized by the high bandwidth of a parallel file system. It behaves very well when I/O is performed accessing large blocks of data, while it is not well suited for frequent and small I/O operations. This is the main area for maintaining scratch files resulting from batch processing.

no
$TAPE This file system is permanent, user specific, accessible form all Cineca systems.

This is an archive area conceived for saving personal data on magnetic media. The list of file is maintained on disks, the file content is moved automatically to tape usinf the LTFS technology . This archive space is not created by default for all users, you have to ask for it, by specifying the maximum space required.

This filesystem is mounted on the login nodes of FERMI and GALILEO and on all nodes of PICO. The retention of the files is related with the life of the username, data are preserved until the username remains active.

More information: http://www.hpc.cineca.it/content/ibm-fermi-user-guide#disk

Pico

PICO is completely consistent with the general CINECA infrastructure (see “Data Storage and File Systems“).

File system Usable space Description Backup?
$HOME 50 GB This file system is permanent, user specific, system specific.

This is a local area where you are placed after the login procedure. It is where system and user applications store their dot-files and dot-directories (.nwchemrc, .ssh,…) and where users keeps initialization files specific for the systems (.cshrc, .profile,…). There is a $HOME area for each username on the machine.

This area  is conceived to store programs and small personal data. Files are never deleted from this area, moreover they are guaranteed by daily backups. The retention of the files is related with the life of the username, data are preserved until the username remains active.

yes
$CINECA_SCRATCH no quota (300 TB in total) This file system is temporary , user specific, system specific.

This is a local temporary storage, like $WORK, conceived for temporary files from batch applications. The difference with $WORK is mainly that it is user specific (not project specific) and that it can be used for sharing data with people outside your project. By default the file access is open to everyone.

On this area could be applied a periodic cleaning procedure, with a normal  retention time of 30 days: files are daily cancelled by an automatic procedure if not accessed for more than 30 days.

$CINECA_SCRATCH does not have any disk quota. However, it is strongly recommended  to maintain a low occupancy of this area in order to prevent the very dangerous filling condition.

no
$WORK 15 TB quota (or as required by the project), 1500 TB in total

This file system is permanent , project specific, system specific.

This is a scratch area for collaborative work within a given project. The retention of the files is related with the life of the project. Files in $WORK will be conserved up to 6 months after the project end, then they will be cancelled.

This area is conceived for hosting large working data files, since it is characterized by the high bandwidth of a parallel file system. It behaves very well when I/O is performed accessing large blocks of data, while it is not well suited for frequent and small I/O operations. This is the main area for maintaining scratch files resulting from batch processing.

no
$TAPE This file system is permanent, user specific, accessible form all Cineca systems.

This is an archive area conceived for saving personal data on magnetic media. The list of file is maintained on disks, the file content is moved automatically to tape usinf the LTFS technology . This archive space is not created by default for all users, you have to ask for it, by specifying the maximum space required.

This filesystem is mounted on the login nodes of FERMI and GALILEO and on all nodes of PICO. The retention of the files is related with the life of the username, data are preserved until the username remains active.

More information: http://www.hpc.cineca.it/content/pico-user-guide#disks