How-To: Data Access and Efficient I/O

We have developed a series of guides for computational scientists interested in efficient data access and storage, interactive analysis, and leveraging emerging storage solutions for HPC. We will continue to expand this document as new results and capabilities become available and continue to improve the quality of existing articles, see below for a list of
known issues and items left to fill in.

In these guides we provided an overview over the technologies, methods, and concepts that emerged in SGA2-T7.2.3 “Key Technologies for Fast Data Access and Interactive Analysis”. It is meant as a high level guide on how to adjust applications with high
demands on storage performance and to the options offered in the HBP.

Of particular interest is the interaction with large scale, but slow and fast, local storage and the optimisation of applications for these technologies. Some of these concepts are high-level and almost universally applicable, like Shared Memory and HDF5, while others fill are very small, but important niches, like DSS and UDJ. We further show how to interact with the archival storage provided in the HBP, in the form of OpenStack SWIFT object stores.

The guide is available in the Collaboratory 2.0, currently only to HBP members. For getting access to this guide, please contact hbp-sp7-coord@fz-juelich.de.

Known issues:

  • Collab 2.0 has no support for Python Notebooks, which makes
    demonstrating Knowledge Graph / SWIFT integration difficult.
  • We have no systems running BeeOND available, therefore the case study
    section is missing.
  • Performance comparisons on alternative storage solutions are stalled
    on administrative issues on our test cluster.