High Performance Analytics and Computing Platform v2

The High Performance Analytics and Computing (HPAC) Platform developed new services during the last year, improved already existing ones and put all of them into operational state. At the end of SGA1 (March 2018), the HPAC Platform thus offers the following new capabilities to the users:

Enablement of PRACE network

HPAC sites provide appropriate services and endpoints (e.g. UFTPD server and client nodes) that can communicate over the PRACE network, in addition to services and endpoints on the public network.

Deployment of high-performance data transfer service

UFTP is a high-performance data transfer library based on FTP, fully integrated with the UNICORE authentication mechanisms and accessible via the UNICORE Rest API. To give users the ability to move datasets between the sites that are part of the data federation, a transfer service based on UNICORE File Transfer Protocol (UFTP) was deployed and configured on the four HPAC sites, i.e. at BSC, CINECA, ETHZ-CSCS and JUELICH-JSC.

Currently UFTP is available via two different interfaces:

  • The UNICORE command line client (ucc), a full-featured client for the UNICORE middleware, able to start both direct data transfer (local machine to server) and third party transfer (server to server);
  • Jupyter notebooks available in the Collaboratory web portal, which enable users to integrate data transfer steps inside more complex workflows.

Additional monitoring information

The HPAC monitoring service (link to the HBP Collaboratory; login required) provides the following information:

  • Status of UNICORE services
  • Network reachability
  • Scheduled maintenances

User creation workflow operational

Over 100 user accounts have been created for the HPAC Platform. Ad hoc APIs have been developed to provide uniform access to user management procedures to all HPAC sites.

OpenStack (IaaS) service operational

Users now have access to a production, state-of-the-art Infrastructure-as-a-Service (IaaS) provided by OpenStack. The system is integrated with the ETHZ-CSCS infrastructure and will be fully integrated with the HPAC Authentication and Authorisation Infrastructure. The OpenStack APIs offer a flexible tool for creating on-demand virtual computing infrastructures, which also enable the interaction with the traditional HPC systems, see “Lightweight virtualization service operational” below for further details.

At the end of SGA1, 58 virtual machines (VMs) are running for the Collaboratory team, and another 56 VMs for the Neurorobotics Platform (SP10).

Object Storage service operational

The HPAC Platform now offers an Object Storage service provided by OpenStack (SWIFT) and it is integrated with the ETHZ-CSCS GPFS storage infrastructure. Object Storage dramatically improves data accessibility and customisations from the Collaboratory, and serves as a key component for the HBP Archival Storage. The current usage of the service is about 8 TB.

Lightweight virtualisation service operational

Under the joint effort between the Neurorobotics Platform and the HPAC Platform, all current components of the NRP have been containerised into functional Docker images: frontend server, back end server and Gazebo server. Moreover, two workflows have been enabled using Docker images:

  • Full container deployment over OpenStack, which is intended for allowing the NRP to run and manage workshops, this use case has workloads that require low performance but with higher user management overheads.
  • Front end and back end servers run on the OpenStack while the robotic simulation is executed by the Gazebo server, which is launched on the TDS of the supercomputer Piz Daint.

Jupyter notebook support by HPAC middleware operational

In collaboration with the Collaboratory team, the Python client library for the UNICORE middleware has been extended. Among other improvements, a new FUSE driver allows to mount HPC storage into Jupyter notebooks running in the Collaboratory.

User support ticket system operational

HPAC has implemented a dedicated ticket system for tracking HPAC-specific issues and user enquiries. BSC has been actively involved in the daily assessment of incoming tickets, including 1st level support and, when applicable, dispatching them to the appropriate HPC site for resolution. Another activity during the last year is the improvement of the integration between the HPAC ticketing system and the local ticketing systems used by each HPC site to address 2nd level support tickets, in particular customising the integration with the OTRS system used at JSC, which required changes in the RT code to properly propagate messages.

The portfolio of services and user adoption of the HPAC infrastructure has significantly increased during the last twelve months, and that is reflected by a growing number of processed requests of more than 160, which is an increase by a factor six as compared to the previous twelve months. The reasons for the strong increase are on the one hand the significantly increased usage of the Platform as compared to previous years, and on the other hand the establishment of the HPAC-internal policy to handle all user requests via the ticketing system.

Arbor library for performance-portable neural network simulators

Arbor is a performance-portable software library for simulators of networks of multi-compartment neurons. During the last year, performance portability was completed for the three main target HPC architectures available in the HBP: Intel x86 CPUs (AVX2 and AVX512), Intel KNL (AVX512) and NVIDIA GPUs (CUDA). The source code was released publicly on GitHub with an open source BSD license, along with documentation on Read the Docs.

The impact of our work

The ultimate objective of the HPAC Platform is to provide an advanced data and computing infrastructure to enable scientific research, therefore considerable efforts have been put into engaging with other SPs. This work proved fruitful, as at least two other HBP Platforms and the Collaboratory framework are now served by the HPAC Platform, with others we have close collaborations. In particular, the consolidation of Infrastructure-as-a-Service (IaaS) and Platform-as-a-Service (PaaS) has made it possible to integrate the following activities:

  • Neurorobotics Platform (NPR; SP10): Considerable effort has been put into the containerisation of the NRP application to make its deployment simpler and portable. The NRP architecture has been partly redefined in order to enable access to ETHZ-CSCS supercomputers through the HPAC IaaS.
  • Neuroinformatics Platform (SP5): Scientists are now heavily using the Object Storage service and taking advantage of its versatility. After the initial six months that this service is available, 13 TB of data are already stored.
  • Jupyter notebook support allowed workflows of the Brain Simulation Platform, for example CDP2 key result 1 “Single Cell Model Builder”, CDP2 key result 2 “Multi-scale validation” or CDP2 key result 3 ”In silico microcircuit experimentation”, to rely on the HPAC middleware for accessing supercomputing resources.
  • The Collaboratory is now fully hosted on the HPAC IaaS. The current system is designed for production and provides advanced levels of reliability, better performance, greatly increased scalability and vicinity to data storage. The Collaboratory will also benefit from every upgrade and new services introduced in HPAC.