The main purpose of the European Databases of Seismogenic Faults (EDSF) installation is to publish datasets through the open standards developed by the Open Geospatial Consortium (OGC) and to host a large part of the data behind those datasets. Read the extended report that illustrates the main characteristics of the IT infrastructure, called SEISMOFAULTS.EU, which serves these purposes. The front end of the infrastructure is the EDSF portal. However, a few other portals and websites are hosted on "SEISMOFAULTS.EU" to provide a user interface with other activities related to sibling projects.

If you are in a hurry, scroll down this web page for a summary.

Implementation

Hardware

SEISMOFAULTS.EU is hosted in three dedicated physical servers belonging to the INGV IT infrastructure:

  • N. 1 Hewlett Packard Enterprise (HPE) ProLiant BL460c (Gen9) equipped with two Intel® Xeon® CPU E5-2640 v4 @ 2.40GHz processors with 10/10 cores, 20 threads;  128 GB of RAM for every CPU with a frequency of 2133 MHz; two HDD HPE of 300GB in Raid 1.

This server is entirely dedicated to development and testing.

  • N. 2 HP ProLiant DL560 (Gen10) equipped with 4 Intel® Xeon® Gold 5118 CPU @ 2.30GHz 12/12 cores, 24 threads processors with 64 GB of RAM for every CPU with a frequency of 2400 MHz; four HDD da 300 GB in Raid 5 and 4 SSD da 1920 GB in RAID 5.

In addition, 50TB of storage hosted by the Storage Area Network (SAN) of Centro Servizi Informativi (CSI) of INGV premises in Rome is dedicated to our installation.

The ProLiant BL460c and one of the DL560 are hosted in CSI's data centre; the other DL560 is hosted in the INGV premises in Bologna.

Software

Following the recommendations of Agenzia per l’Italia Digitale (AGID) contained in the “Linee guida acquisizione e riuso software PA” SEISMOFAULTS.EU implements almost only open-source software.

We installed XenServer Hypervisor on each physical server for creating and managing virtual machines (VMs). Debian Linux OS is installed on all the VMs except for one, on which the MS Windows Server 2016 is installed for running applications devoted to activities not directly related to the data distribution (figure below).

The web portals and websites are managed through Joomla! or WordPress Content Management System (CMS).

The OGC web services are published through the GeoServer software, while the data management in the backend is done with PostGIS.

All services and websites are containerized through Docker Containers technology.

IT infrastructure scheme
SEISMOFAULTS.EU Virtual Machines and their primary tasks

Docker containers are a virtualization technology that creates isolated applications or services in an autonomous environment. This environment includes all the applications/services needed to be executed: dependencies, libraries, and configuration files.

Even if there are some similarities, containers are different from VMs for several reasons:

  1. They share the same kernel and operating system, making them more efficient and less resource-intensive than VMs;
  2. They can be created, deleted, and run faster, making them ideal for microservices implementation and applications that need few resources and great scalability;
  3. They are portable: once created and customized, they can be deployed and run on almost every platform that can run the Docker runtime engine;
  4. They can be easily deployed in the cloud or mixed environments;
  5. They offer higher security. A (carefully configured) container is an environment isolated from the host OS and other containers. A malfunction or compromise does not affect other containers or the host OS.

The typical file system organization of a “dockerized” service inside SEISMOFAULTS.EU can be summarized as follows:

  • A main directory “to hold them all” with a name reminiscent of the name of the service. It might also be a GIT repository (see below);
  • A docker-compose.yml file containing details about the services, networks, and volumes for setting up the application’s environment. It is used to create, deploy, and manage service-related containers;
  • A directory containing volumes is a way to store and manage data generated by and used by Docker containers. Volumes exist outside of the Docker container and can persist data even if the container is destroyed or moved. They typically store and share data between multiple containers or with the host OS (see figure below). 

We recommend the official documentation to provide a full understanding of how Docker works.

docker dir structure
Files and directories organization for managing a dockerized service in SEISMOFAULTS.EU

 

docker in seismofaults
Docker containers scheme on SEISMOFAULTS.EU

Several websites and web services run on the same VM. Every website or web service includes its web server (Nginx or Apache), web application server (i.e., Apache Tomcat), and related RDBMSs.

A (dockerized) reverse proxy manages network traffic routing arriving and departing connections from and to the suitable container hosting the service requested by the user.

Git

Service-related directories are managed through Git.

Git is a version control system designed to handle software development projects. It is a distributed version control system, meaning each user can have a local copy of the repository, making it easier to collaborate with other users.

Every Docker project directory is also a Git repository. In this way, the Git system tracks every configuration change and synchronizes the sysadmin workstation and the server running the services straightforwardly, often automatically.

Security

All VMs are implemented with a strong rationalization of active services to limit unnecessary port exposure to the internet as far as possible.

Some VMs are available only from the institutional intranet and accessible only through a Virtual Private Network (VPN). VMs that publish web services are protected by a few simple firewall rules.

Software updates

The OS and application software updates are performed from some scripts that are periodically and automatically launched using standard Unix tools. The system administrator makes manual revisions of updates on a weekly schedule.

The system administrator manually curates docker image updates, generally performed when image distributors release security patches.

Backup

System backups are performed at several independent levels:

  • VM snapshots are periodically made through the Hypervisor user interface;
  • Several scripts are executed nightly to perform a backup of the main directories of the VM OSs.

Backups are moved in disk partitions normally unused by OS and periodically copied inside a Network Attached Storage (NAS).

Monitoring

Monitoring is the process of collecting, analyzing, and displaying performance data used to measure the performance of an IT system. The goal is to ensure the system’s availability, performance, and security. We monitor the SEISMOFAULTS.EU installation to measure the performance of applications, databases, networks, and servers. Performance metrics such as latency, throughput, availability, and utilization provide a clear picture of the system's health and can identify potential areas of improvement to take corrective action when needed. Monitoring can also provide insight into the security of the system. It can detect unauthorized access attempts, malware, and other malicious activity.

The SEISMOFAULTS.EU infrastructure monitoring is performed at multiple levels. The first level is performed by the VM hypervisor, which notifies the system administrator if a VM has anomalous behaviours, such as excessive disk or CPU activity for an extended time, a saturation of memory space, or abnormal network traffic (figure below).

Nagios, software specifically designed for this task and configured on a dedicated VM, performs a second level.

Nagios monitors both VMs at the OS level (processes, CPU load, RAM usage, disk partition usage, logged users, etc.) and the process level (services and websites). It is configured to notify the system administrator via email or Telegram in case of malfunction (figures below).

Security administration policy for the systems includes, in addition, software updates, backup, and monitoring.

xen alerts configuration
Hypervisor alert system for a server belonging to the SEISMOFAULTS.EU installation 
seismofaults nagios interface
Nagios interface with some of the monitored systems
nagios specific services
Nagios interface showing specific service performance

Access Statistics

"SEISMOFAULTS.EU" uses two approaches to monitor access and report usage statistics for websites and web services because of the different technologies underlying their operation.

Website access statistics are created through the Google Analytics platform, whereas web services access statistics are created using AWStats software. Google Analytics works through a javascript that “intercepts” the connections to every website web page and stores data in the Google Cloud for various analyses (figure below). AWStats, instead, is a Perl script installed on the same VMs that publish the services. It analyzes the logs generated by the web application manager that runs GeoServer and produces a webpage with a set of stats as an output (figure below).

seismofaults awstats
AWStats interface with access statistics to a web service (EDSF13 in this example)

Acknowledgements

We thank all the INGV staff of Centro Servizi Informativi, particularly Giovanni Scarpato, Stefano Cacciaguerra, Pietro Ficeli, Manuela Sbarra, Gianpaolo Sensale, Diego Sorrentino, Stefano Vazzoler, Francesco Zanolin for their continuous IT support; the EPOS Integrated Core Services staff of the EPOS office in Rome: Daniele Bailo, Kety Giuliacci, Rossana Paciello, and Valerio Vinciarelli for their effort in EPOS/SEISMOFAULTS.EU cooperation; and our colleagues Valentino Lauciani for suggestions on the Docker configuration, Matteo Quintiliani for managing the INGV GitLab installation, and Giorgio Maria De Santis, Mario Locati, and Gabriele Tarabusi for the fruitful exchange of views.

The development of the SEISMOFAULTS.EU infrastructure benefitted from the funding of H2020 projects EPOS IP (grant agreement No. 676564) and SERA (grant agreement No. 730900), the JRU EPOS ITALIA Piano di Attività 2021 2024 supported by the Italian Ministry of University and Research (MUR), and the DPC INGV Agreement 2012 2021 (Annex A) and 2022 2024 (Annex A).