How electronic medical information archives will help diagnose diseases more effectively



According to IDC forecasts, by 2025, the total amount of data stored in healthcare organizations will increase to 2.3 zettabytes, and various medical images will account for up to 80-90% of the used storage capacity. The importance of efficient storage of medical images is important by the following example.

In the diagnostic department of a hospital in Tucson (Arizona), up to 40 x-ray images are taken for two mammals using a MRI scan for mammography (diagnosis of breast cancer based on X-ray, ultrasound and MRI). as well as two to five biopsies. In 80% of cases, old patient images taken two years ago and earlier are used to interpret the results of mammography, and in difficult cases, images taken 10 years ago may be needed. To quickly retrieve old photographs from the archive, the PACS (Picture Archiving And Communication System) digital hospital system is used.

The use of old images stored in PACS significantly reduces the risk of errors in the diagnosis of malignant tumors and saves patients with a benign tumor from having to have a mammogram and even a biopsy again just to make sure their tumor is not malignant. At the same time, comparison with old images reduces the risk of erroneous interpretation of images of malignant tumors and allows you to quickly assign the patient the appropriate treatment of pathology and additional analyzes.

Features of long-term storage of medical image archives


What should be the ideal storage system that is used as the storage system for PACS? Obviously, given the large size of medical images, it must be highly scalable so that it can store the results of analyzes of each patient, accumulated over several decades. The second requirement is to ensure quick search and retrieval of data from the archive, without which it is impossible to use old images to interpret the results of new surveys.

Finally, when storing medical images, it is necessary to completely eliminate the possibility of "data leakage", loss and accidental or intentional deletion or damage. A feature of storing data in PACS is the relatively modest I / O performance requirements: it is obvious that the analysis results are written to the archive only once and never change, and only a limited number of users can send requests to extract this data from the archive. Patient data is usually requested no more than once a year.



Traditional enterprise-class storage systems are not suitable for PACS primarily because of the too high cost of data storage, which is largely due to the high performance that is unnecessary for medical image archives when servicing transactional applications, and cheaper entry-level storage systems do not have the scalability required for PACS archives.

Perhaps the best solution for storing medical images


The main solution for storing medical images and other unstructured content in healthcare organizations is the use of file and object storage systems with high scalability and low cost of storage per gigabyte. One of the leaders in this storage segment is Scality, which is promoting its software-defined storage, Scality RING. The first version of Scality RING was released in 2010. This is a scale-out solution that uses peer-to-peer connections and a shared-nothing distributed architecture that is deployed on standard x86 servers. Scality RING supports S3 and Swift object data access protocols, simple HTTP APIs, and file access. Last year, Scality managed to double the number of installations of its systems in healthcare.

Scality RING software is deployed on a cluster consisting of a minimum configuration of three storage nodes, and implements a set of intelligent data access services, as well as data protection and system management. At the upper level are scalable data access services (connectors), which provide data to applications via SMB, NFS and S3 protocols, as well as a supervisor for centralized management and monitoring of system status. Connectors are usually installed directly on storage nodes, but they can also be deployed on dedicated servers.

Middle-level storage Scality RING is a distributed virtual file system with several data protection mechanisms, system self-healing processes and system management and monitoring services. The bottom level of the stack is the distributed storage level that the virtual storage nodes and the I / O daemons form, which abstract the physical storage servers and disk interfaces.

The heart of the storage tier is scalable key-value distributed object storage based on the second generation peer-to-peer routing protocol. Routing provides efficient horizontal storage scaling and search across a very large number of nodes. The software for these storage services is deployed on all servers with the necessary computing power and capacity of the disk subsystem. The servers (nodes) on which the Scality RING software is deployed are connected by a standard IP-based network factory, for example, using 10/25/40/100 Gigabit Ethernet.

Scality RING includes the following software components: connector servers, a distributed MESA DBMS for storing metadata, storage nodes, I / O daemons, and a web-based management portal. MESA provides object indexing and metadata management used at the Scality Scale-out file system (SOFS) abstraction level.

Scality RING connectors provide application access to data stored on servers. They support many data access protocols, including the Amazon Web Services (AWS) S3 object based on the Representational State Transfer (REST) ​​standard, as well as the NFS, SMB, and FUSE file protocols. A single application can simultaneously use multiple RING connectors to access data if you need to scale I / O horizontally or serve many users in parallel.

A storage node is a virtual process that is responsible for objects associated with the allocated part of a distributed key hash of a key (keyspace) RING. Daemons of storage nodes (the so-called bizoid) ensure the immutability of data stored on disk in a low-level local file system. Six virtual storage nodes are deployed on one physical server (host). Each biziod is an instance of a low-level process that controls input / output operations on a specific physical disk and maintains the correspondence of object keys to the addresses of specific objects on this disk.

To ensure high availability of object storage (up to 14 nines), instead of the classic RAID technology, Scality RING uses various data protection mechanisms optimized for distributed systems, including local and geographically distributed replication and erasure coding, which can be combined with replication erasure coding in one connector. When storing small objects (up to 60 Kbytes in size), replication is a more cost-effective protection solution, and for large ones - erasure coding, in which large data sets do not need to be replicated. During replication, six levels of the Class of Service (CoS) service class are used from 0 to 5, which corresponds to saving 3-5 replicas of an object, and all replicas are stored on different disks.

In the case of using erasure coding, the Reed-Solomon error correction mechanism is used, in which, instead of storing several replicas of an object, it is divided into “data chunks” that are written together with parity chunks. These pieces are distributed among RING nodes, and you can restore data from them when one or more nodes fail. Also, high fault tolerance of RING is ensured by the share-nothing architecture, in which there is no main (“master”) node, the failure of which can lead to the failure of the entire system.

Scality RING Ecosystem


Although Scality RING can be deployed on any standard x86 server architecture, Gartner, in its Magic Quadrant for Distributed File Systems and Object Storage, notes that its implementation requires careful equipment selection and detailed project design, as well as deep IT immersion - customer’s specialists in Scality technology.

Since 2014, Hewlett Packard Enterprise, a strategic partner of Scality, offers two server models from the HPE Apollo 4000 Gen10 series , which were developed specifically for Big Data analytics , as a joint platform solution for Scality RING software-defined object storage and object storage: HPE Apollo 4200providing ultra-high storage density (up to 392 TB in a 2U height enclosure accommodating 28 full-size LFF disks or 54 2.5-inch SFF drives) and designed for hyper- scaling of HPE Apollo 4510 capacities based on 4U chassis (68 full-size disks per chassis, more 9 petabytes in a standard 42U server rack).

Both HPE Apollo 4000 Gen10 models allow you to flexibly configure the disk subsystem to meet specific storage node performance and capacity requirements and support HPE iLO 5 server remote management tools familiar to users of HPE ProLiant servers that help you quickly deploy a large number of Scality RING storage nodes and effectively manage them.

The joint solution of HPE and Scality is positioned as a global repository of unstructured data (including archives), when large bandwidth and capacity are much more important than the minimum delay in accessing stored data. It scales to several thousand nodes of data storage and access, providing storage of hundreds of petabytes of data and trillions of objects in one namespace.

For additional protection of stored information, you can use various backup software packages, since Scality has certified its cloud storage for compatibility with VEEAM, Commvault, Microfocus Data Protector, Cloudera, MAPR and WEKA.IO products, and the use of Scality RING in healthcare as an archive of medical images provides certification for compatibility with PACS systems Fujifilm, GE Healthcare, Philips and several other vendors.

Healthcare Scality Ring Case Studies


Currently, in France alone, more than a dozen of the largest hospitals use Scality Ring storage from 400 TB to 6 PB for storing archives of medical images. For example, a large joint installation of HPE and Scality is implemented at the Assistance Publique Hôpitaux de Marseille (AP-HM), which combines four Marseille hospitals with 3400 beds and is the third largest in France. At AP-HM hospitals, 2,000 doctors and 8,500 other medical staff work.

Until 2011, the AP-HM used EMC Centera to store PACS images. By this time, the total volume of medical images was 60 TB, but they were replicated to protect the data, so they occupied 120 TB of capacity. Each year, hospital PACS generated another 20 TB of new images. In 2011, AP-HM replaced Centera with a NAS-system and by 2017 the volume of images increased to 320 TB, and the growth rate increased to 40 TB per year. As the NAS was running out of warranty, and the storage capacity of this storage was no longer sufficient due to the rapid growth of data volumes, the management of the AP-HM hospitals decided to replace again.

When choosing a new storage system, it was necessary to ensure compatibility with all applications used in hospitals, including support for CIFS and NFS file protocols, scaling over several petabytes, reliable protection and data security. AP-HM selected HPE and Scality RING and built a RING cluster distributed across three data centers from six HPE Apollo 4510 storage servers, as well as two HPE ProLiant DL360 servers that store META metadata databases. The main applications are the Carestream PACS system from GE Healthcare, which records the images obtained as a result of radiological studies and the genomics archive. Medical images and other data are backed up using Commvault software.

Availability of Scality Ring in Russia


HPE and Scality have gained extensive experience in collaborative projects. With the implementation of Scality RING in medical institutions in Russia, certified engineers from the HPE’s Moscow office are ready to help the customer.

Under Factory Express, HPE supplies Apollo 4000 Gen10 preconfigured to customer requirements for rapid deployment of the Scality RING cluster, and also offers the Reference Architecture of these server systems for the RING cluster. Since May last year, HPE has been delivering the Base Bundle starter pack of Apollo 4200 Gen10 servers with an initial capacity of 240 TB and pre-installed Scality RING software. To deploy the Base Bundle, you only need to set the network speed and the required storage capacity.

You can get additional information about the Scality Ring, as well as get acquainted with the interface and examples of integration with backup software at the HPE technical webinar, which will be held on June 17. Registration is available at: bit.ly/3bA9HP7

All Articles