EMC Business Continuity Solution for GE Healthcare Centricity PACS ...

50 downloads 248 Views 5MB Size Report
tested configuration environment and identifies the key results of testing. ..... maintain business continuity for the a
White Paper

EMC BUSINESS CONTINUITY SOLUTION FOR GE HEALTHCARE CENTRICITY PACS-IW ENABLED BY EMC RECOVERPOINT AND VMWARE WITH SITE RECOVERY MANAGER Applied Technology

EMC GLOBAL SOLUTIONS

Abstract This white paper provides an overview of a GE Healthcare Centricity PACS-IW environment built on servers with EMC® CLARiiON® storage. It describes the tested configuration environment and identifies the key results of testing. This white paper covers EMC RecoverPoint®, EMC Cluster Enabler, VMware® ESX®, and CLARiiON storage. April 2011

Copyright © 2011 EMC Corporation. All Rights Reserved. EMC believes the information in this publication is accurate of its publication date. The information is subject to change without notice. The information in this publication is provided “as is.” EMC Corporation makes no representations or warranties of any kind with respect to the information in this publication, and specifically disclaims implied warranties of merchantability or fitness for a particular purpose. Use, copying, and distribution of any EMC software described in this publication requires an applicable software license. For the most up-to-date listing of EMC product names, see EMC Corporation Trademarks on EMC.com. VMware, ESX, and VMware vCenter are registered trademarks or trademarks of VMware, Inc. in the United States and/or other jurisdictions. All other trademarks used herein are the property of their respective owners. Part Number H8187

EMC Business Continuity Solution for GE Healthcare Centricity PACS-IW Enabled by EMC RecoverPoint and VMware with Site Recovery Manager

2

Table of Contents Executive summary .............................................................................................................. 5  Key solution benefits .......................................................................................................... 5  Business case .................................................................................................................... 6  Solution overview ............................................................................................................... 6  Introduction ........................................................................................................................ 7  Overview ............................................................................................................................ 7  Purpose .............................................................................................................................. 7  Audience ............................................................................................................................ 7  Terminology ....................................................................................................................... 8  Physical environment .......................................................................................................... 9  Technology overview .......................................................................................................... 9  EMC RecoverPoint ........................................................................................................ 10  EMC RecoverPoint/Cluster Enabler ............................................................................... 10  GE Healthcare Centricity PACS-IW ................................................................................. 11  The physical environment................................................................................................. 12  Configuration ................................................................................................................... 14  Microsoft cluster configuration ..................................................................................... 14  Setting up Microsoft cluster to use MNS with FSW ........................................................ 15  Centricity PACS-IW viewer ............................................................................................. 16  Test and validation ........................................................................................................... 19  Primary cluster node failure .......................................................................................... 19  Test 1: Single server failure at the primary site ............................................................. 22  Test 2: Total server failure at the primary site ............................................................... 25  Test 3: Primary storage failure ...................................................................................... 33  Physical to virtual conversion............................................................................................. 40  Virtualized environment..................................................................................................... 42  Technology overview ........................................................................................................ 42  VMware vCenter SRM ................................................................................................... 43  VMware vCenter SRM with EMC RecoverPoint SRA ........................................................ 44  Hardware and software components ................................................................................ 44  VMware vCenter SRM ................................................................................................... 45  Configuration ................................................................................................................... 46  EMC VMware ................................................................................................................ 46  EMC RecoverPoint ........................................................................................................ 46  Microsoft cluster .......................................................................................................... 46  Configure VMware vCenter SRM .................................................................................... 48  Test and validation ........................................................................................................... 55 

EMC Business Continuity Solution for GE Healthcare Centricity PACS-IW Enabled by EMC RecoverPoint and VMware with Site Recovery Manager

3

Primary ESX node failure .............................................................................................. 55  Primary site failure ....................................................................................................... 55  Observations of the failover process ............................................................................ 59  Conclusion ........................................................................................................................ 60  Summary .......................................................................................................................... 60  Findings ........................................................................................................................... 60  Next steps ........................................................................................................................ 60  References ........................................................................................................................ 61  Product documentation .................................................................................................... 61 

EMC Business Continuity Solution for GE Healthcare Centricity PACS-IW Enabled by EMC RecoverPoint and VMware with Site Recovery Manager

4

Executive summary Healthcare providers are continuing to invest in radiology and cardiology Picture Archiving and Communication Systems (PACS) to predict, diagnose, treat, and monitor disease. With each new imaging modality technology advancement, healthcare provider IT organizations are challenged to meet increasing demands for high availability and scalability of systems and networks, security, and automated tools to simplify the complexity of managing this expanding imaging environment. EMC offers several key infrastructure components to economically deliver highly available applications to healthcare IT consumers. These components, when combined with clinical and business application software from EMC partners, deliver high availability and protection of information to enable continuous hospital operations. The EMC Business Continuity Solution for GE Healthcare Centricity PACS-IW Enabled by EMC RecoverPoint and VMware with Site Recovery Manager is an example of this integration and collaboration between EMC and GE Healthcare to provide scalable, repeatable, and validated solutions. GE Healthcare Centricity PACS-IW is a web-based solution for all modalities and healthcare environments and offers a wide range of functions at high speed, based on the latest international standards for small and medium-size hospitals and imaging centers. For the first time, GE PACS-IW users can deploy a solution integrated with their application with automated failover and recovery processes that can deliver repeatable recovery point objective (RPO) and recovery time objective (RTO).

Key solution benefits This solution provides important benefits, including: •

Built on tiered-networked storage platforms that are highly scalable and are able to support the needs of small and medium-sized hospitals up to the largest academic hospitals and Integrated Delivery Networks (IDNs)



Delivers five 9s of availability (99.999% uptime), reliability, scalability, and performance needed by the healthcare enterprise



Provides a standard and highly automated failover and recovery solution for all GE PACS-IW users, regardless of physical and virtual deployment. Failover was completed in less than 8 minutes in a virtualized environment and in less than 3 minutes in a physical environment, for the workloads tested in this solution



Server virtualization to dynamically map computing resources to the healthcare enterprise. This results in a lowering of IT costs by treating the data center as a single pool of processing



Information management and protection software to meet RPOs and RTOs for business continuity and disaster recovery plans



With RecoverPoint and Site Recovery Manager tools from EMC and VMware you can now manage your recovery in a much more granular fashion, down to the transaction level

EMC Business Continuity Solution for GE Healthcare Centricity PACS-IW Enabled by EMC RecoverPoint and VMware with Site Recovery Manager

5



Security offerings for patient records, confidential emails, and other communications. Active archiving components to provide secure and rapid recall of historical images and meet regulatory requirements for records retention

This information infrastructure, built for reliability, supports both short-term access and longterm archiving and can be integrated with cardiology PACS, radiology PACS, and other clinical and business systems to automate workflow and streamline operations

Business case This white paper provides a technical architecture for current and prospective customers of GE Healthcare Centricity PACS-IW who are responsible for providing an IT infrastructure for clinical applications. EMC enterprise business continuity solutions ensure that your applications and data are available during planned and unplanned outages. The solution provides tiered storage platforms, virtualization capabilities, software, and services that ensure high availability and robust data protection—and fit your budgetary and technology constraints.

Solution overview The PACS-IW solution leverages Microsoft Cluster Services with EMC Cluster Enabler to maintain business continuity for the application and EMC RecoverPoint replication technology for the storage array. Cluster Enabler (CE) for Microsoft Failover Clusters is a software extension of failover cluster functionality. Cluster Enabler allows Windows Server 2003 and 2008 (including R2) Enterprise and Datacenter editions running Microsoft Failover Clusters to operate across multiple connected storage arrays in geographically distributed clusters. In Windows Server 2003, the failover clusters are called server clusters and use Microsoft Cluster Server (MSCS). Each cluster node is connected through a storage network to the supported storage arrays. Cluster Enabler expands the range of cluster storage and management capabilities while ensuring full business continuance protection. A Fibre Channel connection from each cluster node is made to its own storage array. Two connected storage arrays provide automatic failover of mirrored volumes during a Microsoft failover cluster node failover. Cluster Enabler protects data from storage, system, and site failures, 24 hour a day, 7 days a week, and 365 days per year. While Cluster Enabler protects against node failures at either primary or secondary sites, it also allows for scheduled maintenance by allowing cluster resources to fail over and fail back between sites.

EMC Business Continuity Solution for GE Healthcare Centricity PACS-IW Enabled by EMC RecoverPoint and VMware with Site Recovery Manager

6

Introduction Overview This white paper provides an overview of the configuration and testing steps performed on two GE Healthcare Centricity PACS-IW environments. The first environment is built on physical servers with EMC® RecoverPoint/CE. The second is built on VMware® ESX® with Site Recovery Manager on EMC RecoverPoint. Both environments utilize CLARiiON® storage. For each environment this document provides the architectural overview and the configuration steps required, and outlines the key results of testing and validation.

Purpose This white paper documents the requirements for the PACS-IW application to be fully functional for business continuity. It describes an environment built to demonstrate the interoperability and functionality of PACS-IW within a physical and virtualized environment using RecoverPoint. The white paper covers possible failure events, such as application, server, SAN, and array failure.

Audience This document is intended for the EMC Global Practice, Product Engineering, Field Council, Global Solutions, Marketing, and Demo teams, EMC Training and Services organizations, as well as current and prospective GE Centricity PACS-IW customers who are responsible for providing an IT infrastructure for clinical applications.

EMC Business Continuity Solution for GE Healthcare Centricity PACS-IW Enabled by EMC RecoverPoint and VMware with Site Recovery Manager

7

Terminology Table 1 defines terms used in this document. Table 1

Terminology

Term

Definition

MNS/FSW

Majority Node Set/FileShare Witness

MSCS

Microsoft Cluster Services

PACS

Picture Archiving and Communication System

CRR

Continuous remote replication

RPCE

RecoverPoint Cluster Enabler

RPO

Recovery point objective. RPO is the point in time (prior to an outage) that systems and data must be restored to.

RTO

Recovery time objective. RTO is the period of time after an outage in which the systems and data must be restored to the predetermined RPO.

SAN

Storage area network. A high-speed special-purpose network connecting various sorts of storage devices with servers, typically to support a larger network of users.

SRA

Storage Replication Adapter

SRM

Site Recovery Manager

EMC Business Continuity Solution for GE Healthcare Centricity PACS-IW Enabled by EMC RecoverPoint and VMware with Site Recovery Manager

8

Physical environment Technology overview The architecture diagram in Figure 1 illustrates the configuration for GE PACS-IW implementation using physical servers and RecoverPoint replication technology. Each site contains nodes participating in two Microsoft clusters. A third node called the Majority Node Set is used to act as the witness for each of the clusters.

Figure 1

RecoverPoint/CE architecture

Note

On each site the physical main controller servers have USB dongles attached, enabling licensing for PACSIW.

Note

Some sites may have only one database and controller cluster node per site, but for the purpose of demonstrating local site resiliency, two nodes per site were configured, allowing automated lateral failure of either a database or controller node.

EMC Business Continuity Solution for GE Healthcare Centricity PACS-IW Enabled by EMC RecoverPoint and VMware with Site Recovery Manager

9

The following sections identify and briefly describe the technology and components used in the validated physical and virtualized desktop infrastructure environment.

EMC RecoverPoint

EMC RecoverPoint allows continuous data protection and continuous remote replication for on-demand protection and recovery to any point in time. RecoverPoint's advanced capabilities include policy-based management, application integration, and bandwidth reduction. RecoverPoint enables recover of data at a local or remote site to any point-in-time, and ensures continuous replication to a remote site without impacting performance. The EMC RecoverPoint family provides cost-effective, local continuous data protection (CDP) and continuous remote replication (CRR) solutions that allow for any point in time data recovery. •

continuous data protection (CDP)—RecoverPoint maintains only a local replica, referred to as a CDP replica. The data transport is performed synchronously.



continuous remote replication (CRR)—RecoverPoint maintains only a remote replica, referred to as a CRR replica. The data transport is performed asynchronously.

In CRR configurations, data is transferred between two sites over Fibre Channel or a WAN. In this configuration, the RPAs, storage, and splitters exist at both the local and the remote site.

EMC RecoverPoint/Cluster Enabler EMC RecoverPoint/Cluster Enabler (CE) enables geographically dispersed Microsoft Failover Clusters to replicate their data using RecoverPoint/SE CRR or RecoverPoint CRR. Geographically dispersed clusters offer increased levels of high availability, disaster recovery, and automation over non-clustered solutions. RecoverPoint/CE works seamlessly with applications designed to take advantage of Failover Clusters, such as Exchange and SQL Server, in Microsoft Windows 2003 and 2008 environments. RecoverPoint supports consistency groups, which ensures write-order consistency across related mirrored volumes. This is common practice for transactional databases such as Microsoft SQL Server, where log volumes and database volumes must be kept logically consistent with one another.

EMC Business Continuity Solution for GE Healthcare Centricity PACS-IW Enabled by EMC RecoverPoint and VMware with Site Recovery Manager

10

GE Healthcare Centricity PACS-IW

GE Healthcare Centricity PACS-IW is a standards-based, single user interface, web-based PACS. It handles computed tomography (CT), magnetic resonance (MR), ultrasound (US), nuclear medicine (NM), computerized radiography (XA), PET/CT scan (PT), and many other kinds of exams. Centricity PACS-IW uses native tools for paperless operations (document scanning), print pages (key images), industry-standard IHE Portable Data Imaging (PDI) CD burning, and referring physician access for imaging results. Web-based PACS can be configured in different ways to fulfill customers’ requirements regarding data safety and system uptime. It can also be adopted to fit into the current IT environment managing multi-site installations, bandwidth limitations, and security requirements: •

Multi-site powered, for scalable growth



Maximum speed with cross-site streaming of images



Scalable for expanding businesses



Protect patient data transfer using encrypted protocols

The Centricity PACS-IW licensing is controlled via USB dongle. Table 2

PACS-IW cluster details

Term

Main database cluster

Main controller cluster

Cluster Name

GE-DB-CLU

GE-MC-CLU

Member Nodes

GE-DB1, GE-DB2, GE-DRDB1 and GEDRDB2

GE-MC1, GE-MC2, GE-DRMC1 and

GE-DB-MNS

GE-MC-MNS

Majority Node Set

GE-DRMC2

EMC Business Continuity Solution for GE Healthcare Centricity PACS-IW Enabled by EMC RecoverPoint and VMware with Site Recovery Manager

11

The physical environment The environment consisted of two Microsoft clusters spread across two sites. The hardware and software components are listed below. Hardware components •

Gen 4 RecoverPoint appliances x 4



4 Intel servers with quad-core Intel Xeon x5570 processors, 2.93 Ghz CPU, 144 MB memory



CLARiiON CX4-120 x 2



Network switch x 2



FC switch x 4



GE PACS-IW licensing USB dongle x 2

Software components •

RecoverPoint 3.3.SP1.P1(k.119)



Cluster Enabler Base 4.1.0.143



Cluster Enabler Plugin 4.1.0.6



CLARiiON splitter (RecoverPoint Enabler on CX4-120)



Unisphere™ on CLARiiON – FLARE® 30



Solutions Enabler 7.2.0



Solutions Enabler Base License



PowerPath® 5.3



Microsoft SQL Server 2005 SP2



GE Centricity PACS-IW 3.7.3

Storage The storage configuration consisted of two EMC CX4-120 arrays using Unisphere (FLARE 30), one array per site. All server and storage connections are Fibre Channel (FC). The primary fabric consisted of two FC switches and the secondary fabric consisted of two FC switches. EMC RecoverPoint Note

CLARiiON CX4 array-based splitting technology is leveraged for the white paper. This array-based splitter carries out write splitting inside each CX4 storage processor. There are three types of replication: local replication using CDP, remote replication using CRR, and both local and remote replication using a combination of CDP and CRR called concurrent local and remote. This paper only focuses on the CRR product.

For RecoverPoint testing the main database cluster had two nodes per site, with Windows 2003 SP2 x64, each node had SQL 2005 SP2 installed. Consistency groups were configured to replicate the database and log devices between sites.

EMC Business Continuity Solution for GE Healthcare Centricity PACS-IW Enabled by EMC RecoverPoint and VMware with Site Recovery Manager

12

The main controller cluster had two nodes per site, with Windows 2003 SP2 x86, and each node had Centricity PACS-IW software installed. Consistency groups were configured to replicate the data devices between sites. Access to the database and controller is achieved via the cluster virtual IP. EMC RecoverPoint/Cluster Enabler RecoverPoint/Cluster Enabler (RecoverPoint/CE) was installed on each of the cluster nodes allowing integration with Microsoft clustering and RecoverPoint in the event of a failure event by enabling automated failover across sites. RecoverPoint/CE supports the quorum model types in Table 3. Table 3

Supported model types

Microsoft Windows Server

Model type

2003

Majority Node Set (MNS) MNS with File Share Witness (FSW)

2008

Node Majority Node and File Share Majority

For the purpose of the GE PACS-IW use case, the MNS with FSW quorum model was selected. An MNS quorum cluster can only run when the majority of the cluster nodes are available; a two-node MNS quorum cluster is unable to sustain the failure of any cluster node. This is because the majority of a two-node cluster is two. To sustain the failure of any one node in an MNS quorum cluster, you must have at least three devices that can be considered as available. The MNS with FSW cluster model is recommended for clusters with special configurations. It works in a similar way to Node and Disk Majority, but instead of a witness disk, this cluster uses a file share witness. If you use Node and File Share Majority, at least one of the available cluster nodes must contain a current copy of the cluster configuration before you can start the cluster. Otherwise, you must force the starting of the cluster through a particular node.

EMC Business Continuity Solution for GE Healthcare Centricity PACS-IW Enabled by EMC RecoverPoint and VMware with Site Recovery Manager

13

Configuration Microsoft cluster configuration The Microsoft clustering configurations are shown in Figure 2 for both the database and controller nodes.

Figure 2

Main controller and main database clusters

EMC Business Continuity Solution for GE Healthcare Centricity PACS-IW Enabled by EMC RecoverPoint and VMware with Site Recovery Manager

14

Setting up Microsoft cluster to use MNS with FSW

Typically many Windows 2003 cluster configurations are set up with Shared Quorum disk. In order to satisfy the Cluster Enabler requirements, the Microsoft cluster was reconfigured to use MNS with FSW. This entails adding another node into the Microsoft cluster to act as the witness for the cluster; it is recommended that this node be located at a tertiary site. 1. Add a new resource called “MajorityNodeSet” to the cluster group for each cluster, as in the following Microsoft article: http://technet.microsoft.com/en-us/library/cc783784(WS.10).aspx 2. On the MNS node, create a new folder and update the FileShare permissions to add the cluster service account and allow “Change" and "Read" permissions for the file share. Microsoft Windows documentation provides instructions on changing permissions for FileShare. 3. Update the cluster so it can reference the newly created fileshare witness. The following Microsoft article references how to define the MNS with FSW: http://support.microsoft.com/kb/921181 4. Do not add the MNS node into the cluster until Cluster Enabler has been configured, as the MSN does not need Cluster Enabler installed. If the node is added to the cluster at this point the RP/CE configuration will fail. Note

It is recommended that the MNS nodes reside at a tertiary site, as per Microsoft recommendations.

EMC Business Continuity Solution for GE Healthcare Centricity PACS-IW Enabled by EMC RecoverPoint and VMware with Site Recovery Manager

15

Centricity PACS-IW viewer

The GE Centricity PACS-IW viewer provides access to patient images and reports. Interface login To verify the functionality of the PACS-IW viewer, browse to the virtual IP for the main controller cluster: http://xxx.xxx.xxx.xxx and login with a user account.

Figure 3

Centricity PACS-IW login screen

Patient study list This displays the list of patient studies associated with the specific logged-in user.

Figure 4

User-specific patient study list

EMC Business Continuity Solution for GE Healthcare Centricity PACS-IW Enabled by EMC RecoverPoint and VMware with Site Recovery Manager

16

Patient study retrieval When the case study is being retrieved a monitor progress bar may be visible on the viewer screen. Progression bar explanation

Purple bar The technical influencer is primarily the database server. A slowly growing purple bar indicates issues with the database query for study descriptors. Possible reasons for this progress bar are: •

Non-performing hard disks hosting database files like SATA and NAS



Network latency greater than 10 ms



Database has not been re-indexed for a long time

Light green bar The technical influencer is mainly the primary archive and network conditions. A slowly growing light green bar indicates issues with the image retrieval from the primary archive. Possible reasons for this progress bar are: •

Non-performing primary archive; required throughput 200 MB/s (NAS should not be used as the primary archive, preferred primary archive direct-attached storage (DAS) or SAN; SAS or FC disks)



Network connection between the node and controller less than 1 Gb/s



WAN connection between the client and node with low bandwidth (less than 4 Mb/s)

Dark green bar The technical influencer is primarily the client. A slowly growing dark green bar indicates issues with the performance of the client workstation. De-compression is too slow. Possible reasons for this progress bar are: •

CPU is at the limit of its performance. Radiologists should have workstations with 2 x quad-core CPU and 4 GB memory.



Too many applications are running at the same time, eating up memory resources; the workstation starts outsourcing to the page file on hard disk.



In Windows XP, the 4 GB memory address range needs to be switched on in the boot.ini file.



Client hard disk is too slow; required throughput for workstations with high workload hard disk is 120 MB/s.

EMC Business Continuity Solution for GE Healthcare Centricity PACS-IW Enabled by EMC RecoverPoint and VMware with Site Recovery Manager

17

Sample case study Figure 5 shows typical images viewed by healthcare professionals in the PACS-IW viewer.

Figure 5

Patient images – viewed though the Centricity PACS-IW viewer

EMC Business Continuity Solution for GE Healthcare Centricity PACS-IW Enabled by EMC RecoverPoint and VMware with Site Recovery Manager

18

Test and validation Primary cluster node failure In the event of a server failure for any node owning the cluster resources at the time, EMC Cluster Enabler allows all disk-based resources to automatically fail over between sites. The following images demonstrate failover in the event of a node failure on the main database cluster. Note that the resources are owned by the node (GE-DB1) prior to simulated node failure. A node failure can be represented by stopping cluster service, shutting down the operating system, or removing power from the node. The current owner of the resources is GE-DB1, as seen in Cluster Administrator in Figure 6.

Figure 6

Owner of resources

EMC Business Continuity Solution for GE Healthcare Centricity PACS-IW Enabled by EMC RecoverPoint and VMware with Site Recovery Manager

19

Figure 7 shows a view of the cluster from Cluster Enabler Manager.

Figure 7

Cluster status

The SQLServer group needs to be given control by Cluster Enabler. This is configured using the policy options on the RecoverPoint GUI, as shown in Figure 8.

Figure 8

RecoverPoint policy options

EMC Business Continuity Solution for GE Healthcare Centricity PACS-IW Enabled by EMC RecoverPoint and VMware with Site Recovery Manager

20

The consistency group contains the replicated LUNs, as seen on the Replication Sets page, as shown in Figure 9. It shows both the local and remote LUNs; each replicated pair is defined under a replication set within the consistency group.

Figure 9

RecoverPoint Replication Sets tab

EMC Business Continuity Solution for GE Healthcare Centricity PACS-IW Enabled by EMC RecoverPoint and VMware with Site Recovery Manager

21

Test 1: Single server failure at the primary site

The GE-DB1 node is taken offline (using restart of the server/shutdown/stopping cluster service).

Figure 10

GE-DB2 taking ownership of cluster resources

Once Microsoft clustering detects that the active node has gone offline or its resources are failing, it automatically moves the resources to the secondary node available in the cluster. In Figure 10 you can see the GE-DB2 node taking ownership of the cluster resources on the failure of node GE-DB1. The resources automatically move over to GE-DB2, which is a node at the primary site. This node already has visibility to the local storage so no additional tasks on the array, or with RecoverPoint, need to be performed. This is a typical local two-node cluster on shared local storage. Observations of node failure

Main database node failure When a main database node fails and all resources move to the secondary node in the cluster, the cluster resources need to go offline temporarily during this time while being moved from node to node. The following observations were made: •

Centricity PACS-IW Viewer users are not logged out of the application.



PACS-IW users are not able to open patient studies until all cluster resources have moved to the secondary node.

EMC Business Continuity Solution for GE Healthcare Centricity PACS-IW Enabled by EMC RecoverPoint and VMware with Site Recovery Manager

22



Figure 11 is displayed if patient study list access is attempted by users while the failover is in progress.

Figure 11

Study loading problem – main controller



Once the resources are fully online on the secondary node all access to the patient studies is now available.



In the PACS-IW use case environment the failover was observed to take approximately 3 minutes.

Main controller node failure When a main controller node fails and all resources move to the secondary node in the cluster, during this time the cluster resources need to temporarily go offline while being moved from node to node. The following observations were made: •

Centricity PACS-IW Viewer users are automatically logged out of the application. Note

This is because the Tomcat cluster service needs to be restarted on the secondary node



PACS-IW users can re-login once the Tomcat service comes online on the secondary node and can access the patient studies list. Typically, the Tomcat resource restarts within a few seconds.



PACS-IW users are not able to open patient studies until all cluster resources have moved to the secondary node.



The Study loading problem message box is displayed if patient study access is attempted by users while the failover is in progress. (This is expected behavior.)



Once the resources are fully online on the secondary node all access to the patient studies is now available.

EMC Business Continuity Solution for GE Healthcare Centricity PACS-IW Enabled by EMC RecoverPoint and VMware with Site Recovery Manager

23

Figure 12 •

Resources are online

In the PACS-IW use case environment the failover was observed to take approximately 3 minutes.

EMC Business Continuity Solution for GE Healthcare Centricity PACS-IW Enabled by EMC RecoverPoint and VMware with Site Recovery Manager

24

Test 2: Total server failure at the primary site

In this scenario both primary nodes in the cluster are lost. The GE-DB2 node is now taken offline.

Figure 13

Primary nodes lost

1. Cluster Enabler manages the failover of the cluster from a Microsoft clustering perspective to the secondary site; in this case the failover of the resources sees movement to GE-DRDB2. This was the next node as defined in the preferred owner list for the cluster.

EMC Business Continuity Solution for GE Healthcare Centricity PACS-IW Enabled by EMC RecoverPoint and VMware with Site Recovery Manager

25

Figure 14 shows the status of the consistency group in RecoverPoint that is replicating the primary LUNs to the secondary array. Note

Figure 14

The sites are called PRODUCTION and DR respectively, and the data flow is from Production to DR.

Data flow from PRODUCTION to DR

EMC Business Continuity Solution for GE Healthcare Centricity PACS-IW Enabled by EMC RecoverPoint and VMware with Site Recovery Manager

26

2. When site-to site-failover is initiated due to server loss, RecoverPoint automatically enables access to the LUNs at the secondary site. Figure 15 shows the sequence on the RecoverPoint GUI during failure. Here RecoverPoint is in the progress of changing the replication direction; it points to the latest available image on the secondary site and grants access for the target LUNs to remote nodes.

Figure 15

Replication direction changing

EMC Business Continuity Solution for GE Healthcare Centricity PACS-IW Enabled by EMC RecoverPoint and VMware with Site Recovery Manager

27

3. The direction of the replication has been reversed and is now flowing from DR to PRODUCTION, as shown in Figure 16. The group does a quick initialization ensuring that data is totally in sync between sites.

Figure 16

Replication direction is reversed

EMC Business Continuity Solution for GE Healthcare Centricity PACS-IW Enabled by EMC RecoverPoint and VMware with Site Recovery Manager

28

4. Hosts at the target site now have access to the LUNs on the secondary storage. The DR site is now shown as being the production source, as shown in Figure 17.

Figure 17

DR site is the production source

EMC Business Continuity Solution for GE Healthcare Centricity PACS-IW Enabled by EMC RecoverPoint and VMware with Site Recovery Manager

29

5. Now all disk resources are available and hosts are online at the secondary site. Site failover is complete.

Figure 18

Hosts online at secondary site

EMC Business Continuity Solution for GE Healthcare Centricity PACS-IW Enabled by EMC RecoverPoint and VMware with Site Recovery Manager

30

Observations of total server failure

Total primary server failure When a main database node fails and all resources move to the secondary node in the cluster, the cluster resources need to go offline temporarily while being moved from node to node. The following observations were made: •

Centricity PACS-IW Viewer users are not logged out of the application.



PACS-IW users are not able to open patient studies until all cluster resources have moved to the secondary node.



The Study loading problem message box is displayed if patient study access is attempted by users while the failover is in progress. (This is expected behavior.)



Once resources are fully online on the secondary node all access to the patient studies is now available.



In the PACS-IW use case environment the failover was observed to take approximately 3 minutes.

Main controller node failure When a main controller node fails and all resources move to the secondary node in the cluster, the cluster resources need to temporarily go offline while being moved from node to node. The following observations were made: •

Centricity PACS-IW Viewer users are automatically logged out of the application.

Note

This is because the Tomcat cluster service needs to be restarted on the secondary node.



PACS-IW users can re-login once the Tomcat service comes online on the secondary node and access to the patient studies list. Typically the Tomcat resource restarts within a few seconds.



PACS-IW users are not able to open patient studies until all cluster resources have moved to the secondary node.



The Study loading problem message box is displayed if patient study access is attempted by users while the failover is in progress.



Once the resources are fully online on the secondary node all access to the patient studies is now available.

EMC Business Continuity Solution for GE Healthcare Centricity PACS-IW Enabled by EMC RecoverPoint and VMware with Site Recovery Manager

31

Figure 19 •

Resource fully online

In the PACS-IW use case environment the failover was observed to take approximately 3 minutes.

EMC Business Continuity Solution for GE Healthcare Centricity PACS-IW Enabled by EMC RecoverPoint and VMware with Site Recovery Manager

32

Test 3: Primary storage failure

This section describes the procedure used to simulate illustrate primary storage failure. Note

In this example the main database cluster (SQL) is used.

Figure 20

Main database cluster

Storage failure is simulated by shutting down the CLARiiON ports on the switch in the fabric. The storage failure is shown on the RecoverPoint GUI, where the RPAs and Storage icons display error symbols due to access problems, as shown in Figure 21.

Figure 21

Errors on the RPAs and Storage icons

EMC Business Continuity Solution for GE Healthcare Centricity PACS-IW Enabled by EMC RecoverPoint and VMware with Site Recovery Manager

33

In the event of a storage failure the failover to the remote site is not an automated process. Note

The consistency group(s) need to have their policy settings changed.

1. On the RecoverPoint GUI, on the Policy tab for the group, the option “Group is in maintenance mode. It is managed by RecoverPoint, CE can only monitor” must be selected and applied.

Figure 22

Policy tab

2. On the Status tab select “Enable Image Access” from the popup menu.

Figure 23

Status tab

EMC Business Continuity Solution for GE Healthcare Centricity PACS-IW Enabled by EMC RecoverPoint and VMware with Site Recovery Manager

34

3. Select the option “Select the latest image”.

Figure 24

Enable Image Access window

4. Select the option “Logged access (physical)”.

Figure 25

Image access mode window

5. Click Finish.

Figure 26

Summary window

EMC Business Continuity Solution for GE Healthcare Centricity PACS-IW Enabled by EMC RecoverPoint and VMware with Site Recovery Manager

35

6. Repeat steps 1 to 5 for all relevant consistency groups. 7. Image access is now enabled and hosts can access the secondary storage, Figure 27.

Figure 27

Hosts can access secondary storage

At this point the DR nodes are ready and the application is available. Although in the example as shown in Figure 27, the primary storage failure refers to the database nodes, it will also affect the main controller nodes. As a result, the user has to re-login to the PACS-IW application due to the Tomcat resource failover process.

EMC Business Continuity Solution for GE Healthcare Centricity PACS-IW Enabled by EMC RecoverPoint and VMware with Site Recovery Manager

36

Figure 28

PACS-IW Study list

The failover to the DR site needs to be completed once the application is verified to be online. 8. Specify the secondary site as the production site by selecting “Failover to Remote” from the popup menu.

Figure 29

RecoverPoint failover to remote site

EMC Business Continuity Solution for GE Healthcare Centricity PACS-IW Enabled by EMC RecoverPoint and VMware with Site Recovery Manager

37

Figure 30

Start replication transfer from remote

If the remote array is not available at this point the start transfer operation will not happen. Once the array is confirmed to be back online, the transfer will resume automatically. Until that point, red-coloured Xs will remain on the Storage and RPAs icons on the RecoverPoint GUI. 9. On the Policy tab change the policy settings back to “Group is managed by CE, RecoverPoint can only monitor”.

Figure 31

Policy configuration

EMC Business Continuity Solution for GE Healthcare Centricity PACS-IW Enabled by EMC RecoverPoint and VMware with Site Recovery Manager

38

10. Repeat steps 7 to 9, as required, for each of the consistency groups. 11. On recovery of the primary site from a site/array disaster, the consistency groups need to re-synchronize. RecoverPoint reads the source and target volumes and performs a checksum compare of the data, and any differences are sent to the target LUN. This is referred to as a Full Sweep. 12. Once complete the status will show replication from the secondary site to the primary site.

Figure 32

Replication from secondary to primary site

Observations of a primary storage failure In the case where there is an unplanned primary storage failure, the failover feature is not an automated process. It is a requirement of the storage administrator to manage the failover process and recover the application by presenting its data to the secondary site. In planned storage failure scenarios a full recovery time approximating 3 minutes was observed.

EMC Business Continuity Solution for GE Healthcare Centricity PACS-IW Enabled by EMC RecoverPoint and VMware with Site Recovery Manager

39

Physical to virtual conversion The steps below provide an overview for the conversion of the environment from physical to virtual.

Figure 33

Physical to virtual architecture

1. RecoverPoint/CE configuration was removed from the physical environment. To achieve physical to virtual conversion, two ESXs (4.1.0-0.0.260247) were added to both sites. 2. At the primary site new virtual machines were created for both the database and controllers. 3. For the database cluster the new virtual machines were added to the physical cluster, extending the existing cluster by two nodes. 4. The installed SQL 2005 SP2 was pushed out to the new virtual machines in the cluster. 5. Cluster resources were moved to the new virtual machine(s) and the PACS-IW software was validated to be functional (cont’d).

EMC Business Continuity Solution for GE Healthcare Centricity PACS-IW Enabled by EMC RecoverPoint and VMware with Site Recovery Manager

40

6. The remaining physical nodes were evicted from the cluster, leaving the database configuration entirely on the virtualized environment. Likewise the same step was performed on the main controller cluster; the new virtual machines were added to the existing cluster. 7. In order to install the GE PACS-IW software the new nodes need to own the cluster resources, so each virtual machine node was installed in turn. With cluster resources existing on the new virtual machine(s), the PACS-IW software was validated to be functional. 8. The remaining physical nodes were then evicted from the cluster leaving the main controller cluster configuration entirely on the virtualized environment.

EMC Business Continuity Solution for GE Healthcare Centricity PACS-IW Enabled by EMC RecoverPoint and VMware with Site Recovery Manager

41

Virtualized environment Technology overview The architecture diagram in Figure 34 illustrates the configuration for PACS-IW implementation using ESX servers and RecoverPoint replication technology. Site A contains virtual machines participating in a Microsoft cluster for both the database and controller. Replication between sites is configured using RecoverPoint, and VMware Site Recovery Manager (SRM) is used to fail over virtual machines between sites.

Figure 34

VMware and RecoverPoint architecture

EMC Business Continuity Solution for GE Healthcare Centricity PACS-IW Enabled by EMC RecoverPoint and VMware with Site Recovery Manager

42

VMware vCenter SRM

VMware vCenter™ SRM delivers advanced capabilities for disaster recovery management, non-disruptive testing, and automated failover. VMware vCenter SRM can manage failover from production datacenters to disaster recovery sites, as well as failover between two sites with active workloads. Multiple sites can even recover into a single shared recovery site. SRM can also help with planned datacenter failovers such as datacenter migrations.

Figure 35

Site Recovery Manager plug-in within vCenter

EMC Business Continuity Solution for GE Healthcare Centricity PACS-IW Enabled by EMC RecoverPoint and VMware with Site Recovery Manager

43

VMware vCenter SRM with EMC RecoverPoint SRA

EMC RecoverPoint is a comprehensive replication and data protection solution designed and built from scalable and highly available hardware appliances and software modules. RecoverPoint provides local data protection (CDP), remote data protection (CRR), and concurrent local and remote (CLR) data protection support for heterogeneous storage, server, and network environments. The EMC RecoverPoint Storage Replication Adapter (SRA) for VMware vCenter SRM (the RecoverPoint Adapter) is a software package that allows VMware vCenter SRM to implement disaster recovery using EMC RecoverPoint. The Adapter supports SRM functions, such as failover and failover testing, using RecoverPoint as the replication engine.

Hardware and software components The following sections identify, and briefly describe, the technology and components used in the validated virtualized desktop infrastructure environment. The environment consists of virtual machines configured with Microsoft clustering for both the controller and database. The environment consists of two Microsoft clusters spread across two sites. The hardware and software components are listed below. Hardware •

Gen 4 RecoverPoint Appliances x 4



4 servers with quad-core Intel Xeon x5570 processors, 2.93 Ghz CPU, 144 MB memory



CLARiiON CX4-120 x 2



Network switch x 2



FC switch x 4



GE PACS-IW licensing USB dongle x 2

Software •

RecoverPoint 3.3.SP1.P1(k.119)



ESX version 4.1.0-0.0.260247



VMware vCenter Site Recovery Manager version 4.1



EMC RecoverPoint Storage Replication Adapter 1.0.3



CLARiiON splitter (RecoverPoint Enabler on CX4-120)



Unisphere on CLARiiON – FLARE 30



PowerPath/VE version 5.3



Microsoft SQL 2005 SP2



GE Centricity PACS-IW 3.7.3

EMC Business Continuity Solution for GE Healthcare Centricity PACS-IW Enabled by EMC RecoverPoint and VMware with Site Recovery Manager

44

Storage The storage configuration consisted of two EMC CX4-120 arrays using Unisphere (FLARE 30), one array per site. All server and storage connections are Fibre Channel (FC). The primary fabric consisted of two FC switches and the secondary fabric consisted of two FC switches. AnywhereUSB Hub2 The Centricity PACS-IW licensing is controlled via USB dongle. On each site an AnywhereUSB Hub2 containing a PACS-IW USB dongle is configured and made available to the main controller virtual machines, enabling licensing for the PACS-IW application, as shown in Figure 36.

Figure 36

AnywhereUSB Hub2 configuration

VMware vCenter SRM

Site-to-site failover and failback is achieved using VMware SRM. The RecoverPoint SRA was installed within vCenter allowing integration between VMware and RecoverPoint. Protection groups and recovery plans were created from within vCenter.

EMC Business Continuity Solution for GE Healthcare Centricity PACS-IW Enabled by EMC RecoverPoint and VMware with Site Recovery Manager

45

Configuration EMC VMware EMC VMware was configured with two ESX servers per site. For the purposes of the use case virtual machines were created, configured, and brought into the Microsoft cluster; refer to the Physical to virtual conversion section. Alternatively, when performing a new initial GE PACS-IW installation in a fully virtualized environment, the virtual machines can be created on each ESX, as per GE PACS-IW documentation guidelines.

EMC RecoverPoint

EMC RecoverPoint was configured with two appliances per site for Continuous remote replication (CRR) setup. The RecoverPoint CLARiiON splitter was installed on each of the CLARiiON arrays. Consistency groups were configured offering CRR protection of the database and controller LUNs. VMware datastores for the database and controller were also protected using RecoverPoint.

Microsoft cluster

The Microsoft cluster now consists of two virtual nodes and the MNS/FSW node, as shown in Figure 37.

Figure 37

Microsoft cluster

EMC Business Continuity Solution for GE Healthcare Centricity PACS-IW Enabled by EMC RecoverPoint and VMware with Site Recovery Manager

46

Figure 38 shows the virtual machines balanced across the ESX servers. However, these ESX servers were not set up as a Highly Available (HA) or Distributed Resources Scheduler (DRS) cluster as SRM does not currently support migration of DRS or HA settings.

Figure 38

Virtual machines on one ESX server

Figure 39

Virtual machines on another ESX server

Within the vCenter Plug-in Manager interface ensure that the SRM plug-in is enabled.

Figure 40

vCenter Plug-in Manager

EMC Business Continuity Solution for GE Healthcare Centricity PACS-IW Enabled by EMC RecoverPoint and VMware with Site Recovery Manager

47

Configure VMware vCenter SRM

To configure VMware vCenter SRM refer to the Summary in Figure 41. The local site consists of the primary vCenter server and the paired site consists of the secondary vCenter server.

Figure 41

Site Recovery Manager

1. On the primary site create the protection groups, as shown in Figure 42.

Figure 42

Protection groups

EMC Business Continuity Solution for GE Healthcare Centricity PACS-IW Enabled by EMC RecoverPoint and VMware with Site Recovery Manager

48

2. Enter the name of the protection group, as shown in Figure 43.

Figure 43

Create Protection Group - Name

3. Select the datastore group, in this example the GE-DB3 and GE-DB4 virtual machines are used, as shown in Figure 44. This procedure is the same for the virtual machines, GE-MC3 and GE-MC4.

Figure 44

Create Protection Group – Datastore Group

EMC Business Continuity Solution for GE Healthcare Centricity PACS-IW Enabled by EMC RecoverPoint and VMware with Site Recovery Manager

49

4. Select a placeholder to store the virtual machines. In this example, the placeholder for the database virtual machines is selected. When choosing a placeholder for the main controller virtual machines select a separate datastore.

Figure 45

Select placeholder

On the secondary virtual center, you can see that the recovery machines have been created, as shown in Figure 46.

Figure 46

Recovery machines

EMC Business Continuity Solution for GE Healthcare Centricity PACS-IW Enabled by EMC RecoverPoint and VMware with Site Recovery Manager

50

5. On the secondary site virtual center, create the recovery plan for the protection groups that were created at the primary site, as shown in Figure 47.

Figure 47

Recovery plan creation

6. Enter the recovery plan name and description, as shown in Figure 48.

Figure 48

Create Recovery Plan - Name

EMC Business Continuity Solution for GE Healthcare Centricity PACS-IW Enabled by EMC RecoverPoint and VMware with Site Recovery Manager

51

7. Select the protection groups that were created on the primary site, as shown in Figure 49.

Figure 49

Create Recovery Plan – Protection Groups

EMC Business Continuity Solution for GE Healthcare Centricity PACS-IW Enabled by EMC RecoverPoint and VMware with Site Recovery Manager

52

8. On the next screen click Finish, as shown in Figure 50.

Figure 50

Create Recovery Plan – Suspend Local Virtual Machines

EMC Business Continuity Solution for GE Healthcare Centricity PACS-IW Enabled by EMC RecoverPoint and VMware with Site Recovery Manager

53

9. Run the recovery plan on the Secondary Site Virtual Center server. You can run a “Test Recover Plan” operation to ensure the plan works correctly.

Figure 51

Site Recovery – Failover to DR

10. Once the test recovery has completed successfully, click Continue to clean up and return to a ready state.

Figure 52

Site Recovery – Continue

EMC Business Continuity Solution for GE Healthcare Centricity PACS-IW Enabled by EMC RecoverPoint and VMware with Site Recovery Manager

54

Test and validation Primary ESX node failure In the event of an ESX server failure at the local site, we allow Microsoft clustering to control the movement of the cluster resources for both database and controller virtual machines. The resources automatically move over from GE-DB3 to GE-DB4 and GE-MC3 to GE-MC4 at the primary site. This node already has visibility to the local storage so no additional tasks on the array, or with RecoverPoint, need to be performed at this time. This functions as a typical local two-node cluster on shared local storage.

Primary site failure

In this scenario the user is required to intervene and run the recovery plan already configured within Site Recovery in vCenter. This allows SRM to fail over the virtual machines from the primary to the secondary site. 1. Log in to vCenter at the remote site. 2. Click the Run Recovery Plan button, as shown in Figure 53.

Figure 53

Run Recovery Plan

EMC Business Continuity Solution for GE Healthcare Centricity PACS-IW Enabled by EMC RecoverPoint and VMware with Site Recovery Manager

55

Figure 54

Running the recovery plan – Failover to DR

3. Failover to the remote site in progress.

Figure 55

Test the application

4. Once the recovery plan has completed, the virtual machines for both the main database and controller will have come online at the secondary site.

EMC Business Continuity Solution for GE Healthcare Centricity PACS-IW Enabled by EMC RecoverPoint and VMware with Site Recovery Manager

56

Figure 56

Secondary site

5. The Cluster Administrator window shows that the GE-DB cluster is online, as shown in Figure 57.

Figure 57

Cluster is online

EMC Business Continuity Solution for GE Healthcare Centricity PACS-IW Enabled by EMC RecoverPoint and VMware with Site Recovery Manager

57

6. The Integrad Control Panel shows that all main controller resources are online, as shown in Figure 58.

Figure 58

Main controller resources are online

7. The Cluster Administrator window shows that the main controller cluster is online, as shown in Figure 59.

Figure 59

Main controller cluster is online

8. Log in to the PACS-IW application and confirm that application is functioning as expected and that users can access their patient studies.

EMC Business Continuity Solution for GE Healthcare Centricity PACS-IW Enabled by EMC RecoverPoint and VMware with Site Recovery Manager

58

Observations of the failover process

When configured, the failover process took approximately 8 minutes for all resources to come online and the application to be available, based on the workloads tested in this solution. To execute a failback procedure from the secondary site to the primary site, the existing virtual machines on the primary site need to be removed from the inventory within the virtual centre. The failback process then needs to be configured in reverse order.

EMC Business Continuity Solution for GE Healthcare Centricity PACS-IW Enabled by EMC RecoverPoint and VMware with Site Recovery Manager

59

Conclusion Summary For the first time, GE PACS-IW users are able to deploy an integrated solution leveraging EMC capabilities for automated failover and recovery processes that can deliver repeatable RPO and RTO.

Findings Testing showed how: •

A standard, repeatable, and highly automated failover and recovery solution can be created for all GE PACS-IW users, regardless of physical and virtual deployment



An RTO of less than 8 minutes in a virtualized environment and less than 3 minutes in physical environment can be achieved through this solution



Server virtualization results in a lowering of IT costs by treating the data center as a single pool of processing



With RecoverPoint and SRM software tools from EMC and VMware, Centricity PACS-IW users can manage recovery in a much more granular fashion, down to the transaction level

Next steps To learn more about this and other solutions, contact your EMC Account Manager or visit www.EMC.com/healthcare.

EMC Business Continuity Solution for GE Healthcare Centricity PACS-IW Enabled by EMC RecoverPoint and VMware with Site Recovery Manager

60

References Product documentation For additional information on the products discussed in this white paper, see the following: •

EMC RecoverPoint/Cluster Enabler – A Detailed Review



EMC RecoverPoint/Cluster Enabler Plug-in Version 4.0 Product Guide



EMC RecoverPoint Replicating VMware Technical Notes



Improving VMware Disaster Recovery with EMC RecoverPoint – Applied Technology



GE Healthcare Centricity PACS-IW Complete Installation Guide



GE Healthcare Centricity PACS-IW User’s Manual

EMC Business Continuity Solution for GE Healthcare Centricity PACS-IW Enabled by EMC RecoverPoint and VMware with Site Recovery Manager

61