XtremIO Data Protection (XDP) is an exclusive technology found only on ..... back into service is simply a matter of plu
XTREMIO DATA PROTECTION (XDP) “LIGHTS OUT” SSD FAILURES WITH NO PERFORMANCE IMPACT
ABSTRACT XtremIO Data Protection (XDP) is an exclusive technology found only on XtremIO all-flash arrays. Its primary function is to protect data from SSD failures. However, XDP also has several unique and important benefits in performance, consistency, flash endurance, data integrity, and simplicity of operation. Detailed information about how XDP works can be found in the EMC White Paper, “Flash Specific Data Protection”. Here we present one of XDP’s unique advantages – the ability to maintain performance levels in the face of multiple SSD failures, while allowing failed devices to remain in place using XDP’s “hot space” feature until a convenient replacement window can be arranged.
Copyright © 2013 EMC Corporation. All Rights Reserved. EMC believes the information in this publication is accurate as of its publication date. The information is subject to change without notice. The information in this publication is provided “as is.” EMC Corporation makes no representations or warranties of any kind with respect to the information in this publication, and specifically disclaims implied warranties of merchantability or fitness for a particular purpose. Use, copying, and distribution of any EMC software described in this publication requires an applicable software license. For the most up-to-date listing of EMC product names, see EMC Corporation Trademarks on EMC.com. EMC2, EMC, the EMC logo, and the RSA logo are registered trademarks or trademarks of EMC Corporation in the United States and other countries. VMware is a registered trademark of VMware, Inc. in the United States and/or other jurisdictions. All other trademarks used herein are the property of their respective owners. © Copyright 2013 EMC Corporation. All rights reserved. Published in the USA. 02/13 White Paper
2
XDP: CONSISTENT PERFORMANCE UNDER MULTIPLE FAILURES
TABLE OF CONTENTS TABLE OF CONTENTS ............................................................................................................................. 3 INTRODUCTION ..................................................................................................................................... 4 XDP LIGHTS-OUT RESILIENCY ENABLED BY “HOT SPACE” .................................................................... 4 TEST SETUP ........................................................................................................................................... 4 PROCEDURE........................................................................................................................................... 5 VMWARE PERFORMANCE ..................................................................................................................... 10 ADDING SSDS TO THE XDP REDUNDANCY GROUP ............................................................................... 11 CONCLUSION ....................................................................................................................................... 12 HOW TO LEARN MORE ......................................................................................................................... 13
3
FLASH IMPLICATIONS IN ENTERPRISE STORAGE ARRAY DESIGNS
INTRODUCTION XtremIO Data Protection (XDP) is an exclusive technology found only on XtremIO all-flash arrays. Its primary function is to protect data from SSD failures. However, XDP also has several unique and important benefits in performance, consistency, flash endurance, data integrity, and simplicity of operation. Detailed information about how XDP works can be found in the EMC White Paper, “Flash Specific Data Protection”. Here we present one of XDP’s unique advantages – the ability to maintain performance levels in the face of multiple SSD failures, while allowing failed devices to remain in place using XDP’s “hot space” feature until a convenient replacement window can be arranged.
XDP LIGHTS-OUT RESILIENCY ENABLED BY “HOT SPACE” While traditional RAID algorithms maintain unused devices (hot spares) which are only utilized in the event of a failure, XDP instead manages evenly distributed “hot space” in the array with no dedicated spare devices. The XtremIO array always maintains enough hot space to guarantee a rebuild can take place. However, if the administrator desires, failed devices can be left in place in the array and the hot space capability can be leveraged to protect against subsequent failures. This is ideal when maintenance windows cannot be scheduled or when the array is in a remote data center that is not easily reached. Should a subsequent failure occur, XDP’s hot space feature will simple utilize remaining free capacity in the array to perform the required recovery operation. This can be done for as many as five failed SSDs per X-Brick, or until no free space remains in the array. The example that follows was performed by an actual XtremIO customer (who wished to remain anonymous) in their test environment and illustrates the power of XDP, the hot space capability, and the resiliency and performance consistency of the XtremIO array.
TEST SETUP Goal: Observe XtremIO’s performance while pulling an SSD in each X-Brick up to the limit of the XIOS v2.20 software, currently five SSDs per X-Brick. This test will be performed under heavy system load to observe any degradation during and after device rebuilds. IOmeter workload:
4
•
48 workers (2 VMs per host, 8 workers per VM, 8GB RAM)
•
70% write 30% read
•
100% random
•
4K block size
•
4K offset
•
16 outstanding I/Os
•
System setup: XtremIO cluster consisting of two X-Bricks
FLASH IMPLICATIONS IN ENTERPRISE STORAGE ARRAY DESIGNS
PROCEDURE 1.
Begin with a healthy array and pull one drive on each X-Brick while running 4 IOmeter VMs
Steady-‐state IOPS were about 250K before the SSDs were ejected. Here we can see the performance during the rebuild process
Note that physical capacity in the array is 13.5TB
2.
Rebuild progress meter shown by the green bar
After the rebuild finished, two more VMs running the same IOmeter workload were added. This increased IOPS to 330K.
Added two more VMs to Added 2 m ore workload VMs w ith workload
5
FLASH IMPLICATIONS IN ENTERPRISE STORAGE ARRAY DESIGNS
3.
Two additional SSDs are pulled and IOPS averages 240K during rebuild process
Four SSDs are now removed. Two rebuilds are occurring and IOPS are still averaging 240K.
Two rebuilds were occurring during this drop and completed in ~5 minutes
4.
Rebuild progress for two additional SSDs that were pulled.
After the rebuild process finished, performance returned to 330K IOPS. Note the array physical capacity has dropped to 12.9TB from 13.5TB as “hot space” feature uses available capacity in the array to perform the rebuilds. XDP always reserves space for a single rebuild to take place. This reserved space is not exposed as usable capacity in the array.
Performance has returned to 330K IOPS, even with four failed SSDs
Hot Space feature of XDP has claimed some of the array’s free capacity to perform the additional rebuilds
6
FLASH IMPLICATIONS IN ENTERPRISE STORAGE ARRAY DESIGNS
Two rebuilds were occurring during this drop and completed in ~9 minutes
5.
A fifth and sixth SSD are pulled. IOPS again are maintained around 240K during the rebuild. When rebuilds complete, performance returns to 330K IOPS.
Performance has returned to 330K IOPS, even with six failed SSDs!
Hot Space feature of XDP has claimed some of the array’s free capacity to perform the additional rebuilds
6.
Two rebuilds were occurring during this drop and completed in ~9 minutes
A seventh and eighth SSD are pulled. Same effect as above.
Now eight SSDs are failed!
Array maintains performance!
XDP has taken additional Hot Space for the 7th and 8th rebuilds
7
FLASH IMPLICATIONS IN ENTERPRISE STORAGE ARRAY DESIGNS
Two rebuilds are taking place
7.
After the seventh and eight rebuilds finished, IOPS returned to 330K.
Eight failed SSDs – no impact to array performance.
Rebuilds seven and eight take 9 minutes
8
FLASH IMPLICATIONS IN ENTERPRISE STORAGE ARRAY DESIGNS
8.
The ninth and tenth SSDs are pulled. As in previous steps, IOPS were maintained at 240K during the rebuilds and returned to 330K after the rebuilds completed.
Ten failed SSDs – no impact to array performance!
XDP’s Hot Space uses available free capacity to perform the rebuilds.
Rebuilds nine and ten take 10 minutes
9
FLASH IMPLICATIONS IN ENTERPRISE STORAGE ARRAY DESIGNS
VMWARE PERFORMANCE With ten SSDs removed, ESXtop data was collected to analyze the array’s performance and latency as seen through the hypervisor. While in some cases latency increased compared to a completely healthy system, it was still extremely low overall.
10 FLASH IMPLICATIONS IN ENTERPRISE STORAGE ARRAY DESIGNS
ADDING SSDs TO THE XDP REDUNDANCY GROUP Putting SSDs back into service is simply a matter of plugging them back into the disk enclosure. This was done one drive at a time per X-Brick. As the SSDs are brought back into service by XDP, IOPS temporarily drop to 260K and then recover to full performance once XDP is 50% completed with adding the SSDs back to the array.
Only a brief (~1 min) drop in performance as the SSDs are brought back into service in the array.
11 FLASH IMPLICATIONS IN ENTERPRISE STORAGE ARRAY DESIGNS
Adding the remaining eight SSDs to the array results in the same performance pattern as above for each add. Note that with the SSDs back in service, XDP’s hot space feature has given back the usable capacity in the array and it has returned to 13.5TB. Also note that each SSD addition process takes only a couple of minutes.
Array is nearly back to complete health. Another minute and it will be 100%
Hot Space has returned all usable capacity
Each SSD addition takes only about a minute
CONCLUSION XtremIO’s flash-specific data protection algorithm, XDP, imparts new capabilities to the XtremIO array that are not possible on other storage systems, even on other all-flash arrays. The flexibility offered by ‘hot space’ to leave failed SSDs in place without jeopardizing data on the array or application performance gives administrators new levels of security, comfort, and convenience. SSD failures no longer represent an emergency situation mandating immediate action and XtremIO arrays in remote data centers can safely be operated with failed SSDs in place until such time as replacements can be conveniently scheduled.
12 FLASH IMPLICATIONS IN ENTERPRISE STORAGE ARRAY DESIGNS
HOW TO LEARN MORE For a detailed presentation explaining XtremIO’s storage array capabilities and how it substantially improves performance, operational efficiency, ease-of-use, and total cost of ownership, please contact XtremIO at
[email protected]. We will schedule a private briefing in person or via web meeting. XtremIO has benefits in many environments, but is particularly effective for virtual server, virtual desktop, and database applications.
EMC2, EMC, the EMC logo, XtremIO and the XtremIO logo are registered trademarks or trademarks of EMC Corporation in the United States and other countries. VMware is a registered trademark of VMware, Inc., in the United States and other jurisdictions. © Copyright 2013 EMC Corporation. All rights reserved. Published in the USA. 10/13 EMC White Paper H12450 EMC believes the information in this document is accurate as of its publication date. The information is subject to change without notice.
CONTACT US To learn more about how EMC products, services, and solutions can help solve your business and IT challenges, contact your local representative or authorized reseller—or visit us at www.EMC.com.
13 FLASH IMPLICATIONS IN ENTERPRISE STORAGE ARRAY DESIGNS