big data - Bitpipe

0 downloads 275 Views 1MB Size Report
Big data is a term often used in the storage industry today. As companies accumulate more and more data, they must find
E-Guide

How the media and entertainment industry is backing up ‘big data’ Big data is a term often used in the storage industry today. As companies accumulate more and more data, they must find more efficient ways to manage and store it. But the media and entertainment industry has been managing this big data for years. Read this SearchStorage.com E-Guide to see what technologies and strategies they use to cope with their big data issues.

Sponsored By:

SearchStorage.com E-Guide How the media and entertainment industry is backing up ‘big data’

E-Guide

How the media and entertainment industry is backing up ‘big data’ Table of Contents Lessons learned, applied in the media and entertainment data storage industry Resources from Atempo

Sponsored By:

Page 2 of 9

SearchStorage.com E-Guide How the media and entertainment industry is backing up ‘big data’

Lessons learned, applied in the media and entertainment data storage industry By Terri McClure Managing big data is all the talk across the IT landscape nowadays, but it’s a topic the media and entertainment industry has been dealing with longer than just about anyone. The last five years have seen a mass migration to digital media formats for both audio and video capture, production and delivery—not just for new content but for older, celluloid content that is at risk for deterioration and loss. Digital file formats are the ―gift that keeps on giving‖ in the media and entertainment vertical industry. Vendors need to support file formats that can be delivered via many various endpoint devices. Take Pandora Radio for instance. Pandora has more than 75 million registered users in the United States, and the company’s personalized radio stations play on more than 200 devices, including PCs, smart phones, iPads, and in-home connected devices such as televisions, Blu-ray players, table-top radios and digital media players. Delivering audio across so many endpoint devices means that Pandora needs to store data in different formats tailored to play on these devices, drastically multiplying its storage requirements. And it’s not just the number of files and the multiplicative factor related to storing multiple formats that create a challenge in media and entertainment–file formats are growing in size and density. Just like the pixel density in digital photography increases the size of our photos, supporting new high definition audio and video formats also increases the size of a finished movie or song. For example, the shift from standard definition to high definition video increased storage requirements by a factor of 6X. And when considering technologies like 3D video, storage requirements can double because there are essentially two copies of the movie made—one copy for each eye. Depending on video format and compression, a finished two hour movie can range from 1.5 GB to 8 or 10 GB in size, and the raw footage from multiple takes, special effects and CGI edits can easily consume a petabyte of capacity if filmed in 3D. Additionally, sound tracks in multiple languages and the ―out take‖ and commentary segments produced also add on to the overall storage requirements.

Sponsored By:

Page 3 of 9

SearchStorage.com E-Guide How the media and entertainment industry is backing up ‘big data’

The media and entertainment industry has benefitted from the digitization of media–once data gets cataloged onto spinning disk it can be retrieved almost immediately in response to world events. With the passing of a celebrity like Liz Taylor or Michael Jackson, the television station that can produce the footage for the lifetime retrospective has a competitive advantage when it comes to drawing advertisers and can charge more money for air time. Digital media is also playing a big role in professional sports. Most major league baseball teams record and catalog every play–who was on the field, who is pitching, the pitch count, batter count, who is on which base and what pitch is thrown. Later analysis can be used to calculate statistical probabilities or just study videos for ―tells‖ that may indicate if a pitcher might throw to first to hold a runner or make a pitch. The same goes for other professional sports, such as football and basketball. So it comes as no surprise that the media and entertainment market has come under so much scrutiny by the big storage vendors. The major vendors are attacking the market from their positions of strength, with disk-based clustered NAS solutions, while a whole slew of tape vendors are seeing a second wind thanks to the streaming performance of tape and the cost effectiveness of tape as an archive medium. Even cloud storage is playing a key role; the media and entertainment industry is an early adopter of cloud storage technology for enabling collaborative workflows and content distribution. We’ll tackle these technologies one at a time, starting with clustered NAS. Why clustered NAS? Media and entertainment post-production work is largely done via file-based workflows so that the editing team can share access to the raw footage. There are a number of clustered file systems available that excel at the streaming performance required to support the heavy sequential workload required to meet performance demands in these environments– such as Quantum Corp.’s StorNext, FalconStor Software’s HyperFS and IBM GPFS. There are also a number of integrated systems from vendors like BlueArc Corp., DataDirect, EMC Isilon, NetApp Inc. and Panasas Inc. And solutions like the Avere Systems Inc. scale-out NAS services platform can add scale-out manageability and performance, as well as edge caching, in front of legacy scale-up NAS.

Sponsored By:

Page 4 of 9

SearchStorage.com E-Guide How the media and entertainment industry is backing up ‘big data’

These systems are seeing success in this market because most traditional scale-up NAS solutions simply cannot meet the throughput or capacity demands of streaming and storing very large files–they are designed to serve up a high number of small files and lots of file IO, rather than a smaller number of large files that require high file streaming throughput. Clustered NAS systems are designed to scale-out into a multi-node architecture by adding processing power and network connections to increase throughput in line with additional capacity. Many nodes can be deployed to work in parallel to stream large files over all of the available processors and network connections, providing very high throughput rates. With most systems, as nodes are added, they are used and managed as a single system, absorbing new capacity as it is added and automatically load balancing across the cluster. These systems can typically scale into multi-petabyte capacities to meet the burgeoning storage demands of today’s media and entertainment industry. Archiving takes center stage In the media and entertainment industry, the term archive often applies to the working data set used in post production. The term deep archives is used for long term finished products. The active project often leverages clustered NAS for performance, but once a project is completed, it is moved to a long term storage medium such as dense disk-based storage, tape or cloud storage. It is not just the finished project that needs to be archived. Over time, sequels or outtake reels are produced, or the anniversary of an iconic movie saga like Star Wars comes along that requires easy access to raw footage that can be woven together for a compelling story or peek behind the scenes. The value of all of the footage is immeasurable–it simply cannot be recreated, but it can be leveraged for value time and again. The volume of raw footage that needs to be archived for a single project can be massive. Multi-tier archive solutions and dense disk-based object storage archives are often used to manage copies across the tiers of storage, but they don’t back them up—there are simply not enough hours in the day to perform a traditional backup of these massive rich media data repositories. When considering the graphic intensive productions such as animated movies, many of these companies would rather backup the raw files, which tend to be

Sponsored By:

Page 5 of 9

SearchStorage.com E-Guide How the media and entertainment industry is backing up ‘big data’

smaller in size and are more suitable for deduplication versus the rendered files, or the final product, because they know that in a worst-case scenario these could be recovered by resubmitting the raw files to the rendering farms again. But if these rendered files were lost, it would cause a disruption, which is why archiving the rendered files is a good solution for those types of companies. Finally, it’s almost impossible to recreate many of these images. If a nature documentary is out shooting a hawk diving for its prey, the camera guys don’t necessarily get to say ―hey, let’s retake that sequence.‖ The hawk just doesn’t take orders like that. And if that take is lost, it takes time and money to go shoot it again. From a functional perspective–and because of the file types used in media and entertainment, end users find that newer disk-based solutions with deduplication technology really don’t help them at all–so tape is still a big player for them because they can efficiently stream the data due to the sheer size of the files. Plus, storing tape on a shelf is more cost effective than a spinning disk. The cloudification of media and entertainment Since conventional data protection methods do not always apply because of the sheer volume of data in media and entertainment, cloud storage is gaining some popularity. Cloud storage solutions, like those offered from Nirvanix Inc. and Amazon, can offer cost-effective multi-site replication to provide an added layer of data protection. Cloud storage can also offer a cost-effective long term archive strategy that keeps the massive amounts of raw footage available and accessible for easy reuse. While more expensive than storing tapes, cloud storage carries the advantage of faster time to access archive footage. Cloud storage can also help on the content distribution front–both Amazon and Nirvanix allow creation of customer buckets (Amazon) or child accounts (Nirvanix) that allow content creators to upload files specifically for their customers to access and download. This is priceless in collaborative workflows where a CGI house may need to work on a portion of a film and can have near immediate access to the required files.

Sponsored By:

Page 6 of 9

SearchStorage.com E-Guide How the media and entertainment industry is backing up ‘big data’

The bigger truth The next wave of storage wars will be over rich media content–not just because of the massive advances in media and entertainment technology, but the entire wave of the ―consumerization of IT‖. There is more multimedia and user generated content making its way into IT and delivered out via any number of endpoint access devices. And that generates a ton of storage revenue opportunity. But make no mistake–the media and entertainment market is not your typical IT shop. The buyers are often a line of business managers concerned with cost and time to market– meaning performance and efficiency are at the forefront. And traditional backup policies and methods don’t apply–the working data sets are too big to be backed up in a typical backup window–restore would be a costly nightmare if files needed to be rebuilt from incremental backups.

Sponsored By:

Page 7 of 9

PROTECTING AND PRESERVING CONTENT IN ALL STAGES OF DIGITAL MEDIA WORKFLOWS Preserve with Atempo Digital Archive Atempo Digital Archive allows you to migrate content from primary storage to near-line and deep archival storage—disk, tape and cloud. With Atempo Digital Archive, you can maximize storage capacity today while preserving content long-term to monetize in the future. Protect with Atempo Time Navigator Atempo Time Navigator provides enterprise-class data protection for complex, heterogeneous environments, with high performance and proven scalability to the petabyte level.

www.atempo.com

SearchStorage.com E-Guide How the media and entertainment industry is backing up ‘big data’

Resources from Atempo

White Paper: The Digital Archive's New Leading Role in Media Workflows Customer Story: How REELZCHANNEL Delivers More Content with Less Effort Customer Story: How the National Film Board of Canada Preserves 70 Years of Film History

About Atempo Atempo enables organizations to preserve and protect digital information simply and effectively, across any infrastructure, on any platform, over long periods of time. Atempo's archiving solutions deliver policy-based and workflow-driven management of rich media files, e-mail and other digital assets to maximize the efficiency and performance of storage systems and reduce long-term storage costs. Atempo's fully-integrated software portfolio also includes backup and recovery of heterogeneous servers, workstations and laptops throughout the enterprise — from the data center to remote offices. Atempo serves thousands of customers around the world through a sales and support network of over 200 resellers and partners.

Sponsored By:

Page 9 of 9