December 2015. Ever wonder if your ediscovery project was ânormalâ? ... to other ediscovery projects? ... big data i
The Ediscovery.com Pulse Benchmarks Insightful ediscovery metrics that aren’t on a “need-to-know” basis December 2015
Ever wonder if your ediscovery project was “normal”? How do your collection sets, filtering rates and production volumes compare to other ediscovery projects? At Kroll Ontrack, we handle thousands of matters each year and track thousands of nuanced metrics associated with those matters. After aggregating and normalizing this data, it became clear to us that the trends within these metrics would be most valuable in the hands of ediscovery practitioners. The ediscovery.com Pulse Benchmarks are a set of trended data showing key developments in the ediscovery market. The insights provided do not come at the expense of divulging any client-specific or company confidential data. Our goal is to break new ground by arming you with this powerful information and helping you better plan and execute your ediscovery projects.
00 0
1 2 3 4
20 2008
2008 2009 2010 2011 2012 2013 2014
00%
0%
2008
2010
2012
64%
2010
2011
2012
2013
Source data is on the rise as big cases get bigger.
2014
500 2010 5002011
2009
500
400 400
2012
2013
2014
400
300 300
2013 and 2014 saw an increase in the amount of source gigabytes (GBs) — the average number of GBs collected prior to filtering and processing per project.
300
482 482 482
200 200 200 100 100 100
Other
Even though parties are collecting data more diligently Electronic File Transfer and custodian counts per matter continue to decline, CD Rom/DVD big data is driving up data volumes per custodian, resulting in increased data volumes per case. Tape
00
0
2008 2009 2010 2011 2009 2010 20112012 2012 2013 2013 2014 2014 20082008 2009 2010 2011 2012 2013 2014
Source Gigabytes - The number of average gigabytes collected prior to filtering and processing.
Hard Drive
100%
Other
100% 100%
Types of source media fluctuate as ediscovery processing becomes standardized.
2014
4
20
0
Other Electronic Other File Transfer
Electronic File Electronic File CD Rom/DVD Transfer Transfer CD Rom/DVD Tape CD Rom/DVD Hard Drive Tape Tape
Source data for ediscovery processing can be transmitted in many formats.
Hard Drive Hard Drive
As ediscovery processing has become more standardized, the percentage of media types used for data transmission has fluctuated, trending away from Custodian % being transmitted on tapes, hard drives and CDs/ data DVDs Project % and more data being transmitted via electronic file transfer (also known as FTP).
0% 2008
0% 0%
2008 2008
2010
2010 2010
2012
2012 2012
2014
2014 2014
Media Types - The percentages of different media types received for processing.
14 4
Project level deduplication on the rise. 36%
Custodian % Project %
64%
Deduplication is the process of comparing documents based on characteristics and removing duplicate records from the data set. Deduplication algorithms can be compared against the entire data set (project level deduplication, also known as global deduplication) or a subset of data (such as deduplicating within a single custodian, also known as custodian deduplication).
44
Custodian Custodian%% Project Project%%
64% 64% 36%
36% 36%
In efforts to continue to drive down review set volumes, project level deduplication selections are on the rise.
Deduplication Selection - The percentage of projects choosing custodian level deduplication versus project level deduplication.
80
6m
Non-essential custodians excluded faster5mand more efficiently.
40
In no small 40 1m part, this trend is likely due to the fact that40 cost-conscious litigants are leveraging new collection 0m methods and advanced pre-filtering technologies, 2008 2009 2010 2011 2012 2013 2014 30 combined with more effective custodian interviews. 30
20 20
60
6m
50
5m
30 40
4m
30
3m
20 20
2m
4m
The average number of custodians collected per 3m project has declined significantly in recent ediscovery years. 2m 1.5
70
10
6m
10 0
1m6m
2008 2009 2010 2011 2010 2012 2011 2013 2012 20142013 2014 Custodians Per Project - The average number of custodians in a review database.
5m 0m 5m
2
4m 4m 3m 3m 2m 2m
64%
20
Data Type 13 2014The average percentages of email processed, before filtering.
20
0
100
2008
2009
80
2010
2011
2012
While the volume of unstructured data and the complexity20 of data stores proliferate, email still remains the dominant data form for ediscovery. After a decline in 2013, 2012 2013 2014 the percentage of email processed rose to previously recorded levels.
20 0
CD Rom/DVD
2014
Ratio of email to loose files stabilizes.
40 14
Electronic File Transfer
2013
36%
60
Other
14
2008
2009
2010
2011
Tape Hard Drive
40
6m
Better technology equals fewer 5m reviewers on every project. 4m
70
There is 3m simply no escaping the old adage that time is money when it comes to ediscovery — especially when 1.5 2m it comes to document review.
40
30 20 80
odian %
2010
roject %
1m
70
10 2011
60 2012
2013
50
2014
Number of Reviewers Per40 Project - The average 14 number of reviewers on a single ediscovery 30 document review. 1.5
The empowering impact of modern filtering and 0m review technologies, like predictive coding, is likely a 2008 2009 2010 2011 2012 2013 2014 significant driving force behind this trend.
20 10 0
100 2011
2012
80
2013
2008 2009 2010 2011 2012 2013 2014
2014
Better filtering significantly decreases production volumes.
60 40
14
20
20
0 2008
2009
2010
2011
2012
2013
2014
The average number of gigabytes contained in a production set has decreased drastically from 2008 (~100GBs produced) to 2014 (~20GBs produced). This strongly supports the notion that parties are becoming smarter about what they produce.
Produced Gigabytes - The number of average gigabytes contained in a production set. 80
6m
70
5m
60
4m
50
Parties review fewer documents before production.
40
3m 1.5
2m 1m 0m 2008
2009
2010
2011
2012
2013
80 2014 70
Review Pages vs. Production Pages - The 60 average number of pages in the review 50 database compared to the average number40of pages in the production set. 30 1.5
20
80
30 20
As10 parties increasingly leverage technology to reduce 0 review set volumes, the variance between pages 2008 2009 2010 2011 2012 drastically 2013 2014 narrows. reviewed and pages produced
60 50 30 20 10 0
5 6 7 8
2008 2009 2010 2011 2012 2013
The December 2015 update includes new data from projects initiated in 2014. This interval of time allows for a more accurate view of the metrics, providing ample time for projects to progress from collection to production through the EDRM.
800.347.6105 | 952.937.5161 www.ediscovery.com
Copyright © 2015 Kroll Ontrack Inc. All Rights Reserved. Kroll Ontrack, Ontrack and other Kroll Ontrack brand and product names referred to herein are trademarks or registered trademarks of Kroll Ontrack Inc. and/or its parent company, Kroll Inc., in the United States and/or other countries. All other brand and product names are trademarks or registered trademarks of their respective owners. P1215