Briefing Notes: 5 Questions about Big Data for Attorneys and E-Discovery Professionals
I recently provide a briefing to a group of e-discovery professionals about Big Data and why it matters to them, and I thought there might be some value in sharing my notes.
1. What is Big Data?
-
Gartner: Big data is high-volume, high-velocity and high-variety information assets that demand cost-effective, innovative forms of information processing for enhanced insight and decision-making.
-
McKinsey: ‘Big data’ refers to datasets whose size is beyond the ability of typical database software tools to capture, store, manage, and analyze. This definition is intentionally subjective . . .
-
It is subjective, but has definable elements
-
The data itself: large, unstructured information
-
The infrastructure: “Internet scale” in the enterprise
-
The analysis: Asking questions using very large data sets
-
2. Why Does Big Data Matter to E-Discovery Professionals?
-
Data scientists and technologists do not understand the risk side of information
-
You need to be at the table to educate them on:
-
The legal and business value of deleting information
-
The privacy requirements and implications
-
E-Discovery implications of too much data
-
-
The technologies of Big Data may process and manipulate information in a way that affects their accessibility and evidentiary value – you need to be aware of this and guide your clients appropriately
3. Does Big Data offer value to the legal community?
-
Performing sophisticated analysis on large pools of data is not exclusive to any particular industry – there is no reason it could not be applied to the legal community (and already is being used in some limited ways)
-
Relatively speaking, most law firms do not generate massive amounts of data in their day-to-day operations
-
In e-discovery, the technology innovations of Big Data could be helpful in very large cases to help with storage and processing tasks
4. What are some examples of Big Data in action?
-
President Obama’s data-driven election campaign.
-
An online travel company showing more expensive travel options to those who used higher-prices Macintosh computers to access their website.
-
Tracking unreported side effects of drugs using search data (Journal of American Medical Association). Also Google Flu Trends: tracking the spread of the flu using search trends.
-
NYPD Compstat.
-
Fraud Detection: Targeting $3.5 trillion in fraud from banking, healthcare utilities, and government.
-
The City of New York finding those responsible for dumping cooking oil and grease into the sewers by analysing data from the Business Integrity Commission, a city agency that certifies that all local restaurants have a carting service to haul away their grease. With a few quick calculations, comparing restaurants that did not have a carter with geo-spatial data on the sewers, they generate a list of statistically likely suspects to track down dumpers with a 95% success rate.
5. What professional and career opportunities does Big Data represent for e-discovery professionals?
-
Organizations need people who understand the risk side of the equation and who can provide practical guidance
-
Your clients may have Big Data projects that right now, today, are creating unmonitored, unmitigated risk; you need to be able to help them identify and manage that risk
-
Big Data focuses on unstructured information, i.e., the documents, email messages and other information that the e-discovery community knows well. These same skills and techniques can be very useful to business-led Big Data projects.
One comment