Here is the fourth video in our five-part series where I asked 30 Information Governance experts the same question, then produced a 5 minute video of their responses. As you watch the series, it is very interesting to see the common threads that weave through the answers, depending on the role and the type of organization the interviewee comes from.
In the third (3 minute) video of our Information Governance video series, we ask 30 IG experts, “What are the biggest benefits of Information Governance?”
Once again, I wanted to think all of my interviewees, who did an amazing job answering these questions under pressure.
- Janet B. Heins – Director, Collaboration and IG, Biogen Idec (LinkedIn)
- Stephen Cohen – Records Manager, MetLife (LinkedIn)
- Randy Moeller – Global Records Management and Governance, Proctor and Gamble (Twitter)
- Patrick Cunningham, CRM, FAI – Senior Director, Information Governance, Fortune 500 Electronics Manufacturer (Blog)
- Laurence Hart – CIO, AIIM International (Blog)
- Galina Datskovsky – Chair of the Board, ARMA International (LinkedIn)
- Darren Lee – VP Governance, Proofpoint (LinkedIn)
- Arlyce J. Vogel, CRM – Corporate Information Management Project Manager, Large Utility (LinkedIn)
- Robert Smallwood – Executive Director, E-Records Institute & Partner, IMERGE Consulting (LinkedIn)
- Tamir Sigal – VP of Marketing, RSD (Twitter)
- Bassam Zarkout – Chief Technology Officer, RSD (Twitter)
- Robert F. Williams – President, Cohasset Associates, Inc. (Website)
- Conni Christensen – Founding Partner, Synercon Management Consulting (Website)
- Chris Perram, MBA – Owner, Perram Corporation (Website)
- Amir Jaibaji – VP, Product Management, StoredIQ (LinkedIn)
- George Dunn – President, Cre8 Independent Consultants (LinkedIn)
- Tod Chernikoff, CRM (Twitter)
- James Morganstern – Business Development Executive, Integro (LinkedIn)
- Stephen Ludlow – Program Manager, E-Discovery and IG Solutions, OpenText (Twitter)
- Francis Lambert – CEO, Records Technologies (LinkedIn)
- John Montana – CEO, Montana and Associates (LinkedIn)
- Craig Rheinhardt – Director, ECM Product Strategy and Market Development, IBM (LinkedIn)
- Gordon Rapkin – CEO, Archive Systems (LinkedIn)
- Keith D. Davis, MBA, CRM – RIM Program Office, Fortune 15 Healthcare Company (LinkedIn)
- Tom Reding, CRM – Principal, Information Governance, EMC (Twitter)
- Jill Hearn – Principal Product Marketing Manager, EMC SourceOne (LinkedIn)
- Matt Hillery – CTO, Fontis International (Website)
- Gordon E.J. Hoke, CRM – Independent IG Consultant (Twitter)
- Eugene Stakhov – Senior Solution Architect, Lighthouse ECM Group, LLC (LinkedIn)
- Beth E. Chiaiese – Director of Loss Prevention, Foley & Lardner LLP (Website)
- Sue Trombley – Director Consulting, Iron Mountain (Blog)
A few weeks ago, I mentioned that I was working on new feature article for Law Technology News about about how making more and more data “easily accessible” is both essential for Big Data to fulfill its promise and also a huge risk to privacy, intellectual property, and so on.
The promise of Big Data is based on a central assumption: that information will be easily, quickly, and cheaply available, on a grand scale. The plumbing of Big Data — the technology infrastructure — is designed to bring internet scale to enterprise data. Some of the surprising insights that data scientists hope to gain from Big Data analytics come from correlating information from disparate sources, in a context that was never imagined when the information was first created — such as correlating the type of computer used to book a trip with how much a traveler is willing to pay for a hotel room. Or using prescription drug history to screen health insurance applicants.
The problem of protecting privacy, intellectual property, and other rights will only grow more complex as our ability to access and process information becomes more sophisticated.
I also write about how these issues came to the forefront in the wake of the shooting tragedy at Sandy Hook Elementary school in Newton, CT. I also explore emerging technology that allows electronic content to “self-destruct.”
The article has now been published, and you can read it here (free registration required).
I was also interviewed about the article by Monica Bay, Editor-In-Chief of LTN, on Law Technology Now. You can listen to our discussion on the embedded podcast below.
Author: Barclay T. Blair
Late 2012 I was honored to provide a feature editorial for Law Technology News, a fine publication helmed by Monica Bay. You can read it online here (with free registration) or you can read it in full below.
Girding for Battle: A clash is brewing between Big Data and e-discovery
When was the last time you sat at your computer and deleted old files? Yesterday? Never? Don’t remember? Before today’s ubiquitous search engines, there was practical value in being a filer rather than a piler — it was difficult to find a document in a filing cabinet without an index.
Today’s sophisticated search engines obviate the need to manually index. Search technology is wonderful if we know what we are looking for, but is it an information management panacea? Information is growing at an astonishing rate, so much so that the numbers used to communicate growth projections are now so huge that they are almost meaningless.
Until recently, this unfettered growth was generally viewed as hazardous. It drives up storage costs, makes it difficult to find the wheat among the chaff, and increases electronic data discovery risk and cost, the argument goes. The resulting mantra: “We need to categorize it, control it, and clean it up!” Companies have spent decades paralyzed by a near inability to adapt modernist paper records management programs to decidedly postmodern information systems. Today, no part of the organization (including IT) exerts centralized command-and control over data, and we have yet to find an easy replacement for the file clerk. Enter Big Data, where uncontrollable information growth is no longer viewed as evil, or even a necessary evil. In the Big Data world, system administrators now treat bursting databases and file shares not as a shameful secret shared sotto voce in committee meetings, but as something to brag about. In Big Data, information has no downside. It is exalted in Davos, where the World Economic Forum recently “declared data a new class of economic asset, like currency or gold.” It’s been profiled by The New York Times. Pro￼ponents call it “the new oil,” proclaiming it presents the biggest opportunies since the dawn of the internet.
So why does Big Data matter to the legal community? Because it heralds a new battle, over a single question: Should we keep the information we create forever, or should we throw some of it away? The answer used to be simple: it was not feasible to keep everything. The cost was too high, the effort too great. Overburdened systems fail. Information overload reduces productivity. Data must be migrated from old to new systems, with great difficulty and expense.
The chance that you might have a smoking gun buried in the data creates too high a risk of liability. After all, if we learned one lesson from the seminal EDD cases metastasiz- ing from the bankruptcies of Enron (Andersen v. U.S., 544 U.S. 696, 704 (2005)) and Sunbeam, (Coleman (Parent) Holdings, Inc. v. Morgan Stanley & Co., Inc., No. 502003CA005045XXO- CAI (Fla. Cir. Ct., March 1, 2005)), it is that data skeletons in the closet can be spooky.
But Big Data changes the calculus. The software used by Google and Yahoo to index the internet is open source, called Apache Hadoop. This brings internet scale and speed to just about any organization, and it can be run on cheap, off-the-shelf disk drives. Tools to analyze the data (some first commercialized in EDD) are accessible and powerful, promising profound new business and societal insights drawn from the vast pools of data. The fundamental promise of Big Data is that it enables insights into business (and the world) that were not possible before. Proponents see Big Data creating a better world, one fulfilling the promise of the internet itself.
But Big Data advocates downplay the downsides of data, and specifically, the EDD challenges. In the near-Nirvana contemplated by some Big Data proponents, all data is good and more data is better. In EDD, the opposite is usually true.
A recent study by the Pew Research Center about the future of Big Data was positive overall, but acknowledged concerns related to privacy, social control, misinformation, civil rights abuses, and the possibility of simply being overwhelmed by the deluge of data. Within legal, the burden of finding, processing, and producing Big Data in EDD is a foreign concept to most Big Data advocates. Perhaps this is because the Big Data enthusiasm cycle has not yet reached the “trough of disillusionment” where the hype faces the reality of corporate culture and complex legal and compliance requirements.
Records management doctrines specify that organizations should clearly define the business or legal purpose of a piece of information when created. That analysis determines whether, for how long, and in what form the data should be kept. Records retention schedules are intended to provide a measure of defensibility against spoliation claims, as they evince an intent to delete a record based on a proactive and standardized calculation of its value, rather than a reactive determination based on fears about bad evidence. Many organizations have attempted to play records management catch up in advance of pending litigation and have paid the price.
Big Data advocates argue that the economies of scale now make it feasible and desirable to capture and store information that currently has no clear or definable business value. Although large organizations have long collected and analyzed data (using business intelligence software), proponents argue that Big Data is different. They posit that cheaper storage and technical innovations make it easier and faster than ever before to analyze that data, eliminating the need to identify the business purpose of data before it is collected and retained.
With Big Data, no rigid “schema” or organizational approach is necessary before capturing content (unlike in a traditional database). Data professionals now (or in the future) can ask open-ended questions of the data. That includes questions that may be not be germane now, but may be critical in an unpredictable future.
As a result, more data will be kept longer, in a manner that is unmoored from records management tenets. Without a doubt, this philosophy will complicate the governance and e-discovery of data.
So, when was the last time you sat down in front of your computer and deleted old files? In the world of Big Data, this is not only unnecessary, it’s undesirable. And it’s a waste of time.
Should we keep everything forever? Absolutely not. Too much information still has a downside. It is a liability, as well as an asset. Information has risk. Information has real, unavoidable legal and regulatory requirements. Information has a bite that Big Data proponents ignore at their peril.
But the good news: The same tools and infrastructure that empower the potentially profound insights of Big Data can and should be employed to help organizations make informed decisions about data retention. A vast amount of unstructured data in many organizations (over half, according to some studies) is duplicate, outdated, transitory junk that has no business value. Getting rid of this information en mass, without dragging every employee into the process, is now possible.
E-discovery is the place where the cost of information management myopia becomes painfully visible, and is why EDD has consistently driven innovation in handling and under- standing vast amounts of data. However, even with these innovations, the risk and cost of information in EDD is undeniable, and is correlated to the overall volume of information in the organization.
These are the contours of the coming battle between Big Data and e-discovery. It is a philosophical and cultural battle. It is the responsibility of EDD and information governance attorneys and practitioners to gird themselves for this battle. Learn about Big Data, and inform the discussion and decisions in your organization.
Reprinted with permission from Legal Technology News. Further duplication prohibited.
Common Big Data Use Cases
- Sentiment analysis. Analyzing sentiment on social media networks in order to improve marketing campaigns and customer service programs.
- Fraud Detection. Analyzing transactions for patterns and events that may indicate fraud (familiar to anyone who has received a phone call from their credit card company when first using the card outside their home country).
- Retail pricing optimization. Setting the price of a product based on sophisticated analysis of purchasing patterns, customer demographics, and geographic demand variations
Who should you talk to?
Big Data projects are likely being planned in your organization, or your client’s organization right now. Here are some people and places to pay attention to:
- Marketing and customer service. A common real-world current application of Big Data techniques in social media sentiment analysis. These programs are typically driven by marketing or customer service groups.
- IT: Information Security. The IT professionals responsible for information security may already be collecting and analyzing log files from the hundreds or thousands of devices that generate them in the company. This may not technically be a Big Data project yet, but find out what their plans are for correlation with other data sources that may give rise to privacy and other concerns.
- Data scientists and analysts. If your organization is currently hiring data scientists or analysts, there is a good chance that Big Data projects are ongoing. Find out who these people are and learn about their plans. Not only is their work typically very interesting, it may also have serious legal and regulatory implications related to retention, privacy, and e-discovery.
Chart: Drawing the Battle Lines
|Factor||Big Data||Information Governance, E-Discovery|
|Primary motivation||Business value||Legal risk|
|Prevalent attitude towards information||More data is an opportunity||More data is expensive and risky|
|Information type focused on||Databases, moving towards unstructured information||Documents, email, and unstructured information, moving towards databases|
|Bleeding edge analysis||How much is a piece of data worth?||Is this a future smoking gun?|
|Biggest potential downside||Unintended consequences of analysis (e.g., civil rights violations); cost in litigation||Throwing away documents that in aggregate reveal valuable business insight|
Author: Barclay T. Blair
I recently completed a webinar about defensible deletion with Anthony Diana of Mayer Brown, Katey Wood of Enterprise Strategy Group, and Stephen Stewart of Nuix. We had a good discussion focused on the role of inside council in supporting and driving efforts to get rid of unnecessary data. You can check out the recording here.
The blurb for the webinar is below:
For many years, organizations have kept all their data, often beyond the mandated retention period. But with data volumes growing to hundreds of terabytes – or even petabytes – this is no longer an option. The financial and time costs of maintaining storage systems for so much data are prohibitive. In addition, much of this data is unknown, posing significant business risks and adding to the time and expense of discovery or investigation exercises. Defensible deletion allows organizations to identify, categorize and manage all their data across multiple geographical locations, applications and storage and archive systems. With this knowledge, an organization can delete any data that has no business value or legal hold requirements. Deleting unneeded data allows organizations to reduce storage management costs, speed up discovery and investigations, switch off obsolete storage systems and tame the Big Data beast.
Join information governance thought leaders for a step-by-step guide to developing and implementing a defensible deletion program for your organization.
This session will discuss how you can:
- Make content-driven decisions to identify which data you can delete and which you must retain
- Create sound document retention, deletion and archiving policies
- Select a knowledgeable external counsel who can work with you to create and implement a defensible deletion process
Author: Barclay T. Blair
I recently met the former CFO of Radian6, a social media monitoring company that was incubated and grown in New Brunswick, Canada, and we had a great discussion about the role that this kind of technology can play in developing, andvancing, and protecting a brand. He is the former CFO only because the company was recently acquired by SalesForce – an illuminating acquisition.
For those of you new to this technology, the basic idea is that it allows organizations to monitor social media such as Twitter and FaceBook for discussion of a company, its products, its marketing campaigns, and so on. Analytical tools and techniques help companies measure “sentiment” (i.e., do people like or hate your latest product?) and other trends that might be meaningful for sales, marketing, customer support and other purposes.
I recently had a personal encounter with social media monitoring. Last week I did a speaking engagement at an OpenText user group meeting. It was hosted by McDonald’s Corp. at their executive training facility, called “Hamburger University.” No, Hamburger University isn’t just a street name or euphamism for the training center – that is its actual name (see photo below for proof).
Hamburger University is part of McDonald’s campus-style headquarters west of Chicago. I stayed “on campus” at a Hyatt connected by a bridge over an idyllic pond (“Lake Fred”) to the training facility. It’s a beautiful location – 80 acres of lakes and trees and prairie-style architecture.
It was an great event – I presented an “Introduction to Information Governance,” and there were some detailed case studies from OpenText customers like Hyatt about how they are using their digital asset management, social media management, and other products.
Anyway, the morning of the event, I tweeted the following:
Shortly thereafter, the Hyatt tweeted the following, and I replied:
And the McDonald’s Corp tweeted the following, and I replied:
This is social media monitoring software at work, which is cool to see. But there is something even more remarkable happening here, as my friend and enterprise software marketing guru Sean Wilcox (EVP at IT.com) pointed out:
“Such a great way to connect more deeply with customers, partners etc. No wonder Oracle and SFDC are buying companies that help them do this. However, note that, even with all the automation, an individual with a sense of humor created the spark that made this memorable. Praise to the person who replied about Hamburgler.”
This is a great insight: although the technology enabled the interaction, it was the human spark that made it memorable.
We are working on a number of social media governance projects, so this lesson is useful. In some ways, enterprise social media has painted itself into a corner. On the one hand, a key part of its value proposition is its supposed “authenticity,” i.e, its ability to enable “real conversations” among employees, partners, and customers. On the other hand, creating and maintaining authenticity takes a lot of time and effort – too much for many leaders, subject matter experts and other folks with real jobs who might be tapped to represent your brand online. Clearly, any hope of sustaining a new kind of interaction between a brand and its market through social media cannot happen without automation that helps your people find the time to be focused, authentic, and memorable.