Category: Predictive Coding

Announcing The Information Governance Initiative

Three years ago, I sat down in a conference room in Washington, D.C with some really smart people and we quickly realized that we shared a vision for a consortium and think tank devoted to advancing Information Governance. Each of us had seen the incredible value that better information governance could create for their respective clients, but had also witnessed the consequences of information failure first-hand. Without a way for IG practitioners to share their experience across disciplines, it seemed unlikely that the promise of information governance would be fulfilled. Today, thanks to the support of like-minded individuals and organizations, this vision has been realized.

I am so pleased to announce the launch of the Information Governance Initiative (IGI), a cross-disciplinary consortium and think tank focused on advancing information governance. The IGI will publish research, benchmarking surveys, and guidance for practitioners on its website at www.IGInitiative.com. The research will be freely available, and the group will also be providing an online community designed to foster discussion and networking among practitioners.

I am founder and executive director, and it would be great if you would join us.

Our Mission

I believe information can be a positive transformative force in the world – improving business, government, and the lives of people in all walks of life. But I also believe that these benefits are not automatic, and in fact will only be the result of sustained, proactive efforts to understand and manage information in a better way. I believe that there is a need for like-minded people to come together and find this better way. A forum for ideas, facts, and techniques. An initiative that pushes the market forward and builds information literacy.

That’s why we created the Information Governance Initiative – and why we want you to be a part of it.

Who We Are

I founded the IGI along with Bennett B. Borden. I am the executive director, Bennett is the organization’s chair, and Jason R. Baron is co-chair. Jay Brudz is general counsel.

The IGI Advisory Board is comprised of members drawn from the disciplines that own the facets of information governance including information security, data science and analytics, e-discovery, business management, IT management, compliance, business intelligence, records management, finance and audit, privacy, and risk management. We are also developing a Corporate Council comprised of practitioners working in IG. Contact us if you are interested in participating in the Corporate Council.

At launch, IGI Advisory Board members include Courtney Ingraffia Barton, senior counsel, global privacy at Hilton Worldwide, Inc.; Julie Colgan, president of ARMA International; Leigh Isaacs, VP of the information governance Peer Group at ILTA; and Richard Stiennon, chief research analyst at IT-Harvest and well-known cybersecurity expert. Additional board members are being added on an ongoing basis.

Our Supporters

The IGI is launching with broad support from leading providers of information governance products and services, including:

We are also partnering with a variety of organizations to bring IG stakeholders from different disciplines together to work on the information governance problem. For example, we have partnered with The CFO Alliance, a community of over 4,000 senior finance professionals, to bring the IG conversation to the finance community. ARMA International has appointed a representative to the IGI Advisory Board, and the two organizations plan on working together to advance the adoption of information governance. In addition, the IGI will be presenting several sessions on information governance at the Managing Electronic Records conference in Chicago, May 19-21, 2014.

Get Involved in the IGI

Members of the leadership team are speaking about information governance at nine different sessions during the LegalTech NY 2014 conference between February 4-6th. If you are there, come see us and also visit our Charter Supporters in the exhibit halls.

Learn how you can get involved in the IGI at, www.IGInitiative.com

I also invite organizations interested in supporting the advancement of Information Governance to contact me at 646 450 4468 or barclay.blair@IGIniative.com.

2014 ABA Information Governance, E-Discovery and Digital Evidence Conference

Next week don’t miss the 2014 American Bar Association Information Governance, Electronic Discovery and Digital Evidence National Institute at Stetson’s Tampa Law Center in Tampa, Florida, on January 28-31, 2014. I spoke at this event last year, and was supposed to speak again this year, but had a conflict so I will unfortunately not be there. Unfortunate for me, at least. The attendees will probably be fine without me.

This is an event star-studded with e-discovery and information governance luminaries and judges. It is a casual setting, with lots of opportunities to chat with real decision-makers (i.e., judges) and experts who are mapping the future of information governance. Plus, Tampa is a pretty nice place to escape to this time of year. Unless you are from Tampa, in which case, well, you get to sleep in your own bed. And don’t forget to go to Berns (take the tour, it is worth it).

Click here for more information and to register.

Live Information Governance Trends Webinar On January 23rd, 2014

Trends Driving Information Governance Strategies in 2014

In 2013, many organizations successfully launched information governance initiatives, and saw positive progress from those efforts in attaining executive sponsorship, engaging key stakeholders, and executing pilot projects. As we enter 2014, new challenges emerge as organizations look for demonstrable business value amidst unrelenting challenges of information growth, regulatory compliance complexity, and legal discovery.

Join me and Robert A. Cruz as we assess these challenges and discuss what we can expect for Information Governance in 2014. The live webinar, presented by Proofpoint, is on January 23 at 11AM PST/ 2PM EST

Register for the webinar here. 


Please Support Our Information Governance Survey

Take the survey now.

We have worked with the eDJ Group in the past to survey the market about Information Governance attitudes and practices, and I am pleased be working together on a new survey. This time we have an additional partner –  ARMA International – which is excellent.

Our new survey asks some of the same questions we asked previously so that we can track year-over-year changes, but we are also digging into some new areas like big data and predictive coding. Please take a moment to complete the survey. We will be releasing the results publicly, and this kind of data is good for all of us as we try to move the information governance ball down the field (unlike the NY Giants this year –  what the heck?).

Check out the results of our previous surveys to get a flavor of the kind of insight that we expect to get from the survey.

Take the survey now.

Quiz Time: Get a Free Book

I have been doing some research into data remediation and I came across this interesting model that I think fits pretty well. But it is not from the information governance or even the IT world. The first person to tell me in the comments precisely where this model originates will get a copy of my latest book.

Remediation

Briefing Notes: 5 Questions about Big Data for Attorneys and E-Discovery Professionals

I recently provide a briefing to a group of e-discovery professionals about Big Data and why it matters to them, and I thought there might be some value in sharing my notes.

1. What is Big Data?

  • Gartner: Big data is high-volume, high-velocity and high-variety information assets that demand cost-effective, innovative forms of information processing for enhanced insight and decision-making.

  • McKinsey: ‘Big data’ refers to datasets whose size is beyond the ability of typical database software tools to capture, store, manage, and analyze. This definition is intentionally subjective . . .

  • It is subjective, but has definable elements

    • The data itself: large, unstructured information

    • The infrastructure: “Internet scale” in the enterprise

    • The analysis: Asking questions using very large data sets

2. Why Does Big Data Matter to E-Discovery Professionals?

  • Data scientists and technologists do not understand the risk side of information

  • You need to be at the table to educate them on:

    • The legal and business value of deleting information

    • The privacy requirements and implications

    • E-Discovery implications of too much data

  • The technologies of Big Data may process and manipulate information in a way that affects their accessibility and evidentiary value –  you need to be aware of this and guide your clients appropriately

3. Does Big Data offer value to the legal community?

  • Performing sophisticated analysis on large pools of data is not exclusive to any particular industry –  there is no reason it could not be applied to the legal community (and already is being used in some limited ways)

  • Relatively speaking, most law firms do not generate massive amounts of data in their day-to-day operations

  • In e-discovery, the technology innovations of Big Data could be helpful in very large cases to help with storage and processing tasks

4. What are some examples of Big Data in action?

  • President Obama’s data-driven election campaign.

  • An online travel company showing more expensive travel options to those who used higher-prices Macintosh computers to access their website.

  • Tracking unreported side effects of drugs using search data (Journal of American Medical Association). Also Google Flu Trends: tracking the spread of the flu using search trends.

  • NYPD Compstat.

  • Fraud Detection: Targeting $3.5 trillion in fraud from banking, healthcare utilities, and government.

  • The City of New York finding those responsible for dumping cooking oil and grease into the sewers by analysing  data from the Business Integrity Commission, a city agency that certifies that all local restaurants have a carting service to haul away their grease. With a few quick calculations, comparing restaurants that did not have a  carter with geo-spatial data on the sewers, they generate a list of statistically likely suspects to track down dumpers with a 95% success rate.

5. What professional and career opportunities does Big Data represent for e-discovery professionals?

  • Organizations need people who understand the risk side of the equation and who can provide practical guidance

  • Your clients may have Big Data projects that right now, today, are creating unmonitored, unmitigated risk; you need to be able to help them identify and manage that risk

  • Big Data focuses on unstructured information, i.e., the documents, email messages and other information that the e-discovery community knows well. These same skills and techniques can be very useful to business-led Big Data projects.

Response to NARA’s Capstone Email Bulletin

On June 6, 2013, the US National Archives and Records Administration published a call for comments on its draft Bulletin regarding a proposed “Capstone” approach to email retention at federal agencies.  NARA was having technical problems with its comment system when I tried to submit my comments, so based on their instructions I have submitted my comments to them directly by email, and I am also posting them here. 

You can find the request for comment and the draft Bulletin on NARA’s website.

Feedback on NARA’s Capstone Email Records Management Bulletin

As requested, I am providing comments on the “Capstone” approach to email management outlined in the June 6, 2013 draft NARA Bulletin provided above. Thank-you for the opportunity to provide input on this important issue.

I am the founder and principal of an information governance consulting firm based in New York. Since 2001 I have advised many organizations and government agencies on the development and implementation of email retention strategies.

Based on my experience and research, I believe that most organizations currently fall into one of two email records management camps.

The first camp does very little. While they may impose mailbox size limitations, they provide sparse guidance to employees who are forced to delete messages to meet these quotas. Consequently, business records are likely lost – especially if no storage space is allocated for retention of records that simply happen to reside in the email system.  Others allow – or turn a blind eye to – the practice of employees exporting email messages out of the corporate email system so they can be tucked away in shared drives, thumb drives, or taken home for “safekeeping.” This practice results in an effective loss of management control over records found in the email system, and can greatly increase collection costs and increase spoliation risk in e-discovery.

The second camp “manages” email, but treats all email messages equally, regardless of their content. Some – seeking to minimize the cost and potential risk of email – automatically purge all email older than 30, 60, or 90 days. In the absence of a method to capture email messages containing record content, records are surely lost – violating laws that require retention of specified records, regardless of their form. Others – perhaps inspired by SEC Rules 17a-3 and 17a-4 and the email archiving software industry that those Rules singlehandedly created – capture a copy of all messages sent and received and keep them in a separate archive for a fixed period of time. This approach ignores the reality that such an archive will undoubtedly contain both trivial content and critical business records. From a compliance perspective, this may be just fine if you are a broker-dealer subject to these unique, email-specific Rules, but is less fine if you are, like most of the business world, subject to retention rules that do not exempt or treat email in special way, but rather require identification and retention of business records regardless of the form they take.

There are of course other approaches to email retention, one of which is outlined in your draft Bulletin. As I understand it, Capstone is a role-based method that uses the role of the email creator/recipient as a predictor of the content of that user’s account. In the past I have advocated such an approach to clients as a pragmatic method for improving otherwise nascent email records management practices.

NARA should certainly be commended for embracing such pragmatism, and in recognizing that complex user classification systems are often impractical and lightly adopted.

However, I would like to share two additional ideas that may be helpful as NARA finalizes its guidance.

First, while a knowledge worker’s role can certainly be a predictor of an email message’s content, our research has shown us the limits of this approach. We have assessed role-based approaches at client organizations by analyzing actual email accounts sampled from a range of user roles. We have then estimated the percentage of email content that would require retention under the client’s own retention rules. Across a range of users we have found as little as 5% and as much as 95% record content. There is certainly some correlation between the percentage of record content and the role of the user, but it is not always categorical. For example, some users are mostly information processors, and thus may have an extremely high percentage of email records in their inboxes.

Consider for example, a claims processor who receives a partially completed claims form attached to an email message, opens that form and completes it using information they possess, and than sends the completed form to an employee who represents the next link in the processing chain. This scenario is very common, even in large organizations. Assuming that these completed claim forms are records, and that they are not otherwise captured in a content management system, this user’s email account is quite important from a records management perspective.

However, a Capstone system based solely on seniority (i.e., “officials at or near the top of an agency,” as described in the Bulletin) may miss this important account and result in such records disappearing as “temporary” records. Conversely, senior officials may have a relatively low percentage of record content in their email system when they use other systems to communicate their decisions, document those decisions formally, or otherwise use other official or formal systems to complete their work.  Capture and permanent retention of their entire account then, would result in retention of largely trivial content.

These issues can in part be addressed by careful examination of the way email is used by each agency and its users, as mentioned in the Bulletin.

Second, I wonder if NARA is turning away from a content-based approach to record identification and retention too soon – in fact, act just at the time in history when technology to enable semi-automated, content-based approaches is becoming widely available. Our clients are currently evaluating and implementing technology from OpenText and Recommind (there are other providers in the market as well) that marries human and machine intelligence to remove the classification burden from the user. Such systems are by no means trivial to implement and configure, but I believe that they point the way forward for email records management. The effectiveness of automated statistical methods for content classification has been demonstrated effectively in the intensely observed world of US civil litigation; a demonstration that I believe provides a foundation for it application to the records management problem.

Further, while the Capstone method would seem – as noted in your Memo – to foster compliance with the “OMB/NARA M-12-18 Managing Government Records Directive” requirement to “manage both permanent and temporary email records in an accessible electronic format,” I wonder to what extent it addresses the spirit of Section A3 of the Directive to “investigate and stimulate applied research in automated technology to reduce the burden of records management responsibilities?”

Once again, thank-you for the opportunity to provide feedback on this important Bulletin, and I am confident that NARA will continue to provide leadership as federal agencies continue this critical transition.