Early this year I was lucky enough (thanks to a great sponsor) to carve out some significant research and writing time to answer a complicated (and maybe even complex) set of questions: what does unstructured information really cost? How do we answer this question? Which kinds of costs should be included in the answer? Can we use this answer to drive desirable Information Governance behaviors?
I looked at existing models for structured data, studied the emerging Big Data market, talked to clients and experts, and developed some answers to these questions that I think are actually pretty novel. You can download the entire paper here now (at the website of Nuix, the sponsor), and you can also follow along here as I discuss out some of the key ideas and findings over the next few weeks.
A PowerPoint slide (with notes) is available for download here: IG PowerPoint Slide of the Day from Barclay T Blair-10 Factors Driving Unstructured Information Cost. If you do use it, I would appreciate you letting me know how and where.
Unstructured information is ubiquitous. It is typically not the product of a single-purpose business application. It often has no clearly defined owner. It is endlessly duplicated and transmitted across the organization. Determining where and how unstructured information generates cost is difficult.
However, it is possible. Our research shows that there are at least ten key factors that drive the total cost of owning unstructured information. These ten factors identify where organizations typically spend money throughout the lifecycle of managing unstructured information. These factors are listed in Figure 1, along with examples of elements that typically increase cost (“Cost Drivers,” on the left side) and elements that typically reduce costs (“Cost Reducers,” on the right hand side).
- E-Discovery. Finding, processing, and producing information to support lawsuits, investigations and audits. Unstructured information is typically the most common target in e-discovery, and a poorly managed information environment can add millions of dollars in cost to large lawsuits. Simply reviewing a gigabyte of information for litigation can cost $14,000.[i]
- Disposition. Getting rid of information that no longer has value because it is duplicate, out of date, or has no value to the business. In poorly managed information environments, just “separating the wheat from the chaff” can cost large organizations millions of dollars. For enterprises with frequent litigation, the risk of throwing away the wrong piece of information only increases risk and cost. Better management and smart information governance tools drive costs down.
- Classification and Organization. Keeping unstructured information organized so that employees can us it. Also necessary so management rules supporting privacy, privilege, confidentiality, retention, and other requirements can be applied.
- Digitization and Automation. Many business processes continue to be a combination of digital, automated steps and paper-based, manual steps. Automating and digitizing these processes requires investment, but also can drive significant returns. For example, studies have shown that automating Accounts Payable “can reduce invoice processing costs by 90 percent.”[ii]
- Storage and Network Infrastructure. The cost of the devices, networks, software, and labor required to store unstructured information. Although the cost of the baseline commodity (i.e., a gigabyte of storage space) continues to fall, for most organizations overall volume growth and complexity means that storage budgets go up each year. For example, between 2000 and 2010, organization more than doubled the amount they spent on storage-related software even though the cost of raw hard drive space dropped by almost 100 times.[iii]
- Information Search, Access, and Collaboration. The cost of hardware, software, and services designed to ensure that information is available to those who need it, when they need it. This typically includes enterprise content management systems, enterprise search, case management, and the infrastructure necessary to support employee access and use of these systems.
- Migration. The cost of moving unstructured information from outdated systems to current systems. In poorly-managed information environments, the cost of migration can be very high – so high that some organizations maintain legacy systems long after they are no longer supported by the vendor just to avoid (more likely, to simply defer) the migration cost and complexity.
- Policy Management and Compliance. The cost of developing, implementing, enforcing, and maintaining information governance policies on unstructured information. Good policies, consistently enforced will drive down the total cost of owning unstructured information.
- Discovering and Structuring Business Processes. The cost of identifying, improving, and routinizing business processes that are currently ad hoc and disorganized. Typical examples include contract management and accounts receivable as well as revenue-related activities such as sales and customer support. Moving from informal, email and document-based processes to fixed workflows drives down cost.
- Knowledge Capture and Transfer. The cost of capturing critical business knowledge held at the department and employee level and putting that information in a form that enables other employees and part of the organization to benefit from it. Examples include intranets and their more contemporary cousins such as wikis, blogs, and enterprise social media platforms.
[i] Nicholas M. Pace, Laura Zakaras, “Where the Money Goes: Understanding Litigant Expenditures for Producing Electronic Discovery,” RAND Institute for Civil Justice, 2012. Online at, http://www.rand.org/content/dam/rand/pubs/monographs/2012/RAND_MG1208.pdf
[ii] “A Detailed Guide to Imaging and Workflow ROI,” The Accounts Payable Network, 2010