“Data growth is the biggest data center hardware infrastructure challenge for large enterprises, according to a new survey by research firm Gartner Inc.”
Data growth remains IT’s biggest challenge, Gartner says, Computerworld, November 2, 1010
Building on the theme I started with last week’s PowerPoint slide (which has been viewed hundreds times in a few days thanks mostly to a posting on an Australian records management listserv [thanks, Andrew]), I thought I would share another slide that I use to tell the information governance story.
We have all seen the studies that attempt to quantify the amount of information on the planet. The first one that I was aware of (and used extensively in my first book, Information Nation) was the “How Much Information” study done at University of California Berkeley in 2000. The latest one I know of is an IDC study published in May 2010. I dig into the numbers behind this study in in this week’s PowerPoint slide.
There are three “stacks” in this slide, each representing a different dimension of the study, with the relevance to information governance increasing from left to right. Let’s start with the first stack.
The first stack shows the expected overall growth of digital information on the planet between 2009 and 2020. The study projects that this will grow by 44 times, from .8 Zetabytes to 35 Zetabytes (1 Zetabyte = 1 trillion Gigabytes). Although I find the scale of these numbers impressive, and intellectually know that this is an incredible amount of information, the numbers are almost too big to be meaningful. Even attempts to analogize these numbers, like “a stack of books from here to the moon,” don’t really help me. Perhaps this could form the basis of a successful Zen Buddhist Koan – “a story, dialogue, question, or statement, the meaning of which cannot be understood by rational thinking but may be accessible through intuition.” (Wikipedia) What is the sound of one hand clapping? How big is a Zetabyte?
However, moving to the middle stack of hard drives, we get to some numbers that mean something to me. According to the study, the number of individual files or “containers” of data will grow at a faster rate than the overall raw volume of data. In fact, it will grow by 67 times in the same period, or almost 50% more than the overall volume.
Aha, now we are getting somewhere. The problems of unstructured data (or at least, “not well structured” data) is at the core of the information governance problem. All of my clients have the same three problems at or near the top of their problem list: 1) Email 2) Unstructured files in shared drives and C drives, and 3) Backup tapes. According to this study, these kinds of problems are going to get at least 67 times worse over the next decade. Now, in the fog of all this data growth, the information governance problem really starts to take shape.
The final stack, on the right, takes us even further in understanding how the information governance problem is growing faster than the problem of data volume itself. As we complete the transition from paper to digital, the kinds of data we are creating and the kind of management it requires is changing. According to the study, the amount of data requiring some type of information governance (i.e, for “privacy, compliance, custodial protection, confidentiality, or absolute lock down” purposes) by 2020 will nearly double. Moreover, the portion requiring the highest levels of information governance control will grow 100 times. Furthermore, when viewed from a files – rather than an absolute volume perspective – the number of files requiring some kind of information governance will be over 90%.
This is the heart of the information governance problem: not only is overall data volume growing at an astonishing rate, but the number of individual piece of data we have to manage is growing at a faster rate, and the amount of data that we have to manage and control in a special way is growing even faster.
Email me if you would like the original PowerPoint file. (btblair at vialumina.com)