I was reading yet another article about leveraging unstructured text data, like contact reports, notes and comments in our data base. For most of us in this profession, this is our equivalent of internally generated big data.
Boris Evelson, the author of this article, suggested that most organizations only utilize 35% of their available structured data to inform decision making and a mere 25% of unstructured data.
I have to wonder about that. How can I even start to get closer to those numbers?
His article is a succinct how-to for getting started with text analytics. And one of the first steps is to define the use cases for extracting intelligence. What in the world is a use case? It’s just a scenario for where to locate the data and the context of the information you’re looking to find. For every question you have, outline where it might be and what it might look like in that location. In a best case situation, you’d also identify what it wouldn’t look like, as in similar data that isn’t relevant to the question at hand.
For example, if you worked in a college or university setting and wanted to extract information from contact reports about constituents who participated in fraternities or sororities, that would be a description of a possible use case. The output of the search would be your constituent ID number and the name of the fraternity/sorority. Another possibility could be constituents that have told you that they have a second home or vacation home. Another possibility could be constituents who have grandchildren. When mining your text data, if your organization has a tendency to bury relevant lifestyle or affinity details in a big text blob, anything goes.
There are all kinds of use cases that get very complicated, but why not start simple, right? To get familiar with the concepts and terminology, take a look at this article by clicking on the image above.