The transition into the information age has brought a massive proliferation of data, but fortunately, at the same time there has been rapid innovation in tools to analyze this data. In the past, the focus of most large scale data analysis solutions has been on metrics that are easily measured – the number of visitors to a webpage, what products visitors purchase, how many ‘likes’ certain posts get. However, focusing only on things that are easy to measure can mean missing the most important data.

New methodologies in data analysis seek to change that to tap the potential of a much wider selection of data sources. One of these data analysis technique is Text analytics, also known as text mining, a way of transforming raw, unstructured text into structured data, which can then be measured and analyzed scientifically.

It seeks to quantify the sprawling masses of text such as product reviews, customer service interactions, or comments on a product page, and turn it into measurable data, indentifying the “who,” “what,” “when,” “where,” “why,” as well as the emotional tone of conversations.

Tasks included in text analysis 

  • Categorizing information

  • Counting the number of times subjects are mentioned

  • Identifying the sentiments of text

  • Summarizing documents

  • Statistically analyzing blocks of text

  • Extracting concepts and themes

  • Drawing connections between different hyperlinked web pages and

  • Identifying the relationships between entities in the text.

The importance of text analytics is highlighted by its use by major companies. Facebook recently released ‘Topic Data’, a system to anonymously analyze comments and posts about subjects relevant to specific products.

Top Videos of the Day

On the page of this system, they give the example of how a company selling hair de-frizzing products can actually harvest data from users’ posts about how humidity affects their hair.

IBM also recently purchased AlchemyAPI to augment the analytics of their Watson platform, and Microsoft recently purchased the text analytics company Equivio. In addition, all email providers use text analytics in their anti-spam filters and while these never seem to be perfect, their increasing rate of correctly identifying spam highlights the effectiveness of text analytics.

Other practical uses of text analytics

  • Identifying consumer attitude towards brands and products

  • Checking for plagiarism

  • Electronic discovery’ process in legal investigations

  • Determining automatic advertisement placements

  • Monitoring online conversations for national security

  • Indexing large publication databases in academic and scientific fields

Thus, text analytics can be valuable for everyone from small businesses to multinational corporations.

As it can be a complicated field, companies can benefit from outside help in the form of a technology consultant with expertise in this area. A good technology consulting firm can advise on the most appropriate software and help organizations get the most value from its use. Since it is such a new and diverse field, we still do not know all potential uses of text analytics, and as such, businesses could be surprised by innovative ways in which it could help them.

While it is difficult to say for certain, most estimates say that more than 80% of the data is in the form of text. This suggests that there is enormous commercial potential in the field of text mining. While text mining was originally developed by intelligence agencies during the Second World War, it has only been in recent years that the technology has truly began to come into its own. And due to its complexity, it is a field with huge potential for growth, as machines learn to read more and more like their human counterparts.

In the end, we can only guess at how effective the technique will become, but the potential is truly revolutionary, which we are already reaching with the many diverse uses of text analytics available today.