Rundown on text analytics

Text analytics turns text into data through applications and algorithms and analyses it through natural language processing and statistical methods. Alternatively text data mining (or text mining) is the process of getting high-quality information from text, and has matured to also have algorithms from natural language processing and machine learning. You can consider text mining as a subset of text analytics as a whole.

SAS Enterprise Miner and SAS Text Miner are one of the products available which help businesses to perform text analytics to get meaning from their information, and this software is what we will be taking a closer look into.

The ability to pick out the meaning from unsorted text, transforming it into an understanding of trends and relevant phases in social media is crucial to help inform decision making for marketing and social media management.

First taste of analysis: A Reflection

My first experience in industry with text analysis would be a small demonstration that I did involving sentiment analysis, which categorises a document as positive, negative, or neutral based on the comments made in the text. Going through the sample text to find these types of comments manually assisted me in gaining a clearer picture as to what text analytics aims to achieve, and what it can be used for. This sentiment analysis is one of the many functionalities available in SAS Text Miner, which can perform a much faster and more accurate sentiment analysis compared to the manual example which I did above, as well as process far more text at once.

Thoughts on SAS Text Miner

Interface wise, SAS Text Miner doesn’t really look like the most intuitive software to figure out and use, however beyond this interface difficulty, the software can achieve quite a few things. My first test try of SAS was simple and mainly involved the importing of new data sources, and updating of libraries. We then played around with our sample data through text mining training through the use of text parsing and text filters to see what turned up! It looked quite a bit like the screenshot below.

5A screenshot from SAS Text Miner

However, it was quite clear through this experience that we had barely touched the surface of what SAS could do, and the different types of things that could be found and used with, such as content in social media.

Uses of text analytics today – a real example

Text mining is also widely used in businesses to analyse text retrieved from various social media platforms such as Twitter. Businesses can benefit from text mining by being able to see various trends and insights that they may not have picked up before without the use of mining.

One example of a possible use of text mining was the use of SAS to go through unstructured Twitter data from the 2013 Super Bowl, which was a match between the Baltimore Ravens and the San Francisco 49ers.

SAS collected approximately 4.85 million tweets covering the time before and during the match and leading to a broad spread of discussion. This is where the use of SAS becomes handy, and the process of how meaning was made out of all this data was fascinating.

First off, the Twitter data was cleaned by removing irrelevant tweets, identifying any abbreviations and misspellings such as different names for the teams involved, and focusing the analysis. This procedure helped to eliminate noise and help discover emerging topics. One topic that was the focus of this analysis was the Super Bowl blackout and the tweets that went viral.

6.pngA network diagram with the blue dots representing influential Twitter users/authors, and orange showing topics

This analysis revealed several behaviours by the Twitter users. There was a wide variety of topics that was conversed (as seen in the above picture, with some relevant to the game, and others such as Obama not), but what was also interesting was that some companies took the blackout as an opportunity to market their own companies. This helped them to gain retweets, spread the word about their brand, and get advertisement coverage through the Super Bowl they normally would not be able to get due to the high associated costs with normal advertising in that time frame.

7The Tweets that took advantage of the blackout…

8… and the impact of these strategic marketing Tweets as seen in SAS

Overall, this example in the use of text analysis provides a deeper insight into the types of data that can be extrapolated, how it can be used in social media, and the different trends that it can pick up. Even something that may seem irrelevant, such as a blackout, has shown positive marketing results for companies such as Oreo that successfully take advantage of the situation. Whilst learning how to use software like SAS Text Miner to successfully pull out information like the example above may not initially be the easiest task, in the long run having this understanding can have significant benefit for businesses.

What are your thoughts on the use of SAS to analyse text?