Blogging all things data

Customer Retention and Preventing Customer Churn with Algorithms

Customer Retention and Preventing Customer Churn with Algorithms

Customer churn, also known as customer attrition or customer drop-off, is defined as the process of an individual who was formerly a repeat or subscription user, dropping off for one of any number of reasons. Customer retention is an important, even fundamental, aspect of Customer Relationship Management. With media subscription services and Software as a Service becoming an increasingly common method of delivering content and products, scrutinising when customers are dropping off, quantifying a customer churn rate and why they do so is pivotal.

Our previous blog post broadly covered the concept of sentiment analysis and opinion mining. It identified the specific area of opinion that is typically the focus of such methods and began to introduce why this would be of interest to marketers. This post will go into more detail about how organisation can start thinking broadly about retaining customers using a range of customer attrition analysis models.  

What is Customer Churn? 

Although the actual effect depends on the industry, it is practically a truism that it costs much less to focus on customer retention rather than new customer acquisition. Studies have shown that counting in all factors such as the costs of offers and marketing,

the price of acquiring a new customer can be between 5x and 25x more expensive than acting to retain a customer.

Apart from the loss of recurring revenue, a churned customer can incur a number of other effects, such as deactivation costs. Additionally, there are several soft costs, such as diminishing of brand value from vocal dissatisfied customers, reduction of cross-selling opportunities and a potential domino effect wherein other paying clients are subsequently more inclined to drop-off  

The effects of not focusing on customer churn prediction and prevention are particularly acute in industries such as telecommunications, where there are many potential competitors, the barriers to switching provider can be minimal and a churning customer incurs infrastructural offboarding and onboarding costs. 

A basic form of managing customer churn is reactive, in which a customer is only engaged when they take the step of cancelling their subscription. Any outcomes from this method would clearly be inferior, but it is nonetheless better than taking no action. Moving beyond this is the proactive approach which seeks to anticipate and engage with customers before they take the deliberate step of dropping. 

Although there is no single action that can be taken in preventing customer churn once they’ve been identified, it can often be as simple as reaching out with targeted communication or an offer/discount of some kind. The majority of the challenge is in accurate identification.  

Customer attrition analysis and retention strategies prevent a business from becoming like a revolving door.

effective customer retention prevents a business from becoming like a revolving door

Types of Customer Attrition

Not every customer drops off in the same way or for the same reasons. Sometimes a customer is lost without having any real intent to drop in a situation where the company itself is more at fault. Generally, the explanations can be divided into the following: 

Involuntary/Passive – Wherein a certain contract is terminated by the company itself. This can be due to a strategic product change/discontinuation or the cutting off of a customer due to a violation of terms. Termination of an account after a long period of inactivity or failure to pay also falls under this category. 

Incidental/Rotational – This is an active choice on the customer’s behalf to move to a competitor due to reasons outside the company’s control. It can be due to an individual moving geographically or moving to a different financial bracket. Additionally a company’s offerings can be left behind due to technological progress or a decline in an ancillary third-party product. These may follow trends based on larger social or economic factors and savvy marketers can keep abreast of these developments and sometime preempt them.  

Deliberate/Active – In this case a customer is actively dissatisfied and decides to drop or switch to a competitor for reasons of dissatisfaction. The reasons for this will be subjective and difficult to quantify for marketers. 

There is a potentially significant commercial impact in preventing even a small percentage of customer attrition. According to one study,

a 5% improvement in customer retention can lead to an increase in profit between 25% and 85%.

It is clearly in any company’s interest to be making efforts at every one of the mentioned levels in order to prevent customer churn.  

At the first, Involuntary level, solid product strategy, robust support and customer road-mapping is required to reduce customer drop-off. At the Incidental level, relatively simple rule-based and machine learning models can be used to identify certain consistent patterns in customer behaviour that indicate that they are actively moving into a “likely to drop”. The final, Deliberate, level is the most challenging, since it falls within the realm of sentiment, affect and subjective opinion. This is where opinion mining algorithms come into play and can be used to navigate the typically difficult terrain of unstructured data and customer sentiment in order to highlight customers in need of intervention. 

Machine Learning for Customer Churn Prediction

As customers use a subscription-based service they generate large amounts of structured data and metadata. From demographic data to usage and transaction data, it is relatively straightforward and efficient to collect and analyse even in a way which is privacy conscious. The idea underpinning this approach is that customers on the verge of dropping exhibit certain common “churn signals”. These can be identified in the structured data and metadata that they generate through the use of their account. 

One paper estimates that only around 20% of useful business insights that can be acted on in preventing customer churn can be gained from this kind of structured data. The remaining 80% falls within the much more analytically challenging unstructured data realm of natural language. However, given the high impact of even the slightest gains in customer retention, companies are rightfully interested in getting as much insight as they can from this structured data. 

"Only around 20% of useful business insights can be gained from structured data"

A comparison study of the different machine learning approaches that are typically deployed for customer attrition analysis claims that the best method is a boosted Support Vector Machines model which had an accuracy score of 95% and an f-score of 84%. Conversely, the worst performing machine learning methods for the task were the Naïve Bayes and Logical Regression models. We have covered the different types of machine learning algorithms in a previous blog post which you can read here

The research suggests that achieving the best results in machine learning churn prediction relies on using a large training dataset as well as implementing boosting algorithms, the use of which is associated in a substantial increase in f-score. The specific algorithm (or, more accurately a meta-algorithm) used in this case was AdaBoost, which refines the process over time. When implemented, it places more weight on those particular features of a dataset which have been shown to be more relevant. This reduces the dimensionality of the data and reduces the computing resources spent on processing features that are less relevant. 

One set of case studies highlights two companies working with two different datasets which successfully predicted churn using decision trees learning model. The first applied customer service logs, work order details and contractual details to pull a list of clients that were predicted to churn with an accuracy level of 89%. The second parsed the activity logs of customers, using data such as number of searches, number of downloads and number of link follows.

Of the 20 clients that eventually ended up dropping, 16 were accurately predicated by the model. 

Sentiment Analysis for Customer Churn Prediction 

Attempting to identify customers that drop deliberately and whose intent to do so cannot be predicted with structured data requires processing and analysis of the relevant unstructured data. There are two potential approaches to this, the first being semantic analysis of the customer text or call transcript and the other being voice tone analysis which aims to pick up on emotional cues in the customer’s voice during a call. 

Semantic Analysis

One analysis of this topic calls this space as the Voice of the Customer (VOC) and defines it as “call centre calls, emails, questionnaires, web reviews and SMS”. It identifies that the existing research on this topic is still quite sparse and experimental, but highlights cases in which unstructured call centre information applied to a churn prediction model led to an increase in its predictive power.  

The actual study covered in the article interestingly fails to reproduce these effects to customer churn when implementing a simple sentiment polarity score (positive, negative, neutral). However, when sentiment was graded on a level, there was an increase in predictive power. Additionally, the study mentions that these results are possibly linked to the peculiarities of sentiment expression in Japanese culture, which is where the dataset was derived. This emphasises the nuance that is involved in making predictions based on unstructured VOC data. At its core, any such project will be ultimately sociological. 

Analysing Tweet Sentiment to Retain Customers

Opinion mining within this Voice of the Customer space is highly nuanced and fertile to all sorts of experimentation in determining what works and what doesn’t for attrition prediction models. A different study applied the most prominent sentiment analysis algorithms to Tweets, to see if any useful data could be obtained. The results were overall lacklustre, with an overall classification accuracy of 61%, emphasising that not every popular and publicly available space where customers share their opinions is necessarily useful for algorithmic sentiment analysis.  

The challenges of applying sentiment analysis to natural language is listed by one article as being the existence of “noise, sentimental drifts, vocabulary changes, idioms, irony, jargon abbreviations and domain specific terminology”. However, it also presents a range of recent research that has aimed at advancing this field including data clarification through the use of emoticons and sarcasm detection algorithms. Tests performed on opinion extraction on numerous, real-world datasets showed that Maximum Entropy Classifier algorithm performed best in the task, while Regression Models were very much inadequate. Predictably, the bigger the dataset and the longer each text was, the more accurate the analysis.

Voice Tone Analysis

The other method of deriving sentiment insight from unstructured voice data is to analyse the intonation of the customer rather than the content of what they’re saying. However, even more so than semantic analysis, this is a very much new field of inquiry, the potential of which is only beginning to be explored. 

Newly developed tools are being deployed in call centres to identify sentiment based on vocal cues. The aim is to lessen burden on staff in picking up on when customers are displaying signs of agitation and frustration, requiring increased attention and engagement in an attempt to keep them from dropping.  

woman talking on phone and having her voice sentiment analysed

However, the technology is double-edged, with one of its main applications being to monitor the performance of the workers taking the call. The aim is to ensure that the person taking the call is hitting the emotional cues deemed appropriate in keeping an agitated customer appeased. One examination of this technology in practice highlighted that although the company claims that implementing the technology increased customer satisfaction by 13%, workers who were receiving automated cues designed to help them prevent churn sometimes struggled to understand how to interpret them, especially the notification that implied a lack of “empathy” on their end. 

The development of this kind of capacity is dependent on utilising very large data sets. Another company in this space incorporated over 18 years of research and data on 60,000 recorded subjects, extracted, decoded and fed through machine learning models to generate potential real-time insights on any voice. 

As this blog post has emphasised, preventing customer churn is a vital part of customer retention and should be an important focus for any company dealing in subscription-based products, especially those concerned with reducing Saas churn. Applying sentiment analysis to this problem is a relatively novel, but promising and constantly developing topic. For more insights on this topic, watch our free webinar on AI for Customer Service: