Behavioral segmentation is the practice of segmenting your customers based on what they do. I wrote previously about the basics of behavioral segmentation. During the last year, I’ve been putting it into practice through a customer segmentation project with a CASC business partner. Today I’d like to share with you some of the lessons I’ve learned through that experience.
NOT Just an Analytics Exercise
Behavioral segmentation has the advantage of revealing segments of similarly behaved consumers that you may not have thought of previously. It requires a lot of data crunching. There is no doubt about that. However, it is important to remember that behavioral segmentation is not entirely an exercise in numbers. If data crunching is all that you do, you risk creating segments that either are based on artificial or fake relationships or are not very actionable from a target marketing perspective.
Successful behavioral segmentation should be a collaborative exercise between your analytics team and those who have a good grasp of your business and your customers. The latter are most likely found in your marketing or sales department. The process of behavioral segmentation should be developed as an iterative process that goes back and forth between analytics and marketing. The analytics team should start by understanding from the marketing team the purpose of the segmentation exercise, the observed behaviors at hand, and the capability of marketing to implement behavioral segmentation insights. Based on this initial information, the analytics team can produce an initial set of behavioral segments based on customer data.
This initial segmentation scheme should be presented to the marketing team both to make sense of the results and to see if meaningful actions can be taken to target each segment. The input from the marketing team is then fed back to the next round of data crunching to adjust the segmentation focus and approach. This process is repeated until both sides are satisfied with a meaningful set of segments to be implemented in practice.
Data Quality is Really Important
The former CEO of Reddit, Ellen Pao, caused a stir in January when he stated that web analytics is fake. Although his statement may seem radical, the underlying message was actually quite reasonable. His point was that user metrics in web analytics are often inaccurate and therefore are of limited value. This trash-in trash-out mentality applies to all data analytic situations, including behavioral segmentation.
Before you dive into behavioral segmentation, first verify the quality and accuracy of your customer data. Are numbers free of errors? Do you see duplicate records? Are there significant omissions in a systematic pattern? Simple summary statistics are often insufficient in spotting these issues. You won’t see the problems until you really dig deep into the data. I have no doubt that there are a few businesses out there that may maintain squeaky clean data. But in my experience, data cleaning and quality verification are usually the most time-consuming part of an analytics project. So, although discovering new customer segments is exciting, curb your enthusiasm and check your data first.
Determine What to Include
In most situations, there are many different ways of describing customers’ behaviors. Describing customers in a retail setting may include, for example, purchase frequency, order size, basket composition, brand choices, coupon usage, time of purchase, to list just a few. Demographic and psychographic profiles may also be mixed in with pure behavioral data. The results of a segmentation analysis can be vastly different depending on which of these inputs you provide to your analysis.
Which ones should you include? Some might answer that we should just include all available information. But this is neither wise nor feasible in some situations. Depending on the size of the data (both the number of variables and the number of observations), the computational cost may be prohibitively high to justify the inclusion of all possible information. Including unimportant information can also clutter up your analysis with garbage.
Here are a few considerations I would like to offer:
- Select variables that show significant variations. If your customers all behave very similarly on something, say, purchase frequency, including that information is not going to be very informative in identifying different pockets of customers.
- Try different combinations of variables and observe their contribution to the discovered clusters. The ones that do not help much in setting the different segments apart from each other can be dropped. The resulting sizes of the segments can also offer a clue as to the desirability of the solution.
- Choose information whose quality you can trust. This is pretty self-explanatory from the earlier discussion.
- Pay attention to the scale of the variables. Some clustering algorithms such as k-means clustering are sensitive to the scaling of the variables. If you have a variable (e.g., income) that ranges from $10,000 to over $1 million, it will overshadow another smaller variable (e.g., order size) that ranges from a few dollars to a few hundred dollars. If you are using a scale-sensitive algorithm, be sure to transform your variables so that their scales are more comparable with each other.
Mutually Exclusive or Overlapping Segments?
Customer segments are often defined as mutually exclusive categories. That is, a consumer can belong to one segment only. This mutual exclusivity does not have to be the case. For your business, it may be perfectly reasonable or even helpful to allow one customer to belong to two or more identified segments. More complex to understand and manage, overlapping segmentation schemes may require a higher level of organizational sophistication, especially in terms of customer messaging. Having such sophistication will help avoid sending inconsistent messages to customers in overlapping regions of multiple segments.
Your choice of the desired segmentation scheme will affect your choice of the clustering algorithm. While k-means clustering and hierarchical clustering are great for creating mutually exclusive segments, methods such as expectation-maximization (EM) clustering are needed to allow overlapping segment memberships. More broadly, it is important to recognize that each clustering algorithm has its strengths and weaknesses. One size does not fit all in this case. Your choice of methods should be driven by your objectives and the nature of your data.
Data science is context-bound. Truly useful data analytics, including behavioral segmentation, require a combination of data skills and business acumen. The issues listed here are by no means an exhaustive list of all nuances in behavioral segmentation. But they should serve as a good starting point for making sure your segmentation exercise yields valid and useful results.
Find the information helpful? Please share it with others. Thank you for reading!