Inside RFM Segmentation Modeling by John Miglautsch

Many people have contacted me this month about the idea that it is possible to expand on RFM segmentation (Recency, Frequency and Monetary). We have lightly touched on RFM modeling and suggested that this is only a first step in building segmentation models. Perhaps it is best that we begin by elaborating on RFM, illustrate some of its short comings then outline how we can expand the number of interesting analysis variables.

When I mentioned the increasing interest in RFM, one statistical friend of mine replied, “Isn’t that sort of rediscovering ‘70’s technology?” Though RFM has been around for decades it has not been widely applied. We went from the booming ‘70’s through the growth ‘80’s and into the downsizing ‘90’s. Most of us were so busy trying to keep up with the explosion, we didn’t really worry about exactly how to fine tune.

Basic RFM Scoring

To build a simple RFM score, take a few thousand customers (or more if you have them) and their orders and put them into dBASE. Next, query all the orders less than 3 months old. With a join, you can link the orders to customers and use CALC_AS to give them a 5. We typically give <3 mos. a 5, 3-6 mos. a 4, 6-12 mos. a 3, 12-24 mos. a 2 and >24 mos. a 1. (A little tip: if you start at the low end, you can be sure that you have all the customers properly scored even if they have several orders).

For frequency and monetary, it is best to first build fields with each customer’s life to date history. Query and Calc_sum the orders and dollars for each customer. We use an interesting scoring technique I learned from John Wirth, President of Woodworker’s Supply. We total all the orders and sort the customer by number of orders so our best customer would be first. We give them a 5 in recency score. We then subtract that customer’s number of orders from the total. When the total reaches 80% of the original, we start giving those customers a 4. This gives us five breaks (actually six because non-buyers get 0) of the number of orders focusing our F and M on those customers who are most profitable. If modeling is based on uncovering the “80/20″ rule, this method forces you to concentrate more on the top 20.

If you want to validate the impact of RFM on your file, use the above techniques but use only orders up to six months ago. Once your file is scored, look at the orders from customers in each cell. You will be looking at customers as they would have been selected if you would have used RFM. Of course you will find that those in the highest cells will dramatically out perform the rest.

RFM Weaknesses

When we began building complex models, we talked to a seasoned Ph.D. about how to tell whether what we were doing was right. He replied, “If you don’t see recency at the top, you better recheck.” This statement illustrates both the strength and weakness of RFM.

RFM tends to be applied in a vacuum, ignoring other important information. I ran a successful business-to-business catalog for several years only using recency. We felt that if someone hadn’t responded in four years, they weren’t worth mailing. Our file was small and this approach was working. Looking back, I wish I would have considered additional information.

If you have broad product lines and sales varies through the year, then you probably have very different customers buying in different seasons. If we ask, “Who will likely buy?” Recency always wins. If, instead, we ask, “Who is likely to buy ‘X’?” we will probably get a very different answer.

It is intuitively obvious that men are different than women, young different than old, etc. It is also plain that customers who buy one type of product do not necessarily buy all your other products. Further, if the products suggest a lifestyle (like woodworking, fishing, needle point or other interests) when you examine who is likely to buy, recency can almost dissapear!

The fundamental issue is not whether RFM works, but whether we are asking the correct question. I would contend that building one model for travel buyers when some go to the Yukon and others to Cancun is odd indeed. My statistics friend (who banked on recency) assured me that it is always better to build a model based on your specific offer. “Of course! If you want modelling to work, you should almost never have just ONE model.” he replied. But because of the cost, if a mailer has a model, they probably have only ONE.

One of the great advantages of RFM is its simplicity. This is also its great weakness. Some score frequency by dividing their customers into equal groups of say 20%. This gives you one cell that covers the top 20% of your customers (probably 80% of your sales) and four cells for the bottom 20% of your sales. Above, we suggested a fix for this, but at the same time, we then create a bottom cell that may include 50% of your customers. It is very common to find that over half of your customers have ordered once, spent under $100 and haven’t ordered in two or more years.

It is not difficult to identify and contact your best customers… they are the ones who call the most already. If you do nothing more than put offers in your out-going packages and trigger an additional follow up mailing one month after the shipment, you will be getting to the right people. You don’t need a model for this! Virtually all of the articles on RFM imply that segmentation will more than pay for itself by saving wasted circulation. RFM must focus on your best customers. The problem is not who is your best customer, but who of your not so hot customers is worth mailing? We should also note that RFM will have little value in prospecting. Obviously, if you are looking for new customers, they will not have any RFM values.

Expanding RFM

Two obvious solutions arise: First, include product category in your analysis. Second add information external to buying behavior (geodemographic information on the type of neighborhood a customer lives in).

Their are many ways to add product (“P” in RFMP). The simplest is to trigger a flag the first time a customer buys something in a category, in addition, you can track the sales history by category and score it like monetary. Many use a heirarchical structure which makes it easy to select names; this is the worst for modeling. If you are mailing a special offer, you can now break those poor customers (low RFM scores) into those who have purchased or not purchased the relevant product categories.

Neighborhood information also allows you to break up the weak cells. Even if all you knew was who lived in an urban, suburban or rural setting with high, medium or low income… you could obviously improve your ability to mail into the low cells.

Conclusion

RFM can easily lead to smaller and smaller customer mailings. Short term, profits improve; long term, the warehouse gets too big. It is important to look beyond the simple question of “Who is most likely to respond?” Additional variables help us identify the possibility of mailable customers within poor RFM cells.

John Miglautsch is President of Miglautsch Marketing, Inc. He can be reached at john@migmar.com.