Blog: A 3-step model to calculate the lifetime value of prospects

Customer lifetime value of your prospects

Klaas van der Veen

Imagine you consider acquiring a prospect. Your decision would most likely depend on whether this prospect will add value to your organization. This can be determined by measuring the customer lifetime value (CLV), which represents the net present value of all future earnings of a customer. However, calculating CLV of a prospect is challenging since purchase behaviour data is not available.

One way to deal with this is to use the data of your current customers to estimate the value of prospects, which follows a three-step process. First, you calculate the net profit of your customers, focusing on their first 3-5 active years with your organisation. Second, you create customer groups which are significantly different from each other by clustering them based on purchase behaviour. Each cluster can be profiled by using sociodemographic data (e.g. age, gender, etc.). Finally, the CLV of each cluster can be determined by projecting the historical purchase behaviour of these clusters into the future (and assuming they were acquired today). Comparing the cluster profiles with your prospect enables you to estimate the CLV and helps you making the right acquisition decision.

This blog shows you how to carry out the steps described above and use these outcomes to estimate the CLV of your prospects. The case presented in this blog is based on our experience at a retailer (B2C), characterized by customers making multiple purchases per year in a non-contractual setting. First I will discuss the definition of CLV and the required data for calculating it. Subsequently, the case will be explained by following the three steps; (1) calculate net profit of existing customers, (2) create customer clusters and profiles and (3) calculate the net present value per cluster.

CLV definition

CLV is usually defined as the net present value of all future earnings of a customer (Rust et al. 2000). CLV can be mathematically defined as follows:

d = Discount rate
i = Customer (or cluster)
t = Period (e.g. year)

The formula above is best explained by using an example. The table below shows the purchase behaviour of a fictional customer, who is expected to spend € 325,00 during the next three years. Besides the cost of goods sold, the organization makes additional costs regarding shipping (products ordered online) and returns (service provided for returning products). Additionally, the organization invested € 25,00 to acquire this customer and the discount rate is fixed at 10%.

Table 1: Example CLV Calculation

Year 1 Year 2 Year 3
Revenue € 100,- € 150,- € 75,-
Cost of goods sold € 70,- € 105,- € 52,50
Shipping costs € 5,- € 10,- € 5,-
Return costs € 2,50 € 2,50 € 2,50
Net profit € 22,50 € 32,50 € 15,-
Acquisition costs: € 25,-
Discount rate: 10%

Using this data and the formula we can calculate the CLV of this customer:

The expected CLV of this customer is € 36,44. Since the CLV >€ 0,00 it pays off to acquire this customer. In fact, the organization can invest up to € 36,44 + € 25,00 = € 61,44 to acquire this customer and still generate a profit. The calculation shows that profits earned today are worth more than profits earned in a more distant future, this is the effect of the discount factor. Therefore, CLV is often based on a maximum period of 3 to 5 years, also because expected profits become more uncertain in the longer term. Note that the retention probability is not considered. In our case, this effect is already embedded in the purchase behaviour of the clusters. Finally, customers typically spend throughout the whole year. Therefore, we assume (for simplicity) that revenues are earned in the middle of the year, using the exponent t – 0,5 (rather than the beginning of the year (t – 1) or the end of the year (t – 0)).

The key to a comprehensive and successful calculation of CLV is using all the relevant data. In the example, above, only the costs of the product, shipping and returns were considered. However, it is likely that other factors may have an impact on the lifetime value of your customers. For instance, some customers might only buy during sale periods (discounts) or have high service usage levels (e.g. call centre). In addition, a complete CLV calculation also considers overhead costs (e.g. buildings, personnel, distribution, etc.).

Step 1: Calculate net profit of existing customers

Below is an overview of our analysis set-up. First, we defined several customer groups (cohorts) based on the start of their relationship with the organization (e.g. customers starting in the first quarter of 2012). Second, an analytical view per cohort is constructed, containing all purchase behaviour data over time (e.g. average order value in year 1, year 2, etc.). This also includes the calculation of net profit, calculated as margin minus discounts and any other variable costs.


Comparing multiple cohorts has several advantages. First, building clusters based on purchase behaviour for all cohorts enables us to verify if these clusters are stable over time. Second, purchase behaviour (e.g. order frequency) of clusters may change over time. This trend can be used to predict the purchase behaviour of future customers.

Step 2: Create customer clusters and profiles

After finishing the first step you should have a clear view of the purchase behaviour of every individual customer. Since our aim is to identify those prospects with the highest potential value, it is useful to group customers with similar behaviour. The best way to do this is performing a cluster analysis, which is available in most of the analytics software packages. The picture below shows the IBM SPSS Modeller output of the TwoStep cluster mode. This method first compresses the input data into a manageable set of sub clusters, after which it uses a hierarchical clustering method to progressively merge the sub clusters into larger and larger clusters. The advantage of this method is that it can handle large datasets efficiently.

When performing a cluster analysis, it is very important to continuously evaluate the results and do some sensibility checks. Clusters should be significantly different from each other, but also be interpretable (making sense to the business). Furthermore, the settings of a cluster analysis (e.g. distance measure, clustering criterion, how to manage outliers, etc.) can have a considerable impact on the outcomes, not to mention choosing the right input variables.

In this case, we used the number of orders and average order value as input variables for the clusters analysis. Based on these variables the algorithm found three distinct groups, as shown in the picture below.

Creating clusters of customers with specific purchase behaviour (and lifetime value as we will see in step 3) is one thing, but how does this help you identifying those prospects with the highest potential when there is no behavioural data? This is where customer profiling becomes useful. Given the customers within clusters, one can match sociodemographic data (using internal and/or external data) and find the characteristics which are over- or underrepresented for each cluster (e.g. age, income, household type, etc.). This can be done by comparing the means of a characteristic between clusters. To check if the differences are significant, an F-value is calculated (see also the table below), where an importance of 1 denotes significance.

Table 2: Example of compare means calculation

The resulting cluster profiles can be used to target the most relevant channels or events (and thus audiences) to reach the prospects with the highest potential.

Step 3: Calculate the net present value per cluster

In the last step, we combine the insights from the previous steps to make a projection of the customer lifetime value for a given customer group (acquired today or soon). This exercise involves the following activities:

  • Forecast the purchase behaviour of clusters for the next 3-5 years by comparing the cohorts and extend these trends to the future periods.
  • Allocate overhead costs to clusters via the total number of customers or visitors. In this way, overhead costs are not only allocated to customers who make a purchase.
  • Finally, calculate the net present value of all future profits per cluster by using the discount rate.

At the end of this step you can merge all previous results to obtain a complete picture of your customer groups (purchase behaviour, profile, and net present value). Combining purchase behaviour and NPV in our case provides the following insights, based on lifetime duration of four years.

Clearly, customers with a high average order value are likely to be high value customers. On the other hand, customers with a low average order value and a low order frequency are likely to be non-profitable customers.


This blog showed you how to calculate CLV of prospects (and how to identify them) by using only three steps. The essential factor in this process is collecting all relevant data and knowing how to use it in the CLV calculation. The insights from this analysis enables managers to optimize their acquisition strategy. For example, you now know which customer groups to target based on their profile and you are better able to determine the height of the acquisition offer.

Stay up to date with our industry insights
Receive our insights on big data, predictive analytics and marketing automation once every two months.

Follow us on: