SQL Code for Lifetime Value Model

The first part of building the code to pull the data for the lifetime value model is to create an SQL script that will get the historical data we need to train our model on.

The SQL Code select 6 fields from the Contoso Retail Datawarehouse:
The full SQL code is at the bottom of the page

First Section of Code (WITH)

USE ContosoRetailDW;

WITH FirstTransactionDates AS (
    SELECT customerkey,MIN([DateKey]) AS first_transaction_date
    FROM [dbo].[FactOnlineSales]
    GROUP BY customerkey)
	,
	Orders AS (SELECT 
		  [CustomerKey] as CustomerKey, 
		  [SalesOrderNumber] as SalesOrderNumber,
		  ROW_NUMBER() OVER (PARTITION BY [CustomerKey] ORDER BY [SalesOrderNumber] ASC) AS OrderNumber,
		  SUM([SalesAmount]) as SalesAmount
		  FROM
		  [dbo].[FactOnlineSales]
		  GROUP BY [CustomerKey], [SalesOrderNumber] 
)


The code starts off with a subquery after the database is selected in the USE statement.

2 Sub-queries are then created in the WITH clause. This makes the code easier to manage rather than using sub-queries on sub-queries in the main code body.

The first sub-query: FirstTransactionDates creates a temporary table of customers and their first transaction date from the FactOnlineSales table

The second-subquery: Orders creates a temporary table of customers and their orders with the sales ordernumber field ranking the orders from 1 upwards with 1 being the first order. I will create a separate post explaining the line:
ROW_NUMBER() OVER (PARTITION BY [CustomerKey] ORDER BY [SalesOrderNumber] ASC) AS OrderNumber,


Second Section of Code

SELECT 
--selection of CustomerKey from DimCustomer
c.CustomerKey as CustomerKey, 

-- Calculate Age as difference between [BirthDate] selected from DimCustomer and today (using the GETDATE() function), then divide by 365
DATEDIFF(DAY, [BirthDate], GETDATE())/365 as Age, 

--Gender selected as the [gender] field from DimCustomer and converted to 0 and 1 for m and f respectively
CASE 
WHEN [gender] = 'm' THEN 0
WHEN [gender] = 'f' THEN 1
END as Gender,

--YearlyIncome selected from DimCustomer table
c.YearlyIncome as YearlyIncome, 

--FirstPurchaseAmount from sub-query a
a.FirstPurchaseAmount as FirstPurchaseAmount,

LifetimeSales as sub-query b
b.first12months_salesamount as LifetimeSales

-- select from DimCustomer with left join on.... tables to follow...
FROM dbo.DimCustomer c LEFT OUTER JOIN


As you can see from the above, there are 6 field selections:
1. The CustomerKey from the DimCustomer table (alias c)
2. Age calculated as the difference in days between the customers birthdate ([BirthDate] and todays date (GETDATE()). This is then divided by 365 to get the age in years.
3. The Gender which uses the [gender] filed from the DimCustomer table and swaps ‘m’ for 0 and ‘f’ for 1, so we have numeric data for the model.
4. The YearlyIncome field from the DimCustomer table.

Then the last 2 fields FirstPurchaseAmount and first12months_salesamount which are calculated using sub-queries.

Third Section of Code (Sub-query a)



First Sub-query a, which is the DimCustomer table joins to on ON c.CustomerKey = a.CustomerKey
The SELECT query select, CustomerKey and SalesAmount from the ORDERS table.
The ORDERS table is a temporary table created with a sub-query in the WITH clause at the start of the code. The ORDERS table is filtered by OrderNumber = 1 to return the sum of sales for each customer on their first order. This gives us the FirstPurchaseAmount field.

The Fourth Section of code is sub-query B

LEFT OUTER JOIN

-- Build First 12 Month Sales Table b
(
SELECT 
    f.customerkey,
    SUM(CASE WHEN DATEDIFF(month, ft.first_transaction_date, f.DateKey) < 12 THEN f.salesamount 
	ELSE 0 
	END) AS first12months_salesamount

FROM [dbo].[FactOnlineSales] f
LEFT OUTER JOIN FirstTransactionDates ft ON 
f.customerkey = ft.customerkey
GROUP BY f.CustomerKey
) as b
ON c.CustomerKey = b.CustomerKey

WHERE c.CustomerType = 'Person'


The fourth section of code shows the LEFT OUTER JOIN of the DimCustomer table to sub-query b.
Sub-query b gets us the first 12 month sales amount of customers, which will be our dependent variable in our final Model.

The SELECT code starts by selecting the CustomerKey from the FactOnlineSales table.

The second line selects the first_transaction_date from the FirstTransactionDates sub-query created in the WITH clause and the f.DateKey from the FactOnlineSales table and looks at the difference in months between the 2. If the DateKey (transaction date is less than 12 months), then the sales amount is summed as first12months_salesamount.
b.first12months_salesamount as LifetimeSales

SUM(CASE WHEN DATEDIFF(month, ft.first_transaction_date, f.DateKey) < 12 THEN f.salesamount
ELSE 0
END) AS first12months_salesamount


This is the field used in the main SELECT query in code section 2.
b.first12months_salesamount as LifetimeSales

Full SQL Code

USE ContosoRetailDW;

WITH FirstTransactionDates AS (
    SELECT customerkey,MIN([DateKey]) AS first_transaction_date
    FROM [dbo].[FactOnlineSales]
    GROUP BY customerkey)
	,
	Orders AS (SELECT 
		  [CustomerKey] as CustomerKey, 
		  [SalesOrderNumber] as SalesOrderNumber,
		  ROW_NUMBER() OVER (PARTITION BY [CustomerKey] ORDER BY [SalesOrderNumber] ASC) AS OrderNumber,
		  SUM([SalesAmount]) as SalesAmount
		  FROM
		  [dbo].[FactOnlineSales]
		  GROUP BY [CustomerKey], [SalesOrderNumber] 
)

SELECT 
c.CustomerKey as CustomerKey, 
DATEDIFF(DAY, [BirthDate], GETDATE())/365 as Age, 
CASE 
WHEN [gender] = 'm' THEN 0
WHEN [gender] = 'f' THEN 1
END as Gender,
c.YearlyIncome as YearlyIncome, 
a.FirstPurchaseAmount as FirstPurchaseAmount,
b.first12months_salesamount as LifetimeSales

FROM 
dbo.DimCustomer c

--Build FirstPurchaseAmount table a
LEFT OUTER JOIN
(
SELECT CustomerKey, SalesAmount as FirstPurchaseAmount
FROM ORDERS
WHERE OrderNumber = 1
) a
ON c.CustomerKey = a.CustomerKey

LEFT OUTER JOIN

-- Build First 12 Month Sales Table b
(
SELECT 
    f.customerkey,
    SUM(CASE WHEN DATEDIFF(month, ft.first_transaction_date, f.DateKey) < 12 THEN f.salesamount 
	ELSE 0 
	END) AS first12months_salesamount

FROM [dbo].[FactOnlineSales] f
LEFT OUTER JOIN FirstTransactionDates ft ON 
f.customerkey = ft.customerkey
GROUP BY f.CustomerKey
) as b
ON c.CustomerKey = b.CustomerKey

WHERE c.CustomerType = 'Person'


Comments

5 responses to “SQL Code for Lifetime Value Model”

  1. Baddiehubs Nice post. I learn something totally new and challenging on websites

  2. BYU Cougars naturally like your web site however you need to take a look at the spelling on several of your posts. A number of them are rife with spelling problems and I find it very bothersome to tell the truth on the other hand I will surely come again again.

  3. Mating Press very informative articles or reviews at this time.

  4. Tech Learner Very well presented. Every quote was awesome and thanks for sharing the content. Keep sharing and keep motivating others.

  5. Houzzmagazine This is my first time pay a quick visit at here and i am really happy to read everthing at one place

Leave a Reply

Your email address will not be published. Required fields are marked *