Anda di halaman 1dari 13

Case Study: Market Basket Analysis

Ronald Hochreiter

VO Datenbanksysteme, UK Datawarehousing
December 13, 2006
Market Basket Analysis

Case study based on:


P. Giudici. Applied Data Mining. Wiley. 2003. Chapter 7.

Base of analysis. Market consumer behaviour.

Objective. determine products, that are bought together (reorganize


supermarket layout, promotional campaigns: products that are
bought together should not be promoted at the same time, . . . )

Techniques. Association rules (if condition, then result).


E.g. if Beer, then Soletti.

2
Market Basket Analysis - Data Description

• 75 days between January 2nd and April 21st, 2001.

• Southern Italy

• 7301 loyalty cards

• Average expenditure per transaction: EUR 28,27

• Analysis limited to 20 categories (items), only food products

3
Market Basket Analysis - Raw Data

4
Market Basket Analysis - Datawarehouse

1. Create a new table in some database

CREATE TABLE transaction (


code BIGINT ZEROFILL,
product VARCHAR(50)
)

2. Convert input data into SQL inserts.


or
Apply some data import function (phpMyAdmin, did not work at
@univie.ac.at).

3. Easy data analysis possible with simple SQL queries.

5
Market Basket Analysis - SQL Data Analysis

Troubles with simple import: Code is loyalty card number.

SELECT code, count(*) AS counter


FROM transaction
GROUP BY code
ORDER BY counter DESC

does not distinguish between purchases! Solution:

CREATE TABLE transaction2 (


code BIGINT(13) ZEROFILL,
purchase MEDIUMINT UNSIGNED,
product VARCHAR(50)
)

A custom conversion script is necessary!


6
Market Basket Analysis - SQL Data Analysis

Main problem solved, however there may be troubles with the data:

SELECT code, purchase, count(*) AS counter


FROM transaction2
GROUP BY purchase
ORDER BY counter DESC

Some purchases have not been split up correctly, one purchase with
131 items, next: 66, 50, 37, 34, 33, . . .

Quick (and dirty) solution: Remove all purchases, where more than
30 items have been bought together?

Or analyze the large buys in more detail?


7
Market Basket Analysis - SQL Data Analysis

Analyze purchase 22340 (with 131 items)

SELECT product, count(*) AS counter


FROM transaction2
WHERE purchase = 22340
GROUP BY product
ORDER BY counter DESC

8
Market Basket Analysis - Buy-Together Analysis

CREATE TABLE analysis (


purchase MEDIUMINT UNSIGNED,
product1 TINYINT(1) UNSIGNED,
product2 TINYINT(1) UNSIGNED,
...
product20 TINYINT(1) UNSIGNED
)

or select the names of the products directly using a SQL SELECT


DISTINCT command:

SELECT DISTINCT product


FROM transaction
ORDER BY product
9
Market Basket Analysis - Buy-Together Analysis

CREATE TABLE analysis (


purchase MEDIUMINT UNSIGNED,
beer TINYINT(1) UNSIGNED,
biscuits TINYINT(1) UNSIGNED,
...
yoghurt TINYINT(1) UNSIGNED
)

Now, we may list/count purchases with a certain product

SELECT purchase, count(product)


FROM transaction2
WHERE product = ’beer’
GROUP BY purchase
ORDER BY purchase
10
Market Basket Analysis - Buy-Together Analysis

List/Count purchases of two products together (simple inner join on


the same table (!) with two products)

SELECT a.purchase, count( a.product ), count( b.product )


FROM transaction2 a, transaction2 b
WHERE a.product = ’beer’
AND b.product = ’coke’
AND a.purchase = b.purchase
GROUP BY a.purchase
ORDER BY a.purchase

11
Market Basket Analysis - Association Rules

support {A → B} = NA→B
N

Example. support {coke → beer} = ?

NA→B =

SELECT count(*) FROM transaction2 a, transaction2 b


WHERE a.product = ’beer’ AND b.product = ’coke’
AND a.purchase = b.purchase
GROUP BY a.purchase

= 1119 and N =

SELECT count(*)
FROM transaction2
GROUP BY purchase

= 46263, i.e. support {coke → beer} = 0.0241

12
Market Basket Analysis - Association Rules

confidence {A → B} = NN
A→B
A

Example. confidence {coke → beer} = ?

NA→B = 1119 (see support) and NA =

SELECT count(product)
FROM transaction2
WHERE product = ’coke’
GROUP BY purchase

= 4956, i.e. confidence {coke → beer} = 0.226

13

Anda mungkin juga menyukai