This article is about market basket analysis, how to perform it using SQL, and how to present the result in the form of a real-time dashboard.
For the purpose of the article, we use a retail dataset of a global superstore: Superstore Sales. This is. We will work with data for 2014.
If you find this article useful, please give it a boost by sharing it on your social media.
Index:
Application of Market Basket Analysis
Market Basket Analysis Dashboard
Application of Market Basket Analysis
Market basket analysis plays a crucial role in the analysis of retail businesses. The contents of the shopping carts of the customers may provide valuable information and help design promotions, ensure better stock management, identify the drivers of the purchase, build cross-sale models, and boost sales.
The most commonly used technique for market basket analysis is association analysis, which generates association rules.
Although market basket analysis and association rules originate from the analysis of retail transactions they may be applied to other business domains. For instance:
- Electronic medical record (EMR) systems may suggest to the physicians the possible need for a back MRI for a patient who has had this procedure in the past; or remind the physician when prescribing antibiotics also to prescribe probiotics.
- Which other parts of a website are explored by the visitors that came from a particular landing page?
- If a passenger entered the Mall of the Emirates metro station and exited the same station two hours later paying the minimum fare, was there system abuse?
Market basket analysis should not be limited to the application of association analysis.
It should also not be limited to the items in the shopping cart.
Since e-commerce with its user registration, as well as loyalty cards implemented by retail stores, allow customer identification and tracking of his or her purchases over time, the customer characteristics must be definitely considered.
- Why does a particular customer always buy oil filters but never buy engine oil?
And you should not miss the store entity.
- Why is this beautiful reproduction of the Sagrada Familia a bestseller in Barcelona’s Eixample, but not a good seller in El Poblenou?
However, the most important entity to watch is the order or the transaction. It includes the following characteristics in our dataset: order_id, customer_id, product_id, order_date, ship_mode, shipping_cost, order_priority, ship_date, sales, discount, and profit.
Market Basket Analysis Dashboard
Market basket analysis dashboard may be used to make decisions about product promotions, visual merchandising, store segmentation, customer targeting, and more.
To build the market basket analysis dashboard we use the Tepegraph platform. We connect it to the dataset and use the platform to fetch the data with SQL queries and visualize the result.
We start with some basic measures that do not require association rules.
Customer Characteristics:
Proportion of customers and average purchase size of the customers who purchase the most popular products (100 best-selling items).
To find out the proportion we use the following query:
WITH orders AS ( SELECT * FROM superstore_orders WHERE EXTRACT(YEAR FROM order_date)= 2014 ), best_products AS ( SELECT product_id, SUM(quantity) AS quantity FROM orders GROUP BY product_id ORDER BY quantity DESC LIMIT 100 ), customers AS ( SELECT customer_id FROM orders GROUP BY customer_id ), selected_customers AS ( SELECT customer_id FROM orders WHERE product_id IN (SELECT product_id FROM best_products) GROUP BY customer_id ), result AS ( SELECT COUNT(customer_id) AS yes, ROUND((SELECT COUNT(customer_id) FROM customers),0) AS total, ROUND(((SELECT COUNT(customer_id) FROM customers) - COUNT(customer_id)),0) AS no FROM selected_customers ), pie AS ( SELECT yes AS values, 'Yes' as labels FROM result UNION SELECT no AS values, 'No' as labels FROM result ) SELECT * FROM pie
For the average purchase size:
WITH orders AS ( SELECT * FROM superstore_orders WHERE EXTRACT(YEAR FROM order_date)= 2014 ), best_products AS ( SELECT product_id, SUM(quantity) AS quantity FROM orders GROUP BY product_id ORDER BY quantity DESC LIMIT 100 ), customers AS ( SELECT customer_id, SUM(sales) AS sales FROM orders GROUP BY customer_id ), selected_customers AS ( SELECT customer_id, SUM(sales) AS sales FROM orders WHERE product_id IN (SELECT product_id FROM best_products) GROUP BY customer_id ), result AS ( SELECT ROUND(AVG(sales),2) AS avg_order_size FROM selected_customers ) SELECT * FROM result
Proportion of customers and average purchase size of customers who purchase the least popular products (100 worst selling items).
To find out the proportion we use this query:
WITH orders AS ( SELECT * FROM superstore_orders WHERE EXTRACT(YEAR FROM order_date)= 2014 ), worst_products AS ( SELECT product_id, SUM(quantity) AS quantity FROM orders GROUP BY product_id ORDER BY quantity LIMIT 100 ), customers AS ( SELECT customer_id FROM orders GROUP BY customer_id ), selected_customers AS ( SELECT customer_id FROM orders WHERE product_id IN (SELECT product_id FROM worst_products) GROUP BY customer_id ), result AS ( SELECT COUNT(customer_id) AS yes, ROUND((SELECT COUNT(customer_id) FROM customers),0) AS total, ROUND(((SELECT COUNT(customer_id) FROM customers) - COUNT(customer_id)),0) AS no FROM selected_customers ), pie AS ( SELECT yes AS values, 'Yes' as labels FROM result UNION SELECT no AS values, 'No' as labels FROM result ) SELECT * FROM pie
For the average purchase size:
WITH orders AS ( SELECT * FROM superstore_orders WHERE EXTRACT(YEAR FROM order_date)= 2014 ), worst_products AS ( SELECT product_id, SUM(quantity) AS quantity FROM orders GROUP BY product_id ORDER BY quantity LIMIT 100 ), customers AS ( SELECT customer_id, SUM(sales) AS sales FROM orders GROUP BY customer_id ), selected_customers AS ( SELECT customer_id, SUM(sales) AS sales FROM orders WHERE product_id IN (SELECT product_id FROM worst_products) GROUP BY customer_id ), result AS ( SELECT ROUND(AVG(sales),2) AS avg_order_size FROM selected_customers ) SELECT * FROM result
To visualize the proportions I used the Pie Chart, and for the average order size – just a Scaled-up Number. As a result, I got such a visual block:
- Would replacing some of those worst-selling items with better-selling ones threaten the customer relationship?
The breadth of the customer relationship (the number of unique items ever purchased) by the depth of the relationship (the number of orders) for customers who purchased more than one item:
WITH t AS ( SELECT customer_id, COUNT(DISTINCT product_id) AS unique_items, COUNT(DISTINCT order_id) AS unique_orders FROM superstore_orders WHERE EXTRACT(YEAR FROM order_date) = 2014 GROUP BY customer_id ORDER BY customer_id ) SELECT customer_id, unique_items, unique_orders FROM t WHERE unique_items > 1
The result of the query is visualized using a scatter. The concentration of the data points represents the number of customers who have that particular breadth and depth in their relationship. The largest concentration is on the lower left, and the smallest is on the higher up to the right.
Quite a large number of customers in our case are buying little and rear, although better customers tend to purchase a greater diversity of goods and more often.
- Who are those customers on the higher right?
- What products do they buy?
- How can we get more customers of this kind?
Orders per customer distribution 1Y:
SELECT customer_name, COUNT(order_id) AS TotalOrders FROM superstore_orders WHERE EXTRACT(YEAR FROM order_date)= 2014 GROUP BY customer_name
The average number of orders per customer per month for the last month:
SELECT COUNT(DISTINCT order_id) * 1.0 / NULLIF(COUNT(DISTINCT customer_id), 0) AS orders_per_customer FROM superstore_orders WHERE EXTRACT(YEAR FROM order_date)= 2014 AND EXTRACT(MONTH FROM order_date) = 12
Change in the average number of orders per customer per month over the last year:
WITH t AS ( SELECT ( date_part('year', order_date) * 12 + date_part('month', order_date) ) AS year_month, order_id, customer_id FROM superstore_orders ) SELECT year_month, COUNT(DISTINCT order_id) * 1.0 / NULLIF(COUNT(DISTINCT customer_id), 0) AS orders_per_customer FROM t GROUP BY year_month ORDER BY year_month DESC LIMIT 12
Order Characteristics:
Unique items per order distribution 1Y:
SELECT order_id, COUNT(DISTINCT product_id) AS uniqueitems FROM superstore_orders WHERE EXTRACT(YEAR FROM order_date)= 2014 GROUP BY order_id
The average number of unique items per order per month for the last month:
WITH t AS ( SELECT order_id, COUNT(DISTINCT product_id) as product_count, date_part('year', order_date) * 12 + date_part('month', order_date) as year_month FROM superstore_orders GROUP BY order_id, year_month ) SELECT year_month, COUNT(order_id) as order_count, SUM(product_count) AS product_count, SUM(product_count) / COUNT(order_id) as products_per_order FROM t WHERE year_month = (SELECT MAX(year_month) FROM t) GROUP BY year_month
Change in the average number of unique items per order over the last year:
WITH t AS ( SELECT order_id, COUNT(DISTINCT product_id) as product_count, date_part('year', order_date) * 12 + date_part('month', order_date) as year_month FROM superstore_orders GROUP BY order_id, year_month ) SELECT year_month, COUNT(order_id) as order_count, SUM(product_count) AS product_count, SUM(product_count) / COUNT(order_id) as products_per_order FROM t GROUP BY year_month order by year_month DESC LIMIT 12
Average order size for the last month:
SELECT AVG(sales) FROM superstore_orders WHERE EXTRACT(YEAR FROM order_date)= 2014 AND EXTRACT(MONTH FROM order_date) = 12
Changes in average order size over the last year:
WITH t AS ( SELECT ( date_part('year', order_date) * 12 + date_part('month', order_date) ) AS year_month, sales FROM superstore_orders ) SELECT year_month, AVG(sales) as avg_sales FROM t GROUP BY year_month ORDER BY year_month DESC LIMIT 12
Average order size by market 1Y (you may opt for any other important characteristics):
SELECT market, AVG(sales) FROM superstore_orders WHERE EXTRACT(YEAR FROM order_date) = 2014 GROUP BY market
The above visualizations give some insight into the business. We may see a great number of customers buy only one product per order.
- How to improve cross-selling in the sales process?
Item Popularity:
It can be helpful to find out the most common item found in a one-item order. (As we don’t have such orders in the dataset, we go for two-item orders).
WITH orders AS ( SELECT * FROM superstore_orders WHERE EXTRACT(YEAR FROM order_date)= 2014 ), selected_orders AS ( SELECT order_id, SUM(quantity) AS total_quantity FROM orders GROUP BY order_id HAVING SUM(quantity) = 2 ), selected_products AS ( SELECT product_name, SUM(quantity) AS quantity FROM orders WHERE order_id IN (SELECT order_id FROM selected_orders) GROUP BY product_name ) SELECT * FROM selected_products ORDER BY quantity DESC LIMIT 3
What is the most common item found in a multi-item order?
WITH orders AS ( SELECT * FROM superstore_orders WHERE EXTRACT(YEAR FROM order_date)= 2014 ), selected_orders AS ( SELECT order_id, SUM(quantity) AS total_quantity FROM orders GROUP BY order_id HAVING SUM(quantity) > 2 ), selected_products AS ( SELECT product_name, SUM(quantity) AS quantity FROM orders WHERE order_id IN (SELECT order_id FROM selected_orders) GROUP BY product_name ) SELECT * FROM selected_products ORDER BY quantity DESC LIMIT 3
And the most common item found among customers who repeated purchasers:
WITH orders AS ( SELECT * FROM superstore_orders WHERE EXTRACT(YEAR FROM order_date)= 2014 ), selected_customers AS ( SELECT customer_id, COUNT(order_id) AS total FROM orders GROUP BY customer_id HAVING COUNT(order_id) > 10 ), selected_products AS ( SELECT product_name, SUM(quantity) AS quantity FROM orders WHERE customer_id IN (SELECT customer_id FROM selected_customers) GROUP BY product_name ) SELECT * FROM selected_products ORDER BY quantity DESC LIMIT 3
These visualizations may promote ideas for growing customer relationships.
Store:
For the offline stores, it might be good to check how the popularity changes across the different stores. Different stores may have different sales models due to local management decisions, local advertising, demographics, regional trends, and even the region’s climate.
Time
Time is one of the most important attributes of market basket data. It may point to the exact marketing conditions at the time of the sale (promotion, special campaign, PR event, etc.)
Numerous questions about customer behavior can be answered by looking at sales over time.
- What is the best-selling item in this quarter? What about the previous quarter?
- How did a particular event affect the sales?
Association Analysis
When building association rules it is important to choose the proper item set. Figure 8 demonstrates partial product hierarchies in the supermarket. You may see how the level of detail increases from top to bottom.
For the stakeholders of a chain of ice cream parlors, it might be interesting not only the combination of the ice cream flavors, but also toppings as well as types: such as cup, stick, cone, etc., and even whether the ice cream eaten in the ice cream parlor or taken away.
But when analyzing a large supermarket, probably the initial level of detail for the Ice Cream will be Frozen Desserts. Because as the number of items used in the analysis increases, the number of combinations to consider will grow very quickly.
After the initial analysis, we can repeat rule generation to hone in on more specific items. And Because the analysis focuses on more specific items, we use only the subset of the transactions that contain those items.
Market basket analysis works better when the items appear in approximately the same number of transactions. This prevents rules from being dominated by the most common items. That is why we may roll up rare items to higher levels in the hierarchy, so they become more frequent. Such generalized items also help find rules with sufficient support. There are more transactions supporting a given rule at higher levels of the hierarchy.
But because some items are generalized does not mean that all items must be moved to the same level up. For instance, in the example shown in Figure 8, we have chosen to generalize all the types of ice creams as Frozen Desserts for the analysis done for a large supermarket, however, expensive items in the same supermarket, such as home appliances, may remain at a low level.
- What is more important: brand, size, or flavor of tea?
- Will we consider different volumes of shampoo as one product?
The dataset that we are using to build our dashboard contains three levels of product hierarchies:
For the purposes of the dashboard that is designed to work in real-time, we choose sub_category,
which hopefully, will be able to find some useful rules, or at least, promote the ideas that you can confirm with more deep analysis.
So, we select the following transactions:
SELECT order_id AS order, sub_category AS items, quantity FROM superstore_orders WHERE EXTRACT(YEAR FROM order_date)= 2014
There were almost 30,000 transactions in 2014.
We may group these transactions to see which items are purchased together.
SELECT order_id AS orders, string_agg(DISTINCT sub_category, ', ') AS items, SUM(quantity) FROM superstore_orders WHERE EXTRACT(YEAR FROM order_date)= 2014 GROUP BY order_id
Even in this fragment of the data, you may see that items “Art” and “Binders” are often sold together.
We may count how many times each item was bought.
WITH orders AS ( SELECT order_id, sub_category AS item FROM superstore_orders WHERE EXTRACT(YEAR FROM order_date) = 2014 ), total_orders AS ( SELECT COUNT(DISTINCT order_id) AS value FROM orders ), item_orders AS ( SELECT item, COUNT(*) AS order_count, (SELECT value FROM total_orders) AS total_order_count FROM orders GROUP BY item ) SELECT * FROM item_orders ORDER BY order_count DESC LIMIT 100
Let’s associate items that are bought together and then count the number of such transactions.
WITH orders AS ( SELECT order_id, sub_category AS item FROM superstore_orders WHERE EXTRACT(YEAR FROM order_date) = 2014 ), total_orders AS ( SELECT COUNT(DISTINCT order_id) AS value FROM orders ), associated_orders AS ( SELECT o1.item AS item1, o2.item AS item2, COUNT(*) AS order_count, (SELECT value FROM total_orders) AS total_order_count FROM orders AS o1 INNER JOIN orders AS o2 ON o1.order_id = o2.order_id AND o1.item < o2.item GROUP BY item1, item2 ) SELECT * FROM associated_orders ORDER BY order_count DESC LIMIT 150
As we can see in Figure 13, there are 136 combinations that might be considered. Such a combination of items is called an item set. We do not necessarily consider the quantity of the items in a particular transaction. Item sets are interesting themselves.
- Why are those two items purchased together?
Since we analyze online sales where customers purchase relatively few items at one time, two-item sets might be sufficient. We will not waste our efforts searching for the rules with three, four, or more items. However, in some cases, such as with supermarkets, where the average transaction is much larger, more complex rules will be useful.
“Art” and “Binders” were bought together 1,688 times, and are more likely to be purchased together than any other two items. The combination of “Art” and “Fasteners” appears only 680 times. Does it mean that the rule “If Art, then Binders” is better than “If Art, then Fasteners”? We may learn from Figure 12 that the total sales of the Fasteners are only 1588, while the total sales of the “Binders” are 4048.
So, how good is a particular rule?
To find this out we use the following methods: support, confidence, and lift.
Support
The most common way to measure support is to calculate the proportion of transactions that contain all the items in the rule.
For the rule “If Art, then Binders” the Support = Number of transactions when “Art” and “Binders” are bought together / Total number of transactions.
1688 / 7794= 0.2166 or 21.65%
Let’s check Support for all the combinations:
WITH orders AS ( SELECT order_id, sub_category AS item FROM superstore_orders WHERE EXTRACT(YEAR FROM order_date) = 2014 ), total_orders AS ( SELECT COUNT(DISTINCT order_id) AS value FROM orders ), associated_orders AS ( SELECT o1.item AS item1, o2.item AS item2, COUNT(*) AS order_count, (SELECT value FROM total_orders) AS total_order_count FROM orders AS o1 INNER JOIN orders AS o2 ON o1.order_id = o2.order_id AND o1.item < o2.item GROUP BY item1, item2 ), associated_support AS ( SELECT item1, item2, order_count, total_order_count, ROUND((order_count::numeric / total_order_count), 5) AS support FROM associated_orders ) SELECT * FROM associated_support ORDER BY support DESC LIMIT 100
Confidence
Confidence helps to understand how good a rule is at predicting what is on the right-hand side.
It is the ratio of the number of transactions with all the items in the rule to the number of transactions with just the items on the left-hand side. Alternatively, we may count the ratio of all the items in the rule support to the support of the item on the left-hand side.
For the rule “If Art, then Binders” the Confidence = Support (Number of transactions when “Art” and “Binders” are bought together) / Support (Number of transactions when “Art” is bought).
0.21658 / 0.41827 = 0.51782 or 51.78%
It means that, when “Art” appears in an order, there is a 51.78% chance that “Binders” also appears in it.
Let’s check Confidence for all the combinations:
WITH orders AS ( SELECT order_id, sub_category AS item FROM superstore_orders WHERE EXTRACT(YEAR FROM order_date) = 2014 ), total_orders AS ( SELECT COUNT(DISTINCT order_id) AS value FROM orders ), item_orders AS ( SELECT item, COUNT(*) AS order_count, (SELECT value FROM total_orders) AS total_order_count FROM orders GROUP BY item ), item_support AS ( SELECT item, order_count, total_order_count, ROUND((order_count::numeric / total_order_count), 5) AS support FROM item_orders ), associated_orders AS ( SELECT o1.item AS item1, o2.item AS item2, COUNT(*) AS order_count, (SELECT value FROM total_orders) AS total_order_count FROM orders AS o1 INNER JOIN orders AS o2 ON o1.order_id = o2.order_id AND o1.item < o2.item GROUP BY item1, item2 ), associated_support AS ( SELECT item1, item2, order_count, total_order_count, ROUND((order_count::numeric / total_order_count), 5) AS support FROM associated_orders ), associated_confidence AS ( SELECT tas.item1, tas.item2, tas.support AS associated_support, tis1.support AS item1_support, ROUND(tas.support / tis1.support, 5) as confidence FROM associated_support AS tas INNER JOIN item_support AS tis1 ON tis1.item = tas.item1 ) SELECT * FROM associated_confidence ORDER BY associated_support DESC LIMIT 100
Lift
Lift measures the power of the rule by comparing the full rule to randomly guessing the right-hand side.
If the item on the right-hand side already is very common, the rule is not telling us anything.
For the rule “If Art, then Binders” the Lift = Support (Number of transactions when “Art” and “Binders” are bought together) / (Support (Number of transactions when “Art” is bought) * Support (Number of transactions when “Binders” are bought)).
0.21658 / (0.41827 * 0.51937) = 0.99698
Lift greater than 1 means that the product on the left-hand side increases the chance of buying the product on the right-hand side.
When lift is less than 1, the rule is doing worse than informed guessing. So is true for the rule “If Art, then Binders”.
Let’s find out Lift for all the rules:
WITH orders AS ( SELECT order_id, sub_category AS item FROM superstore_orders WHERE EXTRACT(YEAR FROM order_date) = 2014 ), total_orders AS ( SELECT COUNT(DISTINCT order_id) AS value FROM orders ), item_orders AS ( SELECT item, COUNT(*) AS order_count, (SELECT value FROM total_orders) AS total_order_count FROM orders GROUP BY item ), item_support AS ( SELECT item, order_count, total_order_count, ROUND((order_count::numeric / total_order_count), 5) AS support FROM item_orders ), associated_orders AS ( SELECT o1.item AS item1, o2.item AS item2, COUNT(*) AS order_count, (SELECT value FROM total_orders) AS total_order_count FROM orders AS o1 INNER JOIN orders AS o2 ON o1.order_id = o2.order_id AND o1.item < o2.item GROUP BY item1, item2 ), associated_support AS ( SELECT item1, item2, order_count, total_order_count, ROUND((order_count::numeric / total_order_count), 5) AS support FROM associated_orders ), associated_lift AS ( SELECT tas.item1, tas.item2, tas.support AS associated_support, tis1.support AS item1_support, tis2.support AS item2_support, ROUND(tas.support / (tis1.support * tis2.support), 5) as lift FROM associated_support AS tas INNER JOIN item_support AS tis1 ON tis1.item = tas.item1 INNER JOIN item_support AS tis2 ON tis2.item = tas.item2 ) SELECT * FROM associated_lift ORDER BY associated_support DESC LIMIT 100
For the dashboard I opted to display only the rules with support greater than 5%, confidence greater than 20% and lift greater than 1.
WITH orders AS ( SELECT order_id, sub_category AS item FROM superstore_orders WHERE EXTRACT(YEAR FROM order_date) = 2014 ), total_orders AS ( SELECT COUNT(DISTINCT order_id) AS value FROM orders ), item_orders AS ( SELECT item, COUNT(*) AS order_count, (SELECT value FROM total_orders) AS total_order_count FROM orders GROUP BY item ), item_support AS ( SELECT item, order_count, total_order_count, ROUND((order_count::numeric / total_order_count), 5) AS support FROM item_orders ), associated_orders AS ( SELECT o1.item AS item1, o2.item AS item2, COUNT(*) AS order_count, (SELECT value FROM total_orders) AS total_order_count FROM orders AS o1 INNER JOIN orders AS o2 ON o1.order_id = o2.order_id AND o1.item < o2.item GROUP BY item1, item2 ), associated_support AS ( SELECT item1, item2, order_count, total_order_count, ROUND((order_count::numeric / total_order_count), 5) AS support FROM associated_orders ), associated_confidence AS ( SELECT tas.item1, tas.item2, tas.support AS associated_support, tis1.support AS item1_support, ROUND(tas.support / tis1.support, 5) as confidence FROM associated_support AS tas INNER JOIN item_support AS tis1 ON tis1.item = tas.item1 ), associated_lift AS ( SELECT tas.item1, tas.item2, tas.support AS associated_support, tis1.support AS item1_support, tis2.support AS item2_support, ROUND(tas.support / (tis1.support * tis2.support), 5) as lift FROM associated_support AS tas INNER JOIN item_support AS tis1 ON tis1.item = tas.item1 INNER JOIN item_support AS tis2 ON tis2.item = tas.item2 ), summary AS ( SELECT tao.item1, tao.item2, tao.order_count, tas.support AS support, tas.support * 100 AS support_pct, tac.confidence AS confidence, tal.lift AS lift FROM associated_orders AS tao INNER JOIN associated_support AS tas ON tas.item1 = tao.item1 AND tas.item2 = tao.item2 INNER JOIN associated_confidence AS tac ON tac.item1 = tao.item1 AND tac.item2 = tao.item2 INNER JOIN associated_lift AS tal ON tal.item1 = tao.item1 AND tal.item2 = tao.item2 ) SELECT concat(item1, ' + ', item2) AS items, * FROM summary WHERE support > 0.05 AND confidence > 0.2 AND lift > 1 ORDER BY order_count DESC LIMIT 100
Good and bad findings
Association analysis generates association rules automatically. It is a type of undirected data mining that finds patterns in the data where the target is not specified beforehand. Therefore it is our task to decide whether the patterns make sense.
Good findings are actionable. They help make decisions, such as creating promotional discounts on certain product bundles to improve cross-selling.
Beware of trivial and inexplicable rules.
Trivial rules reproduce common knowledge about the business.
- The customer who has bought an extended warranty is likely to buy an expensive household appliance.
Inexplicable rules may not have explanations and recommendations for action. They just don’t make sense.
- The bookstore’s most popular product among American Express credit card shoppers is Barack Obama’s Promised Land.
Conclusion
In this article, we have briefly discussed market basket analysis and association rules.
We went through the process of analyzing data at three levels: an item, an order, and a customer who, if identified, connects all the orders together.
Also, we have described how to select the right level of detail in product hierarchies, and how many items to include in the association rule.
We ran an association analysis with SQL and described how to evaluate the rules using such measures as support, confidence, and lift.
And finally, we built a visual dashboard that represents our findings.
Hope this was helpful!