Project Data Analysis for B2B Retail: Customer Analytics Report

Mentor: Trisna Yulia Junita — Data Scientist at PT. BUMA

Nilda Nurmala
6 min readJan 14, 2021

My third project after completing a course in DQLab about fundamental SQL. I have completed the modules of fundamental SQL using the SELECT, FUNCTION, GROUP BY, INNER JOIN, UNION and Having. This project is provided by DQLab and uses MariaDB.

In this project, xyz.com will review the company’s performance during the last quarter. A data analyst is asked to provide data and analysis regarding the current condition of the company.

List of Tables

The tables that will be used in this project are:

Figure 1. List of tables
  • orders_1 table: Contains data related to sales transactions for the 1st quarter (Jan — Mar 2004)
  • orders_2 table: Contains data related to sales transactions for the 2nd quarter (Apr — Jun 2004)
  • customer table: Contains profile data of consumers who have registered as xyz.com customers

Table Details

Before starting to compile an SQL query and make an analysis of the query results, the first thing to do is familiar with the table that will be used to determine which column is related to the problem to be analyzed and which data manipulation process needs to be done for these columns.

SELECT * FROM orders_1 LIMIT 5;
Figure 2. Output result for orders_1 table
SELECT * FROM orders_2 LIMIT 5;
Figure 3. Output result for orders_2 table
SELECT * FROM customer LIMIT 5;
Figure 4. Output result for customer table

Syntax

The syntax used in this project:
• SELECT is used to select data from the database
• DISTINCT is used to select only distinct (different) values or no duplicate values
• WHERE is used to filter data or extract only data that fulfill a specified condition
• ORDER BY is used to sort the result set in ascending or descending order
• COUNT is used to count the number of rows
• SUM is used to sum of a numeric columns
• BETWEEN is used to selects values within a given range
• AS is used to give temporary names to a table or a column in the table
• QUARTER is used to returns the quarter of the year for a given date value
• UNION is used to combine the result-set of two or more SELECT statements vertically
• GROUP BY is used to group rows that have the same value into summary rows
• DESC is used to sort data in descending order

Sales and Revenue Growth

SQL command to get total sales and revenue for 1st quarter (Jan, Feb, Mar) and 2nd quarter (Apr, May, Jun):

SELECT quarter,
SUM(quantity) AS total_penjualan,
SUM(quantity*priceeach) AS revenue
FROM (
SELECT
orderNumber,
status,
quantity,
priceeach,
‘1’ AS quarter
FROM orders_1
WHERE status = ‘Shipped’
UNION
SELECT
orderNumber,
status,
quantity,
priceeach,
‘2’ AS quarter
FROM orders_2
WHERE status = ‘Shipped’
) AS tabel_a
GROUP BY quarter;
Figure 5. Output result for total sales and revenue

Based on Figure 5, the company’s performance decreased from 1st quarter to 2nd quarter where total sales decreased 22% and revenue decreased 24%.

Sales and Revenue by Product Category

SQL command to get total sales and revenue by product category up to 2nd quarter:

SELECT *
FROM (
SELECT
categoryID,
COUNT(DISTINCT orderNumber) AS total_order,
SUM(quantity) AS total_penjualan
FROM (
SELECT
productCode,
orderNumber,
quantity,
status,
LEFT(productCode,3) AS categoryID
FROM orders_1
WHERE status = “Shipped”
UNION
SELECT
productCode,
orderNumber,
quantity,
status,
LEFT(productCode,3) AS categoryID
FROM orders_2
WHERE status = “Shipped”
) AS tabel_c
GROUP BY categoryID) tabel_d
ORDER BY total_order DESC;
Figure 6. Output result for total sales and revenue by product category

Based on Figure 6, the most ordered products are S18 category products which were ordered 49 times with 5290 items sold and S24 category products ordered 45 times with 4082 items sold. So, these two products contribute around 48% of total orders and 60% of total sales.

Customer Analytics

SQL command to get the number of registered consumers in 1st quarter (Jan, Feb, Mar) and 2nd quarter (Apr, May, Jun):

SELECT quarter, 
COUNT(DISTINCT customerID) AS total_customers
FROM (
SELECT
customerID,
createDate,
QUARTER(createDate) AS quarter
FROM customer
WHERE createDate BETWEEN ‘2004–01–01’ AND ‘2004–06–30’
) AS tabel_b
GROUP BY quarter;
Figure 7. Output result for the number of registered consumers

Based on Figure 7, the number of new consumers has decreased from 1st quarter to 2nd quarter.

Customer Transactions

SQL command to get the number of registered customers in 1st quarter (Jan, Feb, Mar) and 2nd quarter (Apr, May, Jun) who have made transaction:

SELECT quarter, 
COUNT(DISTINCT customerID) AS total_customers
FROM (
SELECT customerID,
createDate,
QUARTER(createDate) AS quarter
FROM customer
WHERE createDate BETWEEN ‘2004–01–01’ AND ‘2004–06–30’
) AS tabel_b
WHERE customerID IN (
SELECT DISTINCT customerID
FROM orders_1
UNION
SELECT DISTINCT customerID
FROM orders_2
)
GROUP BY quarter;
Figure 8. Output result for the number of registered customer who made transaction

Based on Figure 8, the number of customers who have made transactions has decreased from 1st quarter to 2nd quarter. In 1st quarter, customers who have made transactions are around 58% of total customers registered in 1st quarter and in 2nd quarter, customers who have made transactions are about 54% of total registered customers in 2nd quarter. The total number of new customers who have ordered is only about 56%.

Retention Rate Konsumen

SQL command to get customer retention rate in 2nd quarter (Apr, May, Jun):

#Menghitung total unik customers yang transaksi di quarter_1
SELECT COUNT(DISTINCT customerID) as total_customers FROM orders_1;
#output = 25
SELECT “1” AS quarter,
(COUNT(DISTINCT customerID)/25)*100 AS Q2
FROM orders_1
WHERE customerID IN (
SELECT DISTINCT customerID AS Q2
FROM orders_2
);
Figure 9. Output result for customer retention rate

Based on Figure 9, customers who have made transactions in 1st quarter and returned to transactions in 2nd quarter are 24%.

Conclusion

Based on the data from SQL queries, it can be concluded that:

  1. The performance of xyz.com decreased significantly in the 2nd quarter, as seen from the sales and revenue which decreased by 20% and 24%
  2. The number of new customers is also not too good and slightly decreased compared to the previous quarter
  3. Products in the S18 and S24 categories contribute around 48% of total orders and 60% of total sales, so xyz.com should focus on product development for the S18 and S24 categories
  4. The interest of new customers to shop at xyz.com is still lacking, only about 56% have made transactions. It is suggested that the product team need to study on consumer behavior and make product improvements, so that the conversion rate (register to transaction) can increase
  5. Xyz.com’s customer retention rate is also low, only 24%, which means that many customers who have made transactions in 1st quarter do not repeat order in 2nd quarter (no repeat orders)
  6. The company have negative growth in the second quarter and if it wants to achieve the target and positive growth in the third quarter, it needs to make a lot of improvements both in terms of products and in the marketing business. Low retention rates and conversion rates can be an initial diagnosis that consumers are not interested/dissatisfied/disappointed in shopping at xyz.com.

--

--