Project Data Analyst for Retail: Sales Performance Report by DQLab

Mentor: Nelda Ampulembang Parenta — Senior Data Analyst at Logisly

Nilda Nurmala
7 min readDec 17, 2020

Another project I worked on after completing a course in DQLab about fundamental SQL. I have completed the modules of fundamental SQL use SELECT statement, FUNCTION, GROUP BY, INNER JOIN, UNION and Having. This project is provided by DQLab and uses MariaDB.

1.1 Dataset Brief

The dataset used contains transactions from 2009 to 2012 with a total of raw data is 5500, including order status which is divided into order finished, order returned and order cancelled.

The dataset that will be used in this project contains the following data:
1. OrderID
2. Order Status
3. Customer
4. Order Date
5. Order Quantity
6. Sales
7. Discount %
8. Discount
9. Product Category
10. Product Sub-Category

The table name used in this project is dqlab_sales_store.

1.2 Project Instructions

DQLab store management wants to know:
1. Overall performance from DQLab Store
a. Overall performance from DQLab Store for the number of orders and total sales orders finished
b. Overall performance from DQLab Store by product subcategory to compare between 2011 and 2012
2. Promotion Effectiveness and Efficiency
a. Calculate the burn rate of promotions by year
b. Calculate the burn rate of promotions by product subcategory
3. Customer Analyst
a. Analyze the number of customers who make transactions by year
b. Analyze the number of new customers who make transactions by year

2 Syntax

The syntax used in this project:
• SELECT is used to select data from the database
• DISTINCT is used to select only distinct (different) values or no duplicate values
• WHERE is used to filter data or extract only data that fulfill a specified condition
• AND is used to filter data based on more than one condition and displays a data if all conditions are fulfilled
• ORDER BY is used to sort the result set in ascending or descending order
• COUNT is used to count the number of rows
• SUM is used to sum of a numeric columns
• MIN is used to select the smallest value of the selected column
• AS is used to give temporary names to a table or a column in the table
• YEAR is used to filter years in dates
• GROUP BY is used to group rows that have the same value into summary rows
• ROUND is used to return the rounded value of a decimal value
• DESC is used to sort data in descending order

3 DQLab Store Overall Performance
In this section we will analyze the overall performance of the DQlab store.

3.1 Overall Performance by Year

SQL command to get total sales (sales) and number of orders (number_of_order) from 2009 to 2012 (years).

SELECT 
YEAR(order_date) AS years,
SUM(sales) AS sales,
COUNT(order_quantity) AS number_of_order
FROM dqlab_sales_store
WHERE order_status = 'Order Finished'
GROUP BY YEAR(order_date);
Figure 1. Output Results for Overall Performance by Year

Based on Figure 1, total sales each year tends to increase, but in 2012 total sales were not greater than total sales in 2009 and number of orders finished each year tended to increase but not significantly.

3.2 Overall Performance by Product Subcategory

SQL command to get total sales by product subcategory (product_sub_category) in 2011 and 2012 (years).

SELECT 
YEAR(order_date) AS years,
product_sub_category,
SUM(sales) AS sales
FROM dqlab_sales_store
WHERE order_status = ‘Order Finished’
AND YEAR(order_date) > 2010
GROUP BY years, product_sub_category
ORDER BY years, sales DESC;
Figure 2. Output Results for Overall Performance by Product Subcategory in 2011–2012

Based on Figure 2, there is an increase and decrease in total sales for each product subcategory in one year. The highest total sales in 2011 and 2012 were still dominated by product subcategories that tended to be the same, it is chairs and chairmats, office machines, and tables, while the lowest total sales are labels, scissors, rulers and trimmers, and rubber bands.

4 DQLab Store Promotion Effectiveness and Efficiency
In this section we will analyze the effectiveness and efficiency of the promotions that have been carried out so far. The effectiveness and efficiency of the promotions will be analyzed based on the burn rate by comparing the total value of the promotions to the total sales earned. DQLab hopes that the burn rate remains at the maximum 4.5%.

Burn rate = (total discount/total sales) x 100

4. 1 Promotion Effectiveness and Efficiency by Years

SQL command to get total sales (sales), total discount (promotion_value) and percentage burn rate (burn_rate_percentage) by year (years).

SELECT 
YEAR(order_date) AS years,
SUM(sales) as sales,
SUM(discount_value) AS promotion_value,
ROUND((SUM(discount_value)/SUM(sales))*100,2) AS burn_rate_percentage
FROM dqlab_sales_store
WHERE order_status = ‘Order Finished’
GROUP BY years;
Figure 3. Output Results for Promotion Effectiveness and Efficiency by Year

Based on Figure 3, every year the percentage of burn rate tends to increase, which means that promotion costs continue to increase every year. The impact of increased promotion costs is to increase in total sales, but the promotion that has increased sales cannot be said to be effective and efficient because the percentage of burn rate exceeds the maximum expected rate at 4.5%.

4.2 Promotion Effectiveness and Efficiency by Product Subcategory

SQL command to get total sales (sales), total discount (promotion_value) and percentage burn rate (burn_rate_percentage) by product subcategory and in 2012.

SELECT 
YEAR(order_date) as years,
product_sub_category,
product_category,
SUM(sales) AS sales,
SUM(discount_value) AS promotion_value,
ROUND((SUM(discount_value)/SUM(sales))*100,2) AS burn_rate_percentage
FROM dqlab_sales_store
WHERE order_status = 'Order Finished'
AND YEAR(order_date) = 2012
GROUP BY
years,
product_sub_category,
product_category
ORDER by sales DESC;
Figure 4. Output Results for Promotion Effectiveness and Efficiency by Product Subcategories in 2012

Based on Figure 4, in 2012 there were only a few product subcategories with a burn rate percentage lower than 4.5%. Meanwhile, several product subcategories with a burn rate greater than 4.5% had low sales rates.

5 Customer Analyst
In this section, we will analyze DQLab store customers.

5.1 Customers Per Year

SQL command to get the number of customers (number_of_customer) who make transactions by year (years).

SELECT 
YEAR(order_date) AS years,
COUNT(DISTINCT customer) AS number_of_customer
FROM dqlab_sales_store
WHERE order_status = ‘Order Finished’
GROUP BY years
ORDER BY years;
Figure 5. Output Results for Consumers Who Make Transactions by Year

Based on Figure 5, every year the number of customers who make transactions tends to increase but not significantly.

5.2 New Customers Per Year

SQL command to get the number of new customers (number_of_new_customer) who make transactions by year (years).

SELECT YEAR(date_first_transaction) AS years,
COUNT(DISTINCT customer) AS number_of_new_customer
FROM (
SELECT customer,
MIN(order_date) AS date_first_transaction
FROM dqlab_sales_store
WHERE order_status = ‘Order Finished’
GROUP BY customer) AS table_a
GROUP BY years
ORDER BY years;
Figure 6. Output Results New Consumers Who Make Transactions by Year

Based on Figure 6, the number of new customers who make transactions each year decreases. This indicates that the sales received so far have come from existing customers.

6 Discussion

The performance of DQlab stores tends to increase every year, but not significantly. This can be seen from the high number of finished orders (number_of_order) (Figure 2) which does not determine the high total sales, as seen in 2010 where the number of finished orders was higher than in 2011 but the total sales were lower than in 2011. This can be due to differences in sales based on product subcategories.

Based on Table 1, sales growth based on product subcategories has fluctuated every year. In 2010, sales based on product subcategories decreased from the previous year. In the next year, sales based on product subcategories began to increase. And in 2012, sales based on product subcategories continued to increase, there were only a few product subcategories that sales decreased. The change in sales may occur due to the effect of the promotion.

Table 1. Table of Sales Growth by Product Subcategories

The effect of the promotion can be seen from the percentage of burn rate. The expected maximum percentage of burn rate is 4.5%, but every year the percentage of burn rate is greater than 4.5% (Figure 6). It can be said that the promotion is still less effective and efficient. However, if the percentage of burn rate is connected to sales growth (Table 1), it can be seen that promotion has a positive impact, that is increasing total sales. So it can be said, promotion still less effective and efficient only in a few product subcategories.

Besides promotion, other factors that can affect sales are from the customer side. The number of customers who make transactions annually tends to be stable (Figure 5), while the number of new customers who make transactions from year to year is decreasing (Figure 6). This indicates that existing customers make more transactions, which causes store sales to be stagnant.

7 Conclusion

DQLab store during 2009 to 2012 had performance that increased every year but not too significantly. Overall, promotion activities are quite good but still not effective and efficient based on product subcategories. In addition, there are more existing customers who make transactions than new customers, so store performance tends to be stagnant.

--

--