Custom Search
 


How to generate Cumulative Sum (running total) by MySQL - Part 1


Cumulative sum, also known as running total or Year to Date (YTD) total, is a sum of data that is related to a time sequence such as daily, monthly, or yearly. But running total could also be calculated if the dataset does not have a time sequence related column.

For example, below is a list of daily sales and running total.

For calculating running total in SQL, at a high level, the concept includes (1) rank data within a GROUP, (2) RANK BY a column, and (3) calculate running total by using SUM.

For the daily sales illustrated above, the GROUP is the entire Northwind Trader company. The RANK BY column is OrderDate. SUM is applied to DailySales. To translate the calculation in plain language, we can say that we want to RANK daily total sales BY order date within Northwind Trader and calculate running total for each order date.

It may not look too obvious what the GROUP is. To better understand the concept, consider we have 2 Northwind Traders - Northwind A and Northwind B. So for each Northwind company, we want to RANK daily total sales BY order date within each Northwind Trader and calculate running total for each order date. Then the running total is calculated within each Northwind - Northwind A and Northwind B.

To generate a running total (or cumulative sum), please consider the following guidelines.

  1. Use self join to join the table (or base dataset) to itself.
  2. Find a naturally available RANK value column that sequentially increments over time. If such a value does not exist, use RANK(), DENSE_RANK(), or ROW_NUMBER() function to create a RANK value.
  3. The rank value must be unique within the GROUP.
  4. In the self join clause, the RANK value in the left table (t1) is always coded greater than or equal to the RANK value in the right table (t2), e.g. t1.RankValue >= t2.RankValue.
  5. The running total is calculated by summing the total column in the right table (t2).

Now let's use two practical examples to see how to generate running totals.

Practice 1: rank within a single GROUP which is the entire Northwind company

1.1 Create the basic dataset.

The result of the basic dataset is the table we want to calculate the running total for.

-- Basic set
select b.OrderID,
	b.OrderDate,
	round(sum(a.UnitPrice*a.Quantity*(1-a.Discount)),2) as Daily_Sales
from Order_Details as a
inner join Orders b on a.OrderID = b.OrderID
group by b.OrderDate
order by b.OrderDate;

480 records returned.

1.2 Generate running total by self join

To calculate running total, join the above query to itself. Please note that, the join column must be a column that can uniquely rank every record by a time sequence. Here we used t1.OrderDate >= t2.OrderDate. If we use t1.OrderID >= t2.OrderID, it'll produce the same result because OrderID is sequentially incremented as time goes by.

select DATE_FORMAT(t1.OrderDate, "%M %d %Y") as OrderDate,
	t1.Daily_Sales,
	sum(t2.Daily_Sales) as RunningTotal
from
(
	select b.OrderID,
		b.OrderDate,
		round(sum(a.UnitPrice*a.Quantity*(1-a.Discount)),2) as Daily_Sales
	from Order_Details as a
	inner join Orders b on a.OrderID = b.OrderID
    group by b.OrderDate
) as t1
inner join -- this is self join
(
	select b.OrderID,
		b.OrderDate,
		round(sum(a.UnitPrice*a.Quantity*(1-a.Discount)),2) as Daily_Sales
	from Order_Details as a
	inner join Orders b on a.OrderID = b.OrderID
    group by b.OrderDate
) as t2
on t1.OrderDate >= t2.OrderDate
-- this should also work 
-- on t1.OrderID >= t2.OrderID
group by t1.OrderDate
order by t1.OrderDate;

480 records returned.

Below is equivalent version as above.

To simplify the two queries above, we can create a physical table by using the basic dataset, then work out the running total from this new table.

-- Create a new table from the basic set.
create table NorthwindSales
as
select b.OrderID,
	b.OrderDate,
	round(sum(a.UnitPrice*a.Quantity*(1-a.Discount)),2) as Daily_Sales
from Order_Details as a
inner join Orders b on a.OrderID = b.OrderID
group by b.OrderDate
order by b.OrderDate;

-- Work out the running total from the new table.
select DATE_FORMAT(t1.OrderDate, "%M %d %Y") as OrderDate,
	t1.Daily_Sales,
	sum(t2.Daily_Sales) as RunningTotal
from NorthwindSales as t1
inner join NorthwindSales as t2
on t1.OrderDate>=t2.OrderDate
group by t1.OrderDate
order by t1.OrderDate;

Practice 2: rank within multiple GROUPs which are the product categories

2.1 Create the basic dataset.

-- Basic set
select c.CategoryName,
	year(b.OrderDate) as OrderYear,
	round(sum(a.UnitPrice*a.Quantity*(1-a.Discount)),2) as YearlySales
from Order_Details as a
inner join Orders b on a.OrderID = b.OrderID
inner join Products as p on a.ProductID=p.ProductID
inner join Categories as c on c.CategoryID=p.CategoryID
group by c.CategoryName,year(b.OrderDate)
order by c.CategoryName, OrderYear;

24 records returned.

2.2 Generate running total

Within each product category, calculate running total for each order year. Please note that OrderYear is a naturally available rank value that orders YearlSales for each year, so we code it in the join clause like this t1.OrderYear >= t2.OrderYear. If we need to calculate a running total where a naturally available rank value does not exist, we'll have to use MySQL RANK(), DENSE_RANK(), or ROW_NUMBER() function to create the rank value.

select t1.CategoryName,
	t1.OrderYear,
	t1.Yearly_Sales,
	sum(t2.Yearly_Sales) as RunningTotal
from
(
	select c.CategoryName,
		year(b.OrderDate) as OrderYear,
		round(sum(a.UnitPrice*a.Quantity*(1-a.Discount)),2) as Yearly_Sales
	from Order_Details as a
	inner join Orders b on a.OrderID = b.OrderID
	inner join Products as p on a.ProductID=p.ProductID
	inner join Categories as c on c.CategoryID=p.CategoryID
	group by c.CategoryName,year(b.OrderDate)	
) as t1
inner join 
(
	select c.CategoryName,
		year(b.OrderDate) as OrderYear,
		round(sum(a.UnitPrice*a.Quantity*(1-a.Discount)),2) as Yearly_Sales
	from Order_Details as a
	inner join Orders b on a.OrderID = b.OrderID
	inner join Products as p on a.ProductID=p.ProductID
	inner join Categories as c on c.CategoryID=p.CategoryID
	group by c.CategoryName,year(b.OrderDate)
) as t2
on t1.CategoryName=t2.CategoryName and t1.OrderYear >= t2.OrderYear
group by t1.CategoryName,t1.OrderYear
order by t1.CategoryName,t1.OrderYear;

24 records returned.

Below is equivalent version as above.

-- This is equivalent version but create a new table for the basic dataset
create table NorwindSales2
as
select c.CategoryName,
	year(b.OrderDate) as OrderYear,
	round(sum(a.UnitPrice*a.Quantity*(1-a.Discount)),2) as YearlySales
from Order_Details as a
inner join Orders b on a.OrderID = b.OrderID
inner join Products as p on a.ProductID=p.ProductID
inner join Categories as c on c.CategoryID=p.CategoryID
group by c.CategoryName,year(b.OrderDate)
order by c.CategoryName, OrderYear;

-- Calculate running total
select t1.CategoryName,
	t1.OrderYear,
	t1.YearlySales,
    	sum(t2.YearlySales) as RunningTotal
from NorwindSales2 as t1
inner join NorwindSales2 as t2
on t1.CategoryName=t2.CategoryName and t1.OrderYear >= t2.OrderYear
group by t1.CategoryName,t1.OrderYear
order by t1.CategoryName,t1.OrderYear;

Happy Coding!



Other tutorials in this category

1. MySQL Northwind Queries - Part 1

2. MySQL Northwind Queries - Part 2

3. MySQL Northwind Queries - Part 3

4. How to Work with Two Unrelated Values

5. How to Fill Gaps in Sales Data

6. How to Calculate Totals, Subtotals and Grand Total

7. How to Work with NULL Values

8. How to fill down empty cells with values from a previous non-empty row

9. Use RANK function to update a previous record within a group or partition

10. Two ways to add a unique number or ID to each row

11. 3 ways to get Top N rows from MySQL

12. How to generate Cumulative Sum (running total) by MySQL - Part 2

Back to Tutorial Index Page


Copyright © 2024 GeeksEngine.com. All Rights Reserved.

This website is hosted by HostGator.

No portion may be reproduced without my written permission. Software and hardware names mentioned on this site are registered trademarks of their respective companies. Should any right be infringed, it is totally unintentional. Drop me an email and I will promptly and gladly rectify it.

 
Home | Feedback | Terms of Use | Privacy Policy