Custom Search
 


How to fill down empty cells with values from a previous non-empty row


It's a common data transformation task that you need to fill blank cells with values from a previous non-empty row. To show how to tackle this task, we have created a simple example below to visualise it.

Here is the table before filling empty cells.

After filling empty cells, the table looks like this.

This seems to be an extremely easy task by hand, however, to accomplish it by using SQL is not so straightforward. After completing this task, the main SQL skills you'll be able to take away from this tutorial are how to use GROUP BY clause, MAX function, and self table JOINs (join a table with itself).

First, we need to create our sales_data table and then populate it with some data.

-- Create the table
CREATE TABLE sales_data
(
	ID int(10) unsigned NOT NULL,
    CustomerID varchar(5) NULL,
    CompanyName varchar(40) NULL,
    MonthYear varchar(10) NULL,
    Sales double unsigned NULL  
);

Insert data to the table.

-- Insert data to the table
insert into sales_data values(1,'QUICK','QUICK Cozinha','Oct-2017',1208.32),
(2,'EASTC','Eastern Connection','Jan-2018',320.99),
(3,'ISLAT','Island Trading','Nov-2016',403.20)
,(4,'','','Apr-2016',749.06)
,(5,'','','Feb-2018',660.56)
,(6,'CONSH','Consolidated Holdings','Oct-2017',1640.11)
,(7,'','','Nov-2016',480.37)
,(8,'','','Feb-2018',547.20)
,(9,'','','Dev-2016',599.22)
,(10,'','','May-2017',1668.87);

To make our illustration simple enough to understand, we divide it into 6 steps.

Step 1:
Create a table by extracting sales data where CustomerID is empty.

create table sales_data_step1
as
select * from sales_data where CustomerID = '';

Result:

Step 2:
Create a table by extracting sales data where CustomerID is NOT empty.

create table sales_data_step2
as
select * from sales_data where CustomerID <> '';

Result:

Step 3:
Create a table by joining data from Step 1 and Step 2 where ID from Step 1 is greater than ID from Step 2.

create table sales_data_step3
as
select t1.ID as ID1, t2.ID as ID2, t1.CustomerID, t1.CompanyName, t1.MonthYear, t1.Sales
from sales_data_step1 as t1
join sales_data_step2 as t2 on t1.ID > t2.ID
order by t1.ID, t2.ID desc;

Note that the result has been sorted by t1.ID in ascending order and t2.ID in descending order. This has effectively listed the ID of non-empty row as the first record (see data in ID2 column) within each group of ID1. For example, for all records where ID1 equals to 4, the max ID2 is 3. Thus in Step 4 below, ID 3 is the row ID for filling the empty cells for record where ID = 4 (and also ID = 5).

Result:

Step 4:
Create a table by self joining table dada from Step 3 and pick max ID2 from the second table for each group of IDs in the first table. The max IDs are the row to be filled down for blanks.

create table sales_data_step4
as
select t1.ID1 as ID, max(t2.ID2) as MaxID, t1.MonthYear, t1.Sales
from sales_data_step3 as t1
join sales_data_step3 as t2 on t1.ID1 = t2.ID1
group by t1.ID1;

Records in column MaxID are the IDs we are going to extra from sales_data to fill the empty rows where their IDs are in ID column.

Result:

Step 5:
Create a table by filling blanks with values from the previous non-blank row.

create table sales_data_step5
as
select y.ID, x.CustomerID, x.CompanyName, y.MonthYear, y.Sales 
from sales_data as x
join sales_data_step4 as y on x.ID = y.MaxID;

Blank cells are filled by previous non-blank cells.

Result:

Step 6:
Vertically append the non-blank rows and the blank filled rows together.

select * from sales_data where CustomerID <> ''
union
select * from sales_data_step5
order by ID;

Result:

We can consolidate the 6 steps into one complex SQL script below.

select * from sales_data where CustomerID <> ''
union
select y.ID, x.CustomerID, x.CompanyName, y.MonthYear, y.Sales 
from sales_data as x
join
(
	select t1.ID,max(t2.ID) as MaxID, t1.MonthYear, t1.Sales
	from (select * from sales_data where CustomerID = '') as t1
	join (select * from sales_data where CustomerID <> '') as t2 on t1.ID > t2.ID
	group by t1.ID
) as y on x.ID = y.MaxID
order by ID;

Happy Coding!



Other tutorials in this category

1. MySQL Northwind Queries - Part 1

2. MySQL Northwind Queries - Part 2

3. MySQL Northwind Queries - Part 3

4. How to Work with Two Unrelated Values

5. How to Fill Gaps in Sales Data

6. How to Calculate Totals, Subtotals and Grand Total

7. How to Work with NULL Values

8. Use RANK function to update a previous record within a group or partition

9. Two ways to add a unique number or ID to each row

10. 3 ways to get Top N rows from MySQL

11. How to generate Cumulative Sum (running total) by MySQL - Part 1

12. How to generate Cumulative Sum (running total) by MySQL - Part 2

Back to Tutorial Index Page


Copyright © 2024 GeeksEngine.com. All Rights Reserved.

This website is hosted by HostGator.

No portion may be reproduced without my written permission. Software and hardware names mentioned on this site are registered trademarks of their respective companies. Should any right be infringed, it is totally unintentional. Drop me an email and I will promptly and gladly rectify it.

 
Home | Feedback | Terms of Use | Privacy Policy