Custom Search
 


Use RANK function to update a previous record within a group or partition


This tutorial is customized from a practical case where MySQL ranking function RANK(), DENSE_RANK(), or ROW_NUMBER() is used in a multi-steps solution to update a previous record within a group or partition.

The scenario:

Our Patient dimension table (Patient_Dim) stores and tracks current and historical patient data over time. ID is a surrogate key so it's auto assigned and unique. PatientID is the business key which is unique for each patient. ValidFrom and ValidTo track the range of date and time when the patient record is updated. When a patient's address and/or phone change, it automatically triggers that the current record is closed and a new record is created, so the new record becomes the current record where 9999-01-01 is saved to ValidTo column (9999-01-01 can be used as a flag for current record).

This mechanism should guarantee that the ValidTo and the next record's ValidFrom of the same patient should always be 1 second apart, but overtime some records may become corrupted and the two datetime values are not 1 second apart. Our task is to find those problematic records and fix them.

As usual, let's create our Patient_Dim and add some sample data.

CREATE TABLE Patient_Dim
(
  ID int unique NOT NULL,
  PatientID int NOT NULL,
  PatientName varchar(40) NOT NULL,
  Address varchar(100) NOT NULL,
  Phone varchar(24) NOT NULL,
  ValidFrom datetime NOT NULL,
  ValidTo datetime NOT NULL
);

-- Add sample data
insert into Patient_Dim values
(1, 178, 'Simon Anders', '1 Obere St', '030-0074321', '2011-04-03 12:04:20','2014-05-20 01:00:46'),
(2, 178, 'Simon Anders', '1 Obere St', '016295178', '2014-05-20 01:35:26','2015-11-24 01:22:27'),
(3, 178, 'Simon Anders', '25 Berliner', '016295178', '2015-11-24 02:02:21','9999-01-01 00:00:00'),
(4, 225, 'Roland Mendel', '5 Kirch St', '0452-076545', '2015-11-24 01:22:28','2016-05-13 00:26:30'),
(5, 225, 'Roland Mendel', '5 Kirch St', '584962557', '2016-05-13 00:26:31','9999-01-01 00:00:00'),
(6, 366, 'Mario Pontes', '22 Ludovico', '035-640230', '2017-06-15 03:55:39','2018-12-13 03:55:38'),
(7, 366, 'Mario Pontes', '8 Joe Rd', '2967542', '2018-12-13 03:58:17','9999-01-01 00:00:00');

Here is what our patient data look like. As you can see, for patient Simon Andres' record where ID=1, its ValidTo should be 2014-05-20 01:35:25 which is 1 second before the ValidFrom of his next record, but the value currently is 2014-05-20 01:00:46 in the table.

ValidTo and ValidFrom is not 1 second apart.

Our task is to fix the values in ValidTo column to make it 1 second less than the ValidFrom of the next record (identified by ID + 1) for the same patient.

Below is the script to do the fix.

-- Step 1: create a temporary table to hold the fixed data.
create table temp_ValidTo_Fixed
as    
select y.ID,
        x.PatientID,
        x.PatientName,
        y.ValidTo as Old_ValidTo,
        date_add(x.ValidFrom, INTERVAL -1 SECOND) as New_ValidTo
from
(
    select ID,
        PatientID,
        PatientName,
        ValidFrom,
        RANK() OVER (PARTITION BY PatientID ORDER BY ID) as rank_val
    from Patient_Dim
) as x
join 
(
    select ID,
        PatientID,
        PatientName,
        ValidTo,
        RANK() OVER (PARTITION BY PatientID ORDER BY ID)+1 as rank_val
    from Patient_Dim
    where year(ValidTo) != 9999
) as y on x.PatientID=y.PatientID and x.rank_val=y.rank_val
where x.ValidFrom - y.ValidTo != 1;

-- Step 2: fix incorrect ValidTo by using cross table update from the temporary table.
update Patient_Dim as a
inner join temp_ValidTo_Fixed as b on b.ID = a.ID
set a.ValidTo = b.New_ValidTo;

Now let's analyze the script.

In Step 1, there are 2 subqueries in the FROM clause. The first sub query assigns a rank value to each patient based on their ID.

-- First subquery
select ID, 
    PatientID, 
    PatientName,         
    ValidFrom,
    RANK() OVER (PARTITION BY PatientID ORDER BY ID) as rank_val
from Patient_Dim;

7 records returned:

The second subquery does this: (1) selects the records where the year part of ValidTo is not 9999 because these are the records we'd like to check for their value in ValidTo column; (2) assigns a rank value plus 1 to each patient based on their ID.

-- Second subquery
select ID, 
	PatientID, 
	PatientName, 
	ValidTo,
	RANK() OVER (PARTITION BY PatientID ORDER BY ID) + 1 as rank_val
from Patient_Dim
where year(ValidTo) != 9999;

4 records returned:

The last task we do in Step 1 is to join the two subqueries based on PatientID and rank value for records where the ValidFrom and ValidTo is not 1 second apart. Note that in the second subquery, rank value plus 1 literally aligns next record's ValidFrom with previous record's ValidTo for the same patient side-by-side into a single row. This effectively makes it possible to compare ValidFrom and ValidTo of two adjacent records. Then in the SELECT part, we created a new ValidTo column which is ValidFrom minus 1 second.

Finally, in Step 2, a cross-table update is used by joining on ID columns and ValidTo is updated to the new ValidTo in the temporary table.

update Patient_Dim as a
inner join temp_ValidTo_Fixed as b on b.ID = a.ID
set a.ValidTo = b.New_ValidTo;

Here is what the patient dimension looks like after fixing the ValidTo column.

7 records returned:

Please note that it makes no difference if RANK function is replaced by DENSE_RANK or ROW_NUMBER because the ranking is based on ID values which are always unique.

Happy Coding!



Other tutorials in this category

1. MySQL Northwind Queries - Part 1

2. MySQL Northwind Queries - Part 2

3. MySQL Northwind Queries - Part 3

4. How to Work with Two Unrelated Values

5. How to Fill Gaps in Sales Data

6. How to Calculate Totals, Subtotals and Grand Total

7. How to Work with NULL Values

8. How to fill down empty cells with values from a previous non-empty row

9. Two ways to add a unique number or ID to each row

10. 3 ways to get Top N rows from MySQL

11. How to generate Cumulative Sum (running total) by MySQL - Part 1

12. How to generate Cumulative Sum (running total) by MySQL - Part 2

Back to Tutorial Index Page


Copyright © 2024 GeeksEngine.com. All Rights Reserved.

This website is hosted by HostGator.

No portion may be reproduced without my written permission. Software and hardware names mentioned on this site are registered trademarks of their respective companies. Should any right be infringed, it is totally unintentional. Drop me an email and I will promptly and gladly rectify it.

 
Home | Feedback | Terms of Use | Privacy Policy