Tracking historical data in Database

Posted in Blogs and articles on Aug 27, 2009 at 06:30 IST (over 4 years ago). Subscribe to this post Bookmark and Share Email
Showing comments 1 to 6 of total 6 on page 1 of 1
Post reply
« Previous1Next »

robinalex
Rank: 1
 
Data modification is easy transaction to handle when there is no need to track the changes. But, there are cases where we need to do so.
In this article I am writing about various kinds of data tracking mechanisms.
 
I am using Employee address change transaction to explain these methods.
 
The business case is:
Employee is effective from 1/1/2009 with an address 14810 SE 26th Drive
On 7/15/2009 address is changing to 22212 NE 55th St
On 8/26/2009 address is changing to 14810 SE 26th Drive
 
1. History table for a Data Table
 
In this method, a history table is created for the data table for which the history needs to be tracked. The history table will have same fields; in addition to those, it will have a date/time filed for the change date.
 
Here is the data representation for the business case:
 
Table: M1_Emp_Contact
 
Emp_ID
Emp_Adr
123456
14810 SE 26th Drive
 
Table: M1_Emp_Contact_Hist
 
Ch_Date
Emp_ID
Emp_Adr
7/15/2009
123456
22212 NE 55th St
8/26/2009
123456
15478 SE 7th PL
 
Getting latest data:
 
select emp_adr
 from m1_emp_contact
 where emp_id = '123456'
 
Getting data on a date (e.g. 7/20/2009):
 
select emp_adr
 from m1_emp_contact_hist a
 where a.emp_id = '123456'
   and a.ch_date =
      (select min(b.ch_date)
         from m1_emp_contact_hist b
        where a.emp_id = b.emp_id
          and b.ch_date > '7/20/2009')
 
The query tries to get the first change after 7/20/2009. If the query returns some value, that is the address on 7/20/2009.
If no address returned by the query, then data on 7/20/2009 is same as the latest value (previous query)
 
Note: The address on 8/25/2009 is 15478 SE 7th PL and address on 8/26/2009 is 14810 SE 26th Drive, which works fine with the mentioned logic.
 
2. Single History-Log table for all the tables
 
In this method, instead of having a history table for each table, there will be only one history table which can track field value changes for all the required tables.
 
Here is the data representation for the business case:
 
Table: M2_Emp_Contact
 
Emp_ID
Emp_Adr
123456
14810 SE 26th Drive
 
Table: M2_Data_Chg_Log
 
Ch_Date
Ch_Table
Ch_Key
Ch_Field
Old_Value
New_Value
7/15/2009
m2_emp_contact
123456
emp_adr
22212 NE 55th St
15478 SE 7th PL
8/26/2009
m2_emp_contact
123456
emp_adr
15478 SE 7th PL
14810 SE 26th Drive
 
The new value filed is optional.
 
The retrieval of data in this case is same as previous one with additional conditions to point field and table in the history table.
 
Getting latest data:
 
select emp_adr
 from m2_emp_contact
 where emp_id = '123456'
 
Getting data on a date (e.g. 7/20/2009):
 
select old_value
 from m2_data_chg_log a
 where a.ch_key = '123456'
   and a.ch_table = 'm2_emp_contact'
   and a.Ch_Field = 'emp_adr'
   and a.ch_date =
      (select min(b.ch_date)
         from m2_data_chg_log b
        where b.ch_key = a.ch_key
          and b.ch_table = a.ch_table
          and b.Ch_Field = a.Ch_Field
          and b.ch_date > '7/20/2009')
 
The query tries to get the first change after 7/20/2009. If the query returns some value, that is the address on 7/20/2009.
If no address returned by the query, then data on 7/20/2009 is same as the latest value (previous query)
 
3. Effective date and Expiration date
 
In this case, there won’t be any separate table for tracking history. We track the history in data table itself.
 
Here is the data representation for the business case:
 
Table: M3_Emp_Contact
 
Emp_ID
Effective_Date
Expiration_Date
Emp_Adr
123456
1/1/2009
7/15/2009
22212 NE 55th St
123456
7/15/2009
8/26/2009
15478 SE 7th PL
123456
8/26/2009
12/31/9999
14810 SE 26th Drive
 
Getting latest data:
 
select emp_adr
 from m3_emp_contact
 where emp_id = '123456'
   and expiration_date = '12/31/9999'
 
Getting data on a date (e.g. 7/20/2009):
 
select emp_adr
 from m3_emp_contact
 where emp_id = '123456'
   and effective_date <= '7/20/2009'
   and expiration_date > '7/20/2009'
 
Note: as of 8/26/2009 valid data is 14810 SE 26th Drive
 
4. Transaction number
 
This is kind of upgraded version to the previous one. This will be easier to handle if transaction date has no dependability with the system date (when the user has ability to process transaction on any date regardless of the actual time of transaction)
 
Here is the data representation for the business case:
 
Table: M4_Emp_Contact
 
Emp_ID
Trans_Number
Emp_Adr
123456
1
22212 NE 55th St
123456
2
15478 SE 7th PL
123456
3
14810 SE 26th Drive
 
Table: M4_Transactions
 
Object_ID
Obj_Key
Transaction_Date
Trans_Number
Employee
123456
1/1/2009
1
Employee
123456
7/15/2009
2
Employee
123456
8/26/2009
3
 
Transaction table is a generic table, which keeps track of transaction information.
 
Getting latest data:
 
select emp_adr
 from m4_emp_contact a
 where emp_id = '123456'
   and a.Trans_Number =
      (select max(b.Trans_Number)
         from m4_emp_contact b
        where a.Emp_ID = b.Emp_ID)
 
Getting data on a date (e.g. 7/20/2009):
 
select emp_adr
 from M4_Emp_Contact a
 where a.Emp_ID = '123456'
   and a.Trans_Number =
      (select max(b.Trans_Number)
         from M4_Transactions b
        where a.Emp_ID = b.Obj_Key
          and b.Object_ID = 'Employee'
          and b.Transaction_Date < '7/20/2009')
 
Transaction independent of system date
 
As, I mentioned before, let us see what advantage method 4 has over method 3.
 
Here is the additional step to business case:
Employee found mistake in the data entry when he received the acknowledgement email. The employee wants the address to be effective from 8/20/2009 and the address value to be 14820 SE 26th Drive (Typo by operator in building number 14810).
 
The correction will be, as simple as inserting two rows in Method 4.
 
Table: M4_Emp_Contact
 
Emp_ID
Trans_Number
Emp_Adr
123456
1
22212 NE 55th St
123456
2
15478 SE 7th PL
123456
3
14810 SE 26th Drive
123456
4
14820 SE 26th Drive
 
Table: M4_Transactions
 
Object_ID
Obj_Key
Transaction_Date
Trans_Number
Employee
123456
1/1/2009
1
Employee
123456
7/15/2009
2
Employee
123456
8/26/2009
3
Employee
123456
8/20/2009
4
 
In method 3 we can handle the situation, but the transaction is not straight forward like method 4.
Let’s see how we could do this with method 3:
 
Table: M3_Emp_Contact
 
Emp_ID
Effective_Date
Expiration_Date
Emp_Adr
123456
1/1/2009
7/15/2009
22212 NE 55th St
123456
7/15/2009
8/20/2009
15478 SE 7th PL
123456
8/26/2009
8/26/2009
14810 SE 26th Drive
123456
8/20/2009
12/31/9999
14820 SE 26th Drive
 
In this case we need to do 2 updates and 1 insert.
In addition, we have to include the condition effective_date <> expiration_date in all select queries on this table.
 
These are just basic methods, you can combine some of these and add more fields (like timestamps of transactions) to make more efficient history tracking according to your data needs.
Posted by robinalex on Thursday, August 27, 2009, 6:30 am
  • Currently 0.00/5

0 votes

Thank this userFlag this comment

kris
Rank: 14
For the point 3. Effective date and Expiration date. Is it possible to have an expiration date for the present address?

I feel to have a flag field in the table to indicate whether an address is current address or an old address. This makes the query process more simple and standard.
Posted by kris on Monday, October 5, 2009, 1:18 pm
  • Currently 0.00/5

0 votes

Thank this userFlag this comment

robinalex
Rank: 1

With an existing field we can track the present address. So, having an additional flag filed lead to data redundancy; currently an infinite date is set as expiration date. If you feel that, having a date value to that filed is illogical, you can have null value assigned.. But, getting data for certain date becomes difficult; you need to write one logic for present and another for past… BTW These methods for tracking data for any past date… For example, if want to know “to which address” of the client a communication has sent on certain date, you can track it from date..

It would be great if you could explain more about the flag filed standard which you mentioned, and how it simplifies the things?

Posted by robinalex on Tuesday, October 6, 2009, 9:42 am
  • Currently 0.00/5

0 votes

Thank this userFlag this comment

arvind9n...
Rank: 49
Dont forget to add indexes on the tables. The data will become very huge in couple of days. I worked with similar history tables and within 15 days data became more than 15 MB.
Posted by arvind9november on Tuesday, October 6, 2009, 1:45 pm
  • Currently 0.00/5

0 votes

Thank this userFlag this comment

asigh
Rank: 2
I can suggest the following..

1. Make sure you have and index column -- that will auto increment.
2. Put a Trigger on the date changed field. As soon as that changes you can fire a SP and update your history table.
3. Carry the index over to the history table for tracking.
4. Depends how much 'history' you want to carry over. You want to store the previous address too. If yes, then just keep a separate table that holds all the addresses that an index held with the date stamp.
Posted by asigh on Wednesday, October 7, 2009, 9:41 pm
  • Currently 0.00/5

0 votes

Thank this userFlag this comment

neonevin
Rank: 51

Most of the ERPs today are using the data tracking mechanism. The simplest of these i ve seen is using a
combination of effective dated and effective status logics,

Effective dated logic will help you to filter out latest transactions (using max(EFFDT) in ur where codition) as well as track changes to data. (but normally all changes to data will not be tracked)
Effeective status will help you to identify relevant data (i.e EFF_STATUS is active or inactive)

Posted by neonevin on Monday, October 19, 2009, 8:37 pm
  • Currently 0.00/5

0 votes

Thank this userFlag this comment
Pages: « Previous1Next »

Post your comment (No registration required)

  Add my comment  

TechieDesi Community

Not signed in (Sign-in or Register)
Be a true TechieDesi!
Top 10 Users
Spread the word
Invite your friends
Fan stuff
Help us improve
Need Help
FAQ's
Search tips
Found a bug? Report!
Feeds and letters
Subscribe via RSS
Archives
Subscribe to newsletter
Unsubscribe e-mail
Miscellaneous
Privacy policy
Visit rootnerve
About us
About us
Support the development
Official Blog
Advertise with us
Careers
Copyright (c) 2008, TechieDesi.com. All rights reserved | About us | Do-Not-Disturb registry | Powered by rootnerve | Page rendered in 0.015 seconds