Tuesday, 25 March 2014

How to remove duplicates If source is a flat file? OR How to send distinct records to one target and duplicates to others

There are several way to eliminate duplicate records source is flat file.one is sorter with distinct option,other one is  with aggregator with group by on ALL COLUMNS.and  by using variable method in expression transformation.
Have a look into below steps.
Step 1:
Drag and Drop your source/target to mapping designer.
Step 2: 
Sorter
Sort the data based upon port for which you are finding duplicate.
EMPNO     Key (Check)
Step 3:
Expression
Create three ports.
EMPNO_COMPARE (Variable Port) = IIF(EMPNO=OLD_REC, 1, 0)
OLD_EMPNO  (Variable Port ) = EMPNO
DUP_RECORD (Variable Port ) = EMPNO_COMPARE
Step 4:
Router
Create two Groups
New_Record: DUP_RECORD = 0
Dup_Record: DUP_RECORD = 1
Step 5:
Connect New_Record group to distinct target and Dup_Record group ports to duplicate target.


Here if you want get only distinct  records use filter Transformation in place of router transformation 
Thanks
Ur's Hari
If you like this post, please share it by clicking on g+1 Button.


No comments:

Post a Comment