RE: Using sql to delete duplicate records -- MIDRANGE-L

Thanks Rob
That seems to have done the trick

This is where I originally found my sql script
http://iprodeveloper.com/database/deleting-records-duplicate-data-revisited

Alan Shore
E-mail : ASHORE@xxxxxxxx
Phone [O] : (631) 200-5019
Phone [C] : (631) 880-8640
'If you're going through hell, keep going.'
Winston Churchill

-----Original Message-----
From: MIDRANGE-L [mailto:midrange-l-bounces@xxxxxxxxxxxx] On Behalf Of rob@xxxxxxxxx
Sent: Monday, October 05, 2015 1:08 PM
To: Midrange Systems Technical Discussion
Subject: Re: Using sql to delete duplicate records

DDS and DDL is irrelavant. Either can have, or not have, a primary key constraint. DDS is just more likely because the odds of running into legacy software that has no primary key, or the old codger who seems to remember some old situation back on a 360 or 1401 having issues with primary keys goes up.
The terminology may be what is throwing Matt. What many applications called primary key was just a key on a logical over what should have been the primary key but they often didn't even specify UNIQUE.

CREATE TABLE QTEMP.DUMMY (MYCHAR CHAR (5 )) INSERT INTO QTEMP.DUMMY (MYCHAR) VALUES('A') with nc INSERT INTO QTEMP.DUMMY (MYCHAR) VALUES('A') with nc INSERT INTO QTEMP.DUMMY (MYCHAR) VALUES('B') with nc INSERT INTO QTEMP.DUMMY (MYCHAR) VALUES('C') with nc INSERT INTO QTEMP.DUMMY (MYCHAR) VALUES('D') with nc INSERT INTO QTEMP.DUMMY (MYCHAR) VALUES('D') with nc INSERT INTO QTEMP.DUMMY (MYCHAR) VALUES('D') with nc

select mychar, count(*), min(rrn(d))
from qtemp.dummy d
group by mychar
having count(*)>1
order by mychar
MYCHAR COUNT ( * ) MIN ( RRN ( D ) )
A 2 1
D 3 5

http://centerfieldtechnology.com/PDFs/JulyAugust07.pdf
delete from qtemp.dummy d
where rrn(d) in
(select rrn(b)
from qtemp.dummy b,
(select a.mychar, min(rrn(a)) minrrn
from qtemp.dummy a
group by a.mychar) as c
where b.mychar = c.mychar and rrn(b) > c.minrrn) with nc
3 rows deleted from DUMMY in QTEMP.

select mychar, rrn(d)
from qtemp.dummy d

MYCHAR RRN ( D )
A 1
B 3
C 4
D 5

BTW, I searched archive.midrange.com with your subject line, for this answer.

Rob Berendt
--
IBM Certified System Administrator - IBM i 6.1 Group Dekko Dept 1600 Mail to: 2505 Dekko Drive
Garrett, IN 46738
Ship to: Dock 108
6928N 400E
Kendallville, IN 46755
http://www.dekko.com

From: Alan Shore <ashore@xxxxxxxx>
To: "midrange-l@xxxxxxxxxxxx" <midrange-l@xxxxxxxxxxxx>
Date: 10/05/2015 11:07 AM
Subject: Using sql to delete duplicate records
Sent by: "MIDRANGE-L" <midrange-l-bounces@xxxxxxxxxxxx>

Hi everyone
Before I forget - we are on V5r4

This is driving me nuts (I know - not that far to drive)
I used the following sql to delete duplicate records in a file
delete from file01 A where rrn(A) in
(select RRN(B) from file01 B, file01 C where
b.IITEM = c.IITEM and RRN(B) > RRN(C))

and received a display the nnn records had been deleted
Don't ask me why - but I ran the same statement again - expecting zero
records deleted
But it gave me a smaller number than before
Which surprised me
So I ran it again
And received an even smaller number

I ran this statement multiple time, receiving a smaller number each time

I then ran a RGZPFM to remove all deleted records and ran the sql
statement again
Only to be informed that a certain number had been deleted
I ran it again and received notification tha a lesser number had been
deleted

Obviously I am doing something wrong - but I cannot see what it is
Does anyone else see what I am doing wrong?

Alan Shore
E-mail : ASHORE@xxxxxxxx
Phone [O] : (631) 200-5019
Phone [C] : (631) 880-8640
'If you're going through hell, keep going.'
Winston Churchill