On 21-Jul-2011 09:37 , Alan Shore wrote:
<<SNIP>>
I have a file that contains (among other things) 21 occurrences of
the same size field that can contain similar data.
For example
<<ed: addendum: V5r4
3-byte alpha fields, containing either blanks or numbers>>
Key f1 f2 f3 f4 f5 f6 ... f21
00001 360 160 202 400 118
00002 160
00003 400 202 600
00004 160 118
00005 202
00006 400
00007 111
Using record key 00001 as a starting point, I need to group all
records that have any of the non-blank values in any of the
occurrences, EXCLUDING those records that have OTHER non-blank
values.
Using the above example:
Keys 00002, 00004, 00005, 00006 would be grouped with 00001
Key 00003 would not (value 600 NOT in Key 00001)
Key 00007 would not (value 111 NOT in Key 00001)
Hopefully, what I am asking makes sense, AND is feasible.
As always, any and all suggestions are welcome.
Since Key=1 is merely "a starting point", presumably there is already
some means to find whatever that key value is. Thus also seems
specifying that key value as the remaining selection should easily be
effected by a variable versus hard-coded in selection as a literal in a
permanent object like a VIEW.
The current physical layout of the data values in columns of matching
data-type\attributes in each row, suggests that for the more logical set
processing effected by the SQL, those values would best be converted to
values in a single column across multiple rows; i.e. SQL set logic is
geared toward rows, not columns. That can be accomplished by defining a
"pivot query". The pivot query can be encapsulated in a VIEW, leaving
the selection of the key value as part of the eventual SELECT queries
which can utilize that logical view. FWiW, the pivoting of only the
data where key=1 should perform very fast; the given sample set
producing only up to 7*21 rows would also perform very well.
As rows instead, the sets of those values can be easily processed
using EXCEPTION JOIN or equivalent logic. The exception logic includes
the rows where any value exists, that is not a subset of the chosen-as
superset [the superset row in the physical layout]. Selecting the
exceptions means negative logic in selection on the key values of the
original file can omit those rows. So any key that includes any one or
up to all of the same values as the superset will be selected as a valid
subset, because any key that has at least one value not in the superset
will be omitted [using NOT on the IN predicate].
That said, while the VIEW in the below script both is sufficiently
dynamic to pivot the data and runs quickly, I did not test a two-step
query, the first step to build the selection for [or the entire] dynamic
query that performs the selection against each column for all values of
the key=1; i.e. the method described in the following message\link, and
that query by itself runs very quickly. The query which generates that
select also runs very quickly, and is included at the bottom of the script.
http://archive.midrange.com/midrange-l/201107/msg00559.html
<code>
create table qtemp/kf (k dec, f1 char(3) default ' '
, f2 char(3) default ' ', f3 char(3) default ' '
, f4 char(3) default ' ', f5 char(3) default ' '
, f6 char(3) default ' ', f7 char(3) default ' '
)
;
insert into qtemp/kf (k,f1,f2,f3,f4,f5,f6) values
(1, 360, 160, 202, 400, 118, default)
,(2, 160, default, default, default, default, default)
,(3, 400, 202, 600, default, default, default)
,(4, 160, 118, default, default, default, default)
,(5, 202, default, default, default, default, default)
,(6, 400, default, default, default, default, default)
,(7, 111, default, default, default, default, default)
;
create unique index qtemp/kfi on qtemp/kf (k)
;
-- following includes c=blank rows for later IN predicate
-- selection. Included because union DISTINCT is defaulted
-- and no WHERE clause to omit them
create view qtemp/kfpivotx (k,c) as
( select k, f1 from kf
union select k, f2 from kf
union select k, f3 from kf
union select k, f4 from kf
union select k, f5 from kf
union select k, f6 from kf
union select k, f7 from kf
/* ...
union select k, f21 from kf */
)
;
-- similar concept to that already reviewed by the OP
-- but identical select as subselect versus literals
-- Joe proposed similar using a CTE to get k=1 row data
select * from kf
where f1 in (select c from kfpivotx where k=1)
and f2 in (select c from kfpivotx where k=1)
and f3 in (select c from kfpivotx where k=1)
and f4 in (select c from kfpivotx where k=1)
and f5 in (select c from kfpivotx where k=1)
and f6 in (select c from kfpivotx where k=1)
and f7 in (select c from kfpivotx where k=1)
/* ...
and f21 in (select c from kfpivotx where k=1) */
/* and k<>1 */
;
-- following excludes any c=blank rows for use by the
-- EXCEPTION JOIN selecting only non-blank values
create view qtemp/kfpivot (k,c) as
( select k, f1 from kf where f1 <>''
union all select k, f2 from kf where f2 <>''
union all select k, f3 from kf where f3 <>''
union all select k, f4 from kf where f4 <>''
union all select k, f5 from kf where f5 <>''
union all select k, f6 from kf where f6 <>''
union all select k, f7 from kf where f7 <>''
/* ...
union all select k, f21 from kf where f21<>'' */
)
;
-- next two queries select the key values from the original
-- PF that are not the exceptions to the subset requirement
select * from kf
where k not in (select l.k
from kfpivot l exception join
(select * from kfpivot where k=1) r
on l.c=r.c )
/* and k<>1 */
;
select * from kf
where k not in (select l.k
from kfpivot l left join
(select * from kfpivot where k=1) r
on l.c=r.c
where r.c is null)
/* and k<>1 */
;
-- build the dynamic query as IN tests against each column per:
--
http://archive.midrange.com/midrange-l/201107/msg00559.html
-- fetch that into a variable, prepare, declare, and open
with
inpred (ip) as
(select replace(
' IN ( "' concat f1 concat '","' concat
f2 concat '","' concat f3 concat '","' concat
f4 concat '","' concat f5 concat '","' concat
f6 concat '","' concat f7 concat /* '","' concat
...
f20 concat '","' concat f21 concat */ '" ) '
, '"', '''' )
from kf
where k=1
)
select 'select * from kf where f1 ' concat ip
concat 'and f2 ' concat ip concat 'and f3 ' concat ip
concat 'and f4 ' concat ip concat 'and f5 ' concat ip
concat 'and f6 ' concat ip concat 'and f7 ' concat ip
/* ...
concat 'and f20' concat ip concat 'and f21' concat ip */
from inpred
</code>
Regards, Chuck
As an Amazon Associate we earn from qualifying purchases.