| 
 | 
Dennis,
How is your category/code info stored?
Just for grins (sanity optional!), I created a table (codefile) with
two
fields: category and code.
The following query should identify category "pairs" that contain the
same
set of codes.
with catcount as (
select category,count(*) count
from codefile
group by category),
catcomp as (
select a.category cat1, b.category cat2
from catcount a
join catcount b on a.count=b.count
where a.category<>b.category),
catcheck as (
select cat1, cat2, a.code code1, b.code code2
from catcomp
join codefile a on a.category=cat1
left outer join codefile b on b.category=cat2 and
b.code=a.code)
select distinct cat1,cat2
from catcheck a
where not exists (
select *
from catcheck b
where b.cat1=a.cat1 and b.cat2=a.cat2 and b.code2 is null)
Each CTE builds on the previous one. "catcount" is a list of categories
with
the count of the number of codes assigned to them. "catcomp" then pulls
the
category pairs that have the same count. That lets us ignore the
category
pairs that can't be duplicates. "catcheck" generates records for each
code
in the first category of the category pair and includes the matching
code
from the second category of the category pair. The left outer join
means
that the matching code will be null if the second category doesn't have
that
code. The actual query pulls from catcheck for any category that
doesn't
have any nulls for that matching code.
This query will report each category pair twice; since if A=B, then
B=A.
Have fun!
Richard
-----Original Message-----
From: midrange-l-bounces@xxxxxxxxxxxx
[mailto:midrange-l-bounces@xxxxxxxxxxxx] On Behalf Of Dennis
Sent: Friday, June 03, 2011 12:40 PM
To: midrange-l@xxxxxxxxxxxx
Subject: Identifying unique sets of combinations
We have a table as part of our HR system that is used to categorize
days
off. For example, "S" (for sick) is in the SCKAC, USEDS, VACCK and 14
other
categories (I know, I know). V(acation) is in 7 categories, and so on.
These things are categorized in such a way because they intersect so
much.
For example, a Carry-over vacation day and a Vacation day each have the
same
effect in terms of Vacation Balance, but very different effects when it
comes to calculating next year's new balance.
In all, there are 110 absence codes that are grouped into 55 different
categories. In an effort to simplify all of this, I've been asked to
identify categories that share common members, with no exceptions. (In
other words, if categories AAA, ABB and ACC all contain V and C, and
none of
them contain anything else, they are exact duplicates.)
By the same token, If V and G are in 5 categories, but a sixth category
lacks G, the powers that be would like this noted also.
I'm sure there's a way to get to this answer and still maintain what's
left
of my sanity (many of you realize that there's little left), but I have
not
yet landed on it. I thought SQL might get me to the answer most
readily,
but try as I might I haven't found a workable solution yet; my CTEs
quickly
get too complex for me to follow. (See prior note on sanity.) Pivot
table
doesn't seem quite right (though it may help); counts are not
definitive
enough...
So I thought I'd come to this group; see if you've had a similar
situation,
and might be willing to share how you arrived at a solution.
Thanks!
--
Sent from my Galaxy tablet phone with with K-9 Mail. Please excuse my
brevity.
--
As an Amazon Associate we earn from qualifying purchases.
This mailing list archive is Copyright 1997-2025 by midrange.com and David Gibbs as a compilation work. Use of the archive is restricted to research of a business or technical nature. Any other uses are prohibited. Full details are available on our policy page. If you have questions about this, please contact [javascript protected email address].
Operating expenses for this site are earned using the Amazon Associate program and Google Adsense.