RE: Database design issue: seems really bad, yet I'm having trouble coming up with anything better -- MIDRANGE-L

The chart in that article depicts more pro's of the surrogate keys. They are not visible to the end user.

My general database design principle is that I should be able to locate a unique record in ANY table by just ONE column. And that column should never be seen by normal end users.

I have found that this design ideal leads to much simpler SQL and table relationships and is usually the design you see in commercial products.

-----Original Message-----
From: Charles Wilt [mailto:charles.wilt@xxxxxxxxx]
Sent: Thursday, December 11, 2014 10:18 AM
To: Midrange Systems Technical Discussion
Subject: Re: Database design issue: seems really bad, yet I'm having trouble coming up with anything better

Matt,

Natural vs surrogate (vs artificial [see http://www.informationweek.com/software/information-management/celko-on-sql-natural-artificial-and-surrogate-keys-explained/d/d-id/1059246?
])

is pretty much a religious war :)

But even if the rest of the DB is using non-natural keys, I'd argue for a composite "natural" key for the association tables. Since you're dealing with composite artificial keys; The normal downsides of "natural" and the perceived upsides of artificial keys don't really apply

Granted the meal FK specification is a little messier, since you have two fields. But you only have to do that once.

On Thu, Dec 11, 2014 at 10:18 AM, Matt Olson <Matt.Olson@xxxxxxxx> wrote:

In my mind I always pick a single surrogate key for all tables. I
stay away from composite keys at all cost.

It is an extra column but it gives you these advantages:

1. All child tables foreign keys only need a single column to
reference the primary key.
2. Very easy to update the natural keys in table without needing to
update every child table with foreign keys 3. Smaller primary/foreign
key indexes. This can make the database run faster, for example when a
record is deleted in a parent table, the child tables need to be
searched to make sure this will not create orphans.
Narrow indexes are faster to scan.
4. ORM's (object relational mappers) always work much more beautifully
with surrogate keys.
5. Natural Keys especially composite keys make writing code a pain.
When you need to join 4 tables the "where clause" will be much longer
(and easier to mess up) than when single surrogate keys were used.
6. Query performance is improved as the database is able to search
attributes to locate the surrogate key and join all child tables by a
single numeric key (sort of related to #5 and #3)

Only downside is the following:
1. you will have more indexes because you most likely will also want
to index whatever natural (aka composite) keys that exist in the data.

Now consider you want to design "Meals" whereby you want to take a
selection of protein to starch and protein to vegetable mappings.
Using the mapping tables as I defined them is much better way to go
because you can just reference the single surrogate keys in those
mapping tables. It would be a nightmare to have to reference those composite keys.

Matt

-----Original Message-----
From: Charles Wilt [mailto:charles.wilt@xxxxxxxxx]
Sent: Thursday, December 11, 2014 8:58 AM
To: Midrange Systems Technical Discussion
Subject: Re: Database design issue: seems really bad, yet I'm having
trouble coming up with anything better

James,

It sounds like a standard associative entity (ie. junction or map) to
me used to store a many to many relationship.
https://en.wikipedia.org/wiki/Many-to-many_(data_model)

There usually isn't much too them compared to the master tables whose
relationship they store.

There really isn't a better way to handle them in the DB. I'd used
Matt's model, except that there wouldn't be a ProteinStarchMapID or
ProteinVegtableMapID. I'd use a composite PK made up of the two FK to
the corresponding two master tables.

ProteinStarchMap File:
ProteinID, StarchID

ProteinVegtableMap File:
ProteinID, VegtableID

Personally, I'd store it sparsely. The existence of a record means
the starch/vegetable is compatible. Otherwise it is not.
Alternatively you could add a compatible flag and store every possible combination.

From a maintenance standpoint, it's easy select a given protein and
generate a list of all possible starches (from the Starch Master) with
a flag showing rather or not a record exists in ProteinStarchMap. The
uses just goes through and checks or unchecks the boxes.

Assuming you have need of a "Weekly Menu" that stores the schedule of
selected meals. The Menu (or MenuMap) table would simply be
ScheduledDate ProteinID StarchID VegtableID

Using the associative tables, it's real easy to make sure only valid
meals are stored by defining a FK constraint...
foriegn key (proteinID, starchID) references ProteinStarchMap
(proteinID,
starchID)
foriegn key (proteinID, vegtableID) references ProteinVegtableMap
(proteinID, vegtableID)

Lastly, have you ever looked at Object Role Modeling?
http://www.ormfoundation.org/

I've found it useful for DB design & validation. Particularly when
interacting with the non-technical.

On Wed, Dec 10, 2014 at 9:09 PM, James H. H. Lampert <
jamesl@xxxxxxxxxxxxxxxxx> wrote:

On 12/10/14 5:27 PM, Evan Harris wrote:

A mapping file seems the logical way to go about it, and it seems
like it would naturally lend itself to some kind of caching
approach, which might just overcome your bottleneck concerns.

The problem with the mapping file approach is, as I said, the
overhead if it's done within the application's framework: its record
structures have, before a single user-defined field is added:

Fields for creation date, creation time, and creation user Fields
for last-update date, time, user, job, and program An automatically
generated ID field A flag field.

That totals 86 bytes of overhead, per record, in addition to
whatever overhead is inherent to DB2/400.

The "protein" code is 3 characters. The "vegetable" or "starch" code
is

8.

86 bytes of overhead for 11 bytes of mapping data, in a file that
would have to have a separate record for each valid combination of
"protein" and "vegetable," and the same, in another file, for each
valid combination of "protein" and "starch." Something about that
strikes me as being not much of an improvement over the initial
hierarchical solution that seemed so wrong.

Putting space for up to 10 "protein" codes, space-delimited, in each
"starch" and "vegetable" record, would add 39 to 41 characters
(depending on whether we need leading and/or trailing spaces for the
entire field) to each "starch" and "vegetable" record. With 20
"protein" codes per record, it would be 79-81 characters. Then, the
search would work out to nothing more than, say, looking for all
"starch" records for which the "protein list" field contains the
code

for the desired "protein."

--
JHHL
--
This is the Midrange Systems Technical Discussion (MIDRANGE-L)
mailing list To post a message email: MIDRANGE-L@xxxxxxxxxxxx To
subscribe, unsubscribe, or change list options,
visit: http://lists.midrange.com/mailman/listinfo/midrange-l
or email: MIDRANGE-L-request@xxxxxxxxxxxx Before posting, please
take a moment to review the archives at
http://archive.midrange.com/midrange-l.

--
This is the Midrange Systems Technical Discussion (MIDRANGE-L) mailing
list To post a message email: MIDRANGE-L@xxxxxxxxxxxx To subscribe,
unsubscribe, or change list options,
visit: http://lists.midrange.com/mailman/listinfo/midrange-l
or email: MIDRANGE-L-request@xxxxxxxxxxxx Before posting, please take
a moment to review the archives at http://archive.midrange.com/midrange-l.

--
This is the Midrange Systems Technical Discussion (MIDRANGE-L) mailing
list To post a message email: MIDRANGE-L@xxxxxxxxxxxx To subscribe,
unsubscribe, or change list options,
visit: http://lists.midrange.com/mailman/listinfo/midrange-l
or email: MIDRANGE-L-request@xxxxxxxxxxxx Before posting, please take
a moment to review the archives at
http://archive.midrange.com/midrange-l.

--
This is the Midrange Systems Technical Discussion (MIDRANGE-L) mailing list To post a message email: MIDRANGE-L@xxxxxxxxxxxx To subscribe, unsubscribe, or change list options,
visit: http://lists.midrange.com/mailman/listinfo/midrange-l
or email: MIDRANGE-L-request@xxxxxxxxxxxx Before posting, please take a moment to review the archives at http://archive.midrange.com/midrange-l.