Partitioning this data solves the wrong problem.
Traditionally, different members were used to keep people out of each
others' data. That is not the intent of partitioning. In SQL
partitioning, all the partitioned data is available in a single query;
one does not do anything "special" to access the data. Traditional
multi-member data is pre-filtered. In partitioning, one adds additional
logic in the WHERE clause to filter the data. Also, the assumption is
that SQL data is accessed using (embedded) SQL, not native I/O.
Partitioning is a means to break down a large table into smaller chunks
for performance and maintenance reasons. A side effect of partitioning
in i5/OS is that it uses multi-member files, and to an incorrect
conclusion that you use partitioning the same way you would a
multi-member file.
Usually partitioning is implemented in conjunction with data
warehousing; however, partitioning is valid for general SQL tables,
especially in large databases.
Take for example a high volume transaction table. A business requirement
is that the current month and two previous months must be accessed
quickly, and older data can be accessed more slowly.
Partitioning allows you to have a "current" set of data, and
"historical/archive" data, both of which are immediately available for
querying in a single SELECT, but are stored in different physical (in
terms of file) locations.
In effect, you have a small "current" storage, and large "history"
storage. The key and index requirements on the smaller storage segment
can be more easily managed (quicker AP rebuilds and statistics
gathering), and the data accessed more quickly for current data.
The historical segment still has proper indexes and keys, but the
understanding that querying the data takes longer (more like data
warehouse than transactional).
Other database systems often partition data into separate storage spaces
or file groups. This allows the data to be physically segregated (such
as slower drives or near-line storage) yet available.
Take for example, SQL Server, which implements partitioning as file
groups and different file spaces in the file system. Backing up the
"current" set of data occurs quickly and generally not noticeable to the
users; it can also occur on its own schedule. Backing up the archive
data can occur less frequently since it changes less. Also the speed of
archive backup does not pose a serious impact to the business as it
would on the current data.
The side effect of DB2 for i5/OS I mentioned earlier, about storing
partitioned data in multi-member files, is correct for i5/OS due to
single level store and database implementation. The multiple members
aren't meant to separate the data in the same way as traditional
multi-member files have been used. For example, one doesn't partition
(in the sense of SQL partitioning) work in process invoice batches;
rather a batch posting or status flag should be used.
The "problem" with i5/OS partitioning implementation is that you cannot
just backup the "current" member. SAVOBJ requires you to backup the
entire file, with all current and archive members. Perhaps under the
covers the save routine backs up a member at a time, but you can't
explicitly save just the current data. As a consequence, you must back
up all the data (current and archive) each time the object is saved. You
could break the object into a "current" file and "archive" file, but
then you've lost the rationale for partitioning in the first place.
I wanted to clarify that partitioning uses the same implementation as
multi-member files, but for different reasons. Users should not design a
database and application to use partitioning to achieve the same goals
as they might have previously with multi-member files.
Loyd Goodbar
Business Systems
BorgWarner Shared Services
662-473-5713
-----Original Message-----
From: midrange-l-bounces@xxxxxxxxxxxx
[mailto:midrange-l-bounces@xxxxxxxxxxxx] On Behalf Of rob@xxxxxxxxx
Sent: Monday, August 18, 2008 8:08 AM
To: Midrange Systems Technical Discussion
Subject: Re: DDL Files - Tables and keys
From what I remembered,
1 - It wasn't cheap.
2 - People who use multimembers tend to want to have control over the
name
of the name of the member and do not want to put a column in the table
like MemberName and have that column in every row.
With DB2 MultiSystem you tell it how it should determine what member to
place the data. You may say stuff like if year of date field is 2008
then
use member 2008 or some such thing. If you are writing an invoice batch
setup then you would say put the name of the batch as a column in the
table (or field in the file) and use that to determine the member name.
Seems to me if you're going through all that, then why use members
instead
of the extra column in the first place?
Rob Berendt
As an Amazon Associate we earn from qualifying purchases.