Re: Next Identity Key Value -- MIDRANGE-L

On 12-Oct-2015 12:47 -0600, rob wrote:

On 12-Oct-2015 12:28 -0600, Birgitta Hauser wrote:

Alan Campin on Monday, 12.10 2015 18:24 wrote:

Is there a way to get the next identity key that will be used
for a given table? [...]

Try:

Select NEXT_IDENTITY_VALUE
From QSYS2.SYSPARTITIONSTAT
Where Table_Schema = 'YOURSCHEMA'
and Table_Name = 'YOURTABLE'
;

Identity values can vary by partition (aka member)?

The run-time next-to-be-assigned-identity-value is an attribute of the DataSpace; that can be verified by using Start Service Tools (STRSST) to obtain a [formatted] dump of such a Dataspace. And that value is retrievable by an MI instruction, I do not recall, if by either of MATDS or MATDSA; regardless, the MI is protected\restricted vs publicly available instruction(s), and the value that is presented is almost surely retrieved from the Materialize Dataspace [Attributes] instruction as performed by privileged OS code invoked as an effect of querying the SYSPSTAT VIEW. Each member has a Dataspace. Therefore by implication, the identity value is inherently maintained per-member. The typical TABLE is non-partitioned, and thus has only one member, thus has only one dataspace. And DDS Physical File (PF) has no IDENTITY support, so member-specific identities would not be something relevant; not even possible.

So if I use DB2 multisystem and I create a partitioned table using
OrderNumber as an identity column and partition the table by
fulfillment center would there be a possibility of the identity
column not being unique because it could use order number 5 for two
different fulfillment centers?

By default the *database-generated* next-identity-value is the assigned increment\offset from the prior value, and thus would naturally be unique. However note that identity-values have always had the possibility of not being unique, when not also enforced with a CONSTRAINT. There must be a PRIMARY KEY [or UNIQUE KEY] constraint on the IDENTITY column to ensure the values are maintained unique.

Thus the ALWAYS GENERATED value [one not originating from a bogus ALTER ... RESTART] would, for a single-member TABLE, effect generation of unique [non-duplicating] values; i.e. the database generates the consistently unique value, according to the creation-time attributes of the IDENTITY column. That same effect would be expected for the data across partitioned members, just as occurs with one member. So, no, I do *not* expect that the two partitions would get the same non-unique generated value; not without some errant user intervention like OVERRIDING SYSTEM VALUE, or ALTER ... RESTART, or perhaps instead by the choice of the CYCLE vs NO CYCLE attribute.

But I have never created a locally nor distributed partitioning of a TABLE with IDENTITY, and thus have never seen the effects [nor inferred the implementation nor seen any docs describing the effects]. However *I speculate* as a side effect of the implementation, whereby the Datspace determines the next-identity, the expectation could be the assignment of a range of [for the specified number of cached] identity values, is specific _for each member_ [aka partition]. That would be instead of, forcing the assignment of a generated-value within each member to defer [probably] to the primary\first member, which would increase contention on that sole member as arbiter; i.e. perhaps, so as to limit the contention, each member tracks separately its own next-identity-value, but that the database maintains additionally the means to maintain the same distinctness as with one member. Thus perhaps for an IDENTITY defined with increment-by-one:start-with-one:cache-three, the following three-partition TABLE could start with the following:

P_TABLE
P_1 -> next_identity_value=1 [also cached =2, =3]
P_2 -> next_identity_value=4 [also cached =5, =6]
P_3 -> next_identity_value=7 [also cached =8, =9]

If that were the implementation, contention for the location of a single identity-value is avoided minimally for one value and maximally for the cached number of values, for each member. So until the cached values are used-up for any one member, the generate-next need not refer to [and update] a shared storage location serving as some composite next_identity, from which to assign the specific value(s) for the specific member [aka partition] into which row-data is being inserted. Such an implementation would however, upon using-up the values for any one member, refer to the necessarily separate storage location that tracks the overall\composite next-identity-start-value. Thus for whenever any one member /runs out/ of [un]cached values, conspicuously, with the NO CACHE attribute, the contention would be unavoidable except for the first next-value.

Of course someone with DB2 MultiSystem installed could verify whether the value in the SYSPSTAT VIEW is always identical for each partition, or if the value is instead distinct for each member\partition; just create a table with an IDENTITY column having the data partitioned locally [e.g. by-range]:

CREATE TABLE QGPL.TEST_PARTITION
( ORDERNUMBER for ORDNBR BIGINT NOT NULL
GENERATED ALWAYS AS IDENTITY
( START WITH 11 INCREMENT BY 1
CACHE 10 NO CYCLE
)
, FULFILLMENT_CENTER for FFLCTR SMALLINT NOT NULL
, [... other fields ...]
, CONSTRAINT QGPL.TEST_PARTITION_PK_ORDNBR
PRIMARY KEY (ORDNBR)
)
PARTITION BY RANGE (FFLCTR)
( PARTITION FFLCTR01 STARTING 01 ENDING 01 INCLUSIVE
, PARTITION FFLCTR02 STARTING 02 ENDING 02 INCLUSIVE
, PARTITION FFLCTR03 STARTING 03 ENDING 03 INCLUSIVE
)

Then review the results of the following query suggested by Birgitta:

Select NEXT_IDENTITY_VALUE
From QSYS2.SYSPARTITIONSTAT
Where Table_Schema = 'QGPL'
and Table_Name = 'TEST_PARTITION'

Or, unlike members, partitions enforce uniqueness across partitions?

Again, aside from the generally distinct generated values, uniqueness enforcement must be established by the creator of the TABLE, by having assigned a CONSTRAINT to ensure the UNIQUE effect; that is the same, irrespective the TABLE being partitioned.

Or was next_identity_value just added to syspartitionstat just
because it was a newer view and its developer was more open to mods
and new column suggestions at the time?

Generally the work is done by design, not by whimsy. Though indeed, with just a cursory review, the information seems to be exposed from the wrong place, possibly as a side effect of the implementation, rather than the vagaries of the developer attitudes; i.e. one might expect the information for NEXT_IDENTITY_VALUE to be from SYSTABLESTAT vs from SYSPARTITIONSTAT, just like there other data-related attributes that are presented [as consolidated for the composite TABLE] with the former VIEW despite being presented [distinctly for each member-as-sub-component of the TABLE] in the latter VIEW.

But perhaps for accuracy, to reflect the actual results that should be expected for the next value [quite possibly due to implementation; see _implementation_ below], there is not one specific value for which consolidation would be accurate. That is to suggest, perhaps any generated value is indeed specific to the partition, rather than any one value that could be known\common for the overall TABLE; IOW, that an insert into partition-one would effect a different generated-identity-value than an insert into partition-two, therefore there is no one-specific-value scoped to the TABLE.

FWiW, in the case of a common DB2 SQL catalog VIEW, the same details likely could be seen in other variants. For example, in DB2 LUW [for v10.5], there is the NEXTCACHEFIRSTVALUE column of the SYSCAT.COLIDENTATTRIBUTES catalog view; the data is irrespective of PARTITION, existing only for the TABLE-level\scope.

But again, the _implementation_ for the DB2 for i is that Members implement Partitions, and that the Dataspace implements row-data, such that a NEXT_IDENTITY_VALUE necessarily would be an attribute of the Data, and derived from the Dataspace. The catalog VIEW that exposes most of the data-related attributes, those attributes from the Members that are retrieved from the member(s) and dataspace(s), is the SYSPSTAT [aka SYSPARTITIONSTAT]. That suggests that, essentially, if the next-identity-value were available\exposed via the database APIs, then the information would need to be extracted by use of the Retrieve Member Description (QUSRMBRD) API rather than by use of the Retrieve File Description (QDBRTVFD) API. So perhaps the implementation necessitates the representation of the identity-value as being known only on a "per-partition" basis.