On 13-Aug-2015 10:47 -0600, Stone, Joel wrote:
I have a filePROD with 10 million rows, and a fileFIX with 100 rows
(both files are identical columns).
I want to join the two files, and update filePROD with fields from
fileFIX where the primary key matches.
Is that a PRIMARY KEY CONSTRAINT or simply a keyed access path? The
former has a larger Page Size preferred by the SQL Query Engine (SQE)
Unfortunately SQL on v5r4 reads thru the entire 10 million records
to locate the 100 to join on.
Running SQE or the Classic Query Engine (CQE)? Presumably the
latter; read the debug messages for a description about the
implementation and what AccPth(s) were used, or not, and if not, then
for what reason(s) they were unused for implementation of the query.
Is it possible to convince SQL to read sequentially thru the tiny
fileFIX with 100 records (instead of the giant file)?
Probably could, irrespective of the engine being used. But given the
key-matching request, the likely scenario is that the AccPth will be
utilized; there would be little reason to build an access path over the
sequentially read data given one already exists. And reading via the
keyed access path for such a small number of rows is not really an
issue, especially given index-only-access should be possible [but
possibly not done if the overall key size is much greater than the one
column]. Seems the long-running query is a side effect of failing to
use an index on the larger file; i.e. what the query does with the
smaller file is probably immaterial.
The following SQL ran for an hour - if I could force it to read the
fileFIX first, it should only take seconds.
Thanks in advance!
select * from filePROD
( select docid
where filePROD.docid = fileFIX.docid
If the query ran that long, seems FILEPROD is getting read
sequentially.? Again, the debug messages would assist. Do the data
type and lengths actually match across the columns and the access paths?
Not that there should be any negative consequence, but given the
subquery requires no data, selecting a literal versus a column is best;
the optimizer should understand that, and implement appropriately, but
no reason to give any false requirements.
A clause asking to OPTIMIZE FOR 10 ROWS should encourage the use of
an index over sequential access. As well might an ORDER BY
filePROD.docid such that the matching Sort Sequence between the request
and the access path could recognize the indexed data selected would
already be in the required order for the result set.