On 26-Jul-2010 08:10, Loyd Goodbar wrote:
I created an index in our test environment to support a SELECT
DISTINCT query. The file has 4 million rows in test, and 8
million rows in production. The file is DDS-defined.
  A lot of rows for DISTINCT processing, but how many unique values 
across that index?  Generally there is little value, often a greater 
cost, in using an INDEX which is random I\O, when sequential I\O can 
be done instead much more efficiently over larger data sets.
The index is in test, but Visual Explain says a table scan will
be performed, with the reason "No indexes exist".
  Reported "No index" was found, or "No worthwhile index" was 
found, where the latter intends to suggest no index compatible with 
the goal of optimizing the query was found.  Then each index not 
valuable for achieving a better implementation [for optimized 
access] should have been disregarded for a stated reason.
The query asks for the same columns as the index, and in the same
order. DSPDBR reports the index is associated with the base file.
I'm using the iNav "Run SQL Scripts".
  Hmmm.  One would think an "index only" implementation might 
eliminate any preference of sequential over random, given either no 
ORDER BY or run-time collation request that matches the INDEX.
Here is the index and query:
  create index prod_mod.iml10026 on prod_mod.imp1001
    (i1azcd asc
   , i1date asc
   , i1time asc
   , i1badt asc);
  select distinct
     i1azcd as location
, i1date as observationdate
   , i1time as observationtime
   , i1badt as costdate
  from prod_mod.imp1001;
The query will be utilized in embedded SQL in an RPGLE program.
Is there anything I can do to influence how the query performs,
to say "use this index"?  This would be similar to SQL Server
"hints".
  I agree the INDEX seems good for the query; for lack of an ORDER 
BY, an index is as good as [actually, for index-only, better than] 
sequential.
  Some possible hints are to:  Use an ORDER BY where the run-time 
collation matches those established in the INDEX.  Adjust the "Allow 
copy data" setting, and\or specify the desired OPTIMIZE clause.
 Is this a CQE vs. SQE issue?
  I believe DISTINCT processing implementations for the two query 
engines are vastly different, so that probably is a very important 
issue.  Did VE say which?
Are there any gotchas between running the query in Run SQL
Scripts versus an embedded SQLRPGLE, other than naming convention
and date/time formats? All is on V5R4.
  The client connection is with an ASCII CCSID, but I do not think 
that is an issue.  I believe that the query runs on the server in 
EBCDIC, and then only translates into ASCII, what data goes back to 
the client.  That is, I do not believe DISTINCT has any special 
meaning that would require translation of the data to determine if 
the row data was actually distinct according to the target CCSID; 
something that would be relevant only if multiple EBCDIC character 
values translated into the same ASCII character value, but depending 
on design, the effect could be considered a presentation issue 
rather than a functional issue.
  The interactive signon effects a subsystem routing program and 
initial program that may cause the interactive environment to be 
different than work entering the system by another means.  Also the 
defaults for some run-time optimization variations may be unique to 
the chosen environment.
  The defaults for OPTIMIZE and "allow copy data" may be different 
between the embedded and from iNav.  This is similar to how STRSQL 
defaults to something like OPTIMIZE FOR FIRST 20 ROWS for output to 
the display, but embedded or other interfaces might default to 
OPTIMIZE FOR ALL ROWS.  Testing in any environment to verify the 
statement gives the expected output is safer than testing to verify 
that the optimization of the statement is as desired; i.e. the 
expected output remains the same irrespective of optimization, but 
the optimization is specific to whatever is the run-time environment.
Regards, Chuck
As an Amazon Associate we earn from qualifying purchases.