MIDRANGE dot COM Mailing List Archive



Home » MIDRANGE-L » September 2013

Re: SQL question



fixed

On 09 Sep 2013 09:19, Buck Calabro wrote:
<<SNIP>>

SELECT odobnm, odldat, odltim, 0 as total
FROM myfile
where odldat = 090713 or odldat = 090813
union
select 'TOTAL', 9999999, 0, count(*) as total
from myfile
where odldat = 090713 or odldat = 090813
order by 2

A few comments about such a query; merely for edification:

Using a derived table expression in the form of a CTE limits the specification of the predicates in the WHERE clause to just that one subquery. Mostly beneficial for maintenance, if for example someone changes the query statement source [for what is selected], they need to make the change in just one place. That also allows for limiting the number of [duplicated] parameter markers if the query gets converted to use them.

The ODLDAT is CHAR(6). The values of that column can be blank. The data in that second column of the result-set would be implicitly cast to numeric due to the literal value 9999999 in the second subquery. While the selection [theoretically] ensures the values are not blank, one can not be sure that all selection will always occur before evaluating any other expressions\effects of the query, and mapping a blank to a numeric value is an error. The third column of the result-set will also be affected by an implicit cast to numeric per use of the numeric literal value zero being selected in the second subquery. While the ODLTIM is also CHAR(6), and can also be blanks, the value should never be blank when ODLDAT is non-blank. By reason of that same casting effect, the second and third columns will appear as numeric values in reporting; the character string 'MMDDYY' is probably preferable for readability, such that the non-delimited value "90,813" is probably more difficult to be inferred\read as an apparent date value as contrasted with ability of the reader to see the non-delimited value "090813" as an apparent date value. Thus the second subquery would best use character literals vs numeric literals for the second and third columns.

However, instead of using character literal values for those columns of the result set, probably better to use the NULL value cast to the expected or otherwise compatible character type. The database NULL value collates higher than any possible character string value [more so than a string of all digits-nine], thus the ORDER BY is still satisfied.

The literal numeric zero in the first subquery is non-essential in the appearance of the report, esp. given the first column is borrowed for presentation of a label [identifying the row as a summary row], so it could be omitted from the results or given a non-value. Three possibilities I think would improve the above query:
- have the first query select the NULL value cast to a numeric to be compatible with the numeric aggregate in the second subquery. Because the detail rows have no /total/ per se, having the value zero is a bit deceptive, and might better be the NULL value.
- select an empty string for the /total/ column and select the COUNT aggregate cast to character in the second subquery; e.g. char(count(*)), although included as a column, a right-justified value might be more preferable.
- borrow the time column [like the name column was borrowed to provide a label] for the use of the /total/ by casting the COUNT aggregate result to a character result consistent or otherwise compatible with the ODLTIM column; e.g. char(count(*))

The IN predicate may be a better choice [subjective; I am surprised though, that nobody ever mentioned it], over using two equal predicates with the OR logical operator.

While we /know/ the contrived row is distinct, the query would best have used UNION ALL vs UNION [UNION is synonymous with the UNION DISTINCT]. If there were really a row matching the contrived row [effectively impossible per use of the COUNT aggregate], then a row would be /lost/ due to distinct processing. However the real issue for the choice of UNION in that query, is performance; i.e. there is likely no desire to effect distinct processing, so explicitly coding UNION ALL ensures no distinct processing occurs.






Return to Archive home page | Return to MIDRANGE.COM home page

This mailing list archive is Copyright 1997-2014 by MIDRANGE dot COM and David Gibbs as a compilation work. Use of the archive is restricted to research of a business or technical nature. Any other uses are prohibited. Full details are available here. If you have questions about this, please contact