On 15-Oct-2015 15:28 -0500, Art Duarte wrote:
I need help with improving/simplifying this SQL statement:
SELECT
substr(idept,1,1) as lineSeq, dpdesc, min(dhloc) as location
, sum(dhsell*dhqty) + Sum(aSell * aQty) as Sales_Amt
, sum(dhcost*dhqty) + Sum(aCost * aQty) as Cost_Amt
, (cast( ( sum(dhsell*dhqty) + Sum(aSell * aQty) )
- ( sum(dhcost*dhqty) + Sum(aCost * aQty) ) as double)
/ cast( sum(dhcost*dhqty) + Sum(aCost * aQty) as double)
) as Margin
from dbslsdlg left join KCX340D.dbprodpf on dhino = ino
Left Join DBSLSDXL1 b on aINo = ino
Left join dbdeptpf on substr(idept,1,1) = deptno
where dhdate>= 20150701 and dhdate<= 20150730
AND aDate>= 20150701 and aDate<= 20150730
group by substr(idept,1,1), dpdesc order by 1, dpdesc;
Appreciate your help!
Consider, that one concern for improvement and\or simplification may
be achieved with the ability to more easily read and understand what the
query does. The first step might be to reformat the query to make the
SELECT query look /prettier/ for readability; e.g. the following might
be a tad clearer than the given [though I already reformatted that
quoted query, a bit, to make the request fit in the confines of my email
client]?:
SELECT
substr(idept, 1, 1) as lineSeq
, dpdesc
, min(dhloc) as location
, sum(dhsell * dhqty) + Sum(aSell * aQty) as Sales_Amt
, sum(dhcost * dhqty) + Sum(aCost * aQty) as Cost_Amt
, ( cast( ( sum(dhsell*dhqty) + Sum(aSell * aQty) )
- ( sum(dhcost*dhqty) + Sum(aCost * aQty) )
as double)
/ cast( sum(dhcost*dhqty) + Sum(aCost * aQty) as double)
) as Margin
from dbslsdlg
left join KCX340D.dbprodpf
on dhino = ino
Left Join DBSLSDXL1 b
on aINo = ino
Left join dbdeptpf
on substr(idept, 1, 1) = deptno
where dhdate>= 20150701
and dhdate<= 20150730
AND aDate>= 20150701
and aDate<= 20150730
group by
substr(idept,1,1)
, dpdesc
order by
1
, dpdesc
;
A second step after beautifying\reformatting, would be to add
comments to the query.
A third recommendation for improvement would be to add a Correlation
name to each table-reference, and then, qualify the columns with their
respective correlation identifier. Or make the chosen correlation names
conspicuous, as they might relate to a common prefix of the column
names, and then just omit them as qualifiers, because the prefix
stands-out in place of the qualifier; e.g. if the naming of columns
dhdate and dhino suggest a common prefix, then name that table-reference
with an AS DH so all references to the columns prefixed with DH are
almost as conspicuous unqualified [dhDate] as they would be qualified
[DH.dhDate]. The recommendation is especially important given there is
a library qualifier on one table, because if the library name ever
changes, then the qualifiers change as well... which can be a pain if
any were specified elsewhere in the query.
Limited to only that information [as given, with no DDL nor
cardinality of the data], effectively the only thing that can be done
safely [to avoid possibly making incorrect assumptions] is to pare the
statement having used alternate syntax to effect what is already given
or by encapsulation of some of the earlier portion of the query into a
derived-table, into a VIEW [, or into a summary\materialized query
table]. Any legitimate feedback for /improvements/ are likely to be
limited by the ability of the reader to infer much about the query and
the data and the environment(s) the query will be run.
So for example, two limited ideas of reformulating and reformatting
the request [though untested]:
Perhaps using a Common Table Expression (CTE) as the derived-table:
with
sumCTE
( lineseq , dpdesc , location
, sales_amt, cost_amt, margin
) as
( select
substr(idept, 1, 1)
, dpdesc
, min(dhloc)
, sum(dhsell*dhqty) + Sum(aSell * aQty)
, sum(dhcost*dhqty) + Sum(aCost * aQty)
, ( double( ( sum(dhsell*dhqty) + Sum(aSell * aQty) )
- ( sum(dhcost*dhqty) + Sum(aCost * aQty) )
)
/ double( sum(dhcost*dhqty) + Sum(aCost * aQty) )
)
, dhdate
, adate
from dbslsdlg
left join KCX340D.dbprodpf
on dhino = ino
Left Join DBSLSDXL1 b
on aINo = ino
Left join dbdeptpf
on substr(idept, 1, 1) = deptno
where dhdate between 20150701 and 20150730
AND aDate between 20150701 and 20150730
)
SELECT
lineSeq
, dpdesc
, location
, Sales_Amt
, Cost_Amt
, Margin
from sumCTE
group by
lineseq
, dpdesc
order by
lineseq
, dpdesc
;
Or perhaps using a VIEW, though the constant values for the RANGE
predicates is probably best done with variables; perhaps encapsulated
instead as a User Defined Table Function (UDTF), or as stored procedure,
or as ??:
create view sumVIEW
( lineseq , dpdesc , location
, sales_amt, cost_amt, margin
) as
select
substr(idept, 1, 1)
, dpdesc
, min(dhloc)
, sum(dhsell * dhqty) + Sum(aSell * aQty)
, sum(dhcost * dhqty) + Sum(aCost * aQty)
, ( double( ( sum(dhsell*dhqty) + Sum(aSell * aQty) )
- ( sum(dhcost*dhqty) + Sum(aCost * aQty) )
)
/ double( sum(dhcost*dhqty) + Sum(aCost * aQty) )
)
from dbslsdlg
left join KCX340D.dbprodpf
on dhino = ino
Left Join DBSLSDXL1 b
on aINo = ino
Left join dbdeptpf
on substr(idept, 1, 1) = deptno
where dhdate between 20150701 and 20150730
AND aDate between 20150701 and 20150730
;
SELECT
lineSeq
, dpdesc
, location
, Sales_Amt
, Cost_Amt
, Margin
from sumVIEW
group by
lineseq
, dpdesc
order by
lineseq
, dpdesc
As an Amazon Associate we earn from qualifying purchases.