|
See if EVI's make a difference. It appears that you have a finite number
of each type of thing, some would have to be built with 4 byte EVI's, but
City Zip, and State could definitely be built with smaller EVI's, and
possibly that would make a difference.
Mark Murphy
STAR BASE Consulting, Inc.
mmurphy@xxxxxxxxxxxxxxx
From: Charles Wilt <charles.wilt@xxxxxxxxx>
To: Midrange Systems Technical Discussion <midrange-l@xxxxxxxxxxxx>
Date: 04/30/2010 10:34 AM
Subject: DB Schema suitable for searching
Sent by: midrange-l-bounces@xxxxxxxxxxxx
All,
I've got an application that's design to search open invoices in order
to allow users to match a payment to an invoice, when there's no
remittance document included.
The users will search parts of the address, name, amount, ect.
Currently, the app builds a special table daily of customer/contract
data for just those customers who have open invoices. Here's the DDS:
A R FARSRCHP TEXT( 'AR Search.Physical table' )
A ARSKEY 10P TEXT( 'AR Search Key' )
A LOCNBR 5A TEXT( 'Location Number' )
A SENTID 5P TEXT( 'Search Entity ID' )
A SENTTP 1A TEXT( 'Search Entity Type' )
A SENTNM 25A TEXT( 'Search Entity Name' )
A ADDR1 25A TEXT( 'Address Line 1' )
A ADDR2 25A TEXT( 'Address Line 2' )
A ACITY 25A TEXT( 'City' )
A STPR 3A TEXT( 'State-Province' )
A ZIP 9A TEXT( 'Zip' )
A PHONE 17P TEXT( 'Phone' )
A INVOIC 6P TEXT( 'Invoice Number' )
A INVDAT L TEXT( 'Invoice Date' )
A INVAMT 9P 2 TEXT( 'Invoice Amount' )
A AMTDUE 9P 2 TEXT( 'Invoice remaining amt due' )
ARSKEY - AR Search Key is just a surrogate primary key.
SENTTP - Search Entity Type tells me if the SENTID and the name and
address in this record is a
Customer Delivery Address, Customer Mailing,
Contract Address, Billing Address
SENTID - Search Entity ID is the customer or contract number
This means every open invoice appears in this table 4 times, once for
each SENTTP.
The app uses SQL to search for possible matches given the criteria
entered by the user. Here's some example statements:
Select * from ARSRCHP Where SENTNM like ? Order by INVDAT desc, INVOIC,
SENTTP
Select * from ARSRCHP Where INVOIC = ? Order by INVDAT desc, INVOIC,
SENTTP
Select * from ARSRCHP Where INVOIC = ? and (abs(INVAMT - ?) < ?
or abs(AMTDUE - ?) < ?)
Order by INVDAT desc, INVOIC, SENTTP
Select * from ARSRCHP Where SENTNM like ? and INVOIC = ? Order by
INVDAT desc, INVOIC, SENTTP
Select * from ARSRCHP Where ZIP like ? Order by INVDAT desc, INVOIC,
SENTTP
Select * from ARSRCHP Where digits(PHONE) like ? Order by INVDAT desc,
INVOIC, SENTTP
Select * from ARSRCHP Where (ADDR1 like ? or ADDR2 like ?) Order by
INVDAT desc, INVOIC, SENTTP
This app is having performance issues, specifically resource use.
So I got called in to take a look.
First off, there were no indexes, so I looked at the advised indexes
and the SQL plan cache and created some that I though would be useful;
and those indexes have been useful, but not as much as would have
liked. Doing some more digging, it seems to me that the
design/implimentation of the ARSRCHP table is part of the problem.
There's a lot of duplicated data inside it, some numbers:
3,045,000 records in ARSRCHP
1,946,000 distinct records (excluding ARSKEY, SENTID, SENTTP)
761,000 Distinct Invoices
336,000 Distinct ADDR1
243,470 Distinct Phone #
232,000 Distinct Names
20,000 Distinct Zip Codes
12,000 Distinct Cities
Given the use of LIKE in the queries, I believe that means that lots
of full index scans (best case) or full table scans (worse case) being
done that are looking at the same value multiple times.
Now, I do recall some initial discussions between myself and the
developer where we discussed building the ARSRCHP table or just
searching over our normal invoice, customer, contract, and address
tables. I didn't initially feel that ARSRCHP was necessary, but he
felt having ARSRCHP would be better due to the size of our actual
tables. I went along with that and I still believe he's probably
correct.
However, from looking at the production data and the production
queries, I'm thinking that instead of a single flattened ARSRCHP
table, he should have used basically a star schema like you'd use in a
data warehouse.
I believe a Star Schema would provide better performance for the
searches our users are doing.
Does any have any experience with a similar app or any thoughts or
comments?
Thanks in advance,
Charles Wilt
--
This is the Midrange Systems Technical Discussion (MIDRANGE-L) mailing
list
To post a message email: MIDRANGE-L@xxxxxxxxxxxx
To subscribe, unsubscribe, or change list options,
visit: http://lists.midrange.com/mailman/listinfo/midrange-l
or email: MIDRANGE-L-request@xxxxxxxxxxxx
Before posting, please take a moment to review the archives
at http://archive.midrange.com/midrange-l.
--
This is the Midrange Systems Technical Discussion (MIDRANGE-L) mailing list
To post a message email: MIDRANGE-L@xxxxxxxxxxxx
To subscribe, unsubscribe, or change list options,
visit: http://lists.midrange.com/mailman/listinfo/midrange-l
or email: MIDRANGE-L-request@xxxxxxxxxxxx
Before posting, please take a moment to review the archives
at http://archive.midrange.com/midrange-l.
As an Amazon Associate we earn from qualifying purchases.
This mailing list archive is Copyright 1997-2024 by midrange.com and David Gibbs as a compilation work. Use of the archive is restricted to research of a business or technical nature. Any other uses are prohibited. Full details are available on our policy page. If you have questions about this, please contact [javascript protected email address].
Operating expenses for this site are earned using the Amazon Associate program and Google Adsense.