× The internal search function is temporarily non-functional. The current search engine is no longer viable and we are researching alternatives.
As a stop gap measure, we are using Google's custom search engine service.
If you know of an easy to use, open source, search engine ... please contact support@midrange.com.



Thanks Andrew

Alan Shore
Programmer/Analyst, Direct Response
E:AShore@xxxxxxxx
P:(631) 200-5019
C:(631) 880-8640
"If you're going through Hell, keep going" - Winston Churchill


-----Original Message-----
From: midrange-l-bounces@xxxxxxxxxxxx [mailto:midrange-l-bounces@xxxxxxxxxxxx] On Behalf Of ALopez@xxxxxxxxxx
Sent: Thursday, March 08, 2012 10:44 AM
To: midrange-l@xxxxxxxxxxxx
Subject: Re: Question about whether or not to create an Encoded Vector Index (EVI)

I have just started reading about Encoded Vector Index (or EVI) One of
the points that seems to be stressed is cardinality as far as the key
is concerned.
The states of the USA is good cardinality as there are only 50 of them.
Timestamps are bad, because you can have up to a kajillion of them.
How do you determine what is good and what is bad?
For instance, a date field (or at least a numeric field which
represents a date), would that be considered a bad choice for
cardinality.
Cardinality - I hope I have the correct word

I've created them over fields with a much higher number of values. If you look at the parameters for creation you will see that it can support
65,535 distinct values with a 4 byte code. Just be aware that the number of possible values allowed will be based on the number of distinct values in the file, unless you purposely force it to reserve the larger 4 byte code size. This is one reason it's handy to remove the EVIs before large updates to the underlying file--you risk exceeding the number of distinct values for which the EVI was created.

Even if you don't exceed that number, it is still better to remove and add back the EVIs when doing large updates. For a datamart we build on a nightly basis, it is MUCH faster to remove the EVIs, rebuild the datamart (a single, large table), then add back the EVIs than to update the file while the EVIs are present.

As far as your date field goes, will you have more than 65,535 possible values? I know what the standard IBM recommendation based on 50 state codes, but with our little datamart we saw considerable speed improvement when we added an EVI over the zip codes used in our sales history. Way more than 50 fields, but still less than the 65k number.


Andrew Lopez
Systems Analyst



Phone:  803-714-2037


Email: ALopez@xxxxxxxxxx
Please consider the environment before printing this e-mail.


This message and any attachments should be treated as proprietary to the sender and confidential to the identified recipients and should not be disclosed to or used by anyone other than the intended recipient unless pre arranged with the sender. If you are not an addressee of this communication, have received this e-mail by transmission error of the sender, recipient or due to another originator by an error in transmission, you are hereby notified that any disclosure, copying, use, distribution, or taking of any action in reliance on the contents of this information is prohibited. In any such event, please notify the sender immediately by contacting Spirax Sarco Inc., 803 714 2000 or reply to this e-mail and then delete it from your system. Spirax Sarco Inc.
accepts no responsibility for software viruses and all recipients should check for viruses before opening any attachments.

Disclaimer: This message contains confidential information and is intended only for the individual named. If you are not the named addressee you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately by e-mail if you have received this e-mail by mistake and delete this e-mail from your system. E-mail transmission cannot be guaranteed to be secure or error-free as information could be intercepted, corrupted, lost, destroyed, arrive late or incomplete, or contain viruses. The sender therefore does not accept liability for any errors or omissions in the contents of this message, which arise as a result of e-mail transmission. If verification is required please request a hard-copy version. Any views or opinions presented are solely those of the author and do not necessarily represent those of the company.

As an Amazon Associate we earn from qualifying purchases.

This thread ...

Replies:

Follow On AppleNews
Return to Archive home page | Return to MIDRANGE.COM home page

This mailing list archive is Copyright 1997-2024 by midrange.com and David Gibbs as a compilation work. Use of the archive is restricted to research of a business or technical nature. Any other uses are prohibited. Full details are available on our policy page. If you have questions about this, please contact [javascript protected email address].

Operating expenses for this site are earned using the Amazon Associate program and Google Adsense.