× The internal search function is temporarily non-functional. The current search engine is no longer viable and we are researching alternatives.
As a stop gap measure, we are using Google's custom search engine service.
If you know of an easy to use, open source, search engine ... please contact support@midrange.com.



Jeff Crosby wrote:

I have never come upon a complete foolproof set of rules to do this right.


Even humans can't do it right.  How can anyone decide whether it's
Macdonell or MacDonell?  Only the owner of the name knows for sure.

Mixed-case to upper-case is a many-to-one function.  It's impossible to
get round-tripping on a many-to-one function.  (A similar issue comes up
if you only store the last 6 digits of a date, and try to guess at the
first 2.  No matter how clever you try to be, you'll get it wrong
sometimes.  Birthdate of 99/12/23?  Must be 1999; that 107-year-old must
be dead by now.)

The only way to get this right is not to lose the mixed-case form in the
first place.  If it's not feasible just to store the name in mixed-case
and upper-case it for comparisons, you could store the mixed-case
version of the name somewhere else for the problem cases. To determine
the problem cases, use a trivial conversion routine of uppercasing the
first character of each word, and if you don't get correct
round-tripping from mixed to upper back to mixed, you have the problem. 
Then you could store the mixed-case version somewhere else (add varying
length field that is set to '' if the trivial conversion works, or add a
flag to say "look for mixed-case-name in other file").  Probably only a
tiny fraction will be problem cases.


As an Amazon Associate we earn from qualifying purchases.

This thread ...

Replies:

Follow On AppleNews
Return to Archive home page | Return to MIDRANGE.COM home page

This mailing list archive is Copyright 1997-2024 by midrange.com and David Gibbs as a compilation work. Use of the archive is restricted to research of a business or technical nature. Any other uses are prohibited. Full details are available on our policy page. If you have questions about this, please contact [javascript protected email address].

Operating expenses for this site are earned using the Amazon Associate program and Google Adsense.