|
Actually, soundex is supported directly in SQL. For performance reasons, you'd want to create a "Search Key" file that hold the soundex values. I'd probably add a few fields to describe the file, field, and ordinal value, so that this one index can look into many files and fields. Looking for John Smith? Is that JOHN or JON? SMITH or SMYTHE? Soundex or metaphone renders the word to a phonetic hash value. In soundex, JOHN is J500 (Metaphone: JN) and SMITH is S530 (SM0). I personally prefer metaphone since the output is generally human readable, so I find it easier to work with. I translated my Metaphone procedure from pascal (re-published in Joe Celko's "SQL for Smarties") into RPG, then later into a procedure. * METAPHONE------------------------- D Metaphone PR 6a D WordIn 128a Value P Metaphone B * Procedure Interface ----------------------------------- D*........................................................ D* Metaphone - An implementation adapted from source code D* published in Joe Celko's "SQL for smarties". Source D* translated to RPGIV from Pascal implementation written D* by Terry Smithwick (1991). D* D* Usage: Metaphone( WordIn Char 128 ) returns: Char 6 D* D* Eval $MetaCode = Metaphone($CityName) D* {KLFLNT} {CLEVELAND} D*........................................................ D Metaphone PI 6a D WordIn 128a Value * Local data -------------------------------------------- D Soundex S 6a D i S 5i 0 D l S 5i 0 D n S 5i 0 D silent S n D new S n D last S 1a D this S 1a D next S 1a D nnext S 1a D m S 6a D d S 2a * D VowelData DS D 1a Inz('A') D 1a Inz('E') D 1a Inz('I') D 1a Inz('O') D 1a Inz('U') D VowelSet 1a Dim(5) Overlay(VowelData) * D FrontVData DS D 1a Inz('E') D 1a Inz('I') D 1a Inz('Y') D FrontVSet 1a Dim(3) Overlay(FrontVData) * D VarSonData DS Variable Sound D 1a Inz('C') D 1a Inz('S') D 1a Inz('T') D 1a Inz('G') D VarSonSet 1a Dim(4) Overlay(VarSonData) * Begin Metaphone Soundex Function ---------------------- C If WordIn = '' C Return ' ' C EndIf C* If first 2 letters in list of two letter consonnants, drop the leading character. C Eval d = %subst(WordIn:1:2) C If d = 'KN' or C d = 'GN' or C d = 'PN' or C d = 'PF' or C d = 'AE' or C d = 'WR' C Eval WordIn = %subst(WordIn:2:%len(WordIn)-1) C EndIf C* If first letter is X, change it to S. C If %subst(WordIn:1:1)='X' C Eval %subst(WordIn:1:1)='S' C EndIf C* If first letters are WH, change it to W. C If d = 'WH' C Eval WordIn = 'W'+%subst(WordIn:3:%len(WordIn)-2) C EndIf C C* Initialize values for looping C Eval l = %len(%trim(WordIn)) C Eval m = '' C Eval new = *On C Eval n = 1 C C DoW %len(%trim(m)) <= 6 and n <= l C C If n > 1 C Eval last = %subst(WordIn:n-1:1) C Else C Eval last = x'00' C EndIf C C Eval this = %subst(WordIn:n:1) C C If n < l C Eval next = %subst(WordIn:n+1:1) C Else C Eval next = x'00' C EndIf C C If n+1 < l C Eval nnext = %subst(WordIn:n+2:1) C Else C Eval nnext = x'00' C EndIf C C If n > 1 and (this=last) and this<>'C' C Eval n = n + 1 C Iter C EndIf C C If new C C this Lookup VowelSet 10 C If *In10 and n = 1 C Eval m = this C EndIf C C Select C When this = 'B' --------------B C If not ((n=l) and last='M') -MB is silent C Eval m = %trim(m) + 'B' C EndIf C C When this = 'C' --------------C C next Lookup FrontVSet 10 C If not ((last='S') and *In10) -SC(E,I,Y) silent C C If next = 'I' and nnext = 'A' -CIA- = X C Eval m = %trim(m) + 'X' C Else C If *In10 C Eval m = %trim(m) + 'S' -c(e,i,y) = S C Else C C If next = 'H' and last = 'S' -SCH C Eval m = %trim(m) + 'K' CH = K C Else C If next = 'H' -CH- C nnext Lookup VowelSet 10 C If n=1 and n+2<=l and not *In10 C Eval m = %trim(m) + 'K' C Else C Eval m = %trim(m) + 'X' C EndIf C Else C Eval m = %trim(m) + 'K' C = K C EndIf C EndIf C EndIf C EndIf C EndIf C C When this = 'D' --------------D C nnext Lookup FrontVSet 10 C If next = 'G' and *In10 C Eval m = %trim(m) + 'J' C Else C Eval m = %trim(m) + 'T' C EndIf C C When this = 'G' --------------G C nnext Lookup VowelSet 10 C Eval silent = (next='H') and *In10 -GHx C If (n>1) and (((n+1)=l) or C ((next='N') and (nnext='E') and -GNED C (%subst(WordIn:n+3:1)='D') and C ((n+3)=l)) or (last='I') and -IGN C (next='N')) C Eval silent = *On C EndIf C next Lookup FrontVSet 10 C If (n>1) and (last='D') and *In10 -DG(E,I,Y) C Eval silent = *On C EndIf C C If not silent C next Lookup FrontVSet 10 C If *In10 C Eval m = %trim(m) + 'J' C Else C Eval m = %trim(m) + 'K' C EndIf C EndIf C C When this = 'H' --------------H C last Lookup VarSonSet 10 C next Lookup VowelSet 11 C If not ((n=l) or *In10) and *In11 C Eval m = %trim(m) + 'H' C EndIf C C When this = 'F' or --------------F C this = 'J' or J C this = 'L' or L C this = 'M' or M C this = 'N' or N C this = 'R' R C Eval m = %trim(m) + this C C When this = 'K' --------------K C If last <> 'C' C Eval m = %trim(m) + 'K' C EndIf C C When this = 'P' --------------P C If next = 'H' C Eval m = %trim(m) + 'F' C Add 1 n C Else C Eval m = %trim(m) + 'P' C EndIf C C When this = 'Q' --------------Q C Eval m = %trim(m) + 'K' C C When this = 'S' --------------S C If (next = 'H') or C ((n>1) and (next='I') and C (nnext='O' or nnext='A')) C Eval m = %trim(m) + 'X' C Else C Eval m = %trim(m) + 'S' C EndIf C C When this = 'T' --------------T C If (n=1) and (next='H') and (nnext='O') THO- C Eval m = %trim(m) + 'T' C Else C If (n>1) and (next='I') and -TI(O,A)- C (nnext='O' or nnext='A') C Eval m = %trim(m) + 'X' C Else C If (next='H') -TH- = 0 C Eval m = %trim(m) + '0' C Else C If not ((next='C') and (nnext='H')) -TCH C Eval m = %trim(m) + 'T' C EndIf C EndIf C EndIf C EndIf C C When this = 'V' --------------V C Eval m = %trim(m) + 'F' C C When this = 'W' or --------------W C this = 'Y' Y C next Lookup VowelSet 10 C If *In10 C Eval m = %trim(m) + this C EndIf C C When this = 'X' --------------X C Eval m = %trim(m) + 'KS' C C When this = 'Z' --------------X C Eval m = %trim(m) + 'S' C C EndSL C C EndIf C Eval n = n + 1 C EndDo C Return m P Metaphone E Eric DeLong Sally Beauty Company MIS-Project Manager (BSG) 940-898-7863 or ext. 1863 -----Original Message----- From: Booth Martin [mailto:booth@xxxxxxxxxxxx] Sent: Friday, September 10, 2004 12:46 PM To: Midrange Systems Technical Discussion Subject: RE: fast search or scan for words in selected field Well, how about this scenario then? Would this be a successful alternative? Could a file of the search words be made, as is presently being done? Then use SQL to search that base? An advantage of SQl then would be to allow searches on matching two or more of the keywords? But the discussions on Soundex, etc. seems like a real winner though. Where would we learn more about that? --------------------------------- Booth Martin http://www.martinvt.com --------------------------------- -------Original Message------- From: Midrange Systems Technical Discussion Date: 09/10/04 12:34:46 To: Midrange Systems Technical Discussion Subject: RE: fast search or scan for words in selected field I'm with Joe on this one. I can't see how indices would help with substring searches. Then again, just because I can't comprehend it doesn't mean it wouldn't. Rob Berendt -- Group Dekko Services, LLC Dept 01.073 PO Box 2000 Dock 108 6928N 400E Kendallville, IN 46755 http://www.dekko.com -- This is the Midrange Systems Technical Discussion (MIDRANGE-L) mailing list To post a message email: MIDRANGE-L@xxxxxxxxxxxx To subscribe, unsubscribe, or change list options, visit: http://lists.midrange.com/mailman/listinfo/midrange-l or email: MIDRANGE-L-request@xxxxxxxxxxxx Before posting, please take a moment to review the archives at http://archive.midrange.com/midrange-l.
As an Amazon Associate we earn from qualifying purchases.
This mailing list archive is Copyright 1997-2024 by midrange.com and David Gibbs as a compilation work. Use of the archive is restricted to research of a business or technical nature. Any other uses are prohibited. Full details are available on our policy page. If you have questions about this, please contact [javascript protected email address].
Operating expenses for this site are earned using the Amazon Associate program and Google Adsense.