|
The original link was from Midrange Guru at: http://www.midrangeserver.com/mgo/mgo083002-story02.html David Morris >>> mallen@xxxxxxxx 7/18/2003 11:03:44 AM >>> See below, I had saved it but can't find original link A Fuzzy Search Algorithm Most inquiry programs require a user to enter the exact value or exact beginning portion of a value to be located. Unfortunately such programs do not find the desired data if the user doesn't remember the exact spelling of the search term. I have an algorithm that allows searching by any portion of a database field, even if the user misspells the search argument. I assign a matching rate based on the difference in the number of positions. If a certain character is found in the same position of both the search string and the database field, I assign a matching rate of 100. If the two are one position away from each other, the rate is 90. Here is the complete scale: Absolute difference between character positions Matching rate 0 100 1 90 2 80 3 70 4 60 5 50 6 40 7 30 8 20 9 10 10 or more 0 Consider, for example, a file that consists of two fields: a part number and part description. This file contains three records, as per this example: Part Number Part Description 111 WIFE 222 STOP 333 KNIFE These are the calculated matching rates. WIFE 200 STOP 0 KNIFE 300 -- Victor Pisman Victor's source code is shown below. It consists of three objects--an input item master file, an output file to hold the results of the search, and an RPG program to perform the search. -- Ted Output file IMOUT A REF(IMIN) A R FM$OUT A XXITNO R REFFLD(IMITNO) A XXDESC R REFFLD(IMDESC) A XXRATE 3 0 A K XXRATE <font face='times new roman',serif">Program IM * Fuzzy search algorithm * written by Victor Pisman * FIMIN IF E DISK FIMOUT O E DISK A * E AAA 15 1 E BBB 15 3 0 * I 'ABCDEFGHIJKLMNOPQRST-C UC I 'UVWXYZ' * I 'abcdefghijklmnopqrst-C LC I 'uvwxyz' * C *ENTRY PLIST C PARM ##PARM 15 * C LC:UC XLATE##PARM ##PARM * C READ IMIN 99 C *IN99 DOWEQ*OFF C EXSR SEARCH C Z-ADDTTRATE XXRATE C MOVELIMITNO XXITNO C MOVELIMDESC XXDESC C WRITEFM$OUT C READ IMIN 99 C ENDDO * C MOVE *ON *INLR C*********** C SEARCH BEGSR *********** * Get the length of the search argument C MOVEL##PARM ENTRY 15 P C ' ' CHEKRENTRY L 20 * FILL THE ARRAY AAA WITH THE INPUT FIELD VALUE. C CLEARAAA C Z-ADD1 I 50 C 1 DO L I C 1 SUBSTENTRY:I AAA,I C ENDDO * * FILL THE ARRAY BBB. C CLEARBBB C Z-ADD*ZERO BBBTOT C *LIKE DEFN BBB BBBTOT+ 1 C Z-ADD1 I * * CHECK FOR EXACT MATCH FIRST. C ENTRY:L SCAN IMDESC:1 50 C *IN50 IFEQ *OFF NO EXACT MATCH * C 1 DO L I C LC:UC XLATEIMDESC XXNAME C *LIKE DEFN IMDESC XXNAME C AAA,I SCAN XXNAME RATE 30 50 C *IN50 IFEQ *ON C I SUB RATE BBB,I C ADD BBB,I BBBTOT C ELSE C Z-ADD99 BBB,I C ENDIF C ENDDO * C ENDIF EXACT MATCH * * CALCULATE THE NUMBER OF LEADING BLANKS X IN THE INPUT FIELD. C L IFGT *ZERO C BBBTOT DIV L X 30H C ENDIF * * CALCULATE THE ABSOLUTE VALUE OF INDIVIDUAL DIFFERENCES. C Z-ADD1 I C 1 DO L I C BBB,I IFNE 99 C SUB X BBB,I C ENDIF C BBB,I IFLT *ZERO C MULT -1 BBB,I C ENDIF * C BBB,I IFLT 10 C MULT -10 BBB,I C ADD 100 BBB,I C ELSE C Z-ADD*ZERO BBB,I C ENDIF * C ENDDO * CALCULATE TOTAL MATCHING RATE C XFOOTBBB TTRATE 50 * C ENDSR -----Original Message----- From: midrange-l-bounces@xxxxxxxxxxxx [mailto:midrange-l-bounces@xxxxxxxxxxxx] On Behalf Of jamesl@xxxxxxxxxxx Sent: Friday, July 18, 2003 12:35 PM To: midrange-l@xxxxxxxxxxxx Subject: Fuzzy string matching? Would anybody happen to know where I might find a "fuzzy" string match algorithm that either is, or can be, implemented in ILE C or ILE RPG? Something that could recognize slightly misspelled words? -- J.Lampert
As an Amazon Associate we earn from qualifying purchases.
This mailing list archive is Copyright 1997-2024 by midrange.com and David Gibbs as a compilation work. Use of the archive is restricted to research of a business or technical nature. Any other uses are prohibited. Full details are available on our policy page. If you have questions about this, please contact [javascript protected email address].
Operating expenses for this site are earned using the Amazon Associate program and Google Adsense.