fuzzy
Synopsis
cheia.load “fuzzy” |
|
|
match_strength = fuzzy.match(“text to search”, “text to find”) |
|
|
Description
This module provides a function for performing fuzzy substring
matching on a text string. It works by determining the percentage of
the ‘n-grams’ (consecutive substrings of length n, for
certain values of n) of the search term which are found in the
text to search.
Examples
We can see that the match strength decreases as the term diverges from
the original.
print(fuzzy.match(“full moon”, “moon”)) |
» 100 |
print(fuzzy.match(“full moon”, “moone”)) |
» 70.588233947754 |
print(fuzzy.match(“full moon”, “fll mon”)) |
» 50 |
print(fuzzy.match(“full moon”, “flul moon”)) |
» 50 |
print(fuzzy.match(“full moon”, “monday”)) |
» 18.181818008423 |
print(fuzzy.match(“full moon”, “garbage”)) |
» 0 |
|
But note this case:
print(fuzzy.match(“full moon”, “Moon”)) |
» 58.333332061768 |
|
fuzzy.match is case sensitive, so usually you will want to use
string.lower on the inputs to remove case distinctions.
Reference
- fuzzy.match(body, term)
-
Performs a fuzzy search for term within body and returns the
percentage match strength.
Parameters: |
body : string
|
The body of text to search.
|
term : string
|
The term to search for.
|
Returns: |
number
|
The match strength as a percentage.
|
Issues
Makes no attempt to localise matches, so is likely in practice to be
useful only for comparing strings of similar length.
See also
The c't article from which this module was derived,
http://www.heise.de/ct/english/97/04/386/.
Revision history
Added in LuaCheia 5.0.
Credits
Based on code from an article by Reinhard Rapp published in c't 4/97.
Adaptation for LuaCheia by Martin Spernau.
Documentation for LuaCheia by Jamie Webb.
|