LuaCheia Logo  

LuaCheia Reference Manual

fuzzy

Synopsis

cheia.load “fuzzy”
 
match_strength = fuzzy.match(“text to search”, “text to find”)

Description

This module provides a function for performing fuzzy substring matching on a text string. It works by determining the percentage of the ‘n-grams’ (consecutive substrings of length n, for certain values of n) of the search term which are found in the text to search.

Examples

We can see that the match strength decreases as the term diverges from the original.

print(fuzzy.match(“full moon”, “moon”)) » 100
print(fuzzy.match(“full moon”, “moone”)) » 70.588233947754
print(fuzzy.match(“full moon”, “fll mon”)) » 50
print(fuzzy.match(“full moon”, “flul moon”)) » 50
print(fuzzy.match(“full moon”, “monday”)) » 18.181818008423
print(fuzzy.match(“full moon”, “garbage”)) » 0

But note this case:

print(fuzzy.match(“full moon”, “Moon”)) » 58.333332061768

fuzzy.match is case sensitive, so usually you will want to use string.lower on the inputs to remove case distinctions.

Reference

fuzzy.match(body, term)

Performs a fuzzy search for term within body and returns the percentage match strength.

Parameters:
     body : string  

The body of text to search.

     term : string  

The term to search for.

Returns:
     number  

The match strength as a percentage.

Issues

Makes no attempt to localise matches, so is likely in practice to be useful only for comparing strings of similar length.

See also

The c't article from which this module was derived, http://www.heise.de/ct/english/97/04/386/.

Revision history

Added in LuaCheia 5.0.

Credits

Based on code from an article by Reinhard Rapp published in c't 4/97.

Adaptation for LuaCheia by Martin Spernau.

Documentation for LuaCheia by Jamie Webb.

Return to main site

Introduction

 » Writing LuaCheia modules

Module Reference

 » bit
 » cgi
 » fuzzy
 » md5
 » pack
 » rex
 » SDL
 » shelve
 » sqlite

Appendices

 » Module path conventions
 » Module names
 » Application Binary Interface
 » Credits