Description
Returns an integer similarity score from 0 to 100 between two string values, where 100 is the strongest match and 0 is no similarity. Use it to filter rows by approximate match, deduplicate records with inconsistent spelling, or join on fields that do not match exactly.
The function supports three comparison methods. Pick the one that fits the data:
| Method | When to use |
|---|
levenshtein (default) | General-purpose edit-distance similarity. Good for typos and minor spelling variations. |
jaro_winkler | Favors strings that share a common prefix or have transposed characters. Good for names and short identifiers. |
soundex | English-language phonetic match. Returns 100 if both values sound alike, otherwise 0. Good for matching names spelled differently. |
Matching is case-sensitive for levenshtein and jaro_winkler. Wrap the inputs in LOWER() for case-insensitive comparison. soundex is inherently case-insensitive.
Syntax
FUZZY_MATCH(string1, string2[, method])
Arguments
| Argument | Description |
|---|
string1 | Required. The first value to compare. Non-string values are converted to their string representation. |
string2 | Required. The second value to compare. Non-string values are converted to their string representation. |
method | Optional. One of levenshtein (default), jaro_winkler, or soundex. Case-insensitive. |
Examples
Keep rows whose company name is a close match for Acme Inc:
FILTER rows BY FUZZY_MATCH(company, 'Acme Inc') >= 85;
FUZZY_MATCH('Apple', 'Aple') returns 80.
FUZZY_MATCH('Apple', 'Apple') returns 100.
FUZZY_MATCH('Robert', 'Rupert', 'jaro_winkler') returns 85.
FUZZY_MATCH('Smith', 'Smyth', 'soundex') returns 100.
FUZZY_MATCH('Smith', 'Jones', 'soundex') returns 0.
FUZZY_MATCH(LOWER(name1), LOWER(name2)) performs a case-insensitive Levenshtein comparison.
Return value datatype
Integer (0–100).
Notes
- The
levenshtein score is the edit distance normalized by the longer string’s length, computed as round(100 * (1 - distance / maxLen)).
soundex is designed for English and Latin-script letters. Accented characters such as é are normalized to their base form before encoding, so 'José' matches 'Jose'. Inputs in non-Latin scripts (such as Cyrillic or CJK) cannot be phonetically encoded and return null.
- Identical inputs always score 100 for every method.
- The function raises an error if fewer than two arguments are supplied or if an unknown method is requested.
Impact of null value
Returns null if either string1 or string2 is null.Last modified on June 25, 2026