|
| 1 | +# Levenshtein Distance |
| 2 | + |
| 3 | +The Levenshtein distance is a string metric for measuring the |
| 4 | +difference between two sequences. Informally, the Levenshtein |
| 5 | +distance between two words is the minimum number of |
| 6 | +single-character edits (insertions, deletions or substitutions) |
| 7 | +required to change one word into the other. |
| 8 | + |
| 9 | +## Definition |
| 10 | + |
| 11 | +Mathematically, the Levenshtein distance between two strings |
| 12 | +`a` and `b` (of length `|a|` and `|b|` respectively) is given by |
| 13 | + |
| 14 | +where |
| 15 | + |
| 16 | + |
| 17 | + |
| 18 | +where |
| 19 | + |
| 20 | +is the indicator function equal to `0` when |
| 21 | + |
| 22 | +and equal to 1 otherwise, and |
| 23 | + |
| 24 | +is the distance between the first `i` characters of `a` and the first |
| 25 | +`j` characters of `b`. |
| 26 | + |
| 27 | +Note that the first element in the minimum corresponds to |
| 28 | +deletion (from `a` to `b`), the second to insertion and |
| 29 | +the third to match or mismatch, depending on whether the |
| 30 | +respective symbols are the same. |
| 31 | + |
| 32 | +## Example |
| 33 | + |
| 34 | +For example, the Levenshtein distance between `kitten` and |
| 35 | +`sitting` is `3`, since the following three edits change one |
| 36 | +into the other, and there is no way to do it with fewer than |
| 37 | +three edits: |
| 38 | + |
| 39 | +1. **k**itten → **s**itten (substitution of "s" for "k") |
| 40 | +2. sitt**e**n → sitt**i**n (substitution of "i" for "e") |
| 41 | +3. sittin → sittin**g** (insertion of "g" at the end). |
| 42 | + |
| 43 | +## References |
| 44 | + |
| 45 | +- [Wikipedia](https://en.wikipedia.org/wiki/Levenshtein_distance) |
0 commit comments