edit-distance

Edit distance is the minimum number of single-character operations (insertion, deletion, substitution) needed to transform one string into another; it is computed exactly by dynamic programming in $O (nm)$ time.

Definition

The edit distance (or Levenshtein distance) between strings $X$ of length $n$ and $Y$ of length $m$ is the minimum-cost sequence of edit operations that transforms $X$ into $Y$ :

Insertion: insert a character into $X$
Deletion: delete a character from $X$
Substitution: replace a character in $X$ with a different character

In the standard Levenshtein metric, insertion and deletion cost 1; substitution costs 2 (a substitution is modelled as a delete + insert). Some sources (including many lab/exam questions) use uniform costs where substitution = 1. Always check which convention a question uses — the same pair of strings will produce a different distance under the two conventions.

Convention	ins	del	sub	Used by
Levenshtein (J&M)	1	1	2	Jurafsky & Martin textbook
Uniform	1	1	1	Most lab/exam problems, other textbooks

Always check the convention before filling the table

The diagonal cost is the only thing that differs between the two. A mismatch on the diagonal costs +2 under Levenshtein and +1 under uniform — the same pair of strings will give a different final distance depending on which you use.

The cat → cut worked example below uses Levenshtein (sub = 2)

The leda → deal worked example below uses uniform (sub = 1)

Example: INTENTION → EXECUTION

I N T E N T I O N
| | | | | | | | |
* E X E C U T I O N

Edit sequence (cost 8): delete I (1), substitute N→E (2), substitute T→X (2), delete E (1), insert C (1), substitute N→U (2)… working through the alignment gives total cost 8 under Levenshtein.

Dynamic Programming Algorithm

Define $D (i, j)$ = edit distance between $X [1.. i]$ and $Y [1.. j]$ .

Base cases: $D (i, 0) = i D (0, j) = j$

(Transforming any string to the empty string costs $i$ deletions; inserting $j$ characters into the empty string costs $j$ insertions.)

Recurrence:

Course Notes

Explorer

edit-distance

Definition

Example: INTENTION → EXECUTION

Dynamic Programming Algorithm

Graph View

Table of Contents

Backlinks