Skip to content

Files

Latest commit

 

History

History
116 lines (82 loc) · 2.61 KB

072._edit_distance.md

File metadata and controls

116 lines (82 loc) · 2.61 KB

###72. Edit Distance

题目: https://leetcode.com/problems/edit-distance/

难度:

Hard

可以做的操作:

  • insert
  • delete
  • replace

动归典型,原来也是有wikipedia page的算法

https://en.wikipedia.org/wiki/Edit_distance#Common_algorithm

https://en.wikipedia.org/wiki/Levenshtein_distance

看wikipedia 这解释

		/ max(i,j) 	if min(i,j) = 0
			
				   / dp[i-1][j] + 1     word1[i]不在word2[0...j]中,所以删除
 dp[i][j] -  min  -- dp[i][j-1] + 1     insertion
 				   \ dp[i-1][j-1] + 1/0  word[i]与word[j]是否相等

上面的就不用解释了,min分别对应:删除、插入、以及替代(1/0取决 word1[i] == word2[j] ),反正也是tabular类型,画表来解决问题。

用wikipedia上的伪码改造

function LevenshteinDistance(char s[1..m], char t[1..n]):
  // for all i and j, d[i,j] will hold the Levenshtein distance between
  // the first i characters of s and the first j characters of t
  // note that d has (m+1)*(n+1) values
  declare int d[0..m, 0..n]
 
  set each element in d to zero
 
  // source prefixes can be transformed into empty string by
  // dropping all characters
  for i from 1 to m:
      d[i, 0] := i
 
  // target prefixes can be reached from empty source prefix
  // by inserting every character
  for j from 1 to n:
      d[0, j] := j
 
  for j from 1 to n:
      for i from 1 to m:
          if s[i] = t[j]:
            substitutionCost := 0
          else:
            substitutionCost := 1
          d[i, j] := minimum(d[i-1, j] + 1,                   // deletion
                             d[i, j-1] + 1,                   // insertion
                             d[i-1, j-1] + substitutionCost)  // substitution
 
  return d[m, n]

对应的例子表格图

		k	i	t	t	e	n
	0	1	2	3	4	5	6
s	1	1	2	3	4	5	6
i	2	2	1	2	3	4	5
t	3	3	2	1	2	3	4
t	4	4	3	2	1	2	3
i	5	5	4	3	2	2	3
n	6	6	5	4	3	3	2
g	7	7	6	5	4	4	3

AC代码

class Solution(object):
    def minDistance(self, word1, word2):
        """
        :type word1: str
        :type word2: str
        :rtype: int
        """
        m,n = len(word1), len(word2)
        dp = [[0 for i in range(m+1)] for j in range(n+1)]

        for i in range(1,m+1):
            dp[0][i] = i

        for j in range(1,n+1):
            dp[j][0] = j

        for j in range(1,n+1):
            for i in range(1,m+1):
                cost = 0 if word1[i-1] == word2[j-1] else 1
                dp[j][i] = min(dp[j-1][i] + 1, dp[j][i-1] + 1, dp[j-1][i-1] + cost)

        return dp[n][m]

貌似还有提升版本,但是比较明显即使有伪码,一开始也出错于下标.

升级版 to be learned.