当前位置:高等教育资讯网  >  中国高校课件下载中心  >  大学文库  >  浏览文档

南京大学:《计算机问题求解》课程教学资源(PPT课件讲稿)串匹配

资源类别:文库,文档格式:PPTX,文档页数:47,文件大小:2.95MB,团购合买
点击下载完整版文档(PPTX)

计算机问题求解-论题4-5 。串匹配 2019年3月27日

计算机问题求解 – 论题4-5 - 串匹配 2019年3月27日

问题1: 什么是string-matching problem,直观上?形式上?

string-matching problem

几乎每天都在用。。 Introduction to Algorithms,Third Edition 1007/1313 Knuth-Morris-Pratt 1/2 Tigure 32.2 The strmg-matehing aigonmms m ts cnaprer and men preprocessig and matcmng times. Except for the naive brute-force algorithm.which we review in Section 32.1. each string-matching algorithm in this chapter performs some preprocessing based on the pattern and then finds all valid shifts;we call this latter phase"matching." Figure 32.2 shows the preprocessing and matching times for each of the algorithms in this chapter.The total running time of each algorithm is the sum of the prepro- cessing and matching times.Section 32.2 presents an interesting string-matching algorithm.due to Rabin and Karp.Although the ((n-m +1)m)worst-case running time of this algorithm is no better than that of the naive method,it works much better on average and in practice.It also generalizes nicely to other pattern- matching problems.Section 32.3 then describes a string-matching algorithm that begins by constructing a finite automaton specifically designed to search for occur- rences of the given pattern P in a text.This algorithm takes O(m preprocess- ing time,but only (n)matching time.Section 32.4 presents the similar,but much cleverer.Knuth-Morris-Pratt (or KMP)algorithm:it has the same (n)matching time,and it reduces the preprocessing time to only (m). 出 Notation and terminology We denote by E*(read"sigma-star")the set of all finite-length strings formed using characters from the alphabet In this chapter,we consider only finite- length strings.The zero-length empty string,denoted 8,also belongs to 'The

几乎每天都在用

text T a b c ab a a b a b a 5=3 a a we say that pattern P occurs with shift s in text T (or,equivalently,that pattern P occurs beginning at position s+1 in text T)if 0≤s≤n-m and T5+l.s+m=P[l.m(that is,.ifT5+j】=P[Ujl,for 1 j<m).If P occurs with shift s in T,then we call s a valid shift;otherwise, we call s an invalid shift.The string-matching problem is the problem of finding all valid shifts with which a given pattern P occurs in a given text T

text T a b a b a a b a b a 5=3 pattern P b a a Worst Case Algorithm Preprocessing time Matching time Naive 0 O((n-m+1)m) Rabin-Karp ⊙(m) O(n-m+1)m)) Finite automaton O(m∑) ⊙(n) Knuth-Morris-Pratt ⊙(m) Θ(n) n:the length of text T m:the length of pattern the alphabet set,consisiting ofall possible characters

𝑛:𝑡ℎ𝑒 𝑙𝑒𝑛𝑔𝑡ℎ 𝑜𝑓 𝑡𝑒𝑥𝑡 𝑇 𝑚:𝑡ℎ𝑒 𝑙𝑒𝑛𝑔𝑡ℎ 𝑜𝑓 𝑝𝑎𝑡𝑡𝑒𝑟𝑛 𝛴:the alphabet set , consisiting of all possible characters Worst Case

text T a b a b a a b a b a =3 pattern P b a a Algorithm Preprocessing time Matching time Naive 0 O(n-m+1)m) Rabin-Karp ⊙(m) O(n-m+1)m) Finite automaton O(1m∑) Θ(n) Knuth-Morris-Pratt Θ(1m) Θ(n) n:the length of text T m:the length of pattern the alphabet set,consisiting ofall possible characters

𝑛:𝑡ℎ𝑒 𝑙𝑒𝑛𝑔𝑡ℎ 𝑜𝑓 𝑡𝑒𝑥𝑡 𝑇 𝑚:𝑡ℎ𝑒 𝑙𝑒𝑛𝑔𝑡ℎ 𝑜𝑓 𝑝𝑎𝑡𝑡𝑒𝑟𝑛 𝛴:the alphabet set , consisiting of all possible characters

最容易想到的算法 a a a b a a a b a a a b a a a b s=0 5=3 a a b b a a b NAIVE-STRING-MATCHER(T.P) 问题2: 1 n T.length 2 m =P.length 最坏情况下复杂度? 3 fors 0to n-m 0(n-m+1)m) 4 if P[1..ml==T[s +1..s+ml 5 print"Pattern occurs with shift"s 问题3:什么时候出现最坏情况?

最容易想到的算法 𝑶 𝒏 − 𝒎 + 𝟏 𝒎

问题4:你能否说说naive.算法有什么问 题,为什么有可能改进? a a b a a a a b S=0 S=3 a a b a a b NAIVE-STRING-MATCHER(T,P) 逐位单字符比较 1 n T.length The naive string-matcher is 2 m P.length inefficient because it entirely 3 fors 0to n-m ignores information gained if P[1..m ==Ts +1..s+m about the text for one value print“Pattern occurs with shift”s of s when it considers other values of s

The naive string-matcher is inefficient because it entirely ignores information gained about the text for one value of s when it considers other values of s

text T a b a b a a b a b a =3 pattern P b a a Algorithm Preprocessing time Matching time Naive 0 O(n-m+1)m) Rabin-Karp ⊙(m) O(n-m+1)m) Finite automaton O(m∑) ⊙(n) Knuth-Morris-Pratt Θ(1m) ©(n) n:the length of text T m:the length of pattern the alphabet set,consisiting ofall possible characters

𝑛:𝑡ℎ𝑒 𝑙𝑒𝑛𝑔𝑡ℎ 𝑜𝑓 𝑡𝑒𝑥𝑡 𝑇 𝑚:𝑡ℎ𝑒 𝑙𝑒𝑛𝑔𝑡ℎ 𝑜𝑓 𝑝𝑎𝑡𝑡𝑒𝑟𝑛 𝛴:the alphabet set , consisiting of all possible characters

问题5: Rabin-Karp算法的基本思想 是什么? For expository purposes,let us assume that∑={0,1,2,.,9},so that each character is a decimal digit.(In the general case,we can assume that each charac- ter is a digit in radix-d notation,where d=We can then view a string ofk consecutive characters as representing a length-k decimal number.The character string 31415 thus corresponds to the decimal number 31,415

问题5: Rabin-Karp算法的基本思想 是什么?

点击下载完整版文档(PPTX)VIP每日下载上限内不扣除下载券和下载次数;
按次数下载不扣除下载券;
24小时内重复下载只扣除一次;
顺序:VIP每日次数-->可用次数-->下载券;
共47页,可试读16页,点击继续阅读 ↓↓
相关文档

关于我们|帮助中心|下载说明|相关软件|意见反馈|联系我们

Copyright © 2008-现在 cucdc.com 高等教育资讯网 版权所有