当前位置:高等教育资讯网  >  中国高校课件下载中心  >  大学文库  >  浏览文档

《编译原理》课程教学资源:第三章 正则表达式常应用于文本匹配:

资源类别:文库,文档格式:PPT,文档页数:23,文件大小:327.5KB,团购合买
一、正则表达式常应用于文本匹配: 1.串的查找 2.串的替换 3.将输入识别为一个个的记号
点击下载完整版文档(PPT)

本节要点 正则表达式常应用于文本匹配: 串的查找 串的替换 将输入识别为一个个的记号

1 本节要点 • 正则表达式常应用于文本匹配: – 串的查找 – 串的替换 – 将输入识别为一个个的记号

正则表达式的应用 Use #1: Text-processing the web Web is full of data but it's in text form for humans to read · Screenscraping extracting the data you want from screen output these days, the output format is HTML Examples: extract tour schedule of your favorite bands from Ticketmaster web sites as web services: convert address to geo coordinates

正则表达式的应用 • Use #1: Text-processing the web – Web is full of data, but it’s in text form for humans to read • Screenscraping – extracting the data you want from screen output – these days, the output format is HTML • Examples: – extract tour schedule of your favorite bands from Ticketmaster – web sites as web services: convert address to geo coordinates 2

正则表达式的应用 Use #2: Text processing in general a spectrum of uses, from small to big Sma‖!fies: replacing " ugly quotes"with"smart quotes converting files between operating systems · Bigger tasks spell checking formatted documents(HTML): must extract text pretty printing code: find comments, etc; add format directives

正则表达式的应用 • Use #2: Text processing in general – a spectrum of uses, from small to big • Small fixes: – replacing "ugly quotes" with “smart quotes” – converting files between operating systems • Bigger tasks – spell checking formatted documents (HTML): must extract text – pretty printing code: find comments, etc; add format directives 3

正则表达式的应用 Use #3: Program processing especially on the web OntheWebprocedurecalls=httprequests procedure arguments"passed as strings argument extraction can be done with regular expressions · Other uses: extract components of an email address obfuscation: want to obfuscate all JS functions except those called from HTML embedded scripts; so scan web page for names of functions called from HTMl, to avoid obfuscating them

正则表达式的应用 • Use #3: Program processing – especially on the web • On the Web, procedure calls = http requests – “procedure arguments” passed as strings – argument extraction can be done with regular expressions • Other uses: – extract components of an email address – obfuscation: want to obfuscate all JS functions except those called from HTML embedded scripts; so scan web page for names of functions called from HTML, to avoid obfuscating them. 4

Regular Expression Tutorial Focus on the two languages: JavaScript Python a key rules common to both given a string and an regex. e Find the first position in string where a match is possible (except for the match( function in Python, which must match at the beginning of the string

Regular Expression Tutorial • Focus on the two languages: – JavaScript – Python A key rules common to both. Given a string and an regex:  Find the first position in string where a match is possible. (except for the match() function in Python, which must match at the beginning of the string.) 5

String search: from simple to regexp JavaScript) Basic search methods for string objects string". indexof(rin") →2 string". indexof(new RegExp(rn))>-1 等效-" tring" search(new RegExp)4n -string" search(new RegExp(r n)2 string search(r. n/ 2 -"string". match(/tri str/ →["str" string". match(/ri ["st","ri"] string". match(/trilstr/g) strstr 参见( js. htm)

String search: from simple to regexp (JavaScript) • Basic search methods, for String objects: – "string".indexOf("rin") → 2 – "string".indexOf(new RegExp("r*n")) → -1 – "string".search(new RegExp("r*n")) → 4 – "string".search(new RegExp("r.*n")) → 2 – "string".search(/r.*n/) → 2 – "string".match(/tri|str/) → ["str"] – "string".match(/ri|st/g) → ["st", "ri"] – "string".match(/tri|str/g) → ["str"] 参见(js.htm) 6 等效

String search: from simple to regexp JavaScript indexof Syntax: object indexof(search Value, fromIndex) When called from a String object, this method returns the index of the first occurance of the specified searchvalue argument, starting from the specified fromIndex argument. search Syntax: object search(regexp) This method is used to search for a match between a regular expression and the specified string RegExp Syntax new RegExp( "pattern"L flags"l)EfEmyReg=pattern/flags

String search: from simple to regexp (JavaScript) • indexOf – Syntax: object.indexOf(searchValue,[fromIndex]) – When called from a String object, this method returns the index of the first occurance of the specified searchValue argument, starting from the specified fromIndex argument. • search – Syntax: object.search(regexp) – This method is used to search for a match between a regular expression and the specified string. – RegExp – Syntax: • new RegExp(“pattern”[, “flags”])或者myReg=pattern/flags

String search: from simple to regexp JavaScript match Syntax: object. match(regexp) This method is used to match a specified regular expression against a string If one or more matches are made, an array is returned that contains all of the matches. Each entry in the array is a copy of a string that contains a match. if no match is made, a nullis returned To perform a global match you must include the g global flag in the regular expression and to perform a case-insensitive match you must include the i'(ignore case) flag ·匹配用过的不用用于匹配

String search: from simple to regexp (JavaScript) • match – Syntax: object.match(regexp) – This method is used to match a specified regular expression against a string – If one or more matches are made, an array is returned that contains all of the matches. Each entry in the array is a copy of a string that contains a match. If no match is made, a null is returned. To perform a global match you must include the 'g' (global) flag in the regular expression and to perform a case-insensitive match you must include the 'i' (ignore case) flag. • 匹配用过的串不再用于匹配

Same for Python Basic search methods for String objects 表示是原始字义 Maton re match(r"tri rin" ,string") → no match/n re. search(r"tril rin","string"). group)o)>tri re compile(rtrilstr").findall("string )>['str re compile(r"rilst ). findall(string >['st,'ri] re search(r"Itr)I(in),string"). groups()>tr None)(()) capful edens note: match("expests the match to start at index o

Same for Python • Basic search methods, for String objects: – re.match(r"tri|rin", "string") → no match – re.search(r"tri|rin", "string").group(0) → 'tri' – re.compile(r"tri|str").findall("string") → ['str'] – re.compile(r"ri|st").findall("string") → ['st', 'ri'] – re.search(r"(tr)|(in)", "string").groups() → ('tr', None) • note: match() expects the match to start at index 0 9 表示是原始字义

Python正则表达式 ·支持“!,"*","+","?","|",“[y"八" ·“^N":匹配串的开始 “S":匹配到串尾 m}:m个重复 m,n}:m到n个重复 *?,+?,?2,{m,n}?:在第一个符号的意义上,改 贪婪的最大匹配为最小匹配 例:用正则表达式匹配“titles/H1>"时最大匹配可 匹配整个串,最小匹配匹配“ (.):匹配括号内的任意正则表达式,常用于分组

Python正则表达式 • 支持“.”, ”*”, ”+”, ”?”, ”|”, “[ ]”,”\” • “^” :匹配串的开始 • “$”:匹配到串尾 • {m}:m个重复 • {m,n}:m到n个重复 • *?, +?, ?? ,{m,n}? :在第一个符号的意义上,改 贪婪的最大匹配为最小匹配 • 例:用正则表达式匹配“title”时最大匹配可 匹配整个串,最小匹配匹配““ • (...) :匹配括号内的任意正则表达式,常用于分组

点击下载完整版文档(PPT)VIP每日下载上限内不扣除下载券和下载次数;
按次数下载不扣除下载券;
24小时内重复下载只扣除一次;
顺序:VIP每日次数-->可用次数-->下载券;
共23页,试读已结束,阅读完整版请下载
相关文档

关于我们|帮助中心|下载说明|相关软件|意见反馈|联系我们

Copyright © 2008-现在 cucdc.com 高等教育资讯网 版权所有