正在加载图片...
A More Complete Correct Algorithm PROCEDURE SPIDER(G,{SEEDS } Initialize COLLECTION <big fil STACK Initialize VISITED <big hash-t 用disk-based heap结构实现 For every ROOT in SEEDS Initialize STACK <stack data structure> Let STACK push (ROOT,STACK) While STACK is not empty, Do URLcurr :pop (STACK) Until URLcurr is not in VISITED insert-hash (URLcurr,VISITED) PAGE look-up (URLcurr) STORE (<URLcur:PAGE>,COLLECTION) For every URL:in PAGE, Push(URL生,STACK) Return COLLECTIONA More Complete Correct Algorithm PROCEDURE SPIDER4(G, {SEEDS}) Initialize COLLECTION <big file of URL-page pairs> Initialize VISITED <big hash-table> For every ROOT in SEEDS Initialize STACK <stack data structure> Let STACK := push(ROOT, STACK) While STACK is not empty, Do URLcurr := pop(STACK) Until URLcurr is not in COLLECTION insert-hash(URLcurr, VISITED) PAGE := look-up(URLcurr) STORE(<URLcurr, PAGE>, COLLECTION) For every URLi in PAGE, push(URLi, STACK) Return COLLECTION Until URLcurr is not in VISITED STACK 用disk-based heap结构实现
<<向上翻页向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有