本文實(shí)例講述了Python打印scrapy蜘蛛抓取樹結(jié)構(gòu)的方法。分享給大家供大家參考。具體如下:
通過下面這段代碼可以一目了然的知道scrapy的抓取頁面結(jié)構(gòu),調(diào)用也非常簡(jiǎn)單
#!/usr/bin/env pythonimport fileinput, refrom collections import defaultdictdef print_urls(allurls, referer, indent=0): urls = allurls[referer] for url in urls: print ' '*indent + referer if url in allurls: print_urls(allurls, url, indent+2)def main(): log_re = re.compile(r'<GET (.*?)> /(referer: (.*?)/)') allurls = defaultdict(list) for l in fileinput.input(): m = log_re.search(l) if m: url, ref = m.groups() allurls[ref] += [url] print_urls(allurls, 'None')main()
希望本文所述對(duì)大家的Python程序設(shè)計(jì)有所幫助。
新聞熱點(diǎn)
疑難解答
圖片精選