国产探花免费观看_亚洲丰满少妇自慰呻吟_97日韩有码在线_资源在线日韩欧美_一区二区精品毛片,辰东完美世界有声小说,欢乐颂第一季,yy玄幻小说排行榜完本

首頁 > 編程 > Python > 正文

python使用scrapy解析js示例

2020-02-23 05:07:16
字體:
來源:轉載
供稿:網友

代碼如下:
from selenium import selenium

class MySpider(CrawlSpider):
    name = 'cnbeta'
    allowed_domains = ['cnbeta.com']
    start_urls = ['//www.jb51.net']

    rules = (
        # Extract links matching 'category.php' (but not matching 'subsection.php')
        # and follow links from them (since no callback means follow=True by default).
        Rule(SgmlLinkExtractor(allow=('/articles/.*/.htm', )),
             callback='parse_page', follow=True),

        # Extract links matching 'item.php' and parse them with the spider's method parse_item
    )

    def __init__(self):
        CrawlSpider.__init__(self)
        self.verificationErrors = []
        self.selenium = selenium("localhost", 4444, "*firefox", "http://www.jb51.net")
        self.selenium.start()

    def __del__(self):
        self.selenium.stop()
        print self.verificationErrors
        CrawlSpider.__del__(self)


    def parse_page(self, response):
        self.log('Hi, this is an item page! %s' % response.url)
        sel = Selector(response)
        from webproxy.items import WebproxyItem

        sel = self.selenium
        sel.open(response.url)
        sel.wait_for_page_to_load("30000")
        import time

        time.sleep(2.5)

發表評論 共有條評論
用戶名: 密碼:
驗證碼: 匿名發表
主站蜘蛛池模板: 宁蒗| 依安县| 临泽县| 崇文区| 瑞安市| 潮州市| 昌乐县| 南涧| 如皋市| 潞城市| 宁乡县| 句容市| 西乌| 庆安县| 永福县| 昌吉市| 彩票| 刚察县| 稷山县| 饶阳县| 疏附县| 弥勒县| 依安县| 韶关市| 化隆| 宣城市| 金秀| 新泰市| 山丹县| 廊坊市| 勃利县| 郸城县| 藁城市| 曲阜市| 三亚市| 兴安盟| 井冈山市| 大宁县| 民县| 沅陵县| 利川市|