一、 Scrapy簡介
Scrapy is a fast high-level screen scraping and web crawling framework, used to crawl websites and extract structured data from their pages. It can be used for a wide range of purposes, from data mining to monitoring and automated testing.
官方主頁: http://www.scrapy.org/
二、 安裝Python2.7
下載地址:http://www.python.org/ftp/python/2.7.3/python-2.7.3.msi
1) 安裝python
安裝目錄:D:/Python27
2) 添加環(huán)境變量
略System PRoperties -> Advanced -> Environment Variables - >System Variables -> Path -> Edit
3) 驗證環(huán)境變量
T:/>set PathPath=C:/WINDOWS/system32;C:/WINDOWS;C:/WINDOWS/System32/Wbem;D:/Rational/common;D:/Rational/ClearCase/bin;D:/Python27;D:/Python27/ScriptsPATHEXT=.COM;.EXE;.BAT;.CMD;.VBS;.VBE;.JS;.JSE;.WSF;.WSH4) 驗證Python
T:/>pythonPython 2.7.3 (default, Apr 10 2012, 23:31:26) [MSC v.1500 32 bit (Intel)] on win32Type "help", "copyright", "credits" or "license" for more information.>>> exit()T:/>三、 安裝Twisted
Twisted is an event-driven networking engine written in Python and licensed under the open source
1) 安裝setuptools
Download, build, install, upgrade, and uninstall Python packages -- easily!
官方主頁:http://pypi.python.org/pypi/setuptools
下載地址:http://pypi.python.org/packages/2.7/s/setuptools/setuptools-0.6c11.win32-py2.7.exe
安裝過程:略
2) 安裝Zope.Interface
官方主頁:http://pypi.python.org/pypi/zope.interface/
下載地址:http://pypi.python.org/packages/2.7/z/zope.interface/zope.interface-4.0.1-py2.7-win32.egg
安裝過程:
T:/>d:D:/>cd D:/Python27/ScriptsD:/Python27/Scripts>easy_install.exe zope.interface-4.0.1-py2.7-win32.eggProcessing zope.interface-4.0.1-py2.7-win32.eggcreating d:/python27/lib/site-packages/zope.interface-4.0.1-py2.7-win32.eggExtracting zope.interface-4.0.1-py2.7-win32.egg to d:/python27/lib/site-packagesAdding zope.interface 4.0.1 to easy-install.pth fileInstalled d:/python27/lib/site-packages/zope.interface-4.0.1-py2.7-win32.eggProcessing dependencies for zope.interface==4.0.1Finished processing dependencies for zope.interface==4.0.1D:/Python27/Scripts>驗證安裝:
D:/Python27/Scripts>pythonPython 2.7.3 (default, Apr 10 2012, 23:31:26) [MSC v.1500 32 bit (Intel)] on win32Type "help", "copyright", "credits" or "license" for more information.>>> import zope.interface>>>3) 安裝Twisted
官方主頁:http://twistedmatrix.com/trac/wiki/TwistedProject
下載地址:http://pypi.python.org/packages/2.7/T/Twisted/Twisted-12.1.0.win32-py2.7.msi
安裝過程:略
四、 安裝w3lib
官方主頁:http://pypi.python.org/pypi/w3lib
下載地址: http://pypi.python.org/packages/source/w/w3lib/w3lib-1.2.tar.gz
解壓過程:略
安裝過程:
T:/w3lib-1.2>python setup.py installrunning installrunning buildrunning build_pycreating buildcreating build/libcreating build/lib/w3libcopying w3lib/encoding.py -> build/lib/w3libcopying w3lib/form.py -> build/lib/w3libcopying w3lib/html.py -> build/lib/w3libcopying w3lib/http.py -> build/lib/w3libcopying w3lib/url.py -> build/lib/w3libcopying w3lib/util.py -> build/lib/w3libcopying w3lib/__init__.py -> build/lib/w3librunning install_libcreating D:/Python27/Lib/site-packages/w3libcopying build/lib/w3lib/encoding.py -> D:/Python27/Lib/site-packages/w3libcopying build/lib/w3lib/form.py -> D:/Python27/Lib/site-packages/w3libcopying build/lib/w3lib/html.py -> D:/Python27/Lib/site-packages/w3libcopying build/lib/w3lib/http.py -> D:/Python27/Lib/site-packages/w3libcopying build/lib/w3lib/url.py -> D:/Python27/Lib/site-packages/w3libcopying build/lib/w3lib/util.py -> D:/Python27/Lib/site-packages/w3libcopying build/lib/w3lib/__init__.py -> D:/Python27/Lib/site-packages/w3libbyte-compiling D:/Python27/Lib/site-packages/w3lib/encoding.py to encoding.pycbyte-compiling D:/Python27/Lib/site-packages/w3lib/form.py to form.pycbyte-compiling D:/Python27/Lib/site-packages/w3lib/html.py to html.pycbyte-compiling D:/Python27/Lib/site-packages/w3lib/http.py to http.pycbyte-compiling D:/Python27/Lib/site-packages/w3lib/url.py to url.pycbyte-compiling D:/Python27/Lib/site-packages/w3lib/util.py to util.pycbyte-compiling D:/Python27/Lib/site-packages/w3lib/__init__.py to __init__.pycrunning install_egg_infoWriting D:/Python27/Lib/site-packages/w3lib-1.2-py2.7.egg-infoT:/w3lib-1.2>驗證安裝:
T:/>pythonPython 2.7.3 (default, Apr 10 2012, 23:31:26) [MSC v.1500 32 bit (Intel)] on win32Type "help", "copyright", "credits" or "license" for more information.>>> import w3lib>>> 五、 安裝libxml2
官方主頁:http://users.skynet.be/sbi/libxml-python/http://pypi.python.org/pypi/pyOpenSSL
下載地址:http://users.skynet.be/sbi/libxml-python/binaries/libxml2-python-2.7.7.win32-py2.7.exe
安裝過程:略
驗證安裝:
T:/>pythonPython 2.7.3 (default, Apr 10 2012, 23:31:26) [MSC v.1500 32 bit (Intel)] on win32Type "help", "copyright", "credits" or "license" for more information.>>> import libxml2>>> 六、 安裝pyOpenSSL
官方主頁:http://pypi.python.org/pypi/pyOpenSSL
下載地址:http://pypi.python.org/packages/2.7/p/pyOpenSSL/pyOpenSSL-0.13.winxp32-py2.7.msi
安裝過程:略
驗證安裝:
T:/>pythonPython 2.7.3 (default, Apr 10 2012, 23:31:26) [MSC v.1500 32 bit (Intel)] on win32Type "help", "copyright", "credits" or "license" for more information.>>> import OpenSSL>>>七、 安裝Scrapy
官方主頁:http://scrapy.org/
下載地址:http://pypi.python.org/packages/source/S/Scrapy/Scrapy-0.14.4.tar.gz
解壓過程:略
安裝過程:
T:/Scrapy-0.14.4>python setup.py install……Installing easy_install-2.7-script.py script to D:/Python27/ScriptsInstalling easy_install-2.7.exe script to D:/Python27/ScriptsInstalling easy_install-2.7.exe.manifest script to D:/Python27/ScriptsUsing d:/python27/lib/site-packagesFinished processing dependencies for Scrapy==0.14.4T:/Scrapy-0.14.4>
驗證安裝:
T:/>scrapyScrapy 0.14.4 - no active projectUsage: scrapy <command> [options] [args]Available commands: fetch Fetch a URL using the Scrapy downloader runspider Run a self-contained spider (without creating a project) settings Get settings values shell Interactive scraping console startproject Create new project version Print Scrapy version view Open URL in browser, as seen by ScrapyUse "scrapy <command> -h" to see more info about a commandT:/>
新聞熱點
疑難解答