一個(gè)小項(xiàng)目自動(dòng)登錄淘寶聯(lián)盟抓取數(shù)據(jù),由于之前在Github上看過類似用Python寫的代碼因此選擇用Python來寫,第一次用Python正式寫程序還是被其“簡(jiǎn)單”所震撼,當(dāng)然用的時(shí)候還是對(duì)其(2.7版)編碼、遷移環(huán)境等問題所困擾,還好后來都解決了。
言歸正傳,抓取淘寶聯(lián)盟的數(shù)據(jù)首先要解決的就是登錄的問題,之前一般會(huì)碰到驗(yàn)證碼的困擾,現(xiàn)在支持二維碼掃碼登錄反而簡(jiǎn)單了,以下是登錄的Python代碼,主要是獲取二維碼打印,然后不斷的檢查掃碼狀態(tài),如果過期了重新請(qǐng)求二維碼(主要看邏輯,由于有些通用方法做了封裝所以不保證能直接執(zhí)行)
def getQRCode(enableCmdQR): payload = {'_ksTS': str(time.time()), 'from': 'alimama'} qrCodeObj = utils.fetchAPI('https://qrlogin.taobao.com/qrcodelogin/generateQRCode4Login.do', payload, "json", None, True, True) print(qrCodeObj) utils.printQRCode('http:' + qrCodeObj['url'], enableCmdQR) lgToken = qrCodeObj['lgToken'] return lgToken def login(enableCmdQR=False): lgToken = getQRCode(enableCmdQR) code = 0 successLoginURL = "" while code != 10006: payload = {'lgToken': lgToken, 'defaulturl': 'http%3A%2F%2Flogin.taobao.com%2Fmember%2Ftaobaoke%2Flogin.htm%3Fis_login%3D1&_ksTS=' + str( time.time())} rObj = utils.fetchAPI('https://qrlogin.taobao.com/qrcodelogin/qrcodeLoginCheck.do', payload, "json", True, False) code = int(rObj['code']) if 10000 == code: # print("請(qǐng)掃描二維碼登錄") continue elif 10001 == code: print("已掃描二維碼,請(qǐng)?jiān)诖_認(rèn)登錄") elif 10004 == code: print("已過期請(qǐng)重新掃描") login() elif 10006 == code: successLoginURL = rObj["url"] print("登錄成功,正在跳轉(zhuǎn)") else: print("未知錯(cuò)誤,退出執(zhí)行") sys.exit(0) time.sleep(5) print "登錄成功跳轉(zhuǎn):" + successLoginURL r = utils.fetchAPI(successLoginURL, None, "raw", True, False, True) utils.fetchAPI(r.headers['Location'], None, "raw", True, True, False)
解決登錄問題接下去就要解決保存狀態(tài)的問題,Python的Requests庫非常強(qiáng)大,如果簡(jiǎn)單的話可以直接使用request.session來進(jìn)行會(huì)話操作,但由于項(xiàng)目中的很多操作是異步的因此需要解決cookie的存儲(chǔ)和讀取,使用pickel進(jìn)行對(duì)像的序列化和反序列化。其中保存cookie默認(rèn)用增量的方式進(jìn)行更新
def save_cookies(cookies, overWrite=False): try: currentCookie = requests.utils.dict_from_cookiejar(cookies) if len(currentCookie) < 1: return oldCookie = requests.utils.dict_from_cookiejar(load_cookies()) with open(config.COOKIE_FILE, 'w') as f: if not overWrite: cookieDict = dict(oldCookie, **currentCookie) else: cookieDict = requests.utils.dict_from_cookiejar(cookies) pickle.dump(cookieDict, f) print 'Saved cookie' print cookieDict f.close() except: print 'Save cookies failed', sys.exc_info()[0] sys.exit(99)def load_cookies(): try: with open(config.COOKIE_FILE, 'r') as f: cookies = requests.utils.cookiejar_from_dict(pickle.load(f)) f.close() except: cookies = [] return cookies
新聞熱點(diǎn)
疑難解答
圖片精選