国产探花免费观看_亚洲丰满少妇自慰呻吟_97日韩有码在线_资源在线日韩欧美_一区二区精品毛片,辰东完美世界有声小说,欢乐颂第一季,yy玄幻小说排行榜完本

首頁(yè) > 編程 > Python > 正文

python爬蟲(chóng)獲取小區(qū)經(jīng)緯度以及結(jié)構(gòu)化地址

2020-02-16 00:26:21
字體:
來(lái)源:轉(zhuǎn)載
供稿:網(wǎng)友

本文實(shí)例為大家分享了python爬蟲(chóng)獲取小區(qū)經(jīng)緯度、地址的具體代碼,供大家參考,具體內(nèi)容如下

通過(guò)小區(qū)名稱利用百度api可以獲取小區(qū)的地址以及經(jīng)緯度,但是由于api返回的值中的地址形式不同,所以可以首先利用小區(qū)名稱進(jìn)行一輪爬蟲(chóng),獲取小區(qū)的經(jīng)緯度,然后再利用經(jīng)緯度Reverse到小區(qū)的結(jié)構(gòu)化的地址。另外小區(qū)名稱如果是'...號(hào)‘,可以在爬蟲(chóng)開(kāi)始之前在'號(hào)‘之后加一個(gè)'院‘,得到的精確度更高。這次寫(xiě)到程序更加便于二次利用,只需要給程序傳遞一個(gè)dataframe就可以坐等結(jié)果了。現(xiàn)在程序已經(jīng)寫(xiě)好了,就等接下來(lái)在工作中看看效果如何了。

class GetAddressInfo: def __init__(self,df): import pandas assert type(df) == pandas.core.frame.DataFrame and ('city' in df.columns) and ('name' in df.columns),/ 'The dataframe is not vailid' from bs4 import BeautifulSoup  from urllib import request import re import pandas as pd import numpy as np import urllib.parse as urp self.__data__ = df def get_address(self): import numpy as np self.__data__['小區(qū)經(jīng)度'] = np.nan self.__data__['小區(qū)緯度'] = np.nan self.__data__['小區(qū)地址'] = np.nan for i in self.__data__.index: self.__data__.loc[i,'小區(qū)緯度'],self.__data__.loc[i,'小區(qū)經(jīng)度'],self.__data__.loc[i,'小區(qū)地址'] =/      self.__get_neigbour_address__(self.__data__.loc[i,'name'],/      self.__data__.loc[i,'city']) return self.__data__ def __lat__(self,res): try: return pd.to_numeric(re.findall('"lat":(.*)',res)[0].split(',')[0]) except: return 0 def __lng__(self,res): try: return pd.to_numeric(re.findall('"lng":(.*)',res)[0]) except: return 0 def __address__(self,res): try: return re.findall('"address":"(.*)",',res)[0] except: return 'None'  def __get_neigbour_address__(self,name,city): my_ak = ##替換自己的ak qurey = urp.quote(name) tag = urp.quote('住宅區(qū)') try: url = 'http://api.map.baidu.com/place/v2/search?query='+qurey+'&tag='+tag+'®ion='+urp.quote(city)+'&output=json&ak='+my_ak req = request.urlopen(url) res = req.read().decode() lat = self.__lat__(res) lng = self.__lng__(res) address = self.__address__(res) return lat,lng,address except: return 0,0,'None'  class ReverseGetAddress: def __init__(self,data): assert ('小區(qū)緯度' in data.columns) and ('小區(qū)經(jīng)度' in data.columns) and ('name' in data.columns),/ 'The DataFrame is not vailid' from bs4 import BeautifulSoup  from urllib import request import re import pandas as pd import numpy as np import urllib.parse as urp self.__data__ = data def __get_address1__(self,url): try: req = request.urlopen(url) res = req.read().decode() address = re.findall('address":"(.*?)"',res)[0] return address except: return 'None1' def __to_string__(self,arr): return str(arr) def __get_address2__(self): my_ak = ##替換自己的Ak base_url1 = 'http://api.map.baidu.com/geocoder/v2/?callback=renderReverse' base_url2 = '&location=' base_url3 = '&pois=0&radius=1&output=json&pois=1&ak=' url = base_url1+base_url2+self.__data__['小區(qū)緯度'].apply(self.__to_string__)+','/ +self.__data__['小區(qū)經(jīng)度'].apply(self.__to_string__)+base_url3+my_ak return url def get_address(self): url = self.__get_address2__() self.__data__['小區(qū)地址'] = url.apply(self.__get_address1__) return self.__data__            
發(fā)表評(píng)論 共有條評(píng)論
用戶名: 密碼:
驗(yàn)證碼: 匿名發(fā)表
主站蜘蛛池模板: 根河市| 长乐市| 独山县| 揭阳市| 廊坊市| 铜鼓县| 红河县| 金山区| 驻马店市| 枣庄市| 阿尔山市| 林周县| 城步| 赣榆县| 兰考县| 汉寿县| 辰溪县| 许昌县| 宁明县| 兴宁市| 当阳市| 泗阳县| 嘉黎县| 泗阳县| 富宁县| 瑞昌市| 体育| 新宁县| 宁晋县| 新宁县| 巴青县| 临洮县| 永修县| 抚远县| 防城港市| 黄平县| 新宁县| 宁城县| 甘肃省| 永丰县| 弥勒县|