【Python】基于requests库对去哪儿景点评论信息的爬取

【Python】基于requests库对去哪儿景点评论信息的爬取

效果如图:

和驴妈妈的同理,修改data中id就可以对其他景点进行爬虫

from gevent import monkey

monkey.patch_all()

import gevent

import openpyxl

import requests

import time

finishPage = 0

allList = []

page = 2

def comment(sightId,page):

url = "https://piao.qunar.com/ticket/detailLight/sightCommentList.json"

params = {

"sightId":str(sightId),

"index":str(page),

"page":str(page),

"pageSize":"10",

"tagType":"0",

}

headers = {

"user-agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.141 Safari/537.36",

}

res = requests.get(url=url,headers=headers,params=params,timeout=5)

#判断服务器返回数据是否正确

while res.text[1] == "r":

res = requests.get(url=url,headers=headers,params=params,timeout=5)

else:

pass

results = res.json()["data"]

for result in results["commentList"]:

#评论者id

author = result["author"]

#评论日期

publishedDate = result["date"]

#总评分

score = result["score"]

#图片数量

imgNum = len(result["imgs"])

#评论内容

text = result["content"]

commentList = [author,publishedDate,score,imgNum,text]

allList.append(commentList)

print(commentList)

time.sleep(5)

def storage(name,reviewsList):

header = ['评论者ID','评论日期','总评分','图片数量','文本评论']

wb = openpyxl.Workbook()

sheet = wb.active

sheet.title = "commentInfo"

sheet.append(header)

for reviewList in reviewsList:

sheet.append(reviewList)

wb.save("存储/去哪儿 " + name + "'s "+ 'comment.xlsx')

if __name__ == "__main__":

taskList = []

for i in range(1,301):

try:

# task = gevent.spawn(comment,191026,i)

# taskList.append(task)

# gevent.joinall(taskList)

comment(191026,i)

except:

pass

storage("windows of the world",allList)

print(len(allList))

相关星际资讯

姓安的名人 姓安的明星
365bet开户平台

姓安的名人 姓安的明星

🕒 09-06 👁️ 754
为何Saber如此深入人心 为什么很多人喜欢saber
365体育投注官网

为何Saber如此深入人心 为什么很多人喜欢saber

🕒 07-29 👁️ 5198
妈卖批是什么意思(妈卖批,实际上这就是一句骂人的脏话)