使用selenium，用两种方法爬取QQ音乐的评论

21fanqie · 发表于 2020-12-30 14:51:10

使用selement，用两种方法爬取QQ音乐的评论：
一、selenium + BeautifulSoup：

from selenium import webdriver  #从selenium库中调用webdriver模块
import time
from bs4 import BeautifulSoup

driver=webdriver.Chrome()  #注意Chrome中的“C”要大写。# 设置引擎为Chrome，真实地打开一个Chrome浏览器
'''
把Chrome浏览器设置为引擎，然后赋值给变量driver。driver是实例化的
浏览器，在后面你会总是能看到它的影子，这也可以理解，
因为我们要控制这个实例化的浏览器为我们做一些事情。
'''
driver.get('https://y.qq.com/n/yqq/song/000xdZuV2LcQ19.html')
# get(URL)是webdriver的一个方法，它的使命是为你打开指定URL的网页。
time.sleep(3)
#用time.sleep(3)等待三秒，是由于浏览器缓冲加载网页需要耗费一些时间
pageSource=driver.page_source
jiexi = BeautifulSoup(pageSource,'html.parser')
#-----------------找出每一条评论-------------------
Large_partition=jiexi.find_all(class_="comment__list_item c_b_normal js_cmt_li")
try:
    for i in Large_partition:
        comment=i.find(class_="c_tx_normal comment__text js_hot_text").text
        print(comment,'\n')
except AttributeError:
    pass

二、单独用selenium：

from selenium import webdriver  #从selenium库中调用webdriver模块
import time

driver=webdriver.Chrome()

driver.get('https://y.qq.com/n/yqq/song/000xdZuV2LcQ19.html')
time.sleep(3)

for i in range(5):
    more=driver.find_element_by_class_name('comment__show_all')
    more.click()
    time.sleep(2)

comments = driver.find_element_by_class_name('js_hot_list').find_elements_by_class_name('js_cmt_li') # 使用class_name找到评论
print(len(comments)) # 打印获取到的评论个数

for ii in comments:
    small=ii.find_element_by_tag_name('p')
    print('评论：%s\n ---\n' %small.text)
driver.close() # 关闭浏览器

		自动登录	找回密码
密码			立即注册

使用selenium，用两种方法爬取QQ音乐的评论

相关帖子