python爬取b站热门视频信息
时间: 2024-06-05 08:03:36
浏览: 10
Python爬取B站热门视频信息,需要用到requests和BeautifulSoup库。下面是一些大致步骤:
1. 发送请求获取网页源代码
2. 解析网页源代码,获取视频信息
3. 存储数据
具体的代码实现可以参考以下步骤:
1. 导入requests和BeautifulSoup库
import requests
from bs4 import BeautifulSoup
2. 发送请求获取网页源代码
url = 'https://www.bilibili.com/v/popular/rank/all'
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3'}
response = requests.get(url, headers=headers)
html = response.text
3. 解析网页源代码,获取视频信息
soup = BeautifulSoup(html, 'html.parser')
video_list = soup.find_all('li', class_='rank-item')
for video in video_list:
title = video.find('a', class_='title').text.strip()
author = video.find('a', class_='name').text.strip()
play = video.find('span', class_='data-box').find_all('span').text.strip()
danmu = video.find('span', class_='data-box').find_all('span').text.strip()
print('标题:', title)
print('作者:', author)
print('播放量:', play)
print('弹幕数:', danmu)
4. 存储数据
可以将数据存储到本地文件中,也可以将数据存储到数据库中。这里以将数据存储到本地文件中为例:
with open('bilibili.txt', 'w', encoding='utf-8') as f:
for video in video_list:
title = video.find('a', class_='title').text.strip()
author = video.find('a', class_='name').text.strip()
play = video.find('span', class_='data-box').find_all('span').text.strip()
danmu = video.find('span', class_='data-box').find_all('span').text.strip()
f.write('标题:{}\n作者:{}\n播放量:{}\n弹幕数:{}\n\n'.format(title, author, play, danmu))
```