给我一个python爬虫爬取百度首页代码

### Python爬虫百度首页示例代码以下是使用Python编写的爬取百度首页内容的示例代码，基于`requests`和`BeautifulSoup`库实现[^4]。 ```python # 导入所需模块 import requests from bs4 import BeautifulSoup # 设置请求头，模拟浏览器访问 headers = { "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.110 Safari/537.36" } # 定义目标URL url = "http://www.baidu.com" # 发送HTTP请求 resp = requests.get(url, headers=headers) # 检查响应状态码 if resp.status_code == 200: # 将响应内容解析为HTML soup = BeautifulSoup(resp.text, 'html.parser') # 输出页面标题 print("百度首页标题:", soup.title.string) # 保存网页内容到本地文件 with open("baidu_homepage.html", "w", encoding="utf-8") as fp: fp.write(resp.text) print("网页内容已保存到 baidu_homepage.html") else: print(f"请求失败，状态码: {resp.status_code}") ``` #### 说明 1. 使用`requests`库发送HTTP请求并获取百度首页的内容。 2. 设置`User-Agent`以模拟浏览器访问，避免被服务器拦截。 3. 使用`BeautifulSoup`解析HTML内容，并提取页面标题。 4. 将获取的网页内容保存到本地文件中。此外，如果需要更简单的实现方式，可以仅使用`urllib`库完成爬取任务[^1]。 ```python # 导入urllib模块 import urllib.request # 定义目标URL url = "http://www.baidu.com/" # 发送HTTP请求并读取响应内容 response = urllib.request.urlopen(url) html_content = response.read().decode("utf-8") # 输出网页内容 print(html_content) # 保存网页内容到本地文件 with open("baidu_homepage_urllib.html", "w", encoding="utf-8") as fp: fp.write(html_content) print("网页内容已保存到 baidu_homepage_urllib.html") ``` ---

创作声明：本文部分内容由AI辅助生成（AIGC），仅供参考

下一篇 python的listen()onkeypress（）