python爬取網(wǎng)頁如何創(chuàng)建文件

小億
120
2023-09-15 02:10:30
欄目: 編程語言

要在Python中爬取網(wǎng)頁并創(chuàng)建文件,可以按照以下步驟進(jìn)行:

  1. 導(dǎo)入必要的庫:
import requests
  1. 發(fā)送GET請(qǐng)求以獲取網(wǎng)頁內(nèi)容:
url = "https://www.example.com"  # 替換成要爬取的網(wǎng)頁鏈接
response = requests.get(url)
  1. 檢查響應(yīng)狀態(tài)碼,確保請(qǐng)求成功:
if response.status_code == 200:
# 繼續(xù)處理響應(yīng)內(nèi)容
else:
print("請(qǐng)求失敗")
  1. 創(chuàng)建文件并將網(wǎng)頁內(nèi)容寫入文件:
file_path = "output.html"  # 替換成要?jiǎng)?chuàng)建的文件路徑和名稱
with open(file_path, "w", encoding="utf-8") as file:
file.write(response.text)

完整的代碼示例:

import requests
url = "https://www.example.com"  # 替換成要爬取的網(wǎng)頁鏈接
response = requests.get(url)
if response.status_code == 200:
file_path = "output.html"  # 替換成要?jiǎng)?chuàng)建的文件路徑和名稱
with open(file_path, "w", encoding="utf-8") as file:
file.write(response.text)
print("文件創(chuàng)建成功")
else:
print("請(qǐng)求失敗")

此代碼將爬取指定網(wǎng)頁的內(nèi)容,并將內(nèi)容保存為一個(gè)名為"output.html"的文件。你可以根據(jù)需要自定義文件路徑和名稱。

0