<strong id="uoobr"></strong>

溫馨提示×

溫馨提示×

您好，登錄后才能下訂單哦！

密碼登錄×

忘記密碼？

登錄注冊(cè)×

獲取短信驗(yàn)證碼

其他方式登錄

點(diǎn)擊登錄注冊(cè) 即表示同意《億速云用戶服務(wù)條款》

用戶登錄×

賬戶密碼登錄

請(qǐng)使用微信掃描上方二維碼

使用幫助

請(qǐng)求超時(shí)！

請(qǐng)點(diǎn)擊重新獲取二維碼

使用requests庫(kù)怎么實(shí)現(xiàn)一個(gè)python爬蟲(chóng)

發(fā)布時(shí)間：2020-12-18 14:28:11 來(lái)源：億速云閱讀：166 作者：Leah 欄目：開(kāi)發(fā)技術(shù)

使用requests庫(kù)怎么實(shí)現(xiàn)一個(gè)python爬蟲(chóng)？相信很多沒(méi)有經(jīng)驗(yàn)的人對(duì)此束手無(wú)策，為此本文總結(jié)了問(wèn)題出現(xiàn)的原因和解決方法，通過(guò)這篇文章希望你能解決這個(gè)問(wèn)題。

requests庫(kù)

利用pip安裝:
pip install requests

基本請(qǐng)求

req = requests.get("https://www.baidu.com/")
req = requests.post("https://www.baidu.com/")
req = requests.put("https://www.baidu.com/")
req = requests.delete("https://www.baidu.com/")
req = requests.head("https://www.baidu.com/")
req = requests.options(https://www.baidu.com/)

1.get請(qǐng)求

參數(shù)是字典，我們可以傳遞json類(lèi)型的參數(shù)：

import requests
from fake_useragent import UserAgent#請(qǐng)求頭部庫(kù)
headers = {"User-Agent":UserAgent().random}#獲取一個(gè)隨機(jī)的請(qǐng)求頭
url = "https://www.baidu.com/s"#網(wǎng)址
params={
  "wd":"豆瓣"  #網(wǎng)址的后綴
}

requests.get(url,headers=headers,params=params)

使用requests庫(kù)怎么實(shí)現(xiàn)一個(gè)python爬蟲(chóng)

返回了狀態(tài)碼，所以我們要想獲取內(nèi)容，需要將其轉(zhuǎn)成text：

#get請(qǐng)求

headers = {"User-Agent":UserAgent().random}
url = "https://www.baidu.com/s"
params={
  "wd":"豆瓣"
}

response = requests.get(url,headers=headers,params=params)
response.text

2.post 請(qǐng)求

參數(shù)也是字典，也可以傳遞json類(lèi)型的參數(shù)：

import requests 
from fake_useragent import UserAgent

headers = {"User-Agent":UserAgent().random}

url = "https://www.baidu.cn/index/login/login" #登錄賬號(hào)密碼的網(wǎng)址
params = {
  "user":"1351351335",#賬號(hào)
  "password":"123456"#密碼
}

response = requests.post(url,headers=headers,data=params)
response.text

使用requests庫(kù)怎么實(shí)現(xiàn)一個(gè)python爬蟲(chóng)

因?yàn)檫@里需要一個(gè)登錄的網(wǎng)頁(yè)，我這里就隨便用了一個(gè)，沒(méi)有登錄，所以顯示的結(jié)果是這樣的，如果想要測(cè)試登錄的效果，請(qǐng)找一個(gè)登錄的頁(yè)面去嘗試一下。

3.IP代理

采集時(shí)為避免被封IP，經(jīng)常會(huì)使用代理，requests也有相應(yīng) 的proxies屬性。

#IP代理

import requests 
from fake_useragent import UserAgent

headers = {"User-Agent":UserAgent().random}
url = "http://httpbin.org/get" #返回當(dāng)前IP的網(wǎng)址

proxies = {
  "http":"http://yonghuming:123456@192.168.1.1:8088"#http://用戶名:密碼@IP:端口號(hào)
  #"http":"https://182.145.31.211:4224"# 或者IP：端口號(hào)
}

requests.get(url,headers=headers,proxies=proxies)

代理IP可以去：快代理去找，也可以去購(gòu)買(mǎi)。
http://httpbin.org/get。這個(gè)網(wǎng)址是查看你現(xiàn)在的信息：

使用requests庫(kù)怎么實(shí)現(xiàn)一個(gè)python爬蟲(chóng)

4.設(shè)置訪問(wèn)超時(shí)時(shí)間

可以通過(guò)timeout屬性設(shè)置超時(shí)時(shí)間，一旦超過(guò)這個(gè)時(shí)間還沒(méi)獲取到響應(yīng)內(nèi)容，就會(huì)提示錯(cuò)誤。

#設(shè)置訪問(wèn)時(shí)間
requests.get("http://baidu.com/",timeout=0.1)

使用requests庫(kù)怎么實(shí)現(xiàn)一個(gè)python爬蟲(chóng)

5.證書(shū)問(wèn)題(SSLError:HTTP)

ssl驗(yàn)證。

import requests 
from fake_useragent import UserAgent #請(qǐng)求頭部庫(kù)

url = "https://www.12306.cn/index/" #需要證書(shū)的網(wǎng)頁(yè)地址
headers = {"User-Agent":UserAgent().random}#獲取一個(gè)隨機(jī)請(qǐng)求頭

requests.packages.urllib3.disable_warnings()#禁用安全警告
response = requests.get(url,verify=False,headers=headers)
response.encoding = "utf-8" #用來(lái)顯示中文，進(jìn)行轉(zhuǎn)碼
response.text

使用requests庫(kù)怎么實(shí)現(xiàn)一個(gè)python爬蟲(chóng)

6.session自動(dòng)保存cookies

import requests
from fake_useragent import UserAgent

headers = {"User-Agent":UserAgent().chrome}
login_url = "https://www.baidu.cn/index/login/login" #需要登錄的網(wǎng)頁(yè)地址
params = {
  "user":"yonghuming",#用戶名
  "password":"123456"#密碼
}
session = requests.Session() #用來(lái)保存cookie

#直接用session 歹意requests 
response = session.post(login_url,headers=headers,data=params)

info_url = "https://www.baidu.cn/index/user.html" #登錄完賬號(hào)密碼以后的網(wǎng)頁(yè)地址
resp = session.get(info_url,headers=headers)
resp.text

因?yàn)槲疫@里沒(méi)有使用需要賬號(hào)密碼的網(wǎng)頁(yè)，所以顯示這樣：

使用requests庫(kù)怎么實(shí)現(xiàn)一個(gè)python爬蟲(chóng)

我獲取了一個(gè)智慧樹(shù)的網(wǎng)頁(yè)

#cookie 

import requests
from fake_useragent import UserAgent

headers = {"User-Agent":UserAgent().chrome}
login_url = "https://passport.zhihuishu.com/login?service=https://onlineservice.zhihuishu.com/login/gologin" #需要登錄的網(wǎng)頁(yè)地址
params = {
  "user":"12121212",#用戶名
  "password":"123456"#密碼
}
session = requests.Session() #用來(lái)保存cookie

#直接用session 歹意requests 
response = session.post(login_url,headers=headers,data=params)

info_url = "https://onlne5.zhhuishu.com/onlinWeb.html#/stdetInex" #登錄完賬號(hào)密碼以后的網(wǎng)頁(yè)地址
resp = session.get(info_url,headers=headers)
resp.encoding = "utf-8"
resp.text

使用requests庫(kù)怎么實(shí)現(xiàn)一個(gè)python爬蟲(chóng)

7.獲取響應(yīng)信息

代碼	含義
resp.json()	獲取響應(yīng)內(nèi)容（以json字符串）
resp.text	獲取相應(yīng)內(nèi)容（以字符串）
resp.content	獲取響應(yīng)內(nèi)容（以字節(jié)的方式）
resp.headers	獲取響應(yīng)頭內(nèi)容
resp.url	獲取訪問(wèn)地址
resp.encoding	獲取網(wǎng)頁(yè)編碼
resp.request.headers	請(qǐng)求頭內(nèi)容
resp.cookie	獲取cookie

看完上述內(nèi)容，你們掌握使用requests庫(kù)怎么實(shí)現(xiàn)一個(gè)python爬蟲(chóng)的方法了嗎？如果還想學(xué)到更多技能或想了解更多相關(guān)內(nèi)容，歡迎關(guān)注億速云行業(yè)資訊頻道，感謝各位的閱讀！

向AI問(wèn)一下細(xì)節(jié)

推薦閱讀：

免責(zé)聲明：本站發(fā)布的內(nèi)容（圖片、視頻和文字）以原創(chuàng)、轉(zhuǎn)載和分享為主，文章觀點(diǎn)不代表本網(wǎng)站立場(chǎng)，如果涉及侵權(quán)請(qǐng)聯(lián)系站長(zhǎng)郵箱：is@yisu.com進(jìn)行舉報(bào)，并提供相關(guān)證據(jù)，一經(jīng)查實(shí)，將立刻刪除涉嫌侵權(quán)內(nèi)容。

上一篇新聞：
如何在Mybatis-Plus中利用p6spy對(duì)SQL的性能進(jìn)行監(jiān)控
下一篇新聞：
怎么在JavaScript中對(duì)markdown進(jìn)行正則匹配

猜你喜歡

AI
助
手

產(chǎn)品服務(wù)

地區(qū)劃分

專題活動(dòng)

幫助支持

關(guān)于我們

售后咨詢

7*24小時(shí)在線電話：400-100-2938

7*24小時(shí)在線 QQ：800811969

關(guān)注億速云

億速云公眾號(hào)

手機(jī)網(wǎng)站二維碼

<samp id="sdnli"><listing id="sdnli"><var id="sdnli"></var></listing></samp>

<li id="sdnli"><tbody id="sdnli"><thead id="sdnli"></thead></tbody></li>

<progress id="sdnli"><menuitem id="sdnli"><table id="sdnli"></table></menuitem></progress>