您好,登錄后才能下訂單哦!
本篇內(nèi)容介紹了“Python如何實(shí)現(xiàn)郵件自動(dòng)下載”的有關(guān)知識(shí),在實(shí)際案例的操作過程中,不少人都會(huì)遇到這樣的困境,接下來就讓小編帶領(lǐng)大家學(xué)習(xí)一下如何處理這些情況吧!希望大家仔細(xì)閱讀,能夠?qū)W有所成!
開始碼代碼之前,我們先來了解一下三種郵件服務(wù)協(xié)議:
1、SMTP協(xié)議
SMTP(Simple Mail Transfer Protocol),即簡單郵件傳輸協(xié)議。相當(dāng)于中轉(zhuǎn)站,將郵件發(fā)送到客戶端。
2、POP3協(xié)議
POP3(Post Office Protocol 3),即郵局協(xié)議的第3個(gè)版本,是電子郵件的第一個(gè)離線協(xié)議標(biāo)準(zhǔn)。該協(xié)議把郵件下載到本地計(jì)算機(jī),不與服務(wù)器同步,缺點(diǎn)是更易丟失郵件或多次下載相同的郵件。
3、IMAP協(xié)議
IMAP(Internet Mail Access Protocol),即交互式郵件存取協(xié)議。該協(xié)議連接遠(yuǎn)程郵箱直接操作,與服務(wù)器內(nèi)容同步。
然后介紹一下email包
這個(gè)包的中心組件是代表電子郵件消息的“對(duì)象模型”。 應(yīng)用程序主要通過在 message 子模塊中定義的對(duì)象模型接口與這個(gè)包進(jìn)行交互。 應(yīng)用程序可以使用此 API 來詢問有關(guān)現(xiàn)有電子郵件的問題、構(gòu)造新的電子郵件,或者添加或移除自身也使用相同對(duì)象模型接口的電子郵件子組件。 也就是說,遵循電子郵件消息及其 MIME 子組件的性質(zhì),電子郵件對(duì)象模型是所有提供 EmailMessage API 的對(duì)象所構(gòu)成的樹狀結(jié)構(gòu)。
接下來我們通過具體的代碼實(shí)現(xiàn)一個(gè)登錄郵箱客戶端,下載郵件,解析郵件附件內(nèi)容的功能。
首先我們需要定義一個(gè)郵件解析的類,該類需要三個(gè)變量:
1、郵箱所屬的imap服務(wù)地址;
2、郵箱賬號(hào);
3、郵箱密碼【注:不同郵箱需要不同的安全策略,例如qq郵箱需要短信驗(yàn)證,獲取登錄授權(quán)碼,而不是明文密碼去登錄遠(yuǎn)程客戶端】
class Email_parse: def __init__(self,remote_server_url,email_url,password): # imap服務(wù)地址 self.remote_server_url = remote_server_url # 郵箱賬號(hào) self.email_url = email_url # 郵箱密碼 self.password = password
然后定義類中入口函數(shù),登錄遠(yuǎn)程,默認(rèn)獲取第一頁所有的郵件。我們獲取郵件的主題,并打印出來【不同郵件主題的編碼可能不同,二進(jìn)制需要轉(zhuǎn)碼才能正確顯示】
def main_parse_Email(self): """入口函數(shù),登錄imap服務(wù)""" server = imaplib.IMAP4_SSL(self.remote_server_url, 993) server.login(self.email_url, self.password) server.select('INBOX') status,data = server.search(None,"ALL") if status != 'OK': raise Exception('read email error') emailids = data[0].split() mail_counts = len(emailids) print("count:",mail_counts) # 郵件的遍歷是按時(shí)間從后往前,這里我們選擇最新的一封郵件 for i in range(mail_counts - 1, mail_counts - 2, -1): status, edata = server.fetch(emailids[i], '(RFC822)') msg = email.message_from_bytes(edata[0][1]) #獲取郵件主題title subject = email.header.decode_header(msg.get('subject')) if type(subject[-1][0]) == bytes: title = subject[-1][0].decode(str(subject[-1][1])) elif type(subject[-1][0]) == str: title = subject[-1][0] print("title:", title)
其中,msg變量保存的就是郵件的主體,接下來因?yàn)闀?huì)重復(fù)用到msg和tilte,我們將構(gòu)造一個(gè)類函數(shù)返回msg和title。
def get_email_title(msg): subject = email.header.decode_header(msg.get('subject')) if type(subject[-1][0]) == bytes: title = subject[-1][0].decode(str(subject[-1][1])) elif type(subject[-1][0]) == str: title = subject[-1][0] print("title:", title) return title
解析郵件,我們分為兩部分,郵件正文【HTML】和附件【xlsx等】,判斷有附件,我們就保存到固定的路徑下。表格的解析不再贅述了,pandas之類的包足以搞定。
def get_att(msg): """獲取附件并下載""" filename = Email_parse.get_email_name(msg) for part in msg.walk(): file_name = part.get_param("name") if file_name: data = part.get_payload(decode=True) if data != None: att_file = open('./src/' + filename, 'wb') att_file.write(data) att_file.close() else: pass
郵件正文內(nèi)容,我們直接解析html,將文本內(nèi)容直接保存到.txt文件中,方便讀取。
def get_text_from_HTML(msg): """獲取郵件中的html""" filename = Email_parse.get_email_name(msg) current_title = Email_parse.get_email_title(msg) print("filename:",filename,type(filename)) for part in msg.walk(): if not part.is_multipart(): result = part.get_payload(decode=True) result = result.decode('gbk') f = open(f'./src/{current_title}.txt','w') f.write(result) f.close() return result
完整代碼如下:
import email import imaplib from email.header import decode_header import pandas as pd import datetime class Email_parse: def __init__(self,remote_server_url,email_url,password): self.remote_server_url = remote_server_url self.email_url = email_url self.password = password def get_att(msg): filename = Email_parse.get_email_name(msg) for part in msg.walk(): file_name = part.get_param("name") if file_name: data = part.get_payload(decode=True) if data != None: att_file = open('./src/' + filename, 'wb') att_file.write(data) att_file.close() else: pass def get_email_title(msg): subject = email.header.decode_header(msg.get('subject')) if type(subject[-1][0]) == bytes: title = subject[-1][0].decode(str(subject[-1][1])) elif type(subject[-1][0]) == str: title = subject[-1][0] print("title:", title) return title def get_email_name(msg): for part in msg.walk(): file_name = part.get_param("name") if file_name: h = email.header.Header(file_name) dh = email.header.decode_header(h) filename = dh[0][0] if dh[0][1]: value, charset = decode_header(str(filename, dh[0][1]))[0] if charset: filename = value.decode(charset) print("附件名稱:", filename) return filename def main_parse_Email(self): server = imaplib.IMAP4_SSL(self.remote_server_url, 993) server.login(self.email_url, self.password) server.select('INBOX') status,data = server.search(None,"ALL") if status != 'OK': raise Exception('read email error') emailids = data[0].split() mail_counts = len(emailids) print("count:",mail_counts) for i in range(mail_counts - 1, mail_counts - 2, -1): status, edata = server.fetch(emailids[i], '(RFC822)') msg = email.message_from_bytes(edata[0][1]) subject = email.header.decode_header(msg.get('subject')) if type(subject[-1][0]) == bytes: title = subject[-1][0].decode(str(subject[-1][1])) elif type(subject[-1][0]) == str: title = subject[-1][0] print("title:", title) Email_parse.get_att(msg) Email_parse.get_text_from_HTML(msg) def get_text_from_HTML(msg): filename = Email_parse.get_email_name(msg) current_title = Email_parse.get_email_title(msg) print("filename:",filename,type(filename)) for part in msg.walk(): if not part.is_multipart(): result = part.get_payload(decode=True) result = result.decode('gbk') f = open(f'./src/{current_title}.txt','w') f.write(result) f.close() return result if __name__ == "__main__": remote_server_url = 'imap.qq.com' email_url = "*********@qq.com" password = "**********" demo = Email_parse(remote_server_url,email_url,password) demo.main_parse_Email()
運(yùn)行結(jié)果:
“Python如何實(shí)現(xiàn)郵件自動(dòng)下載”的內(nèi)容就介紹到這里了,感謝大家的閱讀。如果想了解更多行業(yè)相關(guān)的知識(shí)可以關(guān)注億速云網(wǎng)站,小編將為大家輸出更多高質(zhì)量的實(shí)用文章!
免責(zé)聲明:本站發(fā)布的內(nèi)容(圖片、視頻和文字)以原創(chuàng)、轉(zhuǎn)載和分享為主,文章觀點(diǎn)不代表本網(wǎng)站立場(chǎng),如果涉及侵權(quán)請(qǐng)聯(lián)系站長郵箱:is@yisu.com進(jìn)行舉報(bào),并提供相關(guān)證據(jù),一經(jīng)查實(shí),將立刻刪除涉嫌侵權(quán)內(nèi)容。