您好,登錄后才能下訂單哦!
這篇文章主要介紹了pandas常規(guī)操作有哪些,具有一定借鑒價(jià)值,感興趣的朋友可以參考下,希望大家閱讀完這篇文章之后大有收獲,下面讓小編帶著大家一起了解一下。
在數(shù)組中經(jīng)常使用的聚合方式 data[['counts', 'ches_name']].agg([np.mean, np.std]) agg({'xx':np.mean, 'xx2':[np.sum, np.std]})
如: 將指定列的全部數(shù)據(jù) * 2
方式一 data['counts'].transform(lambda x: x*2)
方式二:按照函數(shù)內(nèi)既定的規(guī)則,進(jìn)行指定數(shù)據(jù)的操作 def transform_func(values): """自定義函數(shù),定義數(shù)據(jù)操作規(guī)則""" return values*2 data['counts'].transform(transform_func) # 一維 data1 = data.groupby(by='品牌')['銷售額'].transform(tran_func) # 分組之后自定義聚合
推薦好課:Python 自動(dòng)化辦公
源碼參數(shù)分析 def pivot_table( data, # Dataframe,對(duì)哪張表進(jìn)行操作 values=None, # 顯示的字段 index=None, # 行分組鍵,可以是數(shù)組,列表,如果是數(shù)組,必須有一樣的長(zhǎng)度 columns=None, # 列分組鍵 aggfunc="mean", # 聚合函數(shù), 默認(rèn)是mean fill_value=None, # 填充空值, 將為Nan的值填充為對(duì)應(yīng)的值 margins=False, # 匯總開關(guān),默認(rèn)是False dropna=True, margins_name="All", # 匯總的列或者行的bolumns,可以指定修改名稱 observed=False,
pd.pivot_table(data, index=['order_id', 'dishes_name'], aggfunc=[np.mean, np.sum], values=['add_inprice', 'counts']) mean sum add_inprice counts add_inprice counts order_id dishes_name 137 農(nóng)夫山泉NFC果汁100% 0 1 0 1 涼拌菠菜 0 1 0 1 番茄燉牛腩\r\n 0 1 0 1 白飯/小碗 0 4 0 4 西瓜胡蘿卜沙拉 0 1 0 1 ... ... ... ... ... 1323 番茄燉秋葵 0 1 0 1 芝士燴波士頓龍蝦 0 1 0 1 芹黃鱔絲 0 1 0 1 蒜蓉生蠔 0 1 0 1 谷稻小莊 0 1 0 1 [2778 rows x 4 columns]
pd.pivot_table(data, columns= ['order_id', 'amounts'], aggfunc=[np.mean, np.sum], values=['add_inprice', 'counts']) # 列分組鍵,可以說(shuō)是行分組鍵的轉(zhuǎn)置 mean ... sum order_id 137 165 ... 1323 amounts 1 6 26 27 35 99 9 ... 39 49 58 65 78 80 175 add_inprice 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ... 0 0 0 0 0 0 0 counts 4.0 1.0 1.0 1.0 1.0 1.0 1.5 ... 1 1 1 1 1 1 1 [2 rows x 4956 columns]
# aggfunc 聚合函數(shù) # fill_value 為空的,怎么顯示,默認(rèn)為Nan # margins 匯總,默認(rèn)是不匯總 # margins_name 匯總列或者行字段名稱,默認(rèn)為all pd.pivot_table(data, index=['dishes_name'], columns='order_id', values='counts', aggfunc=np.sum, fill_value=0, margins=True, margins_name='總') dishes_name 42度海之藍(lán) 北冰洋汽水 38度劍南春 50度古井貢酒 ... 黃油曲奇餅干 黃花菜炒木耳 黑米戀上葡萄 總 order_id ... 137 0 0 0 0 ... 0 0 0 9 165 0 0 1 0 ... 0 1 0 21 166 0 0 0 0 ... 0 0 0 7 171 0 0 0 0 ... 0 0 0 10 177 0 0 0 0 ... 0 0 0 4 ... ... ... ... ... ... ... ... ... ... 1314 0 0 1 0 ... 0 0 0 12 1317 0 0 0 0 ... 0 0 0 18 1319 0 0 0 0 ... 0 0 0 9 1323 0 0 1 0 ... 0 0 0 15 總 5 45 6 5 ... 5 15 18 3088
推薦好課:Python 自動(dòng)化管理
def crosstab( index, # 行分組鍵 columns, # 列分組鍵 values=None, # 顯示的字段 rownames=None, # 行name colnames=None, # 列name aggfunc=None, # 聚合函數(shù) margins=False, # 匯總 margins_name: str = "All", # 匯總列或者行的名稱 dropna: bool = True, normalize=False,
pd.crosstab(index = data['dishes_name'], columns=data['order_id'], values=data['counts'], aggfunc = np.sum) dishes_name 42度海之藍(lán) 北冰洋汽水 38度劍南春 ... 黃油曲奇餅干 黃花菜炒木耳 黑米戀上葡萄 order_id ... 137 NaN NaN NaN ... NaN NaN NaN 165 NaN NaN 1.0 ... NaN 1.0 NaN 166 NaN NaN NaN ... NaN NaN NaN 171 NaN NaN NaN ... NaN NaN NaN 177 NaN NaN NaN ... NaN NaN NaN ... ... ... ... ... ... ... ... 1309 NaN NaN NaN ... NaN NaN NaN 1314 NaN NaN 1.0 ... NaN NaN NaN 1317 NaN NaN NaN ... NaN NaN NaN 1319 NaN NaN NaN ... NaN NaN NaN 1323 NaN NaN 1.0 ... NaN NaN NaN [278 rows x 156 columns]
axis = 0 : 縱向合并axis = 1:橫向合并,索引對(duì)應(yīng)合并
函數(shù)源碼 def concat( objs: Union[Iterable["NDFrame"], Mapping[Label, "NDFrame"]], # 傳入的是Df格式 axis=0, # 進(jìn)行合并的方向 join="outer", # 默認(rèn)使用的外連接 ignore_index: bool = False, # 重置排序索引 keys=None, levels=None, names=None, verify_integrity: bool = False, sort: bool = False, copy: bool = True,
left = pd.DataFrame({'key1': ['K0', 'K0', 'K1', 'K3'], 'key2': ['K0', 'K1', 'K0', 'K1'], 'A': ['A0', 'A1', 'A2', 'A3'], 'B': ['B0', 'B1', 'B2', 'B3']})right = pd.DataFrame({'key1': ['K0', 'K1', 'K1', 'K2'], 'key2': ['K0', 'K0', 'K0', 'K0'], 'C': ['C0', 'C1', 'C2', 'C3'], 'D': ['D0', 'D1', 'D2', 'D3']}) pd.concat((left, right), axis = 0, join = 'inner') # 指定使用內(nèi)連接,進(jìn)行合并,默認(rèn)使用的是outer pd.concat((left, right), axis = 1, join = 'inner')
def merge( left, # 左表 right, # 右表 how: str = "inner", # 默認(rèn)是內(nèi)連接, on=None, # 必須是兩張表中有公共的主鍵,才能作為主鍵 left_on=None, # 左表主鍵 right_on=None, # 右表主鍵 left_index: bool = False, right_index: bool = False, sort: bool = False, suffixes=("_x", "_y"), copy: bool = True, indicator: bool = False, validate=None,
(1) 兩表中有相同的主鍵
on 連接的主鍵,兩表中共有的主鍵 how 連接的方式,默認(rèn)使用的是內(nèi)連接 outer外連接,返回全部 inner內(nèi)連接返回等值連接 left以左表為主 right以右表為主 pd.merge(left, right, on='key1', how='outer') key1 key2_x A B key2_y C D 0 K0 K0 A0 B0 K0 C0 D0 1 K0 K1 A1 B1 K0 C0 D0 2 K1 K0 A2 B2 K0 C1 D1 3 K1 K0 A2 B2 K0 C2 D2 4 K3 K1 A3 B3 NaN NaN NaN 5 K2 NaN NaN NaN K0 C3 D3
多個(gè)相同主鍵連接 pd.merge(left, right, on=['key1', 'key2'], how='outer') key1 key2 A B C D 0 K0 K0 A0 B0 C0 D0 1 K0 K1 A1 B1 NaN NaN 2 K1 K0 A2 B2 C1 D1 3 K1 K0 A2 B2 C2 D2 4 K3 K1 A3 B3 NaN NaN 5 K2 K0 NaN NaN C3 D3
(2) 兩表中沒有相同的主鍵
left_on : 指定左表中的主鍵 right_on : 指定右表中的主鍵 pd.merge(left, right, left_on = 'key1', right_on = 'key2', how='outer') key1_x key2_x A B key1_y key2_y C D 0 K0 K0 A0 B0 K0 K0 C0 D0 1 K0 K0 A0 B0 K1 K0 C1 D1 2 K0 K0 A0 B0 K1 K0 C2 D2 3 K0 K0 A0 B0 K2 K0 C3 D3 4 K0 K1 A1 B1 K0 K0 C0 D0 5 K0 K1 A1 B1 K1 K0 C1 D1 6 K0 K1 A1 B1 K1 K0 C2 D2 7 K0 K1 A1 B1 K2 K0 C3 D3 8 K1 K0 A2 B2 NaN NaN NaN NaN 9 K3 K1 A3 B3 NaN NaN NaN NaN
(3) 更改表格名稱的方法
left.rename(columns={'key1': 'key11111'}, inplace=True) print(left) key11111 key2 A B
(4) 重疊合并,目的是將殘缺的表,合并為完整的表df1.combine_first(df2)
主表.combine_first(附表) dict1 = {'ID':[1,2,3,4,5,6,7,8,9], 'System':['W10','w10',np.nan,'w10',np.nan,np.nan,'w7','w7','w8']}
dict2 = {'ID':[1,2,3,4,5,6,7,8,9],
'System':[np.nan,np.nan,'w7','w7','w7','w7','w8',np.nan,np.nan]}
df1 = pd.DataFrame(dict1)
df2 = pd.DataFrame(dict2)
print(df1,df2) # 誰(shuí)在前,為主表,主表中沒有的補(bǔ)全,有的值,不動(dòng)
print(df1.combine_first(df2))
ID System 0 1 W10 1 2 w10 2 3 w7 3 4 w10 4 5 w7 5 6 w7 6 7 w7 7 8 w7 8 9 w8
感謝你能夠認(rèn)真閱讀完這篇文章,希望小編分享的“pandas常規(guī)操作有哪些”這篇文章對(duì)大家有幫助,同時(shí)也希望大家多多支持億速云,關(guān)注億速云行業(yè)資訊頻道,更多相關(guān)知識(shí)等著你來(lái)學(xué)習(xí)!
免責(zé)聲明:本站發(fā)布的內(nèi)容(圖片、視頻和文字)以原創(chuàng)、轉(zhuǎn)載和分享為主,文章觀點(diǎn)不代表本網(wǎng)站立場(chǎng),如果涉及侵權(quán)請(qǐng)聯(lián)系站長(zhǎng)郵箱:is@yisu.com進(jìn)行舉報(bào),并提供相關(guān)證據(jù),一經(jīng)查實(shí),將立刻刪除涉嫌侵權(quán)內(nèi)容。