您好,登錄后才能下訂單哦!
今天小編給大家分享一下如何使用Python創(chuàng)建一個(gè)瀑布圖的相關(guān)知識(shí)點(diǎn),內(nèi)容詳細(xì),邏輯清晰,相信大部分人都還太了解這方面的知識(shí),所以分享這篇文章給大家參考一下,希望大家閱讀完這篇文章后有所收獲,下面我們一起來(lái)了解一下吧。
創(chuàng)建圖表
首先,執(zhí)行標(biāo)準(zhǔn)的輸入,并確保IPython能顯示matplot圖。
import numpy as np import pandas as pd import matplotlib.pyplot as plt
%matplotlib inline
設(shè)置我們想畫(huà)出瀑布圖的數(shù)據(jù),并將其加載到數(shù)據(jù)幀(DataFrame)中。
數(shù)據(jù)需要以你的起始值開(kāi)始,但是你需要給出最終的總數(shù)。我們將在下面計(jì)算它。
index = ['sales','returns','credit fees','rebates','late charges','shipping'] data = {'amount': [350000,-30000,-7500,-25000,95000,-7000]} trans = pd.DataFrame(data=data,index=index)
我使用了IPython中便捷的display函數(shù)來(lái)更簡(jiǎn)單地控制我要顯示的內(nèi)容。
from IPython.display import display
display(trans)
首先,我們得到累積和。
display(trans.amount.cumsum()) sales 350000 returns 320000 credit fees 312500 rebates 287500 late charges 382500 shipping 375500 Name: amount, dtype: int64
這看起來(lái)不錯(cuò),但我們需要將一個(gè)地方的數(shù)據(jù)轉(zhuǎn)移到右邊。
blank=trans.amount.cumsum().shift(1).fillna(0)
display(blank)
sales 0 returns 350000 credit fees 320000 rebates 312500 late charges 287500 shipping 382500 Name: amount, dtype: float64
我們需要向trans和blank數(shù)據(jù)幀中添加一個(gè)凈總量。
total = trans.sum().amount trans.loc["net"] = total blank.loc["net"] = total display(trans) display(blank)
sales 0 returns 350000 credit fees 320000 rebates 312500 late charges 287500 shipping 382500 net 375500 Name: amount, dtype: float64
創(chuàng)建我們用來(lái)顯示變化的步驟。
step = blank.reset_index(drop=True).repeat(3).shift(-1)
step[1::3] = np.nan
display(step)
0 0 0 NaN 0 350000 1 350000 1 NaN 1 320000 2 320000 2 NaN 2 312500 3 312500 3 NaN 3 287500 4 287500 4 NaN 4 382500 5 382500 5 NaN 5 375500 6 375500 6 NaN 6 NaN Name: amount, dtype: float64
對(duì)于“net”行,為了不使堆疊加倍,我們需要確保blank值為0。
blank.loc["net"] = 0
然后,將其畫(huà)圖,看一下什么樣子。
my_plot = trans.plot(kind='bar', stacked=True, bottom=blank,legend=None, title="2014 Sales Waterfall")
my_plot.plot(step.index, step.values,'k')
看起來(lái)相當(dāng)不錯(cuò),但是讓我們?cè)囍袷交痀軸,以使其更具有可讀性。為此,我們使用FuncFormatter和一些Python2.7+的語(yǔ)法來(lái)截?cái)嘈?shù)并向格式中添加一個(gè)逗號(hào)。
def money(x, pos):
'The two args are the value and tick position'
return "${:,.0f}".format(x)
from matplotlib.ticker import FuncFormatter formatter = FuncFormatter(money)
然后,將其組合在一起。
my_plot = trans.plot(kind='bar', stacked=True, bottom=blank,legend=None, title="2014 Sales Waterfall")
my_plot.plot(step.index, step.values,'k')
my_plot.set_xlabel("Transaction Types")
my_plot.yaxis.set_major_formatter(formatter)
完整腳本
基本圖形能夠正常工作,但是我想添加一些標(biāo)簽,并做一些小的格式修改。下面是我最終的腳本:
import numpy as np import pandas as pd import matplotlib.pyplot as plt from matplotlib.ticker import FuncFormatter #Use python 2.7+ syntax to format currency def money(x, pos): 'The two args are the value and tick position' return "${:,.0f}".format(x) formatter = FuncFormatter(money) #Data to plot. Do not include a total, it will be calculated index = ['sales','returns','credit fees','rebates','late charges','shipping'] data = {'amount': [350000,-30000,-7500,-25000,95000,-7000]} #Store data and create a blank series to use for the waterfall trans = pd.DataFrame(data=data,index=index) blank = trans.amount.cumsum().shift(1).fillna(0) #Get the net total number for the final element in the waterfall total = trans.sum().amount trans.loc["net"]= total blank.loc["net"] = total #The steps graphically show the levels as well as used for label placement step = blank.reset_index(drop=True).repeat(3).shift(-1) step[1::3] = np.nan #When plotting the last element, we want to show the full bar, #Set the blank to 0 blank.loc["net"] = 0 #Plot and label my_plot = trans.plot(kind='bar', stacked=True, bottom=blank,legend=None, figsize=(10, 5), title="2014 Sales Waterfall") my_plot.plot(step.index, step.values,'k') my_plot.set_xlabel("Transaction Types") #Format the axis for dollars my_plot.yaxis.set_major_formatter(formatter) #Get the y-axis position for the labels y_height = trans.amount.cumsum().shift(1).fillna(0) #Get an offset so labels don't sit right on top of the bar max = trans.max() neg_offset = max / 25 pos_offset = max / 50 plot_offset = int(max / 15) #Start label loop loop = 0 for index, row in trans.iterrows(): # For the last item in the list, we don't want to double count if row['amount'] == total: y = y_height[loop] else: y = y_height[loop] + row['amount'] # Determine if we want a neg or pos offset if row['amount'] > 0: y += pos_offset else: y -= neg_offset my_plot.annotate("{:,.0f}".format(row['amount']),(loop,y),ha="center") loop+=1 #Scale up the y axis so there is room for the labels my_plot.set_ylim(0,blank.max()+int(plot_offset)) #Rotate the labels my_plot.set_xticklabels(trans.index,rotation=0) my_plot.get_figure().savefig("waterfall.png",dpi=200,bbox_inches='tight')
以上就是“如何使用Python創(chuàng)建一個(gè)瀑布圖”這篇文章的所有內(nèi)容,感謝各位的閱讀!相信大家閱讀完這篇文章都有很大的收獲,小編每天都會(huì)為大家更新不同的知識(shí),如果還想學(xué)習(xí)更多的知識(shí),請(qǐng)關(guān)注億速云行業(yè)資訊頻道。
免責(zé)聲明:本站發(fā)布的內(nèi)容(圖片、視頻和文字)以原創(chuàng)、轉(zhuǎn)載和分享為主,文章觀點(diǎn)不代表本網(wǎng)站立場(chǎng),如果涉及侵權(quán)請(qǐng)聯(lián)系站長(zhǎng)郵箱:is@yisu.com進(jìn)行舉報(bào),并提供相關(guān)證據(jù),一經(jīng)查實(shí),將立刻刪除涉嫌侵權(quán)內(nèi)容。