Pandas.DataFrame時(shí)間序列數(shù)據(jù)處理如何實(shí)現(xiàn)

發(fā)布時(shí)間：2023-02-23 10:38:47 來(lái)源：億速云閱讀：115 作者：iii 欄目：開發(fā)技術(shù)

本篇內(nèi)容主要講解“Pandas.DataFrame時(shí)間序列數(shù)據(jù)處理如何實(shí)現(xiàn)”，感興趣的朋友不妨來(lái)看看。本文介紹的方法操作簡(jiǎn)單快捷，實(shí)用性強(qiáng)。下面就讓小編來(lái)帶大家學(xué)習(xí)“Pandas.DataFrame時(shí)間序列數(shù)據(jù)處理如何實(shí)現(xiàn)”吧!

將pandas.DataFrame，pandas.Series的索引設(shè)置為datetime64 [ns]類型時(shí)，將其視為DatetimeIndex，并且可以使用各種處理時(shí)間序列數(shù)據(jù)的函數(shù)?？梢园茨昊蛟轮付ㄐ?，并按切片指定提取周期，這在處理包含日期和時(shí)間信息（例如日期和時(shí)間）的數(shù)據(jù)時(shí)非常方便。

如何將一列現(xiàn)有數(shù)據(jù)指定為DatetimeIndex

將pandas.DataFrame與默認(rèn)的基于0的索引和一個(gè)字符串列作為日期。

import pandas as pd

df = pd.read_csv('./data/26/sample_date.csv')
print(df)
#           date  val_1  val_2
# 0   2017-11-01     65     76
# 1   2017-11-07     26     66
# 2   2017-11-18     47     47
# 3   2017-11-27     20     38
# 4   2017-12-05     65     85
# 5   2017-12-12      4     29
# 6   2017-12-22     31     54
# 7   2017-12-29     21      8
# 8   2018-01-03     98     76
# 9   2018-01-08     48     64
# 10  2018-01-19     18     48
# 11  2018-01-23     86     70

print(type(df.index))
# <class 'pandas.core.indexes.range.RangeIndex'>

print(df['date'].dtype)
# object

將to_datetime（）應(yīng)用于日期字符串列，并轉(zhuǎn)換為datetime64 [ns]類型。

df['date'] = pd.to_datetime(df['date'])
print(df['date'].dtype)
# datetime64[ns]

使用set_index（）方法將datetime64 [ns]類型的列指定為索引。

Pandas.DataFrame,重置列的行名(set_index）

索引現(xiàn)在是DatetimeIndex。索引的每個(gè)元素都是時(shí)間戳類型。

df.set_index('date', inplace=True)
print(df)
#             val_1  val_2
# date                    
# 2017-11-01     65     76
# 2017-11-07     26     66
# 2017-11-18     47     47
# 2017-11-27     20     38
# 2017-12-05     65     85
# 2017-12-12      4     29
# 2017-12-22     31     54
# 2017-12-29     21      8
# 2018-01-03     98     76
# 2018-01-08     48     64
# 2018-01-19     18     48
# 2018-01-23     86     70

print(type(df.index))
# <class 'pandas.core.indexes.datetimes.DatetimeIndex'>

print(df.index[0])
print(type(df.index[0]))
# 2017-11-01 00:00:00
# <class 'pandas._libs.tslib.Timestamp'>

可以按年或月指定行，并按切片提取周期。

print(df['2018'])
#             val_1  val_2
# date                    
# 2018-01-03     98     76
# 2018-01-08     48     64
# 2018-01-19     18     48
# 2018-01-23     86     70

print(df['2017-11'])
#             val_1  val_2
# date                    
# 2017-11-01     65     76
# 2017-11-07     26     66
# 2017-11-18     47     47
# 2017-11-27     20     38

print(df['2017-12-15':'2018-01-15'])
#             val_1  val_2
# date                    
# 2017-12-22     31     54
# 2017-12-29     21      8
# 2018-01-03     98     76
# 2018-01-08     48     64

還可以指定各種格式的行。

print(df.loc['01/19/2018', 'val_1'])
# 18

print(df.loc['20180103', 'val_2'])
# 76

讀取CSV時(shí)如何指定DatetimeIndex

如果原始數(shù)據(jù)是CSV文件，則在使用read_csv（）進(jìn)行讀取時(shí)可以指定DatetimeIndex。

在參數(shù)index_col中指定要用作索引的日期和時(shí)間數(shù)據(jù)的列名（或從0開始的列號(hào)），并將parse_dates設(shè)置為True。

df = pd.read_csv('./data/26/sample_date.csv', index_col='date', parse_dates=True)
print(df)
#             val_1  val_2
# date
# 2017-11-01     65     76
# 2017-11-07     26     66
# 2017-11-18     47     47
# 2017-11-27     20     38
# 2017-12-05     65     85
# 2017-12-12      4     29
# 2017-12-22     31     54
# 2017-12-29     21      8
# 2018-01-03     98     76
# 2018-01-08     48     64
# 2018-01-19     18     48
# 2018-01-23     86     70

print(type(df.index))
# <class 'pandas.core.indexes.datetimes.DatetimeIndex'>

如果CSV文件的日期字符串為非標(biāo)準(zhǔn)格式，請(qǐng)?jiān)趓ead_csv（）的參數(shù)date_parser中指定由lambda表達(dá)式定義的解析器。

parser = lambda date: pd.to_datetime(date, format='%Y年%m月%d日')

df_jp = pd.read_csv('./data/26/sample_date_cn.csv', index_col='date', parse_dates=True, date_parser=parser)
print(df_jp)
#             val_1  val_2
# date
# 2017-11-01     65     76
# 2017-11-07     26     66
# 2017-11-18     47     47
# 2017-11-27     20     38
# 2017-12-05     65     85
# 2017-12-12      4     29
# 2017-12-22     31     54
# 2017-12-29     21      8
# 2018-01-03     98     76
# 2018-01-08     48     64
# 2018-01-19     18     48
# 2018-01-23     86     70

print(type(df_jp.index))
# <class 'pandas.core.indexes.datetimes.DatetimeIndex'>

關(guān)于pandas.Series

這可能不是實(shí)際的模式，但是如果pandas.Series索引是日期字符串。

s = pd.read_csv('./data/26/sample_date.csv', index_col=0, usecols=[0, 1], squeeze=True)
print(s)
# date
# 2017-11-01    65
# 2017-11-07    26
# 2017-11-18    47
# 2017-11-27    20
# 2017-12-05    65
# 2017-12-12     4
# 2017-12-22    31
# 2017-12-29    21
# 2018-01-03    98
# 2018-01-08    48
# 2018-01-19    18
# 2018-01-23    86
# Name: val_1, dtype: int64

print(type(s))
print(type(s.index))
# <class 'pandas.core.series.Series'>
# <class 'pandas.core.indexes.base.Index'>

如果要將此索引轉(zhuǎn)換為DatetimeIndex，則可以通過(guò)將用to_datetime轉(zhuǎn)換的索引替換為屬性索引來(lái)覆蓋它。

s.index = pd.to_datetime(s.index)
print(s)
# date
# 2017-11-01    65
# 2017-11-07    26
# 2017-11-18    47
# 2017-11-27    20
# 2017-12-05    65
# 2017-12-12     4
# 2017-12-22    31
# 2017-12-29    21
# 2018-01-03    98
# 2018-01-08    48
# 2018-01-19    18
# 2018-01-23    86
# Name: val_1, dtype: int64

print(type(s))
print(type(s.index))
# <class 'pandas.core.series.Series'>
# <class 'pandas.core.indexes.datetimes.DatetimeIndex'>

print(s['2017-12-15':'2018-01-15'])
# date
# 2017-12-22    31
# 2017-12-29    21
# 2018-01-03    98
# 2018-01-08    48
# Name: val_1, dtype: int64

到此，相信大家對(duì)“Pandas.DataFrame時(shí)間序列數(shù)據(jù)處理如何實(shí)現(xiàn)”有了更深的了解，不妨來(lái)實(shí)際操作一番吧！這里是億速云網(wǎng)站，更多相關(guān)內(nèi)容可以進(jìn)入相關(guān)頻道進(jìn)行查詢，關(guān)注我們，繼續(xù)學(xué)習(xí)！

向AI問(wèn)一下細(xì)節(jié)

Pandas.DataFrame時(shí)間序列數(shù)據(jù)處理如何實(shí)現(xiàn)

如何將一列現(xiàn)有數(shù)據(jù)指定為DatetimeIndex

讀取CSV時(shí)如何指定DatetimeIndex

關(guān)于pandas.Series

猜你喜歡

最新資訊

相關(guān)推薦

相關(guān)標(biāo)簽