您好,登錄后才能下訂單哦!
這篇“Python怎么用CNN實現(xiàn)對時序數(shù)據(jù)進行分類”文章的知識點大部分人都不太理解,所以小編給大家總結(jié)了以下內(nèi)容,內(nèi)容詳細(xì),步驟清晰,具有一定的借鑒價值,希望大家閱讀完這篇文章能有所收獲,下面我們一起來看看這篇“Python怎么用CNN實現(xiàn)對時序數(shù)據(jù)進行分類”文章吧。
數(shù)據(jù)集利用的是CPSC2020數(shù)據(jù)集。
訓(xùn)練數(shù)據(jù)包括從心律失?;颊呤占?0個單導(dǎo)聯(lián)心電圖記錄,每個記錄持續(xù)約24小時。
下載完成后的TrainingSet數(shù)據(jù)集包括兩個文件夾,分別是data和ref。data和ref文件夾內(nèi)分別有10個mat文件。
data文件夾存儲數(shù)據(jù)文件,每個文件以mat格式存儲,n ∗ 1 n*1n∗1數(shù)組表示;
ref文件夾為標(biāo)簽文件夾,每個文件以mat文件存儲,結(jié)構(gòu)體存儲,包括S_ref,V_ref兩個n*1數(shù)組,分別存儲對應(yīng)標(biāo)簽(S,V)的位置;
采樣率為 400。
S:室上早搏(SPB);
V:心室早搏(PVC);
查看一下前1000個心電圖數(shù)據(jù):
datafile = 'E:/Wendy/Desktop/TrainingSet/data/A04.mat'# 采樣率400 data = scio.loadmat(datafile) #rint(data) # dict sig = data['ecg']# (x,1) #print(sig) sig = np.reshape(sig,(-1)) # (x,)轉(zhuǎn)換為一維向量 print(sig) sigPlot = sig[1:5*200]# # 獲取前1000個信號 fig = plt.figure(figsize=(20, 10),dpi=400) plt.plot(sigPlot) plt.show()
運行結(jié)果:
將標(biāo)簽數(shù)據(jù)轉(zhuǎn)化為一維向量
datafile = 'E:/Wendy/Desktop/TrainingSet/ref/R04.mat'# 采樣率400 data = scio.loadmat(datafile) #print(data) label = data['ref'][0][0] S_ref = label[0]; S_ref = np.reshape(S_ref,(-1)) # 轉(zhuǎn)換為一維向量 V_ref = label[1]; V_ref = np.reshape(V_ref,(-1)) # 轉(zhuǎn)換為一維向量
數(shù)據(jù)分割為5s一個片段
思路:房早室早心拍和前后兩個心拍均有關(guān)系,按照平均心率72計算,平均每個心拍的時間為60/72,因此5個心拍的時間為60/725=4.1667 4.1667s不好計算,故選擇5s 5 ( 秒 ) s a m p r = 5 ∗ 400 = 2000 個 s a m p l e 5(秒)sampr = 5*400=2000個sample5(秒)sampr=5∗400=2000個sample
定義標(biāo)簽:0:其他;1:V_ref; 2:S_ref;
a = len(sig) Fs = 400 # 采樣率為400 segLen = 5*Fs # 2000 num = int(a/segLen) print(num)
運行結(jié)果:
17650
其中Fs為采樣率,segLen為片段長度,num為片段數(shù)量。
接下來需要整合數(shù)據(jù)和標(biāo)簽:
all_data=[] all_label = []; i=1 while i<num+1: all_data.append(np.array(sig[(i-1)*segLen:i*segLen])) # 標(biāo)簽 if set(S_ref) & set(range((i-1)*segLen,i*segLen)): all_label.append(2) elif set(V_ref) & set(range((i-1)*segLen,i*segLen)): all_label.append(1) else: all_label.append(0) i=i+1 type(all_data)# list類型 type(all_label)# list類型 print((np.array(all_data)).shape) # 17650為數(shù)據(jù)長度,2000為數(shù)據(jù)個數(shù) print((np.array(all_label)).shape) #print(all_data)
運行結(jié)果:
(17650, 2000)
(17650,)
17650為數(shù)據(jù)長度,2000為數(shù)據(jù)個數(shù)。
將數(shù)據(jù)保存為字典類型:
import pickle res = {'data':all_data, 'label':all_label} # 字典類型dict with open('./cpsc2020.pkl', 'wb') as fout: # #將結(jié)果保存為cpsc2020.pkl pickle.dump(res, fout)
將數(shù)據(jù)歸一化并進行標(biāo)簽編碼,劃分訓(xùn)練集和測試集,訓(xùn)練集為90%,測試集為10%,打亂數(shù)據(jù)并將其擴展為二維:
import numpy as np import pandas as pd import scipy.io from matplotlib import pyplot as plt import pickle from sklearn.model_selection import train_test_split from collections import Counter from tqdm import tqdm def read_data_physionet(): """ only N V, S """ # read pkl with open('./cpsc2020.pkl', 'rb') as fin: res = pickle.load(fin) # 加載數(shù)據(jù)集 ## 數(shù)據(jù)歸一化 all_data = res['data'] for i in range(len(all_data)): tmp_data = all_data[i] tmp_std = np.std(tmp_data) # 獲取數(shù)據(jù)標(biāo)準(zhǔn)差 tmp_mean = np.mean(tmp_data) # 獲取數(shù)據(jù)均值 if(tmp_std==0): # i=1239-1271均為0 tmp_std = 1 all_data[i] = (tmp_data - tmp_mean) / tmp_std # 歸一化 all_data = [] ## 標(biāo)簽編碼 all_label = [] for i in range(len(res['label'])): if res['label'][i] == 1: all_label.append(1) all_data.append(res['data'][i]) elif res['label'][i] == 2: all_label.append(2) all_data.append(res['data'][i]) else: all_label.append(0) all_data.append(res['data'][i]) all_label = np.array(all_label) all_data = np.array(all_data) # 劃分訓(xùn)練集和測試集,訓(xùn)練集90%,測試集10% X_train, X_test, Y_train, Y_test = train_test_split(all_data, all_label, test_size=0.1, random_state=15) print('訓(xùn)練集和測試集中 其他類別(0);室早(1);房早(2)的數(shù)量: ') print(Counter(Y_train), Counter(Y_test)) # 打亂訓(xùn)練集 shuffle_pid = np.random.permutation(Y_train.shape[0]) X_train = X_train[shuffle_pid] Y_train = Y_train[shuffle_pid] # 擴展為二維(x,1) X_train = np.expand_dims(X_train, 1) X_test = np.expand_dims(X_test, 1) return X_train, X_test, Y_train, Y_test X_train, X_test, Y_train, Y_test = read_data_physionet()
運行結(jié)果:
訓(xùn)練集和測試集中 其他類別(0);室早(1);房早(2)的數(shù)量:
Counter({1: 8741, 0: 4605, 2: 2539}) Counter({1: 1012, 0: 478, 2: 275})
自行構(gòu)建數(shù)據(jù)集:
# 構(gòu)建數(shù)據(jù)結(jié)構(gòu) MyDataset # 單條數(shù)據(jù)信號的形狀為:1*2000 import numpy as np from collections import Counter from tqdm import tqdm from matplotlib import pyplot as plt from sklearn.metrics import classification_report import torch import torch.nn as nn import torch.optim as optim import torch.nn.functional as F from torch.utils.data import Dataset, DataLoader class MyDataset(Dataset): def __init__(self, data, label): self.data = data self.label = label #把numpy轉(zhuǎn)換為Tensor def __getitem__(self, index): return (torch.tensor(self.data[index], dtype=torch.float), torch.tensor(self.label[index], dtype=torch.long)) def __len__(self): return len(self.data)
搭建CNN網(wǎng)絡(luò)結(jié)構(gòu):
# 搭建神經(jīng)網(wǎng)絡(luò) class CNN(nn.Module): def __init__(self): super(CNN, self).__init__() self.conv1 = nn.Sequential( # input shape (1, 1, 2000) nn.Conv1d( in_channels=1, out_channels=16, kernel_size=5, stride=1, padding=2, ), # output shape (16, 1, 2000) nn.Dropout(0.2), nn.ReLU(), nn.MaxPool1d(kernel_size=5), # choose max value in 1x5 area, output shape (16, 1, 400)2000/5 ) self.conv2 = nn.Sequential( # input shape (16, 1, 400) nn.Conv1d(16, 32, 5, 1, 2), # output shape (32, 1, 400) nn.Dropout(0.2), nn.ReLU(), nn.MaxPool1d(kernel_size=5), # output shape (32, 1, 400/5=80) ) self.out = nn.Linear(32 * 80, 3) # fully connected layer, output 3 classes def forward(self, x): x = self.conv1(x) x = self.conv2(x) x = x.view(x.size(0), -1) output = self.out(x) #output.Softmax() return output, x cnn = CNN() print(cnn)
運行結(jié)果:
CNN(
(conv1): Sequential(
(0): Conv1d(1, 16, kernel_size=(5,), stride=(1,), padding=(2,))
(1): Dropout(p=0.2, inplace=False)
(2): ReLU()
(3): MaxPool1d(kernel_size=5, stride=5, padding=0, dilation=1, ceil_mode=False)
)
(conv2): Sequential(
(0): Conv1d(16, 32, kernel_size=(5,), stride=(1,), padding=(2,))
(1): Dropout(p=0.2, inplace=False)
(2): ReLU()
(3): MaxPool1d(kernel_size=5, stride=5, padding=0, dilation=1, ceil_mode=False)
)
(out): Linear(in_features=2560, out_features=3, bias=True)
)
優(yōu)化器利用的是Adam優(yōu)化器,損失函數(shù)使用crossEntropy函數(shù)。
代碼略
50個epoch的運行效果如下:
以上就是關(guān)于“Python怎么用CNN實現(xiàn)對時序數(shù)據(jù)進行分類”這篇文章的內(nèi)容,相信大家都有了一定的了解,希望小編分享的內(nèi)容對大家有幫助,若想了解更多相關(guān)的知識內(nèi)容,請關(guān)注億速云行業(yè)資訊頻道。
免責(zé)聲明:本站發(fā)布的內(nèi)容(圖片、視頻和文字)以原創(chuàng)、轉(zhuǎn)載和分享為主,文章觀點不代表本網(wǎng)站立場,如果涉及侵權(quán)請聯(lián)系站長郵箱:is@yisu.com進行舉報,并提供相關(guān)證據(jù),一經(jīng)查實,將立刻刪除涉嫌侵權(quán)內(nèi)容。