pytorch中LN(LayerNorm)及Relu和其變相輸出操作的示例分析

發(fā)布時(shí)間：2021-05-27 10:28:03 來源：億速云閱讀：698 作者：小新欄目：開發(fā)技術(shù)

這篇文章主要介紹pytorch中LN(LayerNorm)及Relu和其變相輸出操作的示例分析，文中介紹的非常詳細(xì)，具有一定的參考價(jià)值，感興趣的小伙伴們一定要看完！

主要就是了解一下pytorch中的使用layernorm這種歸一化之后的數(shù)據(jù)變化，以及數(shù)據(jù)使用relu，prelu，leakyrelu之后的變化。

import torch
import torch.nn as nn
import torch.nn.functional as F
class model(nn.Module):
    def __init__(self):
        super(model, self).__init__()
        self.LN=nn.LayerNorm(10,eps=0,elementwise_affine=True)
        self.PRelu=nn.PReLU(init=0.25)
        self.Relu=nn.ReLU()
        self.LeakyReLU=nn.LeakyReLU(negative_slope=0.01,inplace=False)
    def forward(self,input ):
        out=self.LN(input)
        print("LN:",out)
        out1=self.PRelu(out)
        print("PRelu:",out1)
        out2=self.Relu(out)
        print("Relu:",out2)
        out3=self.LeakyReLU(out)
        print("LeakyRelu:",out3)
        return out
tensor=torch.tensor([-0.9,0.1,0,-0.1,0.9,-0.4,0.9,-0.5,0.8,0.1])
net=model()
print(tensor)
net(tensor)

輸出：

tensor([-0.9000, 0.1000, 0.0000, -0.1000, 0.9000, -0.4000, 0.9000, -0.5000,
         0.8000, 0.1000])
LN: tensor([-1.6906, 0.0171, -0.1537, -0.3245, 1.3833, -0.8368, 1.3833, -1.0076,
         1.2125, 0.0171], grad_fn=<NativeLayerNormBackward>)
Relu: tensor([0.0000, 0.0171, 0.0000, 0.0000, 1.3833, 0.0000, 1.3833, 0.0000, 1.2125,
        0.0171], grad_fn=<ReluBackward0>)
PRelu: tensor([-0.4227, 0.0171, -0.0384, -0.0811, 1.3833, -0.2092, 1.3833, -0.2519,
         1.2125, 0.0171], grad_fn=<PreluBackward>)
LeakyRelu: tensor([-0.0169, 0.0171, -0.0015, -0.0032, 1.3833, -0.0084, 1.3833, -0.0101,
         1.2125, 0.0171], grad_fn=<LeakyReluBackward0>)

從上面可以看出，這個(gè)LayerNorm的歸一化，并不是將數(shù)據(jù)限定在0-1之間，也沒有進(jìn)行一個(gè)類似于高斯分布一樣的分?jǐn)?shù)，只是將其進(jìn)行了一個(gè)處理，對應(yīng)的數(shù)值得到了一些變化，相同數(shù)值的變化也是相同的。

Relu的則是單純將小于0的數(shù)變成了0，減少了梯度消失的可能性

PRelu是一定程度上的保留了負(fù)值，根據(jù)init給的值。

LeakyRelu也是一定程度上保留負(fù)值，不過比較小，應(yīng)該是根據(jù)negative_slope給的值。

補(bǔ)充：PyTorch學(xué)習(xí)之歸一化層（BatchNorm、LayerNorm、InstanceNorm、GroupNorm）

BN，LN，IN，GN從學(xué)術(shù)化上解釋差異：

BatchNorm：batch方向做歸一化，算NHW的均值，對小batchsize效果不好；BN主要缺點(diǎn)是對batchsize的大小比較敏感，由于每次計(jì)算均值和方差是在一個(gè)batch上，所以如果batchsize太小，則計(jì)算的均值、方差不足以代表整個(gè)數(shù)據(jù)分布

LayerNorm：channel方向做歸一化，算CHW的均值，主要對RNN作用明顯；

InstanceNorm：一個(gè)channel內(nèi)做歸一化，算H*W的均值，用在風(fēng)格化遷移；因?yàn)樵趫D像風(fēng)格化中，生成結(jié)果主要依賴于某個(gè)圖像實(shí)例，所以對整個(gè)batch歸一化不適合圖像風(fēng)格化中，因而對HW做歸一化?？梢约铀倌Ｐ褪諗浚⑶冶３置總€(gè)圖像實(shí)例之間的獨(dú)立。

GroupNorm：將channel方向分group，然后每個(gè)group內(nèi)做歸一化，算(C//G)HW的均值；這樣與batchsize無關(guān)，不受其約束。

SwitchableNorm是將BN、LN、IN結(jié)合，賦予權(quán)重，讓網(wǎng)絡(luò)自己去學(xué)習(xí)歸一化層應(yīng)該使用什么方法。

pytorch中LN(LayerNorm)及Relu和其變相輸出操作的示例分析

1 BatchNorm

torch.nn.BatchNorm1d(num_features, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
torch.nn.BatchNorm2d(num_features, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
torch.nn.BatchNorm3d(num_features, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)

參數(shù)：

num_features：來自期望輸入的特征數(shù)，該期望輸入的大小為'batch_size x num_features [x width]'

eps：為保證數(shù)值穩(wěn)定性（分母不能趨近或取0）,給分母加上的值。默認(rèn)為1e-5。

momentum：動(dòng)態(tài)均值和動(dòng)態(tài)方差所使用的動(dòng)量。默認(rèn)為0.1。

affine：布爾值，當(dāng)設(shè)為true，給該層添加可學(xué)習(xí)的仿射變換參數(shù)。

track_running_stats：布爾值，當(dāng)設(shè)為true，記錄訓(xùn)練過程中的均值和方差；

實(shí)現(xiàn)公式：

track_running_stats：布爾值，當(dāng)設(shè)為true，記錄訓(xùn)練過程中的均值和方差；

實(shí)現(xiàn)公式：

pytorch中LN(LayerNorm)及Relu和其變相輸出操作的示例分析

2 GroupNorm

torch.nn.GroupNorm(num_groups, num_channels, eps=1e-05, affine=True)

參數(shù)：

num_groups：需要?jiǎng)澐譃榈膅roups

num_features：來自期望輸入的特征數(shù)，該期望輸入的大小為'batch_size x num_features [x width]'

eps：為保證數(shù)值穩(wěn)定性（分母不能趨近或取0）,給分母加上的值。默認(rèn)為1e-5。

momentum：動(dòng)態(tài)均值和動(dòng)態(tài)方差所使用的動(dòng)量。默認(rèn)為0.1。

affine：布爾值，當(dāng)設(shè)為true，給該層添加可學(xué)習(xí)的仿射變換參數(shù)。

實(shí)現(xiàn)公式：

pytorch中LN(LayerNorm)及Relu和其變相輸出操作的示例分析

3 InstanceNorm

torch.nn.InstanceNorm1d(num_features, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
torch.nn.InstanceNorm2d(num_features, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
torch.nn.InstanceNorm3d(num_features, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)

參數(shù)：

num_features：來自期望輸入的特征數(shù)，該期望輸入的大小為'batch_size x num_features [x width]'

eps：為保證數(shù)值穩(wěn)定性（分母不能趨近或取0）,給分母加上的值。默認(rèn)為1e-5。

momentum：動(dòng)態(tài)均值和動(dòng)態(tài)方差所使用的動(dòng)量。默認(rèn)為0.1。

affine：布爾值，當(dāng)設(shè)為true，給該層添加可學(xué)習(xí)的仿射變換參數(shù)。

track_running_stats：布爾值，當(dāng)設(shè)為true，記錄訓(xùn)練過程中的均值和方差；

實(shí)現(xiàn)公式：

pytorch中LN(LayerNorm)及Relu和其變相輸出操作的示例分析

4 LayerNorm

torch.nn.LayerNorm(normalized_shape, eps=1e-05, elementwise_affine=True)

參數(shù)：

normalized_shape：輸入尺寸

[?×normalized_shape[0]×normalized_shape[1]×…×normalized_shape[?1]]

eps：為保證數(shù)值穩(wěn)定性（分母不能趨近或取0）,給分母加上的值。默認(rèn)為1e-5。

elementwise_affine：布爾值，當(dāng)設(shè)為true，給該層添加可學(xué)習(xí)的仿射變換參數(shù)。

實(shí)現(xiàn)公式：

pytorch中LN(LayerNorm)及Relu和其變相輸出操作的示例分析

5 LocalResponseNorm

torch.nn.LocalResponseNorm(size, alpha=0.0001, beta=0.75, k=1.0)

參數(shù)：

size：用于歸一化的鄰居通道數(shù)

alpha：乘積因子，Default: 0.0001

beta ：指數(shù)，Default: 0.75

k：附加因子，Default: 1

實(shí)現(xiàn)公式：

pytorch中LN(LayerNorm)及Relu和其變相輸出操作的示例分析

以上是“pytorch中LN(LayerNorm)及Relu和其變相輸出操作的示例分析”這篇文章的所有內(nèi)容，感謝各位的閱讀！希望分享的內(nèi)容對大家有幫助，更多相關(guān)知識，歡迎關(guān)注億速云行業(yè)資訊頻道！

向AI問一下細(xì)節(jié)

pytorch中LN(LayerNorm)及Relu和其變相輸出操作的示例分析

1 BatchNorm

2 GroupNorm

3 InstanceNorm

4 LayerNorm

5 LocalResponseNorm

猜你喜歡

最新資訊

相關(guān)推薦

相關(guān)標(biāo)簽