NLTK庫(kù)可以通過使用align_words函數(shù)對(duì)文本進(jìn)行對(duì)齊。以下是一個(gè)示例代碼:
from nltk.translate import AlignedSent
from nltk.translate import Alignment
src_words = ['I', 'saw', 'the', 'man']
tgt_words = ['Je', 'ai', 'vu', 'l', 'homme']
alignment = Alignment([(0, 0), (1, 1), (2, 2), (3, 3)])
aligned_sent = AlignedSent(src_words, tgt_words, alignment)
print(aligned_sent.words)
print(aligned_sent.mots)
print(aligned_sent.alignment)
在這個(gè)示例中,我們創(chuàng)建了一個(gè)包含源語言單詞列表和目標(biāo)語言單詞列表的AlignedSent對(duì)象,然后打印出對(duì)齊后的結(jié)果。您也可以使用Alignment()函數(shù)來指定單詞之間的對(duì)應(yīng)關(guān)系,以生成自定義的對(duì)齊結(jié)果。