溫馨提示×

您好,登錄后才能下訂單哦!

密碼登錄×
登錄注冊(cè)×
其他方式登錄
點(diǎn)擊 登錄注冊(cè) 即表示同意《億速云用戶服務(wù)條款》

聲紋識(shí)別kaldi callhome diarization怎么實(shí)現(xiàn)

發(fā)布時(shí)間:2021-12-08 13:41:49 來(lái)源:億速云 閱讀:189 作者:iii 欄目:大數(shù)據(jù)

這篇文章主要介紹“聲紋識(shí)別kaldi callhome diarization怎么實(shí)現(xiàn)”,在日常操作中,相信很多人在聲紋識(shí)別kaldi callhome diarization怎么實(shí)現(xiàn)問(wèn)題上存在疑惑,小編查閱了各式資料,整理出簡(jiǎn)單好用的操作方法,希望對(duì)大家解答”聲紋識(shí)別kaldi callhome diarization怎么實(shí)現(xiàn)”的疑惑有所幫助!接下來(lái),請(qǐng)跟著小編一起來(lái)學(xué)習(xí)吧!

callhome diarization kaldi 中專門用來(lái)進(jìn)行混合錄音文件聚類分別的

學(xué)會(huì)自己看kaldi中的 指令demo。

個(gè)人操作如下:

teps/segmentation/detect_speech_activity.sh --cmd 'run.pl' --nj 1 --mfcc-config ./conf/mfcc_hires.conf --extra-left-context 79 --extra-right-context 21 --extra-left-context-initial 0 --extra-right-context-final 0 --frames-per-chunk 150 data/ljj exp/segmentation_1a/tdnn_stats_asr_sad_1a exp/mfcc_hires exp/segmentation_sad_snr/nnet_tdnn_j_ljj data/ljj
 
steps/make_mfcc.sh --mfcc-config conf/mfcc.conf --nj 1 --cmd "run.pl" --write-utt2num-frames true data/ljj_seg exp/make_mfcc mfcc 

utils/fix_data_dir.sh data/ljj_seg
 
 #  倒譜均值方差歸一化(CMVN)  
 local/nnet3/xvector/prepare_feats.sh --nj 1 --cmd "run.pl" data/ljj_seg data/ljj_seg_cmn exp/ljj_seg_cmn
 
 cp data/ljj_seg/segments data/ljj_seg_cmn/
 
 utils/fix_data_dir.sh data/ljj_seg_cmn
 
 diarization/nnet3/xvector/extract_xvectors.sh --cmd "run.pl"  --nj 1 --window 1.5 --period 0.75 --apply-cmn false --min-segment 0.5 exp/xvector_nnet_1a  data/ljj_seg_cmn exp/xvectors_ljj_seg
 
 diarization/nnet3/xvector/score_plda.sh --cmd "run.pl --mem 4G" --nj 1 --target-energy 0.9  exp/xvector_nnet_1a/xvectors_callhome1 exp/xvectors_ljj_seg exp/xvectors_ljj_seg/plda_scores
 
 diarization/cluster.sh --cmd "run.pl --mem 4G" --nj 1 --reco2num-spk data/ljj_seg/reco2num_spk exp/xvectors_ljj_seg/plda_scores exp/xvectors_ljj_seg/plda_scores_num_speakers
 #  如果知道有多少人說(shuō)話 則需要生成 --reco2num-spk data/ljj_seg/reco2num_spk
 
 diarization/cluster.sh --cmd "run.pl --mem 4G" --nj 1 --threshold 0 exp/xvectors_ljj_seg/plda_scores exp/xvectors_ljj_seg/plda_scores_threshold_0
 
 
 第二列是文件名,第三列是開(kāi)始時(shí)間,第四列是移動(dòng)時(shí)間 第五列是 從移動(dòng)時(shí)間開(kāi)始 多少時(shí)間算一份  第八列是文件的label
如下是 已知文件有幾個(gè)人說(shuō)話的時(shí)候
SPEAKER 18642259056-liujinjie.wav 0   0.000   4.510 <NA> <NA> 1 <NA> <NA>
SPEAKER 18642259056-liujinjie.wav 0   4.530   1.660 <NA> <NA> 2 <NA> <NA>
SPEAKER 18642259056-liujinjie.wav 0   6.210   4.880 <NA> <NA> 2 <NA> <NA>
SPEAKER 18642259056-liujinjie.wav 0  11.090   1.660 <NA> <NA> 1 <NA> <NA>
SPEAKER 18642259056-liujinjie.wav 0  12.800   2.130 <NA> <NA> 1 <NA> <NA>
SPEAKER 18642259056-liujinjie.wav 0  14.950   4.400 <NA> <NA> 2 <NA> <NA>
SPEAKER 18642259056-liujinjie.wav 0  19.390   1.810 <NA> <NA> 2 <NA> <NA>
SPEAKER 18642259056-liujinjie.wav 0  21.220   5.220 <NA> <NA> 2 <NA> <NA>
SPEAKER 18642259056-liujinjie.wav 0  26.440   4.410 <NA> <NA> 1 <NA> <NA>
SPEAKER 18642259056-liujinjie.wav 0  30.850   2.480 <NA> <NA> 2 <NA> <NA>
SPEAKER 18642259056-liujinjie.wav 0  33.340   5.120 <NA> <NA> 2 <NA> <NA>
SPEAKER 18642259056-liujinjie.wav 0  38.460   5.990 <NA> <NA> 1 <NA> <NA>
SPEAKER 18642259056-liujinjie.wav 0  44.480   3.910 <NA> <NA> 1 <NA> <NA>
SPEAKER 18642259056-liujinjie.wav 0  48.460   3.460 <NA> <NA> 1 <NA> <NA>
SPEAKER 18642259056-liujinjie.wav 0  52.060   5.420 <NA> <NA> 1 <NA> <NA>
SPEAKER 18642259056-liujinjie.wav 0  57.530   5.030 <NA> <NA> 1 <NA> <NA>


如下是 不知文件有幾個(gè)人說(shuō)話的時(shí)候
SPEAKER 18642259056-liujinjie.wav 0   0.000   4.510 <NA> <NA> 1 <NA> <NA>
SPEAKER 18642259056-liujinjie.wav 0   4.530   1.660 <NA> <NA> 3 <NA> <NA>
SPEAKER 18642259056-liujinjie.wav 0   6.210   4.880 <NA> <NA> 2 <NA> <NA>
SPEAKER 18642259056-liujinjie.wav 0  11.090   1.660 <NA> <NA> 1 <NA> <NA>
SPEAKER 18642259056-liujinjie.wav 0  12.800   2.130 <NA> <NA> 1 <NA> <NA>
SPEAKER 18642259056-liujinjie.wav 0  14.950   4.400 <NA> <NA> 2 <NA> <NA>
SPEAKER 18642259056-liujinjie.wav 0  19.390   1.810 <NA> <NA> 2 <NA> <NA>
SPEAKER 18642259056-liujinjie.wav 0  21.220   5.220 <NA> <NA> 2 <NA> <NA>
SPEAKER 18642259056-liujinjie.wav 0  26.440   4.410 <NA> <NA> 1 <NA> <NA>
SPEAKER 18642259056-liujinjie.wav 0  30.850   2.480 <NA> <NA> 2 <NA> <NA>
SPEAKER 18642259056-liujinjie.wav 0  33.340   5.120 <NA> <NA> 2 <NA> <NA>
SPEAKER 18642259056-liujinjie.wav 0  38.460   5.990 <NA> <NA> 1 <NA> <NA>
SPEAKER 18642259056-liujinjie.wav 0  44.480   3.910 <NA> <NA> 1 <NA> <NA>
SPEAKER 18642259056-liujinjie.wav 0  48.460   3.460 <NA> <NA> 1 <NA> <NA>
SPEAKER 18642259056-liujinjie.wav 0  52.060   5.420 <NA> <NA> 1 <NA> <NA>
SPEAKER 18642259056-liujinjie.wav 0  57.530   5.030 <NA> <NA> 1 <NA> <NA>
 
接下來(lái)就是 用pydub 進(jìn)行語(yǔ)音片段的拼接了

到此,關(guān)于“聲紋識(shí)別kaldi callhome diarization怎么實(shí)現(xiàn)”的學(xué)習(xí)就結(jié)束了,希望能夠解決大家的疑惑。理論與實(shí)踐的搭配能更好的幫助大家學(xué)習(xí),快去試試吧!若想繼續(xù)學(xué)習(xí)更多相關(guān)知識(shí),請(qǐng)繼續(xù)關(guān)注億速云網(wǎng)站,小編會(huì)繼續(xù)努力為大家?guī)?lái)更多實(shí)用的文章!

向AI問(wèn)一下細(xì)節(jié)

免責(zé)聲明:本站發(fā)布的內(nèi)容(圖片、視頻和文字)以原創(chuàng)、轉(zhuǎn)載和分享為主,文章觀點(diǎn)不代表本網(wǎng)站立場(chǎng),如果涉及侵權(quán)請(qǐng)聯(lián)系站長(zhǎng)郵箱:is@yisu.com進(jìn)行舉報(bào),并提供相關(guān)證據(jù),一經(jīng)查實(shí),將立刻刪除涉嫌侵權(quán)內(nèi)容。

AI