<samp id="hz055"><tbody id="hz055"><dl id="hz055"></dl></tbody></samp>

<samp id="hz055"><sup id="hz055"><dl id="hz055"></dl></sup></samp>

溫馨提示×

溫馨提示×

您好，登錄后才能下訂單哦！

密碼登錄×

忘記密碼？

登錄注冊×

獲取短信驗證碼

其他方式登錄

點擊登錄注冊即表示同意《億速云用戶服務條款》

用戶登錄×

賬戶密碼登錄

請使用微信掃描上方二維碼

使用幫助

請求超時！

請點擊重新獲取二維碼

solr自動聚類怎么實現(xiàn)

發(fā)布時間：2021-12-22 17:06:51 來源：億速云閱讀：151 作者：iii 欄目：云計算

這篇文章主要講解了“solr自動聚類怎么實現(xiàn)”，文中的講解內容簡單清晰，易于學習與理解，下面請大家跟著小編的思路慢慢深入，一起來研究和學習“solr自動聚類怎么實現(xiàn)”吧！

Solr 使用Carrot2完成了聚類功能,能夠把檢索到的內容自動分類, Carrot2聚類示例:

要想Solr支持聚類功能,首選要把Solr發(fā)行包的中的dist/ solr-clustering-4.2.0.jar, 復制到\solr\contrib\analysis-extras\lib下.然后打開solrconfig.xml進行添加配置:

<searchComponent name="clustering"

enable="${solr.clustering.enabled:true}"

class="solr.clustering.ClusteringComponent" >

<lst name="engine">

<str name="name">default</str>

<str name="carrot.algorithm">org.carrot2.clustering.lingo.LingoClusteringAlgorithm</str>

<str name="LingoClusteringAlgorithm.desiredClusterCountBase">30</str>

<str name="LingoClusteringAlgorithm.clusterMergingThreshold">0.70</str>

<str name="LingoClusteringAlgorithm.scoreWeight">0</str>

<str name="LingoClusteringAlgorithm.labelAssigner">org.carrot2.clustering.lingo.SimpleLabelAssigner</str>

<str name="LingoClusteringAlgorithm.phraseLabelBoost">1.5</str>

<str name="LingoClusteringAlgorithm.phraseLengthPenaltyStart">8</str>

<str name="LingoClusteringAlgorithm.phraseLengthPenaltyStop">8</str>

<str name="TermDocumentMatrixReducer.factorizationQuality">HIGH</str>

<!--

org.carrot2.matrix.factorization.PartialSingularValueDecompositionFactory

org.carrot2.matrix.factorization.NonnegativeMatrixFactorizationEDFactory

org.carrot2.matrix.factorization.NonnegativeMatrixFactorizationKLFactory

org.carrot2.matrix.factorization.LocalNonnegativeMatrixFactorizationFactory

org.carrot2.matrix.factorization.KMeansMatrixFactorizationFactory

-->

<str name="TermDocumentMatrixReducer.factorizationFactory">org.carrot2.matrix.factorization.NonnegativeMatrixFactorizationEDFactory</str>

<str name="TermDocumentMatrixBuilder.maximumMatrixSize">37500</str>

<str name="TermDocumentMatrixBuilder.titleWordsBoost">2.0</str>

<str name="TermDocumentMatrixBuilder.maxWordDf">0.9</str>

<str name="TermDocumentMatrixBuilder.termWeighting">org.carrot2.text.vsm.TfTermWeighting</str>

<str name="MultilingualClustering.defaultLanguage">CHINESE_SIMPLIFIED</str>

<str name="MultilingualClustering.languageAggregationStrategy">org.carrot2.text.clustering.MultilingualClustering.LanguageAggregationStrategy.FLATTEN_MAJOR_LANGUAGE </str>

<str name="GenitiveLabelFilter.enabled">true</str>

<str name="StopWordLabelFilter.enabled">true</str>

<str name="NumericLabelFilter.enabled">true</str>

<str name="QueryLabelFilter.enabled">true</str>

<str name="MinLengthLabelFilter.enabled">true</str>

<str name="StopLabelFilter.enabled">true</str>

<str name="CompleteLabelFilter.enabled">true</str>

<str name="CompleteLabelFilter.labelOverrideThreshold">0.65</str>

<str name="DocumentAssigner.exactPhraseAssignment">false</str>

<str name="DocumentAssigner.minClusterSize">2</str>

<str name="merge-resources">true</str>

<str name="CaseNormalizer.dfThreshold">1</str>

<str name="PhraseExtractor.dfThreshold">1</str>

<str name="carrot.lexicalResourcesDir">clustering/carrot2</str>

<str name="SolrDocumentSource.solrIdFieldName">id</str>

</lst>

</searchComponent>

配好了聚類組件后,下面配置requestHandler:

<requestHandler name="/clustering"

startup="lazy"

enable="${solr.clustering.enabled:true}"

class="solr.SearchHandler">

<lst name="defaults">

<str name="echoParams">explicit</str>

<bool name="clustering">true</bool>

<str name="clustering.engine">default</str>

<bool name="clustering.results">true</bool>

<str name="carrot.title">category_s</str>

<str name="carrot.snippet">content</str>

<str name="carrot.url">path</str>

<str name="carrot.produceSummary">true</str>

</lst>

<arr name="last-components">

<str>clustering</str>

</arr>

</requestHandler>

有兩個參數(shù)要注意carrot.title,carrot.snippet是聚類的比較計算字段,這兩個參數(shù)必須是stored="true".carrot.title的權重要高于carrot.snippet,如果只有一個做計算的字段carrot.snippet可以去掉(是去掉不是值為空).設完了用下面的URL就可以查詢了

http://localhost:8080/skyCore/clustering?q=*%3A*&wt=xml&indent=true

感謝各位的閱讀，以上就是“solr自動聚類怎么實現(xiàn)”的內容了，經(jīng)過本文的學習后，相信大家對solr自動聚類怎么實現(xiàn)這一問題有了更深刻的體會，具體使用情況還需要大家實踐驗證。這里是億速云，小編將為大家推送更多相關知識點的文章，歡迎關注！

向AI問一下細節(jié)

推薦閱讀：

免責聲明：本站發(fā)布的內容（圖片、視頻和文字）以原創(chuàng)、轉載和分享為主，文章觀點不代表本網(wǎng)站立場，如果涉及侵權請聯(lián)系站長郵箱：is@yisu.com進行舉報，并提供相關證據(jù)，一經(jīng)查實，將立刻刪除涉嫌侵權內容。

上一篇新聞：
Express如何集成Authing OIDC單點登錄
下一篇新聞：
mysql中出現(xiàn)1053錯誤怎么辦

猜你喜歡

AI
助
手

產(chǎn)品服務

地區(qū)劃分

專題活動

幫助支持

關于我們

售后咨詢

7*24小時在線電話：400-100-2938

7*24小時在線 QQ：800811969

關注億速云

億速云公眾號

手機網(wǎng)站二維碼