sudo mkdir /usr/share/elasticsearch/plugins/ik/ sudo cd /usr/share/elasticsearch/plugins/ik/ sudo wget https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v7.4.2/elasticsearch-analysis-ik-7.4.2.zip sudo unzip elasticsearch-analysis-ik-7.4.2.zip sudo \rm elasticsearch-analysis-ik-7.4.2.zip sudo cd /usr/share/elasticsearch/plugins/ik/config sudo wget https://github.com/samejack/sc-dictionary/raw/master/main.txt sudo mv main.dic main.dic.old sudo mv main.txt main.dic
設定擴充字典,編輯 IKAnalyzer.cfg.xml 設定檔案
cd /usr/share/elasticsearch/plugins/ik/config sudo vi IKAnalyzer.cfg.xml將 ext_dict 新增 main.dic
<entry key="ext_dict">main.dic</entry>重新啟動ElasticSearch
sudo service elasticsearch restart
沒問題的話,就來測試一下 IK 吧...
curl -XGET http://localhost:9200/_analyze -H 'Content-Type:application/json' -d'
{
"text":"後悔莫及的人家",
"analyzer": "ik_smart"
}'
應該會得到這樣的結果:
{
"tokens" : [
{
"token" : "後悔莫及",
"start_offset" : 0,
"end_offset" : 4,
"type" : "CN_WORD",
"position" : 0
},
{
"token" : "的",
"start_offset" : 4,
"end_offset" : 5,
"type" : "CN_CHAR",
"position" : 1
},
{
"token" : "人家",
"start_offset" : 5,
"end_offset" : 7,
"type" : "CN_WORD",
"position" : 2
}
]
}
接著來建立一個新的 index 就叫做 test 吧,順便進行一下IK分詞測試看看...
curl -XPUT http://localhost:9200/test
curl -XPOST http://localhost:9200/test/_mapping -H 'Content-Type:application/json' -d'
{
"properties": {
"content": {
"type": "text",
"analyzer": "ik_max_word",
"search_analyzer": "ik_smart"
}
}
}'
curl -XPOST http://localhost:9200/test/_create/1 -H 'Content-Type:application/json' -d'
{"content":"曾經有一份真摯的感情放在我面前.我沒有珍惜.等到失去的時候才後悔莫及,塵世間最痛苦的事莫過於此.你的劍在我的咽喉上割下去吧!不要再猶豫了!如果上天能夠給我一個再來一次的機會,我一定會對那個女孩子說三個字\"我愛你\",如果非要在這份愛上加一個期限的話,我希望是一萬年。"}'
curl -XPOST /test/_search -H 'Content-Type:application/json' -d'
{
"query": {"match": {"content": "如果"}},
"highlight" : {
"pre_tags" : ["", ""],
"post_tags" : [" ", ""],
"fields" : {
"content" : {}
}
}
}'
結果應該會是如此 :
{
"took" : 292,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 0.40037507,
"hits" : [
{
"_index" : "test",
"_type" : "_doc",
"_id" : "1",
"_score" : 0.40037507,
"_source" : {
"content" : """曾經有一份真摯的感情放在我面前.我沒有珍惜.等到失去的時候才後悔莫及,塵世間最痛苦的事莫過於此.你的劍在我的咽喉上割下去吧!不要再猶豫了!如果上天能夠給我一個再來一次的機會,我一定會對那個女孩子說三個字"我愛你",如果非要在這份愛上加一個期限的話,我希望是一萬年。"""
},
"highlight" : {
"content" : [
"""<tag1>如果</tag1>上天能夠給我一個再來一次的機會,我一定會對那個女孩子說三個字"我愛你",<tag1>如果</tag1>非要在這份愛上加一個期限的話,我希望是一萬年。"""
]
}
}
]
}
}
1 則留言:
張貼留言