IT序号网

自定义中文全文索引

developer 2021年06月14日 数据库 361 0

一、中文分词插件

NEO4J中文全文索引,分词组件使用IKAnalyzer。为了支持高版本LUCENE,IKAnalyzer需要做一些调整。

IKAnalyzer-3.0 旧版本实现参考

ELASTICSEARCH-IKAnlyzer 高版本实现参考

1、分词组件的调整

调整之后的分词组件 casia.isiteam.zdr.wltea

// 调整之后的实现 
public final class IKAnalyzer extends Analyzer { 
<span class="token comment"><span class="hljs-comment"><span class="hljs-comment"><span class="hljs-comment"><span class="hljs-comment">// 默认细粒度切分 true-智能切分 false-细粒度切分</span></span></span></span></span> 
<span class="token keyword"><span class="hljs-keyword"><span class="hljs-keyword"><span class="hljs-keyword"><span class="hljs-keyword">private</span></span></span></span></span> Configuration configuration <span class="token operator">=</span> <span class="token keyword"><span class="hljs-keyword"><span class="hljs-keyword"><span class="hljs-keyword"><span class="hljs-keyword">new</span></span></span></span></span> <span class="token class-name">Configuration</span><span class="token punctuation">(</span><span class="token boolean"><span class="hljs-keyword"><span class="hljs-keyword"><span class="hljs-keyword"><span class="hljs-keyword">false</span></span></span></span></span><span class="token punctuation">)</span><span class="token punctuation">;</span> 
 
<span class="token comment"><span class="hljs-comment"><span class="hljs-comment"><span class="hljs-comment"><span class="hljs-comment">/** 
 * IK分词器Lucene  Analyzer接口实现类 
 * &lt;p&gt; 
 * 默认细粒度切分算法 
 */</span></span></span></span></span> 
<span class="token keyword"><span class="hljs-function"><span class="hljs-keyword"><span class="hljs-function"><span class="hljs-keyword"><span class="hljs-function"><span class="hljs-keyword"><span class="hljs-function"><span class="hljs-keyword">public</span></span></span></span></span></span></span></span></span><span class="hljs-function"><span class="hljs-function"><span class="hljs-function"><span class="hljs-function"> </span></span></span></span><span class="token function"><span class="hljs-function"><span class="hljs-title"><span class="hljs-function"><span class="hljs-title"><span class="hljs-function"><span class="hljs-title"><span class="hljs-function"><span class="hljs-title">IKAnalyzer</span></span></span></span></span></span></span></span></span><span class="token punctuation"><span class="hljs-function"><span class="hljs-params"><span class="hljs-function"><span class="hljs-params"><span class="hljs-function"><span class="hljs-params"><span class="hljs-function"><span class="hljs-params">(</span></span></span></span></span></span></span></span></span><span class="token punctuation"><span class="hljs-function"><span class="hljs-params"><span class="hljs-function"><span class="hljs-params"><span class="hljs-function"><span class="hljs-params"><span class="hljs-function"><span class="hljs-params">)</span></span></span></span></span></span></span></span></span><span class="hljs-function"><span class="hljs-function"><span class="hljs-function"><span class="hljs-function"> </span></span></span></span><span class="token punctuation">{</span> 
<span class="token punctuation">}</span> 
 
<span class="token comment"><span class="hljs-comment"><span class="hljs-comment"><span class="hljs-comment"><span class="hljs-comment">/** 
 * IK分词器Lucene Analyzer接口实现类 
 * 
 * </span></span></span><span class="hljs-doctag"><span class="hljs-comment"><span class="hljs-doctag"><span class="hljs-comment"><span class="hljs-doctag"><span class="hljs-comment"><span class="hljs-doctag">@param</span></span></span></span></span></span></span><span class="hljs-comment"><span class="hljs-comment"><span class="hljs-comment"> configuration IK配置 
 */</span></span></span></span></span> 
<span class="token keyword"><span class="hljs-function"><span class="hljs-keyword"><span class="hljs-function"><span class="hljs-keyword"><span class="hljs-function"><span class="hljs-keyword"><span class="hljs-function"><span class="hljs-keyword">public</span></span></span></span></span></span></span></span></span><span class="hljs-function"><span class="hljs-function"><span class="hljs-function"><span class="hljs-function"> </span></span></span></span><span class="token function"><span class="hljs-function"><span class="hljs-title"><span class="hljs-function"><span class="hljs-title"><span class="hljs-function"><span class="hljs-title"><span class="hljs-function"><span class="hljs-title">IKAnalyzer</span></span></span></span></span></span></span></span></span><span class="token punctuation"><span class="hljs-function"><span class="hljs-params"><span class="hljs-function"><span class="hljs-params"><span class="hljs-function"><span class="hljs-params"><span class="hljs-function"><span class="hljs-params">(</span></span></span></span></span></span></span></span></span><span class="hljs-function"><span class="hljs-params"><span class="hljs-function"><span class="hljs-params"><span class="hljs-function"><span class="hljs-params"><span class="hljs-function"><span class="hljs-params">Configuration configuration</span></span></span></span></span></span></span></span><span class="token punctuation"><span class="hljs-function"><span class="hljs-params"><span class="hljs-function"><span class="hljs-params"><span class="hljs-function"><span class="hljs-params"><span class="hljs-function"><span class="hljs-params">)</span></span></span></span></span></span></span></span></span><span class="hljs-function"><span class="hljs-function"><span class="hljs-function"><span class="hljs-function"> </span></span></span></span><span class="token punctuation">{</span> 
    <span class="token keyword"><span class="hljs-keyword"><span class="hljs-keyword"><span class="hljs-keyword"><span class="hljs-keyword">super</span></span></span></span></span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span> 
    <span class="token keyword"><span class="hljs-keyword"><span class="hljs-keyword"><span class="hljs-keyword"><span class="hljs-keyword">this</span></span></span></span></span><span class="token punctuation">.</span>configuration <span class="token operator">=</span> configuration<span class="token punctuation">;</span> 
<span class="token punctuation">}</span> 
 
<span class="token comment"><span class="hljs-comment"><span class="hljs-comment"><span class="hljs-comment"><span class="hljs-comment">/** 
 * 重载Analyzer接口,构造分词组件 
 */</span></span></span></span></span> 
<span class="token annotation punctuation"><span class="hljs-meta"><span class="hljs-meta"><span class="hljs-meta"><span class="hljs-meta">@Override</span></span></span></span></span> 
<span class="token keyword"><span class="hljs-function"><span class="hljs-keyword"><span class="hljs-function"><span class="hljs-keyword"><span class="hljs-function"><span class="hljs-keyword"><span class="hljs-function"><span class="hljs-keyword">protected</span></span></span></span></span></span></span></span></span><span class="hljs-function"><span class="hljs-function"><span class="hljs-function"><span class="hljs-function"> TokenStreamComponents </span></span></span></span><span class="token function"><span class="hljs-function"><span class="hljs-title"><span class="hljs-function"><span class="hljs-title"><span class="hljs-function"><span class="hljs-title"><span class="hljs-function"><span class="hljs-title">createComponents</span></span></span></span></span></span></span></span></span><span class="token punctuation"><span class="hljs-function"><span class="hljs-params"><span class="hljs-function"><span class="hljs-params"><span class="hljs-function"><span class="hljs-params"><span class="hljs-function"><span class="hljs-params">(</span></span></span></span></span></span></span></span></span><span class="hljs-function"><span class="hljs-params"><span class="hljs-function"><span class="hljs-params"><span class="hljs-function"><span class="hljs-params"><span class="hljs-function"><span class="hljs-params">String fieldName</span></span></span></span></span></span></span></span><span class="token punctuation"><span class="hljs-function"><span class="hljs-params"><span class="hljs-function"><span class="hljs-params"><span class="hljs-function"><span class="hljs-params"><span class="hljs-function"><span class="hljs-params">)</span></span></span></span></span></span></span></span></span><span class="hljs-function"><span class="hljs-function"><span class="hljs-function"><span class="hljs-function"> </span></span></span></span><span class="token punctuation">{</span> 
    Tokenizer _IKTokenizer <span class="token operator">=</span> <span class="token keyword"><span class="hljs-keyword"><span class="hljs-keyword"><span class="hljs-keyword"><span class="hljs-keyword">new</span></span></span></span></span> <span class="token class-name">IKTokenizer</span><span class="token punctuation">(</span>configuration<span class="token punctuation">)</span><span class="token punctuation">;</span> 
    <span class="token keyword"><span class="hljs-keyword"><span class="hljs-keyword"><span class="hljs-keyword"><span class="hljs-keyword">return</span></span></span></span></span> <span class="token keyword"><span class="hljs-keyword"><span class="hljs-keyword"><span class="hljs-keyword"><span class="hljs-keyword">new</span></span></span></span></span> <span class="token class-name">TokenStreamComponents</span><span class="token punctuation">(</span>_IKTokenizer<span class="token punctuation">)</span><span class="token punctuation">;</span> 
<span class="token punctuation">}</span> 

}

2、分词测试

自定义分词函数

RETURN zdr.index.iKAnalyzer('复联终章快上映了好激动,据说知识图谱与人工智能技术应用到了那部电影!吖啶基氨基甲烷磺酰甲氧基苯胺是一种药嘛?',true) AS words 

   
    

在这里插入图片描述

 /** * @param text:待分词文本 * @param useSmart:true 用智能分词,false 细粒度分词 * @return * @Description: TODO(支持中英文本分词) */ 
    @UserFunction(name = "zdr.index.iKAnalyzer") 
    @Description("Fulltext index iKAnalyzer - RETURN zdr.index.iKAnalyzer({text},true) AS words") 
    public List<String> iKAnalyzer(@Name("text") String text, @Name("useSmart") boolean useSmart) { 
    PropertyConfigurator<span class="token punctuation">.</span><span class="token function">configureAndWatch</span><span class="token punctuation">(</span><span class="token string"><span class="hljs-string"><span class="hljs-string"><span class="hljs-string"><span class="hljs-string">"dic"</span></span></span></span></span> <span class="token operator">+</span> File<span class="token punctuation">.</span>separator <span class="token operator">+</span> <span class="token string"><span class="hljs-string"><span class="hljs-string"><span class="hljs-string"><span class="hljs-string">"log4j.properties"</span></span></span></span></span><span class="token punctuation">)</span><span class="token punctuation">;</span> 
    Configuration cfg <span class="token operator">=</span> <span class="token keyword"><span class="hljs-keyword"><span class="hljs-keyword"><span class="hljs-keyword"><span class="hljs-keyword">new</span></span></span></span></span> <span class="token class-name">Configuration</span><span class="token punctuation">(</span>useSmart<span class="token punctuation">)</span><span class="token punctuation">;</span> 
 
    StringReader input <span class="token operator">=</span> <span class="token keyword"><span class="hljs-keyword"><span class="hljs-keyword"><span class="hljs-keyword"><span class="hljs-keyword">new</span></span></span></span></span> <span class="token class-name">StringReader</span><span class="token punctuation">(</span>text<span class="token punctuation">.</span><span class="token function">trim</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span> 
    IKSegmenter ikSegmenter <span class="token operator">=</span> <span class="token keyword"><span class="hljs-keyword"><span class="hljs-keyword"><span class="hljs-keyword"><span class="hljs-keyword">new</span></span></span></span></span> <span class="token class-name">IKSegmenter</span><span class="token punctuation">(</span>input<span class="token punctuation">,</span> cfg<span class="token punctuation">)</span><span class="token punctuation">;</span> 
 
    List<span class="token generics function"><span class="token punctuation">&lt;</span>String<span class="token punctuation">&gt;</span></span> results <span class="token operator">=</span> <span class="token keyword"><span class="hljs-keyword"><span class="hljs-keyword"><span class="hljs-keyword"><span class="hljs-keyword">new</span></span></span></span></span> <span class="token class-name">ArrayList</span><span class="token operator">&lt;</span><span class="token operator">&gt;</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span> 
    <span class="token keyword"><span class="hljs-keyword"><span class="hljs-keyword"><span class="hljs-keyword"><span class="hljs-keyword">try</span></span></span></span></span> <span class="token punctuation">{</span> 
        <span class="token keyword"><span class="hljs-keyword"><span class="hljs-keyword"><span class="hljs-keyword"><span class="hljs-keyword">for</span></span></span></span></span> <span class="token punctuation">(</span>Lexeme lexeme <span class="token operator">=</span> ikSegmenter<span class="token punctuation">.</span><span class="token function">next</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span> lexeme <span class="token operator">!=</span> <span class="hljs-keyword"><span class="hljs-keyword"><span class="hljs-keyword"><span class="hljs-keyword">null</span></span></span></span><span class="token punctuation">;</span> lexeme <span class="token operator">=</span> ikSegmenter<span class="token punctuation">.</span><span class="token function">next</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span> <span class="token punctuation">{</span> 
            results<span class="token punctuation">.</span><span class="token function">add</span><span class="token punctuation">(</span>lexeme<span class="token punctuation">.</span><span class="token function">getLexemeText</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span> 
        <span class="token punctuation">}</span> 
    <span class="token punctuation">}</span> <span class="token keyword"><span class="hljs-keyword"><span class="hljs-keyword"><span class="hljs-keyword"><span class="hljs-keyword">catch</span></span></span></span></span> <span class="token punctuation">(</span><span class="token class-name">IOException</span> e<span class="token punctuation">)</span> <span class="token punctuation">{</span> 
        e<span class="token punctuation">.</span><span class="token function">printStackTrace</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span> 
    <span class="token punctuation">}</span> 
    <span class="token keyword"><span class="hljs-keyword"><span class="hljs-keyword"><span class="hljs-keyword"><span class="hljs-keyword">return</span></span></span></span></span> results<span class="token punctuation">;</span> 
<span class="token punctuation">}</span> 

二、样例数据准备

# 构造样例数据 
MERGE (a:Loc {name:'A'}) SET a.description='复联终章快上映了好激动,据说知识图谱与人工智能技术应用到了那部电影!吖啶基氨基甲烷磺酰甲氧基苯胺是一种药嘛?' 
MERGE (b:Loc {name:'B'}) SET b.description='复联终章快上映了好激动,据说知识图谱与人工智能技术应用到了那部电影!吖啶基氨基甲烷磺酰甲氧基苯胺是一种药嘛?' 
MERGE (c:Loc {name:'C'}) SET c.description='复联终章快上映了好激动,据说知识图谱与人工智能技术应用到了那部电影!吖啶基氨基甲烷磺酰甲氧基苯胺是一种药嘛?' 
MERGE (d:Loc {name:'D'}) SET d.description='复联终章快上映了好激动,据说知识图谱与人工智能技术应用到了那部电影!吖啶基氨基甲烷磺酰甲氧基苯胺是一种药嘛?' 
MERGE (e:Loc {name:'E'}) SET e.description='复联终章快上映了好激动,据说知识图谱与人工智能技术应用到了那部电影!吖啶基氨基甲烷磺酰甲氧基苯胺是一种药嘛?' 
MERGE (f:Loc {name:'F'}) SET f.description='复联终章快上映了好激动,据说知识图谱与人工智能技术应用到了那部电影!吖啶基氨基甲烷磺酰甲氧基苯胺是一种药嘛?' 
MERGE (a)-[:ROAD {cost:50}]->(b) 
MERGE (a)-[:ROAD {cost:50}]->(c) 
MERGE (a)-[:ROAD {cost:100}]->(d) 
MERGE (b)-[:ROAD {cost:40}]->(d) 
MERGE (c)-[:ROAD {cost:40}]->(d) 
MERGE (c)-[:ROAD {cost:80}]->(e) 
MERGE (d)-[:ROAD {cost:30}]->(e) 
MERGE (d)-[:ROAD {cost:80}]->(f) 
MERGE (e)-[:ROAD {cost:40}]->(f); 

   
    

三、通过中文全文分词组件创建节点索引

自定义创建索引过程

CALL zdr.index.addChineseFulltextIndex('IKAnalyzer', 'Loc', ['description']) YIELD message RETURN message 

   
    

在这里插入图片描述

 @Procedure(value = "zdr.index.addChineseFulltextIndex", mode = Mode.WRITE) 
    @Description("CALL zdr.index.addChineseFulltextIndex(String indexName, String labelName, List<String> propKeys) YIELD message RETURN message," + 
            "为一个标签下的所有节点的指定属性添加索引") 
    public Stream<NodeIndexMessage> addChineseFulltextIndex(@Name("indexName") String indexName, 
                                                            @Name("labelName") String labelName, @Name("properties") List<String> propKeys) { 
        Label label = Label.label(labelName); 
    List<span class="token generics function"><span class="token punctuation">&lt;</span>NodeIndexMessage<span class="token punctuation">&gt;</span></span> output <span class="token operator">=</span> <span class="token keyword"><span class="hljs-keyword"><span class="hljs-keyword"><span class="hljs-keyword"><span class="hljs-keyword">new</span></span></span></span></span> <span class="token class-name">ArrayList</span><span class="token operator">&lt;</span><span class="token operator">&gt;</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span> 

// // 按照标签找到该标签下的所有节点
ResourceIterator<Node> nodes = db.findNodes(label);
System.out.println("nodes:" + nodes.toString());

    <span class="token keyword"><span class="hljs-keyword"><span class="hljs-keyword"><span class="hljs-keyword"><span class="hljs-keyword">int</span></span></span></span></span> nodesSize <span class="token operator">=</span> <span class="token number"><span class="hljs-number"><span class="hljs-number"><span class="hljs-number"><span class="hljs-number">0</span></span></span></span></span><span class="token punctuation">;</span> 
    <span class="token keyword"><span class="hljs-keyword"><span class="hljs-keyword"><span class="hljs-keyword"><span class="hljs-keyword">int</span></span></span></span></span> propertiesSize <span class="token operator">=</span> <span class="token number"><span class="hljs-number"><span class="hljs-number"><span class="hljs-number"><span class="hljs-number">0</span></span></span></span></span><span class="token punctuation">;</span> 
    <span class="token keyword"><span class="hljs-keyword"><span class="hljs-keyword"><span class="hljs-keyword"><span class="hljs-keyword">while</span></span></span></span></span> <span class="token punctuation">(</span>nodes<span class="token punctuation">.</span><span class="token function">hasNext</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span> <span class="token punctuation">{</span> 
        nodesSize<span class="token operator">++</span><span class="token punctuation">;</span> 
        Node node <span class="token operator">=</span> nodes<span class="token punctuation">.</span><span class="token function">next</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span> 
        System<span class="token punctuation">.</span>out<span class="token punctuation">.</span><span class="token function">println</span><span class="token punctuation">(</span><span class="token string"><span class="hljs-string"><span class="hljs-string"><span class="hljs-string"><span class="hljs-string">"current nodes:"</span></span></span></span></span> <span class="token operator">+</span> node<span class="token punctuation">.</span><span class="token function">toString</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span> 
 
        <span class="token comment"><span class="hljs-comment"><span class="hljs-comment"><span class="hljs-comment"><span class="hljs-comment">// 每个节点上需要添加索引的属性</span></span></span></span></span> 
        Set<span class="token operator">&lt;</span>Map<span class="token punctuation">.</span>Entry<span class="token generics function"><span class="token punctuation">&lt;</span>String<span class="token punctuation">,</span> Object<span class="token punctuation">&gt;</span></span><span class="token operator">&gt;</span> properties <span class="token operator">=</span> node<span class="token punctuation">.</span><span class="token function">getProperties</span><span class="token punctuation">(</span>propKeys<span class="token punctuation">.</span><span class="token function">toArray</span><span class="token punctuation">(</span><span class="token keyword"><span class="hljs-keyword"><span class="hljs-keyword"><span class="hljs-keyword"><span class="hljs-keyword">new</span></span></span></span></span> <span class="token class-name">String</span><span class="token punctuation">[</span><span class="token number"><span class="hljs-number"><span class="hljs-number"><span class="hljs-number"><span class="hljs-number">0</span></span></span></span></span><span class="token punctuation">]</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">entrySet</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span> 
        System<span class="token punctuation">.</span>out<span class="token punctuation">.</span><span class="token function">println</span><span class="token punctuation">(</span><span class="token string"><span class="hljs-string"><span class="hljs-string"><span class="hljs-string"><span class="hljs-string">"current node properties"</span></span></span></span></span> <span class="token operator">+</span> properties<span class="token punctuation">)</span><span class="token punctuation">;</span> 
 
        <span class="token comment"><span class="hljs-comment"><span class="hljs-comment"><span class="hljs-comment"><span class="hljs-comment">// 查询该节点是否已有索引,有的话删除</span></span></span></span></span> 
        Index<span class="token generics function"><span class="token punctuation">&lt;</span>Node<span class="token punctuation">&gt;</span></span> index <span class="token operator">=</span> db<span class="token punctuation">.</span><span class="token function">index</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">forNodes</span><span class="token punctuation">(</span>indexName<span class="token punctuation">,</span> FULL_INDEX_CONFIG<span class="token punctuation">)</span><span class="token punctuation">;</span> 
        System<span class="token punctuation">.</span>out<span class="token punctuation">.</span><span class="token function">println</span><span class="token punctuation">(</span><span class="token string"><span class="hljs-string"><span class="hljs-string"><span class="hljs-string"><span class="hljs-string">"current node index"</span></span></span></span></span> <span class="token operator">+</span> index<span class="token punctuation">)</span><span class="token punctuation">;</span> 
        index<span class="token punctuation">.</span><span class="token function">remove</span><span class="token punctuation">(</span>node<span class="token punctuation">)</span><span class="token punctuation">;</span> 
 
        <span class="token comment"><span class="hljs-comment"><span class="hljs-comment"><span class="hljs-comment"><span class="hljs-comment">// 为了该节点的每个需要添加索引的属性添加全文索引</span></span></span></span></span> 
        <span class="token keyword"><span class="hljs-keyword"><span class="hljs-keyword"><span class="hljs-keyword"><span class="hljs-keyword">for</span></span></span></span></span> <span class="token punctuation">(</span>Map<span class="token punctuation">.</span>Entry<span class="token generics function"><span class="token punctuation">&lt;</span>String<span class="token punctuation">,</span> Object<span class="token punctuation">&gt;</span></span> property <span class="token operator">:</span> properties<span class="token punctuation">)</span> <span class="token punctuation">{</span> 
            propertiesSize<span class="token operator">++</span><span class="token punctuation">;</span> 
            index<span class="token punctuation">.</span><span class="token function">add</span><span class="token punctuation">(</span>node<span class="token punctuation">,</span> property<span class="token punctuation">.</span><span class="token function">getKey</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> property<span class="token punctuation">.</span><span class="token function">getValue</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span> 
        <span class="token punctuation">}</span> 
    <span class="token punctuation">}</span> 
 
    String message <span class="token operator">=</span> <span class="token string"><span class="hljs-string"><span class="hljs-string"><span class="hljs-string"><span class="hljs-string">"IndexName:"</span></span></span></span></span> <span class="token operator">+</span> indexName <span class="token operator">+</span> <span class="token string"><span class="hljs-string"><span class="hljs-string"><span class="hljs-string"><span class="hljs-string">",LabelName:"</span></span></span></span></span> <span class="token operator">+</span> labelName <span class="token operator">+</span> <span class="token string"><span class="hljs-string"><span class="hljs-string"><span class="hljs-string"><span class="hljs-string">",NodesSize:"</span></span></span></span></span> <span class="token operator">+</span> nodesSize <span class="token operator">+</span> <span class="token string"><span class="hljs-string"><span class="hljs-string"><span class="hljs-string"><span class="hljs-string">",PropertiesSize:"</span></span></span></span></span> <span class="token operator">+</span> propertiesSize<span class="token punctuation">;</span> 
    NodeIndexMessage indexMessage <span class="token operator">=</span> <span class="token keyword"><span class="hljs-keyword"><span class="hljs-keyword"><span class="hljs-keyword"><span class="hljs-keyword">new</span></span></span></span></span> <span class="token class-name">NodeIndexMessage</span><span class="token punctuation">(</span>message<span class="token punctuation">)</span><span class="token punctuation">;</span> 
    output<span class="token punctuation">.</span><span class="token function">add</span><span class="token punctuation">(</span>indexMessage<span class="token punctuation">)</span><span class="token punctuation">;</span> 
    <span class="token keyword"><span class="hljs-keyword"><span class="hljs-keyword"><span class="hljs-keyword"><span class="hljs-keyword">return</span></span></span></span></span> output<span class="token punctuation">.</span><span class="token function">stream</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span> 
<span class="token punctuation">}</span> 

四、中文分词索引查询

自定义查询索引过程

CALL zdr.index.chineseFulltextIndexSearch('IKAnalyzer', 'description:吖啶基氨基甲烷磺酰甲氧基苯胺', 100) YIELD node RETURN node 

   
    

在这里插入图片描述

CALL zdr.index.chineseFulltextIndexSearch('IKAnalyzer', 'description:复联* AND year:1999', 100) YIELD node,weight RETURN node.name,node.year,node.description,weight 

   
    

在这里插入图片描述

  @Procedure(value = "zdr.index.chineseFulltextIndexSearch", mode = Mode.WRITE) 
    @Description("CALL zdr.index.chineseFulltextIndexSearch(String indexName, String query, long limit) YIELD node RETURN node," + 
            "执行LUCENE全文检索,返回前{limit个结果}") 
    public Stream<ChineseHit> chineseFulltextIndexSearch(@Name("indexName") String indexName, 
                                                         @Name("query") String query, @Name("limit") long limit) { 
        if (!db.index().existsForNodes(indexName)) { 
            log.debug("如果索引不存在则跳过本次查询:`%s`", indexName); 
            return Stream.empty(); 
        } 
        return db.index() 
                .forNodes(indexName, FULL_INDEX_CONFIG) 
                .query(new QueryContext(query).sortByScore().top((int) limit)) 
                .stream() 
                .map(ChineseHit::new);  // provider 
    } 

   
    

【跨标签类型检索】使用addChineseFulltextIndex给标签下节点属性添加的索引,默认可以使用chineseFulltextIndexSearch合并检索出来

// 增加一个非Loc标签的节点,然后使用检索 
CREATE (n:LocProvince {name:'P'})  SET n.description='复联终章快上映了好激动,据说知识图谱与人工智能技术应用到了那部电影!' RETURN n 
// 节点增加索引(索引名与已有相同) 
CALL zdr.index.addChineseFulltextIndex('IKAnalyzer', 'LocProvince', ['description','year']) YIELD message RETURN message 
// 通过属性检索节点 
CALL zdr.index.chineseFulltextIndexSearch('IKAnalyzer', 'description:复联', 100) YIELD node,weight RETURN node 

   
    

在这里插入图片描述

五、总结

上述NEO4J中文全文索引解决方法,索引不会自动更新,修改节点属性以及新增节点时都需要重新建立索引。
NEO4J默认索引实现参考:neo4j-lucene-index
在这里插入图片描述

原文地址:https://blog.csdn.net/superman_xxx/article/details/89204456

评论关闭
IT序号网

微信公众号号:IT虾米 (左侧二维码扫一扫)欢迎添加!