一、ES初识

简介

https://www.elastic.co/guide/en/elasticsearch/reference/7.5/index.html

简介

Elasticsearch是一个实时分布式搜索和分析引擎。它让你以前所未有的速度处理大数据成为可能。它用于全文搜索、结构化搜索、分析以及将这三者混合使用。

  • Elasticsearch不仅用于大型企业,它还让像 DataLog以及Kou这样的创业公司将最初的想法变成可扩展的解决方案

  • 可以在你的笔记本上运行,也可以在数以百计的服务器上处理PB级别的数据。

  • Elasticsearch是一个基于 Apache Lucene(M)的开源搜索引擎。无论在开源还是专有领域, Lucene可以被认为是迄今为止最先进、性能最好的、功能最全的搜索引擎库。但是,Lucene只是一个库。想要使用它,你必须使用ava来作为开发语言并将其直接集成到你的应用中,更糟糕的是,Lucene非常复杂,你需要深入了解检索的相关知识来理解它是如何工作的。

  • Elasticsearch使用ava开发并使用 Lucene作为其核心来实现所有索引和搜索的功能,但是它的目的是通过简单的 RESTfUL API来隐藏 Lucene的复杂性,从而让全文搜索变得简单

ELK

即使使用

场景

维基、stack overflow、github、商城

维基百科使用 Elasticsearch提供全文搜索并高亮关键字,以及输入实时搜索和搜索纠错等搜索建议功能。
Stackoverflow结合全文搜索与地理位置査询,以及more-like-this功能来找到相关的问题和答案。
Github使用 Elasticsearch检索1300亿行的代码。

对比

1、es基本是开箱即用(解压就可以用!),非常简单。Solr安装略微复杂一丟丢。
2、Sorl利用 Zookeeper进行分布式管理,而 Elasticsearch自身带有分布式协调管理功能。
3、Sorl支持更多格式的数据,比如json、XML、CSV,而 Elasticsearch仅支持json文件格式
4、Solr官方提供的功能更多,而Elasticsearch本身更注重于核心功能,高级功能多有第三方插件提供,例如图形化界面需kibana友好支撑
5、Solr査询快,但更新索引时慢(即插入删除慢),用于电商等査询多的应用

  • ES建立索引快(即査询慢),即实时性査询快,用于facebook新浪等搜索
  • Solr是传统搜索应用的有力解决方案,但Elasticsearch更适用于新兴的实时搜索应用。

6、Solr比较成熟,有一个更大,更成熟的用户、开发和贡献者社区,而elasticsearch相对开发维护者较少,更新太快,学习使用成本较高。

安装

9200:HTTP

9300:TCP

JDK1.8、ElasticSearch客户端,界面工具

https://www.elastic.co/cn/elasticsearch/

内存设置

jvm.options

-Xms256m 1G

启动es

elasticsearch.bat

可视化插件

https://github.com/mobz/elasticsearch-head

npm install

npm run start

跨域解决

http.cors.enabled: true
http.cors.allow-origin: "*"

docker安装

(1)下载ealastic search和kibana

# 版本要对应
docker pull elasticsearch:7.6.2
docker pull kibana:7.6.2

(2)配置

mkdir -p /mydata/elasticsearch/config
mkdir -p /mydata/elasticsearch/data
echo "http.host: 0.0.0.0" >/mydata/elasticsearch/config/elasticsearch.yml
chmod -R 777 /mydata/elasticsearch/

(3)启动Elastic search

docker run --name elasticsearch  -m 300M --memory-swap -1 -p 9200:9200 -p 9300:9300 \
-e  "discovery.type=single-node" \
-e ES_JAVA_OPTS="-Xms64m -Xmx128m" \
-v /mydata/elasticsearch/config/elasticsearch.yml:/usr/share/elasticsearch/config/elasticsearch.yml \
-v /mydata/elasticsearch/data:/usr/share/elasticsearch/data \
-v  /mydata/elasticsearch/plugins:/usr/share/elasticsearch/plugins \
-d elasticsearch:7.6.2 

设置开机启动elasticsearch

docker update elasticsearch --restart=always

(4)启动kibana:

docker run --name kibana -m 300M --memory-swap -1 -e ELASTICSEARCH_HOSTS=http://106.75.103.68:9200 -p 5601:5601 -d kibana:7.6.2

设置开机启动kibana

docker update kibana  --restart=always

Kibana

https://www.elastic.co/cn/downloads/kibana

Kibana 是一个免费且开放的用户界面,能够让您对 Elasticsearch 数据进行可视化,并让您在 Elastic Stack 中进行导航。您可以进行各种操作,从跟踪查询负载,到理解请求如何流经您的整个应用,都能轻松完成

/bin/kibana

端口5601

config:kibana.yaml

i18n.locale: "zh-cn"

基础概念

1、索引

2、字段类型(mapping)

3、文档(documents)

一切都是json,在后台把每个索引划分成多个分片,每个分片可以在集群中的不同服务器间迁移

relational DB ElasticSearch
数据库 索引
类型、types
文档、documents
字段 属性、fiels

文档(数据)

就是一条条数据,是面向文档的,索引和搜索数据的最小单位是文档

重要属性

  • 自我包含,一篇文档同时包含字段和对应的值,key:value
  • 层次型(json)
  • 灵活的结构,不依赖预先定义的模式,动态的添加新的字段

类型(数据表)

类型是文档的逻辑容器,就像关系型数据库一样,表格是行的容器。类型中对于字段的定叉称为映射,比如name映射为字符串类型。

我们说文档是无模式的,它们不需要拥有映射中所定义的所有字段,比如新増一个字段,
elasticsearch会自动的将新字段加入映射,但是这个字段的不确定它是什么类型, elasticsearch就开始猜,如果这个值是18,那么elasticsearch会认为它是整形。但是 elasticsearcht也可能猜不对,所以最安全的方式就是提前定义好所需要的映射,这点跟关系型数据库殊途同归了,先定义好字段,然后再使用。

索引(数据库)

索引是映射类型的容器, elasticsearch中的索引是一个非常大的文档集合。索引存储了映射类型的字段和其他设置。然后它们被存储到了各个分片上了。我们来研究下分片是如何工作的。

一个集群至少有一个节点,一个节点就是一个es进程,节点可以有多个索引默认的,如果创建索引,默认5个分片构成,每个主分片都会有一个副本。

主分片和对应的复制分片都不会再同一个节点内,保证数据不会丢失,一个分片是一个Lucene索引,一个包含倒排索引的文档目录,倒排索引的结构使得es在不扫描全部文档的情况下知道文档包含哪些特定的关键字。

倒排索引

Lucene倒排索引作为底层,试用于快速的全文搜索

将文档(数据)分词

只需要查看标签一栏

分词器

分词:把一句话划分成一个个的关键词

IK分词器插件

版本要一致

中文分词器:IK提供的分词算法:ik_smart(最少切分)和ik_max_word(最细粒度划分,穷尽词库)

https://github.com/medcl/elasticsearch-analysis-ik

win

放入es plugin

G:\elasticsearch-7.10.0-windows-x86_64\elasticsearch-7.10.0\bin>elasticsearch-plugin list
elasticsearch-analysis-ik-7.10.0

linux

wget https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v7.6.2/elasticsearch-analysis-ik-7.6.2.zip

unzip this.zip -d /server/elasticsearch/plugins/ik # 解压到ik目录,mv移动到plugins

# cd /usr/share/elasticsearch/bin

# elasticsearch-plugin list
# 显示ik成功
# 重启容器

测试

// 最少分割
GET _analyze
{
  "analyzer": "ik_smart",
  "text": "明月复苏"
}
// 最多分割 
GET _analyze
{
  "analyzer": "ik_max_word",
  "text": "明月复苏"
}
// 标准
GET _analyze
{
  "analyzer": "standard",
  "text": "明月复苏"
}
// 不会被分割
GET _analyze
{
  "analyzer": "keyword",
  "text": "明月复苏"
}

结果

{
  "tokens" : [
    {
      "token" : "明月",
      "start_offset" : 0,
      "end_offset" : 2,
      "type" : "CN_WORD",
      "position" : 0
    },
    {
      "token" : "复苏",
      "start_offset" : 2,
      "end_offset" : 4,
      "type" : "CN_WORD",
      "position" : 1
    }
  ]
}

明月复苏被拆开了

词库扩展

没有的词需要自己加

/usr/share/elasticsearch/plugins/ik/config中的IKAnalyzer.cfg.xml

1.本地

IKAnalyzer.cfg.xml

<entry key="ext_dict">ming.dic</entry>

目录下添加ming.dic

明月复苏

2.远程

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE properties SYSTEM "http://java.sun.com/dtd/properties.dtd">
<properties>
    <comment>IK Analyzer 扩展配置</comment>
    <!--用户可以在这里配置自己的扩展字典 -->
    <entry key="ext_dict"></entry>
     <!--用户可以在这里配置自己的扩展停止词字典-->
    <entry key="ext_stopwords"></entry>
    <!--用户可以在这里配置远程扩展字典,通过nginx返回分词 -->
    <entry key="remote_ext_dict">http://106.75.103.68/es/fenci.txt</entry> 
    <!--用户可以在这里配置远程扩展停止词字典-->
    <!-- <entry key="remote_ext_stopwords">words_location</entry> -->
</properties>

Rest风格

索引操作

# 关闭索引
POST /my_index/_close
# 开启索引
POST /my_index/_open 

PUT 索引/数据

PUT

PUT /索引名 / [类型名 默认为_doc] / 文档id # 相当于http://localhost/database/table/id
&#123;请求体&#125;
PUT database1 
&#123;
  "mappings": &#123;
    "table1": &#123;
      "properties": &#123;
        "message": &#123;
          "type": "text",
          "index": false, // 不作为索引
          "doc_values": false // 冗余数据,不需要聚合,排序
        &#125;
      &#125;
    &#125;
  &#125;
&#125;


PUT /database/table/1
&#123;
  "name": "明月复苏啊",
  "age": 3
&#125;

&#123;
    "result": "create/update"
&#125;

type类型

  • 字符串
    • text、keyword(不进行全文匹配)
  • 数值类型
    • long、integer、short、byte、double、float、half、scaled float
  • 日期类型
    • date
  • 二进制
    • binary
  • nested(避免扁平化处理,对象数组要使用,如果不设置,数组中的对象会变成 class.name = [“1”, “2”] )
  • 。。。

PUT索引规则

PUT /ming1
&#123;
  "mappings": &#123;
    "properties": &#123;
      "name": &#123;
        "type": "text"
      &#125;,
      "age": &#123;
        "type": "long"
      &#125;,
      "birthday": &#123;
        "type": "date"
      &#125;
    &#125;
  &#125;
&#125;

如果没有指定类型,默认配置字段类型

GET 规则信息

GET ming1

POST 修改


POST /ming1/type1/1/_update
&#123;
    "doc": &#123;
        "name": "name1"
    &#125;
&#125;

DELETE 删除

delete ming1

其他

GET _cat/nodes #  查看所有节点,*表示主节点
GET _cat/health   #  获取es信息
    前两个是时间戳
    cluster ,集群名称
    status,集群状态 green代表健康;yellow代表分配了所有主分片,但至少缺少一个副本,此时集群数据仍旧完整;red代表部分主分片不可用,可能已经丢失数据。
    node.total,代表在线的节点总数量
    node.data,代表在线的数据节点的数量
    shards, active_shards 存活的分片数量
    pri,active_primary_shards 存活的主分片数量 正常情况下 shards的数量是pri的两倍。
    relo, relocating_shards 迁移中的分片数量,正常情况为 0
    init, initializing_shards 初始化中的分片数量 正常情况为 0
    unassign, unassigned_shards 未分配的分片 正常情况为 0
    pending_tasks,准备中的任务,任务指迁移分片等 正常情况为 0
    max_task_wait_time,任务最长等待时间
    active_shards_percent,正常分片百分比 正常情况为 100%
GET _cat/master #  查看主节点
GET _cat/indices?v  #  查看所有的索引(数据库)

文档操作

put

如果没有值会被覆盖

PUT /mingyue/user/3
&#123;
  "name": "明月",
  "age": 3,
  "desc": "法外狂徒啦",
  "tags": ["帅哥", "直男", "交友"]
&#125;

get

get mingyue/user/3

post

灵活性高

doc

POST /ming/type1/ &#123; // 将自动创建id

&#125;
// _update
POST /mingyue/user/3/_update // 进行更新,带doc,数据一样,版本号将不会改变,什么都不变
&#123;
  "doc": &#123;
    "name": "明月2"  
  &#125;
&#125;xxxxxxxxxx POST /mingyue/user/3/_update&#123;  "doc": &#123;    "name": "明月2"    &#125;&#125;", "直男", "交友"]&#125;json
// 无_update,将会覆盖
PUT /mingyue/user/3
&#123;
  "name": "明月2",
  "age": 3,
  "desc": "法外狂徒啦",
  "tags": ["帅哥", "直男", "交友"]
&#125;

搜索

&#123;
    "_index": "database",
    "_type": "table",
    "_id": "1",
    "_version": 1,
    "_seq_no": 0,  // 用来控制乐观锁
    "_primary_term": 1,
    "found": true,
    "_source": &#123;
        "name": "mingyue"
    &#125;
&#125;

简单搜索

GET /mingyue/user/1

GET /mingyue/user/_se/arch?q=myname:明月

复杂搜索

  • query 查询参数,boost相对得分

  • _source: 显示的属性

  • sort:排序

  • from、size: 分页 (mysql limit)

get mingyue/user/_search
&#123;
  "query": &#123;
    "match": &#123;
      "name": "三"
    &#125;
  &#125;,
  "_source": ["name", "desc"],
  "sort": [
    &#123;
      "age": &#123;
        "order": "asc"
      &#125;
    &#125;
  ],
  "from": 0,
  "size": 11
&#125;


// 分词存在即可
get mingyue/user/_search
&#123;
  "query": &#123;
    "match": &#123;
      "tags": "交友 明月"
    &#125;
  &#125;
&#125;

多条件查询

  • must,必须符合

  • should,加分项

  • must_not,不能符合

get mingyue/user/_search
&#123;
  "query": &#123;
    "bool": &#123;
      "must": [
        &#123;
          "match": &#123;
            "name": "明月"
          &#125;
        &#125;,
        &#123;
          "match": &#123;
            "age": 3
          &#125;
        &#125;
      ]
    &#125;
  &#125;
&#125;

filter

进行过滤,gt/lt,不参与评分

get mingyue/user/_search
&#123;
  "query": &#123;
    "bool": &#123;
      "filter": [
        &#123;
          "range": &#123;
            "age": &#123;
              "gte": 3,
              "lte": 20
            &#125;
          &#125;
        &#125;,
        &#123;
          "term": &#123;
            "catalogId": "225"
          &#125;
        &#125;,
        &#123;
          "terms": &#123;
            "brandId": [
              "1",
              "2",
              "9"
            ]
          &#125;
        &#125;
      ]
    &#125;
  &#125;
&#125;

精确查找

使用倒排索引指定的词条进程精确查找

文本字段用match,非文本字段用term

  • match: 使用分词器解析,先分析文档,然后通过分析的文档进行查询 name.keyword全部匹配
  • match_all
  • match_phrase: 关键词当成整体
  • multi_match: 多字段关键词匹配(或)
  • term:不会分割,直接查询精确的,不建议全文检索
  • keyword:字段类型不会被分词器解析

使用should就可以查询多个

get mingyue/user/_search
&#123;
  "query": &#123;
    "match": &#123;
      "name.keyword": "明月"  //  全部值相同
    &#125;
  &#125;
&#125;

get mingyue/user/_search
&#123;
  "query": &#123;
    "term": &#123;
      "name": "明月"
    &#125;
  &#125;
&#125;

&#123;
  "query": &#123;
    "match_phrase": &#123; // 短语匹配
        "address": "mill"
    &#125;
  &#125;
&#125;

get mingyue/user/_search
&#123;
  "query": &#123;
    "multi_match": &#123; // 多字段匹配
        "query": "mill movico", // 会分割
        "field": ["address", "city"]
    &#125;
  &#125;
&#125;

高亮查询

get mingyue/user/_search
&#123;
  "query": &#123;
    "match": &#123;
      "name": "明月"
    &#125;
  &#125;,
  "highlight": &#123;
    "pre_tags": "<p class='key' style='color:red'>",
    "post_tags": "</p>", 
    "fields": &#123;
      "name": &#123;&#125;
    &#125;
  &#125;
&#125;
"hits" : [
      &#123;
        "_index" : "mingyue",
        "_type" : "user",
        "_id" : "3",
        "_score" : 1.7563686,
        "_source" : &#123;
          "name" : "明月2",
          "age" : 3,
          "desc" : "法外狂徒啦",
          "tags" : [
            "帅哥",
            "直男",
            "交友"
          ]
        &#125;,
        "highlight" : &#123;
          "name" : [
            "<p class='key' style='color:red'>明</p><p class='key' style='color:red'>月</p>2"
          ]
        &#125;
      &#125;
    ]

聚合

"aggs":&#123;
    "aggs_name这次聚合的名字,方便展示在结果集中":&#123;
        "AGG_TYPE聚合的类型(avg,term,terms)":&#123;&#125;
     &#125;
&#125;

terms种类、avg平均

样例1

搜索address中包含mill的所有人的年龄分布以及平均年龄,但不显示这些

人的详情

GET bank/_search
&#123;
  "query": &#123;
    "match": &#123;
      "address": "Mill"
    &#125;
  &#125;,
  "aggs": &#123;
    "ageAgg": &#123;
      "terms": &#123; // 多少种情况
        "field": "age",
        "size": 10
      &#125;
    &#125;,
    "ageAvg": &#123;
      "avg": &#123; // 平均
        "field": "age"
      &#125;
    &#125;,
    "balanceAvg": &#123;
      "avg": &#123;
        "field": "balance"
      &#125;
    &#125;
  &#125;,
  "size": 0
&#125;

结果

&#123;
  "took" : 2,
  "timed_out" : false,
  "_shards" : &#123;
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  &#125;,
  "hits" : &#123;
    "total" : &#123;
      "value" : 4,
      "relation" : "eq"
    &#125;,
    "max_score" : null,
    "hits" : [ ]
  &#125;,
  "aggregations" : &#123;
    "ageAgg" : &#123;
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [
        &#123;
          "key" : 38,
          "doc_count" : 2
        &#125;,
        &#123;
          "key" : 28,
          "doc_count" : 1
        &#125;,
        &#123;
          "key" : 32,
          "doc_count" : 1
        &#125;
      ]
    &#125;,
    "ageAvg" : &#123;
      "value" : 34.0
    &#125;,
    "balanceAvg" : &#123;
      "value" : 25208.0
    &#125;
  &#125;
&#125;
样例2

按照年龄聚合,并且求这些年龄段的这些人的平均薪资

GET bank/_search
&#123;
  "query": &#123;
    "match_all": &#123;&#125;
  &#125;,
  "aggs": &#123;
    "ageAgg": &#123;
      "terms": &#123;
        "field": "age",
        "size": 100
      &#125;,
      "aggs": &#123;
        "ageAvg": &#123;
          "avg": &#123;
            "field": "balance"
          &#125;
        &#125;
      &#125;
    &#125;
  &#125;,
  "size": 0
&#125;

结果

&#123;
  "took" : 49,
  "timed_out" : false,
  "_shards" : &#123;
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  &#125;,
  "hits" : &#123;
    "total" : &#123;
      "value" : 1000,
      "relation" : "eq"
    &#125;,
    "max_score" : null,
    "hits" : [ ]
  &#125;,
  "aggregations" : &#123;
    "ageAgg" : &#123;
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [
        &#123;
          "key" : 31,
          "doc_count" : 61,
          "ageAvg" : &#123;
            "value" : 28312.918032786885
          &#125;
        &#125;,
        &#123;
          "key" : 39,
          "doc_count" : 60,
          "ageAvg" : &#123;
            "value" : 25269.583333333332
          &#125;
        &#125;,
        &#123;
          "key" : 26,
          "doc_count" : 59,
          "ageAvg" : &#123;
            "value" : 23194.813559322032
          &#125;
        &#125;
      ]
    &#125;
  &#125;
&#125;
样例3

查出所有年龄分布,并且这些年龄段中M的平均薪资和F的平均薪资以及这个年龄段的总体平均薪资

GET bank/_search
&#123;
  "query": &#123;
    "match_all": &#123;&#125;
  &#125;,
  "aggs": &#123;
    "ageAgg": &#123; // 年龄分类
      "terms": &#123;
        "field": "age",
        "size": 100
      &#125;,
      "aggs": &#123;  //  年龄分类后,进行两种类型的聚合
        "genderAgg": &#123; // 1、性别
          "terms": &#123;
            "field": "gender.keyword" // 性别需要完全相同
          &#125;,
          "aggs": &#123; 
            "balanceAvg": &#123; // 性别下的薪资
              "avg": &#123;
                "field": "balance"
              &#125;
            &#125;
          &#125;
        &#125;,
        "ageBalanceAvg": &#123; // 2、平均薪资
          "avg": &#123;
            "field": "balance"
          &#125;
        &#125;
      &#125;
    &#125;
  &#125;,
  "size": 0
&#125;

返回值

hit: 索引和文档的信息

  • 查询的结果总数
  • 查询出的具体的文档
  • 分数: 符合程度

delete

  • Delete 索引名称/文档名称/主键编号

  • 根据匹配条件删除数据:(注意请求方式是Post)

POST 索引名称/文档名称/_delete_by_query
&#123;
    “query”:&#123;
        “term”:&#123;
            “_id”:100000100
        &#125;
    &#125;
&#125;
  • 删除所有数据:(注意请求方式是Post,只删除数据,不删除表结构)
POST /testindex/testtype/_delete_by_query?pretty
&#123;
    “query”: &#123;
        “match_all”: &#123;
        &#125;
    &#125;
&#125;

批量api

这里的批量操作,当发生某一条执行发生失败时,其他的数据仍然能够接着执行,也就是说彼此之间是独立的。

bulk api以此按顺序执行所有的action(动作)。如果一个单个的动作因任何原因失败,它将继续处理它后面剩余的动作。当bulk api返回时,它将提供每个动作的状态(与发送的顺序相同),所以您可以检查是否一个指定的动作是否失败了。

实例1: 执行多条数据

// index会覆盖,create不会
POST customer/external/_bulk
&#123;"index":&#123;"_id":"1"&#125;&#125;
&#123;"name":"John Doe"&#125;
&#123;"index":&#123;"_id":"2"&#125;&#125;
&#123;"name":"John Doe"&#125;

实例2:对于整个索引执行批量操作

POST /_bulk
&#123;"delete":&#123;"_index":"website","_type":"blog","_id":"123"&#125;&#125;
&#123;"create":&#123;"_index":"website","_type":"blog","_id":"123"&#125;&#125;
&#123;"title":"my first blog post"&#125;
&#123;"index":&#123;"_index":"website","_type":"blog"&#125;&#125;
&#123;"title":"my second blog post"&#125;
&#123;"update":&#123;"_index":"website","_type":"blog","_id":"123"&#125;&#125;
&#123;"doc":&#123;"title":"my updated blog post"&#125;&#125;

测试数据: https://github.com/elastic/elasticsearch/blob/master/docs/src/test/resources/accounts.json

POST /bank/account/_bulk+ 数据 进行测试

mapper

GET /bank/_mapping

类型详细:https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-types.html

binary

  • Binary value encoded as a Base64 string.

boolean

  • true and false values.

Keywords

  • The keyword family, including keyword, constant_keyword, and wildcard.
  • 不会分割,精确匹配

Numbers

  • Numeric types, such as long and double, used to express amounts.

Dates

  • Date types, including date and date_nanos.

alias

  • Defines an alias for an existing field.

text

nested

  • 不设置,会把数组多个对象变成数组student.name = [],声明嵌入式
  • 查询时,不能用Person.name,要用
&#123;
    "nested": &#123;
        "path": "attrs",
        "query": &#123;
            "bool": &#123;
                "must": [
                    &#123;
                        "term": &#123;
                            "attrs.attrId": &#123;
                                "value": "8"
                            &#125;
                        &#125;
                    &#125;,
                    &#123;
                        "terms": &#123;
                            "attrs.attrValue": [
                                "英特尔"
                            ]
                        &#125;
                    &#125;
                ]
            &#125;
        &#125;
    &#125;
 &#125;

设置

新增

put /my_index
&#123;
    "mappings": &#123;
        "properties": &#123;
            "age": &#123;
                "type" : "integer",
                "index": true // 默认能够检索,如果false将无法通过其检索
            &#125;
        &#125;
    &#125;
&#125;

添加新的字段

put /my_index/_mapping

更新字段,只能添加索引然后迁移

迁移:

POST _reindex
&#123;
    "source": &#123;
        "index":  "bank",
        "type": "account"
    &#125;,
    "dest": &#123;
        "index": "newBank"
    &#125;
&#125;

其他操作

数据迁移

get product/_mapping
put gulimall_product
&#123;
    // res - oldProductName&#123;&#125;
    // 去掉所有index: false,doc_value:falsefalse可以节省磁盘空间,提升索引速度,不让字段查询搜索
&#125;

// 迁移数据
post _reindex
&#123;
  "source": &#123;
    "index": "product"
  &#125;,
  "dest": &#123;
    "index": "gulimall_product"
  &#125;
&#125;

二、集成Springboot

https://www.elastic.co/guide/en/elasticsearch/client

开始

操作ES包

自己发http请求也行

9300:TCP

  • 使用spring-data-elasticsearch:transport-api.jar
    • springboot版本不同,jar不同,不能适配es版本
    • es7.x不建议使用8废弃

9200:HTTP

  • JestClient:非官方,更新慢
  • RestTemplate:模拟发http请求,需要自己封装操作
  • HttpClient:同上
  • Elasticsearch-Rest-Client:官方,封装了es操作,操作简单

依赖

es版本要一致

springboot已经限制es版本,需要使用这个限制版本
<properties>
    <java.version>1.8</java.version>
    <elasticsearch.version>7.10.0</elasticsearch.version>
</properties>

<dependency>
    <groupId>org.elasticsearch.client</groupId>
    <artifactId>elasticsearch-rest-high-level-client</artifactId>
    <version>7.6.2</version>
</dependency>

<dependency>
    <groupId>com.alibaba</groupId>
    <artifactId>fastjson</artifactId>
    <version>1.2.62</version>
</dependency>

需要json传输,

JSON.toJSONString(obj)

JSON.parseObject(Str, Map.class)

创建项目

空项目、改jdk、编译版本、es6

依赖一定要一致

<properties>
    <java.version>1.8</java.version>
    <elasticsearch.version>7.10.0</elasticsearch.version>
</properties>

https://www.elastic.co/guide/en/elasticsearch/client/java-rest/current/java-rest-high-getting-started-initialization.html

初始实例:https://www.elastic.co/guide/en/elasticsearch/client/java-rest/7.x/java-rest-high-getting-started-initialization.html

配置类

@Configuration
public class ElasticSearchConfig &#123;
    public static final RequestOptions COMMON_OPTIONS;
    static &#123;
        RequestOptions.Builder builder = RequestOptions.DEFAULT.toBuilder();
        /*builder.addHeader("Authorization", "Bearer " + TOKEN);
        builder.setHttpAsyncResponseConsumerFactory(
                new HttpAsyncResponseConsumerFactory
                        .HeapBufferedResponseConsumerFactory(30 * 1024 * 1024 * 1024));*/
        COMMON_OPTIONS = builder.build();
    &#125;
    @Bean
    public RestHighLevelClient esRestClient() &#123;
        RestHighLevelClient client = new RestHighLevelClient(
                RestClient.builder(
                        new HttpHost("106.75.103.68", 9200, "http")
                )
        );
        return client;
    &#125;
&#125;

如果需要设置请求头

https://www.elastic.co/guide/en/elasticsearch/client/java-rest/7.x/java-rest-low-usage-requests.html#java-rest-low-usage-request-options

private static final RequestOptions COMMON_OPTIONS;
static &#123;
    RequestOptions.Builder builder = RequestOptions.DEFAULT.toBuilder();
    builder.addHeader("Authorization", "Bearer " + TOKEN); 
    builder.setHttpAsyncResponseConsumerFactory(           
        new HttpAsyncResponseConsumerFactory
            .HeapBufferedResponseConsumerFactory(30 * 1024 * 1024 * 1024));
    COMMON_OPTIONS = builder.build();
&#125;

api测试

主要api

// 查询
@Test
void search() throws IOException &#123;
    // 1. 新建查询请求
    SearchRequest searchRequest = new SearchRequest();
    searchRequest.indices("bank"); // 确定请求查询索引
    SearchSourceBuilder sourceBuilder = new SearchSourceBuilder(); // 添加查询条件构造器
    //        sourceBuilder.from()
    sourceBuilder.query( QueryBuilders.matchQuery("address", "mill") ); // 构建查询条件
    TermsAggregationBuilder ageAgg = AggregationBuilders.terms("ageAgg").field("age").size(10); // 年龄分类
    sourceBuilder.aggregation(ageAgg);
    AvgAggregationBuilder balanceAvg = AggregationBuilders.avg("balanceAvg").field("balance"); // 所有的平均值
    sourceBuilder.aggregation(balanceAvg);
    System.out.println(sourceBuilder.toString());
    searchRequest.source(sourceBuilder); // 请求中加入查询

    // 2、正式查询
    SearchResponse searchResponse = client.search(searchRequest, ElasticSearchConfig.COMMON_OPTIONS);

    // 3、分析结果
    System.out.println(searchResponse.toString());
    // 3.1 获取匹配的数据
    SearchHits hits = searchResponse.getHits();
    SearchHit[] searchHits = hits.getHits();
    for (SearchHit searchHit : searchHits) &#123;
        //            searchHit.getIndex();searchHit.getType();
        String source = searchHit.getSourceAsString();
        Account account = JSON.parseObject(source, Account.class);
        System.out.println(account);
    &#125;
    // 3.2 获取聚合数据
    Aggregations aggregations = searchResponse.getAggregations();
    /*for (Aggregation aggregation : aggregations.asList()) &#123;
            aggregation.
        &#125;*/
    Terms ageAgg1 = aggregations.get("ageAgg");
    for (Terms.Bucket bucket : ageAgg1.getBuckets()) &#123;
        String keyAsString = bucket.getKeyAsString();
        System.out.println("年龄:" + keyAsString + "==>>" + bucket.getDocCount());
    &#125;
    Avg balanceAvg1 = aggregations.get("balanceAvg");
    System.out.println("平均薪资:" +balanceAvg1.getValue());

&#125;

// 保存或者更新
@Test
void getOrUpdate() throws IOException &#123;
    IndexRequest indexRequest = new IndexRequest("users");
    indexRequest.id("1");
    //      1、  indexRequest.source("username", "ming", "age", 18);
    User user = new User();
    user.setName("ming");
    user.setAge(18);
    user.setGender("男");
    indexRequest.source(JSON.toJSONString(user), XContentType.JSON);// 请求数据添加:对象和请求头
    IndexResponse index = client.index(indexRequest, ElasticSearchConfig.COMMON_OPTIONS); // 正式发送请求
    System.out.println(index);
&#125;

详细api

class EsApiApplicationTests &#123;
    @Autowired
    @Qualifier("restHighLevelClient") // 指定id/方法名
    private RestHighLevelClient client;

    // 索引的创建 Request
    @Test
    void testCreateIndex() throws IOException &#123;
        //  1、创建索引请求
        CreateIndexRequest request = new CreateIndexRequest("mingyue_index");
        //  2、客户端执行请求
        CreateIndexResponse createIndexResponse = client.indices().create(request, RequestOptions.DEFAULT);
        System.out.println(createIndexResponse);
    &#125;

    //  获取索引
    @Test
    void testExistIndex() throws IOException  &#123;
        GetIndexRequest request = new GetIndexRequest("mingyue_index");
        boolean exist = client.indices().exists(request, RequestOptions.DEFAULT);
        System.out.println(exist);
    &#125;

    // 删除索引
    @Test
    void testDeleteIndex()  throws IOException&#123;
        DeleteIndexRequest request = new DeleteIndexRequest("mingyue_index");
        AcknowledgedResponse delete = client.indices().delete(request, RequestOptions.DEFAULT);
        System.out.println(delete.isAcknowledged());
    &#125;

    // 添加文档
    @Test
    void testAddDocumnet() throws IOException &#123;
        // 创建对象
        User user = new User("明月", 3);
        // 创建请求
        IndexRequest request = new IndexRequest("ming_index");
        // 规则
        request.id("1");
        request.timeout(TimeValue.timeValueSeconds(1));
        request.timeout("1s");
        // 数据放入
        request.source(JSON.toJSONString(user), XContentType.JSON);
        // 客户端发送请求,获取结果
        IndexResponse indexResponse = client.index(request, RequestOptions.DEFAULT);
        System.out.println(indexResponse.toString());
        System.out.println(indexResponse.status()); // 命令返回的状态created
    &#125;

    // 获取文档,判断是否存在
    @Test
    void testIsExists() throws IOException &#123;
        GetRequest getRequest = new GetRequest("ming_index", "1");
        // 不获取_source的上下文
        getRequest.fetchSourceContext(new FetchSourceContext(false));
        getRequest.storedFields("_none_");
        boolean exists = client.exists(getRequest, RequestOptions.DEFAULT);
        System.out.println(exists);
    &#125;

    // 获取文档信息
    @Test
    void testGetDoc() throws Exception &#123;
        GetRequest getRequest = new GetRequest("ming_index", "1");
        GetResponse getResponse = client.get(getRequest, RequestOptions.DEFAULT);
        System.out.println(getResponse.getSourceAsString());
        System.out.println(getResponse);
    &#125;

    // 更新文档的信息
    @Test
    void testUpdateDoc() throws IOException &#123;
        UpdateRequest updateRequest = new UpdateRequest("ming_index", "1");
        updateRequest.timeout("1s");
        User user = new User("明月陎", 18);
        updateRequest.doc(JSON.toJSONString(user), XContentType.JSON);
        UpdateResponse updateResponse;
        updateResponse = client.update(updateRequest, RequestOptions.DEFAULT);
        System.out.println(updateResponse.status());
    &#125;

    // 删除文档
    @Test
    void testDeleteDoc() throws IOException &#123;
        DeleteRequest deleteRequest = new DeleteRequest("ming_index", "1");
        deleteRequest.timeout("1s");
        DeleteResponse deleteResponse = client.delete(deleteRequest, RequestOptions.DEFAULT);
        System.out.println(deleteResponse.status());
    &#125;

    // 特殊的,批量插入
    @Test
    void testBulkRequest() throws IOException &#123;
        BulkRequest bulkRequest = new BulkRequest();
        bulkRequest.timeout("10s");
        List<User> list = new ArrayList<>();
        list.add(new User("ming1", 2));
        list.add(new User("ming12", 2));
        list.add(new User("ming13", 2));
        list.add(new User("ming2", 2));
        list.add(new User("ming4", 2));
        // 批处理
        for(int i = 0; i < list.size(); i++) &#123;
            bulkRequest.add(new IndexRequest("ming_index")
                .id("" + (i + 1)) // 无id,则随机生成
                .source(JSON.toJSONString(list.get(i)), XContentType.JSON)
            );
        &#125;
        BulkResponse bulkResponse = client.bulk(bulkRequest, RequestOptions.DEFAULT);
        System.out.println(bulkResponse.hasFailures());
    &#125;

    // 查询
    /**
     *  SearchRequest 搜索请求
     *  SearchSourceBuilder 条件构造
     *  HighlightBuilder 高亮
     *  TermQueryBuilder 精确查询
     *  MatchAllQueryBuilder
     *  xxx QueryBuilder
      */


    @Test
    void testSearch() throws IOException &#123;
        // 查询请求
        SearchRequest searchRequest = new SearchRequest("ming_index");
        // 构建搜索条件
        SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
        // 获取查询条件构造器,可以使用QueryBuilders工具实现
        //QueryBuilders.termQuery()
//        TermQueryBuilder termQueryBuilder = QueryBuilders.termQuery("name", "ming");
        MatchQueryBuilder matchQueryBuilder = QueryBuilders.matchQuery("name", "ming");
//        MatchAllQueryBuilder matchAllQueryBuilder = QueryBuilders.matchAllQuery();
        sourceBuilder.query(matchQueryBuilder);

//        sourceBuilder.from();
//        sourceBuilder.size();
        sourceBuilder.timeout(new TimeValue(60, TimeUnit.SECONDS));
        // 放入资源

        searchRequest.source(sourceBuilder);

        SearchResponse searchResponse = client.search(searchRequest, RequestOptions.DEFAULT);
        System.out.println(JSON.toJSONString(searchResponse.getHits()));
        for (SearchHit docFields : searchResponse.getHits().getHits()) &#123;
            System.out.println(docFields.getSourceAsMap());
        &#125;

    &#125;

&#125;

爬虫

依赖

jsoup包解析网页

<dependency>
    <groupId>org.jsoup</groupId>
    <artifactId>jsoup</artifactId>
    <version>1.10.2</version>
</dependency>

配置

@Configuration
public class ElasticSearchConfig &#123;
    // id方法名,class返回类名
    @Bean
    public RestHighLevelClient restHighLevelClient() &#123;

        return new RestHighLevelClient(
                RestClient.builder(
                        new HttpHost("localhost", 9200, "http"))
        );
    &#125;
&#125;

实现

@Service
public class ContentService &#123;
    @Autowired
    private RestHighLevelClient restHighLevelClient;
    // 放入es索引
    public Boolean parseContent(String keywords) throws Exception &#123;
        List<Content> contents = new HtmlParseUtil().parseJD(keywords);
        BulkRequest bulkRequest = new BulkRequest();
        bulkRequest.timeout("2m");
        for(int i = 0; i < contents.size(); i++) &#123;
            bulkRequest.add(
                    new IndexRequest("jd_goods")
                        .source(JSON.toJSONString(contents.get(i)), XContentType.JSON)
            );

        &#125;
        BulkResponse bulkResponse = restHighLevelClient.bulk(bulkRequest, RequestOptions.DEFAULT);
//        System.out.println();
//        System.out.println(bulkResponse.status());
        return !bulkResponse.hasFailures();
    &#125;

    public List<Map<String, Object>> searchPage(String keyword, int pageNo, int pageSize) throws Exception&#123;
        if (pageNo <= 1 ) &#123;
            pageNo = 1;
        &#125;
        SearchRequest searchRequest = new SearchRequest("jd_goods");
        SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
        // 分页
        sourceBuilder.from(pageSize * (pageNo-1));
        sourceBuilder.size(pageSize);
        //  查询条件
        TermQueryBuilder termQueryBuilder = QueryBuilders.termQuery("title", keyword);
        // 使用查询
        sourceBuilder.query(termQueryBuilder);
        sourceBuilder.timeout(new TimeValue(60, TimeUnit.SECONDS));
        // 高亮
        HighlightBuilder highlightBuilder = new HighlightBuilder();
        highlightBuilder.field("title");
        highlightBuilder.requireFieldMatch(false); // 关闭多个高亮
        highlightBuilder.preTags("<span style='color:red;'>");
        highlightBuilder.postTags("</span>");
        sourceBuilder.highlighter(highlightBuilder);

        // 执行
        searchRequest.source(sourceBuilder);
        SearchResponse searchResponse = restHighLevelClient.search(searchRequest, RequestOptions.DEFAULT);
        // 解析结果
        ArrayList<Map<String, Object>> list = new ArrayList<>();
        for (SearchHit docFields : searchResponse.getHits().getHits()) &#123;
            // 解析高亮的字段
            Map<String, HighlightField> highlightFields = docFields.getHighlightFields();
            HighlightField title = highlightFields.get("title");
            Map<String, Object> sourceAsMap = docFields.getSourceAsMap();// 原来的结果
            // 解析高亮的字段
            if(title != null) &#123;
                Text[] fragments = title.fragments();
                String new_title = "";
                for (Text fragment : fragments) &#123;
                    new_title += fragment;
                &#125;
                sourceAsMap.put("title", new_title);
            &#125;
            list.add( sourceAsMap );
        &#125;
        return list;
    &#125;
&#125;

谷粒商城-首页搜索

dsl

keyword = 小米
& sort = saleCount_desc/asc // 排序
& hasStock = 0/1
& skuPrice = 400_1900 // 最高最低
& brandId = 1& brandId =2  // 数组
& catalog3Id=1
& attrs = 1_3G:4G:5G & attrs = 2_骁龙845 & attrs = 4_高清屏 // 数组
& pageNum = 1

最终语句

get gulimall_product/_search
&#123;
  "from": 0,
  "size": 1,
  "highlight": &#123;
    "fields": &#123;
      "skuTitle":&#123;&#125;
    &#125;,
    "pre_tags": "<b style='color:red'>",
    "post_tags": "</b>"
  &#125;,
  "query": &#123;
    "bool": &#123;
      "must": [
        &#123;
          "match": &#123;
            "skuTitle": "新手机"
          &#125;
        &#125;
      ],
      "filter": [
        &#123;
          "term": &#123;
            "catalogId": "225"
          &#125;
        &#125;,
        &#123;
          "terms": &#123;
            "brandId": [
              "1",
              "2",
              "9"
            ]
          &#125;
        &#125;,
        &#123;
          "nested": &#123;
            "path": "attrs",
            "query": &#123;
              "bool": &#123;
                "must": [
                  &#123;
                    "term": &#123;
                      "attrs.attrId": &#123;
                        "value": "8"
                      &#125;
                    &#125;
                  &#125;,
                  &#123;
                    "terms": &#123;
                      "attrs.attrValue": [
                        "英特尔"
                      ]
                    &#125;
                  &#125;
                ]
              &#125;
            &#125;
          &#125;
        &#125;,
        &#123;
          "term": &#123;
            "hasStock": &#123;
              "value": "false"
            &#125;
          &#125;
        &#125;,
        &#123;
          "range": &#123;
            "skuPrice": &#123;
              "gte": -1,
              "lte": 1000
            &#125;
          &#125;
        &#125;
      ]
    &#125;
  &#125;,
  "sort": &#123;
    "skuPrice": &#123;
      "order": "desc"
    &#125;
  &#125;,
  "from": 0,
  "size": 1,
  "highlight": &#123;
    "fields": &#123;
      "skuTitle":&#123;&#125;
    &#125;,
    "pre_tags": "<b style='color:red'>",
    "post_tags": "</b>"
  &#125;
  ,
  "aggs": &#123;
    "brand_agg": &#123;
      "terms": &#123;
        "field": "brandId",
        "size": 10
      &#125;,
      "aggs": &#123;
        "brand_name_agg": &#123;
          "terms": &#123;
            "field": "brandName",
            "size": 10
          &#125;
        &#125;,
        "brand_img_agg": &#123;
          "terms": &#123;
            "field": "brandImg",
            "size": 10
          &#125;
        &#125;
      &#125;
    &#125;,
    "catelog_agg": &#123;
      "terms": &#123;
        "field": "catalogId",
        "size": 10
      &#125;,
      "aggs": &#123;
        "catelog_name_agg": &#123;
         "terms": &#123;
            "field": "catalogName",
            "size": 10
         &#125;
        &#125;
      &#125;
    &#125;,
    "attr_agg": &#123;
      "nested": &#123;
        "path": "attrs"
      &#125;,
      "aggs": &#123;
        "attr_id_agg": &#123;
          "terms": &#123;
            "field": "attrs.attrId",
            "size": 10
          &#125;,
          "aggs": &#123;
            "attr_name_agg": &#123;
              "terms": &#123;
                "field": "attrs.attrName",
                "size": 10
              &#125;
            &#125;,
            "attr_value_age": &#123;
              "terms": &#123;
                "field": "attrs.attrValue",
                "size": 10
              &#125;
            &#125;
          &#125;
        &#125;
      &#125;
    &#125;
  &#125;
&#125;

查询分析

模糊匹配match、过滤filter(属性nested、分类、品牌、价格区间range,库存)

排序(sort)

分页(from,size)

高亮(highlight)

&#123;
    "query": &#123;
        "bool": &#123;
            "must":[
                &#123;
                    "match": &#123;
                        "skuTitle": "手机"
                    &#125;
                &#125;
            ],
            "filter": [
                &#123;
                    "term": &#123;
                        "catalogId": "225"
                    &#125;   
                &#125;,
                &#123;
                    "terms":[],
                &#125;,
                &#123;
                    "nested": &#123;
                        "path":"person.name",
                        "query": &#123;

                        &#125;
                    &#125;
                &#125;,
                &#123;
                    "range": &#123;
                        "skuPrice": &#123;
                            "gte": 0,
                            "lte": 1000
                        &#125;
                    &#125;
                &#125;
            ]
        &#125;
    &#125;,
    "sort": &#123;
        "skuPrice": &#123;
            "order": "asc"
        &#125;
    &#125;,
    "from": 0,
    "size": 100,
    "highlight": &#123;
        "fields": &#123;
            "skuTitle": &#123;&#125;
        &#125;,
        "pre_tags": "",
        "post_tags": ""
    &#125;

&#125;
get product/_search
&#123;
  "query": &#123;
    "bool": &#123;
      "must": [
        &#123;
          "match": &#123;
            "skuTitle": "新手机"
          &#125;
        &#125;
      ],
      "filter": [
        &#123;
          "term": &#123;
            "catalogId": "225"
          &#125;
        &#125;,
        &#123;
          "terms": &#123;
            "brandId": [
              "1",
              "2",
              "9"
            ]
          &#125;
        &#125;,
        &#123;
          "nested": &#123;
              "path": "attrs",
              "query": &#123;
                "bool": &#123;
                  "must": [
                    &#123;
                      "term": &#123;
                        "attrs.attrId": &#123;
                          "value": "8"
                        &#125;
                      &#125;
                    &#125;,
                    &#123;
                      "terms": &#123;
                        "attrs.attrValue": [
                          "英特尔"
                        ]
                      &#125;
                    &#125;
                  ]
                &#125;
              &#125;
            &#125;
        &#125;,
        &#123;
          "term": &#123;
            "hasStock": &#123;
              "value": "false"
            &#125;
          &#125;
        &#125;,
        &#123;
          "range": &#123;
            "skuPrice": &#123;
              "gte": -1,
              "lte": 1000
            &#125;
          &#125;
        &#125;
      ]
    &#125;
  &#125;,
  "sort": &#123;
    "skuPrice": &#123;
      "order": "desc"
    &#125;
  &#125;,
  "from": 0,
  "size": 111,
  "highlight": &#123;
    "fields": &#123;
      "skuTitle":&#123;&#125;
    &#125;,
    "pre_tags": "<b style='color:red'>",
    "post_tags": "</b>"
  &#125;
&#125;

聚合分析

最终
get gulimall_product/_search 
&#123;
  "query": &#123;
    "match_all": &#123;&#125;
  &#125;,
  "aggs": &#123;
    "brand_agg": &#123;
      "terms": &#123;
        "field": "brandId",
        "size": 10
      &#125;,
      "aggs": &#123;
        "brand_name_agg": &#123;
          "terms": &#123;
            "field": "brandName",
            "size": 10
          &#125;
        &#125;,
        "brand_img_agg": &#123;
          "terms": &#123;
            "field": "brandImg",
            "size": 10
          &#125;
        &#125;
      &#125;
    &#125;,
    "catelog_agg": &#123;
      "terms": &#123;
        "field": "catalogId",
        "size": 10
      &#125;,
      "aggs": &#123;
        "catelog_name_agg": &#123;
         "terms": &#123;
            "field": "catalogName",
            "size": 10
         &#125;
        &#125;
      &#125;
    &#125;,
    "attr_agg": &#123;
      "nested": &#123;
        "path": "attrs"
      &#125;,
      "aggs": &#123;
        "attr_id_agg": &#123;
          "terms": &#123;
            "field": "attrs.attrId",
            "size": 10
          &#125;,
          "aggs": &#123;
            "attr_name_agg": &#123;
              "terms": &#123;
                "field": "attrs.attrName",
                "size": 10
              &#125;
            &#125;,
            "attr_value_age": &#123;
              "terms": &#123;
                "field": "attrs.attrValue",
                "size": 10
              &#125;
            &#125;
          &#125;
        &#125;
      &#125;
    &#125;
  &#125;
&#125;
无法聚合
非检索属性
"brandName" : &#123;
          "type" : "keyword",
          "index" : false,
          "doc_values" : false
        &#125;

数据迁移:

get product/_mapping
put gulimall_product
&#123;
    // res - oldProductName&#123;&#125;
    // 去掉所有index: false,doc_value:falsefalse可以节省磁盘空间,提升索引速度,不让字段查询搜索
&#125;

// 迁移数据
post _reindex
&#123;
  "source": &#123;
    "index": "product"
  &#125;,
  "dest": &#123;
    "index": "gulimall_product"
  &#125;
&#125;
nested

resp

&#123;
  "took" : 8,
  "timed_out" : false,
  "_shards" : &#123;
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  &#125;,
  "hits" : &#123;
    "total" : &#123;
      "value" : 2,
      "relation" : "eq"
    &#125;,
    "max_score" : 3.2580965,
    "hits" : [
      &#123;
        "_index" : "gulimall_product",
        "_type" : "_doc",
        "_id" : "82",
        "_score" : 3.2580965,
        "_source" : &#123;
          "attrs" : [
            &#123;
              "attrId" : 7,
              "attrName" : "机身长度",
              "attrValue" : "158.3mm"
            &#125;,
            &#123;
              "attrId" : 8,
              "attrName" : "CPU品牌",
              "attrValue" : "英特尔"
            &#125;
          ],
          "brandId" : 6,
          "brandImg" : "https://ming-mall.oss-cn-beijing.aliyuncs.com/2021/10/31/8f88b9d0-1c5c-4b3b-8b8e-096b37bc4fc1_key_login.png",
          "brandName" : "苹果",
          "catalogId" : 225,
          "catalogName" : "手机",
          "hasStock" : false,
          "hotScore" : 0,
          "saleCount" : 0,
          "skuId" : 82,
          "skuImg" : "https://ming-mall.oss-cn-beijing.aliyuncs.com/2021/10/31/a4a6e721-7de0-4eb6-8490-add13221a7b7_key_login.png",
          "skuPrice" : 4444.0,
          "skuTitle" : "iphone11 aaa aaa 8G 16",
          "spuId" : 39
        &#125;,
        "highlight" : &#123;
          "skuTitle" : [
            "<b style='color:red'>iphone11</b> aaa aaa 8G 16"
          ]
        &#125;
      &#125;
    ]
  &#125;,
  "aggregations" : &#123;
    "brandAgg" : &#123;
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [
        &#123;
          "key" : 6,
          "doc_count" : 2,
          "brandImgAgg" : &#123;
            "doc_count_error_upper_bound" : 0,
            "sum_other_doc_count" : 0,
            "buckets" : [
              &#123;
                "key" : "https://ming-mall.oss-cn-beijing.aliyuncs.com/2021/10/31/8f88b9d0-1c5c-4b3b-8b8e-096b37bc4fc1_key_login.png",
                "doc_count" : 2
              &#125;
            ]
          &#125;,
          "brandNameAgg" : &#123;
            "doc_count_error_upper_bound" : 0,
            "sum_other_doc_count" : 0,
            "buckets" : [
              &#123;
                "key" : "苹果",
                "doc_count" : 2
              &#125;
            ]
          &#125;
        &#125;
      ]
    &#125;,
    "catalogAgg" : &#123;
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [
        &#123;
          "key" : 225,
          "doc_count" : 2,
          "catalogNameAgg" : &#123;
            "doc_count_error_upper_bound" : 0,
            "sum_other_doc_count" : 0,
            "buckets" : [
              &#123;
                "key" : "手机",
                "doc_count" : 2
              &#125;
            ]
          &#125;
        &#125;
      ]
    &#125;,
    "attrs" : &#123;
      "doc_count" : 66,
      "attrIdAgg" : &#123;
        "doc_count_error_upper_bound" : 0,
        "sum_other_doc_count" : 0,
        "buckets" : [
          &#123;
            "key" : 8,
            "doc_count" : 64,
            "attrNameAgg" : &#123;
              "doc_count_error_upper_bound" : 0,
              "sum_other_doc_count" : 0,
              "buckets" : [
                &#123;
                  "key" : "CPU品牌",
                  "doc_count" : 64
                &#125;
              ]
            &#125;,
            "attrValueAgg" : &#123;
              "doc_count_error_upper_bound" : 0,
              "sum_other_doc_count" : 0,
              "buckets" : [
                &#123;
                  "key" : "东西",
                  "doc_count" : 36
                &#125;,
                &#123;
                  "key" : "英特尔",
                  "doc_count" : 28
                &#125;
              ]
            &#125;
          &#125;,
          &#123;
            "key" : 7,
            "doc_count" : 2,
            "attrNameAgg" : &#123;
              "doc_count_error_upper_bound" : 0,
              "sum_other_doc_count" : 0,
              "buckets" : [
                &#123;
                  "key" : "机身长度",
                  "doc_count" : 2
                &#125;
              ]
            &#125;,
            "attrValueAgg" : &#123;
              "doc_count_error_upper_bound" : 0,
              "sum_other_doc_count" : 0,
              "buckets" : [
                &#123;
                  "key" : "158.3mm",
                  "doc_count" : 2
                &#125;
              ]
            &#125;
          &#125;
        ]
      &#125;
    &#125;
  &#125;
&#125;

Param与返回结果

https://www.bilibili.com/video/BV1np4y1C7Yf?p=177&spm_id_from=pageDriver

Param

@Data
public class SearchParam &#123;
    //页面传递过来的全文匹配关键字
    private String keyword;

    //品牌id,可以多选
    private List<Long> brandId;

    //三级分类id
    private Long catalog3Id;

    //排序条件:sort=price_desc/asc
    private String sort;

    //是否显示有货
    private Integer hasStock;

    //价格区间查询 20_500
    private String skuPrice;

    //按照属性进行筛选 attrs=1_安卓:其他&attrs=2_6寸:5寸
    private List<String> attrs;

    //页码
    private Integer pageNum = 1;

    //原生的所有查询条件
    private String _queryString;

&#125;

SearchResult

@Data
public class SearchResult &#123;
    //查询到的所有商品信息
    private List<SkuEsModel> product;

    //当前页码
    private Integer pageNum;

    //总记录数
    private Long total;

    //总页码
    private Integer totalPages;

    //页码遍历结果集(分页)
    private List<Integer> pageNavs;

    // 当前查询到的结果,所有涉及到的品牌
    private List<BrandVo> brands;

    // 当前查询到的结果,所有涉及到的所有属性
    private List<AttrVo> attrs;

    // 当前查询到的结果,所有涉及到的所有分类
    private List<CatalogVo> catalogs;


    //===========================以上是返回给页面的所有信息============================//


    /* 面包屑导航数据 */
    private List<NavVo> navs;

    @Data
    public static class NavVo &#123;
        private String navName;
        private String navValue;
        private String link;
    &#125;

    @Data
    @AllArgsConstructor
    public static class BrandVo &#123;
        private Long brandId;
        private String brandName;
        private String brandImg;
    &#125;

    @Data
    @AllArgsConstructor
    public static class AttrVo &#123;
        private Long attrId;
        private String attrName;
        private List<String> attrValue;
    &#125;

    @Data
    @AllArgsConstructor
    public static class CatalogVo &#123;
        private Long catalogId;
        private String catalogName;
    &#125;
&#125;

主体逻辑

// controller
@GetMapping(value = &#123;"/search.html","/"&#125;)
public String getSearchPage(SearchParam searchParam, Model model, HttpServletRequest request) &#123;
    searchParam.set_queryString(request.getQueryString());
    SearchResult result = searchService.getSearchResult(searchParam);
    model.addAttribute("result", result);
    return "search";
&#125;


// service
public SearchResult getSearchResult(SearchParam searchParam) &#123;
    SearchResult searchResult = null;
    // 通过请求参数构建查询请求
    SearchRequest request = bulidSearchRequest( searchParam );
    try &#123;
        SearchResponse searchResponse = restHighLevelClient.search( request, GulimallElasticSearchConfig.COMMON_OPTIONS );
        //将es响应数据封装成结果
        searchResult = bulidSearchResult( searchParam, searchResponse );
    &#125; catch (IOException e) &#123;
        e.printStackTrace();
    &#125;
    return searchResult;
&#125;

查询条件

private SearchRequest bulidSearchRequest(SearchParam searchParam) &#123;
    SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
    //1. 构建bool query
    BoolQueryBuilder boolQueryBuilder = new BoolQueryBuilder();
    //1.1 bool must
    if (!StringUtils.isEmpty(searchParam.getKeyword())) &#123;
        boolQueryBuilder.must(QueryBuilders.matchQuery("skuTitle", searchParam.getKeyword()));
    &#125;

    //1.2 bool filter
    //1.2.1 catalog
    if (searchParam.getCatalog3Id()!=null)&#123;
        boolQueryBuilder.filter(QueryBuilders.termQuery("catalogId", searchParam.getCatalog3Id()));
    &#125;
    //1.2.2 brand
    if (searchParam.getBrandId()!=null&&searchParam.getBrandId().size()>0) &#123;
        boolQueryBuilder.filter(QueryBuilders.termsQuery("brandId",searchParam.getBrandId()));
    &#125;
    //1.2.3 hasStock
    if (searchParam.getHasStock() != null) &#123;
        boolQueryBuilder.filter(QueryBuilders.termQuery("hasStock", searchParam.getHasStock() == 1));
    &#125;
    //1.2.4 priceRange
    RangeQueryBuilder rangeQueryBuilder = QueryBuilders.rangeQuery("skuPrice");
    if (!StringUtils.isEmpty(searchParam.getSkuPrice())) &#123;
        String[] prices = searchParam.getSkuPrice().split("_");
        if (prices.length == 1) &#123;
            if (searchParam.getSkuPrice().startsWith("_")) &#123;
                rangeQueryBuilder.lte(Integer.parseInt(prices[0]));
            &#125;else &#123;
                rangeQueryBuilder.gte(Integer.parseInt(prices[0]));
            &#125;
        &#125; else if (prices.length == 2) &#123;
            //_6000会截取成["","6000"]
            if (!prices[0].isEmpty()) &#123;
                rangeQueryBuilder.gte(Integer.parseInt(prices[0]));
            &#125;
            rangeQueryBuilder.lte(Integer.parseInt(prices[1]));
        &#125;
        boolQueryBuilder.filter(rangeQueryBuilder);
    &#125;
    //1.2.5 attrs-nested
    //attrs=1_5寸:8寸&2_16G:8G
    List<String> attrs = searchParam.getAttrs();
    BoolQueryBuilder queryBuilder = new BoolQueryBuilder();
    if (attrs!=null&&attrs.size() > 0) &#123;
        attrs.forEach(attr->&#123;
            String[] attrSplit = attr.split("_");
            queryBuilder.must(QueryBuilders.termQuery("attrs.attrId", attrSplit[0]));
            String[] attrValues = attrSplit[1].split(":");
            queryBuilder.must(QueryBuilders.termsQuery("attrs.attrValue", attrValues));
        &#125;);
    &#125;
    NestedQueryBuilder nestedQueryBuilder = QueryBuilders.nestedQuery("attrs", queryBuilder, ScoreMode.None);
    boolQueryBuilder.filter(nestedQueryBuilder);
    //1. bool query构建完成
    searchSourceBuilder.query(boolQueryBuilder);

    //2. sort  eg:sort=saleCount_desc/asc
    if (!StringUtils.isEmpty(searchParam.getSort())) &#123;
        String[] sortSplit = searchParam.getSort().split("_");
        searchSourceBuilder.sort(sortSplit[0], sortSplit[1].equalsIgnoreCase("asc") ? SortOrder.ASC : SortOrder.DESC);
    &#125;

    //3. 分页
    searchSourceBuilder.from((searchParam.getPageNum() - 1) * EsConstant.PRODUCT_PAGESIZE);
    searchSourceBuilder.size(EsConstant.PRODUCT_PAGESIZE);

    //4. 高亮highlight
    if (!StringUtils.isEmpty(searchParam.getKeyword())) &#123;
        HighlightBuilder highlightBuilder = new HighlightBuilder();
        highlightBuilder.field("skuTitle");
        highlightBuilder.preTags("<b style='color:red'>");
        highlightBuilder.postTags("</b>");
        searchSourceBuilder.highlighter(highlightBuilder);
    &#125;

    //5. 聚合
    //5.1 按照brand聚合
    TermsAggregationBuilder brandAgg = AggregationBuilders.terms("brandAgg").field("brandId");
    TermsAggregationBuilder brandNameAgg = AggregationBuilders.terms("brandNameAgg").field("brandName");
    TermsAggregationBuilder brandImgAgg = AggregationBuilders.terms("brandImgAgg").field("brandImg");
    brandAgg.subAggregation(brandNameAgg);
    brandAgg.subAggregation(brandImgAgg);
    searchSourceBuilder.aggregation(brandAgg);

    //5.2 按照catalog聚合
    TermsAggregationBuilder catalogAgg = AggregationBuilders.terms("catalogAgg").field("catalogId");
    TermsAggregationBuilder catalogNameAgg = AggregationBuilders.terms("catalogNameAgg").field("catalogName");
    catalogAgg.subAggregation(catalogNameAgg);
    searchSourceBuilder.aggregation(catalogAgg);

    //5.3 按照attrs聚合
    NestedAggregationBuilder nestedAggregationBuilder = new NestedAggregationBuilder("attrs", "attrs");
    //按照attrId聚合
    TermsAggregationBuilder attrIdAgg = AggregationBuilders.terms("attrIdAgg").field("attrs.attrId");
    //按照attrId聚合之后再按照attrName和attrValue聚合
    TermsAggregationBuilder attrNameAgg = AggregationBuilders.terms("attrNameAgg").field("attrs.attrName");
    TermsAggregationBuilder attrValueAgg = AggregationBuilders.terms("attrValueAgg").field("attrs.attrValue");
    attrIdAgg.subAggregation(attrNameAgg);
    attrIdAgg.subAggregation(attrValueAgg);

    nestedAggregationBuilder.subAggregation(attrIdAgg);
    searchSourceBuilder.aggregation(nestedAggregationBuilder);

    log.debug("构建的DSL语句 &#123;&#125;",searchSourceBuilder.toString());

    SearchRequest request = new SearchRequest(new String[]&#123;EsConstant.PRODUCT_INDEX&#125;, searchSourceBuilder);
    return request;
&#125;

响应结果

private SearchResult bulidSearchResult(SearchParam searchParam, SearchResponse searchResponse) &#123;
    SearchResult result = new SearchResult();
    SearchHits hits = searchResponse.getHits();
    //1. 封装查询到的商品信息
    if (hits.getHits()!=null && hits.getHits().length>0)&#123;
        List<SkuEsModel> skuEsModels = new ArrayList<>();
        for (SearchHit hit : hits) &#123;
            String sourceAsString = hit.getSourceAsString();
            SkuEsModel skuEsModel = JSON.parseObject(sourceAsString, SkuEsModel.class);
            //设置高亮属性
            if (!StringUtils.isEmpty(searchParam.getKeyword())) &#123;
                HighlightField skuTitle = hit.getHighlightFields().get("skuTitle");
                String highLight = skuTitle.getFragments()[0].string();
                skuEsModel.setSkuTitle(highLight);
            &#125;
            skuEsModels.add(skuEsModel);
        &#125;
        result.setProduct(skuEsModels);
    &#125;

    //2. 封装分页信息
    //2.1 当前页码
    result.setPageNum(searchParam.getPageNum());
    //2.2 总记录数
    long total = hits.getTotalHits().value;
    result.setTotal(total);
    //2.3 总页码
    Integer totalPages = (int)total % EsConstant.PRODUCT_PAGESIZE == 0 ?
            (int)total / EsConstant.PRODUCT_PAGESIZE : (int)total / EsConstant.PRODUCT_PAGESIZE + 1;
    result.setTotalPages(totalPages);
    List<Integer> pageNavs = new ArrayList<>();
    for (int i = 1; i <= totalPages; i++) &#123;
        pageNavs.add(i);
    &#125;
    result.setPageNavs(pageNavs);

    //3. 查询结果涉及到的品牌
    List<SearchResult.BrandVo> brandVos = new ArrayList<>();
    Aggregations aggregations = searchResponse.getAggregations();
    //ParsedLongTerms用于接收terms聚合的结果,并且可以把key转化为Long类型的数据
    ParsedLongTerms brandAgg = aggregations.get("brandAgg");
    for (Terms.Bucket bucket : brandAgg.getBuckets()) &#123;
        //3.1 得到品牌id
        Long brandId = bucket.getKeyAsNumber().longValue();

        Aggregations subBrandAggs = bucket.getAggregations();
        //3.2 得到品牌图片
        ParsedStringTerms brandImgAgg=subBrandAggs.get("brandImgAgg");
        String brandImg = brandImgAgg.getBuckets().get(0).getKeyAsString();
        //3.3 得到品牌名字
        Terms brandNameAgg=subBrandAggs.get("brandNameAgg");
        String brandName = brandNameAgg.getBuckets().get(0).getKeyAsString();
        SearchResult.BrandVo brandVo = new SearchResult.BrandVo(brandId, brandName, brandImg);
        brandVos.add(brandVo);
    &#125;
    result.setBrands(brandVos);

    //4. 查询涉及到的所有分类
    List<SearchResult.CatalogVo> catalogVos = new ArrayList<>();
    ParsedLongTerms catalogAgg = aggregations.get("catalogAgg");
    for (Terms.Bucket bucket : catalogAgg.getBuckets()) &#123;
        //4.1 获取分类id
        Long catalogId = bucket.getKeyAsNumber().longValue();
        Aggregations subcatalogAggs = bucket.getAggregations();
        //4.2 获取分类名
        ParsedStringTerms catalogNameAgg=subcatalogAggs.get("catalogNameAgg");
        String catalogName = catalogNameAgg.getBuckets().get(0).getKeyAsString();
        SearchResult.CatalogVo catalogVo = new SearchResult.CatalogVo(catalogId, catalogName);
        catalogVos.add(catalogVo);
    &#125;
    result.setCatalogs(catalogVos);

    //5 查询涉及到的所有属性
    List<SearchResult.AttrVo> attrVos = new ArrayList<>();
    //ParsedNested用于接收内置属性的聚合
    ParsedNested parsedNested=aggregations.get("attrs");
    ParsedLongTerms attrIdAgg=parsedNested.getAggregations().get("attrIdAgg");
    for (Terms.Bucket bucket : attrIdAgg.getBuckets()) &#123;
        //5.1 查询属性id
        Long attrId = bucket.getKeyAsNumber().longValue();

        Aggregations subAttrAgg = bucket.getAggregations();
        //5.2 查询属性名
        ParsedStringTerms attrNameAgg=subAttrAgg.get("attrNameAgg");
        String attrName = attrNameAgg.getBuckets().get(0).getKeyAsString();
        //5.3 查询属性值
        ParsedStringTerms attrValueAgg = subAttrAgg.get("attrValueAgg");
        List<String> attrValues = new ArrayList<>();
        for (Terms.Bucket attrValueAggBucket : attrValueAgg.getBuckets()) &#123;
            String attrValue = attrValueAggBucket.getKeyAsString();
            attrValues.add(attrValue);
            List<SearchResult.NavVo> navVos = new ArrayList<>();
        &#125;
        SearchResult.AttrVo attrVo = new SearchResult.AttrVo(attrId, attrName, attrValues);
        attrVos.add(attrVo);
    &#125;
    result.setAttrs(attrVos);

    // 6. 构建面包屑导航
    List<String> attrs = searchParam.getAttrs();
    if (attrs != null && attrs.size() > 0) &#123;
        List<SearchResult.NavVo> navVos = attrs.stream().map(attr -> &#123;
            String[] split = attr.split("_");
            SearchResult.NavVo navVo = new SearchResult.NavVo();
            //6.1 设置属性值
            navVo.setNavValue(split[1]);
            //6.2 查询并设置属性名
            try &#123;
                R r = productFeignService.info(Long.parseLong(split[0]));
                if (r.getCode() == 0) &#123;
                    AttrResponseVo attrResponseVo = JSON.parseObject(JSON.toJSONString(r.get("attr")), new TypeReference<AttrResponseVo>() &#123;
                    &#125;);
                    navVo.setNavName(attrResponseVo.getAttrName());
                &#125;
            &#125; catch (Exception e) &#123;
                log.error("远程调用商品服务查询属性失败", e);
            &#125;
            //6.3 设置面包屑跳转链接
            String queryString = searchParam.get_queryString();
            String replace = queryString.replace("&attrs=" + attr, "").replace("attrs=" + attr+"&", "").replace("attrs=" + attr, "");
            navVo.setLink("http://search.gulimall.com/search.html" + (replace.isEmpty()?"":"?"+replace));
            return navVo;
        &#125;).collect(Collectors.toList());
        result.setNavs(navVos);
    &#125;
    return result;
&#125;

三、常用操作

导入导出数据

https://github.com/elasticsearch-dump/elasticsearch-dump

npm install elasticdump -g
elasticdump
# 导入导出
elasticdump \
  --input=http://production.es.com:9200/my_index \
  --output=http://staging.es.com:9200/my_index \
  --type=data
# 导出json
elasticdump --input=http://112.124.15.81:9200/gulimall_product --output=F:\laji\dumpsearch\gulimall_elasticsearch_data.json --type=mapping
elasticdump --input=http://112.124.15.81:9200/gulimall_product --output=F:\laji\dumpsearch\gulimall_elasticsearch_data.json --type=data
# 导入
elasticdump --input=F:\laji\dumpsearch\gulimall_elasticsearch_data.json --output=http://112.124.15.81:9200 --type=data

elk

wget·https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-8.1.2-linux-x86_64.tar.gz
wget https://artifacts.elastic.co/downloads/logstash/logstash-8.1.2-linux-x86_64.tar.gz
wget https://artifacts.elastic.co/downloads/kibana/kibana-8.1.2-linux-x86_64.tar.gz

tar -zxvf

cd /usr/local/elasticsearch-7.10.2/config/
vim elasticsearch.yml
node.name: node-1
path.data: /usr/local/elasticsearch-7.10.2/data
path.logs: /usr/local/elasticsearch-7.10.2/logs
network.host: 127.0.0.1
http.host: 0.0.0.0
http.port: 9200
discovery.seed_hosts: ["127.0.0.1"]
cluster.initial_master_nodes: ["node-1"]
# es用户
useradd es
chown -R es:es /usr/local/elasticsearch-7.10.2
su - es
/usr/local/elasticsearch-7.10.2/bin/elasticsearch -d 
# 9200端口看

# logstash
tar -zxvf logstash-7.10.2.tar.gz -C /usr/local
# 新增配置文件
cd /usr/local/logstash-7.10.2/bin
vim logstash-elasticsearch.conf
input &#123;
  file &#123;
    path => "/home/ruoyi/logs/sys-*.log"
    start_position => beginning
    sincedb_path => "/dev/null"
    codec => multiline &#123;
      pattern => "^\d&#123;4&#125;-\d&#123;2&#125;-\d&#123;2&#125; \d&#123;2&#125;:\d&#123;2&#125;:\d&#123;2&#125;"
      negate => true
      auto_flush_interval => 3
      what => previous
    &#125;
  &#125;
&#125;

filter &#123;
  if [path] =~ "info" &#123;
    mutate &#123; replace => &#123; type => "sys-info" &#125; &#125;
    grok &#123;
      match => &#123; "message" => "%&#123;COMBINEDAPACHELOG&#125;" &#125;
    &#125;
    date &#123;
      match => [ "timestamp" , "dd/MMM/yyyy:HH:mm:ss Z" ]
    &#125;
  &#125; else if [path] =~ "error" &#123;
    mutate &#123; replace => &#123; type => "sys-error" &#125; &#125;
  &#125; else &#123;
    mutate &#123; replace => &#123; type => "random_logs" &#125; &#125;
  &#125;
&#125;

output &#123;
  elasticsearch &#123;
    hosts => '120.78.129.95:9200'
  &#125;
  stdout &#123; codec => rubydebug &#125;
&#125;

./bin/logstash -f logstash-elasticsearch.conf


# ---------- kibana
# 修改配置
cd /usr/local/kibana-7.10.2/config
vim kibana.yml
server.port: 5601 
server.host: "0.0.0.0" 
elasticsearch.hosts: ["http://120.78.129.95:9200"] 
kibana.index: ".kibana"
# 授权es用户
chown -R es:es /usr/local/kibana-7.10.2/
#启动
# 切换用户成es用户进行操作
su - es
# 后台启动
/usr/local/kibana-7.10.2/bin/kibana &