一、ES初识
简介
https://www.elastic.co/guide/en/elasticsearch/reference/7.5/index.html
简介
Elasticsearch是一个实时分布式搜索和分析引擎。它让你以前所未有的速度处理大数据成为可能。它用于全文搜索、结构化搜索、分析以及将这三者混合使用。
Elasticsearch不仅用于大型企业,它还让像 DataLog以及Kou这样的创业公司将最初的想法变成可扩展的解决方案
可以在你的笔记本上运行,也可以在数以百计的服务器上处理PB级别的数据。
Elasticsearch是一个基于 Apache Lucene(M)的开源搜索引擎。无论在开源还是专有领域, Lucene可以被认为是迄今为止最先进、性能最好的、功能最全的搜索引擎库。但是,Lucene只是一个库。想要使用它,你必须使用ava来作为开发语言并将其直接集成到你的应用中,更糟糕的是,Lucene非常复杂,你需要深入了解检索的相关知识来理解它是如何工作的。
Elasticsearch使用ava开发并使用 Lucene作为其核心来实现所有索引和搜索的功能,但是它的目的是通过简单的 RESTfUL API来隐藏 Lucene的复杂性,从而让全文搜索变得简单
ELK
即使使用
场景
维基、stack overflow、github、商城
维基百科使用 Elasticsearch提供全文搜索并高亮关键字,以及输入实时搜索和搜索纠错等搜索建议功能。
Stackoverflow结合全文搜索与地理位置査询,以及more-like-this功能来找到相关的问题和答案。
Github使用 Elasticsearch检索1300亿行的代码。
对比
1、es基本是开箱即用(解压就可以用!),非常简单。Solr安装略微复杂一丟丢。
2、Sorl利用 Zookeeper进行分布式管理,而 Elasticsearch自身带有分布式协调管理功能。
3、Sorl支持更多格式的数据,比如json、XML、CSV,而 Elasticsearch仅支持json文件格式
4、Solr官方提供的功能更多,而Elasticsearch本身更注重于核心功能,高级功能多有第三方插件提供,例如图形化界面需kibana友好支撑
5、Solr査询快,但更新索引时慢(即插入删除慢),用于电商等査询多的应用
- ES建立索引快(即査询慢),即实时性査询快,用于facebook新浪等搜索
- Solr是传统搜索应用的有力解决方案,但Elasticsearch更适用于新兴的实时搜索应用。
6、Solr比较成熟,有一个更大,更成熟的用户、开发和贡献者社区,而elasticsearch相对开发维护者较少,更新太快,学习使用成本较高。
安装
9200:HTTP
9300:TCP
JDK1.8、ElasticSearch客户端,界面工具
https://www.elastic.co/cn/elasticsearch/
内存设置
jvm.options
-Xms256m 1G
启动es
elasticsearch.bat
可视化插件
https://github.com/mobz/elasticsearch-head
npm install
npm run start
跨域解决
http.cors.enabled: true
http.cors.allow-origin: "*"
docker安装
(1)下载ealastic search和kibana
# 版本要对应
docker pull elasticsearch:7.6.2
docker pull kibana:7.6.2
(2)配置
mkdir -p /mydata/elasticsearch/config
mkdir -p /mydata/elasticsearch/data
echo "http.host: 0.0.0.0" >/mydata/elasticsearch/config/elasticsearch.yml
chmod -R 777 /mydata/elasticsearch/
(3)启动Elastic search
docker run --name elasticsearch -m 300M --memory-swap -1 -p 9200:9200 -p 9300:9300 \
-e "discovery.type=single-node" \
-e ES_JAVA_OPTS="-Xms64m -Xmx128m" \
-v /mydata/elasticsearch/config/elasticsearch.yml:/usr/share/elasticsearch/config/elasticsearch.yml \
-v /mydata/elasticsearch/data:/usr/share/elasticsearch/data \
-v /mydata/elasticsearch/plugins:/usr/share/elasticsearch/plugins \
-d elasticsearch:7.6.2
设置开机启动elasticsearch
docker update elasticsearch --restart=always
(4)启动kibana:
docker run --name kibana -m 300M --memory-swap -1 -e ELASTICSEARCH_HOSTS=http://106.75.103.68:9200 -p 5601:5601 -d kibana:7.6.2
设置开机启动kibana
docker update kibana --restart=always
Kibana
https://www.elastic.co/cn/downloads/kibana
Kibana 是一个免费且开放的用户界面,能够让您对 Elasticsearch 数据进行可视化,并让您在 Elastic Stack 中进行导航。您可以进行各种操作,从跟踪查询负载,到理解请求如何流经您的整个应用,都能轻松完成
/bin/kibana
端口5601
config:kibana.yaml
i18n.locale: "zh-cn"
基础概念
1、索引
2、字段类型(mapping)
3、文档(documents)
一切都是json,在后台把每个索引划分成多个分片,每个分片可以在集群中的不同服务器间迁移
relational DB | ElasticSearch |
---|---|
数据库 | 索引 |
表 | 类型、types |
行 | 文档、documents |
字段 | 属性、fiels |
文档(数据)
就是一条条数据,是面向文档的,索引和搜索数据的最小单位是文档
重要属性
- 自我包含,一篇文档同时包含字段和对应的值,key:value
- 层次型(json)
- 灵活的结构,不依赖预先定义的模式,动态的添加新的字段
类型(数据表)
类型是文档的逻辑容器,就像关系型数据库一样,表格是行的容器。类型中对于字段的定叉称为映射,比如name映射为字符串类型。
我们说文档是无模式的,它们不需要拥有映射中所定义的所有字段,比如新増一个字段,
elasticsearch会自动的将新字段加入映射,但是这个字段的不确定它是什么类型, elasticsearch就开始猜,如果这个值是18,那么elasticsearch会认为它是整形。但是 elasticsearcht也可能猜不对,所以最安全的方式就是提前定义好所需要的映射,这点跟关系型数据库殊途同归了,先定义好字段,然后再使用。
索引(数据库)
索引是映射类型的容器, elasticsearch中的索引是一个非常大的文档集合。索引存储了映射类型的字段和其他设置。然后它们被存储到了各个分片上了。我们来研究下分片是如何工作的。
一个集群至少有一个节点,一个节点就是一个es进程,节点可以有多个索引默认的,如果创建索引,默认5个分片构成,每个主分片都会有一个副本。
主分片和对应的复制分片都不会再同一个节点内,保证数据不会丢失,一个分片是一个Lucene索引,一个包含倒排索引的文档目录,倒排索引的结构使得es在不扫描全部文档的情况下知道文档包含哪些特定的关键字。
倒排索引
Lucene倒排索引作为底层,试用于快速的全文搜索
将文档(数据)分词
只需要查看标签一栏
分词器
分词:把一句话划分成一个个的关键词
IK分词器插件
版本要一致
中文分词器:IK提供的分词算法:ik_smart(最少切分)和ik_max_word(最细粒度划分,穷尽词库)
https://github.com/medcl/elasticsearch-analysis-ik
win
放入es plugin
G:\elasticsearch-7.10.0-windows-x86_64\elasticsearch-7.10.0\bin>elasticsearch-plugin list
elasticsearch-analysis-ik-7.10.0
linux
wget https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v7.6.2/elasticsearch-analysis-ik-7.6.2.zip
unzip this.zip -d /server/elasticsearch/plugins/ik # 解压到ik目录,mv移动到plugins
# cd /usr/share/elasticsearch/bin
# elasticsearch-plugin list
# 显示ik成功
# 重启容器
测试
// 最少分割
GET _analyze
{
"analyzer": "ik_smart",
"text": "明月复苏"
}
// 最多分割
GET _analyze
{
"analyzer": "ik_max_word",
"text": "明月复苏"
}
// 标准
GET _analyze
{
"analyzer": "standard",
"text": "明月复苏"
}
// 不会被分割
GET _analyze
{
"analyzer": "keyword",
"text": "明月复苏"
}
结果
{
"tokens" : [
{
"token" : "明月",
"start_offset" : 0,
"end_offset" : 2,
"type" : "CN_WORD",
"position" : 0
},
{
"token" : "复苏",
"start_offset" : 2,
"end_offset" : 4,
"type" : "CN_WORD",
"position" : 1
}
]
}
明月复苏被拆开了
词库扩展
没有的词需要自己加
/usr/share/elasticsearch/plugins/ik/config
中的IKAnalyzer.cfg.xml
1.本地
IKAnalyzer.cfg.xml
<entry key="ext_dict">ming.dic</entry>
目录下添加ming.dic
明月复苏
2.远程
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE properties SYSTEM "http://java.sun.com/dtd/properties.dtd">
<properties>
<comment>IK Analyzer 扩展配置</comment>
<!--用户可以在这里配置自己的扩展字典 -->
<entry key="ext_dict"></entry>
<!--用户可以在这里配置自己的扩展停止词字典-->
<entry key="ext_stopwords"></entry>
<!--用户可以在这里配置远程扩展字典,通过nginx返回分词 -->
<entry key="remote_ext_dict">http://106.75.103.68/es/fenci.txt</entry>
<!--用户可以在这里配置远程扩展停止词字典-->
<!-- <entry key="remote_ext_stopwords">words_location</entry> -->
</properties>
Rest风格
索引操作
# 关闭索引
POST /my_index/_close
# 开启索引
POST /my_index/_open
PUT 索引/数据
PUT
PUT /索引名 / [类型名 默认为_doc] / 文档id # 相当于http://localhost/database/table/id
{请求体}
PUT database1
{
"mappings": {
"table1": {
"properties": {
"message": {
"type": "text",
"index": false, // 不作为索引
"doc_values": false // 冗余数据,不需要聚合,排序
}
}
}
}
}
PUT /database/table/1
{
"name": "明月复苏啊",
"age": 3
}
{
"result": "create/update"
}
type类型
- 字符串
- text、keyword(不进行全文匹配)
- 数值类型
- long、integer、short、byte、double、float、half、scaled float
- 日期类型
- date
- 二进制
- binary
- nested(避免扁平化处理,对象数组要使用,如果不设置,数组中的对象会变成 class.name = [“1”, “2”] )
- 。。。
PUT索引规则
PUT /ming1
{
"mappings": {
"properties": {
"name": {
"type": "text"
},
"age": {
"type": "long"
},
"birthday": {
"type": "date"
}
}
}
}
如果没有指定类型,默认配置字段类型
GET 规则信息
GET ming1
POST 修改
POST /ming1/type1/1/_update
{
"doc": {
"name": "name1"
}
}
DELETE 删除
delete ming1
其他
GET _cat/nodes # 查看所有节点,*表示主节点
GET _cat/health # 获取es信息
前两个是时间戳
cluster ,集群名称
status,集群状态 green代表健康;yellow代表分配了所有主分片,但至少缺少一个副本,此时集群数据仍旧完整;red代表部分主分片不可用,可能已经丢失数据。
node.total,代表在线的节点总数量
node.data,代表在线的数据节点的数量
shards, active_shards 存活的分片数量
pri,active_primary_shards 存活的主分片数量 正常情况下 shards的数量是pri的两倍。
relo, relocating_shards 迁移中的分片数量,正常情况为 0
init, initializing_shards 初始化中的分片数量 正常情况为 0
unassign, unassigned_shards 未分配的分片 正常情况为 0
pending_tasks,准备中的任务,任务指迁移分片等 正常情况为 0
max_task_wait_time,任务最长等待时间
active_shards_percent,正常分片百分比 正常情况为 100%
GET _cat/master # 查看主节点
GET _cat/indices?v # 查看所有的索引(数据库)
文档操作
put
如果没有值会被覆盖
PUT /mingyue/user/3
{
"name": "明月",
"age": 3,
"desc": "法外狂徒啦",
"tags": ["帅哥", "直男", "交友"]
}
get
get mingyue/user/3
post
灵活性高
doc
POST /ming/type1/ { // 将自动创建id
}
// _update
POST /mingyue/user/3/_update // 进行更新,带doc,数据一样,版本号将不会改变,什么都不变
{
"doc": {
"name": "明月2"
}
}xxxxxxxxxx POST /mingyue/user/3/_update{ "doc": { "name": "明月2" }}", "直男", "交友"]}json
// 无_update,将会覆盖
PUT /mingyue/user/3
{
"name": "明月2",
"age": 3,
"desc": "法外狂徒啦",
"tags": ["帅哥", "直男", "交友"]
}
搜索
{
"_index": "database",
"_type": "table",
"_id": "1",
"_version": 1,
"_seq_no": 0, // 用来控制乐观锁
"_primary_term": 1,
"found": true,
"_source": {
"name": "mingyue"
}
}
简单搜索
GET /mingyue/user/1
GET /mingyue/user/_se/arch?q=myname:明月
复杂搜索
query 查询参数,boost相对得分
_source: 显示的属性
sort:排序
from、size: 分页 (mysql limit)
get mingyue/user/_search
{
"query": {
"match": {
"name": "三"
}
},
"_source": ["name", "desc"],
"sort": [
{
"age": {
"order": "asc"
}
}
],
"from": 0,
"size": 11
}
// 分词存在即可
get mingyue/user/_search
{
"query": {
"match": {
"tags": "交友 明月"
}
}
}
多条件查询
must,必须符合
should,加分项
must_not,不能符合
get mingyue/user/_search
{
"query": {
"bool": {
"must": [
{
"match": {
"name": "明月"
}
},
{
"match": {
"age": 3
}
}
]
}
}
}
filter
进行过滤,gt/lt,不参与评分
get mingyue/user/_search
{
"query": {
"bool": {
"filter": [
{
"range": {
"age": {
"gte": 3,
"lte": 20
}
}
},
{
"term": {
"catalogId": "225"
}
},
{
"terms": {
"brandId": [
"1",
"2",
"9"
]
}
}
]
}
}
}
精确查找
使用倒排索引指定的词条进程精确查找
文本字段用match,非文本字段用term
- match: 使用分词器解析,先分析文档,然后通过分析的文档进行查询 name.keyword全部匹配
- match_all
- match_phrase: 关键词当成整体
- multi_match: 多字段关键词匹配(或)
- term:不会分割,直接查询精确的,不建议全文检索
- keyword:字段类型不会被分词器解析
使用should就可以查询多个
get mingyue/user/_search
{
"query": {
"match": {
"name.keyword": "明月" // 全部值相同
}
}
}
get mingyue/user/_search
{
"query": {
"term": {
"name": "明月"
}
}
}
{
"query": {
"match_phrase": { // 短语匹配
"address": "mill"
}
}
}
get mingyue/user/_search
{
"query": {
"multi_match": { // 多字段匹配
"query": "mill movico", // 会分割
"field": ["address", "city"]
}
}
}
高亮查询
get mingyue/user/_search
{
"query": {
"match": {
"name": "明月"
}
},
"highlight": {
"pre_tags": "<p class='key' style='color:red'>",
"post_tags": "</p>",
"fields": {
"name": {}
}
}
}
"hits" : [
{
"_index" : "mingyue",
"_type" : "user",
"_id" : "3",
"_score" : 1.7563686,
"_source" : {
"name" : "明月2",
"age" : 3,
"desc" : "法外狂徒啦",
"tags" : [
"帅哥",
"直男",
"交友"
]
},
"highlight" : {
"name" : [
"<p class='key' style='color:red'>明</p><p class='key' style='color:red'>月</p>2"
]
}
}
]
聚合
"aggs":{
"aggs_name这次聚合的名字,方便展示在结果集中":{
"AGG_TYPE聚合的类型(avg,term,terms)":{}
}
},
terms种类、avg平均
样例1
搜索address中包含mill的所有人的年龄分布以及平均年龄,但不显示这些
人的详情
GET bank/_search
{
"query": {
"match": {
"address": "Mill"
}
},
"aggs": {
"ageAgg": {
"terms": { // 多少种情况
"field": "age",
"size": 10
}
},
"ageAvg": {
"avg": { // 平均
"field": "age"
}
},
"balanceAvg": {
"avg": {
"field": "balance"
}
}
},
"size": 0
}
结果
{
"took" : 2,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 4,
"relation" : "eq"
},
"max_score" : null,
"hits" : [ ]
},
"aggregations" : {
"ageAgg" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : 38,
"doc_count" : 2
},
{
"key" : 28,
"doc_count" : 1
},
{
"key" : 32,
"doc_count" : 1
}
]
},
"ageAvg" : {
"value" : 34.0
},
"balanceAvg" : {
"value" : 25208.0
}
}
}
样例2
按照年龄聚合,并且求这些年龄段的这些人的平均薪资
GET bank/_search
{
"query": {
"match_all": {}
},
"aggs": {
"ageAgg": {
"terms": {
"field": "age",
"size": 100
},
"aggs": {
"ageAvg": {
"avg": {
"field": "balance"
}
}
}
}
},
"size": 0
}
结果
{
"took" : 49,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1000,
"relation" : "eq"
},
"max_score" : null,
"hits" : [ ]
},
"aggregations" : {
"ageAgg" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : 31,
"doc_count" : 61,
"ageAvg" : {
"value" : 28312.918032786885
}
},
{
"key" : 39,
"doc_count" : 60,
"ageAvg" : {
"value" : 25269.583333333332
}
},
{
"key" : 26,
"doc_count" : 59,
"ageAvg" : {
"value" : 23194.813559322032
}
}
]
}
}
}
样例3
查出所有年龄分布,并且这些年龄段中M的平均薪资和F的平均薪资以及这个年龄段的总体平均薪资
GET bank/_search
{
"query": {
"match_all": {}
},
"aggs": {
"ageAgg": { // 年龄分类
"terms": {
"field": "age",
"size": 100
},
"aggs": { // 年龄分类后,进行两种类型的聚合
"genderAgg": { // 1、性别
"terms": {
"field": "gender.keyword" // 性别需要完全相同
},
"aggs": {
"balanceAvg": { // 性别下的薪资
"avg": {
"field": "balance"
}
}
}
},
"ageBalanceAvg": { // 2、平均薪资
"avg": {
"field": "balance"
}
}
}
}
},
"size": 0
}
返回值
hit: 索引和文档的信息
- 查询的结果总数
- 查询出的具体的文档
- 分数: 符合程度
delete
Delete 索引名称/文档名称/主键编号
根据匹配条件删除数据:(注意请求方式是Post)
POST 索引名称/文档名称/_delete_by_query
{
“query”:{
“term”:{
“_id”:100000100
}
}
}
- 删除所有数据:(注意请求方式是Post,只删除数据,不删除表结构)
POST /testindex/testtype/_delete_by_query?pretty
{
“query”: {
“match_all”: {
}
}
}
批量api
这里的批量操作,当发生某一条执行发生失败时,其他的数据仍然能够接着执行,也就是说彼此之间是独立的。
bulk api以此按顺序执行所有的action(动作)。如果一个单个的动作因任何原因失败,它将继续处理它后面剩余的动作。当bulk api返回时,它将提供每个动作的状态(与发送的顺序相同),所以您可以检查是否一个指定的动作是否失败了。
实例1: 执行多条数据
// index会覆盖,create不会
POST customer/external/_bulk
{"index":{"_id":"1"}}
{"name":"John Doe"}
{"index":{"_id":"2"}}
{"name":"John Doe"}
实例2:对于整个索引执行批量操作
POST /_bulk
{"delete":{"_index":"website","_type":"blog","_id":"123"}}
{"create":{"_index":"website","_type":"blog","_id":"123"}}
{"title":"my first blog post"}
{"index":{"_index":"website","_type":"blog"}}
{"title":"my second blog post"}
{"update":{"_index":"website","_type":"blog","_id":"123"}}
{"doc":{"title":"my updated blog post"}}
测试数据: https://github.com/elastic/elasticsearch/blob/master/docs/src/test/resources/accounts.json
POST /bank/account/_bulk
+ 数据 进行测试
mapper
GET /bank/_mapping
类型详细:https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-types.html
binary
- Binary value encoded as a Base64 string.
boolean
- true and false values.
Keywords
- The keyword family, including keyword, constant_keyword, and wildcard.
- 不会分割,精确匹配
Numbers
- Numeric types, such as long and double, used to express amounts.
Dates
- Date types, including date and date_nanos.
alias
- Defines an alias for an existing field.
text
nested
- 不设置,会把数组多个对象变成数组student.name = [],声明嵌入式
- 查询时,不能用Person.name,要用
{
"nested": {
"path": "attrs",
"query": {
"bool": {
"must": [
{
"term": {
"attrs.attrId": {
"value": "8"
}
}
},
{
"terms": {
"attrs.attrValue": [
"英特尔"
]
}
}
]
}
}
}
}
设置
新增
put /my_index
{
"mappings": {
"properties": {
"age": {
"type" : "integer",
"index": true // 默认能够检索,如果false将无法通过其检索
}
}
}
}
添加新的字段
put /my_index/_mapping
更新字段,只能添加索引然后迁移
迁移:
POST _reindex
{
"source": {
"index": "bank",
"type": "account"
},
"dest": {
"index": "newBank"
}
}
其他操作
数据迁移
get product/_mapping
put gulimall_product
{
// res - oldProductName{}
// 去掉所有index: false,doc_value:false,false可以节省磁盘空间,提升索引速度,不让字段查询搜索
}
// 迁移数据
post _reindex
{
"source": {
"index": "product"
},
"dest": {
"index": "gulimall_product"
}
}
二、集成Springboot
https://www.elastic.co/guide/en/elasticsearch/client
开始
操作ES包
自己发http请求也行
9300:TCP
- 使用spring-data-elasticsearch:transport-api.jar
- springboot版本不同,jar不同,不能适配es版本
- es7.x不建议使用8废弃
9200:HTTP
- JestClient:非官方,更新慢
- RestTemplate:模拟发http请求,需要自己封装操作
- HttpClient:同上
- Elasticsearch-Rest-Client:官方,封装了es操作,操作简单
依赖
es版本要一致
springboot已经限制es版本,需要使用这个限制版本
<properties>
<java.version>1.8</java.version>
<elasticsearch.version>7.10.0</elasticsearch.version>
</properties>
<dependency>
<groupId>org.elasticsearch.client</groupId>
<artifactId>elasticsearch-rest-high-level-client</artifactId>
<version>7.6.2</version>
</dependency>
<dependency>
<groupId>com.alibaba</groupId>
<artifactId>fastjson</artifactId>
<version>1.2.62</version>
</dependency>
需要json传输,
JSON.toJSONString(obj)
JSON.parseObject(Str, Map.class)
创建项目
空项目、改jdk、编译版本、es6
依赖一定要一致
<properties>
<java.version>1.8</java.version>
<elasticsearch.version>7.10.0</elasticsearch.version>
</properties>
配置类
@Configuration
public class ElasticSearchConfig {
public static final RequestOptions COMMON_OPTIONS;
static {
RequestOptions.Builder builder = RequestOptions.DEFAULT.toBuilder();
/*builder.addHeader("Authorization", "Bearer " + TOKEN);
builder.setHttpAsyncResponseConsumerFactory(
new HttpAsyncResponseConsumerFactory
.HeapBufferedResponseConsumerFactory(30 * 1024 * 1024 * 1024));*/
COMMON_OPTIONS = builder.build();
}
@Bean
public RestHighLevelClient esRestClient() {
RestHighLevelClient client = new RestHighLevelClient(
RestClient.builder(
new HttpHost("106.75.103.68", 9200, "http")
)
);
return client;
}
}
如果需要设置请求头
private static final RequestOptions COMMON_OPTIONS;
static {
RequestOptions.Builder builder = RequestOptions.DEFAULT.toBuilder();
builder.addHeader("Authorization", "Bearer " + TOKEN);
builder.setHttpAsyncResponseConsumerFactory(
new HttpAsyncResponseConsumerFactory
.HeapBufferedResponseConsumerFactory(30 * 1024 * 1024 * 1024));
COMMON_OPTIONS = builder.build();
}
api测试
主要api
// 查询
@Test
void search() throws IOException {
// 1. 新建查询请求
SearchRequest searchRequest = new SearchRequest();
searchRequest.indices("bank"); // 确定请求查询索引
SearchSourceBuilder sourceBuilder = new SearchSourceBuilder(); // 添加查询条件构造器
// sourceBuilder.from()
sourceBuilder.query( QueryBuilders.matchQuery("address", "mill") ); // 构建查询条件
TermsAggregationBuilder ageAgg = AggregationBuilders.terms("ageAgg").field("age").size(10); // 年龄分类
sourceBuilder.aggregation(ageAgg);
AvgAggregationBuilder balanceAvg = AggregationBuilders.avg("balanceAvg").field("balance"); // 所有的平均值
sourceBuilder.aggregation(balanceAvg);
System.out.println(sourceBuilder.toString());
searchRequest.source(sourceBuilder); // 请求中加入查询
// 2、正式查询
SearchResponse searchResponse = client.search(searchRequest, ElasticSearchConfig.COMMON_OPTIONS);
// 3、分析结果
System.out.println(searchResponse.toString());
// 3.1 获取匹配的数据
SearchHits hits = searchResponse.getHits();
SearchHit[] searchHits = hits.getHits();
for (SearchHit searchHit : searchHits) {
// searchHit.getIndex();searchHit.getType();
String source = searchHit.getSourceAsString();
Account account = JSON.parseObject(source, Account.class);
System.out.println(account);
}
// 3.2 获取聚合数据
Aggregations aggregations = searchResponse.getAggregations();
/*for (Aggregation aggregation : aggregations.asList()) {
aggregation.
}*/
Terms ageAgg1 = aggregations.get("ageAgg");
for (Terms.Bucket bucket : ageAgg1.getBuckets()) {
String keyAsString = bucket.getKeyAsString();
System.out.println("年龄:" + keyAsString + "==>>" + bucket.getDocCount());
}
Avg balanceAvg1 = aggregations.get("balanceAvg");
System.out.println("平均薪资:" +balanceAvg1.getValue());
}
// 保存或者更新
@Test
void getOrUpdate() throws IOException {
IndexRequest indexRequest = new IndexRequest("users");
indexRequest.id("1");
// 1、 indexRequest.source("username", "ming", "age", 18);
User user = new User();
user.setName("ming");
user.setAge(18);
user.setGender("男");
indexRequest.source(JSON.toJSONString(user), XContentType.JSON);// 请求数据添加:对象和请求头
IndexResponse index = client.index(indexRequest, ElasticSearchConfig.COMMON_OPTIONS); // 正式发送请求
System.out.println(index);
}
详细api
class EsApiApplicationTests {
@Autowired
@Qualifier("restHighLevelClient") // 指定id/方法名
private RestHighLevelClient client;
// 索引的创建 Request
@Test
void testCreateIndex() throws IOException {
// 1、创建索引请求
CreateIndexRequest request = new CreateIndexRequest("mingyue_index");
// 2、客户端执行请求
CreateIndexResponse createIndexResponse = client.indices().create(request, RequestOptions.DEFAULT);
System.out.println(createIndexResponse);
}
// 获取索引
@Test
void testExistIndex() throws IOException {
GetIndexRequest request = new GetIndexRequest("mingyue_index");
boolean exist = client.indices().exists(request, RequestOptions.DEFAULT);
System.out.println(exist);
}
// 删除索引
@Test
void testDeleteIndex() throws IOException{
DeleteIndexRequest request = new DeleteIndexRequest("mingyue_index");
AcknowledgedResponse delete = client.indices().delete(request, RequestOptions.DEFAULT);
System.out.println(delete.isAcknowledged());
}
// 添加文档
@Test
void testAddDocumnet() throws IOException {
// 创建对象
User user = new User("明月", 3);
// 创建请求
IndexRequest request = new IndexRequest("ming_index");
// 规则
request.id("1");
request.timeout(TimeValue.timeValueSeconds(1));
request.timeout("1s");
// 数据放入
request.source(JSON.toJSONString(user), XContentType.JSON);
// 客户端发送请求,获取结果
IndexResponse indexResponse = client.index(request, RequestOptions.DEFAULT);
System.out.println(indexResponse.toString());
System.out.println(indexResponse.status()); // 命令返回的状态created
}
// 获取文档,判断是否存在
@Test
void testIsExists() throws IOException {
GetRequest getRequest = new GetRequest("ming_index", "1");
// 不获取_source的上下文
getRequest.fetchSourceContext(new FetchSourceContext(false));
getRequest.storedFields("_none_");
boolean exists = client.exists(getRequest, RequestOptions.DEFAULT);
System.out.println(exists);
}
// 获取文档信息
@Test
void testGetDoc() throws Exception {
GetRequest getRequest = new GetRequest("ming_index", "1");
GetResponse getResponse = client.get(getRequest, RequestOptions.DEFAULT);
System.out.println(getResponse.getSourceAsString());
System.out.println(getResponse);
}
// 更新文档的信息
@Test
void testUpdateDoc() throws IOException {
UpdateRequest updateRequest = new UpdateRequest("ming_index", "1");
updateRequest.timeout("1s");
User user = new User("明月陎", 18);
updateRequest.doc(JSON.toJSONString(user), XContentType.JSON);
UpdateResponse updateResponse;
updateResponse = client.update(updateRequest, RequestOptions.DEFAULT);
System.out.println(updateResponse.status());
}
// 删除文档
@Test
void testDeleteDoc() throws IOException {
DeleteRequest deleteRequest = new DeleteRequest("ming_index", "1");
deleteRequest.timeout("1s");
DeleteResponse deleteResponse = client.delete(deleteRequest, RequestOptions.DEFAULT);
System.out.println(deleteResponse.status());
}
// 特殊的,批量插入
@Test
void testBulkRequest() throws IOException {
BulkRequest bulkRequest = new BulkRequest();
bulkRequest.timeout("10s");
List<User> list = new ArrayList<>();
list.add(new User("ming1", 2));
list.add(new User("ming12", 2));
list.add(new User("ming13", 2));
list.add(new User("ming2", 2));
list.add(new User("ming4", 2));
// 批处理
for(int i = 0; i < list.size(); i++) {
bulkRequest.add(new IndexRequest("ming_index")
.id("" + (i + 1)) // 无id,则随机生成
.source(JSON.toJSONString(list.get(i)), XContentType.JSON)
);
}
BulkResponse bulkResponse = client.bulk(bulkRequest, RequestOptions.DEFAULT);
System.out.println(bulkResponse.hasFailures());
}
// 查询
/**
* SearchRequest 搜索请求
* SearchSourceBuilder 条件构造
* HighlightBuilder 高亮
* TermQueryBuilder 精确查询
* MatchAllQueryBuilder
* xxx QueryBuilder
*/
@Test
void testSearch() throws IOException {
// 查询请求
SearchRequest searchRequest = new SearchRequest("ming_index");
// 构建搜索条件
SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
// 获取查询条件构造器,可以使用QueryBuilders工具实现
//QueryBuilders.termQuery()
// TermQueryBuilder termQueryBuilder = QueryBuilders.termQuery("name", "ming");
MatchQueryBuilder matchQueryBuilder = QueryBuilders.matchQuery("name", "ming");
// MatchAllQueryBuilder matchAllQueryBuilder = QueryBuilders.matchAllQuery();
sourceBuilder.query(matchQueryBuilder);
// sourceBuilder.from();
// sourceBuilder.size();
sourceBuilder.timeout(new TimeValue(60, TimeUnit.SECONDS));
// 放入资源
searchRequest.source(sourceBuilder);
SearchResponse searchResponse = client.search(searchRequest, RequestOptions.DEFAULT);
System.out.println(JSON.toJSONString(searchResponse.getHits()));
for (SearchHit docFields : searchResponse.getHits().getHits()) {
System.out.println(docFields.getSourceAsMap());
}
}
}
爬虫
依赖
jsoup包解析网页
<dependency>
<groupId>org.jsoup</groupId>
<artifactId>jsoup</artifactId>
<version>1.10.2</version>
</dependency>
配置
@Configuration
public class ElasticSearchConfig {
// id方法名,class返回类名
@Bean
public RestHighLevelClient restHighLevelClient() {
return new RestHighLevelClient(
RestClient.builder(
new HttpHost("localhost", 9200, "http"))
);
}
}
实现
@Service
public class ContentService {
@Autowired
private RestHighLevelClient restHighLevelClient;
// 放入es索引
public Boolean parseContent(String keywords) throws Exception {
List<Content> contents = new HtmlParseUtil().parseJD(keywords);
BulkRequest bulkRequest = new BulkRequest();
bulkRequest.timeout("2m");
for(int i = 0; i < contents.size(); i++) {
bulkRequest.add(
new IndexRequest("jd_goods")
.source(JSON.toJSONString(contents.get(i)), XContentType.JSON)
);
}
BulkResponse bulkResponse = restHighLevelClient.bulk(bulkRequest, RequestOptions.DEFAULT);
// System.out.println();
// System.out.println(bulkResponse.status());
return !bulkResponse.hasFailures();
}
public List<Map<String, Object>> searchPage(String keyword, int pageNo, int pageSize) throws Exception{
if (pageNo <= 1 ) {
pageNo = 1;
}
SearchRequest searchRequest = new SearchRequest("jd_goods");
SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
// 分页
sourceBuilder.from(pageSize * (pageNo-1));
sourceBuilder.size(pageSize);
// 查询条件
TermQueryBuilder termQueryBuilder = QueryBuilders.termQuery("title", keyword);
// 使用查询
sourceBuilder.query(termQueryBuilder);
sourceBuilder.timeout(new TimeValue(60, TimeUnit.SECONDS));
// 高亮
HighlightBuilder highlightBuilder = new HighlightBuilder();
highlightBuilder.field("title");
highlightBuilder.requireFieldMatch(false); // 关闭多个高亮
highlightBuilder.preTags("<span style='color:red;'>");
highlightBuilder.postTags("</span>");
sourceBuilder.highlighter(highlightBuilder);
// 执行
searchRequest.source(sourceBuilder);
SearchResponse searchResponse = restHighLevelClient.search(searchRequest, RequestOptions.DEFAULT);
// 解析结果
ArrayList<Map<String, Object>> list = new ArrayList<>();
for (SearchHit docFields : searchResponse.getHits().getHits()) {
// 解析高亮的字段
Map<String, HighlightField> highlightFields = docFields.getHighlightFields();
HighlightField title = highlightFields.get("title");
Map<String, Object> sourceAsMap = docFields.getSourceAsMap();// 原来的结果
// 解析高亮的字段
if(title != null) {
Text[] fragments = title.fragments();
String new_title = "";
for (Text fragment : fragments) {
new_title += fragment;
}
sourceAsMap.put("title", new_title);
}
list.add( sourceAsMap );
}
return list;
}
}
谷粒商城-首页搜索
dsl
keyword = 小米
& sort = saleCount_desc/asc // 排序
& hasStock = 0/1
& skuPrice = 400_1900 // 最高最低
& brandId = 1& brandId =2 // 数组
& catalog3Id=1
& attrs = 1_3G:4G:5G & attrs = 2_骁龙845 & attrs = 4_高清屏 // 数组
& pageNum = 1
最终语句
get gulimall_product/_search
{
"from": 0,
"size": 1,
"highlight": {
"fields": {
"skuTitle":{}
},
"pre_tags": "<b style='color:red'>",
"post_tags": "</b>"
},
"query": {
"bool": {
"must": [
{
"match": {
"skuTitle": "新手机"
}
}
],
"filter": [
{
"term": {
"catalogId": "225"
}
},
{
"terms": {
"brandId": [
"1",
"2",
"9"
]
}
},
{
"nested": {
"path": "attrs",
"query": {
"bool": {
"must": [
{
"term": {
"attrs.attrId": {
"value": "8"
}
}
},
{
"terms": {
"attrs.attrValue": [
"英特尔"
]
}
}
]
}
}
}
},
{
"term": {
"hasStock": {
"value": "false"
}
}
},
{
"range": {
"skuPrice": {
"gte": -1,
"lte": 1000
}
}
}
]
}
},
"sort": {
"skuPrice": {
"order": "desc"
}
},
"from": 0,
"size": 1,
"highlight": {
"fields": {
"skuTitle":{}
},
"pre_tags": "<b style='color:red'>",
"post_tags": "</b>"
}
,
"aggs": {
"brand_agg": {
"terms": {
"field": "brandId",
"size": 10
},
"aggs": {
"brand_name_agg": {
"terms": {
"field": "brandName",
"size": 10
}
},
"brand_img_agg": {
"terms": {
"field": "brandImg",
"size": 10
}
}
}
},
"catelog_agg": {
"terms": {
"field": "catalogId",
"size": 10
},
"aggs": {
"catelog_name_agg": {
"terms": {
"field": "catalogName",
"size": 10
}
}
}
},
"attr_agg": {
"nested": {
"path": "attrs"
},
"aggs": {
"attr_id_agg": {
"terms": {
"field": "attrs.attrId",
"size": 10
},
"aggs": {
"attr_name_agg": {
"terms": {
"field": "attrs.attrName",
"size": 10
}
},
"attr_value_age": {
"terms": {
"field": "attrs.attrValue",
"size": 10
}
}
}
}
}
}
}
}
查询分析
模糊匹配match、过滤filter(属性nested、分类、品牌、价格区间range,库存)
排序(sort)
分页(from,size)
高亮(highlight)
{
"query": {
"bool": {
"must":[
{
"match": {
"skuTitle": "手机"
}
}
],
"filter": [
{
"term": {
"catalogId": "225"
}
},
{
"terms":[],
},
{
"nested": {
"path":"person.name",
"query": {
}
}
},
{
"range": {
"skuPrice": {
"gte": 0,
"lte": 1000
}
}
}
]
}
},
"sort": {
"skuPrice": {
"order": "asc"
}
},
"from": 0,
"size": 100,
"highlight": {
"fields": {
"skuTitle": {}
},
"pre_tags": "",
"post_tags": ""
}
}
get product/_search
{
"query": {
"bool": {
"must": [
{
"match": {
"skuTitle": "新手机"
}
}
],
"filter": [
{
"term": {
"catalogId": "225"
}
},
{
"terms": {
"brandId": [
"1",
"2",
"9"
]
}
},
{
"nested": {
"path": "attrs",
"query": {
"bool": {
"must": [
{
"term": {
"attrs.attrId": {
"value": "8"
}
}
},
{
"terms": {
"attrs.attrValue": [
"英特尔"
]
}
}
]
}
}
}
},
{
"term": {
"hasStock": {
"value": "false"
}
}
},
{
"range": {
"skuPrice": {
"gte": -1,
"lte": 1000
}
}
}
]
}
},
"sort": {
"skuPrice": {
"order": "desc"
}
},
"from": 0,
"size": 111,
"highlight": {
"fields": {
"skuTitle":{}
},
"pre_tags": "<b style='color:red'>",
"post_tags": "</b>"
}
}
聚合分析
最终
get gulimall_product/_search
{
"query": {
"match_all": {}
},
"aggs": {
"brand_agg": {
"terms": {
"field": "brandId",
"size": 10
},
"aggs": {
"brand_name_agg": {
"terms": {
"field": "brandName",
"size": 10
}
},
"brand_img_agg": {
"terms": {
"field": "brandImg",
"size": 10
}
}
}
},
"catelog_agg": {
"terms": {
"field": "catalogId",
"size": 10
},
"aggs": {
"catelog_name_agg": {
"terms": {
"field": "catalogName",
"size": 10
}
}
}
},
"attr_agg": {
"nested": {
"path": "attrs"
},
"aggs": {
"attr_id_agg": {
"terms": {
"field": "attrs.attrId",
"size": 10
},
"aggs": {
"attr_name_agg": {
"terms": {
"field": "attrs.attrName",
"size": 10
}
},
"attr_value_age": {
"terms": {
"field": "attrs.attrValue",
"size": 10
}
}
}
}
}
}
}
}
无法聚合
非检索属性
"brandName" : {
"type" : "keyword",
"index" : false,
"doc_values" : false
}
数据迁移:
get product/_mapping
put gulimall_product
{
// res - oldProductName{}
// 去掉所有index: false,doc_value:false,false可以节省磁盘空间,提升索引速度,不让字段查询搜索
}
// 迁移数据
post _reindex
{
"source": {
"index": "product"
},
"dest": {
"index": "gulimall_product"
}
}
nested
resp
{
"took" : 8,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 2,
"relation" : "eq"
},
"max_score" : 3.2580965,
"hits" : [
{
"_index" : "gulimall_product",
"_type" : "_doc",
"_id" : "82",
"_score" : 3.2580965,
"_source" : {
"attrs" : [
{
"attrId" : 7,
"attrName" : "机身长度",
"attrValue" : "158.3mm"
},
{
"attrId" : 8,
"attrName" : "CPU品牌",
"attrValue" : "英特尔"
}
],
"brandId" : 6,
"brandImg" : "https://ming-mall.oss-cn-beijing.aliyuncs.com/2021/10/31/8f88b9d0-1c5c-4b3b-8b8e-096b37bc4fc1_key_login.png",
"brandName" : "苹果",
"catalogId" : 225,
"catalogName" : "手机",
"hasStock" : false,
"hotScore" : 0,
"saleCount" : 0,
"skuId" : 82,
"skuImg" : "https://ming-mall.oss-cn-beijing.aliyuncs.com/2021/10/31/a4a6e721-7de0-4eb6-8490-add13221a7b7_key_login.png",
"skuPrice" : 4444.0,
"skuTitle" : "iphone11 aaa aaa 8G 16",
"spuId" : 39
},
"highlight" : {
"skuTitle" : [
"<b style='color:red'>iphone11</b> aaa aaa 8G 16"
]
}
}
]
},
"aggregations" : {
"brandAgg" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : 6,
"doc_count" : 2,
"brandImgAgg" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "https://ming-mall.oss-cn-beijing.aliyuncs.com/2021/10/31/8f88b9d0-1c5c-4b3b-8b8e-096b37bc4fc1_key_login.png",
"doc_count" : 2
}
]
},
"brandNameAgg" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "苹果",
"doc_count" : 2
}
]
}
}
]
},
"catalogAgg" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : 225,
"doc_count" : 2,
"catalogNameAgg" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "手机",
"doc_count" : 2
}
]
}
}
]
},
"attrs" : {
"doc_count" : 66,
"attrIdAgg" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : 8,
"doc_count" : 64,
"attrNameAgg" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "CPU品牌",
"doc_count" : 64
}
]
},
"attrValueAgg" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "东西",
"doc_count" : 36
},
{
"key" : "英特尔",
"doc_count" : 28
}
]
}
},
{
"key" : 7,
"doc_count" : 2,
"attrNameAgg" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "机身长度",
"doc_count" : 2
}
]
},
"attrValueAgg" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "158.3mm",
"doc_count" : 2
}
]
}
}
]
}
}
}
}
Param与返回结果
https://www.bilibili.com/video/BV1np4y1C7Yf?p=177&spm_id_from=pageDriver
Param
@Data
public class SearchParam {
//页面传递过来的全文匹配关键字
private String keyword;
//品牌id,可以多选
private List<Long> brandId;
//三级分类id
private Long catalog3Id;
//排序条件:sort=price_desc/asc
private String sort;
//是否显示有货
private Integer hasStock;
//价格区间查询 20_500
private String skuPrice;
//按照属性进行筛选 attrs=1_安卓:其他&attrs=2_6寸:5寸
private List<String> attrs;
//页码
private Integer pageNum = 1;
//原生的所有查询条件
private String _queryString;
}
SearchResult
@Data
public class SearchResult {
//查询到的所有商品信息
private List<SkuEsModel> product;
//当前页码
private Integer pageNum;
//总记录数
private Long total;
//总页码
private Integer totalPages;
//页码遍历结果集(分页)
private List<Integer> pageNavs;
// 当前查询到的结果,所有涉及到的品牌
private List<BrandVo> brands;
// 当前查询到的结果,所有涉及到的所有属性
private List<AttrVo> attrs;
// 当前查询到的结果,所有涉及到的所有分类
private List<CatalogVo> catalogs;
//===========================以上是返回给页面的所有信息============================//
/* 面包屑导航数据 */
private List<NavVo> navs;
@Data
public static class NavVo {
private String navName;
private String navValue;
private String link;
}
@Data
@AllArgsConstructor
public static class BrandVo {
private Long brandId;
private String brandName;
private String brandImg;
}
@Data
@AllArgsConstructor
public static class AttrVo {
private Long attrId;
private String attrName;
private List<String> attrValue;
}
@Data
@AllArgsConstructor
public static class CatalogVo {
private Long catalogId;
private String catalogName;
}
}
主体逻辑
// controller
@GetMapping(value = {"/search.html","/"})
public String getSearchPage(SearchParam searchParam, Model model, HttpServletRequest request) {
searchParam.set_queryString(request.getQueryString());
SearchResult result = searchService.getSearchResult(searchParam);
model.addAttribute("result", result);
return "search";
}
// service
public SearchResult getSearchResult(SearchParam searchParam) {
SearchResult searchResult = null;
// 通过请求参数构建查询请求
SearchRequest request = bulidSearchRequest( searchParam );
try {
SearchResponse searchResponse = restHighLevelClient.search( request, GulimallElasticSearchConfig.COMMON_OPTIONS );
//将es响应数据封装成结果
searchResult = bulidSearchResult( searchParam, searchResponse );
} catch (IOException e) {
e.printStackTrace();
}
return searchResult;
}
查询条件
private SearchRequest bulidSearchRequest(SearchParam searchParam) {
SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
//1. 构建bool query
BoolQueryBuilder boolQueryBuilder = new BoolQueryBuilder();
//1.1 bool must
if (!StringUtils.isEmpty(searchParam.getKeyword())) {
boolQueryBuilder.must(QueryBuilders.matchQuery("skuTitle", searchParam.getKeyword()));
}
//1.2 bool filter
//1.2.1 catalog
if (searchParam.getCatalog3Id()!=null){
boolQueryBuilder.filter(QueryBuilders.termQuery("catalogId", searchParam.getCatalog3Id()));
}
//1.2.2 brand
if (searchParam.getBrandId()!=null&&searchParam.getBrandId().size()>0) {
boolQueryBuilder.filter(QueryBuilders.termsQuery("brandId",searchParam.getBrandId()));
}
//1.2.3 hasStock
if (searchParam.getHasStock() != null) {
boolQueryBuilder.filter(QueryBuilders.termQuery("hasStock", searchParam.getHasStock() == 1));
}
//1.2.4 priceRange
RangeQueryBuilder rangeQueryBuilder = QueryBuilders.rangeQuery("skuPrice");
if (!StringUtils.isEmpty(searchParam.getSkuPrice())) {
String[] prices = searchParam.getSkuPrice().split("_");
if (prices.length == 1) {
if (searchParam.getSkuPrice().startsWith("_")) {
rangeQueryBuilder.lte(Integer.parseInt(prices[0]));
}else {
rangeQueryBuilder.gte(Integer.parseInt(prices[0]));
}
} else if (prices.length == 2) {
//_6000会截取成["","6000"]
if (!prices[0].isEmpty()) {
rangeQueryBuilder.gte(Integer.parseInt(prices[0]));
}
rangeQueryBuilder.lte(Integer.parseInt(prices[1]));
}
boolQueryBuilder.filter(rangeQueryBuilder);
}
//1.2.5 attrs-nested
//attrs=1_5寸:8寸&2_16G:8G
List<String> attrs = searchParam.getAttrs();
BoolQueryBuilder queryBuilder = new BoolQueryBuilder();
if (attrs!=null&&attrs.size() > 0) {
attrs.forEach(attr->{
String[] attrSplit = attr.split("_");
queryBuilder.must(QueryBuilders.termQuery("attrs.attrId", attrSplit[0]));
String[] attrValues = attrSplit[1].split(":");
queryBuilder.must(QueryBuilders.termsQuery("attrs.attrValue", attrValues));
});
}
NestedQueryBuilder nestedQueryBuilder = QueryBuilders.nestedQuery("attrs", queryBuilder, ScoreMode.None);
boolQueryBuilder.filter(nestedQueryBuilder);
//1. bool query构建完成
searchSourceBuilder.query(boolQueryBuilder);
//2. sort eg:sort=saleCount_desc/asc
if (!StringUtils.isEmpty(searchParam.getSort())) {
String[] sortSplit = searchParam.getSort().split("_");
searchSourceBuilder.sort(sortSplit[0], sortSplit[1].equalsIgnoreCase("asc") ? SortOrder.ASC : SortOrder.DESC);
}
//3. 分页
searchSourceBuilder.from((searchParam.getPageNum() - 1) * EsConstant.PRODUCT_PAGESIZE);
searchSourceBuilder.size(EsConstant.PRODUCT_PAGESIZE);
//4. 高亮highlight
if (!StringUtils.isEmpty(searchParam.getKeyword())) {
HighlightBuilder highlightBuilder = new HighlightBuilder();
highlightBuilder.field("skuTitle");
highlightBuilder.preTags("<b style='color:red'>");
highlightBuilder.postTags("</b>");
searchSourceBuilder.highlighter(highlightBuilder);
}
//5. 聚合
//5.1 按照brand聚合
TermsAggregationBuilder brandAgg = AggregationBuilders.terms("brandAgg").field("brandId");
TermsAggregationBuilder brandNameAgg = AggregationBuilders.terms("brandNameAgg").field("brandName");
TermsAggregationBuilder brandImgAgg = AggregationBuilders.terms("brandImgAgg").field("brandImg");
brandAgg.subAggregation(brandNameAgg);
brandAgg.subAggregation(brandImgAgg);
searchSourceBuilder.aggregation(brandAgg);
//5.2 按照catalog聚合
TermsAggregationBuilder catalogAgg = AggregationBuilders.terms("catalogAgg").field("catalogId");
TermsAggregationBuilder catalogNameAgg = AggregationBuilders.terms("catalogNameAgg").field("catalogName");
catalogAgg.subAggregation(catalogNameAgg);
searchSourceBuilder.aggregation(catalogAgg);
//5.3 按照attrs聚合
NestedAggregationBuilder nestedAggregationBuilder = new NestedAggregationBuilder("attrs", "attrs");
//按照attrId聚合
TermsAggregationBuilder attrIdAgg = AggregationBuilders.terms("attrIdAgg").field("attrs.attrId");
//按照attrId聚合之后再按照attrName和attrValue聚合
TermsAggregationBuilder attrNameAgg = AggregationBuilders.terms("attrNameAgg").field("attrs.attrName");
TermsAggregationBuilder attrValueAgg = AggregationBuilders.terms("attrValueAgg").field("attrs.attrValue");
attrIdAgg.subAggregation(attrNameAgg);
attrIdAgg.subAggregation(attrValueAgg);
nestedAggregationBuilder.subAggregation(attrIdAgg);
searchSourceBuilder.aggregation(nestedAggregationBuilder);
log.debug("构建的DSL语句 {}",searchSourceBuilder.toString());
SearchRequest request = new SearchRequest(new String[]{EsConstant.PRODUCT_INDEX}, searchSourceBuilder);
return request;
}
响应结果
private SearchResult bulidSearchResult(SearchParam searchParam, SearchResponse searchResponse) {
SearchResult result = new SearchResult();
SearchHits hits = searchResponse.getHits();
//1. 封装查询到的商品信息
if (hits.getHits()!=null && hits.getHits().length>0){
List<SkuEsModel> skuEsModels = new ArrayList<>();
for (SearchHit hit : hits) {
String sourceAsString = hit.getSourceAsString();
SkuEsModel skuEsModel = JSON.parseObject(sourceAsString, SkuEsModel.class);
//设置高亮属性
if (!StringUtils.isEmpty(searchParam.getKeyword())) {
HighlightField skuTitle = hit.getHighlightFields().get("skuTitle");
String highLight = skuTitle.getFragments()[0].string();
skuEsModel.setSkuTitle(highLight);
}
skuEsModels.add(skuEsModel);
}
result.setProduct(skuEsModels);
}
//2. 封装分页信息
//2.1 当前页码
result.setPageNum(searchParam.getPageNum());
//2.2 总记录数
long total = hits.getTotalHits().value;
result.setTotal(total);
//2.3 总页码
Integer totalPages = (int)total % EsConstant.PRODUCT_PAGESIZE == 0 ?
(int)total / EsConstant.PRODUCT_PAGESIZE : (int)total / EsConstant.PRODUCT_PAGESIZE + 1;
result.setTotalPages(totalPages);
List<Integer> pageNavs = new ArrayList<>();
for (int i = 1; i <= totalPages; i++) {
pageNavs.add(i);
}
result.setPageNavs(pageNavs);
//3. 查询结果涉及到的品牌
List<SearchResult.BrandVo> brandVos = new ArrayList<>();
Aggregations aggregations = searchResponse.getAggregations();
//ParsedLongTerms用于接收terms聚合的结果,并且可以把key转化为Long类型的数据
ParsedLongTerms brandAgg = aggregations.get("brandAgg");
for (Terms.Bucket bucket : brandAgg.getBuckets()) {
//3.1 得到品牌id
Long brandId = bucket.getKeyAsNumber().longValue();
Aggregations subBrandAggs = bucket.getAggregations();
//3.2 得到品牌图片
ParsedStringTerms brandImgAgg=subBrandAggs.get("brandImgAgg");
String brandImg = brandImgAgg.getBuckets().get(0).getKeyAsString();
//3.3 得到品牌名字
Terms brandNameAgg=subBrandAggs.get("brandNameAgg");
String brandName = brandNameAgg.getBuckets().get(0).getKeyAsString();
SearchResult.BrandVo brandVo = new SearchResult.BrandVo(brandId, brandName, brandImg);
brandVos.add(brandVo);
}
result.setBrands(brandVos);
//4. 查询涉及到的所有分类
List<SearchResult.CatalogVo> catalogVos = new ArrayList<>();
ParsedLongTerms catalogAgg = aggregations.get("catalogAgg");
for (Terms.Bucket bucket : catalogAgg.getBuckets()) {
//4.1 获取分类id
Long catalogId = bucket.getKeyAsNumber().longValue();
Aggregations subcatalogAggs = bucket.getAggregations();
//4.2 获取分类名
ParsedStringTerms catalogNameAgg=subcatalogAggs.get("catalogNameAgg");
String catalogName = catalogNameAgg.getBuckets().get(0).getKeyAsString();
SearchResult.CatalogVo catalogVo = new SearchResult.CatalogVo(catalogId, catalogName);
catalogVos.add(catalogVo);
}
result.setCatalogs(catalogVos);
//5 查询涉及到的所有属性
List<SearchResult.AttrVo> attrVos = new ArrayList<>();
//ParsedNested用于接收内置属性的聚合
ParsedNested parsedNested=aggregations.get("attrs");
ParsedLongTerms attrIdAgg=parsedNested.getAggregations().get("attrIdAgg");
for (Terms.Bucket bucket : attrIdAgg.getBuckets()) {
//5.1 查询属性id
Long attrId = bucket.getKeyAsNumber().longValue();
Aggregations subAttrAgg = bucket.getAggregations();
//5.2 查询属性名
ParsedStringTerms attrNameAgg=subAttrAgg.get("attrNameAgg");
String attrName = attrNameAgg.getBuckets().get(0).getKeyAsString();
//5.3 查询属性值
ParsedStringTerms attrValueAgg = subAttrAgg.get("attrValueAgg");
List<String> attrValues = new ArrayList<>();
for (Terms.Bucket attrValueAggBucket : attrValueAgg.getBuckets()) {
String attrValue = attrValueAggBucket.getKeyAsString();
attrValues.add(attrValue);
List<SearchResult.NavVo> navVos = new ArrayList<>();
}
SearchResult.AttrVo attrVo = new SearchResult.AttrVo(attrId, attrName, attrValues);
attrVos.add(attrVo);
}
result.setAttrs(attrVos);
// 6. 构建面包屑导航
List<String> attrs = searchParam.getAttrs();
if (attrs != null && attrs.size() > 0) {
List<SearchResult.NavVo> navVos = attrs.stream().map(attr -> {
String[] split = attr.split("_");
SearchResult.NavVo navVo = new SearchResult.NavVo();
//6.1 设置属性值
navVo.setNavValue(split[1]);
//6.2 查询并设置属性名
try {
R r = productFeignService.info(Long.parseLong(split[0]));
if (r.getCode() == 0) {
AttrResponseVo attrResponseVo = JSON.parseObject(JSON.toJSONString(r.get("attr")), new TypeReference<AttrResponseVo>() {
});
navVo.setNavName(attrResponseVo.getAttrName());
}
} catch (Exception e) {
log.error("远程调用商品服务查询属性失败", e);
}
//6.3 设置面包屑跳转链接
String queryString = searchParam.get_queryString();
String replace = queryString.replace("&attrs=" + attr, "").replace("attrs=" + attr+"&", "").replace("attrs=" + attr, "");
navVo.setLink("http://search.gulimall.com/search.html" + (replace.isEmpty()?"":"?"+replace));
return navVo;
}).collect(Collectors.toList());
result.setNavs(navVos);
}
return result;
}
三、常用操作
导入导出数据
https://github.com/elasticsearch-dump/elasticsearch-dump
npm install elasticdump -g
elasticdump
# 导入导出
elasticdump \
--input=http://production.es.com:9200/my_index \
--output=http://staging.es.com:9200/my_index \
--type=data
# 导出json
elasticdump --input=http://112.124.15.81:9200/gulimall_product --output=F:\laji\dumpsearch\gulimall_elasticsearch_data.json --type=mapping
elasticdump --input=http://112.124.15.81:9200/gulimall_product --output=F:\laji\dumpsearch\gulimall_elasticsearch_data.json --type=data
# 导入
elasticdump --input=F:\laji\dumpsearch\gulimall_elasticsearch_data.json --output=http://112.124.15.81:9200 --type=data
elk
wget·https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-8.1.2-linux-x86_64.tar.gz
wget https://artifacts.elastic.co/downloads/logstash/logstash-8.1.2-linux-x86_64.tar.gz
wget https://artifacts.elastic.co/downloads/kibana/kibana-8.1.2-linux-x86_64.tar.gz
tar -zxvf
cd /usr/local/elasticsearch-7.10.2/config/
vim elasticsearch.yml
node.name: node-1
path.data: /usr/local/elasticsearch-7.10.2/data
path.logs: /usr/local/elasticsearch-7.10.2/logs
network.host: 127.0.0.1
http.host: 0.0.0.0
http.port: 9200
discovery.seed_hosts: ["127.0.0.1"]
cluster.initial_master_nodes: ["node-1"]
# es用户
useradd es
chown -R es:es /usr/local/elasticsearch-7.10.2
su - es
/usr/local/elasticsearch-7.10.2/bin/elasticsearch -d
# 9200端口看
# logstash
tar -zxvf logstash-7.10.2.tar.gz -C /usr/local
# 新增配置文件
cd /usr/local/logstash-7.10.2/bin
vim logstash-elasticsearch.conf
input {
file {
path => "/home/ruoyi/logs/sys-*.log"
start_position => beginning
sincedb_path => "/dev/null"
codec => multiline {
pattern => "^\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}"
negate => true
auto_flush_interval => 3
what => previous
}
}
}
filter {
if [path] =~ "info" {
mutate { replace => { type => "sys-info" } }
grok {
match => { "message" => "%{COMBINEDAPACHELOG}" }
}
date {
match => [ "timestamp" , "dd/MMM/yyyy:HH:mm:ss Z" ]
}
} else if [path] =~ "error" {
mutate { replace => { type => "sys-error" } }
} else {
mutate { replace => { type => "random_logs" } }
}
}
output {
elasticsearch {
hosts => '120.78.129.95:9200'
}
stdout { codec => rubydebug }
}
./bin/logstash -f logstash-elasticsearch.conf
# ---------- kibana
# 修改配置
cd /usr/local/kibana-7.10.2/config
vim kibana.yml
server.port: 5601
server.host: "0.0.0.0"
elasticsearch.hosts: ["http://120.78.129.95:9200"]
kibana.index: ".kibana"
# 授权es用户
chown -R es:es /usr/local/kibana-7.10.2/
#启动
# 切换用户成es用户进行操作
su - es
# 后台启动
/usr/local/kibana-7.10.2/bin/kibana &