es之批量导入数据
- Bulk
- ES提供了?个叫 bulk 的API 来进?批量操作
- 批量导?
数据
{"index": {"_index": "book", "_type": "_doc", "_id": 1}}
{"name": "权?的游戏"} {"index": {"_index": "book", "_type": "_doc", "_id": 2}}
{"name": "疯狂的?头"}
- POST bulk
curl -X POST "localhost:9200/_bulk" -H 'Content-Type: application/json' --data-binary @name
es之term的多种查询
介绍
- 单词级别查询
- 这些查询通常?于结构化的数据,?如:number, date, keyword等,?不是对text。
- 也就是说,全?本查询之前要先对?本内容进?分词,?单词级别的查询直接在相应字段的反向索引中精确查找,单词级别的查询?般?于数值、?期等类型的字段上
准备?作
- 删除nba索引
- 新增nba索引
PUT nba
{
"mappings": {
"properties": {
"birthDay": {
"type": "date"
},
"birthDayStr": {
"type": "keyword"
},
"age": {
"type": "integer"
},
"code": {
"type": "text"
},
"country": {
"type": "text"
},
"countryEn": {
"type": "text"
},
"displayAffiliation": {
"type": "text"
},
"displayName": {
"type": "text"
},
"displayNameEn": {
"type": "text"
},
"draft": {
"type": "long"
},
"heightValue": {
"type": "float"
},
"jerseyNo": {
"type": "text"
},
"playYear": {
"type": "long"
},
"playerId": {
"type": "keyword"
},
"position": {
"type": "text"
},
"schoolType": {
"type": "text"
},
"teamCity": {
"type": "text"
},
"teamCityEn": {
"type": "text"
},
"teamConference": {
"type": "keyword"
},
"teamConferenceEn": {
"type": "keyword"
},
"teamName": {
"type": "keyword"
},
"teamNameEn": {
"type": "keyword"
},
"weight": {
"type": "text"
}
}
}
}
- 批量导?数据(player?件)
链接:https://pan.baidu.com/s/13Uahu1FxKiY6nfRYeY4Myw
提取码:t2qb
Term query 精准匹配查询(查找号码为23的球员)
POST nba/_search
{
"query": {
"term": {
"jerseyNo": "23"
}
}
}
Exsit Query 在特定的字段中查找非空值的?档(查找队名非空的球员)
POST nba/_search
{
"query": {
"exists": {
"field": "teamNameEn"
}
}
}
Prefix Query 查找包含带有指定前缀term的?档(查找队名以Rock开头的球员)
POST nba/_search
{
"query": {
"prefix": {
"teamNameEn": "Rock"
}
}
}
Wildcard Query 支持通配符查询,*表示任意字符,?表示任意单个字符(查找火箭队的球员)
POST nba/_search
{
"query": {
"wildcard": {
"teamNameEn": "Ro*s"
}
}
}
Regexp Query 正则表达式查询(查找火箭队的球员)
POST nba/_search
{
"query": {
"regexp": {
"teamNameEn": "Ro.*s"
}
}
}
Ids Query(查找id为1和2的球员)
POST nba/_search
{
"query": {
"ids": {
"values": [
1,
2
]
}
}
}
玩转es的范围查询
查找指定字段在指定范围内包含值(日期、数字或字符串)的文档。
- 查找在nba打了2年到10年以内的球员
POST nba/_search
{
"query": {
"range": {
"playYear": {
"gte": 2,
"lte": 10
}
}
}
}
- 查找1980年到1999年出?的球员
POST nba/_search
{
"query": {
"range": {
"birthDay": {
"gte": "01/01/1999",
"lte": "2022",
"format": "dd/MM/yyyy||yyyy"
}
}
}
}
玩转es的布尔查询
布尔查询
type |
description |
must |
必须出现在匹配?档中 |
filter |
必须出现在?档中,但是不打分 |
must_not |
不能出现在?档中 |
should |
应该出现在?档中 |
must (查找名字叫做James的球员)
POST /nba/_search
{
"query": {
"bool": {
"must": [
{
"match": {
"displayNameEn": "james"
}
}
]
}
}
}
效果同must,但是不打分(查找名字叫做James的球员)
POST /nba/_search
{
"query": {
"bool": {
"filter": [
{
"match": {
"displayNameEn": "james"
}
}
]
}
}
}
must_not (查找名字叫做James的?部球员)
POST /nba/_search
{
"query": {
"bool": {
"must": [
{
"match": {
"displayNameEn": "james"
}
}
],
"must_not": [
{
"term": {
"teamConferenceEn": {
"value": "Eastern"
}
}
}
]
}
}
}
should(查找名字叫做James的打球时间应该在11到20年?部球员)
- 即使匹配不到也返回,只是评分不同
POST /nba/_search
{
"query": {
"bool": {
"must": [
{
"match": {
"displayNameEn": "james"
}
}
],
"must_not": [
{
"term": {
"teamConferenceEn": {
"value": "Eastern"
}
}
}
],
"should": [
{
"range": {
"playYear": {
"gte": 11,
"lte": 20
}
}
}
]
}
}
}
- 如果minimum_should_match=1,则变成要查出名字叫做James的打球时间在11到20年?部球员
POST /nba/_search
{
"query": {
"bool": {
"must": [
{
"match": {
"displayNameEn": "james"
}
}
],
"must_not": [
{
"term": {
"teamConferenceEn": {
"value": "Eastern"
}
}
}
],
"should": [
{
"range": {
"playYear": {
"gte": 11,
"lte": 20
}
}
}
],
"minimum_should_match": 1
}
}
}
玩转es的排序查询
- ?箭队中按打球时间从?到?排序的球员
POST nba/_search
{
"query": {
"match": {
"teamNameEn": "Rockets"
}
},
"sort": [
{
"playYear": {
"order": "desc"
}
}
]
}
- ?箭队中按打球时间从?到?,如果年龄相同则按照身?从?到低排序的球员
POST nba/_search
{
"query": {
"match": {
"teamNameEn": "Rockets"
}
},
"sort": [
{
"playYear": {
"order": "desc"
}
},
{
"heightValue": {
"order": "asc"
}
}
]
}
玩转es聚合查询之指标聚合
ES聚合分析是什么
- 聚合分析是数据库中重要的功能特性,完成对?个查询的数据集中数据的聚合计算,如:找出某字段(或计算表达式的结果)的最?值、最?值,计算和、平均值等。ES作为搜索引擎兼数据库,同样提供了强?的聚合分析能?。
- 对?个数据集求最?、最?、和、平均值等指标的聚合,在ES中称为指标聚合
- ?关系型数据库中除了有聚合函数外,还可以对查询出的数据进?分组group by,再在组上进?指标聚合。在ES中称为桶聚合
max min sum avg
- 求出?箭队球员的平均年龄
POST /nba/_search
{
"query": {
"term": {
"teamNameEn": {
"value": "Rockets"
}
}
},
"aggs": {
"avgAge": {
"avg": {
"field": "age"
}
}
},
"size": 0
}
value_count 统计非空字段的文档数
- 求出?箭队中球员打球时间不为空的数量
POST /nba/_search
{
"query": {
"term": {
"teamNameEn": {
"value": "Rockets"
}
}
},
"aggs": {
"countPlayerYear": {
"value_count": {
"field": "playYear"
}
}
},
"size": 0
}
- 查出?箭队有多少名球员
POST nba/_count
{
"query": {
"term": {
"teamNameEn": {
"value": "Rockets"
}
}
}
}
Cardinality 值去重计数
- 查出?箭队中年龄不同的数量
POST /nba/_search
{
"query": {
"term": {
"teamNameEn": {
"value": "Rockets"
}
}
},
"aggs": {
"counAget": {
"cardinality": {
"field": "age"
}
}
},
"size": 0
}
stats 统计count max min avg sum 5个值
- 查出?箭队球员的年龄stats
POST /nba/_search
{
"query": {
"term": {
"teamNameEn": {
"value": "Rockets"
}
}
},
"aggs": {
"statsAge": {
"stats": {
"field": "age"
}
}
},
"size": 0
}
Extended stats ?stats多4个统计结果: 平方和、方差、标准差、平均值加/减两个标准差的区间
- 查出?箭队球员的年龄Extend stats
POST /nba/_search
{
"query": {
"term": {
"teamNameEn": {
"value": "Rockets"
}
}
},
"aggs": {
"extendStatsAge": {
"extended_stats": {
"field": "age"
}
}
},
"size": 0
}
Percentiles 占?百分位对应的值统计,默认返回[ 1, 5, 25, 50, 75, 95, 99 ]分位上的值
- 查出?箭的球员的年龄占?
POST /nba/_search
{
"query": {
"term": {
"teamNameEn": {
"value": "Rockets"
}
}
},
"aggs": {
"pecentAge": {
"percentiles": {
"field": "age"
}
}
},
"size": 0
}
- 查出?箭的球员的年龄占?(指定分位值)
POST /nba/_search
{
"query": {
"term": {
"teamNameEn": {
"value": "Rockets"
}
}
},
"aggs": {
"percentAge": {
"percentiles": {
"field": "age",
"percents": [
20,
50,
75
]
}
}
},
"size": 0
}
玩转es聚合查询之桶聚合
ES聚合分析是什么
- 聚合分析是数据库中重要的功能特性,完成对?个查询的数据集中数据的聚合计算,如:找出某字段(或计算表达式的结果)的最?值、最?值,计算和、平均值等。ES作为搜索引擎兼数据库,同样提供了强?的聚合分析能?。
- 对?个数据集求最?、最?、和、平均值等指标的聚合,在ES中称为指标聚合
- ?关系型数据库中除了有聚合函数外,还可以对查询出的数据进?分组group by,再在组上进?指标聚合。在ES中称为桶聚合
Terms Aggregation 根据字段项分组聚合
- ?箭队根据年龄进?分组
POST /nba/_search
{
"query": {
"term": {
"teamNameEn": {
"value": "Rockets"
}
}
},
"aggs": {
"aggsAge": {
"terms": {
"field": "age",
"size": 10
}
}
},
"size": 0
}
order 分组聚合排序
- ?箭队根据年龄进?分组,分组信息通过年龄从?到?排序 (通过指定字段)
POST /nba/_search
{
"query": {
"term": {
"teamNameEn": {
"value": "Rockets"
}
}
},
"aggs": {
"aggsAge": {
"terms": {
"field": "age",
"size": 10,
"order": {
"_key": "desc"
}
}
}
},
"size": 0
}
- ?箭队根据年龄进?分组,分组信息通过?档数从?到?排序 (通过?档数)
POST /nba/_search
{
"query": {
"term": {
"teamNameEn": {
"value": "Rockets"
}
}
},
"aggs": {
"aggsAge": {
"terms": {
"field": "age",
"size": 10,
"order": {
"_count": "desc"
}
}
}
},
"size": 0
}
- 每?球队按该队所有球员的平均年龄进?分组排序 (通过分组指标值)
POST /nba/_search
{
"aggs": {
"aggsTeamName": {
"terms": {
"field": "teamNameEn",
"size": 30,
"order": {
"avgAge": "desc"
}
},
"aggs": {
"avgAge": {
"avg": {
"field": "age"
}
}
}
}
},
"size": 0
}
筛选分组聚合
- 湖?和?箭队按球队平均年龄进?分组排序 (指定值列表)
POST /nba/_search
{
"aggs": {
"aggsTeamName": {
"terms": {
"field": "teamNameEn",
"include": [
"Lakers",
"Rockets",
"Warriors"
],
"exclude": [
"Warriors"
],
"size": 30,
"order": {
"avgAge": "desc"
}
},
"aggs": {
"avgAge": {
"avg": {
"field": "age"
}
}
}
}
},
"size": 0
}
- 湖?和?箭队按球队平均年龄进?分组排序 (正则表达式匹配值)
POST /nba/_search
{
"aggs": {
"aggsTeamName": {
"terms": {
"field": "teamNameEn",
"include": "Lakers|Ro.*|Warriors.*",
"exclude": "Warriors",
"size": 30,
"order": {
"avgAge": "desc"
}
},
"aggs": {
"avgAge": {
"avg": {
"field": "age"
}
}
}
}
},
"size": 0
}
Range Aggregation 范围分组聚合
- NBA球员年龄按20,20-35,35这样分组
POST /nba/_search
{
"aggs": {
"ageRange": {
"range": {
"field": "age",
"ranges": [
{
"to": 20
},
{
"from": 20,
"to": 35
},
{
"from": 35
}
]
}
}
},
"size": 0
}
- NBA球员年龄按20,20-35,35这样分组 (起别名)
POST /nba/_search
{
"aggs": {
"ageRange": {
"range": {
"field": "age",
"ranges": [
{
"to": 20,
"key": "A"
},
{
"from": 20,
"to": 35,
"key": "B"
},
{
"from": 35,
"key": "C"
}
]
}
}
},
"size": 0
}
Date Range Aggregation 时间范围分组聚合
- NBA球员按出?年?分组
POST /nba/_search
{
"aggs": {
"birthDayRange": {
"date_range": {
"field": "birthDay",
"format": "MM-yyy",
"ranges": [
{
"to": "01-1989"
},
{
"from": "01-1989",
"to": "01-1999"
},
{
"from": "01-1999",
"to": "01-2009"
},
{
"from": "01-2009"
}
]
}
}
},
"size": 0
}
Date Histogram Aggregation 时间柱状图聚合
- 按天、?、年等进?聚合统计。可按 year (1y), quarter (1q), month (1M), week (1w), day(1d), hour (1h), minute (1m), second (1s) 间隔聚合
- NBA球员按出?年分组
POST /nba/_search
{
"aggs": {
"birthday_aggs": {
"date_histogram": {
"field": "birthDay",
"format": "yyyy",
"interval": "year"
}
}
},
"size": 0
}
es之query_string查询
介绍
- query_string 查询,如果熟悉lucene的查询语法,我们可以直接?lucene查询语法写?个查
- 询串进?查询,ES中接到请求后,通过查询解析器,解析查询串?成对应的查询。
指定单个字段查询
POST /nba/_search
{
"query": {
"query_string": {
"default_field": "displayNameEn",
"query": "james OR curry"
}
},
"size": 100
}
POST /nba/_search
{
"query": {
"query_string": {
"default_field": "displayNameEn",
"query": "james AND harden"
}
},
"size": 100
}
指定多个字段查询
POST /nba/_search
{
"query": {
"query_string": {
"fields": [
"displayNameEn",
"teamNameEn"
],
"query": "James AND Rockets"
}
},
"size": 100
}
参考个人博客:cyz