電子商務(wù)公司名稱(chēng)大全簡(jiǎn)單大氣河北seo公司
概述
今天咱們來(lái)看下es中的聚合查詢(xún),在es中聚合查詢(xún)分為三大類(lèi)bucket、metrics、pipeline,每一大類(lèi)下又有十幾種小類(lèi),咱們各舉例集中,有興許的同學(xué)可以參考官網(wǎng):https://www.elastic.co/guide/en/elasticsearch/reference/7.10/search-aggregations.html?本次基于es7.10.2版本編寫(xiě)。
metics聚合
常用指標(biāo)類(lèi)的聚合無(wú)外乎這幾種:Avg、Min、Max、Sum、Cardinality、Percentile ranks。咱們來(lái)看下具體語(yǔ)法:
Avg、Min、Max、Sum這幾個(gè)雷同只需要換函數(shù)名即可,假如我們有一個(gè)日志索引,其索引mapping如下:
{
"mappings": {
"properties": {
"routePath": {
"type":"keyword"
},
"serverCode": {
"type":"keyword"
},
"taskTime": {
"type":"long"
},
"reuqestMsg": {
"type":"text"
},
"responseMsg": {
"type":"text"
}
}
}
}
我們想看下近一月的接口某接口平均耗時(shí)、最小耗時(shí)、最大耗時(shí)等指標(biāo),此時(shí)dsl可以如下編寫(xiě):
GET?/log-2023-02/_serach
{
"size": 0,
"query": {
"bool": {
"filter": [
{
"term": {
"routePath": "/user/getUserInfo"
}
}
]
}
},
"aggs": {
"avg": {
????????????"avg":?{
"field": "taskTime"
}
}
}
}
返回結(jié)果:
? ? ? ? 咱們看下如何去重,根據(jù)接口地址去重查詢(xún):
{
"size": 0,
"aggs": {
"cardinality": {
"cardinality": {
"field": "routePath"
}
}
}
}
只是這個(gè)cardinality有誤差,它底層采用的是HyperLogLog的算法,通過(guò)計(jì)算數(shù)據(jù)的hash值來(lái)去重所以有誤差,百萬(wàn)數(shù)據(jù)誤差在5%以?xún)?nèi),我們可以通過(guò)precision_threshold參數(shù)去調(diào)整最大支持4萬(wàn),該值越大耗費(fèi)內(nèi)存也就越大如果數(shù)據(jù)總量在4萬(wàn)以?xún)?nèi)那么調(diào)整到最大值可以保證100%正確。
接下來(lái)咱們看Percentile ranks這個(gè)也是比較常用的聚合分析函數(shù)他的結(jié)果也是有誤差的但是不影響我們分析整體情況,比如我們需要計(jì)算整體系統(tǒng)的性能可以這樣搞:查詢(xún)接口再響應(yīng)這些耗時(shí)上的百分比就可以通過(guò)如下語(yǔ)句???????
{
"size": 0,
"aggs": {
"rate": {
"percentile_ranks": {
"field": "taskTime",
"values": [
20,
40,
50,
60
]
}
}
}
}
結(jié)果:
bucket聚合
桶聚合中我們常用的有分組、直方圖、范圍、根據(jù)日期分桶聚合這幾類(lèi),咱們先看下分組查詢(xún)(terms)舉例我們想統(tǒng)計(jì)下各個(gè)接口調(diào)用量情況:???????
{
"size": 0,
"aggs": {
"term": {
"terms": {
"field": "routePath"
}
}
}
返回結(jié)果:???????
"aggregations": {
"term": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "/user/getUserInfo",
"doc_count": 5
},
{
"key": "/user/addUser",
"doc_count": 1
},
{
"key": "/user/updateMobile",
"doc_count": 1
},
{
"key": "/user/updateUser",
"doc_count": 1
}
]
}
}
咱們?cè)倏粗狈綀D的查詢(xún)統(tǒng)計(jì)接口耗時(shí)、間隔為1:???????
{
"size": 0,
"aggs": {
"histogram": {
"histogram": {
"field": "taskTime",
"interval": 1
}
}
}
}
結(jié)果
"aggregations": {
"histogram": {
"buckets": [
{
"key": 20.0,
"doc_count": 2
},
{
"key": 21.0,
"doc_count": 0
},
{
"key": 22.0,
"doc_count": 0
????????????????}
???????????]
????????}
????}
根據(jù)日期統(tǒng)計(jì)各接口調(diào)用情況,用直方圖實(shí)行展現(xiàn):???????
{
"size": 0,
"aggs": {
"date_histogram": {
"date_histogram": {
"field": "requestTime",
"interval": "day"
}
}
}
}
查詢(xún)結(jié)果:
"aggregations": {
"histogram": {
"buckets": [
{
"key_as_string": "2023-02-01T00:00:00.000Z",
"key": 1675209600000,
"doc_count": 1
},
{
"key_as_string": "2023-02-02T00:00:00.000Z",
"key": 1675296000000,
"doc_count": 1
},
{
"key_as_string": "2023-02-03T00:00:00.000Z",
"key": 1675382400000,
"doc_count": 1
}
]
}
}
pipeline聚合
它其實(shí)是對(duì)bucket聚合的結(jié)果再次進(jìn)行聚合分期,數(shù)據(jù)準(zhǔn)備:
{ "create" : { "_index" : "employees" } }
{ "name" : "Emma","age":32,"job":"Product Manager","gender":"female","salary":35000 }
{ "create" : { "_index" : "employees" } }
{ "name" : "Underwood","age":41,"job":"Dev Manager","gender":"male","salary": 50000}
{ "create" : { "_index" : "employees" } }
{ "name" : "Tran","age":25,"job":"Web Designer","gender":"male","salary":18000 }
{ "create" : { "_index" : "employees" } }
{ "name" : "Rivera","age":26,"job":"Web Designer","gender":"female","salary": 22000}
{ "create" : { "_index" : "employees" } }
{ "name" : "Rose","age":25,"job":"QA","gender":"female","salary":18000 }
{ "create" : { "_index" : "employees" } }
{ "name" : "Lucy","age":31,"job":"QA","gender":"female","salary": 25000}
{ "create" : { "_index" : "employees" } }
{ "name" : "Byrd","age":27,"job":"QA","gender":"male","salary":20000 }
{ "create" : { "_index" : "employees" } }
{ "name" : "Foster","age":27,"job":"Java Programmer","gender":"male","salary": 20000}
{ "create" : { "_index" : "employees" } }
{ "name" : "Gregory","age":32,"job":"Java Programmer","gender":"male","salary":22000 }
{ "create" : { "_index" : "employees" } }
{ "name" : "Bryant","age":20,"job":"Java Programmer","gender":"male","salary": 9000}
{ "create" : { "_index" : "employees" } }
{ "name" : "Jenny","age":36,"job":"Java Programmer","gender":"female","salary":38000 }
{ "create" : { "_index" : "employees" } }
{ "name" : "Mcdonald","age":31,"job":"Java Programmer","gender":"male","salary": 32000}
{ "create" : { "_index" : "employees" } }
{ "name" : "Jonthna","age":30,"job":"Java Programmer","gender":"female","salary":30000 }
{ "create" : { "_index" : "employees" } }
{ "name" : "Marshall","age":32,"job":"Javascript Programmer","gender":"male","salary": 25000}
{ "create" : { "_index" : "employees" } }
{ "name" : "King","age":33,"job":"Java Programmer","gender":"male","salary":28000 }
{ "create" : { "_index" : "employees" } }
{ "name" : "Mccarthy","age":21,"job":"Javascript Programmer","gender":"male","salary": 16000}
{ "create" : { "_index" : "employees" } }
{ "name" : "Goodwin","age":25,"job":"Javascript Programmer","gender":"male","salary": 16000}
{ "create" : { "_index" : "employees" } }
{ "name" : "Catherine","age":29,"job":"Javascript Programmer","gender":"female","salary": 20000}
{ "create" : { "_index" : "employees" } }
{ "name" : "Boone","age":30,"job":"DBA","gender":"male","salary": 30000}
{ "create" : { "_index" : "employees" } }
{ "name" : "Kathy","age":29,"job":"DBA","gender":"female","salary": 20000}
我們根據(jù)以上數(shù)據(jù)想要查詢(xún)平均薪資最低的行業(yè):???????
{
"size": 0,
"aggs": {
"jobs": {
"terms": {
"field": "job.keyword",
"size": 10
},
"aggs": {
"avg_salary": {
"avg": {
"field": "salary"
}
}
}
},
"min_salary_by_job":{
??????"min_bucket":?{??#再次進(jìn)行聚合查詢(xún)?將jobs桶下的avg_salary求出最小值
"buckets_path": "jobs>avg_salary"
}
}
}
}
結(jié)果如下:???????
"aggregations": {
"jobs": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "Java Programmer",
"doc_count": 7,
"avg_salary": {
"value": 25571.428571428572
}
},
{
"key": "Javascript Programmer",
"doc_count": 4,
"avg_salary": {
"value": 19250.0
}
????????????????},
{
"key": "DBA",
"doc_count": 2,
"avg_salary": {
"value": 25000.0
}
????????????????},
{
"key": "Product Manager",
"doc_count": 1,
"avg_salary": {
"value": 35000.0
}
}
]
},
"min_salary_by_job": {
"value": 19250.0,
"keys": [
"Javascript Programmer"
]
}
}
還有將bucket結(jié)果再次進(jìn)行平均 avg_bucket,bucket結(jié)果再次求最大的max_bucket,bucket結(jié)果再次求百分比的 percentiles_bucket等等。
總結(jié)
基本上咱們把常用的一些聚合查詢(xún)都給大家演示了一遍,當(dāng)然es本身支持的聚合查詢(xún)遠(yuǎn)遠(yuǎn)不止這些,有興趣的同學(xué)可以參考es官網(wǎng)的學(xué)習(xí)手冊(cè):https://www.elastic.co/guide/en/elasticsearch/reference/7.10/index.html 來(lái)探索更多的語(yǔ)法糖。
Elasticsearch系列經(jīng)典文章
-
elasticsearch列一:索引模板的使用
-
elasticsearch系列二:引入索引模板后發(fā)現(xiàn)數(shù)據(jù)達(dá)到一定量還是慢怎么辦?
-
elasticsearch系列三:常用查詢(xún)語(yǔ)法
-
elasticsearch系列四:集群常規(guī)運(yùn)維
-
elasticsearch系列五:集群的備份與恢復(fù)
-
elasticsearch系列六:索引重建