博客
关于我
强烈建议你试试无所不能的chatGPT,快点击我
【07】把 Elasticsearch 当数据库使:聚合后排序
阅读量:6306 次
发布时间:2019-06-22

本文共 5188 字,大约阅读时间需要 17 分钟。

使用 可以用 SQL 进行 elasticsearch 的查询。有的时候分桶聚合之后会产生很多的桶,我们只对其中部分的桶关心。最简单的办法就是排序之后然后取前几位的结果。

ORDER BY _term

SQL

$ cat << EOF | ./es_query.py http://127.0.0.1:9200SELECT ipo_year, COUNT(*) FROM symbol GROUP BY ipo_year ORDER BY ipo_year LIMIT 2EOF
{"COUNT(*)": 4, "ipo_year": 1972}{"COUNT(*)": 1, "ipo_year": 1973}

Elasticsearch

{  "aggs": {    "ipo_year": {      "terms": {        "field": "ipo_year",         "order": [          {            "_term": "asc"          }        ],         "size": 2      },       "aggs": {}    }  },   "size": 0}

因为 ipo_year 是 GROUP BY 的字段,所以按这个排序用_term指代。

{  "hits": {    "hits": [],     "total": 6714,     "max_score": 0.0  },   "_shards": {    "successful": 1,     "failed": 0,     "total": 1  },   "took": 3,   "aggregations": {    "ipo_year": {      "buckets": [        {          "key": 1972,           "doc_count": 4        },         {          "key": 1973,           "doc_count": 1        }      ],       "sum_other_doc_count": 2893,       "doc_count_error_upper_bound": 0    }  },   "timed_out": false}

ORDER BY _count

SQL

$ cat << EOF | ./es_query.py http://127.0.0.1:9200SELECT ipo_year, COUNT(*) AS ipo_count FROM symbol GROUP BY ipo_year ORDER BY ipo_count LIMIT 2EOF
{"ipo_count": 1, "ipo_year": 1973}{"ipo_count": 2, "ipo_year": 1980}

Elasticsearch

{  "aggs": {    "ipo_year": {      "terms": {        "field": "ipo_year",         "order": [          {            "_count": "asc"          }        ],         "size": 2      },       "aggs": {}    }  },   "size": 0}
{  "hits": {    "hits": [],     "total": 6714,     "max_score": 0.0  },   "_shards": {    "successful": 1,     "failed": 0,     "total": 1  },   "took": 2,   "aggregations": {    "ipo_year": {      "buckets": [        {          "key": 1973,           "doc_count": 1        },         {          "key": 1980,           "doc_count": 2        }      ],       "sum_other_doc_count": 2895,       "doc_count_error_upper_bound": -1    }  },   "timed_out": false}

ORDER BY 指标

SQL

$ cat << EOF | ./es_query.py http://127.0.0.1:9200    SELECT ipo_year, MAX(market_cap) AS max_market_cap FROM symbol     GROUP BY ipo_year ORDER BY max_market_cap LIMIT 2EOF
{"max_market_cap": 826830000.0, "ipo_year": 1982}{"max_market_cap": 847180000.0, "ipo_year": 2016}

Elasticsearch

{  "aggs": {    "ipo_year": {      "terms": {        "field": "ipo_year",         "order": [          {            "max_market_cap": "asc"          }        ],         "size": 2      },       "aggs": {        "max_market_cap": {          "max": {            "field": "market_cap"          }        }      }    }  },   "size": 0}
{  "hits": {    "hits": [],     "total": 6714,     "max_score": 0.0  },   "_shards": {    "successful": 1,     "failed": 0,     "total": 1  },   "took": 20,   "aggregations": {    "ipo_year": {      "buckets": [        {          "max_market_cap": {            "value": 826830000.0          },           "key": 1982,           "doc_count": 4        },         {          "max_market_cap": {            "value": 847180000.0          },           "key": 2016,           "doc_count": 6        }      ],       "sum_other_doc_count": 2888,       "doc_count_error_upper_bound": -1    }  },   "timed_out": false}

HISTOGRAM 和 ORDER BY

除了 terms aggregation,其他 aggregation 也支持 order by 但是并不完善。比如 histogram aggregation 支持 sort 但是并不支持 size (也就是可以ORDER BY 但是不能 LIMIT)。官方有计划增加一个通用的支持 LIMIT 的方式,不过还没有实现:

SQL

$ cat << EOF | ./es_query.py http://127.0.0.1:9200    SELECT ipo_year_range, MAX(market_cap) AS max_market_cap FROM symbol     GROUP BY histogram(ipo_year, 10) AS ipo_year_range ORDER BY ipo_year_range EOF
{"ipo_year_range": 1970, "max_market_cap": 18370000000.0}{"ipo_year_range": 1980, "max_market_cap": 522690000000.0}{"ipo_year_range": 1990, "max_market_cap": 230940000000.0}{"ipo_year_range": 2000, "max_market_cap": 470490000000.0}{"ipo_year_range": 2010, "max_market_cap": 287470000000.0}

Elasticsearch

{  "aggs": {    "ipo_year_range": {      "aggs": {        "max_market_cap": {          "max": {            "field": "market_cap"          }        }      },       "histogram": {        "field": "ipo_year",         "interval": 10,         "order": {          "_key": "asc"        }      }    }  },   "size": 0}
{  "hits": {    "hits": [],     "total": 6714,     "max_score": 0.0  },   "_shards": {    "successful": 1,     "failed": 0,     "total": 1  },   "took": 2,   "aggregations": {    "ipo_year_range": {      "buckets": [        {          "max_market_cap": {            "value": 18370000000.0          },           "key": 1970,           "doc_count": 5        },         {          "max_market_cap": {            "value": 522690000000.0          },           "key": 1980,           "doc_count": 155        },         {          "max_market_cap": {            "value": 230940000000.0          },           "key": 1990,           "doc_count": 598        },         {          "max_market_cap": {            "value": 470490000000.0          },           "key": 2000,           "doc_count": 745        },         {          "max_market_cap": {            "value": 287470000000.0          },           "key": 2010,           "doc_count": 1395        }      ]    }  },   "timed_out": false}

转载地址:http://aqixa.baihongyu.com/

你可能感兴趣的文章
Vue--实例方法 / 生命周期
查看>>
【剑指offer纪念版】-- 面试题目录
查看>>
在iphone下遇到的一些兼容性问题
查看>>
mybatis(5) -自定义sql拦截统计执行耗时
查看>>
U盘中了蠕虫病毒后清除蠕虫并恢复文件的方法
查看>>
Linux 文件的权限设置
查看>>
04、navicat无法连接阿里云mysql
查看>>
查看JVM使用的什么垃圾收集器
查看>>
讲述下 :LVM逻辑卷管理遇到的问题
查看>>
jquery实用博客
查看>>
ORACLE日期时间函数大全(一)
查看>>
wpf通用窗体模板
查看>>
快速切换到主线程更新UI的几种方法
查看>>
jsp页面
查看>>
JS中ajax请求提交form表单
查看>>
list集合常用方法
查看>>
shell中的数组
查看>>
linux的usr目录的全称是什么?
查看>>
Java 8 新特性
查看>>
Kibana功能一览
查看>>