统计Elasticsearch中月每天索引量的脚本

432次阅读

共计 3103 个字符，预计需要花费 8 分钟才能阅读完成。

随着业务量的不断上升，最近一段时间需要对生产环境中的 elasticsearch 集群中的历史索引数据做迁移，而在做迁移前需要对被迁移的 elasticsearch 索引数据做统计用于迁移后的验证统计，所以就写了一个脚本用于 es 数据中查询历史索引的量生成报表文件，而在其中有使用过 jq 工具用于取数，jq 的介绍可以见本文最后。

#!/bin/bash
#es_count_report.sh
#used for elasticsearch monthly numbers index
#you must install jq and curl
#writer jim
#2017.09.19
#Position parameter judgment
datetime=$(date +”%Y%m”)

if [$# -lt 1];then
echo “Please enter the date ‘year month'”
echo “ex> $0 ${datetime}”
exit 1
fi

if [$# -gt 1]; then
echo “The input host address are too much”
echo “ex> $0 ${datetime}”
exit 1
fi
# 这里在 elasticsearch 中取数时有用到 curl 和 jq
rpm -qa | grep jq && rpm -qa | grep curl
if [$? -ne 0];then
yum -y install jq curl
fi

es_ip=”192.168.2.200″
es_port=”9200″
monthtime=$1
#elasticsearch 的相关信息及传入的时间
data_index=”data-${monthtime}”
index_name_all=$(curl -s “http://${es_ip}:${es_port}/_cat/indices?v” | grep ${data_index} | awk ‘{print $3}’ | sort)
report_file=”$(pwd)/index_num_”${monthtime}”.txt”
cat /dev/null > $report_file
# 至空生成一个新文件用于记录
for i in $index_name_all
do
index_num=$(curl -s -XGET “http://${es_ip}:${es_port}/${i}/poll/_search/?pretty” -d ‘{“_source”:true,”size”: 0}’|jq ‘.hits.total’) && echo “$i:$index_num” >> $report_file
done

总之在平时可以根据 elasticsearch 的 api 接口实现各种不同的数据统计

JSON 是前端编程经常用到的格式，对于 PHP 或者 Python，解析 JSON 都不是什么大事，尤其是 PHP 的 json_encode 和 json_decode, 干的相当的漂亮。Linux 下也有处理处理 JSON 的神器：jq。
对于 JSON 格式而言，jq 就像 sed/awk/grep 这些神器一样的方便，而也，jq 没有乱七八糟的依赖，只需要一个 binary 文件 jq，就足矣。下面我们看下 jq 的使用。
1 格式化 JSON

manu@manu:~/code/php/json$ cat json_raw.txt
{“name”:“Google”,“location”:{“street”:“1600 Amphitheatre Parkway”,“city”:“Mountain View”,“state”:“California”,“country”:“US”},“employees”:[{“name”:“Michael”,“division”:“Engineering”},{“name”:“Laura”,“division”:“HR”},{“name”:“Elise”,“division”:“Marketing”}]}

上面的 JSON 是 PHP json_encode 之后，echo 出来的字符串，很明显，可读性太差。前一阵子写文档，需要将前后段 JSON 写入文档，我当时是用是网上的 JSON 格式化工具做的。事实上，jq 就可以检查 JSON 的合法性，并把 JSON 格式化成更友好更可读的格式：

cat json_raw.txt | jq .

看到上图，将一团乱麻的 JSON 格式化成个更可读的形式。其实背后另外检查了 JSON 的合法性。如果 JSON 不合法，jq . 会报错。我故意写个错误的 JSON：

manu@manu:~/code/php/json$ cat json_err.txt
{“name”:“Google”,“location”:{“street”:“1600 Amphitheatre Parkway”,“city”:“Mountain View”,“state”:“California”,“country”:“US”},“employees”:[{“name”:“Michael”,“division”:“Engineering”}{“name”:“Laura”,“division”:“HR”},{“name”:“Elise”,“division”:“Marketing”}]}