Getting error rate using one elastic query | Pipeline Aggregations

Yatin Gupta
2 min readJul 8, 2022

The ELK setup is a highly used monitoring tool. We can monitor APIs performances as well by pushing different metrics like status code, API endpoint, response time, etc.

With basic elastic queries, we can easily fetch documents or some metrics like the number of 5xx count in APIs.

Example: Elastic query to fetch last 24 hours documents

Use Case: Fetching Error Rate for last 24 hours of APIs from elastic

Data Pushed to elastic: API endpoint, status code, timestamp of an API call
Error Rate of an API: Total count of API / 5xx count of API

Now, we have to write one elastic query to fetch the error rate of each API in the last 24 hours. This will involve filtering data for the last 24 hours, bucketing by each API and fetching the total count and 5xx count, and finally dividing the total/5xx count for each API.

We will use the concept of Pipeline Aggregations here.

Pipeline aggregations work on the outputs produced from other aggregations rather than from document sets, adding information to the output tree

which means we will perform each step using the output from the previous step.
Elastic queries can even do arithmetic from documents using “painless” script.

Steps :

  1. Filter for the last 24 hours
  2. Create buckets or aggregations for each endpoint
  3. Aggregate all documents inside the buckets created above for total count and 5xx count
  4. Finally, calculate the error rate for each of the buckets created above
  5. [Bonus] Sort in descending order of error rate

This is how the final query looks

The response will have aggregations like this

This can be used in scenarios where we don't want to write custom scripts over the top of elastic data.
For example, using Open Search from AWS, we can directly push error rate, custom metrics, etc. to slack/SNS when any API breaches our threshold or for daily reporting. The best part is, that this all can be done from Open Search without writing any custom code and spinning up lambda/ec2!

There might be a more efficient way to do the same thing from elastic. But for now, I was able to figure out this.

Thanks for reading!

Reference: https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-pipeline.html

--

--