Optimizing Elasticsearch – Part 2: Index Lifecycle Management

In the previous blog post “Optimize Elasticsearch for log collection – Part 1: reduce the number of shards“, we have seen one solution to recover a cluster suffering from the “too many shards syndrome” by merging indices that were too small. In this article, we’ll see how we can rely on latest Elasticsearch feature to keep control on our indices and shards size. We will also use the available features to manage our indices all along their lifecycle – from creation to deletion.

Index Lifecyle Management

Index Lifecycle Management (ILM) is a new feature introduced in Elasticsearch 6.7.0. It is part of Xpack and free to use as part of the community edition[1].

This feature mainly aims at managing indices for time series data, and it allows us to define the different stages and actions for an index from ingestion stage to deletion. It will handle the rollover of indices based on defined requirements automatically (time and / or size).

This feature replaces the curator component that previously took care of managing this.

Quick overview

The Index Lifecycle Management defines 4 different stages in which indices can reside:

Hot: indices in hot stage are sensitive indices used for data ingestion and usually respect a strong SLA. Ingest and search performance of hots indices are important. To reach the best performance requirements, those indices are often store on SSD storage to speed up search and ingestion.
Warm: indices in warm stage are usually not used for ingestion anymore but they’re still likely to be queried, albeit less often than hot indices. Search performance is still important whereas ingestion is usually performed on hot indices.
Cold: indices in cold stage are unlikely to be queried but it’s too early to consider the data to be deleted; for example, in the case of security monitoring, we might still need old data to support a security incident investigation. Cold indices need to remain on disk but their presence cannot leverage the overall cluster performance. Cold indices are usually frozen, we’ll see why and how later.
Delete: indices in the delete stage are about to be deleted. So simply.

An index will start its life in the hot stage, then warm, cold and finally the delete stage; this is the index lifecycle. Next, we need to define what the conditions are for an index to move from one stage to another. The set of conditions defining the life-cycle of an index is called an index policy.

To make the whole roll-over working and efficient, we’ll need to define:

An index policy telling which condition are required to go from one stage to another;
A template mapping telling the initial settings of the newly created index in the roll-over process;
An ingestion alias name pointing to the hottest index freshly created for ingestion.

Of course, as two indices cannot have the same name, we need a way to ingest real-time data into the newly created index triggered by the roll-over mechanism. We’ll use aliases for that purpose.

Index policy

Before going into the index policy definition, we need to define our SLA, use-cases and requirements. Every index in every environment can have its own requirements in terms of performance and time retention. Here, we will make very simple assumptions and rules trying to keep the cluster health and performance at its best by avoiding for example the “too many shards” cluster syndrome.

We consider this default policy applicable for single node cluster or for three nodes cluster. Bigger cluster will imply a more complete policy specifying nodes hosting shards in a specific stage. We can think of a cluster where SSD servers are hosting the hot shards and spinning disks are hosting the others. SSD servers will then be used mainly for ingestion and fast searches which, does make sense in larger environments. For our environment, we’ll only assume we have a small cluster of 3 nodes with homogeneous hardware. In this case, there is no need for specifying a shards allocation strategy [2]. The only thing we need in our scenario is to specify a very simple and efficient index policy strategy:

We want to keep an index for logs ingestion until it reaches the size of 90GB or until it is older than 14 days.[3] Those ingestion indices will have 3 shards. Every shard will then have a maximum size of 30GB which a quite good trade-off performance / overhead[4]. Only after reaching either the size (90GB) or the time requirements (14days), the index will be rolled-over and start its life-cycle in the hot stage.
A hot index older than 14 days will enter the warm phase. As warm indices won’t be use to ingest logs, we can easily merge the segments together to speed up the search a bit. This will have the effect to mark your warm indices as read-only [5]. We also have here the possibility to reduce the number of shards [6], the result will be the same as re-indexing with less shards. As the number of shards in your destination index must be a lower factor of the source index, we decided to not shrink our shards. However, it could be a very interesting operation to perform in big environment where requirements require shrinking. Yet another interesting feature is to increase or decrease the number of replicas, we’ll make sure we have no replicas for our warm indices in order to free-up some disk space.
A warm index older than 30 days will be considered as cold. In our strategy, we will freeze on disk all indices older than 30 days, freeing up some memory for in-memory shards of hot and warm indices [7] . Searching from frozen indices will have a huge impact on Elasticsearch performance; as this is not likely to often happen in most use cases, we considered this as acceptable.
A cold index older than 210 days will be marked to be deleted.

An important aspect to take into account here, is how Elasticsearch evaluates the index age. By default, Elasticsearch will compute the age based on the creation date. However, if the index has been rolled-over meaning it has been replaced by a new index for ingestion, it will use the rollover date to compute the index’s age. According to this, in our configuration, an index can enter the cold phase after either 30 days or 44 days from its creation depending on when it has been rolled-over. This provided the guarantee that data eager than 30 days will never enter the cold phase and data eager than 210 will never be deleted. This complexity is actually required as a single index usually contains data over a maximum window of 14 days in our case.

Lifecycle of indices assuming they are rolled-over daily because of the size requirements (90GB).

Every lifecycle policy can be registered under a specific name; we will use this feature to assign different policy to different index ingesting at different speed. We can also use a different index policy for a different index which doesn’t have the same requirements and SLA.

In elasticsearch, this policy is applied with the following request (in this case, applying the policy to “brofilter”):

PUT _ilm/policy/brofilter
{
  "policy": {
    "phases": {
      "hot": {
        "actions": {
          "rollover": {
            "max_age": "14d",
            "max_size": "90G"
          },
          "set_priority": {
            "priority": 100
          }
        }
      },
      "warm": {
        "min_age": "14d",
        "actions": {
          "forcemerge": {
            "max_num_segments": 1
          },
          "allocate": {
            "number_of_replicas": 0
          },
          "set_priority": {
            "priority": 50
          }
        }
      },
      "cold": {
        "min_age": "30d",
        "actions": {
          "freeze": {}
        }
      },
      "delete": {
        "min_age": "210d",
        "actions": {
          "delete": {}
        }
      }
    }
  }
}

With such a strategy, you will note that:

Indices of 90GB but eager than 14 days will remain in the hot phase. It will just be rolled-over with another one to keep an acceptable shard size.
Indices will be deleted after 210 days from when it has been rolled-over. We will then potentially keep data of 224 days old if the index contains data over a window of 14 days.
The priority is used to load the shards into memory. The higher the priority, the faster the shards will be loaded.
The initial number of shards and replicas for newly created indices is set in the index template.

Template mapping

Template mapping is used to give a default configuration to newly created indices matching a specific pattern. We will also use that template mapping feature to give the initial settings to hot indices. We will specify in the template mapping:

The initial number of shards;
The initial number of replicas;
The applied lifecycle policy;
The alias used for ingestion.

You need to take into account that the number of shards and replicas can be changed during the lifecycle of the index; these settings actually target indices in their hot stage.

The mapping is then very simple, and can be set as following:

PUT _template/brofilter
{
  "index_patterns": [
    "logstash-eagleeye-brofilter-*"
  ],
  "settings": {
    "index": {
      "number_of_shards": 3,
      "number_of_replicas": 1,
      "lifecycle.name": "brofilter",
      "lifecycle.rollover_alias": "logstash-eagleeye-brofilter"
    }
  }
}

This state that all newly created indices matching logstash-eagleeye-brofilter-* will:

Have 3 shards as we want in our hot stage;
Have 1 replica, which is particularly useful in a 3 nodes cluster;
Follow the brofilter policy previously defined;
Will roll over the ingestion alias named logstash-eagleeye-brofilter

We didn’t provide any field mapping, but this would definitely be the place and the time to do that in your own environment!

Creating the first index

As the settings are now committed into Elasticsearch, the very first index can now be created manually together with its ingestion alias, in order to launch the machinery.

PUT logstash-eagleeye-brofilter-000001
{
  "aliases": {
    "logstash-eagleeye-brofiler": {
      "is_write_index": true
    }
  }
}

This will trigger the creation of the very first index: logstash-eagleeye-brofilter-000001. This new index is linked with the ingestion alias logstash-eagleeye-brofilter. This alias name must be used in the Logstash configuration in order to make sure we always ingest in the correct hot index. Not using the ingestion alias to ingest logs would undo the added value of using a rollover strategy.

From now on, the first index is created and it will follow the index policy previously defined. The previously defined mapping will automatically be applied on every new index created by the policy. The ingestion alias will be defined on the newly rolled-over index automatically. We don’t have to do this manually; we just need to keep ingesting in the ingestion alias, and all the rest is managed by Elasticsearch.

By default, every new index will be created with a new appended number. We have decided that the index prefix will be the same as the ingestion alias. All new indices will have an incremented suffix going from logstash-eagleeye-brofilter-000001 to logstash-eagleeye-brofilter-000002 and so on… Note that it is possible to modify this behavior and use a timestamp instead of a number [8]; for our example, we decided not to use this.

Because every new ingestion alias needs to be explicitly created before ingestion, and because index roll-over is automatically managed by Elasticsearch, we can safely disable automatic index creation.

Troubleshooting issues with Index Lifecycle Management

Each piece of software comes with a selection of specific caveats and potential headaches. Elasticsearch index lifecycle management is not an exception to the rule and sometimes, we can still run into issues. To troubleshoot these issues, the “explain” command can be used in order to get insights on the lifecycle state of an index:

GET logstash-eagleeye*/_ilm/explain

That command will give a lifecycle status for every index matching: logstash-eagleeye-*.

In our own environment, we faced an issue where lifecyce.rollover_alias defined in the template mapping was not the same as the one defined in the creation of the first index. The index lifecycle policy was then unable to rollover and bind the ingestion alias to the new index. In order to fix this, we had to do the following:

Modify the mapping of the index in error;
Modify the template mapping to avoid the issue in the future;
Trigger the index policy again.

Once the issue is solved, you have to tell the index policy management to process the index again by running the following command:

POST logstash-eagleeye-brofilter-000001/_ilm/retry

We hope this blog post helped you shed some light on how indices can be managed within Elasticsearch, and the different principles that govern these concepts.

In the next blog post we will explain how you can deploy an Index Lifecycle Policy through ansible for an automatic way of managing all of this easily – stay tuned!

[1] https://www.elastic.co/guide/en/elasticsearch/reference/6.7/index-lifecycle-management.html

[2] https://www.elastic.co/guide/en/elasticsearch/reference/current/index-modules-allocation.html

[3] Specifying an age is very important if we want to remove an index after a certain time. If we don’t specify a maximum age, we take the risk to delete recent data if the same hot index has been used to ingest data for long time.

[4] Mainly to avoid the “too many shards syndrome” as much as possible.

[5] “Force merge should only be called against read-only indices. Running force merge against a read-write index can cause very large segments to be produced (>5Gb per segment), and the merge policy will never consider it for merging again until it mostly consists of deleted docs. This can cause very large segments to remain in the shards.” https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-forcemerge.html

[6] In the warm stage, you can specify shrink action to reduce the number of shards for you index. https://www.elastic.co/guide/en/elasticsearch/reference/current/_actions.html#ilm-shrink-action

[7] https://www.elastic.co/guide/en/elasticsearch/reference/master/frozen-indices.html

[8] https://www.elastic.co/guide/en/elasticsearch/reference/current/date-math-index-names.html

Optimizing Elasticsearch for security log collection – part 1: reducing the number of shards

Nowadays, logs collection for security monitoring is about indexing, searching and datalakes; this is why at NVISO we use Elasticsearch for our threat hunting activities. Collecting, aggregating and searching data at a very high speed is challenging in big environment, especially when the flow is bigger than expected. At NVISO, we are constantly seeking for enhancements and optimizations to increase our performance. We are especially trying to take the maximum benefit of the very latest proven technology available.

In this & the following blog posts, we will show you the performance issues we suffered when using Elasticsearch for log collection. In three subsequent blog posts, I’ll present you how we use the latest features of Elasticsearch to overcome these performance concerns.

Our concerns

Many Elasticsearch environments suffer from the “too many shards” syndrome. Every shard is maintained in memory, and the more shards you have, the more memory you’ll need. Handling too many shards in memory will decrease the performance because the implied overhead of the java garbage collector will be too high.

By default, every created index in Elasticsearch is created with 5 primary shards. It can be good for some environment, but it can be a performance nightmare if that number doesn’t fit your use-cases, which is often the case.

The following Elasticsearch log lines are a sign of a “too many shards” syndrome:

[2018-01-31T18:30:21,491][INFO ][o.e.m.j.JvmGcMonitorService] [server1] [gc][6178] overhead, spent [382ms] collecting in the last [1s]
[2018-01-31T18:30:22,507][WARN ][o.e.m.j.JvmGcMonitorService] [server1] [gc][6179] overhead, spent [539ms] collecting in the last [1s]
[2018-01-31T18:35:13,640][INFO ][o.e.m.j.JvmGcMonitorService] [server1] [gc][6467] overhead, spent [361ms] collecting in the last [1s]
[2018-01-31T18:35:14,656][INFO ][o.e.m.j.JvmGcMonitorService] [server1] [gc][6468] overhead, spent [503ms] collecting in the last [1s]
[2018-01-31T18:35:15,672][INFO ][o.e.m.j.JvmGcMonitorService] [server1] [gc][6469] overhead, spent [427ms] collecting in the last [1s]

Reference: Overhead and heap issues (Elastic.co) –https://discuss.elastic.co/t/overhead-and-heap-issues/117888

As time series data are growing indefinitely, it’s disabling us from using a single index for all the data we want to ingest. At some point, the index will be too large to be queried and we’ll eventually need to create a new one. For years, the best practice was to create a new index every day for new incoming data. With such strategy, we have faced clusters showing very bad performance because there were too many indices with too many shards.

We used to work mostly with a single machine for simplicity and it did work as the amount of data we ingested together with our retention time did not justify the addition of more hardware. To stay in line with best practices dictated by Elasticsearch, we decided in the past to deploy four different instances of Elasticsearch on the same machine in the form of docker containers. We thought that adding more instances until the memory was full will increase the overall performance. However, we were wrong[1].

Not all of our nodes had the same role: one was a coordinating node whereas all the others took the other roles. The coordinating node consumed 10GB whereas the others consumed 32GB of memory where it was possible, making the whole cluster consuming around 106GB of memory. Moreover, in some environment where 106GB of memory wasn’t an option, we reduced the java heap space down to 8GB. This way, we uncontiously forgot to leave any memory available for the file system cache.

Our Logstash configuration managed the rollover of indices by appending the current timestamp at the end of indices’ name. Indices older than 30 days were closed and indices older than 180 days were deleted; we had ten different indices to rollover in average. Our cluster had to manage 30 days x 10 index x 5 shards per index= 1500 shards with sometimes only 8GB of memory per node where most of the shards were sometimes less than 10MB big.

The above concepts resulted at times in severe performance and memory consumption issues; so something had to change. All signals were pointing in the direction of a cluster making use of too many shards.

Rescuing our cluster

To remediate these issues, we decided the following:

Use only one Elasticsearch instance per host to give memory to the file system cache.
Merge indices from days to months to reduce the number of indices and shards, facilitating short term resolution.
Apply a rollover policy to keep indices and shards number under control, to facilitate long term resolution.
Integrate this strategy with an Ansible playbook.

This first blog post will only cover the merging of indices and reducing the number of Elasticsearch instances (scale-in). We will talk about shot-term resolutions in order to rescue our suffering cluster.

Reducing the number of shards

It is not an easy task to reduce the number of shards on a cluster; whenever an index is created, you have to re-index your data to reduce the number of shards.

To remediate this issue, we decided to merge all indices within a month to a single index:

POST _reindex
{
  "source": {
    "index": "logstash-eagleeye-suricatafilter-smtp-2019.03.*"
  },
  "dest": {
    "index": "logstash-eagleeye-suricatafilter-smtp-2019.03"
  },
  "script": {
    "inline": "ctx.remove(\"_id\")"
  }
}

We did this operation for every index type (we have 10 in total). We used the default number of shards; however, it can be smart to reduce the number of shard to one on environment with a lot of index type.

When the re-indexing was done, it was time to free up some shards be deleting all the older shards:

DELETE logstash-eagleeye-suricatafilter-smtp-2019.03.*

This operation can be performed safely as we know data from those indices are now indexed in the new index logstash-eagleeye-suricatafilter-smtp-2019.03. Feel free to use the _count API to double check before deleting!

By doing such, we potentially deleted 30 x 10 x 5 = 1500 shards and created 1 x 10 x 5 = 50 shards. Our cluster just came from 1500 shards to 50 shards by deleting 1450 shards. Our new indices were still around 1GB, so we didn’t suffer from any performance issue of an index that was too big. That operation was worth it but implied some temporary performance decrease as re-indexing data is not free in terms of resource consumption.

Reduce the number of Elasticsearch instances

We wasted a lot of memory in our environment by running three different instances of Elasticsearch. There is no performance gain of running three instances instead of one on the same hardware as soon as you keep the same number of shards for your indices. It is better to run a single instance with 32GB of heap space rather than four instances with 8GB each.

In our cluster, we had three data nodes and one coordinating node. The plan was to decommission two data nodes and delete the coordinating node ending with a single instance of Elasticsearch. To perform this, we just excluded two data nodes from the cluster:

PUT _cluster/settings
{
  "transient": {
    "cluster.routing.allocation.exclude._name": "esnode2,esnode3",
    "cluster.routing.rebalance.enable": "none"
  }
}

This operation moved all the shards from esnode2 and esnode3 to other data nodes available. As we only had three data nodes, all the shards went on the first esnode1 Elasticsearch instance. As for the re-indexing, this operation is quite long, especially when we keep ingesting data in the meantime. When it was done, we could safely delete the two data nodes containing no shards anymore. The coordinating node was also safe to be deleted as it didn’t contain any data.

So, we went from a three nodes cluster with each node of 8GB heap space with a total of 1500 shards each worth 10MB to a healthy single node cluster of 32GB heap space with a total of 50 shards each worth less than 1GB. This is a nice improvement in terms of memory consumption, especially on systems where we had three instances of 32GB and were able to save 42GB of host memory.

We can keep running like this and rollover the indices every month, but it won’t keep things fully under control. What if at some point in time, we ingest more data than expected filling-in indices to more than 200GB? Can we still enhance our indices management strategy and try to reach the best Elasticsearch performance?

In the next blog post, we will show you how to use the latest Elasticsearch features to define and apply a good index lifecycle management policy.

Until next time, and happy clustering!

[1]Elasticsearch strongly relies on the file system cache to reach its performance level. You need some spare memory for the file system cache.