How I manage Elasticsearch indexes in kubernetes
Table of Contents
Introduction⌗
Managing Elasticsearch indexes can be a challenge if you’re in a small team. Especially when index templates are changing from time to time. How does the new data end up using the new index templates, who is creating the new indexes and how is it done via code?
At the company I work at, at the time of writing, we send data via our API into Elasticsearch. When our data structure needs a change, we update this as code and magic happens. Allow me to show you how.
Setup⌗
Prerequisites⌗
Before you start you need the following:
- GitHub or any other git version control tool
- Kubernetes
- Argo Workflows installed on kubernetes docs
- Argo Events installed on kubernetes docs
- Flux installed on kubernetes docs
- Elasticsearch installed on kubernetes (via eck operator)
Setting up Elasticsearch templates⌗
We’ve created a GitHub repository elastic-mappings
that contain both the scripts and templates to use within Elasticsearch.
Our structure looks similar to:
.
├── README.md
├── components
│ ├── defaults.json
│ ├── ip.json
├── index_templates
│ ├── development
│ │ └── our-index.json
│ └── production
│ └── our-index.json
├── lifecyles
│ └── lc-14-days.json
├── pipelines
│ └── enrichment.json
└── scripts
├── components.sh
├── deploy.sh
├── index_templates.sh
├── lifecycles.sh
└── pipelines.sh
Folder | Description |
---|---|
components | These are index template components |
index_templates | These are the definitions of the index templates, they’re consume the template components |
lifecycles | The definitions of life cycle policies |
pipelines | The Elasticsearch pipeline definitions |
scripts | Scripts to call the Elasticsearch API to PUT the templates |
All the templates in JSON
are the same as you’d write them in the dev toolbox queries.
The scripts are just simple curl
commands. With a pipeline.sh
that gets called within Argo Workflows to orchestrate all scripts.
Setting up Argo Events⌗
Argo Events will receive a webhook call from flux, to be configured later, and this call will trigger a Argo Workflow to be deployed.
The files below will create the Sensor
, EventSource
and the required WorkflowTemplates
. I’ve left out the ServiceAccount
, RoleBindings
and ClusterRoleBindings
. As they can differ per environment.
And in turn when there is a POST
call being done on the http://event-eventsource-svc.elastic-mappings.svc.cluster.local:12000/event
URL, the Sensor
will deploy a Workflow
. And at the very end there is a commit into a ConfigMap
that hosts the version number(s) of the index(es). This is how our API (via an kubernetes informer) knows which index to pick.
Setting up Flux⌗
To let flux call a generic webhook at you need to create a Alert
and an Provider
of the notification.toolkit.fluxcd.io
custom resources.
The Alert
will be triggered upon changes of the specified GitRepository
and uses the Provider
to send out the notification.