Elasticsearch Monitoring and Management Plugins

30.3.2014 | 11 minutes reading time

Elasticsearch offers a highly useful plugin mechanism as a standard way for extending its core. Plugins enable developers to add new functionality, e.g., a custom analyzer, or provide alternatives to existing functionality, like swapping in another transport module implementation. Additionally, plugins may contain static content which Elasticsearch then serves via its HTTP server. Of the latter category, there are quite a few plugins that offer a graphical front-end for selected parts of the Elasticsearch REST API, e.g., for monitoring, managing cluster and index state, or querying. In this article, I will introduce five such plugins, all of which have served me well in the past and I think Elasticsearch users should be aware of. Presented in alphabetical order, they are: Bigdesk, Head, HQ, Kopf, and Paramedic.

Before we proceed, why do I write this article at all? Isn’t each of the plugins introduced well on their respective websites? Yes, that’s true, and that’s exactly where you should look for further information about these plugins. My motivation is a different one: I want to highlight the strong points of each plugin, to help you get started and make up your mind which plugin to use for a particular task. Also, I will discuss some limitations and caveats that I came across. Naturally, this is all my personal experience and opinion, and not meant to be a shoot-out: In fact I recommend to install and use all the plugins described here when setting up an Elasticsearch server. I certainly do.

The plugins are evaluated in their most recent versions: Bigdesk 2.4.0, Kopf 0.5.5, and for the other three plugins their GitHub project state as of March 29, 2014. The platform used is Elasticsearch 1.0.1.

Bigdesk

The Bigdesk plugin displays graphs about almost everything you can think of. I am exaggerating of course, but it is really a lot, ranging from OS metrics to Elasticsearch thread pool and cache sizes for each node. Definitely many more graphs than any of the other plugins offers. If you like charts, this one is for you.

A small excerpt of the set of graphs Bigdesk offers for each cluster node.

It is nice that the graphs can be customized a bit. First, by selecting the time range that the x-axis should represent, and second, by choosing the refresh (sampling) interval in which data is read from the Elasticsearch REST API. When choosing the time range, one caveat is that the x-axis never gets wider than the time range for which data has already accumulated at the client. Thus, when you increase the time range let’s say from one to ten minutes, the x-axis will not resize immediately but slowly grow as more data is read from the server. Only nine minutes later, it will have reached the desired ten-minute interval size. Which is a feature of course, but you need to be aware of it to avoid confusion.

Bigdesk cluster diagram.

Bigdesk also offers a cluster diagram showing nodes and indexes as circles, nested and laid out to represent their physical size. The diagram is tagged experimental, and in its current state I don’t find it particularly useful. On mouseover, the circles jump at you but nothing happens when you click them. Nevertheless, I think the diagram has a lot of potential if combined with some way to drill down into the nodes and indices.

Head

The cluster overview provided by Head.

The Head plugin was the first of its kind, with the GitHub repository ranging back to 2010. It offers quite a few features, the most prominent one being a very easily accessible cluster overview on its start page. The overview shows the cluster status, its nodes, indexes, the distribution of primary shards and replicas to the nodes, index aliases and sizes, and more. Probably this overview page is the single most-used feature of all the plugins discussed here.

But Head offers more: A data browser with a simple search function, a graphical query builder where you can click together search queries by selecting index, available fields, and query type and text, plus a JSON editor for formulating any HTTP request to be sent to the Elasticsearch server. Especially the latter provides some nice convenience features like a request history, repeated execution, or transforming the response JSON by applying a custom Javascript function.

This is all pretty nice, but I have to say that the data browser got me confused a couple of times. First, it doesn’t support nested objects that declare fields at subpaths with the same name, e.g., “path1.id” and “path2.id”. For such objects, the fields will be missing in the field overview column, being replaced by a single “id” field. In the table, there will also be a single “id” column, and you may find data from both original fields in it, depending on which one exists in a particular document (and their order). Consequently, when you enter a search value for the “id” field, you don’t even know what precisely will be searched.

A second issue is that with the data browser you might easily see data you are not supposed to see. By default, it shows all documents, and thus simply by opening the browser you might see textual data that belongs to your customers and you don’t want to see in the first place. Of course this is not a bug, but it means I always have to remember not to open the data browser on a production system.

HQ

The HQ plugin is a more recent one, and it easily has the sexiest website. HQ is a general-purpose plugin that offers various features, ranging from cluster, node, and index level metrics to query support for search and other REST calls. While many of the features can be found in one of other plugins as well, I am impressed by the effort that HQ puts into making life easy for the user. In the settings section, it allows you to specify different refresh intervals for different pages, making it possible to enable fine-grained sampling for certain data of interest while effectively disabling others. Also, HQ contains useful tooltips and explanations throughout, leaving few doubts about what’s possible and what’s not.

The HQ node diagnostics, alerting that the JVM is swapping

What I like best about HQ (actually, in my view one of the top features across all plugins) is the node diagnostics page which shows a table with metrics and statistics for each cluster node. What makes the feature special is that HQ applies thresholds to judge whether the values are acceptable, require a closer look, or are alarming, and then colors the table cells accordingly to make the information easily digestible. Furthermore, on mouseover it explains exactly how a certain table cell value is computed, which thresholds are applied, and why HQ thinks something is alarming. This is a unique feature which tremendously helps to understand important aspects of an Elasticsearch cluster.

With regard to querying and REST API support, I was surprised to find some limitations for which I don’t really see a reason. For example, when selecting an index in the query UI, the plugin doesn’t filter the fields accordingly but still offers the fields of all indexes. Also, it is not possible to set up combined queries like a bool query. You might think that you can achieve that via the JSON editor offered for REST API access, but it turns it has a fixed “all” scope so that you cannot search on a specific index only. Unless these limitations are removed, I prefer the similar functionality offered by the Kopf plugin.

Kopf

The cluster overview page provided by Kopf.

The Kopf plugin was released only half a year ago and is the most active of the projects discussed here. Apparently, it was inspired by the Head plugin, and it surely is no coincidence that it is called “Kopf” (the German word for “head”). The plugin looks great and has a cluster overview page which I like even more than the one offered by Head. Especially for large clusters, the various filters offered are highly useful: Nodes by name and type (client, data, master-eligible), indexes by name and state, hide special indexes created internally by Elasticsearch.

The JSON editor Kopf offers for REST API requests.

Kopf provides access to lots of Elasticsearch API features – in that regard it is easily the most powerful of the plugins discussed here. Some features that I particularly like (and which are not offered in similar fashion by any of the other plugins): An analysis feature that allows you to see the tokens generated for your document fields or your self-defined analyzers. Simple forms enabling you to use try advanced features like index aliases, percolation, warmup queries, or snapshot/restore. Many many text fields where you can dynamically change a particular cluster, node, or index setting, with helpful tooltips explaining what the particular field about. And, finally, Kopf offers a JSON editor for arbitrary REST calls, checking the entered JSON for validity as you type, and even formatting it once you submit the request.

Overall, Kopf is truly awesome, and I only have a few minor quibbles: In the cluster overview, it is easy to miss that some elements can be clicked, so take your time to make sure you actually discover everything that it offers. Also, while it is nice that many tooltips are provided, I ran into some that were empty, which is disappointing because you only notice once you hover the mouse over the information icon. Yet, things have improved considerably compared to a few weeks ago, when most tooltips were still empty, so let’s consider this a non-issue.

Paramedic

Paramedic horizon graphs for a four-node cluster.

The main feature of Paramedic is a set of graphs that display load metrics for each cluster node. While this is a pretty common feature, what makes Paramedic special is that instead of line charts (which you find in almost every tool and other plugins like Bigdesk) the data are displayed as horizon graphs . A horizon graph uses a certain coloring scheme in order to display time series in a more compact way than line charts do. This makes it possible to organize lots of horizon graphs on the same page and still enable users to spot interesting events quickly. As a consequence of the compact representation, the y-axis cannot be used to view absolute numbers (like in a line chart), but this is no problem as you can easily get those numbers by hovering the mouse over the graph. A nice explanation and illustration of the concept of horizon graphs is given in this article . Once I got to grips with horizon graphs, after some initial puzzlement, I started to like them a lot.

Note that Paramedic doesn’t implement horizon graph support by itself but instead uses Cubism.js . So if you are interested in adding horizon graphs to your Javascript application as well, you don’t need to reinvent the wheel.

Unfortunately, Paramedic currently offers only six metrics, each of which is displayed in its own block for all cluster nodes. So if you have a cluster with ten nodes, you will first see CPU usage for all the nodes, followed by JVM heap occupancy for all nodes, etc. Thus, there are two things on my wish list for Paramedic to be really useful: First, I would like to have a way to configure which metrics are displayed. Second, I would like to be able to rearrange them by node so that I can better view the state of individual nodes (and maybe spot some correlation between multiple graphs belonging to the same node). Ideally combined with a node filter.

Let me add that Paramedic provides some more features I didn’t mention so far, like displaying information about cluster nodes and the distribution of shards among them. However, the way in which they are presented doesn’t appeal to me, and I much prefer to use analogous functionality offered by the Kopf plugin.

Conclusion

All five plugins are a useful addition to your Elasticsearch toolbox, and it is good to know their strong points. I hope this overview helps you to get started and motivates you to explore them more closely. As you proceed, I’m sure you will form your own opinion and identify your own highlights in the broad set of features set they offer.

You may have noticed that I didn’t include Marvel into the comparison. While Marvel is a great product, discussing it in the same article just didn’t feel right to me. First of all, Marvel is a commercial tool (though free for use in development). Second, in contrast to the plugins discussed here, it is built into the Elasticsearch core and also stores data in an Elasticsearch index, which obviously enables lots of advanced features that the other plugin cannot offer. I’m sure there will be a separate article on Marvel some time soon on this blog.

Was this post helpful?

Blog author

Patrick Peschlow

Do you still have questions? Just send me a message.

fromPatrick Peschlow

Elastic{ON}: Erste Elasticsearch-User-Konferenz in San Francisco

Elasticsearch in all seinen Facetten – das war das Thema der ersten Elastic{ON} , die Anfang März in San Francisco stattfand. Über 1.000 User waren vor Ort, und auch die codecentric als Elasticsearch-Partner war mit einem Stand vertreten! Das codecentric...

8.4.2015 | 5 Minuten Lesezeit

Patrick Peschlow

Scaling an Elasticsearch Index – Introduction

A well-known design decision of Elasticsearch is that a fixed number of shards has to be specified when creating an index. It is not possible to start out with just one or only a few shards and add more shards later as the data increases. Now what to...

30.3.2015 | 7 Minuten Lesezeit

Transactions in Elasticsearch

Earlier this year a customer mentioned a search requirement that I hadn’t really thought about before: How to achieve transactions in Elasticsearch? Recently, the same requirement popped up again in a conversation I had with other search aficionados....

6.10.2014 | 8 Minuten Lesezeit

Elasticsearch Indexing Performance Cheatsheet

You plan to index large amounts of data in Elasticsearch? Or you are already trying to do so but it turns out that throughput is too low? Here is a collection of tips and ideas to increase indexing throughput with Elasticsearch. Some of them I have successfully...

NoSQL

8.5.2014 | 8 Minuten Lesezeit

Useful JVM Flags – Part 8 (GC Logging)

The last part of this series is about garbage collection logging and associated flags. The GC log is a highly important tool for revealing potential improvements to the heap and GC configuration or the object allocation pattern of the application. For...

3.1.2014 | 8 Minuten Lesezeit

Useful JVM Flags – Part 7 (CMS Collector)

The Concurrent Mark Sweep Collector (“CMS Collector”) of the HotSpot JVM has one primary goal: low application pause times. This goal is important for most interactive applications like web applications. Before we take a look at the relevant JVM flags...

4.3.2013 | 10 Minuten Lesezeit

ForkJoinPool vs. ThreadPoolExecutor

Recently, an article of mine appeared on the German site Heise Developer, and today the English translation was published on The H Developer. The article gives an introduction to the Java 7 ForkJoinPool and explains for which application scenarios ...

25.11.2012 | 1 Minuten Lesezeit

Useful JVM Flags – Part 6 (Throughput Collector)

For most application areas that we find in practice, a garbage collection (GC) algorithm is being evaluated according to two criteria: The higher the achieved throughput, the better the algorithm.The smaller the resulting pause times, the better the ...

4.1.2012 | 10 Minuten Lesezeit

Useful JVM Flags – Part 5 (Young Generation Garbage Collection)

In this part of our series we focus on one of the major areas of the heap, the “young generation”. First of all, we discuss why an adequate configuration of the young generation is so important for the performance of our applications. Then we move on...

18.8.2011 | 13 Minuten Lesezeit

Useful JVM Flags – Part 4 (Heap Tuning)

Ideally, a Java application runs just fine with the default JVM settings so that there is no need to set any flags at all. However, in case of performance problems (which unfortunately arise quite often) some knowledge about relevant JVM flags is a welcome...

2.7.2011 | 6 Minuten Lesezeit

Useful JVM Flags – Part 3 (Printing all XX Flags and their Values)

With a recent update of Java 6 (must have been update 20 oder 21), the HotSpot JVM offers two new command line flags which print a table of all XX flags and their values to the command line right after JVM startup. As many HotSpot users were longing ...

Java
APM

10.4.2011 | 4 Minuten Lesezeit

Useful JVM Flags – Part 2 (Flag Categories and JIT Compiler Diagnostics...

In the second part of this series, I give an introduction to the different categories of flags offered by the HotSpot JVM. Also, I am going to discuss some interesting flags regarding JIT compiler diagnostics. JVM flag categories The HotSpot JVM offers...

Java
APM

23.3.2011 | 9 Minuten Lesezeit

Useful JVM Flags – Part 1 (JVM Types and Compiler Modes)

Modern JVMs do an amazing job at running Java applications (and those of other compatible languages) in an efficient and stable manner. Adaptive memory management, garbage collection, just-in-time compilation, dynamic classloading, lock optimization ...

Java
APM

8.3.2011 | 6 Minuten Lesezeit

Your job at codecentric?

Jobs

Agile Developer und Consultant (w/d/m)

Alle Standorte

Gemeinsam bessere Projekte umsetzen.

Wir helfen deinem Unternehmen.

Du stehst vor einer großen IT-Herausforderung? Wir sorgen für eine maßgeschneiderte Unterstützung. Informiere dich jetzt.

Hilf uns, noch besser zu werden.

Wir sind immer auf der Suche nach neuen Talenten. Auch für dich ist die passende Stelle dabei.