In this post, we will introduce the Elastic Stack, also referred to as the ELK stack – Logstash, Elasticsearch and Kibana. And provide step-by-step set-up instructions on how to set up the three tools that comprise the stack to automate the collection and visualization of system logs in a DevOps, cloud-native app architecture.
What is the Elastic Stack (formerly known as the ELK Stack)?
The Elastic Stack is the evolution of the ELK Stack, whose name is an abbreviation of the three core open-source tools the DevOps stack is built around: Elasticsearch, Logstash and Kibana.
Elasticsearch is a search and analytics engine.
Logstash is a server-side data processing pipeline. It collects multiple simultaneous data flows from different sources, parses each event, identifies named fields to build structure and transforms them into a common format for more powerful analysis and business value.
Kibana visualises data in Elasticsearch through charts and tables.
The Elastic Stack extends the ELK Stack with the addition of Beats, an open source platform for lightweight data shippers which allows users to tail files. All four of the open-source projects that make up the Elastic Stack, Elasticsearch, Logstash, Kibana and Beats, are products developed by Elastic.
Elastic describes the evolution of the ELK to Elastic Stack as
“…the same open source products that they were familiar with, only better integrated, more powerful, more user-friendly and with lots and lots of potential.”
This video by Coding Explained goes deeper into the technical details of the Elastic Stacks’ component tools and how they can be configured to work together as a powerful and flexible log management system.
Elastic Stack Use Cases
Elastic has published a number of use cases of how the Elastic Stack has been utilised by well-known brands to solve a variety of technical and, ultimately, business problems. Two examples include:
HappyFresh solves search latency problem with App Search on Elastic Cloud
HappyFresh, a market-leading grocery shopping and delivery platform in Indonesia, Malaysia, and Thailand, was losing customers before they had checked out because of latency issues with their apps’ product search function causing user frustration. A switch to Elastic’s App Search on Elastic Cloud halved search latency on its online and mobile e-commerce portals while being able to handle a 300% increase in search traffic during the Covid-19 pandemic.
The company is convinced its legacy solution would have failed if it had still been relied upon at the point traffic across its apps multiplied tenfold, search traffic threefold, as shoppers moved online during the pandemic.
Adobe image search capabilities built on the Elastic Stack
Several of Adobe’s products feature image search functionality powered by Elastic Stack, with Adobe Stock a prime example. Adobe’s DevOps engineers manage the following Elastic Stack deployment:
18 production clusters hosting over 10 billion documents, with a live ingestion rate of about 6,000 documents per second. To better support search in Adobe Lightroom, for example, they made the switch from Amazon’s Elasticsearch Service to their own self-managed clusters, moving over nearly 3.5B documents. Much of Adobe’s content is non-textual — images, videos, Photoshop files, and the like — but also includes standard enterprise document types, especially PDFs. The Elastic Stack, along with custom-built Elasticsearch plugins, helps drive the following content search experiences:
Search based on computer vision and metadata
Deep textual and hybrid content search
Video and richer format search
Enterprise search
Discovery and recommendations
How to build a centralised logging system using the Elastic Stack
Sometimes we need to look through the logs, searching for the required lines on several servers; to do this, we log in and look for the logs repeating the same commands on the servers.
Suppose we are an email hosting company with 3 MX servers. A customer files a complaint about a missing message which he sent to his wife at a particular time.
If we had set up a centralised logging system using the Elastic Stack, we would be able to find that message in the logs in a couple of clicks. And there are countless other reasons why organisations need a centralised logging system to keep track of what is happening across their IT infrastructure. While exceptions are not uncommon, it is fair to say the Elastic Stack is our go-to at K&C when our teams are building log management systems.
Elastic Stack alternatives for log management
Of course, while effective and popular, the Elastic Stack is not the only choice a software development team has for log management. Datadog, Splunk, Graylog, and Papertrail are the most popular alternatives and competitors to the components of the Elastic stack. Each of these tools has its technological strengths and weaknesses and different charging models, so could be preferred to the traditional ELK tools in particular circumstances.
But in a majority of circumstances, the Elastic stack is our choice for a log management system. And this is exactly, step-by-step, how we would set up a centralized logging system:
Our centralized logging system stack
- CentOS 7: The most recent version of the Linux distribution operating system
- Logstash: Server-based part for processing incoming logs
- Elasticsearch: For storing logs
- Kibana: Web interface for searching through and visualizing the logs
- Logstash Forwarder: It is installed on the servers as an agent for sending logs to a Logstash server.
- Beats: Platform for lightweight data shippers
Step-by-step tutorial to setting up a centralised log management system with ELK
We will install the first three components on our collection server, and Logstash Forwarder on the servers we want to collect logs from.
Install Java 8
Java is needed for Logstash and Elasticsearch. We are going to install OpenJDK 8.
cd /opt sudo wget --no-cookies --no-check-certificate --header "Cookie: gpw_e24=http%3A%2F%2Fwww.oracle.com%2F; oraclelicense=accept-securebackup-cookie" "https://download.oracle.com/otn-pub/java/jdk/8u40-b25/jre-8u40-linux-x64.tar.gz"
Unpack
sudo tar xvf jre-8*.tar.gz
Grant the necessary rights:
sudo chown -R root: jre1.8*
Create simlinks with the use of alternatives:
sudo alternatives --install /usr/bin/java java /opt/jre1.8*/bin/java 1
Delete the downloaded archive
sudo rm /opt/jre-8*.tar.gz
Install Elasticsearch
Import Elasticsearch public GPG key:
sudo rpm --import http://packages.elasticsearch.org/GPG-KEY-elasticsearch
Create and edit the repository file for Elasticsearch:
sudo vi /etc/yum.repos.d/elasticsearch.repo
[elasticsearch-1.4] name=Elasticsearch repository for 1.4.x packages baseurl=http://packages.elasticsearch.org/elasticsearch/1.4/centos gpgcheck=1 gpgkey=http://packages.elasticsearch.org/GPG-KEY-elasticsearch enabled=1
Install Elasticsearch
sudo yum -y install elasticsearch-1.4.4
Modify the configuration file:
sudo vi /etc/elasticsearch/elasticsearch.yml
Close access to elascticsearch from the outside:
network.host: localhost
Run Elasticsearch:
sudo systemctl start elasticsearch.service
And add it to the autorun:
sudo systemctl enable elasticsearch.service
Install Kibana:
Download and unpack Kibana 4:
cd ~; wget https://download.elasticsearch.org/kibana/kibana/kibana-4.0.1-linux-x64.tar.gz; tar xvf kibana-*.tar.gz
Edit the configuration file:
vi ~/kibana-4*/config/kibana.yml
In the Kibana configuration file, find the line that determines the host and replace the IP (0.0.0.0 by default) with the IP from the localhost:
host: "localhost"
This parameter indicates that Kibana will be accessible only locally. This is good, as we will use the reverse proxy server Nginx to grant access from the outside.
sudo mkdir -p /opt/kibana
And relocate the unpacked files there:
sudo cp -R ~/kibana-4*/* /opt/kibana/
Kibana is run as /opt/kibana/bin/kibana, but we will run it as a service. Create Kibana Systemd:
sudo vi /etc/systemd/system/kibana4.service
[Service] ExecStart=/opt/kibana/bin/kibana Restart=always StandardOutput=syslog StandardError=syslog SyslogIdentifier=kibana4 User=root Group=root Environment=NODE_ENV=production [Install] WantedBy=multi-user.target
Now, run it and add it to the autorun
sudo systemctl start kibana4
sudo systemctl enable kibana4
Set epel repository
sudo yum -y install epel-release
Install Nginx
sudo yum -y install nginx httpd-tools
Using htpasswd, create a user and a password
sudo htpasswd -c /etc/nginx/htpasswd.users kibanaadmin
Now, edit the main configuration nginx.com:
sudo vi /etc/nginx/nginx.conf
Find and delete the whole section server{}. Two lines should remain at the end
include /etc/nginx/conf.d/*.conf; }
Now, create the configuration file nginx for kibana4
server { listen 80; server_name example.com; auth_basic "Restricted Access"; auth_basic_user_file /etc/nginx/htpasswd.users; location / { proxy_pass http://localhost:5601; proxy_http_version 1.1; proxy_set_header Upgrade $http_upgrade; proxy_set_header Connection 'upgrade'; proxy_set_header Host $host; proxy_cache_bypass $http_upgrade; } }
Run Nginx:
sudo systemctl start nginx
sudo systemctl enable nginx
Now, Kibana is accessible at https://FQDN/
Install Logstash:
Create the repository file for Logstash:
sudo vi /etc/yum.repos.d/logstash.repo [logstash-1.5] name=logstash repository for 1.5.x packages baseurl=http://packages.elasticsearch.org/logstash/1.5/centos gpgcheck=1 gpgkey=http://packages.elasticsearch.org/GPG-KEY-elasticsearch enabled=1
Save and exit
Install Logstash:
sudo yum -y install logstash
Generate SSL certificates
Generate certificates for checking server authenticity
cd /etc/pki/tls sudo openssl req -subj '/CN=logstash_server_fqdn/' -x509 -days 3650 -batch -nodes -newkey rsa:2048 -keyout private/logstash-forwarder.key -out certs/logstash-forwarder.crt
The file logstash-forwarder.crt should be copied to all servers, which will send logs to the Logstash server
Configure Logstash:
The configuration files for Logstash are written in json format and are located at /etc/logstash/conf.d. Configuration includes 3 sections: inputs, filters, and outputs.
Create file 01-lumberjack-input.conf and set up “lumberjack” input (the protocol used by Logstash and Logstash Forwarder to communicate)
sudo vi /etc/logstash/conf.d/01-lumberjack-input.conf input { lumberjack { port => 5000 type => "logs" ssl_certificate => "/etc/pki/tls/certs/logstash-forwarder.crt" ssl_key => "/etc/pki/tls/private/logstash-forwarder.key" } }
Save and exit. It was noted here that lumberjack will listen to TCP port 5000 and will use the certificates we had generated before.
Now, create a file named 10-syslog.conf, and add it to the settings of syslog messages filtration:
sudo vi /etc/logstash/conf.d/10-syslog.conf filter { if [type] == "syslog" { grok { match => { "message" => "%{SYSLOGTIMESTAMP:syslog_timestamp} %{SYSLOGHOST:syslog_hostname} %{DATA:syslog_program}(?:[%{POSINT:syslog_pid}])?: %{GREEDYDATA:syslog_message}" } add_field => [ "received_at", "%{@timestamp}" ] add_field => [ "received_from", "%{host}" ] } syslog_pri { } date { match => [ "syslog_timestamp", "MMM d HH:mm:ss", "MMM dd HH:mm:ss" ] } } }
Save and exit
Create the last file 30-lumberjack-output.conf:
sudo vi /etc/logstash/conf.d/30-lumberjack-output.conf output { elasticsearch { host => localhost } stdout { codec => rubydebug } }
Restart Logstash:
sudo service logstash restart
Now that Logstash is set up, we go to Logstash Forwarder
Set up Logstash Forwarder
Copy the SSL certificate to the server where Logstash Forwarder will work
scp /etc/pki/tls/certs/logstash-forwarder.crt user@server_private_IP:/tmp
Download the key:
sudo rpm --import http://packages.elasticsearch.org/GPG-KEY-elasticsearch
Create the repository configuration file:
sudo vi /etc/yum.repos.d/logstash-forwarder.repo
Creating repo for Logstash Forwarder
[logstash-forwarder] name=logstash-forwarder repository baseurl=http://packages.elasticsearch.org/logstashforwarder/centos gpgcheck=1 gpgkey=http://packages.elasticsearch.org/GPG-KEY-elasticsearch enabled=1
Install Logstash Forwarder
sudo yum -y install logstash-forwarder
Copy the certificates to the required location:
sudo cp /tmp/logstash-forwarder.crt /etc/pki/tls/certs/
Let’s get to setting it up:
sudo vi /etc/logstash-forwarder.conf
"servers": [ "logstash_server_private_IP:5000" ], "timeout": 15, "ssl ca": "/etc/pki/tls/certs/logstash-forwarder.crt" Между квадратнымы скобками вставляем { "paths": [ "/var/log/messages", "/var/log/secure" ], "fields": { "type": "syslog" } }
Add Logstash Forwarder to the autorun and run it:
sudo service logstash-forwarder restart
Now, Logstash Forwarder will send logs to your Logstash server.
Enter kibana, open Dashboard, and enjoy the view.
Conclusion
Search and data management is becoming an increasingly key component for software systems and technology-driven businesses. Making sure search functions perform optimally in customer-facing apps such as e-commerce apps is less of a competitive advantage than a must-have. The need for technology stacks like the Elastic Stack will only grow in the future, especially as new forms of search, like voice search, gain more traction.
If your next software development project, or future iterations of an evolving application, rely on Elastic Stack expertise and you could benefit from a custom-built dedicated team or augmentation of an existing team with the Elastic Stack, please do get in touch!