The Elastic Stack (ELK): how to set up centralized logging with Logstash, Elasticsearch & Kibana

An introduction to the Elastic Stack as part of a DevOps architecture and step-by-step guide to setting up the collection and visualization of system logs

In this post, we will introduce the Elastic Stack, also referred to as the ELK stack – Logstash, Elasticsearch and Kibana. And provide step-by-step set-up instructions on how to set up the three tools that comprise the stack to automate the collection and visualization of system logs in a DevOps, cloud-native app architecture.

What is the Elastic Stack (formerly known as the ELK Stack)?

The Elastic Stack is the evolution of the ELK Stack, whose name is an abbreviation of the three core open-source tools the DevOps stack is built around: Elasticsearch, Logstash and Kibana.

Elasticsearch is a search and analytics engine.

Logstash is a server-side data processing pipeline. It collects multiple simultaneous data flows from different sources, parses each event, identifies named fields to build structure and transforms them into a common format for more powerful analysis and business value.

Kibana visualises data in Elasticsearch through charts and tables.

The Elastic Stack extends the ELK Stack with the addition of Beats, an open source platform for lightweight data shippers which allows users to tail files. All four of the open-source projects that make up the Elastic Stack, Elasticsearch, Logstash, Kibana and Beats, are products developed by Elastic.

Elastic describes the evolution of the ELK to Elastic Stack as

“…the same open source products that they were familiar with, only better integrated, more powerful, more user-friendly and with lots and lots of potential.”

This video by Coding Explained goes deeper into the technical details of the Elastic Stacks’ component tools and how they can be configured to work together as a powerful and flexible log management system.

Elastic Stack Use Cases

Elastic has published a number of use cases of how the Elastic Stack has been utilised by well-known brands to solve a variety of technical and, ultimately, business problems. Two examples include:

HappyFresh solves search latency problem with App Search on Elastic Cloud

HappyFresh, a market-leading grocery shopping and delivery platform in Indonesia, Malaysia, and Thailand, was losing customers before they had checked out because of latency issues with their apps’ product search function causing user frustration. A switch to Elastic’s App Search on Elastic Cloud halved search latency on its online and mobile e-commerce portals while being able to handle a 300% increase in search traffic during the Covid-19 pandemic.

The company is convinced its legacy solution would have failed if it had still been relied upon at the point traffic across its apps multiplied tenfold, search traffic threefold, as shoppers moved online during the pandemic.

Adobe image search capabilities built on the Elastic Stack

Several of Adobe’s products feature image search functionality powered by Elastic Stack, with Adobe Stock a prime example. Adobe’s DevOps engineers manage the following Elastic Stack deployment:

18 production clusters hosting over 10 billion documents, with a live ingestion rate of about 6,000 documents per second. To better support search in Adobe Lightroom, for example, they made the switch from Amazon’s Elasticsearch Service to their own self-managed clusters, moving over nearly 3.5B documents. Much of Adobe’s content is non-textual — images, videos, Photoshop files, and the like — but also includes standard enterprise document types, especially PDFs. The Elastic Stack, along with custom-built Elasticsearch plugins, helps drive the following content search experiences:

  • Search based on computer vision and metadata

  • Deep textual and hybrid content search

  • Video and richer format search

  • Enterprise search

  • Discovery and recommendations

diagram of Adobe inhouse Elastic Stack deployment

How to build a centralised logging system using the Elastic Stack

Sometimes we need to look through the logs, searching for the required lines on several servers; to do this, we log in and look for the logs repeating the same commands on the servers.

Suppose we are an email hosting company with 3 MX servers. A customer files a complaint about a missing message which he sent to his wife at a particular time.

If we had set up a centralised logging system using the Elastic Stack, we would be able to find that message in the logs in a couple of clicks. And there are countless other reasons why organisations need a centralised logging system to keep track of what is happening across their IT infrastructure. While exceptions are not uncommon, it is fair to say the Elastic Stack is our go-to at K&C when our teams are building log management systems.

Elastic Stack alternatives for log management

Of course, while effective and popular, the Elastic Stack is not the only choice a software development team has for log management. Datadog, Splunk, Graylog, and Papertrail are the most popular alternatives and competitors to the components of the Elastic stack. Each of these tools has its technological strengths and weaknesses and different charging models, so could be preferred to the traditional ELK tools in particular circumstances.

But in a majority of circumstances, the Elastic stack is our choice for a log management system. And this is exactly, step-by-step, how we would set up a centralized logging system:

Our centralized logging system stack

  • CentOS 7: The most recent version of the Linux distribution operating system
  • Logstash: Server-based part for processing incoming logs
  • Elasticsearch: For storing logs
  • Kibana: Web interface for searching through and visualizing the logs
  • Logstash Forwarder: It is installed on the servers as an agent for sending logs to a Logstash server.
  • Beats: Platform for lightweight data shippers

Step-by-step tutorial to setting up a centralised log management system with ELK

We will install the first three components on our collection server, and Logstash Forwarder on the servers we want to collect logs from.

Install Java 8

Java is needed for Logstash and Elasticsearch. We are going to install OpenJDK 8.

cd /opt
sudo wget --no-cookies --no-check-certificate --header "Cookie: gpw_e24=http%3A%2F%2Fwww.oracle.com%2F; oraclelicense=accept-securebackup-cookie" 
"https://download.oracle.com/otn-pub/java/jdk/8u40-b25/jre-8u40-linux-x64.tar.gz"

Unpack

sudo tar xvf jre-8*.tar.gz

Grant the necessary rights:

sudo chown -R root: jre1.8*

Create simlinks with the use of alternatives:

sudo alternatives --install /usr/bin/java java /opt/jre1.8*/bin/java 1

Delete the downloaded archive

sudo rm /opt/jre-8*.tar.gz

Install Elasticsearch

Import Elasticsearch public GPG key:

sudo rpm --import http://packages.elasticsearch.org/GPG-KEY-elasticsearch

Create and edit the repository file for Elasticsearch:

sudo vi /etc/yum.repos.d/elasticsearch.repo
[elasticsearch-1.4]
name=Elasticsearch repository for 1.4.x packages
baseurl=http://packages.elasticsearch.org/elasticsearch/1.4/centos
gpgcheck=1
gpgkey=http://packages.elasticsearch.org/GPG-KEY-elasticsearch
enabled=1

Install Elasticsearch

sudo yum -y install elasticsearch-1.4.4

Modify the configuration file:

sudo vi /etc/elasticsearch/elasticsearch.yml

Close access to elascticsearch from the outside:

network.host: localhost

Run Elasticsearch:

sudo systemctl start elasticsearch.service

And add it to the autorun:

sudo systemctl enable elasticsearch.service

Install Kibana:

Download and unpack Kibana 4:

cd ~; wget https://download.elasticsearch.org/kibana/kibana/kibana-4.0.1-linux-x64.tar.gz; tar xvf kibana-*.tar.gz

Edit the configuration file:

vi ~/kibana-4*/config/kibana.yml

In the Kibana configuration file, find the line that determines the host and replace the IP (0.0.0.0 by default) with the IP from the localhost:

host: "localhost"

This parameter indicates that Kibana will be accessible only locally. This is good, as we will use the reverse proxy server Nginx to grant access from the outside.

sudo mkdir -p /opt/kibana

And relocate the unpacked files there:

sudo cp -R ~/kibana-4*/* /opt/kibana/

Kibana is run as /opt/kibana/bin/kibana, but we will run it as a service. Create Kibana Systemd:

sudo vi /etc/systemd/system/kibana4.service
[Service]
ExecStart=/opt/kibana/bin/kibana
Restart=always
StandardOutput=syslog
StandardError=syslog
SyslogIdentifier=kibana4
User=root
Group=root
Environment=NODE_ENV=production
 
[Install]
WantedBy=multi-user.target

Now, run it and add it to the autorun

sudo systemctl start kibana4
sudo systemctl enable kibana4

Set epel repository

sudo yum -y install epel-release

Install Nginx

sudo yum -y install nginx httpd-tools

Using htpasswd, create a user and a password

sudo htpasswd -c /etc/nginx/htpasswd.users kibanaadmin

Now, edit the main configuration nginx.com:

sudo vi /etc/nginx/nginx.conf

Find and delete the whole section server{}. Two lines should remain at the end

    include /etc/nginx/conf.d/*.conf;
}

Now, create the configuration file nginx for kibana4

server {
    listen 80;
 
    server_name example.com;
 
    auth_basic "Restricted Access";
    auth_basic_user_file /etc/nginx/htpasswd.users;
 
    location / {
        proxy_pass http://localhost:5601;
        proxy_http_version 1.1;
        proxy_set_header Upgrade $http_upgrade;
        proxy_set_header Connection 'upgrade';
        proxy_set_header Host $host;
        proxy_cache_bypass $http_upgrade;        
    }
}

Run Nginx:

sudo systemctl start nginx
sudo systemctl enable nginx

Now, Kibana is accessible at https://FQDN/

Install Logstash:

Create the repository file for Logstash:

sudo vi /etc/yum.repos.d/logstash.repo
 
[logstash-1.5]
name=logstash repository for 1.5.x packages
baseurl=http://packages.elasticsearch.org/logstash/1.5/centos
gpgcheck=1
gpgkey=http://packages.elasticsearch.org/GPG-KEY-elasticsearch
enabled=1

Save and exit

Install Logstash:

sudo yum -y install logstash

Generate SSL certificates

Generate certificates for checking server authenticity

cd /etc/pki/tls
sudo openssl req -subj '/CN=logstash_server_fqdn/' -x509 -days 3650 -batch -nodes -newkey rsa:2048 -keyout private/logstash-forwarder.key -out certs/logstash-forwarder.crt

The file logstash-forwarder.crt should be copied to all servers, which will send logs to the Logstash server

Configure Logstash:

The configuration files for Logstash are written in json format and are located at /etc/logstash/conf.d. Configuration includes 3 sections: inputs, filters, and outputs.

Create file 01-lumberjack-input.conf and set up “lumberjack” input (the protocol used by Logstash and Logstash Forwarder to communicate)

sudo vi /etc/logstash/conf.d/01-lumberjack-input.conf
 
input {
  lumberjack {
    port => 5000
    type => "logs"
    ssl_certificate => "/etc/pki/tls/certs/logstash-forwarder.crt"
    ssl_key => "/etc/pki/tls/private/logstash-forwarder.key"
  }
}

Save and exit. It was noted here that lumberjack will listen to TCP port 5000 and will use the certificates we had generated before.

Now, create a file named 10-syslog.conf, and add it to the settings of syslog messages filtration:

sudo vi /etc/logstash/conf.d/10-syslog.conf
 
filter {
  if [type] == "syslog" {
    grok {
      match => { "message" => "%{SYSLOGTIMESTAMP:syslog_timestamp} %{SYSLOGHOST:syslog_hostname} %{DATA:syslog_program}(?:[%{POSINT:syslog_pid}])?: %{GREEDYDATA:syslog_message}" }
      add_field => [ "received_at", "%{@timestamp}" ]
      add_field => [ "received_from", "%{host}" ]
    }
    syslog_pri { }
    date {
      match => [ "syslog_timestamp", "MMM  d HH:mm:ss", "MMM dd HH:mm:ss" ]
    }
  }
}

Save and exit

Create the last file 30-lumberjack-output.conf:

sudo vi /etc/logstash/conf.d/30-lumberjack-output.conf
 
output {
  elasticsearch { host => localhost }
  stdout { codec => rubydebug }
}

Restart Logstash:

sudo service logstash restart

Now that Logstash is set up, we go to Logstash Forwarder

Set up Logstash Forwarder

Copy the SSL certificate to the server where Logstash Forwarder will work

scp /etc/pki/tls/certs/logstash-forwarder.crt user@server_private_IP:/tmp

Download the key:

sudo rpm --import http://packages.elasticsearch.org/GPG-KEY-elasticsearch

Create the repository configuration file:

sudo vi /etc/yum.repos.d/logstash-forwarder.repo

Creating repo for Logstash Forwarder

[logstash-forwarder]
name=logstash-forwarder repository
baseurl=http://packages.elasticsearch.org/logstashforwarder/centos
gpgcheck=1
gpgkey=http://packages.elasticsearch.org/GPG-KEY-elasticsearch
enabled=1

Install Logstash Forwarder

sudo yum -y install logstash-forwarder

Copy the certificates to the required location:

sudo cp /tmp/logstash-forwarder.crt /etc/pki/tls/certs/

Let’s get to setting it up:

sudo vi /etc/logstash-forwarder.conf
"servers": [ "logstash_server_private_IP:5000" ],
"timeout": 15,
"ssl ca": "/etc/pki/tls/certs/logstash-forwarder.crt"
 
Между квадратнымы скобками вставляем
 
  {
      "paths": [
        "/var/log/messages",
        "/var/log/secure"
       ],
      "fields": { "type": "syslog" }
    }

Add Logstash Forwarder to the autorun and run it:

sudo service logstash-forwarder restart

Now, Logstash Forwarder will send logs to your Logstash server.

Enter kibana, open Dashboard, and enjoy the view.

Conclusion

Search and data management is becoming an increasingly key component for software systems and technology-driven businesses. Making sure search functions perform optimally in customer-facing apps such as e-commerce apps is less of a competitive advantage than a must-have. The need for technology stacks like the Elastic Stack will only grow in the future, especially as new forms of search, like voice search, gain more traction.

If your next software development project, or future iterations of an evolving application, rely on Elastic Stack expertise and you could benefit from a custom-built dedicated team or augmentation of an existing team with the Elastic Stack, please do get in touch!

Featured blog posts