Ceph-Ansible for a quick and error-free installation of Ceph Storage Clusters

Ceph-Ansible use cases and a step-by-step technical guide to deploying Ceph Storage Clusters in Kubernetes

In this post of our DevOps consulting series, we follow on from our earlier comparative analysis of CEPH or NFS as alternative Kubernetes data storage solutions with a guide to Ceph-Ansible. We’ll examine how Ceph-Ansible can be used for quick, error-free Ceph Storage Cluster deployment within a DevOps architecture. We’ll also look at particular use cases, the strengths and drawbacks of these tools in comparison to alternatives and conclude with a detailed step-by-step guide to installing Ceph Clusters using Ceph-Ansible for a Kubernetes environment.

What Is Ceph Data Storage And How And When Is It Used?

The exponential rate of growth in the data storage needs of modern organisations has resulted in the need for effective Big Data storage solutions. The Ceph storage tool has stepped up to meet that challenge.

Ceph is an open-source software project initiated by Red Hat. It is used to enable scalable object, block and file-based storage under a single system. Ceph Storage Clusters are paired with the CRUSH (Controlled Replication Under Scalable Hashing) algorithm to run commodity hardware. CRUSH manages the optimal distribution of data across a Ceph cluster while also freely retrieving data.

Ceph Storage Clusters contain a large amount of data. The use of subclusters beneath a cluster breaks that data up into more manageable and relevant chunks. In order for the data to be organized according to the appropriate lineage, subclusters must be properly configured as part of a ‘parent’ cluster. CRUSH’s role, as a scalable hashing algorithm, nicely divides a large data set into the appropriate clusters and subcluster, allowing for optimized retrieval. So Ceph’s role in big data storage is one that combines both storage optimisation with simple data access and retrieval.

A Ceph-Ansible use case – hybrid cloud infrastructure using Kubernetes

Modern hybrid software architectures that combine bare metal with cloud solutions (eg. AWS, Google Cloud) and use a containerisation tool like Docker and orchestrators like Docker Swarm, Kubernetes and Rancher – often encounter a problem:

How and where to store data applications in a way that they are accessible from anywhere in the infrastructure, regardless of the location of the Docker container with the application?

Ceph helps resolve this problem through a distributed storage system with high availability and scalability. For example, in Kubernetes-based architectures, Ceph has a provisioner for K8s PersistentVolume- CephFS and K8s PersistentVolumeClaims – RBD (Ceph Block Device). Ceph is also often used in big data processing and storage solutions because of its particularly strong horizontal scalability qualities.

Ceph-Ansible, an automation engine for provisioning and configuration management is also from Red Hat. It is widely considered to be the most flexible way to install and manage a significant Ceph Storage Cluster. Some engineers shy away from Ceph-Ansible as it isn’t necessarily the easiest solution to install and manage Ceph storage. But it also isn’t overly difficult with the right know-how. And the production-grade clusters that result mean the extra effort is often more than worthwhile.

Weaknesses of Ceph Storage

The downside to architecture solutions based on Ceph is that it leads to relatively high redundancy rates for servers and/or virtual machines. So while an effective big data and Kubernetes storage solution, it is not a cheap one.

Ceph Storage Clusters should also not be used for critical data as they do not offer high levels of security.

Deploying Ceph in Kubernetes using Ceph-Ansible

Let’s see how to deploy Ceph using Ceph-Ansible for future use in Kubernetes as block devices (PersistentVolumeClaims – RBD).

For our test bench we will use:

1x virtual server with ansible
192.168.1.2for external traffic
10.0.1.4for internal traffic
3x servers for ceph + on each server 3 free HDDs for OSD
192.168.2.1
192.168.2.2
192.168.2.3
for external traffic
10.0.1.1
10.0.1.2
10.0.1.3
for internal traffic

Grafana and Ceph Dashboard for visualization of the Ceph Storage Cluster will also be installed on one of the servers.

For our 4 servers, the internal network 10.0.1.0/24 is configured, which we will use for internal Ceph traffic

Step-by-Step instructions to preparing a server with Ansible

It is necessary to generate ssh keys and deploy them to all servers.

Download the repository:

git clone https://github.com/ceph/ceph-ansible

Now switch to the version we need in accordance with the following structure. It is also worth noting that for different versions there are also different requirements for the ansible version.

The stable- * brunches have been checked by QE and rarely receive corrections during their life cycle:

BrunchVersion
stable-3.0support for Ceph versions of jewel and luminous. Ansible version 2.4 is required
stable-3.1support for Ceph versions of luminous and mimic. Ansible version 2.4 is required
stable-3.2support for Ceph versions of luminous and mimic. Ansible version 2.6 is required
stable-4.0support for Ceph version of nautilus. Ansible version 2.8 is required

We will use the nautilus version in the example

git checkout stable-4.0

Install all the necessary dependencies

pip install -r requirements.txt

Rename example config files:

cp site.yml.sample site.yml
cp group_vars/all.yml.sample group_vars/all.yml
cp group_vars/mons.yml.sample group_vars/mons.yml
cp group_vars/osds.yml.sample group_vars/osds.yml

Create an inventory file with a description of all our servers

[mons]
192.168.2.1
192.168.2.2
192.168.2.3
 
[osds]
192.168.2.1
192.168.2.2
192.168.2.3
 
[mgrs]
192.168.2.1
 
[grafana-server]
192.168.2.1

We bring the main site.yml file to the following form:

---
# Defines deployment design and assigns role to server groups

- hosts:
  - mons
  - osds

  gather_facts: false
  any_errors_fatal: true
  become: true

  tags: always

  vars:
    delegate_facts_host: True

  pre_tasks:
    # If we can't get python2 installed before any module is used we will fail
    # so just try what we can to get it installed

    - import_tasks: raw_install_python.yml

    - name: gather facts
      setup:
      when: not delegate_facts_host | bool

    - name: gather and delegate facts
      setup:
      delegate_to: "{{ item }}"
      delegate_facts: True
      with_items: "{{ groups['all'] }}"
      run_once: true
      when: delegate_facts_host | bool

    - name: install required packages for fedora > 23
      raw: sudo dnf -y install python2-dnf libselinux-python ntp
      register: result
      when:
        - ansible_distribution == 'Fedora'
        - ansible_distribution_major_version|int >= 23
      until: result is succeeded

    - name: check if it is atomic host
      stat:
        path: /run/ostree-booted
      register: stat_ostree
      tags: always

    - name: set_fact is_atomic
      set_fact:
        is_atomic: '{{ stat_ostree.stat.exists }}'
      tags: always

  tasks:
    - import_role:
        name: ceph-defaults
    - import_role:
        name: ceph-facts
    - import_role:
        name: ceph-validate
    - import_role:
        name: ceph-infra

- hosts: mons
  gather_facts: false
  become: True
  any_errors_fatal: true
  pre_tasks:
    - name: set ceph monitor install 'In Progress'
      run_once: true
      set_stats:
        data:
          installer_phase_ceph_mon:
            status: "In Progress"
            start: "{{ lookup('pipe', 'date +%Y%m%d%H%M%SZ') }}"

  tasks:
    - import_role:
        name: ceph-defaults
      tags: ['ceph_update_config']
    - import_role:
        name: ceph-facts
      tags: ['ceph_update_config']
    - import_role:
        name: ceph-handler
    - import_role:
        name: ceph-common
    - import_role:
        name: ceph-config
      tags: ['ceph_update_config']
    - import_role:
        name: ceph-mon
    - import_role:
        name: ceph-mgr
      when: groups.get(mgr_group_name, []) | length == 0

  post_tasks:
    - name: set ceph monitor install 'Complete'
      run_once: true
      set_stats:
        data:
          installer_phase_ceph_mon:
            status: "Complete"
            end: "{{ lookup('pipe', 'date +%Y%m%d%H%M%SZ') }}"

- hosts: mgrs
  gather_facts: false
  become: True
  any_errors_fatal: true
  pre_tasks:
    - name: set ceph manager install 'In Progress'
      run_once: true
      set_stats:
        data:
          installer_phase_ceph_mgr:
            status: "In Progress"
            start: "{{ lookup('pipe', 'date +%Y%m%d%H%M%SZ') }}"

  tasks:
    - import_role:
        name: ceph-defaults
      tags: ['ceph_update_config']
    - import_role:
        name: ceph-facts
      tags: ['ceph_update_config']
    - import_role:
        name: ceph-handler
    - import_role:
        name: ceph-common
    - import_role:
        name: ceph-config
      tags: ['ceph_update_config']
    - import_role:
        name: ceph-mgr

  post_tasks:
    - name: set ceph manager install 'Complete'
      run_once: true
      set_stats:
        data:
          installer_phase_ceph_mgr:
            status: "Complete"
            end: "{{ lookup('pipe', 'date +%Y%m%d%H%M%SZ') }}"

- hosts: osds
  gather_facts: false
  become: True
  any_errors_fatal: true
  pre_tasks:
    - name: set ceph osd install 'In Progress'
      run_once: true
      set_stats:
        data:
          installer_phase_ceph_osd:
            status: "In Progress"
            start: "{{ lookup('pipe', 'date +%Y%m%d%H%M%SZ') }}"

  tasks:
    - import_role:
        name: ceph-defaults
      tags: ['ceph_update_config']
    - import_role:
        name: ceph-facts
      tags: ['ceph_update_config']
    - import_role:
        name: ceph-handler
    - import_role:
        name: ceph-common
    - import_role:
        name: ceph-config
      tags: ['ceph_update_config']
    - import_role:
        name: ceph-osd

  post_tasks:
    - name: set ceph osd install 'Complete'
      run_once: true
      set_stats:
        data:
          installer_phase_ceph_osd:
            status: "Complete"
            end: "{{ lookup('pipe', 'date +%Y%m%d%H%M%SZ') }}"

- hosts: mons
  gather_facts: false
  become: True
  any_errors_fatal: true
  tasks:
    - import_role:
        name: ceph-defaults
    - name: get ceph status from the first monitor
      command: ceph --cluster {{ cluster }} -s
      register: ceph_status
      changed_when: false
      delegate_to: "{{ groups[mon_group_name][0] }}"
      run_once: true

    - name: "show ceph status for cluster {{ cluster }}"
      debug:
        msg: "{{ ceph_status.stdout_lines }}"
      delegate_to: "{{ groups[mon_group_name][0] }}"
      run_once: true
      when: not ceph_status.failed

- import_playbook: infrastructure-playbooks/dashboard.yml
  when:
    - dashboard_enabled | bool
    - groups.get(grafana_server_group_name, []) | length > 0
    - ansible_os_family in ['RedHat', 'Suse']

We edit the group_vars/all.yml file in it we set such important parameters as the version of the future cluster, specify the internal subnet, interfaces, log size and much more. For our example, the configuration of variables will be as follows:

ceph_origin: repository
ceph_repository: community
ceph_stable_release: nautilus
monitor_interface: eth0
journal_size: 5120
#public_network: 0.0.0.0/0 - leave commented
cluster_network: 10.0.1.0/24 #specify the network for internal traffic

Next, edit the file of variables responsible for configuring OSD group_vars/osds.yml

You can perform a fully automatic search and installation of OSD on the server by specifying a variable:

osd_auto_discovery: true

However, we explicitly specify the drives of our servers:

devices:
- /dev/sdb
- /dev/sdc
- /dev/sdd

Preparation is completed and you now you are ready to run ansible-playbook

ansible-playbook site.yml-i inventory_hosts

The approximate deployment time is 10 minutes. The result of successful execution in the console will be the following output:

INSTALLER STATUS ****************************************************************
Install Ceph Monitor           : Complete (0:02:48)
Install Ceph Manager           : Complete (0:01:14)
Install Ceph OSD               : Complete (0:01:29)
Install Ceph Dashboard         : Complete (0:00:38)
Install Ceph Grafana           : Complete (0:01:08)
Install Ceph Node Exporter     : Complete (0:01:31)

Thursday 22 August 2019  10:20:20 +0200 (0:00:00.064)       0:09:50.539 *********
=================================================================================
ceph-common : install redhat ceph packages ------------------------------- 79.72s
ceph-container-engine : install container package ------------------------ 37.51s
ceph-mgr : install ceph-mgr packages on RedHat or SUSE ------------------- 35.18s
ceph-osd : use ceph-volume lvm batch to create bluestore osds ------------ 29.40s
ceph-grafana : wait for grafana to start --------------------------------- 19.38s
ceph-config : generate ceph configuration file: ceph.conf ---------------- 12.03s
ceph-grafana : install ceph-grafana-dashboards package on RedHat or SUSE -- 9.11s
ceph-common : install centos dependencies --------------------------------- 8.23s
ceph-validate : validate provided configuration --------------------------- 7.23s
ceph-mgr : wait for all mgr to be up -------------------------------------- 6.72s
ceph-mon : fetch ceph initial keys ---------------------------------------- 5.33s
ceph-dashboard : set or update dashboard admin username and password ------ 5.25s
ceph-facts : set_fact fsid from ceph_current_status ----------------------- 4.30s
check for python ---------------------------------------------------------- 4.02s
ceph-mon : waiting for the monitor(s) to form the quorum... --------------- 3.94s
ceph-facts : create a local fetch directory if it does not exist ---------- 3.75s
ceph-facts : set_fact devices generate device list when osd_auto_discovery- 3.41s
gather and delegate facts ------------------------------------------------- 3.37s
ceph-osd : apply operating system tuning ---------------------------------- 2.91s
ceph-container-engine : start container service --------------------------- 2.89s

We have successfully installed the following components:

Install CephMonitor
Install CephManager
Install CephOSD
Install CephDashboard
Install Ceph Grafana
Install CephNode Exporter

To access the dashboard, use the link:

Dashboard web UI

http://192.168.2.1:8443/

Default login and password – admin / admin

We have successfully completed the basic setup of a Ceph distributed storage system. If the Storage Cluster needs to be subsequently expanded, just add a new server to the inventory and you’re good to go!

Featured blog posts