CEPHFS vs NFS For Docker Cluster Data Storage?

DevOpsUPDATED ON October 9, 2020

Cephfs vs. NFS Is a Question Our DevOps Team Regulary Encounters When Building a Docker Cluster On A Bare-Metal Server. This Is How They Answer The Question

When K&C’s DevOps engineers build a docker cluster on a physical (bare-metal) server, the CEPHfs vs NFS question often arises. Which of the two distributed file storage systems should we use to store persistent data that should be available to all of the cluster’s servers? Without such storage, the whole concept of docker containers disappears because only, in this case, will the cluster function in high availability mode. Moreover, an application placed on the cluster’s worker node should get access to our data storage and proceed to work in case of a dropout, loss, or unavailability of one of the servers in the data storage cluster.

NFS or Cephfs?

That a persistent storage solution is necessary, is clear. The question then becomes Cephfs or NFS as the optimal solution? The NFS – Network File System is one of the most commonly used data storage systems to meet our minimum requirements. It provides transparent access to files and server file systems. And it enables any client application able to work with a local file to also work with an NFS-file without any program modification.

NFS Server Scheme

NFS Server Scheme

From the scheme above, you can see that the NFS server contains data which is available to every server in the cluster. The given scheme works well for projects involving modest data volumes and without the requirement for high-speed input/output.

What issues can you face when work with NFS?

Potential NFS Problem #1 

The whole load goes to the hard drive, which is on the NFS server and to which all other servers on the cluster call and perfors read/record operations.

Potential NFS Problem #2

A single endpoint to a server with data. In the case of a data server dropout, the possibility that our application will also fail increases respectively.

[text_on_the_background title=”K&C – Creating Beautiful Technology Solutions For 20+ Years . Can We Be Your Competitive Edge?”]Drop us a line to discuss your needs or next project.
Contact Us HERE[/text_on_the_background]

What is CEPH?

CEPH is one of the most advanced and popular distributed file systems and object storage system. It is a software-defined remote file system with an open source, which belongs to open source DevOps specialists, the Red Hat Company

Key CEPH Features

  • no single entry points;
  • easily scalable to petabytes;
  • stores and replicates data;
  • responsible for load balancing;
  • guarantees accessibility and system robustness;
  • free (however, its developers might supply fee-based support);
  • no need for special equipment (the system can be deployed at any data center).

How CEPH Works As A Data Storage Solution

CEPH keeps and provides data for clients in the following ways:

1)RADOS – as an object.

2)RBD – as a block device.

3)CephFS – as a file, POSIX-compliant filesystem.

Access to the distributed storage of RADOS objects is given with the help of the following interfaces:

1)RADOS Gateway – Swift and Amazon-S3 compatible RESTful interface.

2)librados and the related C/C++ bindings.

3)rbd and QEMU-RBD – linux kernel and QEMU block.


Here you can see how data placement is implemented in the CEPH cluster with the replication x2:

CEPH cluster with replication x2

And here you can see how data is restored inside the cluster in case of the loss of a CEPH cluster node:

how data is restored inside cluster in case of loss of the node in CEPH cluster

Ceph Use Cases – Some of The Companies That Use CEPHFS As A Data Storate Solution In A Hybrid Cloud DevOps Architecture

CEPH has gained a wide audience and some well known companies make use of CEPHfs:

CEPH corporate users

How CEPH Works and Infrastructure Requirements

CEPH’s primary requirement of the infrastructure is the availability of a sustainable network connection between a cluster’s servers. The minimal requirement to the network is the presence of 1 Gb/s communications link between servers. With this, it’s recommended to use network interfaces with the bandwidth 10 GB/s.

From our experience of building CEPH clusters, it’s worth mentioning that the network infrastructure requirements cab cause bottlenecks in Docker clusters. Any problems in the network infrastructure can lead to delays in the receipt of data by customers, as well as slow down the cluster and lead to a rebalancing of data within the cluster. We recommend to place the CEPH cluster servers in one server rack, and also make connections between the servers with the help of additional internal network interfaces.

Our experience at K&C also includes clusters built with the network channels at 1GB/s. These are not connected with internal interfaces and replaced in different server racks, which in turn are situated in distinct data centers. Even in such a scheme, a cluster’s work can be regarded as satisfactory as it performs SLA 99.9% of data accessibility.

Building A Minimum Requirements Docker Cluster Using CEPHfs

Let’s consider the building of a minimal cluster. In the given example, we’ll use a network interface 1 GB/s between servers of the CEPH cluster. Clients are connected through the same network interface. The primary requirement, in this case, is to resolve the problems mentioned above which occur when the data storage scheme with the NFS server is implemented. For clients, the data will be provided as a file system.

CEPH data store diagram

In a scheme like the above, we have three physical servers with three hard drives, allotted to the CEPH cluster’s data. Hard drives are of the HDD type (not SSD), the volume is 6Tb, replication factor – x3. As a result, total data volume amounts to 18 Tb. Each of the CEPH cluster’s servers, in turn, is an entry point to the cluster for end clients. This allows us to “lose”  (server down / server maintenance /…) one of the CEPH cluster’s servers per unit time in order to not harm final client’s data and ensure they are available.

In case of the given scheme, we solve the problem represented by NFS as a single entry point to our data storage, as well as accelerate the speed of data operations.

Let’s test the throughput of our cluster using an example of the file record (size – 500 Gb) in the CEPH cluster from a client server.

Graph: loading the file into CEPH cluster

The graph shows that loading the file into the CEPH cluster takes a little over five hours. In addition to that, you should pay attention to the network interface downloading: it is loaded at 30% – 300Mbps, not 100% as you may have assumed. The reason for that is the limitation of the recording speed of HDD hard disks. You can achieve a higher record/read response times when building a CEPH cluster by using SSD drives, but the total cost of the cluster in this case is significantly increased.

In Conclusion – CEPH or NFS As Our Data Storage Solution For A Docker Cluster?

The choice between NFS and CEPH depends on a project’s requirements,  scale, and will also take into consideration future evolutions such as scalability requirements. We’ve worked on projects for which CEPH was the optimal choice, and on other where it was NFS. Broadly speaking, in the case of small clusters where data loads are modest, NFS can be a cheap, easy and perfectly suitable choice. For larger projects where heavier data loads will be processed and stored, the more sophisticated CEPH solution will most likely be recommended. 

[text_on_the_background title=”K&C – Creating Beautiful Technology Solutions For 20+ Years . Can We Be Your Competitive Edge?”]Drop us a line to discuss your needs or next project.
Contact Us HERE[/text_on_the_background]

Related Service

Software Development For Startups

Read more >

QA Consulting and Software Testing

Read more >

Angular Development and Migration Services

Read more >