The cloud native tech stack has blossomed over the past few years. From containerisation and orchestration to monitoring, logging, security and app-specific needs such as streaming and messaging, there is now almost a choice of technologies and tools.
The choice of what is included in a cloud native tech stack will often vary from project to project. It is based on specific needs, priorities and characteristics of your cloud native applications.
Every modern laptop offers the same core functionalities but its own unique specifications, qualities and features. And every public cloud platform also offers ‘must have’ services and features but has its own unique offerings or qualities which mean it is the optimal or preferred choice for a given workload. In the same way, choices within the cloud native technology stack will be best suited to one project, but will make way for a more precisely suited alternative on another occasion.
But all things being equal, at K&C we do have our preferences when it comes to our cloud native tech stack. The tools and technologies we rely on most often unless the unique requirements of a project mean another choice is more suited.
This is our ultimate native cloud tech stack for 2020!
Docker’s role in the cloud native tech stack is simple – it makes containerisation easy. The beauty of containers is that a well-tuned container system allows for 400%-600% more server application instances as VMs running on the same hardware. And they lend themselves to the Continuous Integration/Continuous Deployment (CI/CD) DevOps methodology, which we’ll get to shortly.
Docker allows our devs to easily pack, ship, and run any application as a lightweight, portable, self-sufficient container, to run virtually (pun intended!) anywhere.
Docker’s partnering with the other container big beasts, like Canonical, Google, Red Hat, and Parallels, on its key open-source component libcontainer, has also brought about a very convenient standardisation to containers.
There aren’t any real alternatives to what Docker brings to the cloud native tech stack. Other LXC-based container implementations as Redhat’s CoreOS, Rkt, or LXD by Canonical are less competitors and more LXC refinements.
A vendor-agnostic cluster and container management tool that provides “platform for automating deployment, scaling, and operations of application containers across clusters of hosts”, Kubernetes is the result of a 2014 Google-run open-source initiative. The internet giant had been using Kubernetes itself to run massive systems for 10 years – more than enough proof of the tool’s efficacy.
An integral, arguably the most integral, component of our cloud native technology stack, Kubernetes simplifies operations and architecture with the net result of reducing the costs incurred by a cloud native strategy – especially in the context of multi or hybrid cloud infrastructures.
Like the virtual machines they are replacing, the inherent issue with containers is that they need to be kept track of. Orphaned containers not contributing to an application can inflate cloud CPU and storage bills. They also might need more memory, CPU, or storage, or to be shut them down when the load lightens. All of that requires container orchestration – which is what Kubernetes brings to the cloud native party.
Kubernetes’s role is to add a further layer of abstraction between machines, containers, storage, and networks and their physical implementation by offering an homogenous interface for the deployment of containers to all kinds of clouds, virtual machines, and physical machines.
Amazon ECS can offer an alternative to Kubernetes for a cloud native architecture that only uses AWS as a vendor.
A vendor-agnostic alternative is Docker Swarm. Like Kubernetes, Swarm is production ready and has a similar role and features, though an alternative approach. Docker Swarm shares the same API and structure as Docker itself, while Kubernetes is more of a generalist tool and can work with other container engines, allowing for more flexibility.
Developers that use Kubernetes need to learn a new set of commands, but these then allow for more precise commands to the domain and cluster management. It’s easier to create and deploy test clusters using Swarm but Kubernetes, thanks to its namespace separation, offers a unified way to use a single cluster for production, staging and development, which means in a majority of cases it’s not necessary to create a cluster specifically for testing.
Swarm has quickly improved its features over the years but we still feel that Kubernetes offers greater flexibility and precision for developers experienced in using it.
Prometheus is the time-series database and monitoring solution of choice in our is a cloud native tech stack. Monitoring applications and their servers has become an integral part of the DevOps process. Application exceptions, server CPU and memory usage, storage spikes and services within an application not responding all need a careful eye kept on in the context of performance and cloud costs management.
The main features and benefits of Prometheus are:
Dimensional data modelling – time series are identified by a metric name and key-value pairs.
Queries – time series data is collected and organised into ad-hoc alerts, graphs and tables.
Visualisation – multiple data visualisation modes are a major plus of Prometheus along with its integrated expression browser, Grafana integration and a console template language.
Storage Efficiency – Prometheus’s custom format offers efficient time series data stored in both memory and on local disk. Functional sharding and federation allows for scaling.
Operational Simplicity – servers are independent, which improves reliability through local storage. Go binaries are statically linked and deployment a simple process.
Alerting – alerts are precise, maintain dimensional data and based on Prometheus’s flexible PromQL.
Multiple Client Libraries – client libraries facilitate easy instrumentation of services. >10 libraries are currently supported and custom libraries are also simple to integrate.
Integration – third party data like Docker, StatsD, HAProxy, JMX metrics and system statistics is bridged in Prometheus through existing exporters.
Prometheus Operator allows for monitoring instances to be defined and managed as Kubernetes resources. There is a low threshold to effectively defining the monitoring of your applications through Operator if you already know how to manage Kubernetes.
Graphite and InfluxDB and are alternatives to Prometheus in a cloud native or dynamic environment, though no single alternative offers the same range of functionalities, features and applications as Prometheus. They are also generally harder to integrate. Each, however, has its strengths.
Graphite, for example, can be a better choice for a clustered solution that can hold historical data long term. InfluxDB strengths are in event logging, long term data storage and a consistent view of data between replicas.
But, with the exception of specific use cases where these needs should be prioritised, Prometheus is the monitoring tool of choice in K&C’s native cloud technology stack.
Another Google open-source project and built to work with Kubernetes, the Istio service mesh tames the complexities of managing the networks used to connect microservices. While a microservices architecture solves many of the problems that stem from cloud and particularly multi cloud deployment, it also introduces new challenges.
Development, updates and scaling are all made easier by breaking an app down into microservices. But these microservices are moving parts of a larger entity and they need to be connected and secured. The management of network services like load balancing, traffic management, authentication and authorisation becomes complex. The networked space between microservices within a Kubernetes cluster is the service mesh and its role is to:
Istio’s data plane looks after network traffic between the services in the mesh and the control pane Istio is built around manages and secures the data plane.
Abstraction is Istio’s core benefit. Abstraction means programmatic changes to the mesh can be actioned through Istio commands, which means services themselves don’t need to be redeveloped when network policies or quotas change. The networking spaces between services also don’t need to be directly updated.
Further benefits of Istio are that non-destructive changes to the cluster’s network configuration can be rolled out top to bottom and easily rolled back if they create problems and that it provides detailed statistics and reporting about what’s going on between containers and cluster nodes. That means any counterproductive updates are quickly spotted.
Finally, while Istio integrates most directly and deeply with Kubernetes, its open standards means it is platform independent and can be used with other orchestration systems.
The main alternatives to Istio as a service mesh are Linkerd, Envoy and Hashicorp’s Consul. Consul Connect is simple, flexible and enjoys strong support by Hashicorp. We often use it as an alternative to Istio in projects where the client has a preference for Hashicorp products or our developers prefer it for the particular use case.
Linkerd is also simple and well supported by its creators Bouyant. While there’s nothing wrong with it, per se, we tend to favour Istio’s inherent compatibility with Kubernetes. That, and well implemented features (plus the fact we believe it’s best positioned to lead the market and is, therefore, future-proofed to a greater extent) means Istio secures is position in our dream team cloud native tech stack for 2020.
For CI/CD Automation, the choice is harder. We use both Jenkins and GitLab, with the preference boiling down to the specific project. As a general rule of thumb, we most often opt for Jenkins in the context of larger projects and GitLab for medium-sized and smaller projects.
Jenkins provides Continuous Integration and Continuous Delivery (CI/CD) automation, across combinations of language and source code repositories. It’s still necessary to write scripts for different steps in the process but Jenkins accelerates and secures build, test and deployment tool pipelines.
Jenkins neutralises the DevOps issue of individual developers having to build and test on a local machine before committing their code. This is a major problem when a team of devs are all committing new code. Their changes are tested in isolation rather than in combination. That could often mean problems when multiple updates, all built and tested in isolation, were committed.
Jenkins is an open-source automation server in and for Java that allows DevOps teams to test the combined impact of the code changes before they are committed to the live repository. It comes with 1,600 plug-ins across platforms, UI, administration, source code management and build management.
Jenkins also supports version control tools like Subversion, Git, Mercurial, and Maven.
When Jenkins is integrated with Docker, the application is run inside Docker container. Jenkins builds the Docker image with your application and pushes it to either a public or private Docker registry.
Jenkins excels when used for large projects involving high levels of customisation. Its biggest drawback is it must be run on a server requiring attention, updates and maintenance.
• Selfhosted, which means full control over workspaces
• Workspace control aided by easy debugging of runners
• Strong credential management
• Extensive plugin library
• Set-up and server maintenance could be a prohibitive overhead for small projects.
• Complex plugin integration
• A new pipeline is needed for each environment e.g. Production/Testing
GitLab is now a serious alternative to Jenkins and can be the, and our, preferred tool for certain, usually smaller projects. GitLab comes into its own for small to medium-scale projects thanks to the short set-up time, simple integration of new jobs flexibility. In agile development, the granular nature of GitLabs’ graphic interface and flexible adjustment are strengths.
However, for big projects, the granularity and central configuration in the gitlab-ci.yml, means the structure can quickly become problematically complex.
• Strong Docker integration
• Parallel job execution within stages
• Directed acyclic graph pipeline
• Adding jobs is simple
• Merge requests are integrated
• Concurrent runners allow for scalability
• artifacts have to be defined and uploaded and downloaded for each individual job.
• Merged branch testing can’t be done ahead of the actual merge.
• No support for stages within stages