DevOps is a software development culture, approach and set of tools designed to promote integration and collaboration between traditionally distinct development and operations work and teams. Improving efficiency through the automation of repetitive manual tasks in the integration and deployment (CI/CD pipelines) of software iterations is central to a DevOps approach.
A DevOps architecture incorporates the tools that enable that automation into the design of a software application and the infrastructure that sits between its non-production (development and testing) and production (operations) environments.
Agile & DevOps teams and consultants
Supercharge your next cloud development project!
In this blog, we’ll seek to provide a better understanding of the specifics of a DevOps architecture by introducing the DevOps approach and detailing a particular example of an application’s architecture designed by K&C’s senior DevOps engineer. You will see how it answers to a defined set of requirements for a cloud-native web application, taking into consideration:
If AWS EKS is used instead of OpenShift, we should consider a simpler, more intuitive UI for Kubernetes management to optimize maintenance efforts.
But before presenting and breaking down an example of a DevOps architecture designed to meet these project requirements let’s briefly address why DevOps consultants, architects and teams are currently our most in-demand service as an IT services provider.
DevOps itself is a set of best practices established to unify traditionally separated development and operations teams responsible for a single application into one holistic team. As a modus operandi of organisational culture, DevOps breaks down the barriers between the development and testing environments and the production environment.
This escapes the traditional potential for friction between development and production when a production ‘ready’ iteration of a new feature is handed over from the development to operations team but encounters issues in production. It negates the potential for a ‘blame game’, and the wasted time and resources that are an inevitable consequence of a new feature not working smoothly in production, despite no issues appearing in the development and testing environment.
In a DevOps culture, there is no dev and ops, just DevOps. Key to this holistic approach is the automation of testing, deployment and then review, or monitoring in production. This automated process is the CI/CD pipeline, which means continuous integration and either continuous delivery or continuous deployment.
Well executed, a DevOps approach results in better quality software delivered in less time. That feeds through to reducing overall software development costs, improving business cases.
A DevOps CI/CD pipeline is achieved through the use of various tools, technologies and processes.
When does IT Outsourcing work?
(And when doesn’t it?)
Building a cloud-native web application on DevOps principles involves integrating the DevOps tools and technologies that give us a CI/CD pipeline. Since the requirement was also to build a DevOps architecture to run on AWS, this also influenced the choice of which tools and technologies to use.
Our two DevOps architecture proposals, one of which was based on just AWS EKS, and our recommended alternative using both AWS EKS and OpenShift blended AWS-native and Open Source DevOps tools and technologies.
The two DevOps architecture proposals were:
The difference between the two boiled down to the need to use Helm in the management of the Kubernetes cluster in the DevOps architecture proposal that didn’t use OpenShift and only Amazon EKS.
Let’s run through the role each component or tool which forms part of the DevOps architecture prototypes performs.
Amazon Elastic Kubernetes Service (Amazon EKS), is AWS’s native managed Kubernetes containers-as-a-service (CaaS). It simplifies the running of Kubernetes clusters on AWS by negating the necessity to install, operate and maintain a stand-alone Kubernetes control plane, which is instead managed by EKS.
Amazon EKS is able to detect and replace any abnormal control plane instances (nodes) that could become the catalyst of an issue, automatically restarting them as required across Availability Zones within the Region. EKS maintains high availability of Kubernetes clusters by taking advantage of the AWS Regions architecture to eliminate any single point of failure.
No longer vulnerable to the loss of one or more availability zones, the resulting AWS-managed Kubernetes cluster becomes far more resilient.
Unlike the OpenShift/EKS alternative, our EKS-based reference DevOps architecture collaborates with Heptio to integrate Kubernetes RBAC with IAM authentication.
OpenShift is a ‘family of containerization software’ built by RedHat and the OpenShift container platform is best described as a private platform-as-a-service (PaaS). OpenShift can be deployed and managed on an on-premise private cloud or bare metal server or overlaid onto the infrastructure of a cloud provider – in this case AWS.
The fully managed OpenShift service is deployed and operated on AWS, with the two companies collaborating to offer a combined service that is mutually supported, billed directly from AWS along with other AWS-native resources and tools a client makes use of.
Our recommended DevOps architecture uses OpenShift rather than a pure AWS-native EKS solution for a couple of core reasons:
● Native support by Amazon
● Native KMS support
● Additional management UI needed (like Rancher)
● Simple out-of-the-box settings for security, networking allow for a quick start
● No native implementation by AWS
● Additional vendor (Red Hat) increases complexity
● OpenShift Templates are less flexible as Helm charts
OpenShift allows for more flexibility on deployments, with K8s clusters deployable both on-premises or in AWS. That can be a major plus for organisations that still have data and/or integrations applications hosted on-premise in a multi-cloud or hybrid-cloud environment.
Both AWS EKS and OpenShift require Kubernetes expertise, but the OpenShift WebApp interface lessens the learning curve and makes it easier for DevOps teams to get up to speed.
Paying for OpenShift is more expensive than only using EKS but in the context of the app and DevOps architecture in question, we felt the efficiency and simplicity gains compensate for that. If the client organisation’s in-house team, which lack experience in Kubernetes set-up, orchestration and maintenance, were to encounter more problems with the pure EKS approach, that would almost certainly more than wipe out any surface-level savings.
● Adaption of IaC (Infrastructure as Code) for Prod and Pre-Prod
● Testing of infrastructural changes in the development environment without affecting production
● Less maintenance effort
● Any IaC change affects all Kubernetes tenants (i.e. namespaces)
We recommended a hard environment bases on a Kubernetes cluster because it would allow for highly automated maintenance and separated clusters offer greater flexibility and a more robust infrastructure.
Our DevOps architecture uses AWS KMS for key management. KMS allows for the easy creation and management of cryptographic keys, controlling them across AWS services and within the application. Hardware security modules either already or in the process of being validated under FIPS 140-2 offer a high level of security and resilience and key usage logs are a plus, especially if an application must meet regulatory or compliance requirements.
To answer to the requirements for monitoring and logging, we use a combination of Prometheus and Grafana for Kubernetes monitoring, Jaeger for distributed tracing, Kibana for logging and Grafana for broader application monitoring. All the tools are transferred from OpenShift into the management cluster.
Open source containers and microservices event monitoring tool Prometheus scrapes numerical data based on time series by invoking the metric endpoints of monitored nodes. Prometheus collects and time stamps the metrics.
Prometheus is a free and open-source event monitoring tool for containers or microservices. Prometheus collects numerical data based on time series. The Prometheus server works on the principle of scraping. This invokes the metric endpoint of the various nodes that have been configured to monitor. These metrics are collected in regular timestamps and stored locally. The endpoint that was used to discard is exposed on the node.
Grafana, a multi-platform data visualisation platform that charts or graphs the data source’s availability, and adds value above and beyond Prometheus’s browser expression. Grafana also offers out-of-the-box integration with Prometheus. Used to visualise metrics and logs in lots of different ways and a highly efficient way to search or live stream logs.
Elasticsearch stores data in indices and acts as a distributed, scalable search engine for both full-text and structured search and with applications in analytics. In the context of a Kubernetes cluster, Elasticsearch is used to ingest logs.
Kibana’s role is in viewing the logs ingested into Elasticsearch. Kibana is part of the ELK Elastic Stack and best described as a “user interface that lets you visualise your Elasticsearch data and navigate the Elastic Stack. Do anything from tracking query load to understanding the way requests flow through your apps”.
Jaeger is an Open Source distributed tracing system that includes components to store, visualise and filter traces. It implements the OpenTracing specification. Distributed tracing captures requests to build a picture of the full chain of calls from user requests to interactions between microservices. Jaeger also tracks how long requests took, the lifecycle of network calls such as HTTP and RPC and helps locate bottlenecks that affect performance.
In a Kubernetes environment, Jaeger enables distributed tracing for gRPC services.
A breakdown of the Dev tools used in our DevOps architecture:
AWS-native ECR is a fully-managed Docker container registry that hosts Docker container images in a scalable and high availability architecture. Its role in a DevOps architecture is the reliable deployment of containers with resource-level control of individual repositories achieves through integration with AWS Identity and IAM.
Infrastructure-as-code (IaC) tool Terraform is used to manage the AWS infrastructure. Using Terraform, we can write the code for the infrastructure and maintain it in GIT. We can also maintain infrastructure states and roll back to the previous state if required. Terraform figures out how to achieve the desired infrastructure end-state specified by the code. It also supports immutable infrastructure.
Helm is the application package manager that will run on top of Kubernetes to describe and manage the application’s structure. It helps simplify microservices management through the provision of helm charts and simple management commands.
Argo CD follows the GitOps practice of using Git repositories as the single source of truth for the application’s state. Argo CD ensures application definitions, configurations and environments are declarative and version controlled and lifecycle management automated, auditable and easy to understand.
Within a Kubernetes cluster, Argo CD has implemented a controller for continuous monitoring of running applications, comparing the production state against the desired target state, as defined by the Git repo. If the production state is out of sync with the Git repo’s single source of truth, Argo CD reports and visualises discrepancies and facilitates a manual or automatic roll back to the target state.
Changes to the target state held in the Git can also be automatically pushed through to production.
CodeCommit is a key element to a CI/CD pipeline in an AWS environment. A fully-managed source control service, CodeCommit securely hosts Git repos, eliminating the need for the DevOps team to operate its own source control system, which can be a bottleneck when it comes to scaling infrastructure. Securely stores anything from source code to binaries and compatible with existing Git tools.
Code quality and security in our DevOps architecture rely on SonarQube, a tool with the motto of “Continuous Inspection must become mainstream as Continuous Integration”.
SonarQube is an Open Source tool that provides automated code review, detecting errors, bugs, vulnerabilities and untidily implemented segments in source code across 27 different programming languages through Static Application Security Testing (SAST).
The tool can also be used to check compliance with organisational coding guidelines, as well as just general quality issues. Its contribution to security is to flag potential issues such as insecure coding approaches, outdated cryptographic libraries, consistency of debug output etc.
All of this is key to integrating security into DevOps, or DevSecOps, as the approach is sometimes termed. At K&C, our position is that all DevOps should be by default DevSecOps, negating the need for a unique term.
In a traditional Waterfall approach to software development, security is integrated into software as a final stage of development. The problem with retrospective security integration is that by that stage, there is a danger of fundamental software architectural issues not meeting rigorous security standards. At that point, it can often be too late to make the changes that would offer a robust level of security without redoing much of the development work.
If security is an integral part of the CI/CD DevOps pipeline, that weakness of post-facto integration is eliminated. SonarQube makes sure it is and new code is continuously checked for general quality and in the context of security considerations.
In our DevOps architecture Rancher’s role is:
Rancher, which can deploy and manage K8s clusters on any infrastructure from datacentres to edge, is one of our favourite and integral DevOps tools. Operational and security challenges are managed by Rancher through cluster management and provisioning. Rancher also offers a number of integrated tools that help DevOps teams to run containerised workloads.
Rancher also integrates with a GitHub repository to automate CI pipeline execution. As part of the CI/CD pipeline, Rancher:
Identity and access management (IAM) is another core security pillar. Since we’re using an AWS environment, the native AWS IAM is the obvious solution here. AWS IAM simply controls who is able to sign in to the cloud environment and sets which permissions to use AWS resources each signed in user has.
In a DevOps approach, IAM roles should allow for the team to be responsible for resources usage, permissions set to allow the team to react flexibly to incidents, reducing response times. A strong audit of IAM roles by an external third party is also recommended.
If you could benefit from experienced and knowledgeable DevOps consultancy for your in-house projects, or need a flexible, scalable DevOps team to build or maintain your apps or Kubernetes clusters, K&C would love to help. Just drop us a line!
K&C - Creating Beautiful Technology Solutions For 20+ Years . Can We Be Your Competitive Edge?
Drop us a line to discuss your needs or next project