In this post in our Kubernetes consulting series we’re going to cover Kubernetes backup with Velero, previously known as Heptio Ark, within an AWS environment. Velero is also supported by the Microsoft Azure and GCP cloud platforms.
This step-by-step tutorial on how to install Velero as your backup facility is specific to an AWS environment that we’re running Kubernetes clusters on. But the installation process is almost exactly the same in other non-AWS cloud environments. Simply switch out the AWS-specific commands for those of whatever cloud platform you are running you K8s on.
Before we get into the step-by-step tutorial, let’s first quickly look at Kubernetes backups more generally, and Velero as a disaster recovery management tool. Feel free to skip down the next two short sections to move straight to the tutorial.
The headline is something of a rhetorical question, with it fairly obvious why any application needs to be backed up. It means the application and especially data it holds can be recovered if anything goes wrong.
We’ve probably all had a lost, stolen or broken laptop or smartphone before cloud storage became a relative standard. If data hadn’t been recently backed up externally – it was gone. And even if it was, setting up a repaired machine or replacement with all the same programmes etc. is a monumental pain in rear.
And that was just in the case of a hardware device holding one person’s data. When a software application, especially an enterprise-level application, fails the problem is far bigger. Not only can large volumes of data be lost (even if most of it is presumably backed up on the cloud or physical servers in multiple copies not exposed to any single point of failure) but it can be difficult and time consuming to fix the software itself.
That means if an application were to fail, especially a large complex application, it would be down for some time. That can cost companies a huge amount of money in lost sales and reputational damage. But if it an application has backup and restore tools built into it, it can simply be restored to a point before it failed, and the cause of the failure isolated and resolved. No major harm done.
Which is why any software should have backup and restore tools built into it. However, traditional backup and restore tools weren’t designed for the specifics of containerised applications that employ microservices and have been built to run on Kubernetes clusters.
Which means applications that employ architectures based on Kubernetes need to integrate a backup and restore utility that is specifically designed for a container-based architecture orchestrated by Kubernetes.
That’s exactly what Velero (Ark) is.
[text_on_the_background title=”When does IT Outsourcing work?”](And when doesn’t it?)
Find out HERE[/text_on_the_background]
Velero is an open source disaster recovery management, data migration and protection tool designed to work in tandem with Kubernetes. It allows for relatively easy backup and restore through a series of checkpoints, and supports AWS, GCP, and Azure as well supported. As of version 0.6.0, Velero also offers plugins for compatibility with other backup and volume storage platforms. Velero is also used in the migration of Kubernetes resources from one cluster to another and when replicating the production environment for testing and troubleshooting.
Despite being around for a few years now, first as Heptio Ark and then the renamed Velero, there are still a lot of questions around how to set the tool up for backups.
As we use Velero with Kubernetes on AWS, we have plenty of experience with its set-up and hope that our step-by-step how-to will make it a bit easier for you, saving some time as you get started.
So, without further ado…
Velero requires an S3 bucket to store the pod states archive from etcd.
aws s3api create-bucket \ --bucket <YOUR_BUCKET> \ --region <YOUR_REGION> \ --create-bucket-configuration LocationConstrain <YOUR_REGION>
aws iam create-user --user-name heptio-ark
BUCKET=<YOUR_BUCKET> cat > heptio-ark-policy.json <<EOF { "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "ec2:DescribeVolumes", "ec2:DescribeSnapshots", "ec2:CreateTags", "ec2:CreateVolume", "ec2:CreateSnapshot", "ec2:DeleteSnapshot" ], "Resource": "*" }, { "Effect": "Allow", "Action": [ "s3:GetObject", "s3:DeleteObject", "s3:PutObject", "s3:AbortMultipartUpload", "s3:ListMultipartUploadParts" ], "Resource": [ "arn:aws:s3:::${BUCKET}/*" ] }, { "Effect": "Allow", "Action": [ "s3:ListBucket" ], "Resource": [ "arn:aws:s3:::${BUCKET}" ] } ] } EOF aws iam put-user-policy \ --user-name heptio-ark \ --policy-name heptio-ark \ --policy-document file://heptio-ark-policy.json
aws iam create-access-key --user-name heptio-ark
And we should get this:
{ "AccessKey": { "UserName": "heptio-ark", "Status": "Active", "CreateDate": "2017-07-31T22:24:41.576Z", "SecretAccessKey": <AWS_SECRET_ACCESS_KEY>, "AccessKeyId": <AWS_ACCESS_KEY_ID> } }
[default] aws_access_key_id=<AWS_ACCESS_KEY_ID> aws_secret_access_key=<AWS_SECRET_ACCESS_KEY>
Creating file prereqs.yaml:
# Copyright 2017 the Heptio Ark contributors. # # Licensed under the Apache License, Version 2.0 (the "License"); # you may not use this file except in compliance with the License. # You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. --- apiVersion: apiextensions.k8s.io/v1beta1 kind: CustomResourceDefinition metadata: name: backups.ark.heptio.com labels: component: ark spec: group: ark.heptio.com version: v1 scope: Namespaced names: plural: backups kind: Backup --- apiVersion: apiextensions.k8s.io/v1beta1 kind: CustomResourceDefinition metadata: name: schedules.ark.heptio.com labels: component: ark spec: group: ark.heptio.com version: v1 scope: Namespaced names: plural: schedules kind: Schedule --- apiVersion: apiextensions.k8s.io/v1beta1 kind: CustomResourceDefinition metadata: name: restores.ark.heptio.com labels: component: ark spec: group: ark.heptio.com version: v1 scope: Namespaced names: plural: restores kind: Restore --- apiVersion: apiextensions.k8s.io/v1beta1 kind: CustomResourceDefinition metadata: name: configs.ark.heptio.com labels: component: ark spec: group: ark.heptio.com version: v1 scope: Namespaced names: plural: configs kind: Config --- apiVersion: apiextensions.k8s.io/v1beta1 kind: CustomResourceDefinition metadata: name: downloadrequests.ark.heptio.com labels: component: ark spec: group: ark.heptio.com version: v1 scope: Namespaced names: plural: downloadrequests kind: DownloadRequest --- apiVersion: apiextensions.k8s.io/v1beta1 kind: CustomResourceDefinition metadata: name: deletebackuprequests.ark.heptio.com labels: component: ark spec: group: ark.heptio.com version: v1 scope: Namespaced names: plural: deletebackuprequests kind: DeleteBackupRequest --- apiVersion: apiextensions.k8s.io/v1beta1 kind: CustomResourceDefinition metadata: name: podvolumebackups.ark.heptio.com labels: component: ark spec: group: ark.heptio.com version: v1 scope: Namespaced names: plural: podvolumebackups kind: PodVolumeBackup --- apiVersion: apiextensions.k8s.io/v1beta1 kind: CustomResourceDefinition metadata: name: podvolumerestores.ark.heptio.com labels: component: ark spec: group: ark.heptio.com version: v1 scope: Namespaced names: plural: podvolumerestores kind: PodVolumeRestore --- apiVersion: apiextensions.k8s.io/v1beta1 kind: CustomResourceDefinition metadata: name: resticrepositories.ark.heptio.com labels: component: ark spec: group: ark.heptio.com version: v1 scope: Namespaced names: plural: resticrepositories kind: ResticRepository --- apiVersion: v1 kind: Namespace metadata: name: heptio-ark --- apiVersion: v1 kind: ServiceAccount metadata: name: ark namespace: heptio-ark labels: component: ark --- apiVersion: rbac.authorization.k8s.io/v1beta1 kind: ClusterRoleBinding metadata: name: ark labels: component: ark subjects: - kind: ServiceAccount namespace: heptio-ark name: ark roleRef: kind: ClusterRole name: cluster-admin apiGroup: rbac.authorization.k8s.io Deploying to Kubernetes: kubectl apply -f prereqs.yaml
kubectl create secret generic cloud-credentials \ --namespace <ARK_NAMESPACE> \ --from-file cloud=credentials-ark Creating files config.yaml and deployment.yaml: # Copyright 2017 the Heptio Ark contributors. # # Licensed under the Apache License, Version 2.0 (the "License"); # you may not use this file except in compliance with the License. # You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. --- apiVersion: ark.heptio.com/v1 kind: Config metadata: namespace: heptio-ark name: default persistentVolumeProvider: name: aws config: region: <YOUR_REGION> backupStorageProvider: name: aws bucket: <YOUR_BUCKET> # Uncomment the below line to enable restic integration. # The format for resticLocation is <bucket>[/<prefix>], # e.g. "my-restic-bucket" or "my-restic-bucket/repos". # This MUST be a different bucket than the main Ark bucket # specified just above. # resticLocation: <YOUR_RESTIC_LOCATION> config: region: <YOUR_REGION> backupSyncPeriod: 30m gcSyncPeriod: 30m scheduleSyncPeriod: 1m restoreOnlyMode: false 1 # Copyright 2017 the Heptio Ark contributors. 2 # 3 # Licensed under the Apache License, Version 2.0 (the "License"); 4 # you may not use this file except in compliance with the License. 5 # You may obtain a copy of the License at 6 # 7 # http://www.apache.org/licenses/LICENSE-2.0 8 # 9 # Unless required by applicable law or agreed to in writing, software 10 # distributed under the License is distributed on an "AS IS" BASIS, 11 # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 12 # See the License for the specific language governing permissions and 13 # limitations under the License. 14 15 --- 16 apiVersion: apps/v1beta1 17 kind: Deployment 18 metadata: 19 namespace: heptio-ark 20 name: ark 21 spec: 22 replicas: 1 23 template: 24 metadata: 25 labels: 26 component: ark 27 annotations: 28 prometheus.io/scrape: "true" 29 prometheus.io/port: "8085" 30 prometheus.io/path: "/metrics" 31 spec: 32 restartPolicy: Always 33 serviceAccountName: ark 34 containers: 35 - name: ark 36 image: gcr.io/heptio-images/ark:latest 37 command: 38 - /ark 39 args: 40 - server 41 volumeMounts: 42 - name: cloud-credentials 43 mountPath: /credentials 44 - name: plugins 45 mountPath: /plugins 46 - name: scratch 47 mountPath: /scratch 48 env: 49 - name: AWS_SHARED_CREDENTIALS_FILE 50 value: /credentials/cloud 51 - name: ARK_SCRATCH_DIR 52 value: /scratch 53 volumes: 54 - name: cloud-credentials 55 secret: 56 secretName: cloud-credentials 57 - name: plugins 58 emptyDir: {} 59 - name: scratch 60 emptyDir: {} Deploying to Kubernetes: kubectl apply -f config.yaml kubectl apply -f deployment.yaml
Configuring Your K8s Backups
Creating daily executable file daily.yaml:
apiVersion: ark.heptio.com/v1 kind: Schedule metadata: name: daily namespace: heptio-ark spec: schedule: 5 0 * * * template: excludedNamespaces: null excludedResources: null hooks: resources: null includeClusterResources: null includedNamespaces: - '*' includedResources: null labelSelector: null snapshotVolumes: true ttl: 168h0m0s Deploying: kubectl apply -f daily.yaml
kubectl -n heptio-ark get schedules.ark.heptio.com NAME AGE daily 13h 1 kubectl -n heptio-ark get backups.ark.heptio.com 2 3 NAME AGE 4 daily-20180814220346 13h 5 daily-20180815000534 11h Checking S3 bucket in AWS: aws s3 ls <BUCKET_NAME> PRE daily-20180814220346/ PRE daily-20180815000534/ Checking snapshots: aws ec2 describe-snapshots { "Description": "", "Tags": [ { "Value": "owned", "Key": "kubernetes.io/cluster/aws.cluster.com" }, { "Value": "internal-services", "Key": "kubernetes.io/created-for/pvc/namespace" }, { "Value": "aws.cluster.com", "Key": "KubernetesCluster" }, { "Value": "daily-20180814220346", "Key": "ark.heptio.com/backup" }, { "Value": "pvc-37895bcc-67ec-11e8-b075-0689fdf74ef8", "Key": "ark.heptio.com/pv" }, { "Value": "pvc-37895bcc-67ec-11e8-b075-0689fdf74ef8", "Key": "kubernetes.io/created-for/pv/name" }, { "Value": "prometheus-claim0", "Key": "kubernetes.io/created-for/pvc/name" }, { "Value": "aws.cluster.com-dynamic-pvc-37895bcc-67ec-11e8-b075-0689fdf74ef8", "Key": "Name" } ], "Encrypted": false, "VolumeId": "vol-0733b80fe4057339a", "State": "completed", "VolumeSize": 489, "StartTime": "2018-08-14T22:03:49.000Z", "Progress": "100%", "OwnerId": "412224592913", "SnapshotId": "snap-04173b3945360a237" },
Download the Velero client for convenience: https://github.com/vmware-tanzu/velero/releases
Great, now we know our way around Velero backups, from the creation of S3 bucket and server in Kubernetes to the configuration process. Read more on secret management and Consul with Hashicorp and on Kubernetes security in our K8s consulting series blog posts.
And we hope you’ve found the above Velero installation tutorial useful and you Kubernetes clusters are now safely backed up and bullet-proof!
[text_on_the_background title=”K&C – Creating Beautiful Technology Solutions For 20+ Years . Can We Be Your Competitive Edge?”]Drop us a line to discuss your needs or next project.
Contact Us HERE[/text_on_the_background]