Kops, self-described as 'kubectl for clusters' is an official Kubernetes tool that allows you to build and manage your Kubernetes infrastructure in cloud. Namely, it's a strong tool for deploying clusters to Amazon Web Services (AWS). We have since favoured aligning to the industry where we can and contributing to a tool that is part of the Kubernetes ecosystem. Since then, we have been running kops happily, and this article is to help others in a similar position leveraging an additional AZ once they have a running cluster.

In June 2018, AWS announced the launch of a 3rd Availability Zone in the London Region, below is the process taken to leverage this with links to PRs we've created to help make the process easier.

To utilise this guide, you'll need to have:

  • A working Kubernetes Cluster built with kops 

  • VPC, Route Tables & Subnets managed by kops

  • A subnet range available within your VPC (you would not be able to do this if the VPC CIDR is fully allocated)

  • 5 Master instances distributed across 2 Availability Zones

  • ETCDv3 with TLS

  • Using this version of kops (or more recent) Step-by-step guide 

Let's run a validate on the cluster to make sure everything is healthy before we start:

kops validate cluster

Using cluster from kubectl context: test.k8s.appvia.io
Validating cluster test.k8s.appvia.io

INSTANCE GROUPS
NAME ROLE MACHINETYPE MIN MAX SUBNETS
eu-west-2a-master1 Master t2.medium 1 1 eu-west-2a
eu-west-2a-master2 Master t2.medium 1 1 eu-west-2a
eu-west-2a-master3 Master t2.medium 1 1 eu-west-2a
eu-west-2b-master1 Master t2.medium 1 1 eu-west-2b
eu-west-2b-master2 Master t2.medium 1 1 eu-west-2b
nodes Node t2.medium 2 2 eu-west-2a,eu-west-2b

NODE STATUS
NAME ROLE READY
ip-10–100–0–13.eu-west-2.compute.internal master True
ip-10–100–0–170.eu-west-2.compute.internal node True
ip-10–100–0–21.eu-west-2.compute.internal master True
ip-10–100–0–56.eu-west-2.compute.internal master True
ip-10–100–1–170.eu-west-2.compute.internal master True
ip-10–100–1–23.eu-west-2.compute.internal node True
ip-10–100–1–238.eu-west-2.compute.internal master True

Your cluster test.k8s.appvia.io is ready

Now, we need to edit the kops ClusterSpec, defining a new subnet that would be located within the 3rd Availability Zone:  kops edit cluster test.k8s.appvia.io

Jump down to the ‘subnets’ section, adding the new subnet into the list.

At the same time, you can define new Master instances to be provisioned on the AZ-C Subnet that will be created. To maintain a quorum for etcd, you need to provision and persist an odd number of Master instances (3, 5, 7 etc) at all times. The kops utility will prevent you from creating an odd number of Master instances.

For this example, I created an additional 2 instances to reside within AZ-C, bringing the total to 7. I've chosen a CoreOS AMI to provision all the nodes, but feel free to change this to whatever is more suitable for your environment.

 cat euw2c-master1.yaml
apiVersion: kops/v1alpha2
kind: InstanceGroup
metadata:
 labels:
 kops.k8s.io/cluster: test.k8s.appvia.io
 name: eu-west-2c-master1
spec:
 image: coreos.com/CoreOS-stable-1576.5.0-hvm
 machineType: t2.medium
 maxSize: 1
 minSize: 1
 role: Master
 subnets:
  eu-west-2c

 kops create -f euw2c-master1.yaml
Created instancegroup/eu-west-2c-master1
To deploy these resources, run: kops update cluster test.k8s.appvia.io --yes

 cat euw2c-master2.yaml
apiVersion: kops/v1alpha2
kind: InstanceGroup
metadata:
 labels:
 kops.k8s.io/cluster: test.k8s.appvia.io
 name: eu-west-2c-master2
spec:
 image: coreos.com/CoreOS-stable-1576.5.0-hvm
 machineType: t2.medium
 maxSize: 1
 minSize: 1
 role: Master
 subnets:
  eu-west-2c

 kops create -f euw2c-master2.yaml
Created instancegroup/eu-west-2c-master2
To deploy these resources, run: kops update cluster test.k8s.appvia.io --yes

Before deploying these new Instance Groups (IGs), one more update needs to take place within the ClusterSpec to define new associated etcd members: kops edit cluster test.k8s.appvia.io

Jump down to the etcdClusters section, adding etcd-main and etcd-events EBS volumes for the two new IGs. 

Finally, you can run a kops update to interact with the AWS API, creating the new AWS resources that have been defined in the Cluster Spec:

 kops update cluster --yes

Using cluster from kubectl context: test.k8s.appvia.io

I0123 15:27:15.413444   11637 executor.go:91] Tasks: 0 done / 118 total; 46 can run
I0123 15:27:15.863181   11637 executor.go:91] Tasks: 46 done / 118 total; 29 can run
I0123 15:27:16.374682   11637 executor.go:91] Tasks: 75 done / 118 total; 27 can run
I0123 15:27:17.905691   11637 executor.go:91] Tasks: 102 done / 118 total; 9 can run
I0123 15:27:18.035842   11637 dnsname.go:111] AliasTarget for "api.test.k8s.appvia.io." is "api-test-k8s-fvii3v-1272530458.eu-west-2.elb.amazonaws.com."
I0123 15:27:18.153972   11637 executor.go:91] Tasks: 111 done / 118 total; 7 can run
I0123 15:27:18.856931   11637 executor.go:91] Tasks: 118 done / 118 total; 0 can run
I0123 15:27:18.856995   11637 dns.go:153] Pre-creating DNS records
I0123 15:27:19.232648   11637 update_cluster.go:253] Exporting kubecfg for cluster

kops has set your kubectl context to test.k8s.appvia.io
Cluster changes have been applied to the cloud.
Changes may require instances to restart: kops rolling-update cluster

At this point, you will have:

  • Created 1 new Subnet within your VPC

  • Created 2 new Master Instance Groups, residing within the subnet named ‘eu-west-2c’

  • Created 4 new EBS volumes, 2x ‘etcd-main’ && ‘etcd-events’, for the new Instance Groups

Unfortunately the new Master instances will not automatically join the etcd cluster without some manual intervention. The etcd member list will need updating to include hosts for the two new instances, and then these new instances will require some reconfiguration to join the existing cluster.

Firstly, SSH into an existing Master instance. Once in the instance, run these commands

If etcdctl is not already present on the Host OS, you can download it from here (grab the same version as what you’re running): 

Validate that you can communicate with both endpoints.

Add the two new members for both ‘etcd-main’ & ‘etcd-events’:

${ETCD_MAIN} member add etcd-c1 --peer-urls="https://etcd-c1.internal.test.k8s.appvia.io:2380"

Member 356cb0ee3ba9d1f6 added to cluster 5024708754869ab3

➜ ${ETCD_MAIN} member add etcd-c2 --peer-urls="https://etcd-c2.internal.test.k8s.appvia.io:2380"
Member 4e65c79f2332a2ae added to cluster 5024708754869ab3

➜ ${ETCD_EVENTS} member add etcd-events-c1 --peer-urls="https://etcd-events-c1.internal.test.k8s.appvia.io:2381"

Member fa29d33b3e164931 added to cluster df1655d3f512ef29

➜ ${ETCD_EVENTS} member add etcd-events-c2 --peer-urls="https://etcd-events-c2.internal.test.k8s.appvia.io:2381"

Member de67c45c59dc7476 added to cluster df1655d3f512ef29

Now that the new members have been added, they should initially be listed as unstarted

At this point, the etcd Clusters have been configured to register the new clients once they become available on their respective addresses. The newly provisioned Master instances now require amendments to their etcd configuration to be able to join the existing Clusters.

Next, log-in via SSH to the new Master instances and perform these commands

Validate that the members have started successfully and joined the cluster by running a ‘member list’ command on any of the master nodes that you are connected to. 

Once the etcd clusters have recovered as above, you can start the protokube service back up on the Master instances within the new Availability Zone: systemctl start protokube

Your cluster should eventually validate successfully via kops after a few minutes once the new Master instances have caught up:

kops validate cluster
Using cluster from kubectl context: test.k8s.appvia.io

Validating cluster test.k8s.appvia.io

INSTANCE GROUPS
NAME   ROLE MACHINETYPE MIN MAX SUBNETS
eu-west-2a-master1 Master t2.medium 1 1 eu-west-2a
eu-west-2a-master2 Master t2.medium 1 1 eu-west-2a
eu-west-2a-master3 Master t2.medium 1 1 eu-west-2a
eu-west-2b-master1 Master t2.medium 1 1 eu-west-2b
eu-west-2b-master2 Master t2.medium 1 1 eu-west-2b
eu-west-2c-master1 Master t2.medium 1 1 eu-west-2c
eu-west-2c-master2 Master t2.medium 1 1 eu-west-2c
nodes   Node t2.medium 2 2 eu-west-2a,eu-west-2b

NODE STATUS
NAME      ROLE READY
ip-10-100-0-13.eu-west-2.compute.internal master True
ip-10-100-0-170.eu-west-2.compute.internal node True
ip-10-100-0-21.eu-west-2.compute.internal master True
ip-10-100-0-56.eu-west-2.compute.internal master True
ip-10-100-1-170.eu-west-2.compute.internal master True
ip-10-100-1-23.eu-west-2.compute.internal node True
ip-10-100-1-238.eu-west-2.compute.internal master True
ip-10-100-2-118.eu-west-2.compute.internal master True
ip-10-100-2-140.eu-west-2.compute.internal master True

Your cluster test.k8s.appvia.io is ready

Finally, you can now configure your node pools to make use of the new subnet. Run kops edit ig <nodes-ig> and add the new Availability Zone

One more kops update cluster test.k8s.appvia.io --yes and you’re good to go!

ABOUT AUTHOR

Kashif Saadat

Kashif Saadat

Kash is a Co-Founder and SRE Lead at Appvia