GitHub
Get a Demo

January 29, 2018 | By: Kashif Saadat

Extending your Kubernetes Cluster into a new AWS Availability Zone with kops

    If you are not familiar with kops. It is a tool that allows you to build and manage your Kubernetes infrastructure in cloud. It is something we aligned to, as we started using Kubernetes since 0.6. We have been running Kubernetes in production for 2.5 years and like most people had created our own tool to manage the buildout of Kubernetes Clusters.

    We have since favoured aligning to the industry where we can and contributing to a tool that is part of the Kubernetes ecosystem. Since then, we have been running kops happily, and this article is to help others in a similar position leveraging an additional AZ once they have a running cluster.

    AWS Recently announced availability of a 3rd Availability Zone in the London Region, below is the process taken to leverage this with links to PR’s we have made to help make the process easier.

    This guide assumes you have:


    Firstly, lets run a validate on the Cluster to make sure everything is healthy before we start:

    kops validate cluster
    
    Using cluster from kubectl context: test.k8s.appvia.io
    Validating cluster test.k8s.appvia.io
    
    INSTANCE GROUPS
    NAME ROLE MACHINETYPE MIN MAX SUBNETS
    eu-west-2a-master1 Master t2.medium 1 1 eu-west-2a
    eu-west-2a-master2 Master t2.medium 1 1 eu-west-2a
    eu-west-2a-master3 Master t2.medium 1 1 eu-west-2a
    eu-west-2b-master1 Master t2.medium 1 1 eu-west-2b
    eu-west-2b-master2 Master t2.medium 1 1 eu-west-2b
    nodes Node t2.medium 2 2 eu-west-2a,eu-west-2b
    
    NODE STATUS
    NAME ROLE READY
    ip-10–100–0–13.eu-west-2.compute.internal master True
    ip-10–100–0–170.eu-west-2.compute.internal node True
    ip-10–100–0–21.eu-west-2.compute.internal master True
    ip-10–100–0–56.eu-west-2.compute.internal master True
    ip-10–100–1–170.eu-west-2.compute.internal master True
    ip-10–100–1–23.eu-west-2.compute.internal node True
    ip-10–100–1–238.eu-west-2.compute.internal master True
    
    Your cluster test.k8s.appvia.io is ready

    Now we need to edit the kops ClusterSpec, defining a new subnet that would be located within the 3rd Availability Zone: kops edit cluster test.k8s.appvia.io

    Jump down to the ‘subnets’ section, adding the new subnet into the list as below:

    https://gist.github.com/KashifSaadat/6fbced9cba585290bd5e3341108e8b15#file-subnetspec-diff

    At the same time you can define new Master instances to be provisioned on the AZ-C Subnet that will be created. To maintain a quorum for etcd, you must provision and persist an odd number of Master instances (3, 5, 7 etc) at all times. The kops utility will prevent you from creating an odd number of Master instances.

    Note: There is a current bug in kops around the way masters and volumes work. This causes problems if you wanted to replace the master nodes with the ones in the new AZ as oppose to extend. This bug has been raised here. For ease of simplicity, we will focus on extending rather than replacing.

    For this example, I will create an additional 2 instances to reside within AZ-C, bringing the total to 7. I have chosen a CoreOS AMI to provision all the nodes, however feel free to change this to whatever is more suitable for your environment.

     cat euw2c-master1.yaml
    apiVersion: kops/v1alpha2
    kind: InstanceGroup
    metadata:
     labels:
     kops.k8s.io/cluster: test.k8s.appvia.io
     name: eu-west-2c-master1
    spec:
     image: coreos.com/CoreOS-stable-1576.5.0-hvm
     machineType: t2.medium
     maxSize: 1
     minSize: 1
     role: Master
     subnets:
      eu-west-2c
    
     kops create -f euw2c-master1.yaml
    Created instancegroup/eu-west-2c-master1
    To deploy these resources, run: kops update cluster test.k8s.appvia.io --yes
    
     cat euw2c-master2.yaml
    apiVersion: kops/v1alpha2
    kind: InstanceGroup
    metadata:
     labels:
     kops.k8s.io/cluster: test.k8s.appvia.io
     name: eu-west-2c-master2
    spec:
     image: coreos.com/CoreOS-stable-1576.5.0-hvm
     machineType: t2.medium
     maxSize: 1
     minSize: 1
     role: Master
     subnets:
      eu-west-2c
    
     kops create -f euw2c-master2.yaml
    Created instancegroup/eu-west-2c-master2
    To deploy these resources, run: kops update cluster test.k8s.appvia.io --yes

    Before deploying these new Instance Groups (IGs), one more update must take place within the ClusterSpec to define new associated etcd members: kops edit cluster test.k8s.appvia.io

    Jump down to the etcdClusters section, adding etcd-main and etcd-events EBS volumes for the two new IGs:

    https://gist.github.com/KashifSaadat/c44b8d7e5a27e8fdf1ee8323c5ccdfa6#file-etcdclusterspec-diff

    Finally you can run a kops update to interact with the AWS API, creating the new AWS resources that have been defined in the Cluster Spec:

     kops update cluster --yes
    
    Using cluster from kubectl context: test.k8s.appvia.io
    
    I0123 15:27:15.413444   11637 executor.go:91] Tasks: 0 done / 118 total; 46 can run
    I0123 15:27:15.863181   11637 executor.go:91] Tasks: 46 done / 118 total; 29 can run
    I0123 15:27:16.374682   11637 executor.go:91] Tasks: 75 done / 118 total; 27 can run
    I0123 15:27:17.905691   11637 executor.go:91] Tasks: 102 done / 118 total; 9 can run
    I0123 15:27:18.035842   11637 dnsname.go:111] AliasTarget for "api.test.k8s.appvia.io." is "api-test-k8s-fvii3v-1272530458.eu-west-2.elb.amazonaws.com."
    I0123 15:27:18.153972   11637 executor.go:91] Tasks: 111 done / 118 total; 7 can run
    I0123 15:27:18.856931   11637 executor.go:91] Tasks: 118 done / 118 total; 0 can run
    I0123 15:27:18.856995   11637 dns.go:153] Pre-creating DNS records
    I0123 15:27:19.232648   11637 update_cluster.go:253] Exporting kubecfg for cluster
    
    kops has set your kubectl context to test.k8s.appvia.io
    Cluster changes have been applied to the cloud.
    Changes may require instances to restart: kops rolling-update cluster

    At this point, you will have:

    • Created 1 new Subnet within your VPC

    • Created 2 new Master Instance Groups, residing within the subnet named ‘eu-west-2c’

    • Created 4 new EBS volumes, 2x ‘etcd-main’ && ‘etcd-events’, for the new Instance Groups

    Unfortunately the 2 new Master instances will not automatically join the etcd cluster without some manual intervention. The etcd member list will need updating to include hosts for the two new instances, and then these new instances will require some reconfiguration to join the existing cluster.

    Firstly, SSH into an existing Master instance. Once in the instance, run the following commands:

    https://gist.github.com/KashifSaadat/9ea936e1f7ab769f0c8049d78db54119#file-etcd_cmd-sh

    If etcdctl is not already present on the Host OS, you can download it from here (grab the same version as what you’re running): https://github.com/coreos/etcd/releases

    Validate that you can communicate with both endpoints:

    https://gist.github.com/KashifSaadat/18c5d255d75c227ab5990bf3a2ab490f#file-etcd-status-txt

    Add the two new members for both ‘etcd-main’ && ‘etcd-events’:

    ${ETCD_MAIN} member add etcd-c1 --peer-urls="https://etcd-c1.internal.test.k8s.appvia.io:2380"
    
    Member 356cb0ee3ba9d1f6 added to cluster 5024708754869ab3
    
    ➜ ${ETCD_MAIN} member add etcd-c2 --peer-urls="https://etcd-c2.internal.test.k8s.appvia.io:2380"
    Member 4e65c79f2332a2ae added to cluster 5024708754869ab3
    
    ➜ ${ETCD_EVENTS} member add etcd-events-c1 --peer-urls="https://etcd-events-c1.internal.test.k8s.appvia.io:2381"
    
    Member fa29d33b3e164931 added to cluster df1655d3f512ef29
    
    ➜ ${ETCD_EVENTS} member add etcd-events-c2 --peer-urls="https://etcd-events-c2.internal.test.k8s.appvia.io:2381"
    
    Member de67c45c59dc7476 added to cluster df1655d3f512ef29

    Now that the new members have been added, they should initially be listed as unstarted:

    https://gist.github.com/KashifSaadat/65dcf2e3184412327bf71f5daf10f831#file-etcd-member-list-unstarted-txt

    At this point, the etcd Clusters have been configured to register the new clients once they become available on their respective addresses. The newly provisioned Master instances now require amendments to their etcd configuration to be able to join the existing Clusters.

    Next, login via SSH to the new Master instances and perform the following commands:

    https://gist.github.com/KashifSaadat/b7dcb1a284caaa6f61b6d5a118aaa08b#file-etcd-config-update-sh

    Validate that the members have started successfully and joined the cluster by running a ‘member list’ command on any of the master nodes that you are connected to:

    https://gist.github.com/KashifSaadat/b905795361b6b889b4d4b2f164b2e446#file-etcd-member-list-started-txt

    Once the etcd clusters have recovered as above, you can start the protokube service back up on the Master instances within the new Availability Zone: systemctl start protokube

    Your cluster should eventually validate successfully via kops after a few minutes once the new Master instances have caught up:

    kops validate cluster
    Using cluster from kubectl context: test.k8s.appvia.io
    
    Validating cluster test.k8s.appvia.io
    
    INSTANCE GROUPS
    NAME   ROLE MACHINETYPE MIN MAX SUBNETS
    eu-west-2a-master1 Master t2.medium 1 1 eu-west-2a
    eu-west-2a-master2 Master t2.medium 1 1 eu-west-2a
    eu-west-2a-master3 Master t2.medium 1 1 eu-west-2a
    eu-west-2b-master1 Master t2.medium 1 1 eu-west-2b
    eu-west-2b-master2 Master t2.medium 1 1 eu-west-2b
    eu-west-2c-master1 Master t2.medium 1 1 eu-west-2c
    eu-west-2c-master2 Master t2.medium 1 1 eu-west-2c
    nodes   Node t2.medium 2 2 eu-west-2a,eu-west-2b
    
    NODE STATUS
    NAME      ROLE READY
    ip-10-100-0-13.eu-west-2.compute.internal master True
    ip-10-100-0-170.eu-west-2.compute.internal node True
    ip-10-100-0-21.eu-west-2.compute.internal master True
    ip-10-100-0-56.eu-west-2.compute.internal master True
    ip-10-100-1-170.eu-west-2.compute.internal master True
    ip-10-100-1-23.eu-west-2.compute.internal node True
    ip-10-100-1-238.eu-west-2.compute.internal master True
    ip-10-100-2-118.eu-west-2.compute.internal master True
    ip-10-100-2-140.eu-west-2.compute.internal master True
    
    Your cluster test.k8s.appvia.io is ready

    Finally, you can now configure your node pools to make use of the new subnet. Run kops edit ig <nodes-ig> and add the new Availability Zone in:

    https://gist.github.com/KashifSaadat/3c7b4e6caf7a094c9d45a669cf3e0531#file-nodespec-diff

    One more kops update cluster test.k8s.appvia.io --yes and you’re good to go!

    Back

    Stay Up-To-Date

    Subscribe to Our Newsletter