This is the story of how three Appvia Engineers contributed so much to the KOPS project that they became authorising contributors, more commonly known as maintainers. Read about why they needed KOPs and what challenges they faced, as well as the particular changes they made for their own needs and for the needs of the community.
Kubernetes Operations, or Kops, is an open source project used to set up Kubernetes clusters easily and swiftly. [Kuberentes Operations with KOPs]
We like to think of it as kubectl for clusters. KOPs will not only help you create, destroy, upgrade and maintain production-grade, highly available, Kubernetes clusters, but it will also provision the necessary cloud infrastructure.
Seven years ago, three Appvia Engineers - Kash Saadat, Appvia SRE Lead, Rohith Jayawardene, Appvia Product Engineering Lead and Lewis Marshall, SRE Lead & Tech Evangalist, were contracting at the Home Office. Rohith started a project called Commons, the aim of which was about consolidating a whole lot of default services such as asset management and CLI.
During this time, the Appvia contracting team had their own homegrown solution at the Home Office, which was a digital services platform. This was prior to actually looking at using KOPs, so essentially doing Kubernetes the Hard Way. It was a collection of lots of bash scripts to form a platform and service that would enable developers to X-ray and deliver early on their applications.
Eventually the platform Rohith and his team were running on started to become popular and it was gaining a lot of traction inside the Home Office itself. As the Home Office themselves were interested in different portfolios and different clients, and because the project was consumed in a sort of SaaS format for a central engineering team, the team sold their product off into different portfolios or to different clients at the Home Office. As it became more popular, it became a strategic platform which created an opportunity for users and the team said to themselves, “Okay, well, here, we’ve got this platform, it’s great. It’s wonderful.”
What they started to experience however, was that there was a huge amount of work that got thrown into the backlog all the time. This meant the team couldn’t keep the platform exactly as it was. The platform would have to migrate, meaning all of their customers using its capabilities would have to migrate as well. There were other concerns for the team such as the way users were managed and ETCD. This prompted the team to release version three, which enabled a simple migration path for their customers.
This need to change the platform presented an opportunity for the team to question wanted to stick with what they had. They knew Kubernetes was coming into popularity - which raised questions around sticking with what they had, and all the entire giant backlog that accompanies it, or do they go with something else? And if they were going to go for something else, what do they actually go with? As far as the tech and Kubernetes, 1.6 was arriving and introduced and becoming a viable option.
But the internally-developed solution quickly surfaced some unanticipated problems. There was an increasing burden of effort to introduce or re-engineer things in order to stay in line with the new features and requirements that come to bear in industry. That was expensive and time-consuming. It was certainly professionally developed. But the trouble is, even internally professionally developed stuff can turn into an untenable long term proposition if there is a need to constantly refactor them.
Would it be worth putting any more engineering time into it? Was there an alternative?
After much research, it appeared that KOPS was the closest fit – but by no means the perfect fit – for their needs. It had the following virtues:
On the other hand, it did not offer features needed by the team. For example, there was no support for encrypted Docker volumes or migration to newer versions of ETCD.
So, while it was a compromise, it was definitely a step in the right direction. And since it was open source, the team had the opportunity to add their own embellishments to the existing project. This would be the answer to the security needs.
The team began by implementing YAML configuration files as a way to pass information to the KOPS process. That allowed KOPS to be used in a CI/CD process and the configuration to be kept (in YAML form) in a git repository.
With the help of a very accommodating user community, the team was able to incorporate numerous changes into KOPS. Some of the changes included encrypted docker volume support, CI/CD (i.e. non-interactive) execution support, ETCDv3 support, and RBAC. For a more detailed list, look here and here.
KOPS is an excellent way to configure your kubernetes environment. It brings to light all the various configuration options necessary to create a full-blown cluster. But ultimately one thing is missing: it does not have any sense of common best practices or optimum choices for your cluster. It assumes that you are a thinking professional who understands all of the implications of everything you're doing. It will not stop you from making bad choices. It has, in other words, no opinions about how you should do things.
This same team who worked on making KOPS more robust and usable are now turning their attention to a new product called Wayfinder. Unlike KOPS, Wayfinder is built from the ground-up to have opinions about kubernetes configuration. In addition, wayfinder is built so that users do not need to be kubernetes experts in order to bring up a cluster. In particular, security best practices are “pre-baked” in the product.