Self-Service Of Cloud Resources - Terraform Controller

Key Takeaways

Self-service of cloud resources

Kubernetes has been brilliant at delivering an ecosystem for developers, improving the velocity of shipping, bringing components under a common framework and DSL, coupled with the flexibility to expand and extend the offering. And so it beggars belief that speaking to customers, application dependencies and consumption is still a major bottleneck to progress, with teams blocked waiting on that database, queue, object store and so forth.

‍

Thing is ...

‍

Most applications don't even make it to production - a large chunk of the software delivery cycle is prototyping and experimentation. It's being driven to show value quickly: try it, show it and see if it fails.
That statement often comes into conflict with platform engineering, and naturally so. Their goals of productised setup, ownership of reliability, cost and security is a very different world view.
And while Kubernetes has been very successful in bridging the barrier of application delivery, enabling development teams and DevOps to experience platform as a service, application dependencies in a large chunk of organisations remain a ticketed system; click, open support ticket, wait for response, and so on.

While the terraform-controller isn’t trying to solve all those issues, it's a step in the right direction.

‍

Reuse the terraform modules and code you already have; no pivots or tech choices.
Allow teams to consume it while maintaining control over the assets (terraform modules) and the security profile (checkov).
Let teams be aware of their own costs, allowing them to improve them.
‍

The why and the what for?

For Developers

Workflows are run outside of the Developer’s namespace, so credentials can be centrally managed and shared without being exposed.
Changes can be approved beforehand, following a plan and apply workflow.
Developers can view and debug the terraform workflows from their namespaces.
Delivers the output as environment variables ready to be consumed directly from a Kubernetes secret without further manipulation of the values.
‍

For Platform Engineers

It's not a free for all, Platform Engineers can implement policy around which modules can be consumed by the app teams.
Configuration can be environment specific, enabling engineers to inject environment-specific data into the module configuration. Use cases like environment tags, filters, project labels, cost codes and so forth can be injected.
Enable developers to see the associated costs to their configurations within Kubernetes
Supports pod identity (IRSA on AWS) and shuffles credentials management over to the cloud vendor.
Integrates with Infracosts and provides the ability to view expected costs and potentially enforce policy (budget control).
Reuse the Terraform you've probably already written and the experience you've almost certainly got.
Place guardrails around the modules that your teams can use, rather than referencing or pulling any Terraform module from the internet.
Ability to orphan resources i.e. delete the custom resource without deleting the cloud resource backing it.

Give it a try!

Prerequisites

The quickest way to get up and running is via the Helm chart.

‍

a. Deploy the controller

^{$ git clone git@github.com:appvia/terraform-controller.git
$ cd terraform-controller
# kind create cluster
$ helm install -n terraform-system terraform-controller charts/ --create-namespace
$ kubectl -n terraform-system get po}

‍

b. Configure credentials for developers

^{# The following assumes you are using static credentials. For managed pod identity see the docs: https://github.com/appvia/terraform-controller/blob/master/docs/providers.md

$ kubectl -n terraform-system create secret generic aws
--from-literal=AWS_ACCESS_KEY_ID=<ID>
--from-literal=AWS_SECRET_ACCESS_KEY=<SECRET>
--from-literal=AWS_REGION=<REGION>
$ kubectl -n terraform-system apply -f examples/provider.yaml}

‍

c. Create your first configuration

^{$ kubectl create namespace apps
# NOTE: Make sure to change the bucket name in examples/configuration.yaml (spec.variables.bucket)
$ vim examples/configuration.yaml
$ kubectl -n apps apply -f examples/configuration.yaml
# Check the module output
$ kubectl -n apps get secret test -o yaml}

‍

What's on the roadmap?

Budget Constraints

With Infracosts already integrated one idea is to introduce control over budgets. Though it wouldn't directly enforce costs, and some resources are usage based (i.e. an s3 bucket is free, but dump 10TB inside and it costs a lot). It could be a lightweight means of capturing costs, allowing developers to play / tune their dependencies and foster a better understanding on cost.

‍

^{constraints:
budgets:
# Allow monthly spend of up to 100 dollars within each namespace for cloud resources
- namespaces:
matchExpressions:
- key: kubernetes.io/metadata.name
operator: Exists
budget: 100
# Allow monthly spend of up to 500 dollars for namespaces with project cost center code PK-101
- namespaces:
matchExpression:
- key: company.com/costcode:
operator: In
values: [PK-101]
budget: 500}

‍
‍

Policy enforcement

Integrate Checkov into the pipeline and allow Platform Engineers the ability to drive policy from above.

constraints:

‍
_{checkov:
source: https://github.com/<ORG>/<POLICY-REPO>.git?ref=v1.2.0
secretRef:
name: policy-sshkey}

‍

Note: while acting as a barrier, it's very late in the game if this is applied against a production workload and not really following a shift-left approach. It is definitely worth reading our Policy As (Versioned) Code (PaC) blog for a coupled approach (i.e. using your same PaC repository in your Terraform module CI workflows prior to publishing versions, as well as enforced within the Cluster at deployment time).

‍

Update: This has been completed and is available from v0.1.1 onwards: https://github.com/appvia/terraform-controller/releases/tag/v0.1.1
‍
‍

So what are the alternatives?

This is by no means meant to be an exclusive list or comparison, there's plenty of blogs a Google search away for that, but it's worth highlighting a few notable projects out there.

‍

Crossplane

Now an incubating project on the CNCF, Crossplane is an interesting project and plays well into the bric-a-brac approach loved by us DevOps. In a gist, it's composed of managed resources (think terraform resources) which are packaged up into "Compositions" (think opinionated collection of terraform modules) and presented back to the Application Developer as a consumable CRD. Initially trying to replicate the breadth of Terraforms cloud support, it recently joined the club with its Terrajet project, which codegens controllers from Terraform providers.

‍

It has many pros, but does come with a pivot on tech and a learning curve for the platform teams.
Can't reuse previous investment or experience. Chances are you ha've a collection of tried and tested terraform modules which are about to be scrapped.
There is no dry-run or plan support. Any changes made to resources will attempt to apply immediately, which holds risk where modification of particular resource attributes may invoke a destructive change.
‍

Terraform Operator

Probably the first google hit when typing terraform controller. The project works in a similar approach - coordinating a series of workflows via Kubernetes jobs and mapping those to "terraform init" and "terraform apply".

‍

The custom resource definition is highly flexible, allowing for tweaks; images, versions, post and pre run scripts and numerous other settings (though arguably you probably have to block some of this functionality in some way, as it increases the surface area for abuse).
Currently, the operator has no means of sharing credentials. This could be seen as a pro, however, it does make for a more complicated deployment and consumption by developers, as you now need to manage credentials for teams or perhaps integrate with a product like Vault.
It has a feature of spec.resourceDownloads which is quite useful and could perhaps be used to provide environment-specific configuration.
Doesn't support approving or reviewing changes - everything is "auto-approve".
Policy would have to be superimposed afterwards via another component i.e. GateKeeper or a Kynervo admission controller.
Terraform outputs are written as a json object inside a Kubernetes Secret, and so your application may require changes to parse and consume these values.
‍

Watch this space for a "hello world" example using the Terraform controller, and check out the following for more info: