How to Use the Security Profiles Operator

07 February 2022 by Chris Nesbitt-Smith

Introduction

Following on from PodSecurityPolicy is Dead, Long Live...? this tutorial covers the practical use of a new tool from the Kubernetes Node Specialist Interest Group.

The Linux kernel (the same marvel that brings us containers) provides a few capabilities for bridging the last mile in security management limiting the actual capabilities of the processes that are running. This should be seen as one of the most impactful changes you can make to disrupting the Cyber Kill Chain in your organisation.

The three technologies: seccomp, AppArmor, SELinux are best used in a microservice architecture where each service handles only a small discrete task that can be effectively limited to do only that.

It would likely be largely ineffective applied to a monolithic application that has broad capabilities, where limiting it to its ‘business-as-usual’ wouldn’t really rule much if anything out

What these technologies won’t do:

  • Stop activity that is normal for your app, e.g. if it makes sql queries against a database, an attacker could still exploit a vulnerability, allowing them to execute arbitrary sql queries.

What these technologies will do:

  • Potentially break your applications if you don’t fully capture their behaviour, e.g. if you record a profile and miss out a scenario like someone uploading a file in your e2e tests, if things are working properly that’ll be blocked.
  • Help you better understand what your application does and highlight when its behaviour changes as the developers evolve it.
  • Broad or overly permissive profiles could be used to expose services that are good candidates for splitting up.

Let’s look at these technologies

seccomp

seccomp (short for secure computing mode) is a computer security facility in the Linux kernel. seccomp allows a process to make a one-way transition into a "secure" state where it can only make limited system calls. Should it attempt any other system calls, the kernel will either just log or terminate the process.

AppArmor

AppArmor ("Application Armor") is a Linux kernel security module that allows the system administrator to restrict programs' capabilities with per-program profiles. Profiles can allow capabilities like network access, raw socket access, and the permission to read, write, or execute files on matching paths. AppArmor supplements the traditional Unix discretionary access control (DAC) model by providing mandatory access control (MAC).

SELinux

SELinux is a set of kernel modifications and user-space tools that have been added to various Linux distributions. Its architecture strives to separate enforcement of security decisions from the security policy, and streamlines the amount of software involved with security policy enforcement.

And Kubernetes exposes these!

However, managing them is not easy, so unsurprisingly lots of commercial products have entered the space with all sorts of buzzwords like ‘artificial intelligence’ and ‘machine learning’.

These commercial offerings are great and can simplify the implementation but it’s worth understanding how things are working under the hood and electing how much control you might relinquish to an algorithm.

Relatively recently a Kubernetes special interest group has developed the Kubernetes Security Profiles Operator which works to expose the power of seccomp, SELinux and AppArmor to end users.

The technologies are not mutually exclusive, and I would encourage combining them, but for the sake of this article I’ll be focusing on seccomp since it is currently the best supported by the Security Profiles Operator at time of writing and it has been cited as mitigating some recent high profile vulnerabilities e.g. Polkit Pwnkit CVE-2021-4034.

In short to inform your technology choices:

  • Seccomp can reduce the chance that a kernel vulnerability will be successfully exploited.
  • AppArmor and SELinux can prevent an application from accessing files it should not.
Click the image to view the graph in full size

Show me the code

This is a tutorial  on how to use the Security Profiles Operator to

I'm going to demo this using Docker Desktop and also Podman machine for a mac. You can follow the same steps if you are on a Linux or Windows machine. Docker machine things might be a lot easier for you if you're using a linux machine with auditd/syslog enabled, but since the vm that Docker Desktop (linuxkit) or podman-machine (fedora core) doesn't ship with that running, we'll have to run our own

Bootstrap

This assumes you're using docker (inc Docker Desktop) or Podman; podman machine requires a few tweaks, I've added these as comments and suffixed the line with PODMAN ONLY and PODMAN MACHINE ONLY where necessary you'll need to just uncomment these line

Start a KiND cluster

We’re going to use KiND to run a local kubernetes cluster

You need to mount force the /proc to be mounted through to the nodes, if you have multiple nodes you'll need to add the extraMounts section to each node

# export KIND_EXPERIMENTAL_PROVIDER=podman                           # PODMAN ONLY
# podman machine init --cpus=4 --memory=8096                         # PODMAN MACHINE ONLY
# podman machine start                                               # PODMAN MACHINE ONLY
# podman system connection default podman-machine-default-root       # PODMAN MACHINE ONLY

kind create cluster --config - << EOF
apiVersion: kind.x-k8s.io/v1alpha4
kind: Cluster
name: kind
networking:
#  apiServerAddress: "0.0.0.0"                                       # PODMAN ONLY
nodes:
  - role: control-plane
    image: kindest/node:v1.23.3
    extraMounts:
    - hostPath: /proc
      containerPath: /hostproc
EOF

# sed -i '' 's/https:\/\/:/https:\/\/localhost:/g' ~/.kube/config    # PODMAN ONLY

Mounting the /proc is important since it allows us to match the process ids fro the kernel level audit logs, through to the namespaced process ids within the KiND namespaced cgroup.

Deploy syslog (and wait for it to be ready)

Podman machine and Docker Desktop use a vm that doesn't ship with syslog or auditd which you'll need to write the logs for the log enricher to then collect, this needs to be deployed as a DaemonSet across the cluster. You may be able to skip this step if you're using a linux workstation or podman-machine which can use eBPF instead of log-enrichment.

kubectl apply -k github.com/chrisns/syslog-auditd
kubectl --namespace kube-system wait --for condition=ready pods -l name=syslog

Deploy cert manager (and wait for it to be ready)

kubectl apply -f https://github.com/jetstack/cert-manager/releases/download/v1.6.1/cert-manager.yaml
kubectl --namespace cert-manager wait --for condition=ready pod -l app.kubernetes.io/instance=cert-manage

Deploy Security Profiles Operator (and wait for it to be ready)

kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/security-profiles-operator/main/deploy/operator.yaml
kubectl --namespace security-profiles-operator wait --for condition=ready ds name=spo

Use our custom proc mount and enable the log enricher in the SecurityProfilesOperatorDaemon (SPOD) and wait for it to be ready

kubectl --namespace security-profiles-operator patch spod spod --type=merge -p '{"spec":{"hostProcVolumePath":"/hostproc"}}'
kubectl --namespace security-profiles-operator patch spod spod --type=merge -p '{"spec":{"enableLogEnricher":true}}' # DOCKER DESKTOP ONLY
# kubectl --namespace security-profiles-operator patch spod spod --type=merge -p '{"spec":{"enableBpfRecorder":true}}' # PODMAN / LINUX HOST ONLY
kubectl --namespace security-profiles-operator wait --for condition=ready pod -l name=spo

Record Syscalls

$ kubectl apply -f https://raw.githubusercontent.com/appvia/security-profiles-operator-demo/main/demo-recorder.yaml

$ kubectl run my-pod --image=nginx --labels app=demo && kubectl wait --for condition=ready --timeout=-1s pod my-pod && kubectl delete pod my-pod
pod/my-pod created
pod/my-pod condition met
pod "my-pod" deleted

$ kubectl run --rm -it my-pod --image=alpine --labels app=demo -- sh
If you don't see a command prompt, try pressing enter.
/ # ls
bin    dev    etc    home   lib    media  mnt    opt    proc   root   run    sbin   srv    sys    tmp    usr    var
/ # exit
Session ended, resume using 'kubectl attach my-pod -c my-pod -i -t' command when the pod is running
pod "my-pod" delete

Collect a seccomp profile

You'll now have a profile thats ready to use (note it is only aggregated and created when the pod exits)

We can check what that looks like with and export it to keep it in our version control kubectl neat get sp demo-recorder-my-pod -o yaml should give you a yaml that looks like:

I'm using kubectl-neat to make the output less verbose

apiVersion: security-profiles-operator.x-k8s.io/v1beta1
kind: SeccompProfile
metadata:
  labels:
    spo.x-k8s.io/profile-id: SeccompProfile-demo-recorder-my-pod
  name: demo-recorder-my-pod
  namespace: default
spec:
  architectures:
  - SCMP_ARCH_AARCH64
  defaultAction: SCMP_ACT_ERRNO
  syscalls:
  - action: SCMP_ACT_ALLOW
    names:
    - brk
    - capget
    - capset
    - chdir
    - clone
    - close
    - epoll_ctl
    - execve
    - exit_group
    - fchown
    - fcntl
    - fstat
    - fstatfs
    - futex
    - getcwd
    - getdents64
    - geteuid
    - getpgid
    - getpid
    - getppid
    - getuid
    - ioctl
    - lseek
    - madvise
    - mmap
    - mprotect
    - munmap
    - nanosleep
    - newfstatat
    - openat
    - ppoll
    - prctl
    - read
    - rt_sigaction
    - rt_sigprocmask
    - rt_sigreturn
    - set_tid_address
    - setgid
    - setgroups
    - setpgid
    - setuid
    - wait4
    - writ
    - writev

Start a workload with that Seccomp Profile

For shorthand we're gonna use --overrides to force in some extra things to the podspec

$ kubectl run --rm -ti my-pod --image=alpine  --overrides='{ "spec": {"securityContext": {"seccompProfile": {"type": "Localhost", "localhostProfile": "operator/default/demo-recorder-my-pod.json"}}}}' -- sh
/ # ls
bin    dev    etc    home   lib    media  mnt    opt    proc   root   run    sbin   srv    sys    tmp    usr    var
/ # exit
Session ended, resume using 'kubectl attach my-pod -c my-pod -i -t' command when the pod is running
pod "my-pod" deleted

Ok, so we've not broken anything

Prove that the seccomp profile is enforcing

Without the seccomp profile

$ kubectl run --rm -ti my-pod --image=alpine -- sh
If you don't see a command prompt, try pressing enter.

/ # mkdir foo
/ # touch bar
/ # rm /etc/alpine-release
/ # ping -c 1 1.1.1.1
PING 1.1.1.1 (1.1.1.1): 56 data bytes
64 bytes from 1.1.1.1: seq=0 ttl=37 time=20.657 ms

--- 1.1.1.1 ping statistics ---
1 packets transmitted, 1 packets received, 0% packet loss
round-trip min/avg/max = 20.657/20.657/20.657 ms
/ # nslookup google.com
Server:		10.96.0.10
Address:	10.96.0.10:53

Non-authoritative answer:
Name:	google.com
Address: 142.250.187.238

Non-authoritative answer:
Name:	google.com
Address: 2a00:1450:4009:81f::200e
/ # wget -q 1.1.1.1
/ # exit
Session ended, resume using 'kubectl attach my-pod -c my-pod -i -t' command when the pod is running
pod "my-pod" delete

All looks normal and permissive, now lets try the same thing with our profile

$ kubectl run --rm -ti my-pod --image=alpine  --overrides='{ "spec": {"securityContext": {"seccompProfile": {"type": "Localhost", "localhostProfile": "operator/default/demo-recorder-my-pod.json"}}}}' -- sh
/ # mkdir foo
mkdir: can't create directory 'foo': Operation not permitted
/ # touch bar
touch: bar: Operation not permitted
/ # rm /etc/alpine-release
rm: remove '/etc/alpine-release'? y
rm: can't remove '/etc/alpine-release': Operation not permitted
/ # ping -c 1 1.1.1.1
PING 1.1.1.1 (1.1.1.1): 56 data bytes
ping: permission denied (are you root?)
/ # nslookup google.com
nslookup: socket(AF_INET,2,0): Operation not permitted
/ # wget -q 1.1.1.1
wget: socket(AF_INET,1,0): Operation not permitted

Cool, so we're pretty trapped, but this is quite a contrived example, lets try with something a bit more real

Put that all together to something less contrived

For this exercise we'll deploy Wordpress which needs MySQL/MariaDB and we'll also throw in phpMyAdmin for 'fun'.

First let's deploy our recorder.

kubectl apply -f https://raw.githubusercontent.com/appvia/security-profiles-operator-demo/main/wordpress-recorder.yaml

Now let’s deploy our apps

kubectl apply -k github.com/appvia/wordpress-kustomization-demo
kubectl wait --for condition=ready pod mysql-0

Now let's go to the wordpress gui and check

kubectl port-forward svc/wordpress 8080:http

Open a browser to http://localhost:8080

It doesn't really matter what config you give it, you're not going to keep this installation, you can imagine if this were your app, you might run your end to end tests right now.

Do some other things now, like create a blog post, upload images etc

Now lets try phpmyadmin

kubectl port-forward svc/phpmyadmin 8081:http

And go to it in the browser http://localhost:8081/?db=mydb&table=wp_posts which proves it’s all talking to one another, you can click around and do some other things like upload/download a file etc if you like.

Now lets delete our pods collect our profiles and stop recording

kubectl delete -k github.com/appvia/wordpress-kustomization-demo
kubectl delete -f https://raw.githubusercontent.com/appvia/security-profiles-operator-demo/main/wordpress-recorder.yaml

kubectl neat get sp wordpress-mysql -o yaml > mysql-seccomp-profile.yaml
kubectl neat get sp wordpress-phpmyadmin-0 -o yaml > phpmyadmin-seccomp-profile.yaml
kubectl neat get sp wordpress-wordpress-0 -o yaml > wordpress-seccomp-profile.yaml

Now we've got our profiles, we can either update our deployment code and include the seccomp profile with our infra code, or if perhaps this isn't in your control, perhaps its a public helm chart you've got no influence over you can use a Security Profiles Operator provided binding instead

kubectl apply -f - << EOF
apiVersion: security-profiles-operator.x-k8s.io/v1alpha1
kind: ProfileBinding
metadata:
  name: wordpress-wordpress
spec:
  profileRef:
    kind: SeccompProfile
    name: wordpress-wordpress-0
  image: wordpress:5.8.2-php7.4-apache
---
apiVersion: security-profiles-operator.x-k8s.io/v1alpha1
kind: ProfileBinding
metadata:
  name: wordpress-phpmyadmin
spec:
  profileRef:
    kind: SeccompProfile
    name: wordpress-phpmyadmin-0
  image: phpmyadmin:5.1.1-apache
---
apiVersion: security-profiles-operator.x-k8s.io/v1alpha1
kind: ProfileBinding
metadata:
  name: wordpress-mysql
spec:
  profileRef:
    kind: SeccompProfile
    name: wordpress-mysql
  image: mariadb:10.6.5-focal
EOF

If we look at the pods you'll find that the Security Profiles Operator has mutated the pod specs and injected something like:

securityContext:
      seccompProfile:
        localhostProfile: operator/default/wordpress-mysql.json
        type: Localhost

Now let's check it's all enforcing as we expect:

$ kubectl exec -ti deploy/phpmyadmin -- sh
# touch f
touch: setting times of 'f': Operation not permitted
# su
su: write error: Operation not permitted

You may find to your disappointment as I did that many community (and commercial) products often dance like no one is watching and require quite liberal access to kernel syscalls, but we can at least now monitor what they can do and be aware when the required permissions change.

It's worth noting that some recent CVEs may been mitigated with even the default seccomp profile

Summary

If you’ve followed along with this tutorial you should now be all set to start capturing seccomp profiles for your workload, and have all the tools you need to work that into your continuous integration + deployment pipelines to be able to enforce them in your clusters.

Share this article

About the author

Picture of Chris Nesbitt-Smith

Chris Nesbitt-Smith

Solution Architect

A developer at heart that's now more focused on people and less on the technical implementation detail. Although, when I'm not building stuff with the kids, I spend time on open source work and co-maintain a few high-profile repositories in the home automation space.

Related articles