Handling Serverless on Kubernetes

AJ McCaw, August 9, 2023

Serverless and Kubernetes are hot topics in the software industry. Serverless is a software architecture design that allows you to run your applications on servers without managing the underlying infrastructure running the application. In other words, you don’t have to provision, maintain, and scale the host servers in order to run your applications.

Kubernetes is a container orchestration tool for automating the deployment, scaling, and management of containerized applications. Prior to Kubernetes, containerizing applications meant that you had the ability to run your application on any operating system with the expectation that the application would run the same way. However, deploying, managing, and scaling multiple containerized applications turned out to be tedious, which is the problem Kubernetes was created to solve.

The combination of serverless and Kubernetes opens a new world of possibilities for containerized applications, including using compute resources only when the containerized application is triggered, among many other benefits.

In this article, you’ll learn what serverless capabilities are with Kubernetes, what the use cases are where serverless is essential, and how to handle serverless on Kubernetes.

What Is Serverless Architecture

In serverless architecture, the underlying servers are only invoked when an event triggers the application, and then the application stops running once it completes its process. This makes it possible to only pay for the computing resources your application uses.

With traditional computing, the hosting provider charges you whether your application is actively receiving requests or is idle. You don’t have to worry about purchasing or provisioning servers when using serverless architecture because the serverless vendor has now taken over that responsibility.

Importance of Serverless Architecture

Serverless architecture provides many benefits for your applications, including the following:

Highly Scalable

Serverless architecture makes your applications highly scalable. As your application usage increases, the serverless vendor begins to spin more servers to accommodate the increase in usage. In comparison, with traditional Kubernetes hosting, the application has a fixed number of requests it can handle, unless you manually scale the application replicas or resources, or configure a cluster or pod autoscaler in your cluster. 

In summary, serverless Kubernetes handles high availability in an easy and transparent way when compared to traditional Kubernetes hosting.

Easy to Update and Deploy

Serverless architecture makes it easy to update and deploy a new application version with lesser configurations compared to traditional Kubernetes hosting. You can easily upload recent changes in your application and release them to production. In addition, since the application combines different functions and is not monolithic, deploying parts of your applications without deploying the whole application becomes effortless.

No Server Management

When you use a serverless infrastructure for your applications, there are still underlying servers that run the applications. However, you don’t have to be concerned with the management of how your application interacts with the servers. The management is done by the vendor provider offering the serverless infrastructure. Since you are not in charge of managing the underlying infrastructure, you have more free time to focus on other important parts of your application.

Pay as You Go

With a serverless architecture, you’re only charged for the computing resources you use, which reduces costs. Your code only runs when it’s triggered, and it scales up when there is a spike in traffic and automatically scales down when the traffic returns to normal.

In traditional server architecture, you have to predict your application usage so that you can properly plan on the compute resources needed in order to save cost. If, later on, your application doesn’t receive the intended traffic as predicted, you’ll be wasting resources and be charged for it. While autoscaler solves the issue of autoscaling in traditional Kubernetes hosting, it isn’t possible to scale your deployments to zero which means that at zero traffic you will still be charged for your cluster resources. However, with serverless architecture, scaling to zero is possible, which makes it possible for you to only pay for the resources you consume.

How to Handle Serverless with Kubernetes

Kubernetes doesn’t provide out-of-the-box functionalities in handling serverless workloads on your Kubernetes infrastructure. However, there are several Kubernetes tools that allow you to run serverless workloads on Kubernetes, including Knative, OpenFaas, and Apache OpenWhisk.


Knative, originally created by Google, is an open source platform for deploying and managing serverless applications on the cloud and premises. Knative was designed to let developers focus on developing their application while removing the burden of deploying, managing, and scaling their application on Kubernetes. 

Knative has many notable features including scaling applications to zero, application triggers based on events, and handling events from multiple sources. Other features include caching artifacts for faster builds, progressive application rollouts, and connecting your own monitoring tools. 


Similar to KNative, OpenFaas allows developers to deploy serverless functions and applications on Kubernetes without the need for repetitive, boilerplate coding. One of the main purposes for the development of OpenFaas is to ensure that developers can deploy serverless applications with ease. 

Some of the features in OpenFaas that allow for ease of deployment of serverless applications include a template store which contains different programming language templates for building Docker container images of your application. These images can run as serverless applications on Kubernetes. In addition to that, OpenFaas has a functions store that lets you share and reuse other developers’ functions. 

Other notable features and benefits of OpenFaas include scaling idle functions to zero replicas, triggering functions/applications via HTTP request, Apache Kafka messages and AWS SQS queues, and enabling authentication via OpenID Connect and OAuth2.


Developed by Apache, OpenWhisk is another serverless platform that takes care of managing and provisioning servers for your application while you focus on developing and maintaining your application. You can write your application in any programming language, and then your functions can be triggered via HTTP request or other triggers using hooks and polling. 

OpenWhisk provides a feature called `packages` that allows you to integrate and reuse other developer functions with your functions. Additionally, some packages can also be used to trigger your own functions, such as the Alarm Package. You can use this package to trigger your functions at intervals. 

Other notable features include system limits on the memory a function uses per minute, number of functions invocations per minute, and composing multiple functions developed with different languages to form a sequence.

What Are the Kubernetes Tool Benefits

Knative, OpenFass, and OpenWhisk provide many benefits for deploying serverless applications on Kubernetes, including the following:

Autoscaling: Application workloads can be scaled up from zero when there is an increased traffic request on your application and can be reduced to zero when no request hits your application.

Progressive rollouts: You can configure your application deployment strategy when there are new releases or versions, such as configuring blue-green deployments and canary deployments.

Events: You can handle events from many sources and trigger workloads from specific events.

Cloud deployment: You have the ability to deploy your application on any cloud infrastructure.

How Does Serverless Differ from Traditional Workloads in Kubernetes

To deploy traditional workloads in Kubernetes, you need to create a YAML file that describes the deployment. This deployment file contains information such as the container image, the number of pod replicas, a service that matches the labels of the pods in order to expose the application on the Kubernetes network, and so on. Following is an example of a deployment file:

apiVersion: apps/v1

kind: Deployment


  name: nginx-deployment


    app: nginx


  replicas: 3



      app: nginx




        app: nginx



      - name: nginx

        image: nginx:1.14.2


        - containerPort: 80

However, serverless workloads have a different approach to deployment. In serverless, you don’t have to worry about the underlying infrastructure, which means that you run your container image on your cluster using any of the Kubernetes serverless tools mentioned previously, and everything is taken care of for you. This approach comes with a lot of benefits. For instance, once you deploy the application, the serverless tool automatically creates the required resources to run your application, which includes deployments and services.

When to Use Serverless Workloads in Kubernetes

Deciding when to use serverless workloads in Kubernetes is dependent on your needs. If you want to reduce the time it takes for your application to be deployed to the market, then a serverless workload is a good option compared to managing multiple deployments in traditional workloads.

If your application isn’t going to experience constant usage, then a serverless workload may be right for you because of the cost savings. With traditional workloads, you are consistently using compute resources, which your service provider will charge for at the end of the month whether you had incoming requests into your application or not.

Additionally, serverless architecture is platform agnostic, which means there is no vendor lock-in to any of the Kubernetes platform providers including Amazon EKS and Azure Kubernetes Service (AKS).  

Before switching to a serverless workload, it’s also important to consider some of the disadvantages of using serverless workloads on Kubernetes. One of the major limitations is that the response time of the first request is higher than the subsequent requests because of the time it takes to create the first deployment and for the pod to respond. This phenomenon is known as cold start. Cold start can affect the performance of your app and reduce your application’s adoption rate. 

Additionally, if you have a large application with a predictable workload, traditional servers might be better off than going with serverless computing because, in the long run, you save cost by setting up and optimizing your servers to the expected workload.

In reference to the previous article on Kubernetes serverless frameworks, In the previous Appvia article on Kubernetes serverless frameworks, several different Kubernetes frameworks were reviewed like OpenFaas and OpenWhisk. Since that article was published, these frameworks have improved with new features that are worth mentioning. 

Following are some notable improvements:


In the last three years, OpenFaas has added the following features:

– You can now add HTTP status code to its histogram

– You can reduce cache expiry for replica readiness from 5s to 250ms

– The basic-auth plugin and gateway with Faas-Provider has been updated

– You can now publish async requests to multiple topics and updates to the NATS dependency


No notable changes have been made to the OpenWhisk library since November of 2020. However, between July 2019 and November 2020, OpenWhisk did update the following:

– Supports for the Node.js 14 runtime

– Support for the PHP 7.4 runtime

– Users can now implement `ArtifactStore` for CosmosDB

– You can use separate DB users for deployed components


Kubeless was actively maintained by VMware-archive, however, it’s no longer actively maintained by Vmware and is looking for a new individual or group that is willing to continue maintaining the project. 

This means that if you’re using this serverless framework, you may have difficulty getting the support you need. The project currently has over 6,000 stars on GitHub, and the last release of the project was in 2021.


Since July of 2019, Knative has made the following updates:

– An `autoscaling.knative.dev/activation-scale` annotation that allows the user to set a minimum number of replicas when not scaled to zero

– Allows `dnsConfig` and `dnsPolicy` to be specified on pod specs when the feature is enabled in the `config-features` config map.

– Custom pod metrics (other than “CPU” or “memory”) are now allowed 

– Users can set `container[*].securityContext.runAsGroup` in order to improve the security of containers

– Users can set `spec.template.spec.automountServiceAccountToken` to false in a PodSpec in order to opt-out of Kubenetes’ default behavior (*ie* mounting a ServiceAccount token in the pod’s containers)

– You can set `ReadOnlyRootFilesystem` on the container’s `SecurityContext`

– Revisions are retained 48 hours (as opposed to 24 hours previously), and the latest 20 revisions are kept for a longer period of time before they’re considered for Garbage Collection (previously only 1 revision was retained)

– The ability to fix name collision when having two routes in network

– The probe path changed from `/_internal/knative/activator/probe` to `/healthz` and made it consistent across all probe receivers in Knative Serving

– You can set `ReadOnlyRootFilesystem` on the container’s `SecurityContext`


In the last three years, Fission has updated the following: 

– The Fission CLI is more adaptable with the Kubernetes API

– Enabled recommended `securityContext` settings during Fission installation

– Upgrade in Storage Service

– Custom metric support for horizontal pod autoscaling (HPA) and autoscaling

– Improvements in monitoring

– Migration from fission-core chart to fission-all chart

– S3 can now be a backend for Storage Service


Serverless architecture reduces the release time to market for developers because it makes them focus on the application development and not configuring, provisioning, and managing the server for their application. Developers with applications that aren’t active round the clock benefit tremendously from serverless architecture because they only pay for what they use; when their application is inactive, they’re not charged until their application becomes active again.

Serverless architecture is not only restricted to functions applications. You can also incorporate it with container applications. With the help of special tools like KNative built on top of Kubernetes, you can run serverless container workloads on Kubernetes and enjoy the benefits of serverless on Kubernetes.

Once again, it’s essential to note that serverless architecture has its flaws, including latency delay in the first application request.

If your application doesn’t have a predictive workload and you don’t have enough workforce to provision and manage computing resources, then serverless architecture is a great option for you.

Subscribe to receive resource and product updates