Forem: Ankur Sinha

From client-go to controller-runtime: Rebuilding a Kubernetes Controller

Ankur Sinha — Fri, 13 Mar 2026 04:37:59 +0000

In my previous article, I built a Kubernetes controller from scratch using client-go, informers, and workqueues.

If you haven't read it yet, you can check it here:

👉 https://dev.to/ankrsinha/from-crds-to-controllers-building-a-kubernetes-custom-controller-from-scratch-3ibk

In that project, I built Mini Task Runner, a simplified Tekton-like system where a Task defines container steps and a TaskRun triggers their execution. The controller watches TaskRun resources and creates a Pod that runs those steps.

While building the controller with raw client-go primitives helped me understand how Kubernetes controllers work internally, most real-world projects such as Kubebuilder and Operator SDK use a higher-level framework called controller-runtime to build controllers. Other systems like Tekton use similar abstractions built on top of client-go.

In this post, I rebuild the same Mini Task Runner controller using controller-runtime and explore how the architecture changes compared to the manual client-go implementation.

1. Migration Motivation

Why move from client-go → controller-runtime

The client-go library provides the fundamental building blocks required to interact with the Kubernetes API:

Informers
Listers
Workqueues
Typed clients

These primitives are powerful, but they are also low-level. When writing controllers directly with client-go, developers must assemble all of these components manually.

In my first controller implementation, I had to:

Create informer factories
Attach event handlers
Maintain a rate-limited workqueue
Implement worker goroutines
Handle retry logic
Manage relationships between resources

Most of this code exists before the actual reconciliation logic even begins.

controller-runtime was introduced to simplify this process by providing higher-level abstractions. Instead of wiring controller infrastructure manually, developers can focus primarily on reconciliation logic.

This is why many production Kubernetes controllers use controller-runtime as their foundation.

Problems in Manual Controllers

Writing the controller manually exposed several pain points.

A large portion of the code was dedicated to controller infrastructure rather than business logic. Informers had to be initialized, caches needed to be synced, event handlers registered, and worker routines had to continuously process items from a workqueue.

Handling relationships between resources also required additional logic. For example, when a Pod changed state, the controller had to explicitly map that update back to the corresponding TaskRun. This often required adding labels or writing custom mapping logic.

Workqueue management was another responsibility. If reconciliation failed, the key had to be requeued using rate limiting to avoid overwhelming the API server.

None of this logic was directly related to the actual goal of the controller, which was simply:

Observe a TaskRun and ensure a Pod exists that executes the Task.

This made the controller more complex than necessary.

2. Architecture Changes

Migrating to controller-runtime significantly simplified the architecture. The controller now revolves around four core ideas:

Manager
Reconciler
Cached Client
Automatic Workqueue

Manager

In controller-runtime, everything begins with the controller manager.

The manager acts as the central runtime for controllers. It is responsible for:

starting controllers
maintaining shared caches
providing Kubernetes clients
coordinating controller lifecycle

The controller registers itself with the manager, which ensures it runs continuously.

A typical initialization looks like this:

mgr, err := ctrl.NewManager(ctrl.GetConfigOrDie(), ctrl.Options{
    Scheme: scheme,
})

Once started, the manager runs all registered controllers.

Controller Builder Pattern

In controller-runtime, controllers are typically registered using the Controller Builder Pattern. This pattern connects the controller with the manager and defines which resources should trigger reconciliation.

ctrl.NewControllerManagedBy(mgr).
    For(&miniv1.TaskRun{}).
    Owns(&corev1.Pod{}).
    Complete(&TaskRunReconciler{})

Here, TaskRun is the primary resource being watched by the controller. The Owns(&corev1.Pod{}) declaration tells controller-runtime to also watch Pods created by the controller. Whenever a Pod owned by a TaskRun changes state, the corresponding TaskRun is automatically enqueued for reconciliation.

The Complete() call registers the TaskRunReconciler, which contains the reconciliation logic executed for each event.

Reconciler

Instead of manually implementing worker loops, controller-runtime uses the Reconciler pattern.

The framework automatically calls a Reconcile() function whenever a relevant resource event occurs.

The reconciler receives a request containing the resource's namespace and name. Its responsibility is straightforward:

Fetch the current state of the resource
Compare it with the desired state
Apply changes to move the system toward that desired state

In the Mini Task Runner controller, reconciliation follows a simple state machine based on the TaskRun phase:

If the TaskRun is new → create a Pod
If the Pod is running → update status
If the Pod finishes → mark success or failure

The reconciler focuses purely on this logic.

Cached Client

controller-runtime provides a cached Kubernetes client.

Reads from the cluster are served from a local cache maintained by shared informers rather than directly hitting the API server.

This provides two advantages:

Reduced API server load
Faster read operations

Fetching a resource therefore looks very simple:

var tr miniv1.TaskRun
r.Get(ctx, req.NamespacedName, &tr)

The framework manages cache synchronization internally.

Automatic Workqueue

Another major difference from the client-go implementation is that the workqueue is no longer explicitly managed in the code.

controller-runtime automatically creates and manages the workqueue.

Events such as:

TaskRun creation
TaskRun updates
Pod updates

automatically enqueue reconciliation requests.

The framework also handles:

retry behavior
rate limiting
worker execution

This removes a significant amount of boilerplate code.

3. Benefits Observed

After migrating the controller, several improvements became clear.

Reduced Boilerplate

Most of the setup code required for informers, queues, and workers disappeared.

In the previous controller, a large portion of the code was dedicated to wiring infrastructure components. With controller-runtime, the controller setup became much shorter and easier to read.

Built-in Retries

controller-runtime automatically retries reconciliation when errors occur.

If the Reconcile function returns an error, the request is requeued with exponential backoff.

This ensures the controller remains resilient without explicitly writing retry logic.

Cleaner Design

The controller structure becomes easier to reason about.

Instead of thinking about informers, worker threads, and workqueues, the developer focuses on reconciliation — comparing desired state with actual state and applying the necessary changes.

Easier Ownership Handling

controller-runtime provides utilities for managing ownership relationships between resources.

For example, when creating a Pod for a TaskRun, the Pod can be set as a child resource using:

ctrl.SetControllerReference(tr, pod, r.Scheme)

This automatically enables:

garbage collection of Pods when the TaskRun is deleted
reconciliation when the Pod status changes

Maintainability

With less infrastructure code and clearer separation of responsibilities, the controller becomes easier to extend and maintain.

Future improvements can be implemented by modifying the reconciliation logic rather than changing controller wiring.

4. Challenges Faced

Although controller-runtime simplifies controller development, there were still a few areas that required careful attention.

Status Updates

Updating the status field of a custom resource must be done using the status subresource API.

Instead of using the normal client update, the controller must call:

r.Status().Update(ctx, tr)

This distinction is important because Kubernetes treats spec and status updates differently.

Cache Behavior

Because the client reads from a local cache, the result of a write operation might not immediately appear in subsequent reads.

For example, right after creating a Pod, the cache might not yet contain that object.

To handle this safely, reconciliation logic must be idempotent, meaning the controller should behave correctly even if the same operation is attempted multiple times.

Requeue Logic

Some situations require the controller to revisit an object after a short delay.

In the Mini Task Runner, while a Pod is running, the controller periodically rechecks its status.

controller-runtime supports this using:

return ctrl.Result{RequeueAfter: 5 * time.Second}, nil

Choosing when to rely on events versus explicit requeueing required some experimentation.

Debugging Reconciliation

Because reconciliation is event-driven and asynchronous, debugging sometimes requires adding detailed logs to understand when and why reconciliation is triggered.

Structured logging helped trace the lifecycle of a TaskRun.

5. Final Outcome

After completing the migration, the controller showed several improvements compared to the original implementation.

Performance

Using cached clients reduces the number of direct API server calls. This improves scalability and ensures the controller behaves efficiently as the number of resources grows.

Simplicity

The controller code became significantly shorter and easier to understand.

Most of the complexity related to informers and workqueues is now handled by controller-runtime, allowing the code to focus primarily on business logic.

Production Readiness

controller-runtime follows patterns widely used in modern Kubernetes controllers.

After rebuilding the Mini Task Runner controller using this framework, the system aligns more closely with how real-world Kubernetes operators are implemented.

The controller was also containerized and pushed to GitHub Container Registry (GHCR). The image was then deployed inside the cluster using a standard Kubernetes Deployment.

When running inside the cluster, the controller uses Kubernetes in-cluster configuration instead of a local kubeconfig file. This allows the controller to communicate with the API server using the service account mounted inside the Pod.

Because the controller needs to create Pods and update TaskRun resources, appropriate RBAC permissions were defined using a ServiceAccount, ClusterRole, and ClusterRoleBinding. This ensures the controller has only the permissions required to reconcile resources inside the cluster.

Want to Try the controller-runtime Version?

This article focuses on the architectural changes introduced by controller-runtime. If you want to explore the implementation or run it yourself, you can find the code in the same repository.

The controller-runtime version of Mini Task Runner is available in the fork2 branch:

👉 https://github.com/ankrsinha/mini-task/tree/fork2

The repository contains instructions for:

building the controller
containerizing it with Docker
pushing the image to GitHub Container Registry (GHCR)
deploying the controller in a Kubernetes cluster

Closing Thoughts

Building the first controller with raw client-go helped me understand the mechanics behind Kubernetes controllers: informers, caches, workqueues, and worker loops.

Migrating the same controller to controller-runtime showed how those mechanics can be abstracted into a cleaner and more maintainable framework.

Both approaches are valuable learning experiences. But if the goal is to build controllers that resemble those used in production Kubernetes projects, controller-runtime provides a much more practical starting point.

Authors

Ankur Sinha

Aditya Shinde

From CRDs to Controllers: Building a Kubernetes Custom Controller from Scratch

Ankur Sinha — Tue, 24 Feb 2026 05:47:10 +0000

If you’ve worked with Kubernetes, you’ve probably seen how it can orchestrate complex workflows using custom resources.

But how does this actually work behind the scenes?

To understand this better, I built Mini Task Runner, a simplified Tekton-like task execution system from scratch. This project explores how Kubernetes can be extended using Custom Resource Definitions (CRDs) and custom controllers.

In this post, we will look at how custom resources interact with the Kubernetes API server and etcd, why code generation is required when building controllers in Go, and how an event-driven controller architecture works.

1. Kubernetes Extensibility with CRDs

Out of the box, Kubernetes understands a fixed set of standard resources such as Pods, Deployments, Services, and ReplicaSets.

When you submit a Deployment to the API server, the configuration is stored in etcd, which acts as the cluster’s key-value datastore. Built-in controllers monitor these resources and continuously reconcile the desired state with the actual state of the cluster.

But what if you want Kubernetes to understand concepts like:

a CI/CD pipeline
a database cluster
a workflow execution

This is where Custom Resource Definitions (CRDs) come in.

CRDs allow developers to define their own API objects and extend the Kubernetes API.

For the Mini Task Runner, I defined two custom resources:

Task

A reusable template describing the steps required to execute a task.

Each step specifies:

a container image
a script to run

TaskRun

An execution instance that triggers a Task.

A TaskRun references a Task and stores execution status such as the current phase, start time, and completion time.

However, defining CRDs alone is not enough. Kubernetes will store these resources in etcd, but nothing will act on them.

To make these resources functional, we need a Custom Controller. The controller watches the API server for changes to these resources and performs actions to reconcile the desired state with the current state.

2. Code Generation for Custom Resources

Before writing the controller logic, our Go program needs to understand the custom API types (Task and TaskRun).

Unlike built-in Kubernetes objects, these types do not exist in standard Kubernetes client libraries. As a result, we must generate supporting code.

Every Kubernetes object must implement certain interfaces, including methods that safely create deep copies of objects in memory. Writing these functions manually would be error-prone and repetitive.

Kubernetes provides code-generation tools that automatically generate the required plumbing.

After defining the Go structs for our custom resources and adding the appropriate annotations, the code generator produces:

DeepCopy methods – required for Kubernetes object handling
Typed clientsets – strongly typed clients for interacting with the API server
Informers and listers – used to build event-driven controllers

After code generation, the controller has two clients available:

A core Kubernetes client to create and manage resources such as Pods
A custom client to interact with the Task and TaskRun resources

This generated code provides the foundation needed to implement the controller.

3. Controller Architecture

A controller is responsible for observing changes in the cluster and taking actions to move the system toward the desired state.

There are two main approaches for implementing a controller.

Polling-Based Controller

A naive approach would be to continuously poll the Kubernetes API server.

For example, a loop might periodically ask:

list all TaskRuns
list all Pods
compare the state
take action if needed

Polling is inefficient because it repeatedly queries the API server, increasing load and introducing unnecessary latency.

Kubernetes controllers instead rely on an event-driven architecture.

Event-Driven Controller Architecture

Production-grade Kubernetes controllers rely on four key components:

Informers
Listers
Workqueues
Workers

Informers

Informers maintain a long-lived watch connection with the Kubernetes API server. Whenever a resource is created, updated, or deleted, the API server sends an event to the informer.

This avoids repeated polling and allows the controller to react immediately to changes.

Listers (Local Cache)

Informers maintain a local cache of resources. Instead of querying the API server for every read operation, the controller reads data from this in-memory cache.

This significantly reduces network calls and improves performance.

Workqueue

When an event occurs, the informer pushes the resource key into a rate-limited workqueue.

The queue acts as a buffer that ensures resources are processed safely and avoids overwhelming the controller during bursts of updates.

Workers

Workers are background goroutines that continuously read items from the workqueue and process them using the controller’s reconciliation logic.

Multiple workers can run in parallel, allowing the controller to handle many events concurrently.

4. The Reconciliation Loop

The core logic of the controller is implemented in the reconciliation loop.

A worker retrieves a resource key from the workqueue and performs the following steps:

1. Fetch the Current State

The controller retrieves the corresponding TaskRun from the local cache.

2. Determine the Current Phase

The controller checks the TaskRun status.

If the phase is Succeeded or Failed, no further action is required.
If the phase is empty, it indicates a new execution request.

3. Start Execution

For a new TaskRun:

The referenced Task is retrieved.
Task steps are converted into container definitions.
A Pod is created with restartPolicy: Never.
The TaskRun status is updated to Pending.

4. Track Execution

If the TaskRun is in Pending or Running, the controller checks the associated Pod.

If the Pod is running → update phase to Running
If the Pod succeeds → update phase to Succeeded
If the Pod fails → update phase to Failed

5. Update Status

The TaskRun status is updated through the Kubernetes API server, ensuring the cluster state remains consistent.

If reconciliation fails due to transient issues, the resource is automatically requeued with exponential backoff.

This ensures reliability without overwhelming the API server.

5. Adding a kubectl Plugin

Creating TaskRun resources manually using YAML can be inconvenient. Tools like Tekton provide CLI utilities to simplify this process.

kubectl supports plugins, which allows custom commands to be added easily.

If an executable named kubectl-<command> is placed in the system PATH, kubectl treats it as a native command.

To improve usability, I implemented a small CLI tool:

kubectl task start <task-name>

This command creates a TaskRun resource with a unique name and submits it to the cluster.

Once the TaskRun is created, the controller detects it through the informer, enqueues the event, and begins executing the corresponding Task.

Want to Try It Yourself?

This article focuses on the architecture and design of the controller.

If you want to run the project yourself, the implementation described in this article is available in the fork1 branch of the repository:

👉 https://github.com/ankrsinha/mini-task/tree/fork1

The repository includes instructions for:

installing the CRDs
running the code generator
starting the controller locally
using the kubectl task start plugin

Conclusion

Building Mini Task Runner provided a deeper understanding of how Kubernetes controllers operate internally.

Some key takeaways from this project:

CRDs define the data model, but controllers provide the logic that makes them useful.
Code generation simplifies controller development by generating clients, informers, and deep-copy methods.
Event-driven architectures using informers and workqueues allow controllers to scale efficiently without overloading the API server.

For anyone interested in Kubernetes internals, implementing a custom controller is one of the best ways to understand how the control plane operates.

Next Step

In the next article, I rebuild the same controller using controller-runtime, a framework widely used for implementing Kubernetes controllers.

→ From client-go to controller-runtime: Rebuilding a Kubernetes Controller https://dev.to/ankrsinha/from-client-go-to-controller-runtime-rebuilding-a-kubernetes-controller-5c20

Authors

Ankur Sinha

Aditya Shinde