Explained: Kubernetes Operators

You're deploying Kafka to Kubernetes. You write a Deployment for the brokers, a StatefulSet for ZooKeeper, ConfigMaps for configuration, Services to expose everything, and a handful of PersistentVolumeClaims for storage. The cluster comes up. Then a broker dies and Kubernetes restarts it, but it comes back with the wrong broker ID and can't rejoin the cluster. Kubernetes did what it was told: restart the container. But it had no idea what a Kafka broker actually is, or what a healthy restart looks like for stateful distributed systems.

This is the gap that Kubernetes Operators were built to fill. They encode the operational knowledge that a human expert would apply, the kind that can't be captured in a Deployment manifest, and turn it into automation that runs inside the cluster.

The problem with generic primitives

Kubernetes ships with a set of built-in resource types: Pod, Deployment, StatefulSet, Service, ConfigMap, and so on. These cover a lot of ground for stateless workloads. You describe the desired state, Kubernetes makes it real, and if something drifts, it corrects it. The reconciliation loop is the core idea, and it works beautifully for applications that don't care which pod they run on or what happened to the previous one.

Stateful, distributed systems are different. A Kafka cluster, a PostgreSQL replica set, an Elasticsearch index, a Cassandra ring: each has its own operational playbook. Scaling Cassandra isn't just adding pods; it's adding nodes to the ring and triggering a rebalance. Upgrading Elasticsearch isn't a rolling restart of Deployments; it's a shard-aware sequence that avoids data loss. Backing up a database isn't a cron job that runs pg_dump; it's a coordinated snapshot at a consistent point in time.

None of this operational knowledge fits into a StatefulSet. A human operator (the person who manages the system) carries this knowledge in their head and applies it when needed. An Operator, in the Kubernetes sense, encodes that same knowledge as code and runs it as a controller inside the cluster.

Two building blocks: CRDs and controllers

Operators are built from two standard Kubernetes features. Neither is Operator-specific; both exist independently. Together they give you everything you need.

The first is a Custom Resource Definition, or CRD. A CRD teaches the Kubernetes API server about a new kind of object. Once you install a CRD, you can create, update, and delete instances of that object through the same kubectl commands and REST API you use for any other Kubernetes resource. The API server stores them, validates them against a schema, and makes them available to controllers. A CRD for Kafka might let you define a KafkaCluster resource with fields like brokers, replicationFactor, and storageClass.

The second is a controller. A controller is a loop: it watches the state of some Kubernetes objects, compares that to the desired state, and takes action to close the gap. Kubernetes itself is full of controllers: the Deployment controller, the ReplicaSet controller, the StatefulSet controller. They all follow the same pattern. An Operator is simply a controller that watches a custom resource type instead of, or in addition to, the built-in types.

Put these together and you have an Operator. Someone defines a CRD such as KafkaCluster, letting users express what they want in domain terms. The Operator controller watches for instances of that resource and continuously reconciles the actual cluster state toward the desired state, using whatever logic is appropriate for Kafka specifically.

The reconciliation loop in detail

The word reconcile is used deliberately. The controller doesn't issue imperative commands ("start broker 3"). It observes what exists, compares it to what should exist, and figures out what actions to take. This is a fundamentally different model from scripting.

A typical reconciliation loop looks like this. Something triggers the reconciler: either a change to the custom resource (someone updated brokers: 3 to brokers: 5), or a change to a resource the Operator is watching (a pod went down), or a periodic re-sync. The reconciler reads the current desired state from the custom resource. It reads the current actual state from the cluster, checking what StatefulSets, Services, and ConfigMaps currently exist. It computes the difference and decides what to do.

Critically, the reconciler must be idempotent. If it runs twice with nothing changed, the result should be identical. If it's interrupted halfway through, the next run should safely pick up and finish. This is not just good practice; it's a requirement, because the controller framework can and will call your reconciler multiple times for the same event, and will retry on failure.

The reconciler writes its assessment back to the custom resource as status conditions: machine-readable fields like Ready: true, BrokersReady: 3/3, or Phase: Upgrading. These status fields are how the Operator communicates what it's doing, and they're what kubectl shows you when you describe a custom resource.

What an Operator actually knows how to do

The power of an Operator is in what it encodes. Generic Kubernetes controllers handle generic cases. An Operator handles the specific operational knowledge for one system, and that knowledge can be arbitrarily sophisticated.

A production-grade database Operator, for example, might handle: provisioning a new cluster with the right number of nodes and correct replication topology; performing a rolling upgrade in a sequence that avoids quorum loss; detecting that a replica has fallen behind and triggering a re-sync rather than a restart; creating a consistent backup snapshot and storing it off-cluster; responding to a failover by promoting a replica and updating DNS. None of this is Kubernetes knowledge. All of it is database knowledge, translated into code that speaks the Kubernetes API.

The Operator maturity model, originally defined by Red Hat, describes five levels of increasing capability: basic install, seamless upgrades, full lifecycle management, deep insights and metrics, and finally auto-pilot, where the Operator can tune and self-heal without any human input. Most real-world Operators land somewhere in the middle of this scale.

Where Operators live and how they run

An Operator is just a pod running in the cluster. Usually it runs in its own namespace, something like kafka-operator-system, with a ServiceAccount that has RBAC permissions to read and write the resources it manages. It connects to the Kubernetes API server and uses the standard watch mechanism to receive events. Everything else is application code.

Most Operators are written in Go using the controller-runtime library, which handles the event queue, the informer cache, the leader election, and the retry logic. This lets the Operator author focus on the reconciliation logic itself rather than the plumbing. Operator SDK and Kubebuilder are the two most common scaffolding tools: they generate the boilerplate and provide CLI commands to register CRDs with the API server. Operators can also be written in Python, Java, or any language with a Kubernetes client library, though Go remains the dominant choice.

Leader election is worth highlighting. Most Operators run a single active controller instance, even if multiple replicas are deployed for availability. The replicas compete for a lock, implemented as a Lease object in Kubernetes, and only the leader processes events. If the leader dies, a replica acquires the lock and takes over within seconds. This prevents split-brain scenarios where two controllers are both making conflicting changes to the same resources.

Where you'll encounter Operators in practice

Operators have become the standard packaging format for any non-trivial stateful workload on Kubernetes. If you've used Strimzi to run Kafka, CloudNativePG for PostgreSQL, Prometheus Operator for monitoring, cert-manager for TLS certificates, or Argo CD for GitOps, you've been using Operators. Each of them installs a CRD (or several), deploys a controller, and lets you interact with complex systems through simple, declarative custom resources.

OpenShift makes particularly heavy use of Operators. The platform manages its own components: the API server, the ingress controller, the image registry, the monitoring stack, each through a dedicated cluster Operator. This is why OpenShift upgrades feel more reliable than manually applying manifests: the cluster Operators know the correct upgrade sequence for each component and enforce it. Operator Lifecycle Manager (OLM) is the layer that handles installing, updating, and managing the dependency graph between Operators. It is available on OpenShift out of the box and installable on plain Kubernetes clusters.

OperatorHub.io is the public catalog of community and certified Operators. Before writing your own, it's worth checking whether one already exists for your system. Production-grade Operators for the major databases, message brokers, and observability platforms are mature and well-maintained.

Summary

Kubernetes Operators extend the platform's reconciliation model to cover systems that built-in primitives can't handle. A CRD teaches the API server about a new kind of resource. A controller watches that resource and continuously reconciles actual state toward desired state, using domain-specific knowledge that generic Kubernetes knows nothing about. The two together are an Operator.

The key insight is that an Operator is not a different kind of automation: it's the same reconciliation pattern Kubernetes uses internally, applied to your specific system. The Deployment controller and the Kafka Operator are structurally the same thing. One ships with Kubernetes and handles generic containers; the other is written by someone who knows how Kafka works and handles Kafka specifically.

Once you understand this, a lot of things click into place: why complex stateful systems on Kubernetes are distributed as Operators rather than Helm charts alone, why OpenShift upgrades are managed by Operators, and why "just use a StatefulSet" is never the complete answer for a distributed database.

Part of the Explained series — concepts in tech, clearly.