What makes a good Operator?

March 9, 2020Alex Handy

In 2016, CoreOS coined the term, Operator. They started a movement about a whole new type of managed application that achieves automated Day-2 operations with a user-experience that feels native to Kubernetes.

Since then, the extensions mechanisms that underpin the Operator pattern, have evolved significantly. Custom Resource Definitions, an integral part of any Operator, became stable, got validation and a versioning feature that includes conversion. Also, the experience the Kubernetes community gained when writing and running Operators accumulated critical mass. If you’ve attended any KubeCon in the past 2 years, you will have noticed the increased coverage and countless sessions focusing on Operators.

The popularity that Operators enjoy, is based on the possibility to achieve a cloud-like service experience for almost any workload available wherever your cluster runs. Thus, Operators are striving to be the world's best provider of their workload as-a-service.

But what actually does make for a good Operator? Certainly the user experience is an important pillar, but it is mostly defined through the interaction between the cluster user running kubectl and the Custom Resources that are defined by the Operator.

This is possible with Operators being extensions of the Kubernetes control plane. As such, they are global entities that run on your cluster for a potentially very long time, often with wide privileges. This has some implications that require forethought.

For this kind of application, best practices have evolved to mitigate potential issues, security risks, or simply to make the Operator more maintainable in the future. The Operator Framework Community has published a collection of these practices: https://github.com/operator-framework/community-operators/blob/master/docs/best-practices.md

They are covering recommendations concerning the design of an Operator as well as behavioral best practices that come into play at runtime. They reflect a culmination of experience from the Kubernetes community writing Operators for a broad range of use cases. In particular, the observations the Operator Framework community made, when developing tooling for writing and lifecycling Operators.

Some highlights include the following development practices:

One Operator per managed application
Multiple operators should be used for complex, multi-tier application stacks
CRD can only be owned by a single Operator, shared CRDs should be owned by a separate Operator
One controller per custom resource definition

As well as many others.

With regard to best practices around runtime behavior, it’s noteworthy to point out these:

Do not self-register CRDs
Be capable of updating from a previous version of the Operator
Be capable of managing an Operand from an older Operator version
Use CRD conversion (webhooks) if you change API/CRDs

There are additional runtime practices (please, don’t run as root) in the document worth reading.

This list, being a community effort, is of course open to contributions and suggestions. Maybe you are planning to write an Operator in the near future and are wondering how a certain problem would be best solved using this pattern? Or you recently wrote an Operator and want to share some of your own learnings as your users started to adopt this tool? Let us know via GitHub issues or file a PR with your suggestions and improvements. Finally, if you want to publish your Operator or use an existing one, check out OperatorHub.io.

About the author

Alex Handy

Principal Product Marketing Manager

Red Hatter since 2018, technology historian and founder of The Museum of Art and Digital Entertainment. Two decades of journalism mixed with technology expertise, storytelling and oodles of computing experience from inception to ewaste recycling. I have taught or had my work used in classes at USF, SFSU, AAU, UC Law Hastings and Harvard Law.

I have worked with the EFF, Stanford, MIT, and Archive.org to brief the US Copyright Office and change US copyright law. We won multiple exemptions to the DMCA, accepted and implemented by the Librarian of Congress. My writings have appeared in Wired, Bloomberg, Make Magazine, SD Times, The Austin American Statesman, The Atlanta Journal Constitution and many other outlets.

I have been written about by the Wall Street Journal, The Washington Post, Wired and The Atlantic. I have been called "The Gertrude Stein of Video Games," an honor I accept, as I live less than a mile from her childhood home in Oakland, CA. I was project lead on the first successful institutional preservation and rebooting of the first massively multiplayer game, Habitat, for the C64, from 1986: https://neohabitat.org . I've consulted and collaborated with the NY MOMA, the Oakland Museum of California, Cisco, Semtech, Twilio, Game Developers Conference, NGNX, the Anti-Defamation League, the Library of Congress and the Oakland Public Library System on projects, contracts, and exhibitions.

Read full bio

Browse by channel

Explore all channels

Platform products

Try & buy

Featured cloud services

By category

By organization type

By customer

Featured

Topics

Articles

More to explore

For customers

For partners

About us

Open source

Company details

Communities

Recommendations

Select a language

Select a language

What makes a good Operator?

About the author

Alex Handy

More like this

Browse by channel

Products

Tools

Try, buy, & sell

Communicate

About Red Hat

Select a language

Red Hat legal and privacy links

Red Hat legal and privacy links