Subscribe to our blog

In 2016, CoreOS coined the term, Operator. They started a movement about a whole new type of managed application that achieves automated Day-2 operations with a user-experience that feels native to Kubernetes.

Since then, the extensions mechanisms that underpin the Operator pattern, have evolved significantly. Custom Resource Definitions, an integral part of any Operator, became stable, got validation and a versioning feature that includes conversion. Also, the experience the Kubernetes community gained when writing and running Operators accumulated critical mass. If you’ve attended any KubeCon in the past 2 years, you will have noticed the increased coverage and countless sessions focusing on Operators.

The popularity that Operators enjoy, is based on the possibility to achieve a cloud-like service experience for almost any workload available wherever your cluster runs. Thus, Operators are striving to be the world's best provider of their workload as-a-service.

But what actually does make for a good Operator? Certainly the user experience is an important pillar, but it is mostly defined through the interaction between the cluster user running kubectl and the Custom Resources that are defined by the Operator. 

This is possible with Operators being extensions of the Kubernetes control plane. As such, they are global entities that run on your cluster for a potentially very long time, often with wide privileges. This has some implications that require forethought.

For this kind of application, best practices have evolved to mitigate potential issues, security risks, or simply to make the Operator more maintainable in the future. The Operator Framework Community has published a collection of these practices: https://github.com/operator-framework/community-operators/blob/master/docs/best-practices.md

They are covering recommendations concerning the design of an Operator as well as behavioral best practices that come into play at runtime. They reflect a culmination of experience from the Kubernetes community writing Operators for a broad range of use cases. In particular, the observations the Operator Framework community made, when developing tooling for writing and lifecycling Operators.

Some highlights include the following development practices:

  • One Operator per managed application
  • Multiple operators should be used for complex, multi-tier application stacks
  • CRD can only be owned by a single Operator, shared CRDs should be owned by a separate Operator
  • One controller per custom resource definition

As well as many others.

With regard to best practices around runtime behavior, it’s noteworthy to point out these:

  • Do not self-register CRDs
  • Be capable of updating from a previous version of the Operator
  • Be capable of managing an Operand from an older Operator version
  • Use CRD conversion (webhooks) if you change API/CRDs

There are additional runtime practices (please, don’t run as root) in the document worth reading.

This list, being a community effort, is of course open to contributions and suggestions. Maybe you are planning to write an Operator in the near future and are wondering how a certain problem would be best solved using this pattern? Or you recently wrote an Operator and want to share some of your own learnings as your users started to adopt this tool? Let us know via GitHub issues or file a PR with your suggestions and improvements. Finally, if you want to publish your Operator or use an existing one, check out OperatorHub.io.


About the author

Red Hatter since 2018, technology historian and founder of The Museum of Art and Digital Entertainment. Two decades of journalism mixed with technology expertise, storytelling and oodles of computing experience from inception to ewaste recycling. I have taught or had my work used in classes at USF, SFSU, AAU, UC Law Hastings and Harvard Law. 

I have worked with the EFF, Stanford, MIT, and Archive.org to brief the US Copyright Office and change US copyright law. We won multiple exemptions to the DMCA, accepted and implemented by the Librarian of Congress. My writings have appeared in Wired, Bloomberg, Make Magazine, SD Times, The Austin American Statesman, The Atlanta Journal Constitution and many other outlets.

I have been written about by the Wall Street Journal, The Washington Post, Wired and The Atlantic. I have been called "The Gertrude Stein of Video Games," an honor I accept, as I live less than a mile from her childhood home in Oakland, CA. I was project lead on the first successful institutional preservation and rebooting of the first massively multiplayer game, Habitat, for the C64, from 1986: https://neohabitat.org . I've consulted and collaborated with the NY MOMA, the Oakland Museum of California, Cisco, Semtech, Twilio, Game Developers Conference, NGNX, the Anti-Defamation League, the Library of Congress and the Oakland Public Library System on projects, contracts, and exhibitions.

 
Read full bio

Browse by channel

automation icon

Automation

The latest on IT automation that spans tech, teams, and environments

AI icon

Artificial intelligence

Explore the platforms and partners building a faster path for AI

open hybrid cloud icon

Open hybrid cloud

Explore how we build a more flexible future with hybrid cloud

security icon

Security

Explore how we reduce risks across environments and technologies

edge icon

Edge computing

Updates on the solutions that simplify infrastructure at the edge

Infrastructure icon

Infrastructure

Stay up to date on the world’s leading enterprise Linux platform

application development icon

Applications

The latest on our solutions to the toughest application challenges

Original series icon

Original shows

Entertaining stories from the makers and leaders in enterprise tech