Kubernetes 1.16: A big step for CRDs, kubectl and Container Storage Interface

This week Kubernetes 1.16 is expected and we want to highlight the technical features that enterprise Kubernetes users should know about. With Custom Resource Definitions (CRDs) moving into official general availability, storage improvements, and more, this release hardens the project and celebrates the main extension points for building cloud native applications on Kubernetes.

CRDs to GA

Custom Resource Definitions (CRDs) were introduced into upstream Kubernetes by Red Hat engineers in version 1.7. From the beginning, they were designed as a future-proof implementation of what was previously prototyped as ThirdPartyResources. The road of CRDs has focused on the original goal of making custom resources production ready, bringing it to be a generally available feature in Kubernetes, highlighted with the promotion of the API to v1 in 1.16.

CRDs have become a cornerstone of API extensions in the Kubernetes ecosystem, and is the basis of innovation and a core building block of OpenShift 4. Red Hat has continued pushing CRDs forward ever since, as one of the main drivers in the community behind the features and stability improvements, which finally lead to the v1 API. This progress made OpenShift 4 possible.

Let’s take a deeper look at what will change in the v1 API of Custom Resource Definitions (in the apiextensions.k8s.io/v1 API group). The main theme is around consistency of data stored in CustomResources:

The goal is that consumers of data stored in CustomResources can rely on that it has been validated on creation and on every update such that the data: 

  • follows a well-known schema
  • is strictly typed
  • and only contains values that were intended by the developers to be stored in the CRD.

 

We know all of these properties from native resources like, for example, pods.

 Pods have a well-known structure for metadata, spec, spec.containers, spec.volumes, etc.

  1. Every field in a pod is strictly typed, e.g. every field is either a string, a number, an array, an object or a map. Wrong types are rejected by the kube-apiserver: one cannot put a string where a number is expected.
  2. Unknown fields are stripped when creating a pod: a user or a controller cannot store arbitrary custom fields, e.g. pod.spec.myCustomField. For API compatibility reasons the Kubernetes API conventions say to drop and not reject those fields.

In order to fulfill these 3 properties, CRDs in the v1 version of apiextensions.k8s.io require:

  1. That a schema is defined (in `CRD.spec.versions[n].schema`) – example:
    apiVersion: apiextensions.k8s.io/v1
    kind: CustomResourceDefinition
    metadata:
      name: crontabs.example.openshift.io
    spec:
      group: example.openshift.io
      versions:
      - name: v1
        served: true
        storage: true
        schema:
          openAPIV3Schema:
            type: object
            properties:
              spec:
                type: object
                properties:
                  cronSpec:
                    type: string
                  image:
                    type: string
                  replicas:
                    type: integer
      scope: Namespaced
      names:
        plural: crontabs
        singular: crontab
        kind: CronTab

    2. That the schema is structural (https://kubernetes.io/blog/2019/06/20/crd-structural-schema/) — KEP: https://github.com/kubernetes/enhancements/blob/master/keps/sig-api-machinery/20190425-structural-openapi.md – the example above is structural.
    3. That pruning of unknown fields (those which are not specified in the schema of (1)) is enabled (pruning used be opt-in in v1beta1 via `CRD.spec.preserveUnknownFields: false`) — KEP: https://github.com/kubernetes/enhancements/blob/master/keps/sig-api-machinery/20180731-crd-pruning.md – pruning is enabled for the example as it is a v1 manifest where this is the default.

These are all formal requirements about CRDs and their CustomResources, checked by the kube-apiserver automatically. But there is an additional dimension of high quality APIs in the Kubernetes ecosystem: API review and approval.

API Review and Approval

Getting APIs right does not only mean to be a good fit for the described business logic. APIs must be

  • compatible with Kubernetes API Machinery of today and tomorrow,
  • future-proof in their own domain, i.e. certain API patterns are good and some are knowingly bad for later extensions.

The core Kubernetes developers had to learn painful lessons in the first releases of the Kubernetes platform, and eventually introduced a process called “API Review”. There is a set of people in the community who very carefully and with a lot of experience review every change against the APIs that are considered part of Kubernetes. Concretely, these are the APIs under the *.k8s.io domain.

To make it clear to the API consumer that APIs in *.k8s.io are following all quality standards of core Kubernetes, CRDs under this domain must also go through the API Review process (this is not a new requirement, but has been in place for a long time) and – and this is new – must link the API review approval PR in an annotation::

metadata:
  annotations:
    "api-approved.kubernetes.io": "https://github.com/kubernetes/kubernetes/pull/78458"

Without this annotation, a CRD under the *.k8s.io domain is rejected by the API server.

There are discussions about introducing other reserved domains for the wider Kubernetes community, e.g. *.x-k8s.io, with different, lower requirements than for core resources under *.k8s.io.

CRD Defaulting to Beta

Next to the presented theme of CRD data consistency, another important feature in 1.16 is the promotion of defaulting to beta. Defaulting is known to everybody for native resources, i.e. unspecified fields in a manifest are automatically set to default values on creation by the kube-apiserver.

For example, pod.spec.restartPolicy defaults to Always. Hence, if the user does not set that field, the API server will set and persist Always as the value.

Also old objects already persisted in etcd can get new fields when read from etcd using the defaulting mechanism. This is an important difference to mutating admission webhooks, which are not called on read from etcd, and hence cannot simulate real defaulting.

Defaults are an important API feature which heavily drives an API design. Defaults are now definable in CRD OpenAPI schemas. Here is an example from an OpenShift 4 CRD:

apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
 name: kubeapiservers.operator.openshift.io
spec:
 scope: Cluster
 group: operator.openshift.io
 names:
   kind: KubeAPIServer
   plural: kubeapiservers
   singular: kubeapiserver
 subresources:
   status: {}
 versions:
 - name: v1
   served: true
   storage: true
   schema:
     openAPIV3Schema:
       spec:
           type: object
           properties:
             logLevel:
               type: string
               default: Normal
             managementState:
               pattern: ^(Managed|Force)$
               default: Managed
               type: string

When such an object is created with explicitly setting logLevel and managementState, the log level will be Normal and the managementState will be Managed.

Kubectl independence

Kubectl came to life almost five years ago as a replacement for the initial CLI for Kubernetes: kubecfg. Its main goals were:

  • improved user experience 
  • and modularity. 

Initially, these goals were met, but over time it flourished in some places, but not in others. Red Hat engineers worked on allowing extensibility and stability of kubectl since the beginning because this was required to make pieces of OpenShift as an enterprise distribution of Kubernetes possible.

The initial discussions about the possibility of splitting kubectl out of the main Kubernetes repository to allow faster iteration and shorter release cycles were started almost two years ago. Unfortunately, the years the kubectl code lived in the main Kubernetes repository caused it to have a tight coupling with some of the internals of Kubernetes. 

Several Red Hat engineers were involved in this effort from the start, refactoring the existing code to make it less coupled with internals, exposing libraries such as k8s.io/api (https://github.com/kubernetes/api/) and k8s.io/client-go (https://github.com/kubernetes/client-go/), to name a few, which are the foundation for many of the existing integrations. 

One of the biggest offenders in that internals fight was the fact that entire kubectl code relied on the internal API versions (iow. internal representation of all the resources exposed in kube-apiserver). Changing this required a lot of manual and mundane work to rewrite every piece of code to properly work with external, official API (iow. the ones you work on a regular basis when interacting with a cluster).

Many long hours of sometimes hard, other times dull work was put into this effort, which resulted in the recent initial brave step which moved (almost all) kubectl code to staging directory. In short, a staging repository is one that is treated as an external one, having its own distinct import path (in this case k8s.io/kubectl).

Reaching this first visible goal brings us several important implications. Kubectl is currently being published (through the publishing-bot) into its own repository that can be easily consumed by the external actors as k8s.io/kubectl. Even though there are a few commands left in the main kubernetes tree, we are working hard on closing this gap, while trying to figure out how the final extraction piece will work, mostly from the testing and release point of view.

Storage improvements

For this release, SIG-storage focused on bringing feature parity between Container Storage Interface (CSI) and in-tree drivers, as well as improving the stability of CSI sidecars and filling in functionality gaps.

We are working on migrating in-tree drivers and replacing them with their CSI equivalent. This is an effort across releases, with more work to follow, but we made steady progress. 

Some of the features the Red Hat storage team designed and implemented include:

  • Volume cloning (beta) to allow users to create volumes from existing sources.
  • CSI volume expansion as default, which brings feature parity between in-tree and CSI drivers. 
  • Raw block support improvements and bug fixes, especially when using raw block on iSCSI volumes.

Learn more

Kubernetes 1.16 brings enhancements to CRDs and storage. Check out the Kubernetes 1.16 release notes to learn more.

Categories
Kubernetes
Tags
, , ,