Crit

Crit is a command-line tool for bootstrapping Kubernetes clusters. It handles the initial configuration of Kubernetes control plane components, and adding workers to the cluster.

It is designed to be used within automated scripting (i.e. non-interactive). Many providers of virtual infrastructure allow user-defined customization via shell script, which ensures Crit composes well with provider provisioning tools (e.g. AWS Cloudformation).

Installation

The easiest way to install:

curl -sSfL https://get.crit.sh | sh

Pre-built binaries are also available in Releases. Crit is written in Go so it is also pretty simple to install via go:

go get -u github.com/criticalstack/crit/cmd/crit

RPM/Debian packages are also available via packagecloud.io.

Requirements

Crit is a standalone binary, however, there are implied requirements that aren't as straight-forward. Be sure to check out the Getting Started.

Design

Decoupled from Etcd Management

The Kubernetes control plane requires etcd for storage, however, bootstrapping and managing etcd is not a responsibility of Crit. This decreases code complexity and results in more maintainable code. Rather than handle all aspects of installing and managing Kubernetes, Crit is designed to be one tool in the toolbox, specific to bootstrapping Kubernetes components.

Safely handling etcd in a cloud environment is not as easy as it may seem, so we have a separate project, e2d, designed to bootstrap etcd and manage cluster membership.

Lazy Cluster Initialization

Crit leverages the unique features of etcd to handle how brand new clusters are bootstrapped. With other tooling, this is often accomplished by handling cluster initialization separately from all subsequent nodes joining the cluster (even if done so implicitly). The complexity for handling this initial case can be difficult to automate in distributed systems. Instead, the distributed locking capabilities of etcd are used to synchronize nodes and initialize a cluster automatically. All nodes race to acquire the distributed lock, and should the cluster not exist (signified by the presence of shared cluster files), a new cluster is initialized by the node that was first to acquire the lock, otherwise the node joins the cluster.

This ends up being really cool when working with projects like cluster-api, since all control plane nodes can be initialized simultaneously, greatly reducing the time to create a HA cluster (especially a 5 node control plane).

Node Roles

Nodes are distinguished as having only one of two roles, either control plane or worker. All the same configurations for clusters are possible, such as colocating etcd on the control plane, but Crit is only concerned with how it needs to bootstrap the two basic node roles.

Cluster Upgrades

There are a several important considerations for upgrading a cluster. Crit itself is only a bootstrapper, in that it takes on the daunting task of ensuring that the cluster components are all configured, but afterwards, there is not much left for it to do. However, the most important aspects of the philosophy behind Crit and e2d is to ensure that colocated control planes can:

  1. Have all nodes deployed simultaneously, and crit/e2d will ensure that they are bootstrapped regardless of the order they come up.
  2. It can safely perform a rolling upgrade.

Built for Automation

Getting Started

Quick Start

A local Critical Stack cluster can be setup using cinder with one easy command:

$ cinder create cluster

This quickly creates a ready-to-use Kubernetes cluster running completely within a single Docker container.

Cinder, or Crit-in-Docker, can be useful for developing on Critical Stack clusters locally, or simply to learn more about Crit. You can read more about requirements, configuration, etc over in the Cinder Guide.

Running in Production

Setting up a production Kubernetes cluster requires quite a bit of planning and configuration. For one, there are many considerations that influence the way a cluster should be configured. When starting a new cluster or setting up a standard cluster configuration, one should consider the following:

  • Where will it be running? (e.g. AWS, GCP, bare-metal, etc)
  • What level of resiliency is required?
    • This is about how the cluster can deal with faults and depending upon factors like colocation of etcd, how it fails can become more complicated.
  • What will provide out-of-band storage for cluster secrets?
    • This applies mostly to the initial cluster secrets, the Kubernetes and Etcd CA cert/key pairs.
  • What kind of applications will run on the cluster?
  • What cost-based factors are there?
  • What discovery mechanisms are available for new nodes?
  • Are there specific performance requirements that affect the infrastructure being used?

The Crit Guide, and the accompanying Security Guide, exists to help answer these questions and provide general guidance for setting up a typical Kubernetes cluster to meet various use-cases.

In particular, a few good places to start planning your Kubernetes cluster:

Crit Guide

This guide will take you through all of the typical configuration use cases that may come up when creating a new Kubernetes cluster.

System Requirements

Exact system requirements will be dependent upon a lot of factors, however, for the most part, any relatively modern linux operating system will fit the bill.

  • Linux kernel >= 4.9.17
  • systemd
  • iptables (optional)

Newer versions of the kernel will enable using cilium's kube-proxy replacement feature, which will replace the need to deploy kube-proxy (and therefore not need iptables also).

Dependencies

  • kubelet >= 1.14.x
  • containerd >= 1.2.6
  • CNI >= 0.7.5

References

Installation

Install from helper script

Run the following in your terminal to download the latest version of crit:

curl -sSfL https://get.crit.sh | sh

By default, the latest release version is installed. Set environment variables to install a different version, or to install to a different destination:

curl -sSfL https://get.crit.sh | VERSION=1.0.8 INSTALL_DIR=$HOME/bin sh

Install From Packagecloud.io

Debian/Ubuntu:

curl -sL https://packagecloud.io/criticalstack/public/gpgkey | apt-key add -
apt-add-repository https://packagecloud.io/criticalstack/public/ubuntu
apt-get install -y criticalstack e2d

Fedora:

dnf config-manager --add-repo https://packagecloud.io/criticalstack/public/fedora
dnf install -y criticalstack e2d

Install from GH releases

Download a binary release from https://github.com/criticalstack/crit/releases/latest suitable for your system and then install, for example:

curl -sLO https://github.com/criticalstack/crit/releases/download/v1.0.1/crit_1.0.1_Linux_x86_64.tar.gz
tar xzf crit_0.2.9_Linux_x86_64.tar.gz
mv crit /usr/local/bin/

Please note, installing from a GH release will not automatically install the systemd kubelet drop in:

mkdir -p /etc/systemd/system/kubelet.service.d
curl -sLO https://raw.githubusercontent.com/criticalstack/crit/master/build/package/20-crit.conf
mv 20-crit.conf /etc/systemd/system/kubelet.service.d/
systemctl daemon-reload

Configuration

Configuration is passed to Crit via yaml and is separated into two different types: ControlPlaneConfiguration and WorkerConfiguration. The concerns are split between these two configs for any given node, and they each contain a NodeConfiguration that specifies node-specific settings like for the [Kubelet], networking, etc.

Embedded ComponentConfigs

The ComponentConfigs are part of an ongoing effort to make configuration of Kubernetes components (API server, kubelet, etc) more dynamic by making configuration directly through Kubernetes API types. Crit will be using these ComponentConfigs when available since they simplify all aspects of taking user configuration and transforming that into Kubernetes component configuration. Kubernetes components are being changed to support direct configuration from file with the ComponentConfig API types, so Crit embeds these to make configuration more straightforward.

Currently, only the kube-proxy and kubelet ComponentConfigs are ready to be used, but more are currently being worked on and will be adopted by Crit as other components begin supporting configuration from file.

Runtime Defaults

Some configuration defaults are set at the time of running crit up. These mostly include settings that are based upon the host that is running the command, such as the hostname.

If left unset, the controlPlaneEndpoint value will be set to the ipv4 of the host. In the case there are multiple network interfaces, the first non-loopback network interface is used.

The default directory for Kubernetes files is /etc/kubernetes and any paths to manifests, certificates, etc are derived from this.

Etcd is also configured presuming that mTLS is used and that the etcd nodes are colocated with the Kubernetes control plane components, effectively making this the default configuration:

apiVersion: crit.sh/v1alpha2
kind: ControlPlaneConfiguration
etcd:
  endpoints:
  - "https://${controlPlaneEndpoint.Host}:2379"
  caFile: /etc/kubernetes/pki/etcd/ca.crt
  caKey: /etc/kubernetes/pki/etcd/ca.key
  certFile: /etc/kubernetes/pki/etcd/client.crt
  keyFile: /etc/kubernetes/pki/etcd/client.key

The CA certificate required for the worker to validate the cluster it's joining is also derived from the default Kubernetes configuration directory:

apiVersion: crit.sh/v1alpha2
kind: WorkerConfiguration
caCert: /etc/kubernetes/pki/ca.crt

Container Runtimes

If interested in a comprehensive deep-dive into all things container runtime, Capital One has a great blog post going into the history and current state of container runtimes: A Comprehensive Container Runtime Comparison.

Containerd

Containerd is a robust and easy-to-use container runtime. It has a proven track record of reliability, and is the container runtime we use for many Critical Stack installations.

Docker

Docker is a more than just a container runtime, and actually utilizes containerd internally.

CRI-O

Running Etcd

Crit requires a connection to etcd to coordinate the bootstrapping process. The etcd cluster does not have to be colocated on the node. For bootstrapping and managing etcd, we prefer using our own e2d tool. It embeds etcd and combines it with the hashicorp/memberlist gossip network to manage etcd membership.

Control Plane Sizing

External Etcd

When dealing with etcd running external to the Kubernetes control plane components, there are not a lot of restrictions on how many control plane nodes one can have. There can be any number of nodes that meet demand and availability needs, and can even be auto-scaled. With that said, however, the performance of Kubernetes is tied heavily to the performance of etcd, so more nodes does not mean more performance.

Colocated Etcd

Colocation of etcd, or "stacked etcd" (as it's referred to in the Kubernetes documentation), is the practice of installing etcd alongside the Kubernetes control plane components (kube-apiserver, kube-controller-manager, and kube-scheduler). This has some obvious benefits like reducing cost by reducing the virtual machines needed, but introduces a lot of complexity and restrictions.

Etcd's performance goes down the more nodes that are added, because more members are required to vote to commit to the raft log, so there should never be more than 5 voting members in a cluster (unless performing a rolling upgrade). Also, the number of members should always be odd to help protect against the split-brain problem. This means that the control plane can only safely be made up of 1, 3, or 5 nodes.

Etcd also should not be scaled up or down (at least, at this time). The reason is that the etcd cluster is being put at risk each time there is a membership change, so this also means that the control plane size needs to be selected ahead of time and not be altered.

General Recommendations

In cloud environments, 3 is a good size to balance resiliency and performance. The reasoning here is that cloud environments provide ways to quickly automate replacing failed members, so losing a node does not put etcd in danger of losing quorum for long until a new node can replace the existing one. As etcd moves towards adding more functionality around the learners member type, this will also open up to having a "hot spare" ready to take the place of the failed member immediately.

For bare-metal, 5 is a good size to ensure that failed nodes have more time to be replaced since a new node might need to be physically allocated.

Kubernetes Certificates

There are two available options for bootstrapping new worker nodes:

Generating Certificates

For an overview of the certificates Kubernetes requires and how they are used, see here.

Cluster CA

To generate the cluster CA and private key:

crit certs init --cert-dir /etc/kubernetes/pki

Certificates for Etcd

Etcd certificates can be generated using our e2d tool. See e2d pki.

Certificates and Kubeconfigs for Kubernetes Components

The following certificates and kubeconfigs can be created with crit. See the crit up command.

/etc/kubernetes/
├── admin.conf
├── controller-manager.conf
├── kubelet.conf
├── pki
│   ├── apiserver-healthcheck-client.crt
│   ├── apiserver-healthcheck-client.key
│   ├── apiserver-kubelet-client.crt
│   ├── apiserver-kubelet-client.key
│   ├── apiserver.crt
│   ├── apiserver.key
│   ├── auth-proxy-ca.crt
│   ├── auth-proxy-ca.key
│   ├── ca.crt
│   ├── ca.key
│   ├── front-proxy-ca.crt
│   ├── front-proxy-ca.key
│   ├── front-proxy-client.crt
│   ├── front-proxy-client.key
│   ├── sa.key
│   └── sa.pub
└── scheduler.conf

Managing Certificates

Check Certificate Expiration

You can use the crit certs list command to check when certificates expire:

$ crit certs list
Certificate Authorities:
========================
Name		CN		Expires	NotAfter
ca		kubernetes	9y	2030-09-27T01:45:12Z
front-proxy-ca	front-proxy-ca	9y	2030-09-27T16:36:08Z

Certificates:
=============
Name				CN				Expires	NotAfter
apiserver			kube-apiserver			364d	2021-09-29T23:54:16Z
apiserver-kubelet-client	kube-apiserver-kubelet-client	364d	2021-09-29T23:54:16Z
apiserver-healthcheck-client	system:basic-info-viewer	364d	2021-09-29T23:54:16Z
front-proxy-client		front-proxy-client		364d	2021-09-29T23:54:17Z

Rotating Certificates

There are several different solutions pertaining to certificate rotation. The appropriate solution greatly depends on an organization's use case. Some things to consider:

  • Does certificate rotation need to intergrate with an organization's existing certificate infrastructure?
  • Can certificate approval and signing be automated, or does it require a cluster administrator?
  • How often do certificates need to be rotated?
  • How many clusters need to be supported?

Rotating with Crit

Certificates can be renewed with crit certs renew. Note, this does not renew the CA.

Rotating with the Kubernetes certificates API

Kubernetes provides a Certificate API that can be used to provision certificates using certificate signing requests.

Kubelet Certificate

The kubelet certificate can be automatically renewed using the kubernetes api.

Advanced Certificate Rotation

Organizations that require an automated certificate rotation solution that integrates with existing certificate infrastructure should consider projects like cert-manager.

Bootstrapping a Worker

There are two available options for bootstrapping new worker nodes:

Bootstrap Token

Crit supports a worker bootstrap flow using bootstrap tokens and the cluster CA certificate (e.g. /etc/kubernetes/pki/ca.crt):

apiVersion: crit.sh/v1alpha2
kind: WorkerConfiguration
bootstrapToken: abcdef.0123456789abcdef
caCert: /etc/kubernetes/pki/ca.crt
controlPlaneEndpoint: mycluster.domain
node:
  cloudProvider: aws
  kubernetesVersion: 1.17.3

This method is adapted from the kubeadm join workflow, but uses the full CA certificate instead of using CA pinning. It also does not depend upon clients getting a signed configmap, and therefore does not require anonymous auth to be turned on.

Bootstrap Server

Experimental

The bootstrap protocol used by Kubernetes/kubeadm relies on operations that imply manual work to be performed, in particular, the bootstrap token creation and how that is distributed to new worker nodes. Crit introduces a new bootstrap protocol that tries to work better in environments that are completely automated.

A bootstrap-server static pod is created alongside the Kubernetes components that run on each control plane node. This provides a service to new nodes before they have joined the cluster that allows them to be authorized and given a bootstrap token. This also has the benefit of making the bootstrap token expiration very small, limited the window greatly that it can be used.

Configuration

Here is an example of using Amazon Instance Identity Document w/ signature verification while also limiting the accounts bootstrap tokens will be issued for:

apiVersion: crit.sh/v1alpha2
kind: ControlPlaneConfiguration
critBootstrapServer:
  cloudProvider: aws
  extraArgs:
    filters: account-id=${account_id}

Override bootstrap-server default port:

apiVersion: crit.sh/v1alpha2
kind: ControlPlaneConfiguration
critBootstrapServer:
  extraArgs:
    port: 8080

Authorizers

AWS

The AWS authorizer uses Instance Identity Documents and RSA SHA 256 signature verification to confirm the identity of new nodes requesting bootstrap tokens.

Configuring Control Plane Components

See here for a complete list of available configuration options.

Control plane endpoint

The control plane endpoint is the address (IP or DNS), along with optional port, that represents the control plane. It it is effectively the API server address, however, it is internally used for a few other purposes, such as:

  • Discovering other services using the host (e.g. bootstrap-server)
  • Adding to the SAN for generated cluster certificates

It is specified in the config file like so:

apiVersion: crit.sh/v1alpha2
kind: ControlPlaneConfiguration
controlPlaneEndpoint: "example.com:6443"

Disable/Enable Kubernetes Feature Gates

Setting feature gates will be important if you need specific features that are not available by default or maybe to enable a feature that wasn't enabled by default for a particular version of Kubernetes.

For example, CSI-related features were only enabled by default starting with version 1.17, so for older versions of Kubernetes you will need to turn them on manually for the control plane:

apiVersion: crit.sh/v1alpha2
kind: ControlPlaneConfiguration
...
kubeAPIServer:
  featureGates:
    CSINodeInfo: true
    CSIDriverRegistry: true
    CSIBlockVolume: true
    VolumeSnapshotDataSource: true
node:
  kubelet:
    featureGates:
      CSINodeInfo: true
      CSIDriverRegistry: true
      CSIBlockVolume: true

and for the workers:

apiVersion: crit.sh/v1alpha2
kind: WorkerConfiguration
...
node:
  kubelet:
    featureGates:
      CSINodeInfo: true
      CSIDriverRegistry: true
      CSIBlockVolume: true

The kubeAPIServer, kubeControllerManager, kubeScheduler, and kubelet all have feature gates that can be configured. More info is available in the Kubernetes docs.

Configuring Pod/Service Subnets

apiVersion: crit.sh/v1alpha2
kind: ControlPlaneConfiguration
podSubnet: "10.153.0.0/16"
serviceSubnet: "10.154.0.0/16"

Configuring a Cloud Provider

A cloud provider can be specified to integrate with the underlying infrastructure provider. Note, the specified cloud will most likely require authentication/authorization to access their APIs.

Crit supports both In-tree and out-of-tree cloud providers.

In-tree Cloud Provider

In-tree cloud providers can be specified with the following:

apiVersion: crit.sh/v1alpha2
kind: ControlPlaneConfiguration
...
node:
  cloudProvider: aws

and for the workers:

apiVersion: crit.sh/v1alpha2
kind: WorkerConfiguration
...
node:
  cloudProvider: aws

Out-of-tree Cloud Provider

Out-of-tree cloud providers can be specified with the following:

apiVersion: crit.sh/v1alpha2
kind: ControlPlaneConfiguration
...
node:
  kubeletExtraArgs: 
    cloud-provider: external

and for the workers:

apiVersion: crit.sh/v1alpha2
kind: WorkerConfiguration
...
node:
  kubeletExtraArgs: 
    cloudProvider: external

A manifest specific to cloud environment must then be applied to run the external cloud controller manager.

Running crit up

Depending on the provided config, crit up will either provision a Control Plane Node or a Worker Node:

Running crit up with a control plane configuration perfoms the following steps:

StepDescription
ControlPlanePreCheckValidate configuration
CreateOrDownloadCertsGenerate CAs; if already present, don't overwrite
CreateNodeCertsGenerate certificates for kubernetes components; if already present, dont overwrite
StopKubeletStop the kubelet using systemd
WriteKubeConfigsGenerate control plane kubeconfigs and the admin kubeconfig
WriteKubeletConfigsWrite kubelet settings
StartKubeletStart Kubelet using systemd
WriteKubeManifestsWrite static pod manifests for control plane
WaitClusterAvailableWait for the control plane to be available
WriteBootstrapServerManifest [optional]Write the crit boostrap server pod manifest
DeployCoreDNSDeploy CoreDNS after cluster is available
DeployKubeProxyDeploy KubeProxy
EnableCSRApproverAdd RBAC to allow csrapprover to boostrap nodes
MarkControlPlaneAdd taint to control plane node
UploadInfoUpload crit config map that holds info regarding the cluster

Running crit up with a worker configuration perfoms the following steps:

StepDescription
WorkerPreCheckValidate configuration
StopKubeletStop the kubelet using systemd
WriteBootstrapKubeletConfigWrite kubelet boostrap kubeconfig
WriteKubeletConfigsWrite kubelet settings
StartKubeletStart Kubelet using systemd

Installing a CNI

The CNI can be installed at any point after a node bootstrapping (i.e. after crit up finishes successfully). For example, when we install cilium via helm it looks something like this:

helm repo add cilium https://helm.cilium.io/
helm install cilium cilium/cilium --namespace kube-system \
    --version 1.8.2

Installing a Storage Driver

helm repo add criticalstack https://charts.cscr.io/criticalstack
kubectl create namespace local-path-storage
helm install local-path-storage criticalstack/local-path-provisioner \
	--namespace local-path-storage \
	--set nameOverride=local-path-storage \
	--set storageClass.defaultClass=true

Install the AWS CSI driver via helm

https://github.com/kubernetes-sigs/aws-ebs-csi-driver

helm repo add criticalstack https://charts.cscr.io/criticalstack
helm install aws-ebs-csi-driver criticalstack/aws-ebs-csi-driver \
	--set enableVolumeScheduling=true \
	--set enableVolumeResizing=true \
	--set enableVolumeSnapshot=true \
	--version 0.3.0

Setting a Default StorageClass

kubectl apply -f - <<EOT
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: ebs-sc
  annotations:
    storageclass.kubernetes.io/is-default-class: "true"
provisioner: ebs.csi.aws.com
volumeBindingMode: WaitForFirstConsumer
parameters:
  csi.storage.k8s.io/fstype: xfs
  type: io1
  iopsPerGB: "50"
  encrypted: "true"
EOT

Configuring Authentication

Configure the Kubernetes API Server

The Kubernetes API server can be configured with OpenID Connect to use an existing OpenID Identity Provider. It can only trust a single issuer and until the API server can be configured with component configs it must be specified in the Crit config as command-line arguments:

apiVersion: crit.sh/v1alpha2
kind: ControlPlaneConfiguration
kubeAPIServer:
  extraArgs:
    oidc-issuer-url: "https://accounts.google.com"
    oidc-client-id: critical-stack
    oidc-username-claim: email
    oidc-groups-claim: groups

The above configuration will allow the API server to use Google as its identity provider, but with some major limitations:

  • Kubernetes does not act as a client for the issuer
  • Does not provide a way to manage the lifecycle of OpenID Connect tokens

This can be best understood looking in the Kubernetes authentication documentation for OpenID Connect Tokens. The process of getting a token happens completely outside of the context of the Kubernetes cluster and is passed in as an argument to kubectl commands.

Using an In-cluster Identity Provider

Given the limitations mentioned above, many run their own identity providers inside of the cluster to provide additional auth features to the cluster. This complicates configuration, however, since the API server will either have to be reconfigured and restarted, or will need to be configured with an issuer that is not yet running.

So what if you want to provide a web interface that leverages this authentication? Given the limitations mentioned above, you would have to write authentication logic for the specific upstream identity provider into your application, and should the upstream identity provider change, so does the authentication logic AND the API server configuration. This is where identity providers, such as Dex, come in. Dex uses OpenID Connect to provide authentication for other applications by acting as a shim between the client app and the upstream provider. When using Dex, the oidc-issuer-url argument being specified needs to target the expected address of Dex running the cluster, so something like:

oidc-issuer-url: "https://dex.kube-system.svc.cluster.local:5556"

It is ok that Dex isn't running yet, the API server will function as normal until the issuer is available.

The auth-proxy CA

The API server uses the host's root CAs by default, but in the case where an application might not be using a CA signed certificate, like during development or automated testing, Crit generates an additional CA that is already available in the API server certs volume. This helps with the chicken/egg problem of needing to specify a CA file when bootstrapping a new cluster before the application has been deployed. To use this auth-proxy CA, just add this to the API server configuration:

oidc-ca-file: /etc/kubernetes/pki/auth-proxy-ca.crt

Please note that this assumes that the default Kubernetes directory (/etc/kubernetes) is being used. From here there are many options to make use of auth-proxy CA. For example, cert-manager can be installed and the auth-proxy CA can be setup as a ClusterIssuer:

# install cert-manager
kubectl create namespace cert-manager
kubectl apply --validate=false -f https://github.com/jetstack/cert-manager/releases/download/v0.14.0/cert-manager.yaml

# add auth-proxy-ca secret to be used as ClusterIssuer
kubectl -n cert-manager create secret generic auth-proxy-ca --from-file=tls.crt=/etc/kubernetes/pki/auth-proxy-ca.crt --from-file=tls.key=/etc/kubernetes/pki/auth-proxy-ca.key

# wait for cert-manager-webhook readiness
while [[ $(kubectl -n cert-manager get pods -l app=webhook -o 'jsonpath={..status.conditions[?(@.type=="Ready")].status}') != "True" ]]; do echo "waiting for pod" && sleep 1; done

kubectl apply -f - <<EOT
apiVersion: cert-manager.io/v1alpha2
kind: ClusterIssuer
metadata:
  name: auth-proxy-ca
  namespace: cert-manager
spec:
  ca:
    secretName: auth-proxy-ca
EOT

Then applications can create cert-manager certificates for their application to use:

apiVersion: cert-manager.io/v1alpha2
kind: Certificate
metadata:
  name: myapp-example
spec:
  secretName: myapp-certs
  duration: 8760h # 365d
  renewBefore: 360h # 15d
  organization:
  -  Internet Widgits Pty Ltd
  isCA: false
  keySize: 2048
  keyAlgorithm: rsa
  keyEncoding: pkcs1
  usages:
    - server auth
    - client auth
  dnsNames:
  - myapp.example.com
  issuerRef:
    name: auth-proxy-ca
    kind: ClusterIssuer

Of course, this is just one possible way to approach authentication, and configuration will vary greatly depending upon the needs of the application(s) running on the cluster.

Kubelet Settings

Disable swap for Linux-based Operating Systems

Swap cannot be enabled for the kubelet to work (see here). This is a helpful drop-in to ensure that swap is disabled on a system:

[Unit]
After=local-fs.target

[Service]
ExecStart=/sbin/swapoff -a

[Install]
WantedBy=multi-user.target

Reserving Resources

Reserving some resources for the system to use is often times very helpful to ensure that resource hungry pods don't kill the system by causing it to run out of memory.

...
node:
  kubelet:
    kubeReserved:
      cpu: 128m
      memory: 64Mi
    kubeReservedCgroup: /podruntime.slice
    kubeletCgroups: /podruntime.slice
    systemReserved:
      cpu: 128m
      memory: 192Mi
    systemReservedCgroup: /system.slice
# /etc/systemd/system/kubelet.service.d/10-cgroup.conf
# Sets the cgroup for the kubelet service
[Service]
CPUAccounting=true
MemoryAccounting=true
Slice=podruntime.slice
# /etc/systemd/system/containers.slice
# Creates a cgroup for kubelet
[Unit]
Description=Grouping resources slice for containers
Documentation=man:systemd.special(7)
DefaultDependencies=no
Before=slices.target
Requires=-.slice
After=-.slice
# /etc/systemd/system/podruntime.slice
# Creates a cgroup for kubelet
[Unit]
Description=Limited resources slice for Kubelet service
Documentation=man:systemd.special(7)
DefaultDependencies=no
Before=slices.target
Requires=-.slice
After=-.slice

Exposing Cluster DNS

Replace Systemd-resolved With Dnsmasq

Sometimes systemd-resolved, default stub resolver for many linux systems, needs to be replaced with dnsmasq. This dnsmasq systemd drop-in is useful to ensure that systemd-resolved is not running when the dnsmasq service is started:

# /etc/systemd/system/dnsmasq.service.d/10-resolved-fix.conf
[Unit]
After=systemd-resolved.service

[Service]
ExecStartPre=/bin/systemctl stop systemd-resolved.service
ExecStartPost=/bin/systemctl start systemd-resolved.service

It works by allowing systemd-resolved to start, but stopping it once the dnsmasq service is started. This is helpful because it doesn't require changing any of the systemd-resolved specific settings but allows the dnsmasq service to be enabled/disabled when desired.

Forwarding Cluster-bound DNS on the Host

A reason why one might want to use something like dnsmasq, instead of systemd-resolved, is to expose the cluster DNS to the host. This would allow resolution of DNS for service and pod subnets from the host that is running the Kubernetes components. It only requires adding this dnsmasq configuration drop-in:

# /etc/dnsmasq.d/kube.conf
server=/cluster.local/10.254.0.10

This tells dnsmasq to forward any DNS queries it receives that end in the cluster domain, to the Kubernetes cluster dns, CoreDNS. In this case, it is presuming that the default cluster domain (cluster.local) and services subnet, have been configured. The address of CoreDNS is chosen automatically based upon the services subnet, so if the services subnet is 10.254.0.0/16 (the default), CoreDNS will be listening at 10.254.0.10.

Specifying the resolv.conf

The default resolv.conf from the host is used, /etc/resolv.conf or a different conf file can be set in the Kubelet:

apiVersion: crit.sh/v1alpha2
kind: ControlPlaneConfiguration
node:
  kubelet:
    resolvConf: /other/resolv.conf

Crit attempts to determine if systemd-resolved is running, and dynamically sets resolvConf to be /run/systemd/resolve/resolv.conf.

Security Guide

This guide will take you through configuring security features of Kubernetes, as well as, features specific to Crit. It will also include general helpful information or gotchas to look out for when creating a new cluster.

Encrypting Kubernetes Secrets

EncryptionProviderConfig

To encrypt secrets within the cluster you must create an EncryptionConfiguration manifest and pass it to the API server.

touch /etc/kubernetes/encryption-config.yaml
chmod 600 /etc/kubernetes/encryption-config.yaml
cat <<-EOT > /etc/kubernetes/encryption-config.yaml
apiVersion: apiserver.config.k8s.io/v1
kind: EncryptionConfiguration
resources:
  - resources:
    - secrets
    providers:
    - aescbc:
        keys:
        - name: key1
          secret: $(cat /etc/kubernetes/pki/etcd/ca.key | md5sum | cut -f 1 -d ' ' | head -c -1 | base64)
    - identity: {}
EOT

This EncryptionConfiguration uses the aescbc provider for encrypting secrets. Details on other providers, including third-party key management systems, can be found in the Kubernetes official documentation.

apiVersion: crit.sh/v1alpha2
kind: ControlPlaneConfiguration
kubeAPIServer:
  extraVolumes:
  - name: encryption-config
    hostPath: /etc/kubernetes/encryption-config.yaml
    mountPath: /etc/kubernetes/encryption-config.yaml
    readOnly: true

Once the API server is available, verify that new secrets are encrypted.

Enabling Pod Security Policies

What is a Pod Security Policy

Pod Security Policies are in-cluster Kubernetes resources that provides ways of securing pods. The official Pod Security Policy of the official Kubernetes docs provides a great deal of helpful information and a walkthrough of how to use them, and is highly recommended reading. For the purposes of this documentation, we really just want to focus on getting them running on your Crit cluster.

Configuration

The APIServer has quite a few admission plugins enabled by default, however, the PodSecurityPolicy plugin must be enabled when configuring the APIServer with the enable-admission-plugin option:

apiVersion: crit.sh/v1alpha2
kind: ControlPlaneConfiguration
kubeAPIServer:
  extraArgs:
    enable-admission-plugins: PodSecurityPolicy

enable-admission-plugin can be provided a comma-delimited list of admission plugins to enable. While the order that admission plugins run does matter, it does not matter for this particular option as it simply enables the plugin.

The admission plugin SecurityContextDeny must NOT be enabled along with PodSecurityPolicy. In the case that PodSecurityPolicy is enabled, the usage completely supplants the functionality provided by SecurityContextDeny.

Pod Security Policy Examples

Crit embeds two Pod Security Policies that provides a good starting place for configuring PSPs in your cluster. They were adapted from the examples provided in the Kubernetes docs and can be found in GitHub here or can be printed to the console using crit template on the desired file:

$ crit template psp-privileged.yaml

Privileged Pod Security Policy

# psp-privileged.yaml
apiVersion: policy/v1beta1
kind: PodSecurityPolicy
metadata:
  name: privileged
  annotations:
    seccomp.security.alpha.kubernetes.io/allowedProfileNames: '*'
spec:
  privileged: true
  allowPrivilegeEscalation: true
  allowedCapabilities:
  - '*'
  volumes:
  - '*'
  hostNetwork: true
  hostPorts:
  - min: 0
    max: 65535
  hostIPC: true
  hostPID: true
  runAsUser:
    rule: 'RunAsAny'
  seLinux:
    rule: 'RunAsAny'
  supplementalGroups:
    rule: 'RunAsAny'
  fsGroup:
    rule: 'RunAsAny'
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: psp:privileged
rules:
- apiGroups: ['policy']
  resources: ['podsecuritypolicies']
  verbs:     ['use']
  resourceNames:
  - privileged
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: psp:privileged
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: psp:privileged
subjects:
- kind: Group
  apiGroup: rbac.authorization.k8s.io
  name: system:serviceaccounts:kube-system
- kind: Group
  name: system:serviceaccounts:kube-node-lease
  apiGroup: rbac.authorization.k8s.io
- kind: Group
  name: system:serviceaccounts:kube-public
  apiGroup: rbac.authorization.k8s.io
- kind: Group
  name: system:serviceaccounts:default
  apiGroup: rbac.authorization.k8s.io
- kind: Group
  name: system:nodes
  apiGroup: rbac.authorization.k8s.io
- kind: User
  apiGroup: rbac.authorization.k8s.io
  # Legacy node ID
  name: kubelet

Restricted Pod Security Policy

# psp-restricted.yaml
apiVersion: policy/v1beta1
kind: PodSecurityPolicy
metadata:
  name: default-cluster-restricted
  annotations:
    seccomp.security.alpha.kubernetes.io/allowedProfileNames: 'docker/default,runtime/default'
    apparmor.security.beta.kubernetes.io/allowedProfileNames: 'runtime/default'
    seccomp.security.alpha.kubernetes.io/defaultProfileName:  'runtime/default'
    apparmor.security.beta.kubernetes.io/defaultProfileName:  'runtime/default'
spec:
  privileged: false
  # Required to prevent escalations to root.
  allowPrivilegeEscalation: false
  # This is redundant with non-root + disallow privilege escalation,
  # but we can provide it for defense in depth.
  requiredDropCapabilities:
    - ALL
  # Allow core volume types.
  volumes:
    - 'configMap'
    - 'emptyDir'
    - 'projected'
    - 'secret'
    - 'downwardAPI'
    # Assume that persistentVolumes set up by the cluster admin are safe to use.
    - 'persistentVolumeClaim'
  hostNetwork: false
  hostIPC: false
  hostPID: false
  runAsUser:
    # Require the container to run without root privileges.
    rule: 'MustRunAsNonRoot'
  seLinux:
    # This policy assumes the nodes are using AppArmor rather than SELinux.
    rule: 'RunAsAny'
  supplementalGroups:
    rule: 'MustRunAs'
    ranges:
      # Forbid adding the root group.
      - min: 1
        max: 65535
  fsGroup:
    rule: 'MustRunAs'
    ranges:
      # Forbid adding the root group.
      - min: 1
        max: 65535
  readOnlyRootFilesystem: false
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: psp:restricted
rules:
- apiGroups: ['policy']
  resources: ['podsecuritypolicies']
  verbs:     ['use']
  resourceNames:
  - default-cluster-restricted
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: psp:restricted
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: psp:restricted
subjects:
# Authorize all service accounts in a namespace:
- kind: Group
  apiGroup: rbac.authorization.k8s.io
  name: system:serviceaccounts
# Or equivalently, all authenticated users in a namespace:
- kind: Group
  apiGroup: rbac.authorization.k8s.io
  name: system:authenticated

Audit Policy Logging

apiVersion: crit.sh/v1alpha2
kind: ControlPlaneConfiguration
kubeAPIServer:
  extraArgs:
    audit-policy-file: "/etc/kubernetes/audit-policy.yaml"
    audit-log-path: "/var/log/kubernetes/kube-apiserver-audit.log"
    audit-log-maxage: "30"
    audit-log-maxbackup: "10"
    audit-log-maxsize: "100"
  extraVolumes:
  - name: apiserver-logs
    hostPath: /var/log/kubernetes
    mountPath: /var/log/kubernetes
    readOnly: false
    hostPathType: directory
  - name: apiserver-audit-config
    hostPath: /etc/kubernetes/audit-policy.yaml
    mountPath: /etc/kubernetes/audit-policy.yaml
    readOnly: true

Disabling Anonymous Authentication

The API server defaults to allow anonymous auth, meaning that incoming requests that are not authenticated will be implicitly given a username system:anonymous and be part of the system:unauthenticated group. While this user may not have permission to anything, problems related to allowing anonymous authentication are still possible, such as vulnerabilities like the "Billion Laughs" attack.

Disabling anonymous authentication only requires passing an argument to the API server:

apiVersion: crit.sh/v1alpha2
kind: ControlPlaneConfiguration
kubeAPIServer:
  extraArgs:
    anonymous-auth: false

API Server Healthchecks

Liveness probes will fail for static pods should anonymous-auth be set to false. Crit addresses this by detecting when --anonymous-auth has been disabled and adds a special healthcheck-proxy sidecar to the apiserver static pod. It acts as a reverse proxy with the frontend effectively accepting anonymous traffic and the backend using an authenticated user. The backend connection is established with the built-in system:basic-info-viewer user to limit the auth to only being able to look at health and version information.

Kubelet Server Certificate

Encrypting Shared Cluster Files

The pki shared by all control plane nodes are distributed via etcd/e2d using e2db, an ORM-like abstraction over etcd. These files should be protected using strong encryption, and e2db provides a feature for encrypting entire tables. The one requirement is that the etcd ca key is provided in the crit configuration:

apiVersion: crit.sh/v1alpha2
kind: ControlPlaneConfiguration
etcd:
  endpoints:
  - "https://${controlPlaneEndpoint.Host}:2379"
  caFile: /etc/kubernetes/pki/etcd/ca.crt
  caKey: /etc/kubernetes/pki/etcd/ca.key
  certFile: /etc/kubernetes/pki/etcd/client.crt
  keyFile: /etc/kubernetes/pki/etcd/client.key

where the important file here is ca.key, since it is only one suitable to use as a data encryption key.

Cinder Guide

This guide will take you through installing and using Cinder.

What is Cinder

Cinder, or Crit-in-Docker, is very similar to kind. In fact, it uses many packages from kind under-the-hood along with the base container image that makes it all work. Think of cinder as like a flavor of kind (kind is quite good, to say the least). Just like kind, cinder won't work on all platforms, and right now only supports amd64 architectures running macOS and linux, and requires running Docker.

Cinder bootstraps each node with Crit and installs several helpful additional components, such as the machine-api and machine-api-provider-docker.

Using Cinder to Develop Crit

# dev.yaml
apiVersion: cinder.crit.sh/v1alpha1
kind: ClusterConfiguration
files:
  - path: "/usr/bin/crit"
    owner: "root:root"
    permissions: "0755"
    encoding: hostpath
    content: bin/crit
❯ make crit
❯ cinder create cluster -c dev.yaml
Creating cluster "cinder" ...
 🔥  Generating certificates
 🔥  Creating control-plane node
 🔥  Installing CNI
 🔥  Installing StorageClass
 🔥  Running post-up commands
Set kubectl context to "kubernetes-admin@cinder". Prithee, be careful.

Installation

Cinder is installed alongside of crit so the same helper script can be used for installation:

curl -sSfL https://get.crit.sh | sh

Configuration

Adding Files

apiVersion: cinder.crit.sh/v1alpha1
kind: ClusterConfiguration
files:
  - path: "/etc/kubernetes/auth-proxy-ca.yaml"
    owner: "root:root"
    permissions: "0644"
    content: |
      apiVersion: cert-manager.io/v1alpha2
      kind: ClusterIssuer
      metadata:
        name: auth-proxy-ca
        namespace: cert-manager
      spec:
        ca:
          secretName: auth-proxy-ca

HostPath

apiVersion: cinder.crit.sh/v1alpha1
kind: ClusterConfiguration
  kubeAPIServer:
    extraArgs:
      audit-policy-file: "/etc/kubernetes/audit-policy.yaml"
      audit-log-path: "/var/log/kubernetes/kube-apiserver-audit.log"
      audit-log-maxage: "30"
      audit-log-maxbackup: "10"
      audit-log-maxsize: "100"
    extraVolumes:
    - name: apiserver-logs
      hostPath: /var/log/kubernetes
      mountPath: /var/log/kubernetes
      readOnly: false
      hostPathType: Directory
    - name: apiserver-audit-config
      hostPath: /etc/kubernetes/audit-policy.yaml
      mountPath: /etc/kubernetes/audit-policy.yaml
      readOnly: true
files:
  - path: "/etc/kubernetes/audit-policy.yaml"
    owner: "root:root"
    permissions: "0644"
    encoding: hostpath
    content: audit-policy.yaml

Running Additional Commands

apiVersion: cinder.crit.sh/v1alpha1
kind: ClusterConfiguration
preCritCommands:
  - crit version
postCritCommands:
  - |
    helm repo add jetstack https://charts.jetstack.io
    helm install cert-manager jetstack/cert-manager \
      --namespace cert-manager \
      --version v0.15.1 \
      --set tolerations[0].effect=NoSchedule \
      --set tolerations[0].key="node.kubernetes.io/not-ready" \
      --set tolerations[0].operator=Exists \
      --set installCRDs=true
    kubectl rollout status -n cert-manager deployment/cert-manager-webhook -w

Updating the Containerd Configuration

apiVersion: cinder.crit.sh/v1alpha1
kind: ClusterConfiguration
files:
  - path: "/etc/containerd/config.toml"
    owner: "root:root"
    permissions: "0644"
    content: |
      # explicitly use v2 config format
      version = 2

      # set default runtime handler to v2, which has a per-pod shim
      [plugins."io.containerd.grpc.v1.cri".containerd]
        default_runtime_name = "runc"
      [plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc]
        runtime_type = "io.containerd.runc.v2"

      # Setup a runtime with the magic name ("test-handler") used for Kubernetes
      # runtime class tests ...
      [plugins."io.containerd.grpc.v1.cri".containerd.runtimes.test-handler]
        runtime_type = "io.containerd.runc.v2"

      [plugins."io.containerd.grpc.v1.cri".registry.mirrors]
        [plugins."io.containerd.grpc.v1.cri".registry.mirrors."docker.io"]
          endpoint = ["https://docker.io"]

Adding Volume Mounts

apiVersion: cinder.crit.sh/v1alpha1
kind: ClusterConfiguration
extraMounts:
  - hostPath: templates
    containerPath: /cinder/templates
    readOnly: true

Forwarding Ports to the Host

apiVersion: cinder.crit.sh/v1alpha1
kind: ClusterConfiguration
extraPortMappings:
  - containerPort: 2379
    hostPort: 2379

Features

Side-loading Images

Kind allows you to side-load images in your local clusters. Cinder exposes the same functionality via cinder load:

cinder load criticalstack/quake-kube:v1.0.5

This will make the criticalstack/quake-kube:v1.0.5 image from the host available in the Cinder node. Any image that is available on the host can be loaded, and Cinder lazily pulls images that are not found on the host.

Registry Mirrors

Mirrors for container image registries can be setup to effectively "alias" them. The key is the alias, and the value is the full endpoint for the registry:

apiVersion: cinder.crit.sh/v1alpha1
kind: ClusterConfiguration
registryMirrors:
  docker.io: "https://docker.io"

It can be used to alias registries with different names OR it can be used to specify plain http registries:

...
registryMirrors:
  myregistry.dev: "http://myregistry.dev"

Local Registry

An instance of Distribution (aka Docker Registry v2) can be setup for a Cinder cluster by specifying a config file with the LocalRegistry feature gate:

apiVersion: cinder.crit.sh/v1alpha1
kind: ClusterConfiguration
featureGates:
  LocalRegistry: true

This will start a Docker container on the host with the running registry (if not already running). The registry is shared for all Cinder clusters on a host and is available at localhost:5000 (i.e. this is what you docker push to). This registry is then available inside the cluster at cinderegg:5000.

Cinder also creates the local-registry-hosting ConfigMap so that any tooling that supports Local Registry Hosting, such as Tilt, will be able to automatically discover and use the local registry.

apiVersion: v1
kind: ConfigMap
metadata:
  name: local-registry-hosting
  namespace: kube-public
data:
  localRegistryHosting.v1: |
    host: "localhost:{{ .LocalRegistryPort }}"
    hostFromContainerRuntime: "{{ .LocalRegistryName }}:{{ .LocalRegistryPort }}"
    hostFromClusterNetwork: "{{ .LocalRegistryName }}:{{ .LocalRegistryPort }}"
    help: "https://docs.crit.sh/cinder-guide/local-registry.html"`

More information about this Kubernetes standard can be found here.

Krustlet

Krustlet is a tool to run WebAssembly workloads natively on Kubernetes by acting like node in your Kubernetes cluster. It can be enabled for a Cinder cluster using the following configuration:

# config.yaml
apiVersion: cinder.crit.sh/v1alpha1
kind: ClusterConfiguration
featureGates:
  LocalRegistry: true
  Krustlet: true
controlPlaneConfiguration:
  kubeProxy:
    affinity:
      nodeAffinity:
        requiredDuringSchedulingIgnoredDuringExecution:
          nodeSelectorTerms:
            - matchExpressions:
              - key: "kubernetes.io/arch"
                operator: NotIn
                values: ["wasm32-wasi", "wasm32-wascc"]

to create a new Cinder cluster:

❯ cinder create cluster -c config.yaml
Creating cluster "cinder" ...
 🔥  Generating certificates
 🔥  Creating control-plane node
 🔥  Installing CNI
 🔥  Installing StorageClass
 🔥  Running post-up commands
Set kubectl context to "kubernetes-admin@cinder". Prithee, be careful.

Note that node affinity is being set for kube-proxy to ensure it does not try to schedule a pod on either the WASI or WASCC nodes

This will start two instances of Krustlet for both WASI and waSCC runtimes:

❯ kubectl get no

NAME           STATUS   ROLES    AGE   VERSION
cinder         Ready    master   2m    v1.18.5
cinder-wascc   Ready    <none>   1m    0.5.0
cinder-wasi    Ready    <none>   1m    0.5.0

With these nodes ready, we can build and push images to our local registry and run them on our Cinder cluster. For example, the Hello World Rust for WASI can be built using cargo and pushed to our local registry using wasm-to-oci:

❯ cargo build --target wasm32-wasi --release
❯ wasm-to-oci push --use-http \
    target/wasm32-wasi/release/hello-world-rust.wasm \
    localhost:5000/hello-world-rust:v0.2.0

The line in k8s.yaml specifying the image to use will need to be modified:

...
spec:
  containers:
    - name: hello-world-wasi-rust
      #image: webassembly.azurecr.io/hello-world-wasi-rust:v0.2.0
      image: cinderegg:5000/hello-world-rust:v0.2.0
...

Finally, the manifest can be applied:

❯ kubectl apply -f k8s.yaml

Which will result in the pod being scheduled on the waSCC Krustlet:

❯ kubectl get po -A

NAMESPACE            NAME                                  READY   STATUS                          RESTARTS   AGE
kube-system          cilium-operator-657978fb5b-frrxj      1/1     Running                         0          8m4s
kube-system          cilium-pqmsc                          1/1     Running                         0          8m4s
kube-system          coredns-pqljz                         1/1     Running                         0          7m57s
kube-system          hello-world-wasi-rust                 0/1     ExitCode:0                      0          1s
kube-system          kube-apiserver-cinder                 1/1     Running                         0          8m18s
kube-system          kube-controller-manager-cinder        1/1     Running                         0          8m18s
kube-system          kube-proxy-85lwd                      1/1     Running                         0          8m4s
kube-system          kube-scheduler-cinder                 1/1     Running                         0          8m18s
local-path-storage   local-path-storage-74cd8967f5-vv2mb   1/1     Running                         0          8m4s

And should produce the following log output:

❯ kubectl logs hello-world-wasi-rust

hello from stdout!
hello from stderr!
POD_NAME=hello-world-wasi-rust
FOO=bar
CONFIG_MAP_VAL=cool stuff
Args are: []

Bacon ipsum dolor amet chuck turducken porchetta, tri-tip spare ribs t-bone ham hock. Meatloaf
pork belly leberkas, ham beef pig corned beef boudin ground round meatball alcatra jerky.
Pancetta brisket pastrami, flank pork chop ball tip short loin burgdoggen. Tri-tip kevin
shoulder cow andouille. Prosciutto chislic cupim, short ribs venison jerky beef ribs ham hock
short loin fatback. Bresaola meatloaf capicola pancetta, prosciutto chicken landjaeger andouille
swine kielbasa drumstick cupim tenderloin chuck shank. Flank jowl leberkas turducken ham tongue
beef ribs shankle meatloaf drumstick pork t-bone frankfurter tri-tip.

FAQ

TODO: modify this a bit

Can e2d scale up (or down) after cluster initialization?

The short answer is No, because it is unsafe to scale etcd and any solution that scales etcd is increasing the chance of cluster failure. This is a feature that will be supported in the future, but it relies on new features and fixes to etcd. Some context will be necessary to explain why:

A common misconception about etcd is that it is scalable. While etcd is a distributed key/value store, the reason it is distributed is to provide for distributed consensus, NOT to scale in/out for performance (or flexibility). In fact, the best performing etcd cluster is when it only has 1 member and the performance goes down as more members are added. In etcd v3.4, a new type of member called learners was introduced. These are members that can receive raft log updates, but are not part of the quorum voting process. This will be an important feature for many reasons, like stability/safety and faster recovery from faults, but will also potentially[1] enable etcd clusters of arbitrary sizes.

So why not scale within the recommended cluster sizes if the only concern is performance? Previously, etcd clusters have been vulnerable to corruption during membership changes due to the way etcd implemented raft. This has only recently been addressed by incredible work from CockroachDB, and it is worth reading about the issue and the solution in this blog post: Availability and Region Failure: Joint Consensus in CockroachDB.

The last couple features needed to safely scale have been roadmapped for v3.5 and are highlighted in the etcd learner design doc:

Make learner state only and default: Defaulting a new member state to learner will greatly improve membership reconfiguration safety, because learner does not change the size of quorum. Misconfiguration will always be reversible without losing the quorum.

Make voting-member promotion fully automatic: Once a learner catches up to leader’s logs, a cluster can automatically promote the learner. etcd requires certain thresholds to be defined by the user, and once the requirements are satisfied, learner promotes itself to a voting member. From a user’s perspective, “member add” command would work the same way as today but with greater safety provided by learner feature.

Since we want to implement this feature as safely and reliably as possible, we are waiting for this confluence of features to become stable before finally implementing scaling into e2d.

[1] Only potentially, because the maximum is currently set to allow only 1 learner. There is a concern that too many learners could have a negative impact on the leader which is discussed briefly here. It is also worth noting that other features may also fulfill the same need like some kind of follower replication: etcd#11357.

Command Reference

crit

bootstrap Critical Stack clusters

Synopsis

bootstrap Critical Stack clusters

Options

  -h, --help            help for crit
  -v, --verbose count   log output verbosity

SEE ALSO

General Commands

crit template

Render embedded assets

Synopsis

Render embedded assets

crit template [path] [flags]

Options

  -c, --config string   config file (default "config.yaml")
  -h, --help            help for template

Options inherited from parent commands

  -v, --verbose count   log output verbosity

SEE ALSO

  • crit - bootstrap Critical Stack clusters

crit up

Bootstraps a new node

Synopsis

Bootstraps a new node

crit up [flags]

Options

  -c, --config string              config file (default "config.yaml")
  -h, --help                       help for up
      --kubelet-timeout duration   timeout for Kubelet to become healthy (default 15s)
      --timeout duration            (default 20m0s)

Options inherited from parent commands

  -v, --verbose count   log output verbosity

SEE ALSO

  • Running crit up - useful information about the crit up command
  • crit - bootstrap Critical Stack clusters

crit version

Print the version info

Synopsis

Print the version info

crit version [flags]

Options

  -h, --help   help for version

Options inherited from parent commands

  -v, --verbose count   log output verbosity

SEE ALSO

  • crit - bootstrap Critical Stack clusters

crit certs

Handle Kubernetes certificates

Synopsis

Handle Kubernetes certificates

Options

  -h, --help   help for certs

Options inherited from parent commands

  -v, --verbose count   log output verbosity

SEE ALSO

crit certs init

initialize a new CA

Synopsis

initialize a new CA

crit certs init [flags]

Options

      --cert-dir string   
  -h, --help              help for init

Options inherited from parent commands

  -v, --verbose count   log output verbosity

SEE ALSO

crit certs list

list cluster certificates

Synopsis

list cluster certificates

crit certs list [flags]

Options

      --cert-dir string    (default "/etc/kubernetes/pki")
  -h, --help              help for list

Options inherited from parent commands

  -v, --verbose count   log output verbosity

SEE ALSO

crit certs renew

renew cluster certificates

Synopsis

renew cluster certificates

crit certs renew [flags]

Options

      --cert-dir string    (default "/etc/kubernetes/pki")
      --dry-run           
  -h, --help              help for renew

Options inherited from parent commands

  -v, --verbose count   log output verbosity

SEE ALSO

crit config

Handle Kubernetes config files

Synopsis

Handle Kubernetes config files

Options

  -h, --help   help for config

Options inherited from parent commands

  -v, --verbose count   log output verbosity

SEE ALSO

crit config import

import a kubeconfig

Synopsis

import a kubeconfig

crit config import [kubeconfig] [flags]

Options

  -h, --help   help for import

Options inherited from parent commands

  -v, --verbose count   log output verbosity

SEE ALSO

crit create

Create Kubernetes resources

Synopsis

Create Kubernetes resources

Options

  -h, --help   help for create

Options inherited from parent commands

  -v, --verbose count   log output verbosity

SEE ALSO

crit create token

creates a bootstrap token resource

Synopsis

creates a bootstrap token resource

crit create token [token] [flags]

Options

  -h, --help   help for token

Options inherited from parent commands

  -v, --verbose count   log output verbosity

SEE ALSO

crit generate

Utilities for generating values

Synopsis

Utilities for generating values

Options

  -h, --help   help for generate

Options inherited from parent commands

  -v, --verbose count   log output verbosity

SEE ALSO

crit generate hash

Synopsis

crit generate hash [ca-cert-path] [flags]

Options

  -h, --help   help for hash

Options inherited from parent commands

  -v, --verbose count   log output verbosity

SEE ALSO

crit generate kubeconfig

generates a kubeconfig

Synopsis

generates a kubeconfig

crit generate kubeconfig [filename] [flags]

Options

      --CN string           (default "kubernetes-admin")
      --O string            (default "system:masters")
      --cert-dir string     (default ".")
      --cert-name string    (default "ca")
  -h, --help               help for kubeconfig
      --name string         (default "crit")
      --server string      

Options inherited from parent commands

  -v, --verbose count   log output verbosity

SEE ALSO

crit generate token

generates a bootstrap token

Synopsis

generates a bootstrap token

crit generate token [token] [flags]

Options

  -h, --help   help for token

Options inherited from parent commands

  -v, --verbose count   log output verbosity

SEE ALSO

cinder

Create local Kubernetes clusters

Synopsis

Cinder is a tool for creating and managing local Kubernetes clusters using containers as nodes. It builds upon kind, but using Crit and Cilium to configure a Critical Stack cluster locally.

Options

  -h, --help            help for cinder
  -v, --verbose count   log output verbosity

SEE ALSO

General Commands

cinder load

Load container images from host

Synopsis

Load container images from host

cinder load [flags]

Options

  -h, --help          help for load
      --name string   cluster name (default "cinder")

Options inherited from parent commands

  -v, --verbose count   log output verbosity

SEE ALSO

  • cinder - Create local Kubernetes clusters

cinder version

Print the version info

Synopsis

Print the version info

cinder version [flags]

Options

  -h, --help   help for version

Options inherited from parent commands

  -v, --verbose count   log output verbosity

SEE ALSO

  • cinder - Create local Kubernetes clusters

cinder create

Create cinder resources

Synopsis

Create cinder resources such a new cinder clusters or add nodes to existing clusters.

cinder create [flags]

Options

  -h, --help   help for create

Options inherited from parent commands

  -v, --verbose count   log output verbosity

SEE ALSO

cinder create cluster

Creates a new cinder cluster

Synopsis

Creates a new cinder cluster

cinder create cluster [flags]

Options

  -c, --config string       cinder configuration file
  -h, --help                help for cluster
      --image string        node image (default "criticalstack/cinder:v1.0.0-beta.1")
      --kubeconfig string   sets kubeconfig path instead of $KUBECONFIG or $HOME/.kube/config
      --name string         cluster name (default "cinder")

Options inherited from parent commands

  -v, --verbose count   log output verbosity

SEE ALSO

cinder create node

Creates a new cinder worker

Synopsis

Creates a new cinder worker

cinder create node [flags]

Options

  -c, --config string       cinder configuration file
  -h, --help                help for node
      --image string        node image (default "criticalstack/cinder:v1.0.0-beta.1")
      --kubeconfig string   sets kubeconfig path instead of $KUBECONFIG or $HOME/.kube/config
      --name string         cluster name (default "cinder")

Options inherited from parent commands

  -v, --verbose count   log output verbosity

SEE ALSO

cinder delete

Delete cinder resources

Synopsis

Delete cinder resources

cinder delete [flags]

Options

  -h, --help   help for delete

Options inherited from parent commands

  -v, --verbose count   log output verbosity

SEE ALSO

cinder delete cluster

Deletes a cinder cluster

Synopsis

Deletes a cinder cluster

cinder delete cluster [flags]

Options

  -h, --help          help for cluster
      --name string   cluster name (default "cinder")

Options inherited from parent commands

  -v, --verbose count   log output verbosity

SEE ALSO

cinder delete node

Deletes a cinder node

Synopsis

Deletes a cinder node

cinder delete node [flags]

Options

  -h, --help          help for node
      --name string   cluster name (default "cinder")

Options inherited from parent commands

  -v, --verbose count   log output verbosity

SEE ALSO

cinder export

Export from local cluster

Synopsis

Export from local cluster

cinder export [flags]

Options

  -h, --help   help for export

Options inherited from parent commands

  -v, --verbose count   log output verbosity

SEE ALSO

cinder export kubeconfig

Export kubeconfig from cinder cluster and merge with $HOME/.kube/config

Synopsis

Export kubeconfig from cinder cluster and merge with $HOME/.kube/config

cinder export kubeconfig [flags]

Options

  -h, --help                help for kubeconfig
      --kubeconfig string   sets kubeconfig path instead of $KUBECONFIG or $HOME/.kube/config
      --name string         cluster name (default "cinder")

Options inherited from parent commands

  -v, --verbose count   log output verbosity

SEE ALSO

cinder get

Get cinder resources

Synopsis

Get cinder resources

cinder get [flags]

Options

  -h, --help   help for get

Options inherited from parent commands

  -v, --verbose count   log output verbosity

SEE ALSO

cinder get clusters

Get running cluster

Synopsis

Get running cluster

cinder get clusters [flags]

Options

  -h, --help                help for clusters
      --kubeconfig string   sets kubeconfig path instead of $KUBECONFIG or $HOME/.kube/config
      --name string         cluster name (default "cinder")

Options inherited from parent commands

  -v, --verbose count   log output verbosity

SEE ALSO

cinder get images

List all containers images used by cinder

Synopsis

List all containers images used by cinder

cinder get images [flags]

Options

  -h, --help   help for images

Options inherited from parent commands

  -v, --verbose count   log output verbosity

SEE ALSO

cinder get ip

Get node IP address

Synopsis

Get node IP address

cinder get ip [flags]

Options

  -h, --help          help for ip
      --name string   cluster name (default "cinder")

Options inherited from parent commands

  -v, --verbose count   log output verbosity

SEE ALSO

cinder get kubeconfigs

Get kubeconfig from cinder cluster

Synopsis

Get kubeconfig from cinder cluster

cinder get kubeconfigs [flags]

Options

  -h, --help                help for kubeconfigs
      --kubeconfig string   sets kubeconfig path instead of $KUBECONFIG or $HOME/.kube/config
      --name string         cluster name (default "cinder")

Options inherited from parent commands

  -v, --verbose count   log output verbosity

SEE ALSO

cinder get nodes

List cinder cluster nodes

Synopsis

List cinder cluster nodes

cinder get nodes [flags]

Options

  -h, --help          help for nodes
      --name string   cluster name (default "cinder")

Options inherited from parent commands

  -v, --verbose count   log output verbosity

SEE ALSO