
Crit
Crit is a command-line tool for bootstrapping Kubernetes clusters. It handles the initial configuration of Kubernetes control plane components, and adding workers to the cluster.
It is designed to be used within automated scripting (i.e. non-interactive). Many providers of virtual infrastructure allow user-defined customization via shell script, which ensures Crit composes well with provider provisioning tools (e.g. AWS Cloudformation).
Installation
The easiest way to install:
curl -sSfL https://get.crit.sh | sh
Pre-built binaries are also available in Releases. Crit is written in Go so it is also pretty simple to install via go:
go get -u github.com/criticalstack/crit/cmd/crit
RPM/Debian packages are also available via packagecloud.io.
Requirements
Crit is a standalone binary, however, there are implied requirements that aren't as straight-forward. Be sure to check out the Getting Started.
Design
Decoupled from Etcd Management
The Kubernetes control plane requires etcd for storage, however, bootstrapping and managing etcd is not a responsibility of Crit. This decreases code complexity and results in more maintainable code. Rather than handle all aspects of installing and managing Kubernetes, Crit is designed to be one tool in the toolbox, specific to bootstrapping Kubernetes components.
Safely handling etcd in a cloud environment is not as easy as it may seem, so we have a separate project, e2d, designed to bootstrap etcd and manage cluster membership.
Lazy Cluster Initialization
Crit leverages the unique features of etcd to handle how brand new clusters are bootstrapped. With other tooling, this is often accomplished by handling cluster initialization separately from all subsequent nodes joining the cluster (even if done so implicitly). The complexity for handling this initial case can be difficult to automate in distributed systems. Instead, the distributed locking capabilities of etcd are used to synchronize nodes and initialize a cluster automatically. All nodes race to acquire the distributed lock, and should the cluster not exist (signified by the presence of shared cluster files), a new cluster is initialized by the node that was first to acquire the lock, otherwise the node joins the cluster.
This ends up being really cool when working with projects like cluster-api, since all control plane nodes can be initialized simultaneously, greatly reducing the time to create a HA cluster (especially a 5 node control plane).
Node Roles
Nodes are distinguished as having only one of two roles, either control plane or worker. All the same configurations for clusters are possible, such as colocating etcd on the control plane, but Crit is only concerned with how it needs to bootstrap the two basic node roles.
Cluster Upgrades
There are a several important considerations for upgrading a cluster. Crit itself is only a bootstrapper, in that it takes on the daunting task of ensuring that the cluster components are all configured, but afterwards, there is not much left for it to do. However, the most important aspects of the philosophy behind Crit and e2d is to ensure that colocated control planes can:
- Have all nodes deployed simultaneously, and crit/e2d will ensure that they are bootstrapped regardless of the order they come up.
- It can safely perform a rolling upgrade.
Built for Automation
Getting Started
Quick Start
A local Critical Stack cluster can be setup using cinder
with one easy command:
$ cinder create cluster
This quickly creates a ready-to-use Kubernetes cluster running completely within a single Docker container.
Cinder, or Crit-in-Docker, can be useful for developing on Critical Stack clusters locally, or simply to learn more about Crit. You can read more about requirements, configuration, etc over in the Cinder Guide.
Running in Production
Setting up a production Kubernetes cluster requires quite a bit of planning and configuration. For one, there are many considerations that influence the way a cluster should be configured. When starting a new cluster or setting up a standard cluster configuration, one should consider the following:
- Where will it be running? (e.g. AWS, GCP, bare-metal, etc)
- What level of resiliency is required?
- This is about how the cluster can deal with faults and depending upon factors like colocation of etcd, how it fails can become more complicated.
- What will provide out-of-band storage for cluster secrets?
- This applies mostly to the initial cluster secrets, the Kubernetes and Etcd CA cert/key pairs.
- What kind of applications will run on the cluster?
- What cost-based factors are there?
- What discovery mechanisms are available for new nodes?
- Are there specific performance requirements that affect the infrastructure being used?
The Crit Guide, and the accompanying Security Guide, exists to help answer these questions and provide general guidance for setting up a typical Kubernetes cluster to meet various use-cases.
In particular, a few good places to start planning your Kubernetes cluster:
Crit Guide
This guide will take you through all of the typical configuration use cases that may come up when creating a new Kubernetes cluster.
- System Requirements
- Installation
- Configuration
- Container Runtimes
- Running Etcd
- Control Plane Sizing
- Kubernetes Certificates
- Bootstrapping a Worker
- Configuring Control Plane Components
- Running crit up
- Installing a CNI
- Installing a Storage Driver
- Configuring Authentication
- Kubelet Settings
- Exposing Cluster DNS
System Requirements
Exact system requirements will be dependent upon a lot of factors, however, for the most part, any relatively modern linux operating system will fit the bill.
Linux kernel
>= 4.9.17- systemd
- iptables (optional)
Newer versions of the kernel will enable using cilium's kube-proxy replacement feature, which will replace the need to deploy kube-proxy (and therefore not need iptables also).
Dependencies
- kubelet >= 1.14.x
- containerd >= 1.2.6
- CNI >= 0.7.5
References
- https://kubernetes.io/docs/concepts/extend-kubernetes/compute-storage-net/network-plugins/#cni
- https://kubernetes.io/docs/concepts/extend-kubernetes/compute-storage-net/network-plugins/#support-hostport
- https://docs.cilium.io/en/v1.6/gettingstarted/cni-chaining-portmap/#portmap-hostport
Installation
Install from helper script
Run the following in your terminal to download the latest version of crit:
curl -sSfL https://get.crit.sh | sh
By default, the latest release version is installed. Set environment variables to install a different version, or to install to a different destination:
curl -sSfL https://get.crit.sh | VERSION=1.0.8 INSTALL_DIR=$HOME/bin sh
Install From Packagecloud.io
Debian/Ubuntu:
curl -sL https://packagecloud.io/criticalstack/public/gpgkey | apt-key add -
apt-add-repository https://packagecloud.io/criticalstack/public/ubuntu
apt-get install -y criticalstack e2d
Fedora:
dnf config-manager --add-repo https://packagecloud.io/criticalstack/public/fedora
dnf install -y criticalstack e2d
Install from GH releases
Download a binary release from https://github.com/criticalstack/crit/releases/latest suitable for your system and then install, for example:
curl -sLO https://github.com/criticalstack/crit/releases/download/v1.0.1/crit_1.0.1_Linux_x86_64.tar.gz
tar xzf crit_0.2.9_Linux_x86_64.tar.gz
mv crit /usr/local/bin/
Please note, installing from a GH release will not automatically install the systemd kubelet drop in:
mkdir -p /etc/systemd/system/kubelet.service.d
curl -sLO https://raw.githubusercontent.com/criticalstack/crit/master/build/package/20-crit.conf
mv 20-crit.conf /etc/systemd/system/kubelet.service.d/
systemctl daemon-reload
Configuration
Configuration is passed to Crit via yaml and is separated into two different types: ControlPlaneConfiguration
and WorkerConfiguration
. The concerns are split between these two configs for any given node, and they each contain a NodeConfiguration
that specifies node-specific settings like for the [Kubelet], networking, etc.
Embedded ComponentConfigs
The ComponentConfigs are part of an ongoing effort to make configuration of Kubernetes components (API server, kubelet, etc) more dynamic by making configuration directly through Kubernetes API types. Crit will be using these ComponentConfigs when available since they simplify all aspects of taking user configuration and transforming that into Kubernetes component configuration. Kubernetes components are being changed to support direct configuration from file with the ComponentConfig API types, so Crit embeds these to make configuration more straightforward.
Currently, only the kube-proxy
and kubelet
ComponentConfigs are ready to be used, but more are currently being worked on and will be adopted by Crit as other components begin supporting configuration from file.
Runtime Defaults
Some configuration defaults are set at the time of running crit up
. These mostly include settings that are based upon the host that is running the command, such as the hostname.
If left unset, the controlPlaneEndpoint
value will be set to the ipv4 of the host. In the case there are multiple network interfaces, the first non-loopback network interface is used.
The default directory for Kubernetes files is /etc/kubernetes
and any paths to manifests, certificates, etc are derived from this.
Etcd is also configured presuming that mTLS is used and that the etcd nodes are colocated with the Kubernetes control plane components, effectively making this the default configuration:
apiVersion: crit.sh/v1alpha2
kind: ControlPlaneConfiguration
etcd:
endpoints:
- "https://${controlPlaneEndpoint.Host}:2379"
caFile: /etc/kubernetes/pki/etcd/ca.crt
caKey: /etc/kubernetes/pki/etcd/ca.key
certFile: /etc/kubernetes/pki/etcd/client.crt
keyFile: /etc/kubernetes/pki/etcd/client.key
The CA certificate required for the worker to validate the cluster it's joining is also derived from the default Kubernetes configuration directory:
apiVersion: crit.sh/v1alpha2
kind: WorkerConfiguration
caCert: /etc/kubernetes/pki/ca.crt
Container Runtimes
If interested in a comprehensive deep-dive into all things container runtime, Capital One has a great blog post going into the history and current state of container runtimes: A Comprehensive Container Runtime Comparison.
Containerd
Containerd is a robust and easy-to-use container runtime. It has a proven track record of reliability, and is the container runtime we use for many Critical Stack installations.
Docker
Docker is a more than just a container runtime, and actually utilizes containerd internally.
CRI-O
Running Etcd
Crit requires a connection to etcd to coordinate the bootstrapping process. The etcd cluster does not have to be colocated on the node. For bootstrapping and managing etcd, we prefer using our own e2d tool. It embeds etcd and combines it with the hashicorp/memberlist gossip network to manage etcd membership.
Control Plane Sizing
External Etcd
When dealing with etcd running external to the Kubernetes control plane components, there are not a lot of restrictions on how many control plane nodes one can have. There can be any number of nodes that meet demand and availability needs, and can even be auto-scaled. With that said, however, the performance of Kubernetes is tied heavily to the performance of etcd, so more nodes does not mean more performance.
Colocated Etcd
Colocation of etcd, or "stacked etcd" (as it's referred to in the Kubernetes documentation), is the practice of installing etcd alongside the Kubernetes control plane components (kube-apiserver
, kube-controller-manager
, and kube-scheduler
). This has some obvious benefits like reducing cost by reducing the virtual machines needed, but introduces a lot of complexity and restrictions.
Etcd's performance goes down the more nodes that are added, because more members are required to vote to commit to the raft log, so there should never be more than 5 voting members in a cluster (unless performing a rolling upgrade). Also, the number of members should always be odd to help protect against the split-brain problem. This means that the control plane can only safely be made up of 1, 3, or 5 nodes.
Etcd also should not be scaled up or down (at least, at this time). The reason is that the etcd cluster is being put at risk each time there is a membership change, so this also means that the control plane size needs to be selected ahead of time and not be altered.
General Recommendations
In cloud environments, 3 is a good size to balance resiliency and performance. The reasoning here is that cloud environments provide ways to quickly automate replacing failed members, so losing a node does not put etcd in danger of losing quorum for long until a new node can replace the existing one. As etcd moves towards adding more functionality around the learners member type, this will also open up to having a "hot spare" ready to take the place of the failed member immediately.
For bare-metal, 5 is a good size to ensure that failed nodes have more time to be replaced since a new node might need to be physically allocated.
Kubernetes Certificates
There are two available options for bootstrapping new worker nodes:
Generating Certificates
For an overview of the certificates Kubernetes requires and how they are used, see here.
Cluster CA
To generate the cluster CA and private key:
crit certs init --cert-dir /etc/kubernetes/pki
Certificates for Etcd
Etcd certificates can be generated using our e2d tool. See e2d pki.
Certificates and Kubeconfigs for Kubernetes Components
The following certificates and kubeconfigs can be created with crit. See the crit up
command.
/etc/kubernetes/
├── admin.conf
├── controller-manager.conf
├── kubelet.conf
├── pki
│ ├── apiserver-healthcheck-client.crt
│ ├── apiserver-healthcheck-client.key
│ ├── apiserver-kubelet-client.crt
│ ├── apiserver-kubelet-client.key
│ ├── apiserver.crt
│ ├── apiserver.key
│ ├── auth-proxy-ca.crt
│ ├── auth-proxy-ca.key
│ ├── ca.crt
│ ├── ca.key
│ ├── front-proxy-ca.crt
│ ├── front-proxy-ca.key
│ ├── front-proxy-client.crt
│ ├── front-proxy-client.key
│ ├── sa.key
│ └── sa.pub
└── scheduler.conf
Managing Certificates
Check Certificate Expiration
You can use the crit certs list
command to check when certificates expire:
$ crit certs list
Certificate Authorities:
========================
Name CN Expires NotAfter
ca kubernetes 9y 2030-09-27T01:45:12Z
front-proxy-ca front-proxy-ca 9y 2030-09-27T16:36:08Z
Certificates:
=============
Name CN Expires NotAfter
apiserver kube-apiserver 364d 2021-09-29T23:54:16Z
apiserver-kubelet-client kube-apiserver-kubelet-client 364d 2021-09-29T23:54:16Z
apiserver-healthcheck-client system:basic-info-viewer 364d 2021-09-29T23:54:16Z
front-proxy-client front-proxy-client 364d 2021-09-29T23:54:17Z
Rotating Certificates
There are several different solutions pertaining to certificate rotation. The appropriate solution greatly depends on an organization's use case. Some things to consider:
- Does certificate rotation need to intergrate with an organization's existing certificate infrastructure?
- Can certificate approval and signing be automated, or does it require a cluster administrator?
- How often do certificates need to be rotated?
- How many clusters need to be supported?
Rotating with Crit
Certificates can be renewed with crit certs renew
. Note, this does not renew the CA.
Rotating with the Kubernetes certificates API
Kubernetes provides a Certificate API that can be used to provision certificates using certificate signing requests.
Kubelet Certificate
The kubelet certificate can be automatically renewed using the kubernetes api.
Advanced Certificate Rotation
Organizations that require an automated certificate rotation solution that integrates with existing certificate infrastructure should consider projects like cert-manager.
Bootstrapping a Worker
There are two available options for bootstrapping new worker nodes:
Bootstrap Token
Crit supports a worker bootstrap flow using bootstrap tokens and the cluster CA certificate (e.g. /etc/kubernetes/pki/ca.crt
):
apiVersion: crit.sh/v1alpha2
kind: WorkerConfiguration
bootstrapToken: abcdef.0123456789abcdef
caCert: /etc/kubernetes/pki/ca.crt
controlPlaneEndpoint: mycluster.domain
node:
cloudProvider: aws
kubernetesVersion: 1.17.3
This method is adapted from the kubeadm join workflow, but uses the full CA certificate instead of using CA pinning. It also does not depend upon clients getting a signed configmap, and therefore does not require anonymous auth to be turned on.
Bootstrap Server
Experimental
The bootstrap protocol used by Kubernetes/kubeadm relies on operations that imply manual work to be performed, in particular, the bootstrap token creation and how that is distributed to new worker nodes. Crit introduces a new bootstrap protocol that tries to work better in environments that are completely automated.
A bootstrap-server static pod is created alongside the Kubernetes components that run on each control plane node. This provides a service to new nodes before they have joined the cluster that allows them to be authorized and given a bootstrap token. This also has the benefit of making the bootstrap token expiration very small, limited the window greatly that it can be used.
Configuration
Here is an example of using Amazon Instance Identity Document w/ signature verification while also limiting the accounts bootstrap tokens will be issued for:
apiVersion: crit.sh/v1alpha2
kind: ControlPlaneConfiguration
critBootstrapServer:
cloudProvider: aws
extraArgs:
filters: account-id=${account_id}
Override bootstrap-server default port:
apiVersion: crit.sh/v1alpha2
kind: ControlPlaneConfiguration
critBootstrapServer:
extraArgs:
port: 8080
Authorizers
AWS
The AWS authorizer uses Instance Identity Documents and RSA SHA 256 signature verification to confirm the identity of new nodes requesting bootstrap tokens.
Configuring Control Plane Components
See here for a complete list of available configuration options.
Control plane endpoint
The control plane endpoint is the address (IP or DNS), along with optional port, that represents the control plane. It it is effectively the API server address, however, it is internally used for a few other purposes, such as:
- Discovering other services using the host (e.g. bootstrap-server)
- Adding to the SAN for generated cluster certificates
It is specified in the config file like so:
apiVersion: crit.sh/v1alpha2
kind: ControlPlaneConfiguration
controlPlaneEndpoint: "example.com:6443"
Disable/Enable Kubernetes Feature Gates
Setting feature gates will be important if you need specific features that are not available by default or maybe to enable a feature that wasn't enabled by default for a particular version of Kubernetes.
For example, CSI-related features were only enabled by default starting with version 1.17, so for older versions of Kubernetes you will need to turn them on manually for the control plane:
apiVersion: crit.sh/v1alpha2
kind: ControlPlaneConfiguration
...
kubeAPIServer:
featureGates:
CSINodeInfo: true
CSIDriverRegistry: true
CSIBlockVolume: true
VolumeSnapshotDataSource: true
node:
kubelet:
featureGates:
CSINodeInfo: true
CSIDriverRegistry: true
CSIBlockVolume: true
and for the workers:
apiVersion: crit.sh/v1alpha2
kind: WorkerConfiguration
...
node:
kubelet:
featureGates:
CSINodeInfo: true
CSIDriverRegistry: true
CSIBlockVolume: true
The kubeAPIServer
, kubeControllerManager
, kubeScheduler
, and kubelet
all have feature gates that can be configured. More info is available in the Kubernetes docs.
Configuring Pod/Service Subnets
apiVersion: crit.sh/v1alpha2
kind: ControlPlaneConfiguration
podSubnet: "10.153.0.0/16"
serviceSubnet: "10.154.0.0/16"
Configuring a Cloud Provider
A cloud provider can be specified to integrate with the underlying infrastructure provider. Note, the specified cloud will most likely require authentication/authorization to access their APIs.
Crit supports both In-tree and out-of-tree cloud providers.
In-tree Cloud Provider
In-tree cloud providers can be specified with the following:
apiVersion: crit.sh/v1alpha2
kind: ControlPlaneConfiguration
...
node:
cloudProvider: aws
and for the workers:
apiVersion: crit.sh/v1alpha2
kind: WorkerConfiguration
...
node:
cloudProvider: aws
Out-of-tree Cloud Provider
Out-of-tree cloud providers can be specified with the following:
apiVersion: crit.sh/v1alpha2
kind: ControlPlaneConfiguration
...
node:
kubeletExtraArgs:
cloud-provider: external
and for the workers:
apiVersion: crit.sh/v1alpha2
kind: WorkerConfiguration
...
node:
kubeletExtraArgs:
cloudProvider: external
A manifest specific to cloud environment must then be applied to run the external cloud controller manager.
Running crit up
Depending on the provided config, crit up
will either provision a Control Plane Node or a Worker Node:
Running crit up
with a control plane configuration perfoms the following steps:
Step | Description |
---|---|
ControlPlanePreCheck | Validate configuration |
CreateOrDownloadCerts | Generate CAs; if already present, don't overwrite |
CreateNodeCerts | Generate certificates for kubernetes components; if already present, dont overwrite |
StopKubelet | Stop the kubelet using systemd |
WriteKubeConfigs | Generate control plane kubeconfigs and the admin kubeconfig |
WriteKubeletConfigs | Write kubelet settings |
StartKubelet | Start Kubelet using systemd |
WriteKubeManifests | Write static pod manifests for control plane |
WaitClusterAvailable | Wait for the control plane to be available |
WriteBootstrapServerManifest [optional] | Write the crit boostrap server pod manifest |
DeployCoreDNS | Deploy CoreDNS after cluster is available |
DeployKubeProxy | Deploy KubeProxy |
EnableCSRApprover | Add RBAC to allow csrapprover to boostrap nodes |
MarkControlPlane | Add taint to control plane node |
UploadInfo | Upload crit config map that holds info regarding the cluster |
Running crit up
with a worker configuration perfoms the following steps:
Step | Description |
---|---|
WorkerPreCheck | Validate configuration |
StopKubelet | Stop the kubelet using systemd |
WriteBootstrapKubeletConfig | Write kubelet boostrap kubeconfig |
WriteKubeletConfigs | Write kubelet settings |
StartKubelet | Start Kubelet using systemd |
Installing a CNI
The CNI can be installed at any point after a node bootstrapping (i.e. after crit up
finishes successfully). For example, when we install cilium via helm it looks something like this:
helm repo add cilium https://helm.cilium.io/
helm install cilium cilium/cilium --namespace kube-system \
--version 1.8.2
Installing a Storage Driver
helm repo add criticalstack https://charts.cscr.io/criticalstack
kubectl create namespace local-path-storage
helm install local-path-storage criticalstack/local-path-provisioner \
--namespace local-path-storage \
--set nameOverride=local-path-storage \
--set storageClass.defaultClass=true
Install the AWS CSI driver via helm
https://github.com/kubernetes-sigs/aws-ebs-csi-driver
helm repo add criticalstack https://charts.cscr.io/criticalstack
helm install aws-ebs-csi-driver criticalstack/aws-ebs-csi-driver \
--set enableVolumeScheduling=true \
--set enableVolumeResizing=true \
--set enableVolumeSnapshot=true \
--version 0.3.0
Setting a Default StorageClass
kubectl apply -f - <<EOT
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: ebs-sc
annotations:
storageclass.kubernetes.io/is-default-class: "true"
provisioner: ebs.csi.aws.com
volumeBindingMode: WaitForFirstConsumer
parameters:
csi.storage.k8s.io/fstype: xfs
type: io1
iopsPerGB: "50"
encrypted: "true"
EOT
Configuring Authentication
Configure the Kubernetes API Server
The Kubernetes API server can be configured with OpenID Connect to use an existing OpenID Identity Provider. It can only trust a single issuer and until the API server can be configured with component configs it must be specified in the Crit config as command-line arguments:
apiVersion: crit.sh/v1alpha2
kind: ControlPlaneConfiguration
kubeAPIServer:
extraArgs:
oidc-issuer-url: "https://accounts.google.com"
oidc-client-id: critical-stack
oidc-username-claim: email
oidc-groups-claim: groups
The above configuration will allow the API server to use Google as its identity provider, but with some major limitations:
- Kubernetes does not act as a client for the issuer
- Does not provide a way to manage the lifecycle of OpenID Connect tokens
This can be best understood looking in the Kubernetes authentication documentation for OpenID Connect Tokens. The process of getting a token happens completely outside of the context of the Kubernetes cluster and is passed in as an argument to kubectl
commands.
Using an In-cluster Identity Provider
Given the limitations mentioned above, many run their own identity providers inside of the cluster to provide additional auth features to the cluster. This complicates configuration, however, since the API server will either have to be reconfigured and restarted, or will need to be configured with an issuer that is not yet running.
So what if you want to provide a web interface that leverages this authentication? Given the limitations mentioned above, you would have to write authentication logic for the specific upstream identity provider into your application, and should the upstream identity provider change, so does the authentication logic AND the API server configuration. This is where identity providers, such as Dex, come in. Dex uses OpenID Connect to provide authentication for other applications by acting as a shim between the client app and the upstream provider. When using Dex, the oidc-issuer-url
argument being specified needs to target the expected address of Dex running the cluster, so something like:
oidc-issuer-url: "https://dex.kube-system.svc.cluster.local:5556"
It is ok that Dex isn't running yet, the API server will function as normal until the issuer is available.
The auth-proxy CA
The API server uses the host's root CAs by default, but in the case where an application might not be using a CA signed certificate, like during development or automated testing, Crit generates an additional CA that is already available in the API server certs volume. This helps with the chicken/egg problem of needing to specify a CA file when bootstrapping a new cluster before the application has been deployed. To use this auth-proxy CA, just add this to the API server configuration:
oidc-ca-file: /etc/kubernetes/pki/auth-proxy-ca.crt
Please note that this assumes that the default Kubernetes directory (/etc/kubernetes
) is being used. From here there are many options to make use of auth-proxy CA. For example, cert-manager can be installed and the auth-proxy CA can be setup as a ClusterIssuer
:
# install cert-manager
kubectl create namespace cert-manager
kubectl apply --validate=false -f https://github.com/jetstack/cert-manager/releases/download/v0.14.0/cert-manager.yaml
# add auth-proxy-ca secret to be used as ClusterIssuer
kubectl -n cert-manager create secret generic auth-proxy-ca --from-file=tls.crt=/etc/kubernetes/pki/auth-proxy-ca.crt --from-file=tls.key=/etc/kubernetes/pki/auth-proxy-ca.key
# wait for cert-manager-webhook readiness
while [[ $(kubectl -n cert-manager get pods -l app=webhook -o 'jsonpath={..status.conditions[?(@.type=="Ready")].status}') != "True" ]]; do echo "waiting for pod" && sleep 1; done
kubectl apply -f - <<EOT
apiVersion: cert-manager.io/v1alpha2
kind: ClusterIssuer
metadata:
name: auth-proxy-ca
namespace: cert-manager
spec:
ca:
secretName: auth-proxy-ca
EOT
Then applications can create cert-manager certificates for their application to use:
apiVersion: cert-manager.io/v1alpha2
kind: Certificate
metadata:
name: myapp-example
spec:
secretName: myapp-certs
duration: 8760h # 365d
renewBefore: 360h # 15d
organization:
- Internet Widgits Pty Ltd
isCA: false
keySize: 2048
keyAlgorithm: rsa
keyEncoding: pkcs1
usages:
- server auth
- client auth
dnsNames:
- myapp.example.com
issuerRef:
name: auth-proxy-ca
kind: ClusterIssuer
Of course, this is just one possible way to approach authentication, and configuration will vary greatly depending upon the needs of the application(s) running on the cluster.
Kubelet Settings
Disable swap for Linux-based Operating Systems
Swap cannot be enabled for the kubelet to work (see here). This is a helpful drop-in to ensure that swap is disabled on a system:
[Unit]
After=local-fs.target
[Service]
ExecStart=/sbin/swapoff -a
[Install]
WantedBy=multi-user.target
Reserving Resources
Reserving some resources for the system to use is often times very helpful to ensure that resource hungry pods don't kill the system by causing it to run out of memory.
...
node:
kubelet:
kubeReserved:
cpu: 128m
memory: 64Mi
kubeReservedCgroup: /podruntime.slice
kubeletCgroups: /podruntime.slice
systemReserved:
cpu: 128m
memory: 192Mi
systemReservedCgroup: /system.slice
# /etc/systemd/system/kubelet.service.d/10-cgroup.conf
# Sets the cgroup for the kubelet service
[Service]
CPUAccounting=true
MemoryAccounting=true
Slice=podruntime.slice
# /etc/systemd/system/containers.slice
# Creates a cgroup for kubelet
[Unit]
Description=Grouping resources slice for containers
Documentation=man:systemd.special(7)
DefaultDependencies=no
Before=slices.target
Requires=-.slice
After=-.slice
# /etc/systemd/system/podruntime.slice
# Creates a cgroup for kubelet
[Unit]
Description=Limited resources slice for Kubelet service
Documentation=man:systemd.special(7)
DefaultDependencies=no
Before=slices.target
Requires=-.slice
After=-.slice
Exposing Cluster DNS
Replace Systemd-resolved With Dnsmasq
Sometimes systemd-resolved, default stub resolver for many linux systems, needs to be replaced with dnsmasq. This dnsmasq systemd drop-in is useful to ensure that systemd-resolved is not running when the dnsmasq service is started:
# /etc/systemd/system/dnsmasq.service.d/10-resolved-fix.conf
[Unit]
After=systemd-resolved.service
[Service]
ExecStartPre=/bin/systemctl stop systemd-resolved.service
ExecStartPost=/bin/systemctl start systemd-resolved.service
It works by allowing systemd-resolved to start, but stopping it once the dnsmasq service is started. This is helpful because it doesn't require changing any of the systemd-resolved specific settings but allows the dnsmasq service to be enabled/disabled when desired.
Forwarding Cluster-bound DNS on the Host
A reason why one might want to use something like dnsmasq, instead of systemd-resolved, is to expose the cluster DNS to the host. This would allow resolution of DNS for service and pod subnets from the host that is running the Kubernetes components. It only requires adding this dnsmasq configuration drop-in:
# /etc/dnsmasq.d/kube.conf
server=/cluster.local/10.254.0.10
This tells dnsmasq to forward any DNS queries it receives that end in the cluster domain, to the Kubernetes cluster dns, CoreDNS. In this case, it is presuming that the default cluster domain (cluster.local
) and services subnet, have been configured. The address of CoreDNS is chosen automatically based upon the services subnet, so if the services subnet is 10.254.0.0/16
(the default), CoreDNS will be listening at 10.254.0.10
.
Specifying the resolv.conf
The default resolv.conf from the host is used, /etc/resolv.conf
or a different conf file can be set in the Kubelet:
apiVersion: crit.sh/v1alpha2
kind: ControlPlaneConfiguration
node:
kubelet:
resolvConf: /other/resolv.conf
Crit attempts to determine if systemd-resolved is running, and dynamically sets resolvConf
to be /run/systemd/resolve/resolv.conf
.
Security Guide
This guide will take you through configuring security features of Kubernetes, as well as, features specific to Crit. It will also include general helpful information or gotchas to look out for when creating a new cluster.
- Encrypting Kubernetes Secrets
- Enabling Pod Security Policies
- Audit Policy Logging
- Disabling Anonymous Authentication
- Kubelet Server Certificate
- Encrypting Shared Cluster Files
Encrypting Kubernetes Secrets
EncryptionProviderConfig
To encrypt secrets within the cluster you must create an EncryptionConfiguration
manifest and pass it to the API server.
touch /etc/kubernetes/encryption-config.yaml
chmod 600 /etc/kubernetes/encryption-config.yaml
cat <<-EOT > /etc/kubernetes/encryption-config.yaml
apiVersion: apiserver.config.k8s.io/v1
kind: EncryptionConfiguration
resources:
- resources:
- secrets
providers:
- aescbc:
keys:
- name: key1
secret: $(cat /etc/kubernetes/pki/etcd/ca.key | md5sum | cut -f 1 -d ' ' | head -c -1 | base64)
- identity: {}
EOT
This EncryptionConfiguration
uses the aescbc
provider for encrypting secrets. Details on other providers, including third-party key management systems, can be found in the Kubernetes official documentation.
apiVersion: crit.sh/v1alpha2
kind: ControlPlaneConfiguration
kubeAPIServer:
extraVolumes:
- name: encryption-config
hostPath: /etc/kubernetes/encryption-config.yaml
mountPath: /etc/kubernetes/encryption-config.yaml
readOnly: true
Once the API server is available, verify that new secrets are encrypted.
Enabling Pod Security Policies
What is a Pod Security Policy
Pod Security Policies are in-cluster Kubernetes resources that provides ways of securing pods. The official Pod Security Policy of the official Kubernetes docs provides a great deal of helpful information and a walkthrough of how to use them, and is highly recommended reading. For the purposes of this documentation, we really just want to focus on getting them running on your Crit cluster.
Configuration
The APIServer has quite a few admission plugins enabled by default, however, the PodSecurityPolicy
plugin must be enabled when configuring the APIServer with the enable-admission-plugin
option:
apiVersion: crit.sh/v1alpha2
kind: ControlPlaneConfiguration
kubeAPIServer:
extraArgs:
enable-admission-plugins: PodSecurityPolicy
enable-admission-plugin
can be provided a comma-delimited list of admission plugins to enable. While the order that admission plugins run does matter, it does not matter for this particular option as it simply enables the plugin.
The admission plugin SecurityContextDeny
must NOT be enabled along with PodSecurityPolicy
. In the case that PodSecurityPolicy
is enabled, the usage completely supplants the functionality provided by SecurityContextDeny
.
Pod Security Policy Examples
Crit embeds two Pod Security Policies that provides a good starting place for configuring PSPs in your cluster. They were adapted from the examples provided in the Kubernetes docs and can be found in GitHub here or can be printed to the console using crit template on the desired file:
$ crit template psp-privileged.yaml
Privileged Pod Security Policy
# psp-privileged.yaml
apiVersion: policy/v1beta1
kind: PodSecurityPolicy
metadata:
name: privileged
annotations:
seccomp.security.alpha.kubernetes.io/allowedProfileNames: '*'
spec:
privileged: true
allowPrivilegeEscalation: true
allowedCapabilities:
- '*'
volumes:
- '*'
hostNetwork: true
hostPorts:
- min: 0
max: 65535
hostIPC: true
hostPID: true
runAsUser:
rule: 'RunAsAny'
seLinux:
rule: 'RunAsAny'
supplementalGroups:
rule: 'RunAsAny'
fsGroup:
rule: 'RunAsAny'
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: psp:privileged
rules:
- apiGroups: ['policy']
resources: ['podsecuritypolicies']
verbs: ['use']
resourceNames:
- privileged
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: psp:privileged
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: psp:privileged
subjects:
- kind: Group
apiGroup: rbac.authorization.k8s.io
name: system:serviceaccounts:kube-system
- kind: Group
name: system:serviceaccounts:kube-node-lease
apiGroup: rbac.authorization.k8s.io
- kind: Group
name: system:serviceaccounts:kube-public
apiGroup: rbac.authorization.k8s.io
- kind: Group
name: system:serviceaccounts:default
apiGroup: rbac.authorization.k8s.io
- kind: Group
name: system:nodes
apiGroup: rbac.authorization.k8s.io
- kind: User
apiGroup: rbac.authorization.k8s.io
# Legacy node ID
name: kubelet
Restricted Pod Security Policy
# psp-restricted.yaml
apiVersion: policy/v1beta1
kind: PodSecurityPolicy
metadata:
name: default-cluster-restricted
annotations:
seccomp.security.alpha.kubernetes.io/allowedProfileNames: 'docker/default,runtime/default'
apparmor.security.beta.kubernetes.io/allowedProfileNames: 'runtime/default'
seccomp.security.alpha.kubernetes.io/defaultProfileName: 'runtime/default'
apparmor.security.beta.kubernetes.io/defaultProfileName: 'runtime/default'
spec:
privileged: false
# Required to prevent escalations to root.
allowPrivilegeEscalation: false
# This is redundant with non-root + disallow privilege escalation,
# but we can provide it for defense in depth.
requiredDropCapabilities:
- ALL
# Allow core volume types.
volumes:
- 'configMap'
- 'emptyDir'
- 'projected'
- 'secret'
- 'downwardAPI'
# Assume that persistentVolumes set up by the cluster admin are safe to use.
- 'persistentVolumeClaim'
hostNetwork: false
hostIPC: false
hostPID: false
runAsUser:
# Require the container to run without root privileges.
rule: 'MustRunAsNonRoot'
seLinux:
# This policy assumes the nodes are using AppArmor rather than SELinux.
rule: 'RunAsAny'
supplementalGroups:
rule: 'MustRunAs'
ranges:
# Forbid adding the root group.
- min: 1
max: 65535
fsGroup:
rule: 'MustRunAs'
ranges:
# Forbid adding the root group.
- min: 1
max: 65535
readOnlyRootFilesystem: false
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: psp:restricted
rules:
- apiGroups: ['policy']
resources: ['podsecuritypolicies']
verbs: ['use']
resourceNames:
- default-cluster-restricted
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: psp:restricted
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: psp:restricted
subjects:
# Authorize all service accounts in a namespace:
- kind: Group
apiGroup: rbac.authorization.k8s.io
name: system:serviceaccounts
# Or equivalently, all authenticated users in a namespace:
- kind: Group
apiGroup: rbac.authorization.k8s.io
name: system:authenticated
Audit Policy Logging
apiVersion: crit.sh/v1alpha2
kind: ControlPlaneConfiguration
kubeAPIServer:
extraArgs:
audit-policy-file: "/etc/kubernetes/audit-policy.yaml"
audit-log-path: "/var/log/kubernetes/kube-apiserver-audit.log"
audit-log-maxage: "30"
audit-log-maxbackup: "10"
audit-log-maxsize: "100"
extraVolumes:
- name: apiserver-logs
hostPath: /var/log/kubernetes
mountPath: /var/log/kubernetes
readOnly: false
hostPathType: directory
- name: apiserver-audit-config
hostPath: /etc/kubernetes/audit-policy.yaml
mountPath: /etc/kubernetes/audit-policy.yaml
readOnly: true
Disabling Anonymous Authentication
The API server defaults to allow anonymous auth, meaning that incoming requests that are not authenticated will be implicitly given a username system:anonymous
and be part of the system:unauthenticated
group. While this user may not have permission to anything, problems related to allowing anonymous authentication are still possible, such as vulnerabilities like the "Billion Laughs" attack.
Disabling anonymous authentication only requires passing an argument to the API server:
apiVersion: crit.sh/v1alpha2
kind: ControlPlaneConfiguration
kubeAPIServer:
extraArgs:
anonymous-auth: false
API Server Healthchecks
Liveness probes will fail for static pods should anonymous-auth be set to false. Crit addresses this by detecting when --anonymous-auth
has been disabled and adds a special healthcheck-proxy sidecar to the apiserver static pod. It acts as a reverse proxy with the frontend effectively accepting anonymous traffic and the backend using an authenticated user. The backend connection is established with the built-in system:basic-info-viewer
user to limit the auth to only being able to look at health and version information.
Kubelet Server Certificate
Encrypting Shared Cluster Files
The pki shared by all control plane nodes are distributed via etcd/e2d using e2db, an ORM-like abstraction over etcd. These files should be protected using strong encryption, and e2db provides a feature for encrypting entire tables. The one requirement is that the etcd ca key is provided in the crit configuration:
apiVersion: crit.sh/v1alpha2
kind: ControlPlaneConfiguration
etcd:
endpoints:
- "https://${controlPlaneEndpoint.Host}:2379"
caFile: /etc/kubernetes/pki/etcd/ca.crt
caKey: /etc/kubernetes/pki/etcd/ca.key
certFile: /etc/kubernetes/pki/etcd/client.crt
keyFile: /etc/kubernetes/pki/etcd/client.key
where the important file here is ca.key
, since it is only one suitable to use as a data encryption key.
Cinder Guide
This guide will take you through installing and using Cinder.
What is Cinder
Cinder, or Crit-in-Docker, is very similar to kind. In fact, it uses many packages from kind under-the-hood along with the base container image that makes it all work. Think of cinder as like a flavor of kind (kind is quite good, to say the least). Just like kind, cinder won't work on all platforms, and right now only supports amd64 architectures running macOS and linux, and requires running Docker.
Cinder bootstraps each node with Crit and installs several helpful additional components, such as the machine-api and machine-api-provider-docker.
Using Cinder to Develop Crit
# dev.yaml
apiVersion: cinder.crit.sh/v1alpha1
kind: ClusterConfiguration
files:
- path: "/usr/bin/crit"
owner: "root:root"
permissions: "0755"
encoding: hostpath
content: bin/crit
❯ make crit
❯ cinder create cluster -c dev.yaml
Creating cluster "cinder" ...
🔥 Generating certificates
🔥 Creating control-plane node
🔥 Installing CNI
🔥 Installing StorageClass
🔥 Running post-up commands
Set kubectl context to "kubernetes-admin@cinder". Prithee, be careful.
Installation
Cinder is installed alongside of crit so the same helper script can be used for installation:
curl -sSfL https://get.crit.sh | sh
Configuration
Adding Files
apiVersion: cinder.crit.sh/v1alpha1
kind: ClusterConfiguration
files:
- path: "/etc/kubernetes/auth-proxy-ca.yaml"
owner: "root:root"
permissions: "0644"
content: |
apiVersion: cert-manager.io/v1alpha2
kind: ClusterIssuer
metadata:
name: auth-proxy-ca
namespace: cert-manager
spec:
ca:
secretName: auth-proxy-ca
HostPath
apiVersion: cinder.crit.sh/v1alpha1
kind: ClusterConfiguration
kubeAPIServer:
extraArgs:
audit-policy-file: "/etc/kubernetes/audit-policy.yaml"
audit-log-path: "/var/log/kubernetes/kube-apiserver-audit.log"
audit-log-maxage: "30"
audit-log-maxbackup: "10"
audit-log-maxsize: "100"
extraVolumes:
- name: apiserver-logs
hostPath: /var/log/kubernetes
mountPath: /var/log/kubernetes
readOnly: false
hostPathType: Directory
- name: apiserver-audit-config
hostPath: /etc/kubernetes/audit-policy.yaml
mountPath: /etc/kubernetes/audit-policy.yaml
readOnly: true
files:
- path: "/etc/kubernetes/audit-policy.yaml"
owner: "root:root"
permissions: "0644"
encoding: hostpath
content: audit-policy.yaml
Running Additional Commands
apiVersion: cinder.crit.sh/v1alpha1
kind: ClusterConfiguration
preCritCommands:
- crit version
postCritCommands:
- |
helm repo add jetstack https://charts.jetstack.io
helm install cert-manager jetstack/cert-manager \
--namespace cert-manager \
--version v0.15.1 \
--set tolerations[0].effect=NoSchedule \
--set tolerations[0].key="node.kubernetes.io/not-ready" \
--set tolerations[0].operator=Exists \
--set installCRDs=true
kubectl rollout status -n cert-manager deployment/cert-manager-webhook -w
Updating the Containerd Configuration
apiVersion: cinder.crit.sh/v1alpha1
kind: ClusterConfiguration
files:
- path: "/etc/containerd/config.toml"
owner: "root:root"
permissions: "0644"
content: |
# explicitly use v2 config format
version = 2
# set default runtime handler to v2, which has a per-pod shim
[plugins."io.containerd.grpc.v1.cri".containerd]
default_runtime_name = "runc"
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc]
runtime_type = "io.containerd.runc.v2"
# Setup a runtime with the magic name ("test-handler") used for Kubernetes
# runtime class tests ...
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.test-handler]
runtime_type = "io.containerd.runc.v2"
[plugins."io.containerd.grpc.v1.cri".registry.mirrors]
[plugins."io.containerd.grpc.v1.cri".registry.mirrors."docker.io"]
endpoint = ["https://docker.io"]
Adding Volume Mounts
apiVersion: cinder.crit.sh/v1alpha1
kind: ClusterConfiguration
extraMounts:
- hostPath: templates
containerPath: /cinder/templates
readOnly: true
Forwarding Ports to the Host
apiVersion: cinder.crit.sh/v1alpha1
kind: ClusterConfiguration
extraPortMappings:
- containerPort: 2379
hostPort: 2379
Features
Side-loading Images
Kind allows you to side-load images in your local clusters. Cinder exposes the same functionality via cinder load
:
cinder load criticalstack/quake-kube:v1.0.5
This will make the criticalstack/quake-kube:v1.0.5
image from the host available in the Cinder node. Any image that is available on the host can be loaded, and Cinder lazily pulls images that are not found on the host.
Registry Mirrors
Mirrors for container image registries can be setup to effectively "alias" them. The key is the alias, and the value is the full endpoint for the registry:
apiVersion: cinder.crit.sh/v1alpha1
kind: ClusterConfiguration
registryMirrors:
docker.io: "https://docker.io"
It can be used to alias registries with different names OR it can be used to specify plain http registries:
...
registryMirrors:
myregistry.dev: "http://myregistry.dev"
Local Registry
An instance of Distribution (aka Docker Registry v2) can be setup for a Cinder cluster by specifying a config file with the LocalRegistry
feature gate:
apiVersion: cinder.crit.sh/v1alpha1
kind: ClusterConfiguration
featureGates:
LocalRegistry: true
This will start a Docker container on the host with the running registry (if not already running). The registry is shared for all Cinder clusters on a host and is available at localhost:5000
(i.e. this is what you docker push
to). This registry is then available inside the cluster at cinderegg:5000
.
Cinder also creates the local-registry-hosting
ConfigMap so that any tooling that supports Local Registry Hosting, such as Tilt, will be able to automatically discover and use the local registry.
apiVersion: v1
kind: ConfigMap
metadata:
name: local-registry-hosting
namespace: kube-public
data:
localRegistryHosting.v1: |
host: "localhost:{{ .LocalRegistryPort }}"
hostFromContainerRuntime: "{{ .LocalRegistryName }}:{{ .LocalRegistryPort }}"
hostFromClusterNetwork: "{{ .LocalRegistryName }}:{{ .LocalRegistryPort }}"
help: "https://docs.crit.sh/cinder-guide/local-registry.html"`
More information about this Kubernetes standard can be found here.
Krustlet
Krustlet is a tool to run WebAssembly workloads natively on Kubernetes by acting like node in your Kubernetes cluster. It can be enabled for a Cinder cluster using the following configuration:
# config.yaml
apiVersion: cinder.crit.sh/v1alpha1
kind: ClusterConfiguration
featureGates:
LocalRegistry: true
Krustlet: true
controlPlaneConfiguration:
kubeProxy:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: "kubernetes.io/arch"
operator: NotIn
values: ["wasm32-wasi", "wasm32-wascc"]
to create a new Cinder cluster:
❯ cinder create cluster -c config.yaml
Creating cluster "cinder" ...
🔥 Generating certificates
🔥 Creating control-plane node
🔥 Installing CNI
🔥 Installing StorageClass
🔥 Running post-up commands
Set kubectl context to "kubernetes-admin@cinder". Prithee, be careful.
Note that node affinity is being set for kube-proxy
to ensure it does not try to schedule a pod on either the WASI or WASCC nodes
This will start two instances of Krustlet for both WASI and waSCC runtimes:
❯ kubectl get no
NAME STATUS ROLES AGE VERSION
cinder Ready master 2m v1.18.5
cinder-wascc Ready <none> 1m 0.5.0
cinder-wasi Ready <none> 1m 0.5.0
With these nodes ready, we can build and push images to our local registry and run them on our Cinder cluster. For example, the Hello World Rust for WASI can be built using cargo and pushed to our local registry using wasm-to-oci:
❯ cargo build --target wasm32-wasi --release
❯ wasm-to-oci push --use-http \
target/wasm32-wasi/release/hello-world-rust.wasm \
localhost:5000/hello-world-rust:v0.2.0
The line in k8s.yaml specifying the image to use will need to be modified:
...
spec:
containers:
- name: hello-world-wasi-rust
#image: webassembly.azurecr.io/hello-world-wasi-rust:v0.2.0
image: cinderegg:5000/hello-world-rust:v0.2.0
...
Finally, the manifest can be applied:
❯ kubectl apply -f k8s.yaml
Which will result in the pod being scheduled on the waSCC Krustlet:
❯ kubectl get po -A
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system cilium-operator-657978fb5b-frrxj 1/1 Running 0 8m4s
kube-system cilium-pqmsc 1/1 Running 0 8m4s
kube-system coredns-pqljz 1/1 Running 0 7m57s
kube-system hello-world-wasi-rust 0/1 ExitCode:0 0 1s
kube-system kube-apiserver-cinder 1/1 Running 0 8m18s
kube-system kube-controller-manager-cinder 1/1 Running 0 8m18s
kube-system kube-proxy-85lwd 1/1 Running 0 8m4s
kube-system kube-scheduler-cinder 1/1 Running 0 8m18s
local-path-storage local-path-storage-74cd8967f5-vv2mb 1/1 Running 0 8m4s
And should produce the following log output:
❯ kubectl logs hello-world-wasi-rust
hello from stdout!
hello from stderr!
POD_NAME=hello-world-wasi-rust
FOO=bar
CONFIG_MAP_VAL=cool stuff
Args are: []
Bacon ipsum dolor amet chuck turducken porchetta, tri-tip spare ribs t-bone ham hock. Meatloaf
pork belly leberkas, ham beef pig corned beef boudin ground round meatball alcatra jerky.
Pancetta brisket pastrami, flank pork chop ball tip short loin burgdoggen. Tri-tip kevin
shoulder cow andouille. Prosciutto chislic cupim, short ribs venison jerky beef ribs ham hock
short loin fatback. Bresaola meatloaf capicola pancetta, prosciutto chicken landjaeger andouille
swine kielbasa drumstick cupim tenderloin chuck shank. Flank jowl leberkas turducken ham tongue
beef ribs shankle meatloaf drumstick pork t-bone frankfurter tri-tip.
FAQ
TODO: modify this a bit
Can e2d scale up (or down) after cluster initialization?
The short answer is No, because it is unsafe to scale etcd and any solution that scales etcd is increasing the chance of cluster failure. This is a feature that will be supported in the future, but it relies on new features and fixes to etcd. Some context will be necessary to explain why:
A common misconception about etcd is that it is scalable. While etcd is a distributed key/value store, the reason it is distributed is to provide for distributed consensus, NOT to scale in/out for performance (or flexibility). In fact, the best performing etcd cluster is when it only has 1 member and the performance goes down as more members are added. In etcd v3.4, a new type of member called learners was introduced. These are members that can receive raft log updates, but are not part of the quorum voting process. This will be an important feature for many reasons, like stability/safety and faster recovery from faults, but will also potentially[1] enable etcd clusters of arbitrary sizes.
So why not scale within the recommended cluster sizes if the only concern is performance? Previously, etcd clusters have been vulnerable to corruption during membership changes due to the way etcd implemented raft. This has only recently been addressed by incredible work from CockroachDB, and it is worth reading about the issue and the solution in this blog post: Availability and Region Failure: Joint Consensus in CockroachDB.
The last couple features needed to safely scale have been roadmapped for v3.5 and are highlighted in the etcd learner design doc:
Make learner state only and default: Defaulting a new member state to learner will greatly improve membership reconfiguration safety, because learner does not change the size of quorum. Misconfiguration will always be reversible without losing the quorum.
Make voting-member promotion fully automatic: Once a learner catches up to leader’s logs, a cluster can automatically promote the learner. etcd requires certain thresholds to be defined by the user, and once the requirements are satisfied, learner promotes itself to a voting member. From a user’s perspective, “member add” command would work the same way as today but with greater safety provided by learner feature.
Since we want to implement this feature as safely and reliably as possible, we are waiting for this confluence of features to become stable before finally implementing scaling into e2d.
[1] Only potentially, because the maximum is currently set to allow only 1 learner. There is a concern that too many learners could have a negative impact on the leader which is discussed briefly here. It is also worth noting that other features may also fulfill the same need like some kind of follower replication: etcd#11357.
Command Reference
crit
bootstrap Critical Stack clusters
Synopsis
bootstrap Critical Stack clusters
Options
-h, --help help for crit
-v, --verbose count log output verbosity
SEE ALSO
- crit certs - Handle Kubernetes certificates
- crit config - Handle Kubernetes config files
- crit create - Create Kubernetes resources
- crit generate - Utilities for generating values
- crit template - Render embedded assets
- crit up - Bootstraps a new node
- crit version - Print the version info
General Commands
crit template
Render embedded assets
Synopsis
Render embedded assets
crit template [path] [flags]
Options
-c, --config string config file (default "config.yaml")
-h, --help help for template
Options inherited from parent commands
-v, --verbose count log output verbosity
SEE ALSO
- crit - bootstrap Critical Stack clusters
crit up
Bootstraps a new node
Synopsis
Bootstraps a new node
crit up [flags]
Options
-c, --config string config file (default "config.yaml")
-h, --help help for up
--kubelet-timeout duration timeout for Kubelet to become healthy (default 15s)
--timeout duration (default 20m0s)
Options inherited from parent commands
-v, --verbose count log output verbosity
SEE ALSO
- Running crit up - useful information about the
crit up
command - crit - bootstrap Critical Stack clusters
crit version
Print the version info
Synopsis
Print the version info
crit version [flags]
Options
-h, --help help for version
Options inherited from parent commands
-v, --verbose count log output verbosity
SEE ALSO
- crit - bootstrap Critical Stack clusters
crit certs
Handle Kubernetes certificates
Synopsis
Handle Kubernetes certificates
Options
-h, --help help for certs
Options inherited from parent commands
-v, --verbose count log output verbosity
SEE ALSO
- crit - bootstrap Critical Stack clusters
- crit certs init - initialize a new CA
- crit certs list - list cluster certificates
- crit certs renew - renew cluster certificates
crit certs init
initialize a new CA
Synopsis
initialize a new CA
crit certs init [flags]
Options
--cert-dir string
-h, --help help for init
Options inherited from parent commands
-v, --verbose count log output verbosity
SEE ALSO
- crit certs - Handle Kubernetes certificates
crit certs list
list cluster certificates
Synopsis
list cluster certificates
crit certs list [flags]
Options
--cert-dir string (default "/etc/kubernetes/pki")
-h, --help help for list
Options inherited from parent commands
-v, --verbose count log output verbosity
SEE ALSO
- crit certs - Handle Kubernetes certificates
crit certs renew
renew cluster certificates
Synopsis
renew cluster certificates
crit certs renew [flags]
Options
--cert-dir string (default "/etc/kubernetes/pki")
--dry-run
-h, --help help for renew
Options inherited from parent commands
-v, --verbose count log output verbosity
SEE ALSO
- crit certs - Handle Kubernetes certificates
crit config
Handle Kubernetes config files
Synopsis
Handle Kubernetes config files
Options
-h, --help help for config
Options inherited from parent commands
-v, --verbose count log output verbosity
SEE ALSO
- crit - bootstrap Critical Stack clusters
- crit config import - import a kubeconfig
crit config import
import a kubeconfig
Synopsis
import a kubeconfig
crit config import [kubeconfig] [flags]
Options
-h, --help help for import
Options inherited from parent commands
-v, --verbose count log output verbosity
SEE ALSO
- crit config - Handle Kubernetes config files
crit create
Create Kubernetes resources
Synopsis
Create Kubernetes resources
Options
-h, --help help for create
Options inherited from parent commands
-v, --verbose count log output verbosity
SEE ALSO
- crit - bootstrap Critical Stack clusters
- crit create token - creates a bootstrap token resource
crit create token
creates a bootstrap token resource
Synopsis
creates a bootstrap token resource
crit create token [token] [flags]
Options
-h, --help help for token
Options inherited from parent commands
-v, --verbose count log output verbosity
SEE ALSO
- crit create - Create Kubernetes resources
crit generate
Utilities for generating values
Synopsis
Utilities for generating values
Options
-h, --help help for generate
Options inherited from parent commands
-v, --verbose count log output verbosity
SEE ALSO
- crit - bootstrap Critical Stack clusters
- crit generate hash -
- crit generate kubeconfig - generates a kubeconfig
- crit generate token - generates a bootstrap token
crit generate hash
Synopsis
crit generate hash [ca-cert-path] [flags]
Options
-h, --help help for hash
Options inherited from parent commands
-v, --verbose count log output verbosity
SEE ALSO
- crit generate - Utilities for generating values
crit generate kubeconfig
generates a kubeconfig
Synopsis
generates a kubeconfig
crit generate kubeconfig [filename] [flags]
Options
--CN string (default "kubernetes-admin")
--O string (default "system:masters")
--cert-dir string (default ".")
--cert-name string (default "ca")
-h, --help help for kubeconfig
--name string (default "crit")
--server string
Options inherited from parent commands
-v, --verbose count log output verbosity
SEE ALSO
- crit generate - Utilities for generating values
crit generate token
generates a bootstrap token
Synopsis
generates a bootstrap token
crit generate token [token] [flags]
Options
-h, --help help for token
Options inherited from parent commands
-v, --verbose count log output verbosity
SEE ALSO
- crit generate - Utilities for generating values
cinder
Create local Kubernetes clusters
Synopsis
Cinder is a tool for creating and managing local Kubernetes clusters using containers as nodes. It builds upon kind, but using Crit and Cilium to configure a Critical Stack cluster locally.
Options
-h, --help help for cinder
-v, --verbose count log output verbosity
SEE ALSO
- cinder create - Create cinder resources
- cinder delete - Delete cinder resources
- cinder export - Export from local cluster
- cinder get - Get cinder resources
- cinder load - Load container images from host
- cinder version - Print the version info
General Commands
cinder load
Load container images from host
Synopsis
Load container images from host
cinder load [flags]
Options
-h, --help help for load
--name string cluster name (default "cinder")
Options inherited from parent commands
-v, --verbose count log output verbosity
SEE ALSO
- cinder - Create local Kubernetes clusters
cinder version
Print the version info
Synopsis
Print the version info
cinder version [flags]
Options
-h, --help help for version
Options inherited from parent commands
-v, --verbose count log output verbosity
SEE ALSO
- cinder - Create local Kubernetes clusters
cinder create
Create cinder resources
Synopsis
Create cinder resources such a new cinder clusters or add nodes to existing clusters.
cinder create [flags]
Options
-h, --help help for create
Options inherited from parent commands
-v, --verbose count log output verbosity
SEE ALSO
- cinder - Create local Kubernetes clusters
- cinder create cluster - Creates a new cinder cluster
- cinder create node - Creates a new cinder worker
cinder create cluster
Creates a new cinder cluster
Synopsis
Creates a new cinder cluster
cinder create cluster [flags]
Options
-c, --config string cinder configuration file
-h, --help help for cluster
--image string node image (default "criticalstack/cinder:v1.0.0-beta.1")
--kubeconfig string sets kubeconfig path instead of $KUBECONFIG or $HOME/.kube/config
--name string cluster name (default "cinder")
Options inherited from parent commands
-v, --verbose count log output verbosity
SEE ALSO
- cinder create - Create cinder resources
cinder create node
Creates a new cinder worker
Synopsis
Creates a new cinder worker
cinder create node [flags]
Options
-c, --config string cinder configuration file
-h, --help help for node
--image string node image (default "criticalstack/cinder:v1.0.0-beta.1")
--kubeconfig string sets kubeconfig path instead of $KUBECONFIG or $HOME/.kube/config
--name string cluster name (default "cinder")
Options inherited from parent commands
-v, --verbose count log output verbosity
SEE ALSO
- cinder create - Create cinder resources
cinder delete
Delete cinder resources
Synopsis
Delete cinder resources
cinder delete [flags]
Options
-h, --help help for delete
Options inherited from parent commands
-v, --verbose count log output verbosity
SEE ALSO
- cinder - Create local Kubernetes clusters
- cinder delete cluster - Deletes a cinder cluster
- cinder delete node - Deletes a cinder node
cinder delete cluster
Deletes a cinder cluster
Synopsis
Deletes a cinder cluster
cinder delete cluster [flags]
Options
-h, --help help for cluster
--name string cluster name (default "cinder")
Options inherited from parent commands
-v, --verbose count log output verbosity
SEE ALSO
- cinder delete - Delete cinder resources
cinder delete node
Deletes a cinder node
Synopsis
Deletes a cinder node
cinder delete node [flags]
Options
-h, --help help for node
--name string cluster name (default "cinder")
Options inherited from parent commands
-v, --verbose count log output verbosity
SEE ALSO
- cinder delete - Delete cinder resources
cinder export
Export from local cluster
Synopsis
Export from local cluster
cinder export [flags]
Options
-h, --help help for export
Options inherited from parent commands
-v, --verbose count log output verbosity
SEE ALSO
- cinder - Create local Kubernetes clusters
- cinder export kubeconfig - Export kubeconfig from cinder cluster and merge with $HOME/.kube/config
cinder export kubeconfig
Export kubeconfig from cinder cluster and merge with $HOME/.kube/config
Synopsis
Export kubeconfig from cinder cluster and merge with $HOME/.kube/config
cinder export kubeconfig [flags]
Options
-h, --help help for kubeconfig
--kubeconfig string sets kubeconfig path instead of $KUBECONFIG or $HOME/.kube/config
--name string cluster name (default "cinder")
Options inherited from parent commands
-v, --verbose count log output verbosity
SEE ALSO
- cinder export - Export from local cluster
cinder get
Get cinder resources
Synopsis
Get cinder resources
cinder get [flags]
Options
-h, --help help for get
Options inherited from parent commands
-v, --verbose count log output verbosity
SEE ALSO
- cinder - Create local Kubernetes clusters
- cinder get clusters - Get running cluster
- cinder get images - List all containers images used by cinder
- cinder get ip - Get node IP address
- cinder get kubeconfigs - Get kubeconfig from cinder cluster
- cinder get nodes - List cinder cluster nodes
cinder get clusters
Get running cluster
Synopsis
Get running cluster
cinder get clusters [flags]
Options
-h, --help help for clusters
--kubeconfig string sets kubeconfig path instead of $KUBECONFIG or $HOME/.kube/config
--name string cluster name (default "cinder")
Options inherited from parent commands
-v, --verbose count log output verbosity
SEE ALSO
- cinder get - Get cinder resources
cinder get images
List all containers images used by cinder
Synopsis
List all containers images used by cinder
cinder get images [flags]
Options
-h, --help help for images
Options inherited from parent commands
-v, --verbose count log output verbosity
SEE ALSO
- cinder get - Get cinder resources
cinder get ip
Get node IP address
Synopsis
Get node IP address
cinder get ip [flags]
Options
-h, --help help for ip
--name string cluster name (default "cinder")
Options inherited from parent commands
-v, --verbose count log output verbosity
SEE ALSO
- cinder get - Get cinder resources
cinder get kubeconfigs
Get kubeconfig from cinder cluster
Synopsis
Get kubeconfig from cinder cluster
cinder get kubeconfigs [flags]
Options
-h, --help help for kubeconfigs
--kubeconfig string sets kubeconfig path instead of $KUBECONFIG or $HOME/.kube/config
--name string cluster name (default "cinder")
Options inherited from parent commands
-v, --verbose count log output verbosity
SEE ALSO
- cinder get - Get cinder resources
cinder get nodes
List cinder cluster nodes
Synopsis
List cinder cluster nodes
cinder get nodes [flags]
Options
-h, --help help for nodes
--name string cluster name (default "cinder")
Options inherited from parent commands
-v, --verbose count log output verbosity
SEE ALSO
- cinder get - Get cinder resources