It is very common to relate Kubernetes with terms like stateless and ephemeral. This is because, when a container is terminated, any data stored in it during its lifetime is lost. This behaviour might be acceptable in some scenarios, however, very often, there is a need to store persistently data and some static configurations. As you can imagine data persistence is an essential feature for running stateful applications such as databases and fileservers.
Persistence enables data to be stored outside of the container and here is when persitent volumes come into play. This post will explain later what a PV is but, in a nutshell, a PV is a Kubernetes resource that allows data to be retained even if the container is terminated, and also allows the data to be accessed by multiple containers or pods if needed.
On the other hand, is important to note that, generally speaking, a Kubernetes cluster does not exist on its own but depends on some sort of underlying infrastructure, that means that it would be really nice to have some kind of connector between the kubernetes control plane and the infrastructure control plane to get the most of it. For example, this dialogue may help the kubernetes scheduler to place the pods taking into account the failure domain of the workers to achieve better availability of an application, or even better, when it comes to storage, the kubernetes cluster can ask the infrastructure to use the existing datastores to meet any persistent storage requirement. This concept of communication between a kubernetes cluster and the underlying infrastructure is referred as a Cloud Provider or Cloud Provider Interface.
In this particular scenario we are running a kubernetes cluster over the top of vSphere and we will walk through the process of setting up the vSphere Cloud Provider. Once the vSphere Cloud Provider Interface is set up you can take advantage of the Container Native Storage (CNS) vSphere built-in feature. The CNS allows the developer to consume storage from vSphere on-demand on a fully automated fashion while providing to the storage administrator visibility and management of volumes from vCenter UI. Following picture depicts a high level diagram of the integration.
It is important to note that this article is not based on any specific kubernetes distributions in particular. In the case of using Tanzu on vSphere, some of the installation procedures are not necessary as they come out of the box when enabling the vSphere with Tanzu as part of an integrated solution.
Installing vSphere Container Storage Plugin
The Container Native Storage feature is realized by means of a Container Storage Plugin, also called a CSI driver. This CSI runs in a Kubernetes cluster deployed in vSphere and will be responsilbe for provisioning persistent volumes on vSphere datastores by interacting with vSphere control plane (i.e. vCenter). The plugin supports both file and block volumes. Block volumes are typically used in more specialized cases where low-level access to the storage is required, such as for databases or other storage-intensive workloads whereas file volumes are more commonly used in kubernetes because they are more flexible and easy to manage. This guide will focus on the file volumes but feel free to explore extra volume types and supported functionality as documented here.
Before proceeding, if you want to interact with vCenter via CLI instead of using the GUI a good helper would be govc that is a tool designed to be a user friendly CLI alternative to the vCenter GUI and it is well suited for any related automation tasks. The easiest way to install it is using the govc prebuilt binaries on the releases page. The following command will install it automatically and place the binary in the /usr/local/bin path.
curl -L -o - "https://github.com/vmware/govmomi/releases/latest/download/govc_$(uname -s)_$(uname -m).tar.gz" | sudo tar -C /usr/local/bin -xvzf - govc
To facilitate the use of govc, we can create a file to set some environment variables to avoid having to enter the URL and credentials each time. A good practice is to obfuscate the credentials using a basic Base64 encoding algorithm. Following command show how to code any string using this mechanism.
cGFzc3cwcmQK
Get the Base64 encoded string of your username and password as shown above and now edit a file named govc.env and set the following environment variables replacing with your particular data.
export GOVC_URL=vcsa.cpod-vcn.az-mad.cloud-garage.net
export GOVC_USERNAME=$(echo <yourbase64encodedusername> | base64 -d)
export GOVC_PASSWORD=$(echo <yourbase64encodedpasssord> | base64 -d)
export GOVC_INSECURE=1
Once the file is created you can actually set the variables using source command.
source govc.env
If everything is ok you can should be able to use govc command without any further parameters. As an example, try a simple task such as browsing your inventory to check if you can access to your vCenter and authentication has succeded.
/cPod-VCN/vm
/cPod-VCN/network
/cPod-VCN/host
/cPod-VCN/datastore
Step 1: Prepare vSphere Environment
According to the deployment document mentioned earlier, one of the requirements is enabling the UUID advanced property in all Virtual Machines that conform the cluster that is going to consume the vSphere storage through the CSI.
Since we already have the govc tool installed and operational we can take advantage of it to do it programmatically instead of using the vsphere graphical interface which is always more laborious and costly in time, especially if the number of nodes in our cluster is very high. The syntax to enable the mentioned advanced property is shown below.
govc vm.change -vm 'vm_inventory_path' -e="disk.enableUUID=1"
Using ls command and pointing to the right folder, we can see the name of the VMs that have been placed in the folder of interest. In my setup the VMs are placed under cPod-VCN/vm/k8s folder as you can see in the following output.
/cPod-VCN/vm/k8s/k8s-worker-06
/cPod-VCN/vm/k8s/k8s-worker-05
/cPod-VCN/vm/k8s/k8s-worker-03
/cPod-VCN/vm/k8s/k8s-control-plane-01
/cPod-VCN/vm/k8s/k8s-worker-04
/cPod-VCN/vm/k8s/k8s-worker-02
/cPod-VCN/vm/k8s/k8s-worker-01
Now that we know the VMs that conform our k8s cluster you can issue the following command to set the disk-enableUUID VM property one by one. Another smarter approach (specially if the number of worker nodes is high of if you need to automate this task) is taking advantage of some linux helpers to create “single line commands”. See below how you can do it chaining the govc output along with the powerful xargs command to easily issue the same command recursively for all ocurrences.
govc ls /cPod-VCN/vm/k8s | xargs -n1 -I{arg} govc vm.change -vm {arg} -e="disk.enableUUID=1"
This should enable the UUID advanced parameter in all the listed vms and we should be ready to take next step.
Step 2: Install Cloud Provider Interface
Once this vSphere related tasks has been completed, we can move to Kubernetes to install the Cloud Provider Interface. First of all, is worth to mention that the vSphere cloud-controller-manager (the element in charge of installing the required components that conforms the Cloud Provider) relies the well-known kubernetes taint node.cloudprovider.kubernetes.io/uninitialized=true:NoSchedule to mark the kubelet as not initialized before proceeding with cloud provider installation. Generally speaking a taint is just a node’s property in a form of a label that is typically used to ensure that nodes are properly configured before they are added to the cluster, and to prevent issues caused by nodes that are not yet ready to operate normally. Once the node is fully initialized, the label can be removed to restoring normal operation. The procedure to taint all the nodes of your cluster in a row, using a single command is shown below.
node/k8s-contol-plane-01 tainted
node/k8s-worker-01 tainted
node/k8s-worker-02 tainted
node/k8s-worker-03 tainted
node/k8s-worker-04 tainted
node/k8s-worker-05 tainted
node/k8s-worker-06 tainted
Once the cloud-controller-manager initializes this node, the kubelet removes this taint. Verify the taints are configured by using regular kubectl commads and some of the parsing and filtering capabilities that jq provides as showed below.
{
"name": "k8s-worker-01",
"taints": [
{
"effect": "NoSchedule",
"key": "node.cloudprovider.kubernetes.io/uninitialized",
"value": "true"
}
]
},
<skipped...>
Once the nodes are properly tainted we can install the vSphere cloud-controller-manager. Note CPI is tied to the specific kubernetes version we are running. In this particular case I am running k8s version 1.24. Get the corresponding manifest from the official cloud-provider-vsphere github repository using below commands.
VERSION=1.24
wget https://raw.githubusercontent.com/kubernetes/cloud-provider-vsphere/master/releases/v$VERSION/vsphere-cloud-controller-manager.yaml
Now edit the downloaded yaml file and locate the section where a Secret object named vsphere-cloud-secret is declared. Change the highlighted lines to match your environment settings. Given the fact this intents to be a lab environment and for the sake of simplicity, I am using a full-rights administrator account for this purpose. Make sure you should follow best practiques and create minimum privileged service accounts if you plan to use it in a production environment. Find here the full procedure to set up specific roles and permissions.
apiVersion: v1
kind: Secret
metadata:
name: vsphere-cloud-secret
labels:
vsphere-cpi-infra: secret
component: cloud-controller-manager
namespace: kube-system
# NOTE: this is just an example configuration, update with real values based on your environment
stringData:
172.25.3.3.username: "[email protected]"
172.25.3.3.password: "<useyourpasswordhere>"
In the same way, locate a ConfigMap object called vsphere-cloud-config and change relevant settings to match your environment as showed below:
apiVersion: v1
kind: ConfigMap
metadata:
name: vsphere-cloud-config
labels:
vsphere-cpi-infra: config
component: cloud-controller-manager
namespace: kube-system
data:
# NOTE: this is just an example configuration, update with real values based on your environment
vsphere.conf: |
# Global properties in this section will be used for all specified vCenters unless overriden in VirtualCenter section.
global:
port: 443
# set insecureFlag to true if the vCenter uses a self-signed cert
insecureFlag: true
# settings for using k8s secret
secretName: vsphere-cloud-secret
secretNamespace: kube-system
# vcenter section
vcenter:
vcsa:
server: 172.25.3.3
user: "[email protected]"
password: "<useyourpasswordhere>"
datacenters:
- cPod-VCN
Now that our configuration is completed we are ready to install the controller that will be in charge of establishing the communication between our vSphere based infrastructure and our kubernetes cluster.
serviceaccount/cloud-controller-manager created
secret/vsphere-cloud-secret created
configmap/vsphere-cloud-config created
rolebinding.rbac.authorization.k8s.io/servicecatalog.k8s.io:apiserver-authentication-reader created
clusterrolebinding.rbac.authorization.k8s.io/system:cloud-controller-manager created
clusterrole.rbac.authorization.k8s.io/system:cloud-controller-manager created
daemonset.apps/vsphere-cloud-controller-manager created
If everything goes as expected, we should now see a new pod running in the kube-system namespace. Verify the running state just by showing the created pod using kubectl as shown below.
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
vsphere-cloud-controller-manager-wtrjn 1/1 Running 1 (5s ago) 5s 10.113.2.10 k8s-contol-plane-01 <none> <none>
Step 3: Installing Container Storage Interface (CSI Driver)
Before moving further, it is important to establish the basic kubernetes terms related to storage. The following list summarizes the main resources kubernetes uses for this specific purpose.
- Persistent Volume: A PV is a kubernetes object used to provision persistent storage for a pod in the form of volumes. The PV can be provisioned manually by an administrator and backed by physical storage in a variety of formats such as local storage on the host running the pod or external storage such as NFS, or it can also be dinamically provisioned interacting with an storage provider through the use of a CSI (Compute Storage Interface) Driver.
- Persistent Volume Claim: A PVC is the developer’s way of defining a storage request. Just as the definition of a pod involves a computational request in terms of cpu and memory, a pvc will be related to storage parameters such as the size of the volume, the type of data access or the storage technology used.
- Storage Class: a StorageClass is another Kubernetes resource related to storage that allows you to point a storage resource whose configuration has been defined previously using a class created by the storage administrator. Each class can be related to a particular CSI driver and have a configuration profile associated with it such as class of service, deletion policy or retention.
To sum up, in general, to have persistence in kubernetes you need to create a PVC which later will be consumed by a pod. The PVC is just a request for storage bound to a particular PV, however, if your define a Storage Class, you don’t have to worry about PV provisioning, the StorageClass will create the PV on the fly on your behalf interacting via API with the storage infrastructure.
In the particular case of the vSphere CSI Driver, when a PVC requests storage, the driver will translate the instructions declared in the Kubernetes object into a API request that vCenter will be able to understand. vCenter will then instruct the creation of vSphere cloud native storage (i.e a PV in a form of a native vsphere vmdk) that will be attached to the VM running the Kubernetes node and then attached to the pod itself. One extra benefit is that vCenter will report information about the container volumes in the vSphere client to allow the administrator to have an integrated storage management view.
Let’s deploy the CSI driver then. The first step is to create a new namespace that we will use to place the CSI related objects. To do this we use kubectl as showed below:
kubectl create ns vmware-system-csi
Now create a config file that will be used to authenticate the cluster against vCenter. As mentioned we are using here a full-rights administrator account but it is recommended to use a service account with specific associated roles and permissions. Also, for the sake of simplicity, I am not verifying vCenter SSL presented certificate but it is strongly recommended to import vcenter certificates to enhance communications security. Replace the highligthed lines to match with your own environment as shown below.
[Global]
cluster-id = "cluster01"
cluster-distribution = "Vanilla"
# ca-file = <ca file path> # optional, use with insecure-flag set to false
# thumbprint = "<cert thumbprint>" # optional, use with insecure-flag set to false without providing ca-file
[VirtualCenter "vcsa.cpod-vcn.az-mad.cloud-garage.net"]
insecure-flag = "true"
user = "[email protected]"
password = "<useyourpasswordhere>"
port = "443"
datacenters = "<useyourvsphereDChere>"
In order to inject the configuration and credential information into kubernetes we will use a secret object that will use the config file as source. Use following kubectl command to proceed.
kubectl create secret generic vsphere-config-secret --from-file=csi-vsphere.conf -n vmware-system-csi
And now it is time to install the driver itself. As usual we will use a manifest that will install the latest version available that at the moment of writing this post is 2.7.
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/vsphere-csi-driver/v2.7.0/manifests/vanilla/vsphere-csi-driver.yaml
If you inspect the state of the installed driver you will see that two replicas of the vsphere-csi-controller deployment remain in a pending state. This is because the deployment by default is set to spin up 3 replicas but also has a policy to be scheduled only in control plane nodes along with an antiaffinity rule to avoid two pods running on the same node. That means that with a single control plane node the maximum number of replicas in running state would be one. On the other side a daemonSet will also spin a vsphere-csi-node in every single node.
NAME READY STATUS RESTARTS AGE
vsphere-csi-controller-7589ccbcf8-4k55c 0/7 Pending 0 2m25s
vsphere-csi-controller-7589ccbcf8-kbc27 0/7 Pending 0 2m27s
vsphere-csi-controller-7589ccbcf8-vc5d5 7/7 Running 0 4m13s
vsphere-csi-node-42d8j 3/3 Running 2 (3m25s ago) 4m13s
vsphere-csi-node-9npz4 3/3 Running 2 (3m28s ago) 4m13s
vsphere-csi-node-kwnzs 3/3 Running 2 (3m24s ago) 4m13s
vsphere-csi-node-mb4ss 3/3 Running 2 (3m26s ago) 4m13s
vsphere-csi-node-qptpc 3/3 Running 2 (3m24s ago) 4m13s
vsphere-csi-node-sclts 3/3 Running 2 (3m22s ago) 4m13s
vsphere-csi-node-xzglp 3/3 Running 2 (3m27s ago) 4m13s
You can easily adjust the number of replicas in the vsphere-csi-controller deployment that just by editing the kubernetes resource and set the number of replicas to one. The easiest way to do it is shown below.
kubectl scale deployment -n vmware-system-csi vsphere-csi-controller --replicas=1
Step 4: Creating StorageClass and testing persistent storage
Now that our CSI driver is up and running let’s create a storageClass that will point to the infrastructure provisioner to create the PVs for us on-demand. Before proceeding with storageClass definition lets take a look at the current datastore related information in our particular scenario. We can use the vSphere GUI for this but again, an smarter way is using govc to obtain some relevant information of our datastores that we will use afterwards.
Name: vsanDatastore
Path: /cPod-VCN/datastore/vsanDatastore
Type: vsan
URL: ds:///vmfs/volumes/vsan:529c9fd4d68b174b-1af2d7a4b1b22457
Capacity: 4095.9 GB
Free: 3328.0 GB
Name: nfsDatastore
Path: /cPod-VCN/datastore/nfsDatastore
Type: NFS
URL: ds:///vmfs/volumes/f153c0aa-c96d23c2/
Capacity: 1505.0 GB
Free: 1494.6 GB
Remote: 172.25.3.1:/data/Datastore
We want our volumes to use the vSAN storage as our persistent storage. To do so, use the vsanDataStore associated URL to instruct the CSI to create the persistent volumes in the desired datastore. You can create as many storageClasses as required, each of them with particular parametrization such as the datastore backend, the storage policy or the filesystem type. Additionally, as part of the definition of our Storage class, we are adding an annotation to declare this class as default. That means any PVC without an explicit storageClass specification will use this one as default.
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: vsphere-sc
annotations: storageclass.kubernetes.io/is-default-class=true
provisioner: csi.vsphere.vmware.com
parameters:
storagepolicyname: "vSAN Default Storage Policy"
datastoreurl: "ds:///vmfs/volumes/vsan:529c9fd4d68b174b-1af2d7a4b1b22457/"
# csi.storage.k8s.io/fstype: "ext4" #Optional Parameter
Once the yaml manifest is created simply apply it using kubectl.
kubectl apply -f vsphere-sc.yaml
As a best practique, always verify the status of any created object to see if everything is correct. Ensure the StorageClass is followed by “(default)” which means the annotation has been correctly applied and this storageClass will be used by default.
NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE
vsphere-sc (default) csi.vsphere.vmware.com Delete Immediate false 10d
As mentioned above the StorageClass allows us to abstract from the storage provider so that the developer can dynamically request a volume without the intervention of an storage administrator. The following manifest would allow you to create a volume using the newly created vsphere-cs class.
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: vsphere-pvc
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 5Gi
storageClassName: vsphere-sc
Apply the created yaml file using kubectl command as shown below
kubectl apply -f vsphere-pvc.yaml
And verify the creation of the PVC and the current status using kubectl get pvc command line.
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
vsphere-pvc Bound pvc-8da817f0-8231-4d27-9452-188a8ef4f144 5Gi RWO vsphere-sc 13s
Note how the new PVC is bound to a volume that has been automatically created by means of the CSI Driver without any user intervention. If you explore the PV using kubectl you would see.
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
pvc-8da817f0-8231-4d27-9452-188a8ef4f144 5Gi RWO Delete Bound default/vsphere-pvc vsphere-sc 37s
When using vSphere CSI Driver another cool benefit is that you have integrated management so you can access the vsphere console to verify the creation of the volume according to the capacity parameters and access policy configured as shown in the figure below
You can drill if you go to Cluster > Monitor > vSAN Virtual Objects. Filter out using the volume name assigned to the PVC to get a cleaner view of your interesting objects.
Now click on VIEW PLACEMENT DETAILS to see the Physical Placement for the persistent volume we have just created.
The Physical Placement window shows how the data is actually placed in vmdks that resides in different hosts (esx03 and esx04) of the vSAN enabled cluster creating a RAID 1 strategy for data redundancy that uses a third element as a witness placed in esx01. This policy is dictated by the “VSAN Default Storage Policy” that we have attached to our just created StorageClass. Feel free to try different StorageClasses bound to different vSAN Network policies to fit your requirements in terms of availability, space optimization, encryption and so on.
This concludes the article. In the next post I will explain how to enable vSAN File Services in kubernetes to cover a particular case in which many different pods running in different nodes need to access the same volume simultaneously. Stay tuned!