It is very common to relate Kubernetes with terms like stateless and ephemeral. This is because, when a container is terminated, any data stored in it during its lifetime is lost. This behaviour might be acceptable in some scenarios, however, very often, there is a need to store persistently data and some static configurations. As you can imagine data persistence is an essential feature for running stateful applications such as databases and fileservers.

Persistence enables data to be stored outside of the container and here is when persitent volumes come into play. This post will explain later what a PV is but, in a nutshell, a PV is a Kubernetes resource that allows data to be retained even if the container is terminated, and also allows the data to be accessed by multiple containers or pods if needed.

On the other hand, is important to note that, generally speaking, a Kubernetes cluster does not exist on its own but depends on some sort of underlying infrastructure, that means that it would be really nice to have some kind of connector between the kubernetes control plane and the infrastructure control plane to get the most of it. For example, this dialogue may help the kubernetes scheduler to place the pods taking into account the failure domain of the workers to achieve better availability of an application, or even better, when it comes to storage, the kubernetes cluster can ask the infrastructure to use the existing datastores to meet any persistent storage requirement. This concept of communication between a kubernetes cluster and the underlying infrastructure is referred as a Cloud Provider or Cloud Provider Interface.

In this particular scenario we are running a kubernetes cluster over the top of vSphere and we will walk through the process of setting up the vSphere Cloud Provider. Once the vSphere Cloud Provider Interface is set up you can take advantage of the Container Native Storage (CNS) vSphere built-in feature. The CNS allows the developer to consume storage from vSphere on-demand on a fully automated fashion while providing to the storage administrator visibility and management of volumes from vCenter UI. Following picture depicts a high level diagram of the integration.

Kubernetes and vSphere Cloud Provider

It is important to note that this article is not based on any specific kubernetes distributions in particular. In the case of using Tanzu on vSphere, some of the installation procedures are not necessary as they come out of the box when enabling the vSphere with Tanzu as part of an integrated solution.

Installing vSphere Container Storage Plugin

The Container Native Storage feature is realized by means of a Container Storage Plugin, also called a CSI driver. This CSI runs in a Kubernetes cluster deployed in vSphere and will be responsilbe for provisioning persistent volumes on vSphere datastores by interacting with vSphere control plane (i.e. vCenter). The plugin supports both file and block volumes. Block volumes are typically used in more specialized cases where low-level access to the storage is required, such as for databases or other storage-intensive workloads whereas file volumes are more commonly used in kubernetes because they are more flexible and easy to manage. This guide will focus on the file volumes but feel free to explore extra volume types and supported functionality as documented here.

Before proceeding, if you want to interact with vCenter via CLI instead of using the GUI a good helper would be govc that is a tool designed to be a user friendly CLI alternative to the vCenter GUI and it is well suited for any related automation tasks. The easiest way to install it is using the govc prebuilt binaries on the releases page. The following command will install it automatically and place the binary in the /usr/local/bin path.

curl -L -o - "https://github.com/vmware/govmomi/releases/latest/download/govc_$(uname -s)_$(uname -m).tar.gz" | sudo tar -C /usr/local/bin -xvzf - govc

To facilitate the use of govc, we can create a file to set some environment variables to avoid having to enter the URL and credentials each time. A good practice is to obfuscate the credentials using a basic Base64 encoding algorithm. Following command show how to code any string using this mechanism.

echo “passw0rd” | base64
cGFzc3cwcmQK

Get the Base64 encoded string of your username and password as shown above and now edit a file named govc.env and set the following environment variables replacing with your particular data.

vi govc.env
export GOVC_URL=vcsa.cpod-vcn.az-mad.cloud-garage.net
export GOVC_USERNAME=$(echo <yourbase64encodedusername> | base64 -d)
export GOVC_PASSWORD=$(echo <yourbase64encodedpasssord> | base64 -d)
export GOVC_INSECURE=1

Once the file is created you can actually set the variables using source command.

source govc.env

If everything is ok you can should be able to use govc command without any further parameters. As an example, try a simple task such as browsing your inventory to check if you can access to your vCenter and authentication has succeded.

govc ls
/cPod-VCN/vm
/cPod-VCN/network
/cPod-VCN/host
/cPod-VCN/datastore

Step 1: Prepare vSphere Environment

According to the deployment document mentioned earlier, one of the requirements is enabling the UUID advanced property in all Virtual Machines that conform the cluster that is going to consume the vSphere storage through the CSI.

Since we already have the govc tool installed and operational we can take advantage of it to do it programmatically instead of using the vsphere graphical interface which is always more laborious and costly in time, especially if the number of nodes in our cluster is very high. The syntax to enable the mentioned advanced property is shown below.

govc vm.change -vm 'vm_inventory_path' -e="disk.enableUUID=1"

Using ls command and pointing to the right folder, we can see the name of the VMs that have been placed in the folder of interest. In my setup the VMs are placed under cPod-VCN/vm/k8s folder as you can see in the following output.

govc ls /cPod-VCN/vm/k8s
/cPod-VCN/vm/k8s/k8s-worker-06
/cPod-VCN/vm/k8s/k8s-worker-05
/cPod-VCN/vm/k8s/k8s-worker-03
/cPod-VCN/vm/k8s/k8s-control-plane-01
/cPod-VCN/vm/k8s/k8s-worker-04
/cPod-VCN/vm/k8s/k8s-worker-02
/cPod-VCN/vm/k8s/k8s-worker-01

Now that we know the VMs that conform our k8s cluster you can issue the following command to set the disk-enableUUID VM property one by one. Another smarter approach (specially if the number of worker nodes is high of if you need to automate this task) is taking advantage of some linux helpers to create “single line commands”. See below how you can do it chaining the govc output along with the powerful xargs command to easily issue the same command recursively for all ocurrences.

govc ls /cPod-VCN/vm/k8s | xargs -n1 -I{arg} govc vm.change -vm {arg} -e="disk.enableUUID=1"

This should enable the UUID advanced parameter in all the listed vms and we should be ready to take next step.

Step 2: Install Cloud Provider Interface

Once this vSphere related tasks has been completed, we can move to Kubernetes to install the Cloud Provider Interface. First of all, is worth to mention that the vSphere cloud-controller-manager (the element in charge of installing the required components that conforms the Cloud Provider) relies the well-known kubernetes taint node.cloudprovider.kubernetes.io/uninitialized=true:NoSchedule to mark the kubelet as not initialized before proceeding with cloud provider installation. Generally speaking a taint is just a node’s property in a form of a label that is typically used to ensure that nodes are properly configured before they are added to the cluster, and to prevent issues caused by nodes that are not yet ready to operate normally. Once the node is fully initialized, the label can be removed to restoring normal operation. The procedure to taint all the nodes of your cluster in a row, using a single command is shown below.

kubectl get nodes | grep Ready | awk ‘{print $1}’ | xargs -n1 -I{arg} kubectl taint node {arg} node.cloudprovider.kubernetes.io/uninitialized=true:NoSchedule
node/k8s-contol-plane-01 tainted
node/k8s-worker-01 tainted
node/k8s-worker-02 tainted
node/k8s-worker-03 tainted
node/k8s-worker-04 tainted
node/k8s-worker-05 tainted
node/k8s-worker-06 tainted

Once the cloud-controller-manager initializes this node, the kubelet removes this taint. Verify the taints are configured by using regular kubectl commads and some of the parsing and filtering capabilities that jq provides as showed below.

kubectl get nodes -o json | jq ‘[.items[] | {name: .metadata.name, taints: .spec.taints}]’
{
    "name": "k8s-worker-01",
    "taints": [
      {
        "effect": "NoSchedule",
        "key": "node.cloudprovider.kubernetes.io/uninitialized",
        "value": "true"
      }
    ]
  },
<skipped...>

Once the nodes are properly tainted we can install the vSphere cloud-controller-manager. Note CPI is tied to the specific kubernetes version we are running. In this particular case I am running k8s version 1.24. Get the corresponding manifest from the official cloud-provider-vsphere github repository using below commands.

VERSION=1.24
wget https://raw.githubusercontent.com/kubernetes/cloud-provider-vsphere/master/releases/v$VERSION/vsphere-cloud-controller-manager.yaml

Now edit the downloaded yaml file and locate the section where a Secret object named vsphere-cloud-secret is declared. Change the highlighted lines to match your environment settings. Given the fact this intents to be a lab environment and for the sake of simplicity, I am using a full-rights administrator account for this purpose. Make sure you should follow best practiques and create minimum privileged service accounts if you plan to use it in a production environment. Find here the full procedure to set up specific roles and permissions.

vi vsphere-cloud-controller-manager.yaml (Secret)
apiVersion: v1
kind: Secret
metadata:
  name: vsphere-cloud-secret
  labels:
    vsphere-cpi-infra: secret
    component: cloud-controller-manager
  namespace: kube-system
  # NOTE: this is just an example configuration, update with real values based on your environment
stringData:
  172.25.3.3.username: "[email protected]"
  172.25.3.3.password: "<useyourpasswordhere>"

In the same way, locate a ConfigMap object called vsphere-cloud-config and change relevant settings to match your environment as showed below:

vi vsphere-cloud-controller-manager.yaml (ConfigMap)
apiVersion: v1
kind: ConfigMap
metadata:
  name: vsphere-cloud-config
  labels:
    vsphere-cpi-infra: config
    component: cloud-controller-manager
  namespace: kube-system
data:
  # NOTE: this is just an example configuration, update with real values based on your environment
  vsphere.conf: |
    # Global properties in this section will be used for all specified vCenters unless overriden in VirtualCenter section.
    global:
      port: 443
      # set insecureFlag to true if the vCenter uses a self-signed cert
      insecureFlag: true
      # settings for using k8s secret
      secretName: vsphere-cloud-secret
      secretNamespace: kube-system

    # vcenter section
    vcenter:
      vcsa:
        server: 172.25.3.3
        user: "[email protected]"
        password: "<useyourpasswordhere>"
        datacenters:
          - cPod-VCN

Now that our configuration is completed we are ready to install the controller that will be in charge of establishing the communication between our vSphere based infrastructure and our kubernetes cluster.

kubectl apply -f vsphere-cloud-controller-manager.yaml
serviceaccount/cloud-controller-manager created 
secret/vsphere-cloud-secret created 
configmap/vsphere-cloud-config created 
rolebinding.rbac.authorization.k8s.io/servicecatalog.k8s.io:apiserver-authentication-reader created 
clusterrolebinding.rbac.authorization.k8s.io/system:cloud-controller-manager created
clusterrole.rbac.authorization.k8s.io/system:cloud-controller-manager created 
daemonset.apps/vsphere-cloud-controller-manager created

If everything goes as expected, we should now see a new pod running in the kube-system namespace. Verify the running state just by showing the created pod using kubectl as shown below.

kubectl get pod -n kube-system vsphere-cloud-controller-manager-wtrjn -o wide
NAME                                     READY   STATUS    RESTARTS        AGE     IP            NODE                  NOMINATED NODE   READINESS GATES
vsphere-cloud-controller-manager-wtrjn   1/1     Running   1 (5s ago)   5s   10.113.2.10   k8s-contol-plane-01   <none>           <none>

Step 3: Installing Container Storage Interface (CSI Driver)

Before moving further, it is important to establish the basic kubernetes terms related to storage. The following list summarizes the main resources kubernetes uses for this specific purpose.

  • Persistent Volume: A PV is a kubernetes object used to provision persistent storage for a pod in the form of volumes. The PV can be provisioned manually by an administrator and backed by physical storage in a variety of formats such as local storage on the host running the pod or external storage such as NFS, or it can also be dinamically provisioned interacting with an storage provider through the use of a CSI (Compute Storage Interface) Driver.
  • Persistent Volume Claim: A PVC is the developer’s way of defining a storage request. Just as the definition of a pod involves a computational request in terms of cpu and memory, a pvc will be related to storage parameters such as the size of the volume, the type of data access or the storage technology used.
  • Storage Class: a StorageClass is another Kubernetes resource related to storage that allows you to point a storage resource whose configuration has been defined previously using a class created by the storage administrator. Each class can be related to a particular CSI driver and have a configuration profile associated with it such as class of service, deletion policy or retention.

To sum up, in general, to have persistence in kubernetes you need to create a PVC which later will be consumed by a pod. The PVC is just a request for storage bound to a particular PV, however, if your define a Storage Class, you don’t have to worry about PV provisioning, the StorageClass will create the PV on the fly on your behalf interacting via API with the storage infrastructure.

In the particular case of the vSphere CSI Driver, when a PVC requests storage, the driver will translate the instructions declared in the Kubernetes object into a API request that vCenter will be able to understand. vCenter will then instruct the creation of vSphere cloud native storage (i.e a PV in a form of a native vsphere vmdk) that will be attached to the VM running the Kubernetes node and then attached to the pod itself. One extra benefit is that vCenter will report information about the container volumes in the vSphere client to allow the administrator to have an integrated storage management view.

Let’s deploy the CSI driver then. The first step is to create a new namespace that we will use to place the CSI related objects. To do this we use kubectl as showed below:

kubectl create ns vmware-system-csi

Now create a config file that will be used to authenticate the cluster against vCenter. As mentioned we are using here a full-rights administrator account but it is recommended to use a service account with specific associated roles and permissions. Also, for the sake of simplicity, I am not verifying vCenter SSL presented certificate but it is strongly recommended to import vcenter certificates to enhance communications security. Replace the highligthed lines to match with your own environment as shown below.

vi csi-vsphere.conf
[Global]
cluster-id = "cluster01"
cluster-distribution = "Vanilla"
# ca-file = <ca file path> # optional, use with insecure-flag set to false
# thumbprint = "<cert thumbprint>" # optional, use with insecure-flag set to false without providing ca-file
[VirtualCenter "vcsa.cpod-vcn.az-mad.cloud-garage.net"]
insecure-flag = "true"
user = "[email protected]"
password = "<useyourpasswordhere>"
port = "443"
datacenters = "<useyourvsphereDChere>"

In order to inject the configuration and credential information into kubernetes we will use a secret object that will use the config file as source. Use following kubectl command to proceed.

kubectl create secret generic vsphere-config-secret --from-file=csi-vsphere.conf -n vmware-system-csi

And now it is time to install the driver itself. As usual we will use a manifest that will install the latest version available that at the moment of writing this post is 2.7.

kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/vsphere-csi-driver/v2.7.0/manifests/vanilla/vsphere-csi-driver.yaml

If you inspect the state of the installed driver you will see that two replicas of the vsphere-csi-controller deployment remain in a pending state. This is because the deployment by default is set to spin up 3 replicas but also has a policy to be scheduled only in control plane nodes along with an antiaffinity rule to avoid two pods running on the same node. That means that with a single control plane node the maximum number of replicas in running state would be one. On the other side a daemonSet will also spin a vsphere-csi-node in every single node.

kubectl get pod -n vmware-system-csi
NAME                                      READY   STATUS    RESTARTS        AGE
vsphere-csi-controller-7589ccbcf8-4k55c   0/7     Pending   0               2m25s
vsphere-csi-controller-7589ccbcf8-kbc27   0/7     Pending   0               2m27s
vsphere-csi-controller-7589ccbcf8-vc5d5   7/7     Running   0               4m13s
vsphere-csi-node-42d8j                    3/3     Running   2 (3m25s ago)   4m13s
vsphere-csi-node-9npz4                    3/3     Running   2 (3m28s ago)   4m13s
vsphere-csi-node-kwnzs                    3/3     Running   2 (3m24s ago)   4m13s
vsphere-csi-node-mb4ss                    3/3     Running   2 (3m26s ago)   4m13s
vsphere-csi-node-qptpc                    3/3     Running   2 (3m24s ago)   4m13s
vsphere-csi-node-sclts                    3/3     Running   2 (3m22s ago)   4m13s
vsphere-csi-node-xzglp                    3/3     Running   2 (3m27s ago)   4m13s

You can easily adjust the number of replicas in the vsphere-csi-controller deployment that just by editing the kubernetes resource and set the number of replicas to one. The easiest way to do it is shown below.

kubectl scale deployment -n vmware-system-csi vsphere-csi-controller --replicas=1

Step 4: Creating StorageClass and testing persistent storage

Now that our CSI driver is up and running let’s create a storageClass that will point to the infrastructure provisioner to create the PVs for us on-demand. Before proceeding with storageClass definition lets take a look at the current datastore related information in our particular scenario. We can use the vSphere GUI for this but again, an smarter way is using govc to obtain some relevant information of our datastores that we will use afterwards.

govc datastore.info
Name:        vsanDatastore
  Path:      /cPod-VCN/datastore/vsanDatastore
  Type:      vsan
  URL:       ds:///vmfs/volumes/vsan:529c9fd4d68b174b-1af2d7a4b1b22457
  Capacity:  4095.9 GB
  Free:      3328.0 GB
Name:        nfsDatastore
  Path:      /cPod-VCN/datastore/nfsDatastore
  Type:      NFS
  URL:       ds:///vmfs/volumes/f153c0aa-c96d23c2/
  Capacity:  1505.0 GB
  Free:      1494.6 GB
  Remote:    172.25.3.1:/data/Datastore

We want our volumes to use the vSAN storage as our persistent storage. To do so, use the vsanDataStore associated URL to instruct the CSI to create the persistent volumes in the desired datastore. You can create as many storageClasses as required, each of them with particular parametrization such as the datastore backend, the storage policy or the filesystem type. Additionally, as part of the definition of our Storage class, we are adding an annotation to declare this class as default. That means any PVC without an explicit storageClass specification will use this one as default.

vi vsphere-sc.yaml
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: vsphere-sc
  annotations: storageclass.kubernetes.io/is-default-class=true
provisioner: csi.vsphere.vmware.com
parameters:
  storagepolicyname: "vSAN Default Storage Policy"  
  datastoreurl: "ds:///vmfs/volumes/vsan:529c9fd4d68b174b-1af2d7a4b1b22457/"
# csi.storage.k8s.io/fstype: "ext4" #Optional Parameter

Once the yaml manifest is created simply apply it using kubectl.

kubectl apply -f vsphere-sc.yaml 

As a best practique, always verify the status of any created object to see if everything is correct. Ensure the StorageClass is followed by “(default)” which means the annotation has been correctly applied and this storageClass will be used by default.

kubectl get storageclass
NAME                        PROVISIONER              RECLAIMPOLICY   VOLUMEBINDINGMODE      ALLOWVOLUMEEXPANSION   AGE
vsphere-sc (default)        csi.vsphere.vmware.com   Delete          Immediate              false                  10d

As mentioned above the StorageClass allows us to abstract from the storage provider so that the developer can dynamically request a volume without the intervention of an storage administrator. The following manifest would allow you to create a volume using the newly created vsphere-cs class.

vi vsphere-pvc.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: vsphere-pvc
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 5Gi
  storageClassName: vsphere-sc

Apply the created yaml file using kubectl command as shown below

kubectl apply -f vsphere-pvc.yaml 

And verify the creation of the PVC and the current status using kubectl get pvc command line.

kubectl get pvc
NAME          STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
vsphere-pvc   Bound    pvc-8da817f0-8231-4d27-9452-188a8ef4f144   5Gi        RWO            vsphere-sc     13s

Note how the new PVC is bound to a volume that has been automatically created by means of the CSI Driver without any user intervention. If you explore the PV using kubectl you would see.

kubectl get pv
NAME                                       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM                 STORAGECLASS   REASON   AGE
pvc-8da817f0-8231-4d27-9452-188a8ef4f144   5Gi        RWO            Delete           Bound    default/vsphere-pvc   vsphere-sc              37s

When using vSphere CSI Driver another cool benefit is that you have integrated management so you can access the vsphere console to verify the creation of the volume according to the capacity parameters and access policy configured as shown in the figure below

Inspecting PVC from vSphere GUI

You can drill if you go to Cluster > Monitor > vSAN Virtual Objects. Filter out using the volume name assigned to the PVC to get a cleaner view of your interesting objects.

vSAN Virtual Objects

Now click on VIEW PLACEMENT DETAILS to see the Physical Placement for the persistent volume we have just created.

Persistent Volume Physical Placement

The Physical Placement window shows how the data is actually placed in vmdks that resides in different hosts (esx03 and esx04) of the vSAN enabled cluster creating a RAID 1 strategy for data redundancy that uses a third element as a witness placed in esx01. This policy is dictated by the “VSAN Default Storage Policy” that we have attached to our just created StorageClass. Feel free to try different StorageClasses bound to different vSAN Network policies to fit your requirements in terms of availability, space optimization, encryption and so on.

This concludes the article. In the next post I will explain how to enable vSAN File Services in kubernetes to cover a particular case in which many different pods running in different nodes need to access the same volume simultaneously. Stay tuned!