In the previous article we walked through the preparation of an upstream k8s cluster to take advantage of the converged storage infrastructure that vSphere provides by using a CSI driver that allows the pod to consume the vSAN storage in the form of Persistent Volumes created automatically through an special purpose StorageClass.
With the CSI driver we would have most of the persistent storage needs for our pods covered, however, in some particular cases it is necessary that multiple pods can mount and read/write the same volume simultaneously. This is basically defined by the Access Mode specification that is part of the PV/PVC definition. The typical Access Modes available in kubernetes are:
ReadWriteOnce – Mount a volume as read-write by a single node
ReadOnlyMany – Mount the volume as read-only by many nodes
ReadWriteMany – Mount the volume as read-write by many nodes
In this article we will focus on the Access Mode ReadWriteMany (RWX) that allow a volume to be mounted simultaneously in read-write mode for multiple pods running in different kubernetes nodes. This access mode is tipically supported by a network file sharing technology such as NFS. The good news are this is not a big deal if we have vSAN because, again, we can take advantage of this wonderful technology to enable the built-in file services and create shared network shares in a very easy and integrated way.
Enabling vSAN File Services
The procedure for doing this is described below. Let’s move into vSphere GUI for a while. Access to your cluster and go to vSAN>Services. Now click on ENABLE blue button at the bottom of the File Service Tile.
The first step will be to select the network on which the service will be deployed. In my case it will select a specific PortGroup in the subnet 10.113.4.0/24 and the VLAN 1304 with name VLAN-1304-NFS.
This action will trigger the creation of the necessary agents in each one of the hosts that will be in charge of serving the shared network resources via NFS. After a while we should be able to see a new Resource Group named ESX Agents with four new vSAN File Service Node VMs.
Once the agents have been deployed we can access to the services configuration and we will see that the configuration is incomplete because we haven’t defined some important service parameters yet. Click on Configure Domain button at the botton of the File Service tile.
The first step is to define a domain that will host the names of the shared services. In my particular case I will use vsanfs.sdefinitive.net as domain name. Any shared resource will be reached using this domain.
Before continuing, it is important that our DNS is configured with the names that we will give to the four file services agents needed. In my case I am using fs01, fs02, fs03 and fs04 as names in the domain vsanfs.sdefinitive.net and the IP addresses 10.113.4.100-103. Additionally we will have to indicate the IP of the DNS server used in the resolution, the netmask and the default gateway as shown below.
In the next screen we will see the option to integrate with AD, at the moment we can skip it because we will consume the network services from pods.
Next you will see the summary of all the settings made for a last review before proceeding.
Once we press the FINISH green button the network file services will be completed and ready to use.
Creating File Shares from vSphere
Once the vSAN File Services have been configured we should be able to create network shares that will be eventually consumed in the form of NFS type volumes from our applications. To do this we must first provision the file shares according our preferences. Go to File Services and click on ADD to create a new file share.
The file share creation wizard allow us to specify some important parameters such as the name of our shared service, the protocol (NFS) used to export the file share, the NFS version (4.1 and 3), the Storage Policy that the volume will use and, finally other quota related settings such as the size and warning threshold for our file share.
Additionally we can set a add security by means of a network access control policy. In our case we will allow any IP to access the shared service so we select the option “Allow access from any IP” but feel free to restrict access to certain IP ranges in case you need it.
Once all the parameters have been set we can complete the task by pressing the green FINISH button at the bottom right side of the window.
Let’s inspect the created file share that will be seen as another vSAN Virtual Object from the vSphere administrator perspective.
If we click on the VIEW FILE SHARE we could see the specific configuration of our file share. Write down the export path (fs01.vsanfs.sdefinitive.net:/vsanfs/my-nfs-share) since it will be used later as an specification of the yaml manifest that will declare the corresponding persistent volume kubernetes object.
From an storage administrator perspective we are done. Now we will see how to consume it from the developer perspective through native kubernetes resources using yaml manifest.
Consuming vSAN Network File Shares from Pods
A important requirement to be able to mount nfs shares is to have the necesary software installed in the worker OS, otherwise the mounting process will fail. If you are using a Debian’s familly Linux distro such as Ubuntu, the installation package that contains the necessary binaries to allow nfs mounts is nfs-common. Ensure this package is installed before proceeding. Issue below command to meet the requirement.
sudo apt-get install nfs-common
Before proceeding with creation of PV/PVCs, it is recommended to test connectivity from the workers as a sanity check. The first basic test would be pinging to the fqdn of the host in charge of the file share as defined in the export path of our file share captured earlier. Ensure you can also ping to the rest of the nfs agents defined (fs01-fs04).
ping fs01.vsanfs.sdefinitive.net
PING fs01.vsanfs.sdefinitive.net (10.113.4.100) 56(84) bytes of data.64 bytes from 10.113.4.100 (10.113.4.100):icmp_seq=1 ttl=63 time=0.688 ms
If DNS resolution and connectivity is working as expected we are safe to mount the file share in any folder in your filesystem. Following commands show how to mount the file share using NFS 4.1 by using the export part associated to our file share. Ensure the mount point (/mnt/my-nfs-share in this example) exists before proceeding. If not so create in advance using mkdir as usual.
mount -t nfs4 -o minorversion=1,sec=sys fs01.vsanfs.sdefinitive.net:/vsanfs/my-nfs-share /mnt/my-nfs-share -v
mount.nfs4:timeout set for Fri Dec 23 21:30:09 2022mount.nfs4:trying text-based options 'minorversion=1,sec=sys,vers=4,addr=10.113.4.100,clientaddr=10.113.2.15'
If the mounting is sucessfull you should be able to access the share at the mount point folder and even create a file like shown below.
/# cd /mnt/my-nfs-share/# touch hola.txt/# lshola.txt
Now we are safe to jump into the manifest world to define our persistent volumes and attach them to the desired pod. First declare the PV object using ReadWriteMany as accessMode and specify the server and export path of our network file share.
Note we will use here a storageClassName specification using an arbitrary name vsan-nfs. Using a “fake” or undefined storageClass is supported by kubernetes and is tipically used for binding purposes between the PV and the PVC which is exactly our case. This is a requirement to avoid that our PV resource ends up using the default storage class which in this particular scenario would not be compatible with the ReadWriteMany access-mode required for NFS volumes.
Apply the yaml and verify the creation of the PV. Note we are using rwx mode that allow access to the same volume from different pods running in different nodes simultaneously.
kubectl get pv nfs-pv
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGEnfs-pv 500Mi RWX Retain Bound default/nfs-pvc vsan-nfs 60s
Now do the same for the PVC pointing to the PV created. Note we are specifiying the same storageClassName to bind the PVC with the PV. The accessMode must be also consistent with PV definition and finally, for this example we are claiming 500 Mbytes of storage.
As usual verify the status of the pvc resource. As you can see the pvc is bound state as expected.
kubectl get pvc nfs-pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGEnfs-pvc Bound nfs-pv 500Mi RWX vsan-fs 119s
Then attach the volume to a regular pod using following yaml manifest as shown below. This will create a basic pod that will run an alpine image that will mount the nfs pvc in the /my-nfs-share container’s local path . Ensure the highlighted claimName specification of the volume matches with the PVC name defined earlier.
Apply the yaml using kubectl apply and try to open a shell session to the container using kubectl exec as shown below.
kubectl exec nfs-pod1 -it -- sh
We should be able to access the network share, list any existing files to check if you are able to write new files as shown below.
/# touch /my-nfs-share/hola.pod1/# ls /my-nfs-sharehola.pod1 hola.txt
The last test to check if actually multiple pods running in different nodes can read and write the same volume simultaneously would be creating a new pod2 that mounts the same volume. Ensure that both pods are scheduled in different nodes for a full verification of the RWX access-mode.
In the same manner apply the manifest file abouve to spin up the new pod2 and try to open a shell.
kubectl exec nfs-pod2 -it -- sh
Again, we should be able to access the network share, list existing files and also to create new files.
/# touch /my-nfs-share/hola.pod2/# ls /my-nfs-sharehola.pod1 hola.pod2 hola.txt
In this article we have learnt how to enable vSAN File Services and how to consume PV in RWX. In the next post I will explain how to leverage MinIO technology to provide an S3 like object based storage on the top of vSphere for our workloads. Stay tuned!
It is very common to relate Kubernetes with terms like stateless and ephemeral. This is because, when a container is terminated, any data stored in it during its lifetime is lost. This behaviour might be acceptable in some scenarios, however, very often, there is a need to store persistently data and some static configurations. As you can imagine data persistence is an essential feature for running stateful applications such as databases and fileservers.
Persistence enables data to be stored outside of the container and here is when persitent volumes come into play. This post will explain later what a PV is but, in a nutshell, a PV is a Kubernetes resource that allows data to be retained even if the container is terminated, and also allows the data to be accessed by multiple containers or pods if needed.
On the other hand, is important to note that, generally speaking, a Kubernetes cluster does not exist on its own but depends on some sort of underlying infrastructure, that means that it would be really nice to have some kind of connector between the kubernetes control plane and the infrastructure control plane to get the most of it. For example, this dialogue may help the kubernetes scheduler to place the pods taking into account the failure domain of the workers to achieve better availability of an application, or even better, when it comes to storage, the kubernetes cluster can ask the infrastructure to use the existing datastores to meet any persistent storage requirement. This concept of communication between a kubernetes cluster and the underlying infrastructure is referred as a Cloud Provider or Cloud Provider Interface.
In this particular scenario we are running a kubernetes cluster over the top of vSphere and we will walk through the process of setting up the vSphere Cloud Provider. Once the vSphere Cloud Provider Interface is set up you can take advantage of the Container Native Storage (CNS) vSphere built-in feature. The CNS allows the developer to consume storage from vSphere on-demand on a fully automated fashion while providing to the storage administrator visibility and management of volumes from vCenter UI. Following picture depicts a high level diagram of the integration.
It is important to note that this article is not based on any specific kubernetes distributions in particular. In the case of using Tanzu on vSphere, some of the installation procedures are not necessary as they come out of the box when enabling the vSphere with Tanzu as part of an integrated solution.
Installing vSphere Container Storage Plugin
The Container Native Storage feature is realized by means of a Container Storage Plugin, also called a CSI driver. This CSI runs in a Kubernetes cluster deployed in vSphere and will be responsilbe for provisioning persistent volumes on vSphere datastores by interacting with vSphere control plane (i.e. vCenter). The plugin supports both file and block volumes. Block volumes are typically used in more specialized cases where low-level access to the storage is required, such as for databases or other storage-intensive workloads whereas file volumes are more commonly used in kubernetes because they are more flexible and easy to manage. This guide will focus on the file volumes but feel free to explore extra volume types and supported functionality as documented here.
Before proceeding, if you want to interact with vCenter via CLI instead of using the GUI a good helper would be govc that is a tool designed to be a user friendly CLI alternative to the vCenter GUI and it is well suited for any related automation tasks. The easiest way to install it is using the govc prebuilt binaries on the releases page. The following command will install it automatically and place the binary in the /usr/local/bin path.
To facilitate the use of govc, we can create a file to set some environment variables to avoid having to enter the URL and credentials each time. A good practice is to obfuscate the credentials using a basic Base64 encoding algorithm. Following command show how to code any string using this mechanism.
echo “passw0rd” | base64
cGFzc3cwcmQK
Get the Base64 encoded string of your username and password as shown above and now edit a file named govc.env and set the following environment variables replacing with your particular data.
Once the file is created you can actually set the variables using source command.
sourcegovc.env
If everything is ok you can should be able to use govc command without any further parameters. As an example, try a simple task such as browsing your inventory to check if you can access to your vCenter and authentication has succeded.
According to the deployment document mentioned earlier, one of the requirements is enabling the UUID advanced property in all Virtual Machines that conform the cluster that is going to consume the vSphere storage through the CSI.
Since we already have the govc tool installed and operational we can take advantage of it to do it programmatically instead of using the vsphere graphical interface which is always more laborious and costly in time, especially if the number of nodes in our cluster is very high. The syntax to enable the mentioned advanced property is shown below.
Using ls command and pointing to the right folder, we can see the name of the VMs that have been placed in the folder of interest. In my setup the VMs are placed under cPod-VCN/vm/k8s folder as you can see in the following output.
Now that we know the VMs that conform our k8s cluster you can issue the following command to set the disk-enableUUID VM property one by one. Another smarter approach (specially if the number of worker nodes is high of if you need to automate this task) is taking advantage of some linux helpers to create “single line commands”. See below how you can do it chaining the govc output along with the powerful xargs command to easily issue the same command recursively for all ocurrences.
This should enable the UUID advanced parameter in all the listed vms and we should be ready to take next step.
Step 2: Install Cloud Provider Interface
Once this vSphere related tasks has been completed, we can move to Kubernetes to install the Cloud Provider Interface. First of all, is worth to mention that the vSphere cloud-controller-manager (the element in charge of installing the required components that conforms the Cloud Provider) relies the well-known kubernetes taintnode.cloudprovider.kubernetes.io/uninitialized=true:NoSchedule to mark the kubelet as not initialized before proceeding with cloud provider installation. Generally speaking a taint is just a node’s property in a form of a label that is typically used to ensure that nodes are properly configured before they are added to the cluster, and to prevent issues caused by nodes that are not yet ready to operate normally. Once the node is fully initialized, the label can be removed to restoring normal operation. The procedure to taint all the nodes of your cluster in a row, using a single command is shown below.
Once the cloud-controller-manager initializes this node, the kubelet removes this taint. Verify the taints are configured by using regular kubectl commads and some of the parsing and filtering capabilities that jq provides as showed below.
Once the nodes are properly tainted we can install the vSphere cloud-controller-manager. Note CPI is tied to the specific kubernetes version we are running. In this particular case I am running k8s version 1.24. Get the corresponding manifest from the official cloud-provider-vsphere github repository using below commands.
Now edit the downloaded yaml file and locate the section where a Secret object named vsphere-cloud-secret is declared. Change the highlighted lines to match your environment settings. Given the fact this intents to be a lab environment and for the sake of simplicity, I am using a full-rights administrator account for this purpose. Make sure you should follow best practiques and create minimum privileged service accounts if you plan to use it in a production environment. Find here the full procedure to set up specific roles and permissions.
vi vsphere-cloud-controller-manager.yaml (Secret)
apiVersion:v1kind:Secretmetadata:name:vsphere-cloud-secretlabels:vsphere-cpi-infra:secretcomponent:cloud-controller-managernamespace:kube-system# NOTE: this is just an example configuration, update with real values based on your environmentstringData:172.25.3.3.username:"Administrator@vsphere.local"172.25.3.3.password:"<useyourpasswordhere>"
In the same way, locate a ConfigMap object called vsphere-cloud-config and change relevant settings to match your environment as showed below:
vi vsphere-cloud-controller-manager.yaml (ConfigMap)
apiVersion:v1kind:ConfigMapmetadata:name:vsphere-cloud-configlabels:vsphere-cpi-infra:configcomponent:cloud-controller-managernamespace:kube-systemdata:# NOTE: this is just an example configuration, update with real values based on your environmentvsphere.conf:| # Global properties in this section will be used for all specified vCenters unless overriden in VirtualCenter section. global: port: 443 # set insecureFlag to true if the vCenter uses a self-signed cert insecureFlag: true # settings for using k8s secret secretName: vsphere-cloud-secret secretNamespace: kube-system # vcenter section vcenter: vcsa: server: 172.25.3.3 user: "Administrator@vsphere.local" password: "<useyourpasswordhere>" datacenters: - cPod-VCN
Now that our configuration is completed we are ready to install the controller that will be in charge of establishing the communication between our vSphere based infrastructure and our kubernetes cluster.
If everything goes as expected, we should now see a new pod running in the kube-system namespace. Verify the running state just by showing the created pod using kubectl as shown below.
kubectl get pod -n kube-system vsphere-cloud-controller-manager-wtrjn -o wide
Before moving further, it is important to establish the basic kubernetes terms related to storage. The following list summarizes the main resources kubernetes uses for this specific purpose.
Persistent Volume: A PV is a kubernetes object used to provision persistent storage for a pod in the form of volumes. The PV can be provisioned manually by an administrator and backed by physical storage in a variety of formats such as local storage on the host running the pod or external storage such as NFS, or it can also be dinamically provisioned interacting with an storage provider through the use of a CSI (Compute Storage Interface) Driver.
Persistent Volume Claim: A PVC is the developer’s way of defining a storage request. Just as the definition of a pod involves a computational request in terms of cpu and memory, a pvc will be related to storage parameters such as the size of the volume, the type of data access or the storage technology used.
Storage Class: a StorageClass is another Kubernetes resource related to storage that allows you to point a storage resource whose configuration has been defined previously using a class created by the storage administrator. Each class can be related to a particular CSI driver and have a configuration profile associated with it such as class of service, deletion policy or retention.
To sum up, in general, to have persistence in kubernetes you need to create a PVC which later will be consumed by a pod. The PVC is just a request for storage bound to a particular PV, however, if your define a Storage Class, you don’t have to worry about PV provisioning, the StorageClass will create the PV on the fly on your behalf interacting via API with the storage infrastructure.
In the particular case of the vSphere CSI Driver, when a PVC requests storage, the driver will translate the instructions declared in the Kubernetes object into a API request that vCenter will be able to understand. vCenter will then instruct the creation of vSphere cloud native storage (i.e a PV in a form of a native vsphere vmdk) that will be attached to the VM running the Kubernetes node and then attached to the pod itself. One extra benefit is that vCenter will report information about the container volumes in the vSphere client to allow the administrator to have an integrated storage management view.
Let’s deploy the CSI driver then. The first step is to create a new namespace that we will use to place the CSI related objects. To do this we use kubectl as showed below:
kubectl create ns vmware-system-csi
Now create a config file that will be used to authenticate the cluster against vCenter. As mentioned we are using here a full-rights administrator account but it is recommended to use a service account with specific associated roles and permissions. Also, for the sake of simplicity, I am not verifying vCenter SSL presented certificate but it is strongly recommended to import vcenter certificates to enhance communications security. Replace the highligthed lines to match with your own environment as shown below.
vi csi-vsphere.conf
[Global]cluster-id = "cluster01"cluster-distribution = "Vanilla"# ca-file = <ca file path> # optional, use with insecure-flag set to false# thumbprint = "<cert thumbprint>" # optional, use with insecure-flag set to false without providing ca-file[VirtualCenter "vcsa.cpod-vcn.az-mad.cloud-garage.net"]insecure-flag = "true"user = "Administrator@cpod-vcn.az-mad.cloud-garage.net"password = "<useyourpasswordhere>"port = "443"datacenters = "<useyourvsphereDChere>"
In order to inject the configuration and credential information into kubernetes we will use a secret object that will use the config file as source. Use following kubectl command to proceed.
And now it is time to install the driver itself. As usual we will use a manifest that will install the latest version available that at the moment of writing this post is 2.7.
If you inspect the state of the installed driver you will see that two replicas of the vsphere-csi-controller deployment remain in a pending state. This is because the deployment by default is set to spin up 3 replicas but also has a policy to be scheduled only in control plane nodes along with an antiaffinity rule to avoid two pods running on the same node. That means that with a single control plane node the maximum number of replicas in running state would be one. On the other side a daemonSet will also spin a vsphere-csi-node in every single node.
You can easily adjust the number of replicas in the vsphere-csi-controller deployment that just by editing the kubernetes resource and set the number of replicas to one. The easiest way to do it is shown below.
Step 4: Creating StorageClass and testing persistent storage
Now that our CSI driver is up and running let’s create a storageClass that will point to the infrastructure provisioner to create the PVs for us on-demand. Before proceeding with storageClass definition lets take a look at the current datastore related information in our particular scenario. We can use the vSphere GUI for this but again, an smarter way is using govc to obtain some relevant information of our datastores that we will use afterwards.
We want our volumes to use the vSAN storage as our persistent storage. To do so, use the vsanDataStore associated URL to instruct the CSI to create the persistent volumes in the desired datastore. You can create as many storageClasses as required, each of them with particular parametrization such as the datastore backend, the storage policy or the filesystem type. Additionally, as part of the definition of our Storage class, we are adding an annotation to declare this class as default. That means any PVC without an explicit storageClass specification will use this one as default.
Once the yaml manifest is created simply apply it using kubectl.
kubectl apply -f vsphere-sc.yaml
As a best practique, always verify the status of any created object to see if everything is correct. Ensure the StorageClass is followed by “(default)” which means the annotation has been correctly applied and this storageClass will be used by default.
As mentioned above the StorageClass allows us to abstract from the storage provider so that the developer can dynamically request a volume without the intervention of an storage administrator. The following manifest would allow you to create a volume using the newly created vsphere-cs class.
Apply the created yaml file using kubectl command as shown below
kubectl apply -f vsphere-pvc.yaml
And verify the creation of the PVC and the current status using kubectl get pvc command line.
kubectl get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGEvsphere-pvc Bound pvc-8da817f0-8231-4d27-9452-188a8ef4f144 5Gi RWO vsphere-sc 13s
Note how the new PVC is bound to a volume that has been automatically created by means of the CSI Driver without any user intervention. If you explore the PV using kubectl you would see.
kubectl get pv
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGEpvc-8da817f0-8231-4d27-9452-188a8ef4f1445Gi RWO Delete Bound default/vsphere-pvc vsphere-sc 37s
When using vSphere CSI Driver another cool benefit is that you have integrated management so you can access the vsphere console to verify the creation of the volume according to the capacity parameters and access policy configured as shown in the figure below
You can drill if you go to Cluster > Monitor > vSAN Virtual Objects. Filter out using the volume name assigned to the PVC to get a cleaner view of your interesting objects.
Now click on VIEW PLACEMENT DETAILS to see the Physical Placement for the persistent volume we have just created.
The Physical Placement window shows how the data is actually placed in vmdks that resides in different hosts (esx03 and esx04) of the vSAN enabled cluster creating a RAID 1 strategy for data redundancy that uses a third element as a witness placed in esx01. This policy is dictated by the “VSAN Default Storage Policy” that we have attached to our just created StorageClass. Feel free to try different StorageClasses bound to different vSAN Network policies to fit your requirements in terms of availability, space optimization, encryption and so on.
This concludes the article. In the next post I will explain how to enable vSAN File Services in kubernetes to cover a particular case in which many different pods running in different nodes need to access the same volume simultaneously. Stay tuned!