Kubernetes and Helm: The Beauty of Orchestration

Previous Section: Terraform: What is it, how do we use it, our scripts to provision
Welcome to the next instalment in our multi-part series on setting up a Kubernetes cluster with integrated CVMFS (CERN Virtual Machine File System) support! In this guide, we’ll dive into configuring Kubernetes and Helm on MicroK8s, covering essential setup steps for deploying a cluster that seamlessly integrates with CVMFS drivers. By leveraging Helm’s configuration capabilities alongside Kubernetes essentials, we aim to simplify deployment and management while ensuring a robust setup tailored for high-performance data access and storage.
This section builds on our previous steps, taking you further into advanced Kubernetes concepts, from setting up persistent storage solutions to managing complex configurations with Helm’s value overrides. As we progress, you’ll not only gain insight into practical configurations but also understand how CVMFS can be harnessed for distributed data systems, enhancing data accessibility and scalability within your Kubernetes environment.
Prerequisites
To make the most of this guide, ensure you have a working knowledge of Kubernetes and Helm fundamentals, especially in these areas:
- Storage Classes: Essential for defining dynamic storage options, enabling Kubernetes to handle different types of storage backends seamlessly.
- Persistent Volume Claims (PVCs): Key to managing pod storage needs, allowing applications to request and retain storage across restarts.
- Persistent Volumes (PVs): Understand PVs to control the storage resources used in the cluster, particularly relevant in environments using CVMFS and NFS-backed storage.
- Pod Deployments: Familiarity with deploying pods, particularly using NGINX or other common container images, will help when testing volume configurations.
- Values YAML Overrides: Proficiency with Helm’s values file structure is crucial for customizing deployments and managing complex configurations for each environment.
By following each step in this guide, you’ll not only be able to set up Kubernetes on MicroK8s with Helm and CVMFS but also gain insights into best practices for managing data-intensive workloads on Kubernetes clusters.
MicroK8s: A Lightweight Kubernetes Solution
MicroK8s, a minimal and production-ready Kubernetes distribution, is designed to provide developers with a quick, easy way to deploy and manage containerized applications without the overhead of a traditional Kubernetes setup. Developed by Canonical, MicroK8s is lightweight, efficient, and deployable across Linux, macOS, and Windows, making it perfect for local development, edge computing, or small-scale production environments.
Why Choose MicroK8s?
MicroK8s simplifies Kubernetes by packaging essential services—like DNS, storage, and ingress controllers—into a small, integrated package. Unlike traditional Kubernetes, which often requires extensive configuration, MicroK8s provides a fully encapsulated environment with a single-command installation.
Key Benefits of MicroK8s
- Ease of Use: MicroK8s allows for rapid setup, enabling developers to avoid complex configurations and get right to testing and deploying applications.
- Optimized Resource Usage: Its lightweight design requires fewer resources, making it ideal for development and testing environments on local machines.
- Modular Add-Ons: With built-in add-ons (e.g., DNS, ingress, and storage), MicroK8s provides the flexibility to enable or disable services as needed.
Streamlined Kubernetes with MicroK8s
MicroK8s empowers developers and teams by offering a rapid, low-overhead approach to Kubernetes, enabling fast experimentation and easy transitions from development to production for smaller-scale applications.
Helm: Simplifying Kubernetes Deployments
Helm is often called the “Kubernetes package manager” because it drastically simplifies deploying, updating, and managing applications within Kubernetes clusters. By bundling configurations into pre-configured templates known as “charts,” Helm provides consistent and streamlined application deployment across environments.
What is Helm?
Helm packages Kubernetes resources, such as deployments and services, into Helm charts. This allows for consistent and simplified deployment management and reduces configuration errors, making Helm invaluable for Kubernetes environments.
Key Benefits of Helm
- Efficient Deployment: Helm streamlines Kubernetes setup by packaging configurations into reusable charts.
- Version Control and Rollbacks: Helm maintains versioned charts, allowing for easy updates and reliable rollback if issues arise.
- Flexibility and Reusability: Helm charts are customizable for different environments, reducing redundancy and saving time in configuring Kubernetes resources.
Setting Up the Environment
To get MicroK8s running smoothly, we’ll handle several key setup tasks. In this section, we’ll install MicroK8s, configure essential add-ons, and set up Helm for Kubernetes.
Essential Add-Ons to Enable
- Helm: A package manager to streamline Kubernetes app deployment.
- Host-Storage: Enables local storage management.
- DNS: Supports service discovery within the cluster.
- Ingress: Manages external access to services.
- Registry: Provides a local container registry for storing images.
Installing and Configuring MicroK8s with setup.sh
The setup.sh
script initializes MicroK8s, enables add-ons, and configures the environment. Canonical also provides a guide on how to install everything you need here.
# Function to install MicroK8s on the local machine.
# This function:
# - Updates the system package list and upgrades existing packages.
# - Installs MicroK8s via snap in the stable channel.
# - Configures UFW (Uncomplicated Firewall) to allow network traffic for MicroK8s.
# - Adds the current user to the MicroK8s group to manage permissions.
# - Sets up the .kube directory for Kubernetes configurations.
function install_microk8s(){
echo "Installing MicroK8s..."
sudo apt update && sudo apt upgrade -y # Update and upgrade system packages
sudo snap install microk8s --classic --channel=1.30/stable # Install MicroK8s from snap
sudo microk8s status --wait-ready # Wait until MicroK8s is ready
# Allow traffic on the MicroK8s CNI (Container Network Interface) bridge
sudo ufw allow in on cni0 && sudo ufw allow out on cni0
sudo ufw default allow routed # Allow routing by default
# Add the user to the microk8s group and set permissions on the .kube directory
sudo usermod -a -G microk8s ubuntu
sudo mkdir -p /home/ubuntu/.kube
sudo chown -R ubuntu:ubuntu /home/ubuntu/.kube
echo "MicroK8s installed successfully"
}
# Function to create aliases for MicroK8s commands to simplify usage.
# This function:
# - Detects the user’s shell (either bash or zsh).
# - Adds aliases for `kubectl` and `helm` commands to use MicroK8s' versions.
# - Reloads the shell configuration file to apply the new aliases immediately.
function alias_microk8s() {
# Check shell type and set the appropriate configuration file
if [[ $SHELL == *"bash"* ]]; then
config_file="/home/ubuntu/.bashrc"
elif [[ $SHELL == *"zsh"* ]]; then
config_file="/home/ubuntu/.zshrc"
else
echo "Unsupported shell. Please use bash or zsh."
return 1
fi
# Add aliases if they do not already exist in the configuration file
if grep -q "alias kubectl=" "$config_file" && grep -q "alias helm=" "$config_file"; then
echo "Aliases already exist in $config_file"
else
echo "alias kubectl='microk8s kubectl'" >> "$config_file"
echo "alias helm='microk8s helm3'" >> "$config_file"
echo "Aliases added to $config_file"
# Reload configuration file to apply the aliases
source "$config_file"
echo "Configuration reloaded. You can now use 'kubectl' and 'helm'."
fi
}
# Function to enable essential add-ons for MicroK8s.
# This function:
# - Enables various MicroK8s add-ons like Helm, DNS, ingress, registry, and storage.
# - Links the kubelet directory to facilitate Kubernetes operations.
# - Calls `alias_microk8s` to set up aliases for ease of use.
function setup_microK8s(){
echo "Setting up MicroK8s..."
# Enable add-ons required for Kubernetes operations
microk8s enable helm
microk8s enable host-storage
microk8s enable dns
microk8s enable ingress
microk8s enable registry
# Set up aliases for MicroK8s commands
alias_microk8s
# Create symbolic link for the kubelet directory for compatibility
sudo ln -s /var/snap/microk8s/common/var/lib/kubelet /var/lib/kubelet
echo "MicroK8s setup completed"
}
Deploying NFS with Helm for CVMFS Drivers
The CVMFS driver relies on NFS (Network File System) as a storage backend that supports ReadWriteMany
. This setup allows multiple Kubernetes pods to access a shared data volume.
# Function to install and configure NFS (Network File System) on the local machine.
# This function:
# - Installs the NFS server package.
# - Sets up a shared directory with the correct permissions.
# - Configures the export settings to share the directory across specified subnets.
# - Restarts the NFS server to apply changes.
function install_nfs(){
# Install NFS kernel server package
sudo apt-get install nfs-kernel-server -y
# Create the NFS directory and set permissions
sudo mkdir -p /srv/nfs
sudo chown nobody:nogroup /srv/nfs
sudo chmod 0777 /srv/nfs
# Obtain local and Kubernetes IPs and configure subnet addresses for NFS sharing
local_ip=$(hostname -I | awk '{print $1}')
kube_ip=$(hostname -I | awk '{print $2}')
subnet_local_ip=$(echo $local_ip | awk -F. '{print $1"."$2".0.0/16"}')
subnet_kube_ip=$(echo $kube_ip | awk -F. '{print $1"."$2".0.0/16"}')
# Backup existing exports file and configure new export rules
sudo mv /etc/exports /etc/exports.bak
echo "/srv/nfs $subnet_local_ip(rw,sync,no_subtree_check,no_root_squash,insecure) $subnet_kube_ip(rw,sync,no_subtree_check,no_root_squash,insecure)" | sudo tee /etc/exports
# Restart NFS server to apply export settings
sudo systemctl restart nfs-kernel-server
}
# Function to install the NFS CSI (Container Storage Interface) driver in Kubernetes.
# This function:
# - Adds the Helm repository for the NFS CSI driver.
# - Installs the CSI driver in the kube-system namespace.
# - Configures an NFS storage class and persistent volume claim for shared storage in Kubernetes.
function install_csi_nfs_driver(){
# Add and update Helm repository for the NFS CSI driver
microk8s helm3 repo add csi-driver-nfs https://raw.githubusercontent.com/kubernetes-csi/csi-driver-nfs/master/charts
microk8s helm3 repo update
# Install NFS CSI driver with Helm, setting the correct kubelet directory for MicroK8s
microk8s helm3 install csi-driver-nfs csi-driver-nfs/csi-driver-nfs \
--namespace kube-system \
--set kubeletDir=/var/snap/microk8s/common/var/lib/kubelet
# Wait until the CSI driver pods are ready
microk8s kubectl wait pod --selector app.kubernetes.io/name=csi-driver-nfs --for condition=ready --namespace kube-system
# Define the local server IP for the NFS storage class configuration
local_ip=$(hostname -I | awk '{print $1}')
# Create a YAML file for the NFS storage class
echo "Creating NFS storage class"
cat <<EOF > nfs-storage-class.yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: nfs
provisioner: nfs.csi.k8s.io
parameters:
server: $local_ip
share: /srv/nfs
reclaimPolicy: Delete
volumeBindingMode: Immediate
mountOptions:
- hard
- nfsvers=4.1
EOF
# Create a YAML file for a persistent volume claim that uses the NFS storage class
cat <<EOF > nfs-pvc.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: cvmfs-alien-cache
namespace: kube-system
spec:
storageClassName: nfs
accessModes: [ReadWriteMany]
resources:
requests:
storage: 25Gi
EOF
# Apply the storage class and persistent volume claim configurations in Kubernetes
microk8s kubectl apply -f nfs-storage-class.yaml
microk8s kubectl apply -f nfs-pvc.yaml -n kube-system # Only used for CVMFS alien cache
echo "NFS storage class created successfully"
echo "NFS driver installed successfully"
}
Installing the CVMFS Drivers
We’ve set up our Kubernetes cluster and configured our environment, and now it’s time to integrate CVMFS (CernVM File System)—a key component for managing large-scale data efficiently. This step is crucial for projects involving scientific computing or environments where sharing massive datasets is essential. If you need a refresher on CVMFS, please go here (TODO: Put link).
What is CVMFS? (A Quick Recap)
CVMFS, or CernVM File System, is a distributed file system that simplifies access to software and data repositories. It was designed to handle the unique challenges of large-scale scientific experiments, such as those conducted at CERN, where software and data need to be shared efficiently across thousands of nodes. Here’s a breakdown of how CVMFS operates:
- On-Demand Access: CVMFS doesn’t require data to be downloaded in full. Instead, files are accessed on demand, only downloading what is needed.
- Caching: It utilizes a caching mechanism that keeps frequently accessed files locally, reducing download times and bandwidth usage.
- Versioning: CVMFS maintains a versioned file system, allowing users to access previous versions of software or data.
- Content-Addressed Storage: Files are identified by their content, ensuring consistency across multiple nodes.
In our Kubernetes setup, CVMFS will be accessible as a mounted file system, allowing containerized applications to pull data seamlessly without manual downloads.
# Function to install CVMFS (CERN Virtual Machine File System) on Kubernetes.
# This function:
# - Clones the CVMFS CSI (Container Storage Interface) repository to the local machine.
# - Uses Helm to install the CVMFS CSI driver in the `kube-system` namespace.
# - Applies a custom values file (`cvmfs-values.yaml`) to configure the deployment.
function install_cvmfs(){
echo "Installing CVMFS on K8s..."
# Clone the CVMFS CSI repository to the specified directory
git clone https://github.com/lablytics/cvmfs-csi /home/ubuntu/cvmfs-csi
# Install CVMFS using Helm with custom values, deploying it in the kube-system namespace
microk8s helm3 install cvmfs /home/ubuntu/cvmfs-csi/deployments/helm/cvmfs-csi -n kube-system -f cvmfs-values.yaml
echo "CVMFS installed successfully"
}
The repository cloned here is a direct fork from here https://github.com/cvmfs-contrib/cvmfs-csi, we have had to make a small change to the config files in order to make the drivers compatible with non-standard Kubernetes libraries such as MicroK8s. When we say “non-standard”, we just mean that the default kubelet
directory is different than /var/lib/kubelet
Understanding the Configuration in cvmfs-values.yaml
The cvmfs-values.yaml
file is a crucial part of the setup as it contains configurations specific to our CVMFS deployment. Let’s go through the key settings:
# Configuration for additional ConfigMaps in the CVMFS CSI deployment.
# This ConfigMap, `cvmfs-csi-default-local`, defines settings for CVMFS behavior.
extraConfigMaps:
cvmfs-csi-default-local:
default.local: |
# Directly use HTTP proxy without any intermediate proxies.
CVMFS_HTTP_PROXY="DIRECT"
# Set cache quota limit to 4000 MB.
CVMFS_QUOTA_LIMIT="4000"
# Enable GeoAPI for optimized repository server selection based on location.
CVMFS_USE_GEOAPI="yes"
# Set autofs timeout to 1 hour for auto-unmounting idle CVMFS mounts.
CVMFS_AUTOFS_TIMEOUT=3600
# Enable debug logging, directing output to /tmp/cvmfs.log.
CVMFS_DEBUGLOG=/tmp/cvmfs.log
# Conditional settings for alien cache (an additional external cache).
{{- if .Values.cache.alien.enabled }}
# Use an alien cache at the specified location if enabled.
CVMFS_ALIEN_CACHE={{ .Values.cache.alien.location }}
# When using alien cache, disable quota limit control by CVMFS.
CVMFS_QUOTA_LIMIT=-1
# Configure whether repositories share a common cache directory.
CVMFS_SHARED_CACHE=no
{{- end -}}
# Define the host path for automounting CVMFS.
automountHostPath: /cvmfs
# Specify the directory for kubelet in the MicroK8s environment.
kubeletDirectory: /var/snap/microk8s/common/var/lib/kubelet
# Configuration for automatically creating a storage class for CVMFS.
automountStorageClass:
create: true # Enable automatic creation of a storage class.
name: cvmfs # Set the name of the storage class to 'cvmfs' IMPORTANT.
Why This Setup is Important
This setup integrates CVMFS with Kubernetes using a CSI driver, allowing applications running in the cluster to seamlessly access CVMFS repositories as if they were local file systems. The caching mechanism in CVMFS reduces network load, speeds up data access, and makes it easy to share large datasets across distributed systems—all without manually managing downloads.
By the end of this configuration, CVMFS is installed and ready to serve data in a Kubernetes-native way, leveraging the flexibility of Helm and the power of the Kubernetes API.
Deploying a Test PVC and Pod
To set up a testing environment for CVMFS (CERN Virtual Machine File System) in Kubernetes, this script and YAML configuration files deploy a PersistentVolumeClaim (PVC) and a test pod that mounts it. This deployment allows you to validate CVMFS configurations and test the integration of the storage volume in a Kubernetes cluster. Below is a breakdown of each piece of code and what it accomplishes.
# Function to deploy a test PVC (PersistentVolumeClaim) and a test pod in Kubernetes.
# This function:
# - Applies the cvmfs-pvc.yaml to create a ReadOnlyMany volume claim for CVMFS.
# - Applies the cvmfs-demo-pod.yaml to create a pod with the PVC mounted for testing.
function deploy_test_pvc_and_pod(){
echo "Deploying CVMFS PVC"
# Deploy the PersistentVolumeClaim for CVMFS; --validate=false skips strict validation.
microk8s kubectl apply -f cvmfs-pvc.yaml --validate=false
echo "CVMFS PVC deployed successfully"
echo "Deploying CVMFS test pod"
# Deploy the test pod, mounting the CVMFS PVC as a volume.
microk8s kubectl apply -f cvmfs-demo-pod.yaml --validate=false
echo "CVMFS test pod deployed successfully"
}
PersistentVolumeClaim YAML File (cvmfs-pvc.yaml
)
This PVC configuration reserves 10Gi of storage from the cvmfs
storage class and allows ReadOnlyMany
access, meaning multiple pods can read the data simultaneously. The PVC enables the storage to be dynamically mounted by pods in the Kubernetes cluster.
# YAML file to define a PersistentVolumeClaim (PVC) for CVMFS.
# This PVC:
# - Uses the 'cvmfs' storage class to provision storage.
# - Sets access mode to ReadOnlyMany, allowing multiple pods to read the volume concurrently.
# - Requests 10Gi of storage to be available.
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: cvmfs # Name of the PVC to reference it in the pod spec.
spec:
accessModes:
- ReadOnlyMany # Allow multiple pods to read from this volume.
storageClassName: cvmfs # Use the 'cvmfs' storage class that was defined earlier.
resources:
requests:
storage: 10Gi # Request 10Gi storage capacity.
Pod YAML File (cvmfs-demo-pod.yaml
)
This YAML file defines a simple pod named cvmfs-demo
running an NGINX container. The pod uses the previously created PVC to mount the CVMFS volume at /cvmfs
. This setup allows the pod to access files within the CVMFS file system, verifying that the storage configuration is functioning as expected.
# YAML file to define a test pod that mounts the CVMFS PVC.
# This pod:
# - Runs an NGINX container.
# - Mounts the CVMFS volume to the container at the /cvmfs path.
# - Uses HostToContainer mount propagation to ensure the volume is shared correctly.
apiVersion: v1
kind: Pod
metadata:
name: cvmfs-demo # Name of the test pod.
spec:
containers:
- name: nginx # Container name.
image: nginx # Image to pull.
imagePullPolicy: IfNotPresent # Pull the image if not already available.
volumeMounts:
- name: cvmfs # Mount the CVMFS PVC as a volume.
mountPath: /cvmfs # Directory path in the container where volume is mounted.
mountPropagation: HostToContainer # Ensures the host volume is accessible within the container.
volumes:
- name: cvmfs # Reference to the PVC.
persistentVolumeClaim:
claimName: cvmfs # Name of the PVC to mount in the pod.
Setting up Connections to our Cluster (Optional)
Setting up external access to our Kubernetes cluster allows us to manage and monitor it with tools like Lens, a Kubernetes dashboard that provides a graphical interface for managing clusters, viewing logs, and debugging. To facilitate this connectivity, we modify the kubectl
configuration files and MicroK8s settings, enabling the cluster to accept external connections using the server’s public IP address. Below is a breakdown of the script used to update these configurations.
Bash Script for Updating Cluster Configuration with Public IP
This script consists of two functions, update_ip_in_kubeconfig
and update_ip_in_microk8s_config
, which modify specific configuration files to include the server’s public IP address. These adjustments allow external access to the Kubernetes API server, making it possible to manage the cluster from outside the local network.
# Function to update the kubeconfig file with the public IP address.
# This function:
# - Retrieves the public IP address.
# - Modifies a copy of the kubeconfig file to use the public IP, allowing external access.
function update_ip_in_kubeconfig() {
echo "Updating public IP in kubeconfig file"
# Retrieve the public IP address of the server
public_ip=$(curl -s ifconfig.me)
kubeconfig_file="/var/snap/microk8s/current/credentials/client.config" # Original kubeconfig file
new_kubeconfig_path="/home/ubuntu/.kube/config-public" # Path to save the updated kubeconfig
# Exit if the public IP couldn't be retrieved
if [ -z "$public_ip" ]; then
echo "Failed to retrieve public IP."
exit 1
fi
# Check if the kubeconfig file exists, then create a copy and update it with the public IP
if [ -f "$kubeconfig_file" ]; then
sudo cp "$kubeconfig_file" "$new_kubeconfig_path"
sudo sed -i "s/server: https:\/\/[0-9.]\+/server: https:\/\/$public_ip/" "$new_kubeconfig_path"
sudo chown ubuntu:ubuntu "$new_kubeconfig_path"
echo "Public IP ($public_ip) added to the kubeconfig file"
else
echo "kubeconfig file not found"
exit 1
fi
}
# Function to update the MicroK8s configuration to include the public IP address.
# This function:
# - Retrieves the server’s public IP address.
# - Updates the MicroK8s CSR (Certificate Signing Request) template to add the public IP, allowing the API server to recognize it as a valid address for connections.
function update_ip_in_microk8s_config(){
echo "Updating public IP in MicroK8s configuration"
# Retrieve the public IP address of the server
public_ip=$(curl -s ifconfig.me)
config_file="/var/snap/microk8s/current/certs/csr.conf.template" # MicroK8s CSR template file
# Exit if the public IP couldn't be retrieved
if [ -z "$public_ip" ]; then
echo "Failed to retrieve public IP."
exit 1
fi
# Add the public IP to the CSR template as IP.3 to support external connections
sudo sed -i "/#MOREIPS/a IP.3 = $public_ip" "$config_file"
echo "Public IP ($public_ip) added to the configuration as IP.3"
}
Putting it all Together
Once you have understood all of the methods, you can now run your script to perform all of the required actions
# Main function to execute the complete setup process for Kubernetes with CVMFS and NFS.
# This function:
# - Installs MicroK8s and enables essential add-ons.
# - Configures NFS for shared storage.
# - Installs the CSI NFS driver to manage NFS storage in Kubernetes.
# - Updates MicroK8s and kubeconfig settings to allow external access.
# - Installs CVMFS, setting up the CERN Virtual Machine File System for data sharing.
# - Deploys a test PersistentVolumeClaim and a test pod to validate CVMFS functionality.
function main() {
install_microk8s # Install MicroK8s on the local machine.
install_nfs # Install and configure NFS server.
setup_microK8s # Enable add-ons and prepare MicroK8s for use.
install_csi_nfs_driver # Deploy the CSI driver for managing NFS storage in Kubernetes.
update_ip_in_microk8s_config # Update MicroK8s config with public IP for external access.
update_ip_in_kubeconfig # Update kubeconfig with public IP for external connectivity.
install_cvmfs # Install CVMFS to enable CERN data access in Kubernetes.
deploy_test_pvc_and_pod # Deploy a test PVC and pod to verify CVMFS setup.
}
# Execute the main function to initialize the full Kubernetes setup.
main
At this point in the guide, here’s what our project structure should look like
├── devops/
├── .env
├── configure-aws.sh
|── YOUR_KEY_PAIR.pem
|── Dockerfile
|── docker-compose.yml
|── provision.sh
├── cvmfs-full-setup-basic/
├── cvmfs-values.yaml # Values for our CVMFS driver
├── main.tf # Terraform configuration
├── setup.sh # All the setup functionality from this guide
├── terraform.tfvars # Specific variables for terraform
└── variables.tf # Generic Variables for Terraform
In the next part of this guide, we will create some automated scripts to connect to our EC2 instance, tear it down, and other helpful tools that will make our DevOps experience finally mean something! Because what is DevOps without configurations and automations!
Next Section: Putting it All Together: CVMFS In Action