Live CNI migration: Canal to Cilium

Introduction

I recently performed a live CNI migration for existing Kubernetes clusters managed with kops from Canal to Cilium with kube-proxy replacement and decided to document the process in this post.

The clusters involved had approximately 750 nodes and were hosted on AWS. Your cluster might be running in a different environment or at a different scale, so keep in mind that you may need to adjust some steps if you’re using this post as a guide for your own migration.

The migration process is divided into multiple stages, with each stage serving as a breakpoint. This ensures the cluster remains fully operational throughout the migration, with uninterrupted pod-to-pod communication.

For context, Canal is essentially a CNI plugin that combines Calico (for policy management) and Flannel (for network management). If you’re migrating from Flannel, you can use the same approach with small adjustments.

CNI transition considerations

Network policy handling

Before proceeding with the migration, there are a few limitations associated with network policies that must be considered.

The most straightforward path is to temporarily disable network policies from this point until the migration is complete. This approach is also suggested in the Cilium migration documentation.

Disabling network policies was not an option in the clusters I was working on, which made the process a little more complicated.

If you find yourself in the same situation, you will likely need to create temporary policies that allow traffic between both networks. This is particularly important for Cilium, as it blocks traffic by default when any policies are defined in the cluster.

You can use a CiliumClusterwideNetworkPolicy if temporary broad access is acceptable for your cluster. It must be created after Cilium is installed.

Ingress allow lists

If your cluster uses ingress allow lists based on client-ip, note that you must enable Cilium’s kube-proxy replacement to maintain their functionality.

Without the eBPF based kube-proxy replacement, the client-ip presented to your pods will default to the Cilium interface IP of each node, which will make all your requests unauthorised.

Preparation

The preparation phase is the stage 0 of our migration. In this stage, we label existing nodes, update cluster settings, and set up the components required throughout the migration.

Note: All assets referenced in this article are available in: https://git.sr.ht/~jovem/cni-migration-assets/tree

Label existing nodes

As we will be controlling the CNIs available on each host using label selectors, let’s label the existing nodes and instance groups with:

node-role.kubernetes.io/canal=true

Once your instance groups are updated, apply the cluster changes:

kops update cluster <cluster-name> --state=<state-store> --yes

Note: If you have node auto scaling configured in your cluster, remember to configure the new labels on your node pools or instance groups so new nodes are created with the correct labels.

If you don’t want to label existing nodes individually, you can run a kops rolling-update to ensure that all nodes have the new label:

kops rolling-update cluster <cluster-name> --state=<state-store> --yes

Expose Flannel runtime data

When Canal is deployed via kops, Flannel runtime data is not exposed to the host. Access to runtime data is necessary later on during the Multus CNI setup.

We will be patching the canal daemonset to expose /run/flannel on the host and also to use a node selector associated with the node-role.kubernetes.io/canal=true label (that should now be associated with all cluster nodes).

{
  "spec": {
    "template": {
      "spec": {
        "nodeSelector": {
          "node-role.kubernetes.io/canal": "true"
        },
        "volumes": [
          {
            "name": "run-flannel",
            "hostPath": {
              "path": "/run/flannel",
              "type": "DirectoryOrCreate"
            }
          }
        ],
        "containers": [
          {
            "name": "kube-flannel",
            "volumeMounts": [
              {
                "mountPath": "/run/flannel",
                "name": "run-flannel"
              }
            ]
          }
        ]
      }
    }
  }
}

Important: Make sure that all nodes are labelled at this point, otherwise they will lose connectivity once you apply the patch.

The patch can be applied with:

kubectl -n kube-system patch daemonset canal --patch-file ./canal.patch.json

Install Cilium

At this stage, Cilium is installed in the cluster and associated with the node-role.kubernetes.io/cilium: "true" node selector. For installation, the official Cilium documentation provides instructions for different environments. This article assumes you’ll be using the generic Helm chart installation method.

Before proceeding with the installation, you’ll have to understand and decide which Cilium features you will be using. The clusters I was working on were already using Istio to handle ingress and L7 traffic, so Cilium L7 features were not enabled. If you’re migrating from Canal, you’re likely using VXLAN, so we will be using the Geneve tunnel protocol in Cilium (using VXLAN with a different port also works).

Use the following Helm values for the initial setup:

tunnelPort: 6081 # default geneve port
tunnelProtocol: geneve

operator:
  unmanagedPodWatcher:
    restart: false # Cilium will not restart pods for us during the migration

ipam:
  mode: "cluster-pool"
  operator:
    clusterPoolIPv4PodCIDRList: ["100.72.0.0/13"] # Make sure to use an exclusive CIDR for Cilium

cni:
  # Disable exclusive mode so that Cilium doesn't backup and remove other config files in etc/cni/net.d/*
  exclusive: false

nodeSelector:
  node-role.kubernetes.io/cilium: "true"

# Enable l7proxy if you require L7 features
l7Proxy: false

Install Cilium using:

helm repo add cilium https://helm.cilium.io/
helm install cilium cilium/cilium --version 1.18.6 --namespace kube-system -f values-stage-0.yaml

Important: Make sure the subnet used for Cilium does not overlap with Canal’s subnet. Overlapping subnets will prevent proper traffic routing.

Once installed, Cilium’s control plane will be deployed, but its DaemonSet will initially have zero associated nodes.

Install Multus

Multus is an open-source project that allows Kubernetes pods to attach to multiple networks. It functions as a meta-plugin (as in, a plug-in that calls other plug-ins), enabling pods to use both the Canal and Cilium networks simultaneously.

I am not going to be covering details about the Multus configuration in this article so I published a preconfigured Multus deployment in the cni-migration-assets repository. Review the manifests in the multus-cni folder, then apply them to your cluster:

kubectl apply -f ./multus-cni

Once the manifests are applied, you’ll have two Multus DaemonSets with distinct configurations and both will initially be associated with zero nodes.

Update kops networking

When using a CNI that is not managed through Kops, we need to update the cluster’s spec.networking to reflect it:

spec:
  networking:
    cni: {}

Be sure to review the advanced networking section in the kops documentation for further details before proceeding.

This should be your cluster state at this point:

All nodes labelled with node-role.kubernetes.io/canal=true.
Cluster autoscaling configured to apply node-role.kubernetes.io/canal=true to new nodes.
Patched Canal DaemonSet to:
- Expose /run/flannel volume.
- Add nodeSelector associated with node-role.kubernetes.io/canal=true.
Cilium installed with the node-role.kubernetes.io/cilium: "true" selector and associated with zero nodes.
Multus installed with two distinct DaemonSets and associated with zero nodes.
Kops configuration updated to replace canal: {} with cni: {} under the networking key.

Stage 1: canal primary, cilium secondary

Now that we have the DaemonSets necessary for the migration deployed, we will update the node labels present in kops instance group definitions. These labels control how CNIs will be configured on new nodes.

Add the following labels to all instance group definitions:

node-role.kubernetes.io/cilium=true
node-role.kubernetes.io/cni-priority=canal

Important: Do not remove the node-role.kubernetes.io/canal=true label from instance groups.

Once your instance groups are updated, apply the cluster changes:

kops update cluster <cluster-name> --state=<state-store> --yes

Instead of manually applying labels to existing nodes, we will perform a kops rolling update. It not only ensures that all nodes will have the correct labels applied but also allows CNIs to be properly configured as each node is recreated.

Start by rolling control-plane instance groups and then continue with the remaining ones:

kops rolling-update cluster <cluster-name> --state=<state-store> \
    --instance-group=<control-plane-ig> --yes

kops rolling-update cluster <cluster-name> --state=<state-store> --yes

Following is a brief explanation of each label and their role:

node-role.kubernetes.io/canal=true
Canal daemonset will be scheduled on the node.

node-role.kubernetes.io/cilium=true
Cilium daemonset will be scheduled on the node.

node-role.kubernetes.io/cni-priority=canal
Associate the node to the Multus daemonset configured to use Canal as its primary network.

Migration stage 1 establishes a dual-CNI architecture in your cluster. The deployment results in:

Multus now operates as the primary CNI on all nodes, managing both Canal and Cilium networks. The configuration maintains Canal as the primary network for all existing workloads, while Cilium runs as a secondary network.

Each pod is created with two network interfaces with distinct IP and subnet allocations:

Primary interface associated with Canal.
Secondary interface associated with Cilium.

IP addresses can be obtained by inspecting the k8s.v1.cni.cncf.io/network-status pod annotation.

Migration stage 1 results in the following cluster state:

Multus pods running in all cluster nodes.
Canal pods running in all cluster nodes.
Cilium pods running in all cluster nodes.
Multus using Canal as the primary network.
Multus using Cilium as the secondary network.

Stage 2: cilium primary, canal secondary

In this stage, cilium becomes the primary network, while canal serves as the secondary network.

The SBR plugin is chained to our CNI configs to properly route traffic based on the pod’s primary IP address. Source based routing allows, for example, traffic from a canal-primary pod targeting a cilium-primary pod to be routed exclusively through the canal network.

Important: If your cluster relies on ingress allow-lists based on client-ip, dedicate specific nodes for ingress workloads and keep using canal as the primary network on those nodes. This is necessary because kube-proxy is still active on all cluster nodes and will NAT traffic destined for the Cilium network.

Set the node-role.kubernetes.io/cni-priority label to cilium in your instance group definitions to set Cilium as the primary network:

node-role.kubernetes.io/cni-priority=cilium

Once your instance groups are updated, apply the cluster changes:

kops update cluster <cluster-name> --state=<state-store> --yes

Perform the rolling update, starting with the control-plane instance groups:

kops rolling-update cluster <cluster-name> --state=<state-store> \
    --instance-group=<control-plane-ig> --yes

kops rolling-update cluster <cluster-name> --state=<state-store> --yes

Once the rolling update is complete, migration stage 2 results in the following cluster state:

Multus, Canal, and Cilium pods running on all nodes.
Multus configured to use:
- Cilium as the primary network (except for ingress nodes).
- Canal as the secondary network (except for ingress nodes).
Ingress nodes continue to operate with canal-primary pod IPs.

Stage 3: cilium with kube-proxy replacement

Now that pods are using Cilium as their primary network, we can update cilium to enable kube-proxy replacement, remove canal and remove kube-proxy from new nodes.

Start by creating a CiliumNodeConfig that will disable kube-proxy replacement on nodes that are running Canal and Cilium:

---
apiVersion: cilium.io/v2
kind: CiliumNodeConfig
metadata:
  namespace: kube-system
  name: kube-proxy
spec:
  nodeSelector:
    matchLabels:
      node-role.kubernetes.io/canal: "true"
      node-role.kubernetes.io/cilium: "true"
  defaults:
    kube-proxy-replacement: "false"

And then update Cilium with the helm values from values-stage-3.yaml:

helm upgrade cilium cilium/cilium \
  --namespace kube-system \
  --version 1.18.6 \
  -f ./cilium/values-stage-3.yaml

Remove the following labels from your instance group definitions:

node-role.kubernetes.io/canal
node-role.kubernetes.io/cni-priority

And update your kops configuration to disable kube-proxy:

spec:
  kubeProxy:
    enabled: false

Apply the cluster changes:

kops update cluster <cluster-name> --state=<state-store> --yes

Perform a rolling update, starting with ingress instance groups, followed by control-plane and remaining nodes:

kops rolling-update cluster <cluster-name> --state=<state-store> \
    --instance-group=<ingress-ig> --yes

kops rolling-update cluster <cluster-name> --state=<state-store> \
    --instance-group=<control-plane-ig> --yes

kops rolling-update cluster <cluster-name> --state=<state-store> --yes

With kube-proxy removed and kube-proxy replacement enabled on all nodes, Cilium now operates as the sole cluster CNI.

Completing the migration

With Cilium now fully operational and kube-proxy replaced, you can clean up migration resources and apply production settings to Cilium.

Update your cilium release with the helm values from values-stage-4.yaml.

helm upgrade cilium cilium/cilium \
  --namespace kube-system \
  --version 1.18.6 \
  -f ./cilium/values-stage-4.yaml

Note that I added 2 new entries in this file that might be useful for those using Gossip DNS, k8sServiceHost and k8sServicePort.

Gossip-based clusters use a peer-to-peer network instead of externally hosted DNS for propagating the Kubernetes API address. If you perform a control-plane rolling update without specifying a load balancer address for k8sServiceHost, Cilium pods will not be able to communicate with the API server as soon as the last control-plane node is replaced because cilium pods are not aware of the new IP addresses. This issue resolves itself once the Cilium pods restart and discover the new API server addresses.

Once Cilium settings are updated, You can remove the node-role.kubernetes.io/cilium label from instance group definitions.

Remember to remove any temporary network policies you might have created, including the CiliumClusterwideNetworkPolicy mentioned earlier.

Validate your cluster to make sure everything is working correctly.

The following resources will now be unused and can also be removed from your cluster:

kube-multus daemonset.
kube-multus-cilium daemonset.
canal daemonset.
canal-config configmap.
calico-kube-controllers deployment.
kube-proxy CiliumNodeConfig

And that’s it, the migration is complete. Cilium is now your cluster’s CNI, with kube-proxy fully replaced by Cilium’s eBPF implementation.