Equinix Switches from KubeSpray to Talos Linux, Cutting Deployment Time While Maintaining Security

Edge Edge Location Global Retail Retail

Equinix is the world’s digital infrastructure company, with over 250 data centers worldwide, covering 27 countries, 5 continents, and $7 billion in revenue.

Challenge

  • Time-consuming upgrades and deployments
  • Thin SRE resources

Environment

  • 250 Data Centers
Impact

Impact

  • Fully retired Kubespray
  • Faster deployments and upgrades
  • Improved operability
Question

Why Sidero and Omni

  • Declarative configuration
  • Minimal attack surface
  • API-based management
Challenge

Growth Outpaces the Team, and Kubespray Can’t Keep Up

Jorik Jonker’s DevOps team at Equinix offers managed Kubernetes and other managed services to enterprise customers who want to focus on security and compliance. Their initial managed Kubernetes offering was built on Kubespray and Flatcar, but as adoption scaled up, their team did not. They faced complications as they ran Kubespray alongside a system of very convoluted Ansible scripts and upgrades and deployments took a lot of time, tying up the SREs for too long. Equinix knew they had to become more efficient.

In 2019, Equinix found Talos Linux and liked its declarative configuration, reduced attack surface, and API management. However, they were concerned by how different Talos Linux was compared to their previous experiences. They also needed to prove security compliance for their enterprise customers and were unsure how to do this with Talos Linux, so they decided to continue using Kubespray for as long as they could.

Solution

A Clean Break from Kubespray and Ansible

Eventually, Equinix chose to do a proof of concept with Talos Linux. They found that, because Talos Linux only does Kubernetes and is API managed, it is architecturally simple and fast.

It was settled. Equinix knew Talos Linux would work for them, and they wanted to give the team time to practice and get acquainted with it. Equinix scheduled for team members to deploy Talos Linux while someone with experience broke the deployment, providing each individual with real-world experience of working with Talos Linux. Within hours, the team was comfortable with the new Operating System.

Equinix chose to build their first product, a new generation of their managed Kubernetes service,  on Talos Linux.

The next step was setting up proof of compliance for their clients. They use Kubebench to assess their platform against CIS hardening guidelines. Initially, Kubebench reported some tests as failed, as it could not determine that some files and packages were set with limited permissions or disabled on Talos Linux. This was problematic because such files did not even exist within Talos Linux; thus, Kubebench was not able to process the information properly. Equinix submitted patches to Kubebench and resolved all the false positives and submitted PRs so that the Dutch government security compliance standards would correctly recognize Talos Linux as secure.

Equinix has now end-of-life’d their Kubespray Kubernetes offering and is solely supporting their Talos Linux-based Kubernetes product.

Results

Deployments drop from 45 Minutes to 10

With Talos Linux, Equinix has reduced the time to deploy Kubernetes on virtual machines from 45 minutes to less than 10. They have also significantly reduced the time required for upgrades, allowing them to iterate releases faster. Now, when there is an issue that needs troubleshooting, they can simply replace a node to make things work. You can’t do that with Kubespray. Talos Linux encourages you to address infrastructure as cattle, which has systematic advantages in all parts of operations.

This article is based on the talk Jorik gave at TalosCon 2023.

Thanks for reading!

There's more to discover.
Simply click through for the next article.