Digital Infrastructure
Global
Data Center
Time-consuming upgrades and deployments
Thin SRE resources
250 Data Centers
Declarative configuration
Minimal attack surface
API-based management
Fully retired Kubespray
Faster deployments and upgrades
Improved operability
Equinix is the world’s digital infrastructure company, with over 250 data centers worldwide, covering 27 countries, 5 continents, and $7 billion in revenue.
Jorik Jonker’s DevOps team at Equinix offers managed Kubernetes and other managed services to enterprise customers who want to focus on security and compliance. Their initial managed Kubernetes offering was built on Kubespray and Flatcar, but as adoption scaled up, their team did not. They faced complications as they ran Kubespray alongside a system of very convoluted Ansible scripts and upgrades and deployments took a lot of time, tying up the SREs for too long. Equinix knew they had to become more efficient.
In 2019, Equinix found Talos Linux and liked its declarative configuration, reduced attack surface, and API management. However, they were concerned by how different Talos Linux was compared to their previous experiences. They also needed to prove security compliance for their enterprise customers and were unsure how to do this with Talos Linux, so they decided to continue using Kubespray for as long as they could.
Eventually, Equinix chose to do a proof of concept with Talos Linux. They found that, because Talos Linux only does Kubernetes and is API managed, it is architecturally simple and fast.
It was settled. Equinix knew Talos Linux would work for them, and they wanted to give the team time to practice and get acquainted with it. Equinix scheduled for team members to deploy Talos Linux while someone with experience broke the deployment, providing each individual with real-world experience of working with Talos Linux. Within hours, the team was comfortable with the new Operating System.
Equinix chose to build their first product, a new generation of their managed Kubernetes service, on Talos Linux.
The next step was setting up proof of compliance for their clients. They use Kubebench to assess their platform against CIS hardening guidelines. Initially, Kubebench reported some tests as failed, as it could not determine that some files and packages were set with limited permissions or disabled on Talos Linux. This was problematic because such files did not even exist within Talos Linux; thus, Kubebench was not able to process the information properly. Equinix submitted patches to Kubebench and resolved all the false positives and submitted PRs so that the Dutch government security compliance standards would correctly recognize Talos Linux as secure.
Equinix has now end-of-life’d their Kubespray Kubernetes offering and is solely supporting their Talos Linux-based Kubernetes product.
With Talos Linux, Equinix has reduced the time to deploy Kubernetes on virtual machines from 45 minutes to less than 10. They have also significantly reduced the time required for upgrades, allowing them to iterate releases faster. Now, when there is an issue that needs troubleshooting, they can simply replace a node to make things work. You can’t do that with Kubespray. Talos Linux encourages you to address infrastructure as cattle, which has systematic advantages in all parts of operations.
This article is based on the talk Jorik gave at TalosCon 2023.