Real-world lessons for building better infrastructure

One of the last calls I joined was to learn how Talos Linux supports Costco-sized fridges. Before that? Manufacturing robots. And before that? Cows. (Yes, really.)

Every now and then, I hear a story that sticks. Not because of the scale or clever architecture, but because it reminds me why we built Talos Linux and Omni in the first place.

Here are five of my favorite real-world examples where our users were able to take back control and start trusting their infrastructure again.

Hathora: Gaming at internet scale, without scaling costs

  • Clusters: 100+
  • Regions: 14
  • Engineers: 6

Hathora set out to build a platform-as-a-service for multiplayer game studios, which need incredible network performance. All those gamers have very high expectations, making performance critical. But Hathora knew they couldn’t keep bleeding money on EKS. Cloud was too expensive. They needed to try something different. They would move to bare metal and create a fully hybrid environment.

Talos Linux gave them a reliable foundation across every node, no matter where it ran. With only six engineers available, Hathora also needed to make management easy. Omni did just that.

That’s 100s of clusters, 14 regions, and 6 engineers. On top of that, Hathora’s 80/20 bare metal and cloud split enabled them to drastically cut costs while still delivering low-latency performance to a global player base. They have a future-proof platform so impressive that it’s been written about here and here.

Mynewsdesk: Massive savings without the learning curve

  • Infra costs: ↓ 90%
  • Latency: ↓ 29%
  • K8s experience required: Minimal

Mynewsdesk had a similarly tough challenge. They needed to stay GDPR compliant while also tackling out-of-control infrastructure costs, so they decided to build an in-house platform running on Kubernetes. This would give them more control and certainty over the security of their data.

The problem was, they didn’t have much in-house Kubernetes expertise. They needed a solution that ticked all the boxes without giving them an impossible learning curve. And this is Kubernetes we’re talking about, so that is rarely a given.

They found Talos Linux made Kubernetes easy to work with. It’s declarative and extraordinarily simple, meaning the team didn’t need to learn about Kubernetes, YAML, and other complexities. Talos Linux successfully abstracted and hid Kubernetes from the developers, so they could just get started.

Since then, Mynewsdesk has reduced infrastructure costs by 90%, cutting $200,000 in costs down to $20,000. They also reduced latency by 29%, which I’m sure the end-users appreciate.

JYSK: From bottlenecked to smooth ops

  • Locations: 3,400
  • First transformation attempt: K3s (failed at Day 2)
  • Today: Consistent, zero-touch updates

When I say colorful and organized JYSK storefronts, what do you think? In this story, you should think nodes. JYSK chose to implement Kubernetes as part of the Unified Commerce initiative across 3,400 edge locations, and at first, they didn’t choose Talos Linux. They chose K3s. But that solution never reached Day 2 operations. The sheer volume of updates and patches required for in-store clusters was unmanageable. They would have to find something else.

When you have 3,400 stores and a global brand, consistency means a lot. And when they found Talos Linux, they liked its immutability. They liked the idea of no SSH, no last-minute fixes, knowing this would reduce configuration drift and ensure simplicity throughout the deployment.

Now that they didn’t have to worry about random things going wrong, they can focus on being systematic at scale. Plus, Talos Linux enables them to easily perform hands-free upgrades, ensuring all those colorful and organized storefronts.

SNCF: Cloud native transformation, tough processes

  • Org: France’s national railway
  • Security Manifesto: 200+ pages
  • Time to cloud native: 4 months

SNCF is a similar story, but this time with trains and public-facing screens in place of shops. SNCF runs France’s national railway system, and with so many people relying on them all day, every day, they don’t have time for broken updates or flaky infrastructure.

The Cloud Native Team had long since considered moving from the public cloud to an open-source Kubernetes platform, but some very big blockers stood in their way. Namely, a 200-page security manifesto and working with teams who were accustomed to doing things a certain way.

Talos Linux provided SNCF a number of benefits, massively reducing production incidents between IaaS and CaaS, and enabling the SNCF to modernize critical applications. Talos Linux enabled them to go cloud native in 4 months. The immutability and built-in security kept them compliant with the security manifesto and, as they say, Talos Linux gave them “out of the box Kubernetes.” That meant the Cloud Native Team didn’t require rounds and rounds of help from other teams to get their project off the ground. They could just run with it.

CrossnoKaye: From 10s to 100s of devices manufactured

  • Industry: Industrial refrigeration
  • Devices shipped per year: 10 → 100s
  • TeamViewer logins: Hopefully 0

CrossnoKaye builds control systems for industrial refrigeration. And their edge device manufacturing process used to be… a lot. Technicians sometimes had to get on TeamViewer just to get devices online. It had long been functional, but for a fast-growing scaleup, they knew that wouldn’t cut it. They were shipping 10s of devices a year, and they wanted to increase that number to 100s. It’s a classic growth story.

With Talos Linux, they’ve automated the entire process. They don’t have to touch anything. They get API-everything and pure repeatability. CrossnoKaye’s manufacturing partners can provision entire enclosures themselves. This has transformed CrossnoKaye’s manufacturing process into something that will support them well into the future as they grow.

What the ability to build better infrastructure really means

All of these stories started with a team hitting the wall of traditional infrastructure. It was too expensive, too fragile, or just too difficult to use. For whatever reason, these teams needed to do something transformational.

They chose to shift their way of thinking. Rather than chasing problems and fighting fires, they were able to design them out entirely with Omni and Talos Linux. And in the end, all of these companies to build better.

If you’re ready to take back control and start building better infrastructure, Talos Linux and Omni are right there waiting. Go take Omni for a spin.