Transportation
Europe
Data Center Hybrid Edge
Internal Silos and legacy systems
Massive amounts of real-time data
200 clusters across Azure and AWS public clouds and on-prem data center
One team brokering services for 400 internal projects
Data from 5000 trains per day
Immutable OS with small attack surface
Easy to install new versions
Release management
90% fewer production incidents between IaaS and CaaS
66% less maintenance effort
Zero configuration drift
La Société Nationale des Chemins de fer Français (SNCF) is France’s state-owned national railway company, responsible for the country’s entire rail network, including high-speed intracity TGV trains.
The SNCF relies on its Cloud Native Team to broker services for all the main IT divisions, covering 400 different internal projects across train management, tracks, train stations, rolling stock maintenance, finance, real estate, and more. The team also maintains dedicated open source involvement by contributing to CNCF projects, including Harbor and participating in the Platform Engineering Working Group.
The SNCF team processes real-time data from 4,000-5,000 trains daily to support critical passenger information systems across the entire Paris railway network, including information that needs to be shared with the public. They needed to modernize their applications in order to keep up with the endless flow of train data.
Given the complexity and size of data that must be managed on a daily basis, SNCF also wanted to provide seamless operations at the node level, systematizing the way they create, destroy, roll out, and autoscale. They wanted an immutable OS to install on the edge both in the trains and in the train stations which would provide immediate information back to their on-prem data center for analysis of the data.
Transitioning from public cloud to an open source Kubernetes platform in SNCF’s data center presented technical and operational challenges, including broader organizational transformation. Silos, years of established processes, and a 200-page security manifesto caused friction in moving their efforts forward. Teams that had long worked on traditional infrastructures struggled to align to cloud native approaches.
The Cloud Native Team began experimenting with Kubernetes in 2018 but failed to efficiently leverage and implement it at scale. They built a Kubernetes solution using Ubuntu with RKE2, but the year-long project was difficult and ultimately unsuccessful. To move forward, SNCF decided on four key principles. They would need to:
“We don’t need to ask the legacy teams to provide us with a modern solution to run Kubernetes. We’ve got our own out-of-the-box solution for Kubernetes, which is the Talos operating system and Kubernetes. We can just run it through OpenStack and then the magic happens.”
Thomas Comtet, Senior Staff Engineer, SNCF
These principles led SNCF to Talos Linux. Talos’s immutability and built-in security ensured compliance with that 200-page security manifesto. Talos provides release management, makes it easy to install new versions, and erases configuration drift, enabling the team to effectively manage infrastructure at scale.
SNCF uses Talos on-prem in the data center to manage live data, the real-time positioning of trains, real-time localization, and the communication of current train statuses to the public (eg. ETA and track). The team now manages approximately 200 Kubernetes clusters across their environment.
The SNCF team developed their own tool (https://github.com/mstrohl/talos-cockpit) to replicate AKS auto-upgrade functionality for its on-prem environment, making data center operation possible in locations where direct cloud provider tools aren’t available. By making this open source, they are able to share expertise with the community and ensure others benefit from their learnings.
“SNCF’s experience demonstrates that simplifying complex systems, rather than adding layers of complexity, leads to more effective outcomes. By embracing cloud native principles and tools like Talos, we created a consistent, efficient infrastructure that supports both its cloud and on-prem operations.”
Thomas Comtet, Senior Staff Engineer, SNCF
The SNCF Cloud Native Team has driven significant results in its operations, including the modernization of critical applications and facilitating non-public cloud compatible applications to the benefits of a cloud native architecture.
SNCF has improved its technical stability through Talos, reducing production incidents between IaaS and CaaS by 90%, with only minor issues remaining. The team has also increased its efficiency, achieving a 66% reduction in maintenance efforts and eliminating configuration drift.
Talos made it easy for the team to work quickly, reducing development time and enabling them to have a production-ready, cloud-native solution in 4 months. SNCF’s success shows that by standardizing on a secure, minimal operating system and applying cloud native principles across environments, even a large national infrastructure can modernize quickly and operate with greater stability and efficiency.
Kubernetes has revolutionized how applications run, and Sidero® is building tools to bring that same transformation to the infrastructure underneath.
Follow Us