Building a Multi-Kubernetes Management Platform on Clusterpedia
Challenge
Following its cloud native transformation in 2016, China Mobile Cloud (ECloud) was running more than 100 public cloud resource pools and hundreds of super-large Kubernetes clusters. They needed a way to more easily manage multiple super-scale clusters at the same time. Li Li, leader of the Cloud Native Team at China Mobile Cloud explained: “Kubernetes enables more efficient use of physical resources, easier management and orchestration of containerized applications, and greatly improves the availability and stability of applications. At the same time, we need a tool that can manage these clusters.”
Solution
In 2022, the team discovered Clusterpedia, a CNCF project that makes it easy to manage and efficiently retrieve resources running on multiple Kubernetes clusters.
Impact
Since introducing Clusterpedia, China Mobile Cloud has improved the efficiency of multi-cluster resource retrieval by 60%, realized unified access to multi-cluster workload statuses, and made multi-cluster resource management possible. At the same time, engineers have also saved 50% of operation and maintenance time.
By the numbers
50% Savings
On operational and maintenance time
60% Improvement
In efficiency of multi-cluster resources
100% retrieval
Of multi-cluster resource status
China Mobile Cloud (ECloud), which belongs to China Mobile Communications Group Co.,Ltd, caters to government enterprises, institutions, developers, and other customers – providing basic cloud native resources, platform capabilities, software applications, and other services. Currently, China Mobile Cloud’s public cloud IaaS+PaaS market share ranks seventh in China, and is the fastest growing public cloud service in the world. Its private cloud share is ranked 5th, and the government cloud has entered the top three in the industry.
China Mobile Cloud serves millions of enterprise customers and has more than 1200 eco-partners. The company pursues its values of “5G era, the intelligent cloud around you”, based on cloud native, open source technology and standards. It ensures its own product and technology innovation reflect this – providing customers with a full-stack, cloud native product family, helping users to be agile, secure, and trustworthy, and to continue to evolve for the future.
In 2022, China Mobile Cloud planned to build a multi-cluster management platform to provide an operations and management portal for the underlying multiple hyperscale clusters. Specifically, two main issues arose:
- Multi-cluster retrieval
- Compatibility with different versions of Kubernetes
After considering a number of different solutions, China Mobile Cloud chose Clusterpedia.
“Clusterpedia is easy to use and stable. Because it is a relatively new project, we started with an attitude of experimentation, but through practice, Clusterpedia has proven to be fully capable of carrying the weight of production, and it has worked steadily so far.”
Li Li, Leader, Cloud Native Team at China Mobile Cloud
In addition, the team reported that Clusterpedia’s strict internal version-compatible handling means the client does not have to change the code or upgrade the client-go, helping them to reduce upgrade costs.
Improved experience of searching in multiple clusters
Before the introduction of Clusterpedia, China Mobile Cloud had no tool for retrieving resources across multiple clusters. As the business grew and the data center was scaling, quickly retrieving a specific resource across multiple clusters was an urgent problem to solve.
“We needed to be able to retrieve resources on all Kubernetes clusters in a single panel. Previously, we had to keep detailed records of what applications were running on each K8s cluster and then go to the corresponding K8s cluster based on that record, but now we can do that based on Clusterpedia, which enables fast multi-cluster resource retrieval that greatly improves the efficiency of multi-cluster resource searches.
Li Li, Leader, Cloud Native Team at China Mobile Cloud
Clusterpedia’s powerful resource retrieval tool for multiple clusters, and Watch feature for multi-cluster resources, allowed China Mobile Cloud to achieve a unified aggregation of multi-cluster resource status events – enabling monitoring and processing of multi-cluster resource status.
In addition, the China Mobile Cloud team has successfully transformed some product components – based on Clusterpedia with multi-cluster capabilities – using client-go’s informer to monitor Clusterpedia’s APIServer to sense the status of any cluster workload and perform corresponding event processing.
Clusterpedia also liberated the operations engineers, providing a unified panel to troubleshoot problems without the hassle of switching between multiple clusters.
“Users want to be able to use multiple clusters without changing any code, as if they were in a single cluster, and they use client-go, kubectl, to interact with Kubernetes. Clusterpedia is fully compatible with the native Kubernetes API, and we made some minor modifications to Clusterpedia to allow it to take on the role of a multi-cluster APIServer.”
Li Li, Leader, Cloud Native Team at China Mobile Cloud
Community involvement
The team now plans to work on a multi-cluster solution that will allow applications to run on multiple clusters as if they were on a single cluster. In addition to reading operations on multi-cluster resources, they will be created, deleted, and updated to multi-cluster resources, as well as areas such as multi-cluster resource scheduling, multi-cluster networking, and multi-cluster monitoring. And having enjoyed the convenience of Clusterpedia, China Mobile Cloud is now looking for ways it can give back to the open source community.
“Clusterpedia is a treasure we found in CNCF. We have benefited from this project and we are also actively involved in contributing back. For example, the Watch feature for multi-cluster resources is based on our business requirements, but we have have open sourced it for the community. We look forward to working with the Clusterpedia community in more depth in the future, and to growing with the community.”
Li Li, Leader, Cloud Native Team at China Mobile Cloud