How PNC Bank automated software supply chain compliance with TriggerMesh.
Challenge
As one of the largest banks within the United States with $367 billion of assets under administration, PNC has a massive IT footprint and a dev team that needs to not only deliver innovative code but also consistently meet regulatory compliance requirements. PNC sought to develop a way to ensure new code would meet security standards and audit compliance requirements automatically—replacing the cumbersome 30-day manual process they had in place.
Solution
Using Knative, the cloud native serverless and eventing framework, PNC developed internal tools to automatically check new code and changes to existing code. Developers immediately know if their code meets company-wide standards. The power of Knative’s eventing and serverless features allows PNC to bridge processes between Apache Kafka and CI/CD toolchain events and achieve this automated state. PNC also utilized the TriggerMesh declarative API to address the specifics of the event driven workflow. The process allows PNC to stop code from going into production if any part of the requirements outlined are missing.
Impact
Deployment became easier, clearer, and immeasurably faster. An automated, instantaneous process replaced a process that meant 37 or more days of preparing presentations and holding meetings. The internally developed Policy-as-Code service checks code in near real time. Developers are freed up, and code reviews are not subject to the errors inherent in human reviews. Developers utilize a highly developed CI/CD process for over 6,000 applications maintained in PNC. Tests are created and implemented by compliance owners and automatically integrated into the workflow.
By the numbers
Time saving
30-days off the development cycle, saving large amounts of time and money.
scale
20k code repositories kept compliant to company standards
consistency
5,000 developers can finalize the custom compliance process in real time
Automating compliance to realize continuous delivery
Every CI/CD improvement that speeds code to deployment multiplies the effectiveness of a software team. At finance organizations, the quest to develop an efficient, reliable CI/CD pipeline inevitably runs into compliance requirements. Making certain that applications abide by the organizational and regulatory bodies is complex and puts a lot on the line. Mistakes create further delays, fines, and potentially embarrassment. PNC, with 9,500 developers and 20,000 code repositories, is always looking for efficiencies to improve productivity at scale. The Director of DevOps for PNC and his small team of developers was tasked with a major challenge: reducing the time and cost required for compliance review of code. With hundreds of teams delivering code, PNC’s existing 37-day manual process was a giant barrier toward software deployment.
Before the DevOps transformation that PNC initiated six years ago, new code took over 300 days to get from code completion to production. Using automation and DevOps best practices, PNC reduced the code-to-production window to 37 days.
The manual compliance process was the last-mile problem of CI/CD for PNC. For a dev team, compliance amounted to 120 hours of work after code was complete. That effort was spent in producing slide decks with screenshots, meetings, and communication with multiple business units to ensure regulatory compliance.
Eliminating manual compliance
The PNC Portfolio Management Team, which oversees DevOps, had to find a way to eliminate 30-day manual compliance process while also keeping regulators and internal clients happy and productive.
The solution involved building native cloud infrastructure using Kubernetes, Knative, and TriggerMesh. PNC crafted a sophisticated internal service, named Policy-as-Code, that harnesses the power of Knative automated eventing and serverless abilities. Using TriggerMesh’s declarative API, the team was able to take advantage of event logging to craft a bridge between Apache Kafka and Jenkins to the bank’s Policy-as-Code application. Helm handled implementation of rules, providing a way for the compliance teams to use the system without needing to interact with the backend.
Policy-as-Code sits outside the deployment pipeline while maintaining the ability to work with their toolchain. This toolchain varies from team to team, but normally consists of over 15 tools. Because not all teams within PNC use the same tools, flexibility was an important requirement for the project.
“For Policy-as-Code to work, you need to have solid event capture,” said the Director. “That is because the application needs to understand when and how information is being exchanged between different elements of the toolchain. Not only do I need to know that an event happened but I need to know what that event impacted and what the results of the event were. For example, if we are talking about a static code quality scan, it would be very easy to set up a web hook to tell you a scan is finished. But we need to, on enrichment, make a call back and attribute the results of a scan to the actual binary that we are planning to deploy. That is harder, and that is just one tool. If you multiply that by all the different toolchain integrations that we have, plus the different procedural variations, you get a sense of the complexity.”
They turned to Kubernetes, Knative, and TriggerMesh to achieve that level of automation
“We wanted to build this cloud native, but our MVP quickly ballooned into a Golang monolith that was running thousands of different threads and was begging for failure. We looked at what we had, and we looked at our roadmap, and we said, ‘This ain’t going to work’.”
Director, PNC Bank
Luckily, the team looked for another method early in the process. Implementing TriggerMesh in the Knative environment fundamentally changed the architecture by separating out the policies from the process.
As a result, the Policy-as-Code team would not have to maintain sprawling code bases. With such a complicated tool, it was also important that the team could fix things without it becoming a quagmire. Knative helped solve these issues by giving them the ability to independently test each component of Policy-as-Code without end-to-end testing.
In addition, the connection to Kafka using Knative was much easier. Switching to a Knative architecture shrank their base images to around 10 MB from 1.3 GB.
“Another thing that is really cool when using TriggerMesh and Knative to route the system is that it is so modular. At all the interaction points we can read off the payloads without writing much code. We get more visibility over our entire flow,” said the Lead Software Engineer for the team.
Operating on the Knative architecture, the TriggerMesh (A CNCF member) open source solution provided efficient, easily managed event capture. The ability to orchestrate through TriggerMesh meant others in the company would not need to understand the whole backend system. “We abstracted a lot of it away,” said the Lead Software Engineer. “The team would only have to worry about building that small serverless function.”
PNC’s technical infrastructure is always under load, handling a query load of around 20 billion per day. At peak times, queries can come in at 500,000 a second. When developers push code, all pieces need to be in place to keep the system up and running because events must be processed in real time.
Faster code deployment and automatic audit trail
As a custom serverless application in its own right, Policy-as-Code provides a pass/fail status for code submitted by internal clients. Developers know immediately if their code meets standards, all within a friendly to use interface. Code that does not pass compliance checks does not go into production.
Already, 6,000 application components at PNC utilize Policy-as-Code. A process that took a team an average of 30 days to complete can now take place in near real time. Teams are no longer burdened with long wait times or code review meetings and communications before their code can go to production.
“Our real success was our ability to say if your code change is fully compliant and does not affect complex authentication or funds transfer, you can go from new code to production in just the time it takes to run your pipelines. There is no longer 120 hours over 37 days of non-code compliance work standing in the way of production. Thousands of PNC software developers have seen their deployment window shrink to near real time. Our teams achieve true continuous delivery. That is our big win.”
Director, PNC Bank