Upside down pyramid, why IT patching needs reform.
Get your checklist ready, send out communications, make sure we have a call tree ready and make sure we have application teams on standby to verify their systems. This sound familiar to anyone out there? It is the makings of a monthly patch cycle that most of us are familiar with, if you're not, you're the lucky one. Interesting enough patching does not make IT simple, patching adds a complexity to IT which involves millions of dollars of investment and resource overhead cost. I was raised in the dawn of the “Information Technology Golden Age” and was indoctrinated to always believe that IT should make work easier not harder.
That's the point, we should be working smarter not harder. As I think about the patching process many of you out there would agree that the patching process is excessive but necessary. Why does our cost have to be so high for something that should already be factored into our business? The one word that comes to mind is “Security”. Many of you have seen the growth of the security sector within the IT field. According to Bloomberg it costs 57,000 dollars to support security for 50 people in an organization. That means that 1,140.00 per person is what the average organization must pay to ensure that their environment is secure which tends to pose the question, what exactly are we paying for?
For data centers this question is very one sided, Organizations are now into the business of securing their data. Data has become the currency of the world, my data gets compromised and I lose business. So how does this all fit into the cost? Simply put, support teams must ensure that hardware and software are consistently maintained and updated which is commonly referred to as “Patched”. Patching allows all the software we use the ability to address security vulnerabilities by pushing updates and getting rid of the potential holes that exist; sounds really good right? Well, that all depend on who you ask, developers will tell you “No”, Engineers will tell you “Yes”, Customers will tell you “No” and Security will tell you, “you have to”. Now you start to see the problem, different stakeholders require different needs.
These needs differ based on the priorities of the Agency, Security Organization and Business Line, more importantly these needs are not aligned strategically to drive business effectively. So the question then becomes how to align business strategy across the different stakeholders especially when stakeholders have different strategic missions? Recently I attended GSA IT C2E which is a collaboration of vendors who perform work at the General Services Administration (GSA) and have created this event to discuss ways that they can improve efficiency within their customer space and broker better lines of communication amongst vendors. This same topic was one the main focal point of the evening and many vendors vocalized ideas and strategies which addressed their particular areas of focus and provided amazing alternatives to this problem. The major point we all agreed upon is that many of the maintenance activities we perform are driven solely based on the pressure to secure systems. Hacks from China, internal threats, new technologies, shellshock attacks, terrorist and many more cyber threats put the government in a state alarm. This can be understood especially when the government maintains a steady pool of personal data that could compromise individuals lives.
So then what’s next? It sounds like I am justifying the absorbent cost that we pay for securing systems and avoiding the communication gap. It’s time for an evolution in IT security and Business Automation that will streamline the way we do business today. Currently patching schedules are driven to ensure that all systems are patched within a 30 day window, which includes developing changes, testing the changes and then promoting the changes For massive systems this is just impossible. Developers require time to ensure that all dependencies in systems scale and minimize the number of changes that occur to an application in order to reduce the overall impact to the business line users. Based on the aggressive schedule teams will either take shortcuts or risk the security of the system to buy themselves more time to develop work arounds. So where does this new evolution start? DevOps, it has been around for many years but proved expensive in its early years with all the up front costs to build out the virtualized systems. With Cloud systems DevOps becomes literally inexpensive in comparison to prior years, you can now programmatically build systems without the need of having multiple full time engineers supporting your environment. As part of this DevOp evolution you have several factors which play into successfully establishing a sustainable model for DevOps. The key here is to understand that not every application is DevOps ready, DevOps is an adoption of standards which allow your organization to automate business process by using code (My definition). As we all know code tends to execute faster than people and in order to create that foundation you need to have certain underlying components in place:
Platform tool standardization
Standardized code practices
Automated business process
Application security categorization
Obviously DevOps isn’t for everyone but you have to weigh the risks and reward in order to determine if it is worth establishing this model. Take into consideration a very basic formula to calculate cost, if the cost exceeds 8% of your annual IT budget then consider switching to an automated model.
(Hours x (Rate x Number of Tech) x Systems) + (Patch Failure% x (Hours x (Rate x Number of Tech) x Systems)) = Cost to Patch ** Note consider all participants who participate in the patching cycle, business line developers, Managers, techs and Tier 1 support services.
So how does DevOps help my patching cycle? DevOps will allow you to turn your patches into a code based deployments that will automatically install to all your platform components. Since systems are built on common platform components engineers can quickly test the impact that patches will have on components based on the stack they reside on. Additionally the deployment of these updated components can provide developers and business users more time to perform testing and verification of packages since deployments are simply a push of a button away. From a security standpoint you have the rapid deployment needed when critical security vulnerabilities are published and need to be addressed. From a business point of view you lower costs by making standards that developers and business line users can adhere to. The benefit that you gain from this code based package is the ability for your organization to recycle and reuse code, this reduces redundancy in many areas which include contracts, resources, hardware, oversight and many more. In an organization with a DevOps framework you introduce automated Business Process into operations for Configuration Management. This increases the decision maker's ability to communicate and execute decisions through executable business logic that can lead to on demand changes. Not to say Configuration Management didn't already exist in your organization but over communicating has a tendency to mute out important changes, where an automated process will target changes to those respective individuals that need to be involved. The simplicity of having a Platform that is standardized also makes educating employees, leaders and managers more organic which ultimately leads to better decisions and strategic alignment. So ask yourself “Who can patch faster, you or the code you wrote”.
My final takeaway is “Priority” of patching. The industry says patch everything as fast as we can. I can fully understand why we would, but is everything really need to be patched right away? If security has taught me anything it is that Security is about measuring your risk. If we have a DevOp framework in place, would it be to much to ask to align patching cycles with system categorization? Example, if I have an application that is classified as “High” why does it have to be patched as quickly as one deemed a “Low” classification. Many experienced Operation Managers do accommodate based on classifications but it is done more through a process of waivers versus natural operations. Not to say we shouldn’t patch but to the point, couldn’t we drive the industry to focus on “High’s” and “Moderate’s” the first 30 to 45 days and on “Medium’s” and “Low’s” within 45 to 60 days as a standard? Risk is substantially less on the lower systems and in many cases you have follow up patches to previous patches that need to be applied as well. It is my opinion that the industry needs to consider these types of evolutions especially when it comes to new technologies that substantially change how we do business.