The Upgrade Dilemma

Does using a microservices or container architecture actually help with upgrades? We often are required to upgrade our networking, storage, operating systems, and application components. Some of those upgrades are security related, while others are not. Almost all of them can impact our application in some fashion. However, containers and microservices are supposed to save us from this nightmare! I do not see it: all containers and microservices do is change how our application is packaged and therefore how it is patched itself. This is the upgrade dilemma.

I am working with a customer that does 44 billion queries a day. You have heard me write about them in the past. This application has made use of microservices since its inception. Everything about this application is about maintaining a high performance environment. Microservices were used to allow data to flow to where it needed to be or allow lookups to happen out of band. In essence, expensive disk related tasks were moved into microservices, where the disk reads and writes could be strictly controlled and, if necessary, removed outright. However, now we are in a major upgrade cycle, a cycle that has gone on for quite some time.

Most of the problems with the upgrade cycle are due to moving between major versions of an operating system. This changes the underlying kernel, which changes how networking is implemented. These changes impact everything. A slip here could majorly impact performance. Since we are talking about an operating system change, all the base libraries we depend upon have also changed. Some of those changes were to drop insecure protocols. These changes add up over time. We have to think outside our comfort zones to decide how best to do these upgrades.

Now, if this were a container host, any changes to the most fundamental libraries would impact every container. Networking specifically would be impacted: the networking between containers and into the container host. If this were a container host, we could package all the older libraries and bits into the container and run on newer container host operating systems. However, our container would still be running older code: code with known security issues, for example.

Containers are usually lightweight, but not from what I have seen. Everything is a lift and shift. This leads to having to maintain not only the container host, but also the container versions of code. For some, this is required by their business (such as accounting offices, which constantly have older versions of accounting software used by their customers). For others, the headache of upgrades just gets worse. We have far more to think about. Yet, hey, I use containers, so upgrades are the responsibility of development. They sure are, but if they require a library or feature within the container OS, and that was modified, then the container may or may not work as expected.

In the world of upgrades of operating systems, containers and microservices make testing your application easier. It makes development easier. It does not necessarily make upgrades easier. We are all fans of making small changes to large environments and testing, fixing, and rolling those changes out as rapidly as possible. Microservices and containers make that possible. However, it does not remove the need for continual testing. It does not provide relief from upgrade issues. It makes deployment easier, it can pinpoint areas that need more testing or even fixing easier, but it does not make doing upgrades much easier.

Small changes often make large systems perform worse. When you are dealing with scale, upgrades become an issue. This is why scale testing is a requirement. It is why we need continual testing during our agile development. This is also why companies like Ixia and Spirent exist: to provide scale for such testing. As we enter the world of IoT, we need to be cognizant of upgrades. Fundamental upgrades to sensors and back-end clusters become major issues. I have seen seemingly innocuous code changes cause performance to drop from 44 billion queries a day to less than 1 billion. This drop would impact any company’s bottom line. Yet, operating system upgrades are ones that can be sneaky. A simple change to remove a security protocol currently used by an application could require a re-key of the entire application. A re-key could include creating brand-new certificates.

Upgrades and certificate and secret management become major issues for IoT and many other implementations. How do you manage these today? Do you rely on your developers to get it right? Use tools? Have scripts? Or do you still do things manually? How do you manage major operating system upgrades? How do you handle the upgrade dilemma?

Posted in Transformation & AgilityTagged , , ,