I have been building solutions on AWS since 2008, and even though that sounds like a long time, I have still only scratched the surface of what is possible in the cloud. Every few weeks I get another “Aha” moment when I see problems solved with cloud architectures that would be either too hard, not feasible, or too time-consuming to accomplish in a non-cloud environment. Here is my latest “Aha” moment.
Distributing Workloads Across Multiple Database Technologies
With traditional on-premises non-cloud architectures, keeping the number of unique database technologies to a minimum is often the norm. This is because for each database technology added to the application architecture, a unique set of skillsets must be staffed to develop, manage, and maintain each piece of technology. For example, take an application requiring an SQL solution like Oracle, a NoSQL solution like MongoDB, a caching solution like Redis, and a disk storage solution on a WAN. From an administration and operations standpoint, the company would most likely have to assign different staff members with expertise in each one of these technologies In addition, each one of these technologies most likely requires a unique collection of hardware components which may also require specific skills on the infrastructure teams. From a development standpoint, developers now have to work with multiple specialists to procure, install, manage, and operate all of these unique technologies. For many companies this may not be feasible, both from a cost and time-to-market standpoint.
In the cloud, and specifically with AWS, all of these various types of database technologies are available as services. Developers no longer need to go through long procurement cycles, and companies no longer need to ramp up staff to install, manage, and operate the underlying technologies and infrastructure, since these technologies are on-demand resources accessible via APIs. Developers can quickly experiment with multiple types of database technologies to match each workload with the best tool to do the job. It is not uncommon for a cloud architecture to mix and match a relational database managed as a service (e.g. RDS) with a NoSQL service (e.g. DynamoDB), a caching tier (e.g. ElastiCache), a cloud disk storage service (S3), and a content delivery network (CloudFront) without the complexity, costs, and manageability issues that come with managing all of these technologies manually.
In fact, a developer with a decent amount of AWS experience could have a prototype of an application using all of the above technologies up and running in a few short hours. In an on-premises world, not one of these database technologies could be installed, functional, and available for the developer in a short timeframe. First, the hardware and software would need to be procured. Once it was procured, an infrastructure person would need to install the hardware. Then someone (possibly the same person, but maybe not) would have to install the operating system and other critical software components. After that, a DBA-type resource would need to implement the database technology. If the company has no in-house expertise, it might need to hire specialists for this. All of these tasks and more would need to be completed before the developer could even attempt to prototype the application for the target database technology. This process could take weeks or, in some companies, months.
In most cases, the application developer does not have the luxury of time to wait for the various technologies to be enabled, so they tend to stick with the technologies that are already implemented and managed by the company. The end result is that many applications force workloads into relational databases even if a relational database is not the best tool for the job. The following picture shows a common design pattern on AWS for a 3-tier web application. Notice all of the different APIs that are used and the different database technology choices. This type of architecture would take a lot of physical hardware and staffing to support in an on-premises world.
The point here is not to bash relational databases or criticize anything about on-premises and self-managed solutions. The point is that in the cloud, it is more feasible to create architectures comprised of many different technologies because the burdens of implementing and managing all of the underlying technologies are many fewer than in an on-premises world. In the cloud it makes sense to classify data and identify unique workload patterns and then leverage the best database technology solution available for each unique workload to maximize performance. Instead of throwing everything in a big honking database and spending countless hours tweaking it for performance, we can now distribute unique workloads across unique technology solutions and scale each independently.
Making Technology More Feasible
Cloud APIs make it faster and more feasible for developers to experiment with many different types of technologies without requiring their company to make an initial investment in software, hardware, and human resources. Introducing a technology like Hadoop in an on-premises world could require a large upfront investment. Leveraging a DBaaS like DynamoDB greatly reduces that investment because the database is managed as a service and requires no procurement or installation and minimal administration. In my previous post, I discussed how the cloud increases agility and highlighted how I was able to set up a highly-available, fault-tolerant PHP application within a virtual private cloud in under 2 hours. Cloud APIs are making it faster and cheaper to build complex architectures in short timeframes with minimal resources. The fact that virtual servers can be spun up and deployed in minutes creates all kinds of new design approaches that were off the table in the days of procuring hardware. Vertical scaling architectures are giving way to horizontally scaling architectures. Architectures are being broken down into smaller components that can run as individual services that can scale independently on their own infrastructures with their own SLAs. Developers really need to think differently when designing in the cloud; no more big monolithic applications. Like a human body made up of many cells each performing autonomously, cloud architectures can be built in a similar fashion. When one component the fails, the rest of the application should continue to function.
Cloud providers like AWS abstract the underlying infrastructure and provide many PaaS-like services so that developers can quickly deploy applications. Providing developers with APIs to access numerous types of services in the categories of compute, networking, storage, analytics, utilities, security, operations, and more enables companies to spend more time focusing on business requirements and less time focusing on IT plumbing. The on-demand nature of these services opens up new opportunities for architects. Now that systems can be built to automatically provision compute resources and take advantage of third-party solutions simply by calling an API, the architect’s toolbox is loaded with tools that may not have been available to them before. Since compute resources are now immutable (aka disposable), fresh new architectural approaches that were just never feasible before can be applied in the cloud. To take advantage of what the cloud offers us, we need to change the way we approach design patterns so that we don’t deploy on-premises architectures in a cloudy world.