Whenever AWS has an outage, it makes the news. In fact, AWS said the recent issue wasn’t even an outage, and it still made the news. Issues with S3 returning a lot of errors in the US-East-1 region caused application problems for a few hours. Personally, it affected my morning routine. I start the day reading blog posts using NewsBlur. NewsBlur wouldn’t show me any blogs. Instead, it reported server errors caused by this S3 issue, so my usual source of news couldn’t tell me that there was news about an AWS S3 issue. Before we start talking about how unreliable the cloud is, let us ask who among us has private infrastructure that is without fault? While cloud service outages make the headlines, on-premises outages happen all the time, too. Also, who cares if your application isn’t available for a few hours every couple of years? Not every application needs 100% uptime. It may be the right business decision to accept an application outage when there is an infrastructure outage.
Articles Tagged with availability
The recent Amazon Web Services Simple Storage Service (S3) outage has taught us quite a bit about fragile cloud architectures. While many cloud providers will make hay during the next few weeks, current cloud architectures are fragile. Modern hybrid cloud architectures are fragile. We need to learn from this outage to design better systems: ones that are not fragile, ones that can recover from an outage. Fragile cloud is not a naysayer: it is a chance to do better! What can we do better?
As we move through the year, there are often monthly and quarterly upgrade cycles to our virtual and cloud environments. These are caused by security issues, natural upgrades to hardware, software, or even application updates. Application updates are now continuous, using continuous integration and deployment strategies, while hardware and other upgrades come more slowly. Cloud upgrades can be incredibly impactful, as all subsystems need to be restarted. Yet, there is a cycle to this. There is need to control what is happening, and a need to not break compliance, security, data protection, or other policies.
When you read many blogs and articles on cloud security, writers such as myself often mention jurisdictional issues as a big problem. Nor is the ability to Audit clouds the only problem. Yet both of these are huge issues for clouds today, but fundamentally, is the cloud flawed from a security point of view or are there plenty of security mechanisms available?
With the release of vSphere 4.1 there have been some great enhancements that have been added with this release. In one of my earlier post I took a look at the vSphere 4.1 release of ESXi. This post I am going to take a look at vSphere 4.1 availability options and enhancements. So what has changed with this release? A maximum of 320 virtual machines per host has been firmly set. In vSphere 4.0 there were different VM/Host limitations for DRS as well as different rules for VMware HA. VMware has also raised the number of virtual machines that can be run in a single cluster from 1280 in 4.0 to 3000 in the vSphere 4.1 release. How do these improvements affect your upgrade planning?
I was upgrading my nodes from VMware VI3 to VMware vSphere and used the VMware Update Manager to perform the update. Given that my existing filesystems were implemented to meet the requirements of the DISA STIG for ESX, as well as availability. I was surprised to find that when the upgrade of the first node of my cluster completed, that the install did NOT take into account my existing file system structure, but instead imposed the default file system used by the standard VMware vSphere ESX 4 installation.
Why is this an availability and a security issue?