In Do Users Have a Negative Perception of Desktop Virtualization?, James Rankin brought up a set of issues that arise whenever a new platform is deployed in an organization. Those issues revolve around the fact that users tend to then blame all problems with user experience upon the new platform, even if those problems had existed prior to the deployment of the new platform. In the case of a Citrix or VMware VDI deployment, this takes the form of “Citrix is slow” or “View is slow.”
Managing VDI Performance Overview
The key to managing both expectations and actual results is to measure what is happening before you go to VDI (establish a baseline) and then measure performance (response time and throughput) during and after the deployment, so that you have real data to use in defense of yourself and the VDI project when the inevitable user complaints arise. This means that you need to monitor both the existing environment and the new VDI environment for user experience. The only reasonable metrics with which we can measure user experience are application response time (how long does it take for something to happen?) and throughput (how much work is being done per unit of time?) However, VDI and its predecessor, server-based computing (Citrix XenApp), introduce a unique set of difficulties into the equation, meaning that not just any monitoring solution is going to give you the right data. Those unique challenges are:
- VDI interjects a network in between the application execution environment and the user. Therefore, all other things being equal (and they are not), at the minimum there is one more thing that has to get done in that “send email” transaction than is the case with a local PC. In other words, even if you put a dedicated physical PC in a data center with its own hard disk, there would still be this extra step.
- That network obscures application transactions. If someone presses the “send” button on a VDI client, there is no way to detect that send in the protocol used between the VDI client and the back end data center. This is true for RDP, ICA, HDX, PCoIP, and every other remoting protocol. So it is frankly impossible to get visibility into production transactions from the actual end user device to the VDI back end. Synthetic transactions can help you with load testing, but they are relatively useless for ongoing monitoring of real-time performance.
- VDI also replaces what used to be several dedicated resources with resources that are now shared, due to the fact that most VDI implementations rely upon putting the user’s environment into a VM and running many of these VMs on a physical host via a hypervisor. Dedicated CPU becomes shared CPU. Dedicated memory becomes shared memory. Most importantly, dedicated local hard disk becomes shared storage—storage that generally was not designed to have several thousand Windows PCs boot at the same time.
- VDI does nothing to reduce the end-to-end complexity of the entire application system. A VDI instance with thirty installed applications could easily be communicating with and reliant upon hundreds of back end servers. A thousand VDI instances could, in turn, easily be reliant upon thousands of back end servers.
- As mentioned at the start of this post, the fact that the VDI system is the method by which the application is delivered to end users causes the end users to blame the delivery mechanism any time the application is slow.
- This means that the curse of everyone who supports a production VDI instance (or, for that matter, a Citrix XenApp instance) is that they are guilty until proven innocent when it comes to application performance issues.
The above situation means that if being guilty until proven innocent is a revenue issue or a reputation issue (for you or the entire IT department), then you had better monitor for application performance before, during, and after your VDI project.
Approaches to Monitoring VDI Performance
When you set out to understand the performance of your VDI environment, you should start with one key assumption. That assumption is that the data that you want and need (response time and throughput) is not commonly available via a management API or standard protocol like WMI, SNMP, SMI-S, or NetFlow. This means that the data you want must be collected by the monitoring tool that you select, which in turn means that if the monitoring tool does not collect response time and throughput, it is not going to meet your needs. The diagram below demonstrates how four different vendors collect the slices of the data that are critical:
- AppEnsure puts an agent into every Windows and Linux server (physical or virtual) that is part of the entire application system. This gives AppEnsure the unique ability to identify each application by name, map its topology, and measure response time and throughput across the application system.
- Aternity puts an agent into the desktop OS. That means that in a VDI environment, its agent is running in the VDI instance in the data center. This uniquely gives Aternity the ability to see end user application transaction data.
- ExtraHop either puts a virtual appliance on the virtual mirror port of the vSwitch or puts a physical appliance on a physical span or mirror port on the physical switch. Extensive layer 7 protocol decoding and deep packet inspection gives ExtraHop as much visibility as you can get from the network data (without requiring agents).
- VMTurbo is unique in its ability to allow you to set the priority of the VMs in your environment and then automatically ensure that the highest priority workloads get the resources that they need to perform well. So if you have 100 very important users, 1,000 somewhat important users, and 5,000 other users, VMTurbo can sort this out for you automatically.
Below we go into a bit of detail on the approaches of each of these vendors.
AppEnsure is focused on measuring the response time and throughput of every application in your environment. It starts by identifying the application by name. It then maps the end-to-end topology of that application and measures the end-to-end and hop-by-hop response time and throughput. When either response time or throughput degrade, AppEnsure provides automated diagnostics that tell you where the problem is. If you are in a situation where you are constantly being blamed for slow application performance because “Citrix is slow,” and you know that the problem is not in Citrix (or VMware View), then AppEnsure is an indispensable tool for finding out where that application slowdown is actually occurring. Since AppEnsure is instrumenting the application system, it is also ideal for establishing the baseline for what the performance of the application system is before a VDI project starts.
Aternity installs an agent on the desktop or laptop OS (in the case of a dedicated PC) or into that same OS when running in a VDI instance. The key feature of this agent is the ability to measure actual transactions within end user applications via a profile for each application (someone has to create this profile for each application for this to work). So if you want to know how long it is taking when a user clicks “Send” in Outlook for the Outlook client to complete the send, Aternity is in the unique position to tell you this for both dedicated Windows desktops/laptops and instances of desktop applications running in VDI environments. Of course, if you install Aternity on a desktop before you virtualize that desktop, then you will have detailed before and after data about transaction performance for the applications for which you have profiles.
ExtraHop is able to see everything that is going on in the network that supports your VDI environment and in the network that supports the applications that are delivered by your VDI environment. ExtraHop has also done some extra work in the area of ICA/HDX layer 7 decoding that allows you to see things that are otherwise quite hard to see, such as how long it is taking for applications to launch and how long it is taking for people to log on. This is the greatest and most granular level of visibility that you can get into a VDI environment without installing agents on VDI instances and back end application servers.
VMTurbo is not a monitor of application performance. It does not collect application response time or throughput data. Rather, it takes the entirely different approach of trying to completely prevent the occurrence of application performance issues due to resource constraints. It does this by allowing you to prioritize your workloads (including all of your VDI instances with respect to each other), figuring out what the constrained resources are in the virtual and physical infrastructure supporting your VDI instances (including your Cisco UCS and NetApp storage resources), and then automatically allocating those scarce resources to their highest and best use. Note that VMTurbo can do this across your VDI instances and your instances of virtualized servers, so that if one application delivered through VDI to one set of users is of critical importance, VMTurbo is in the unique position of being able to ensure service quality through the entire delivery chain for that service.
VDI performance management is a unique problem that requires unique tools that take unique approaches. AppEnsure, Aternity, ExtraHop, and VMTurbo each take a unique and valuable angle towards helping enterprises with business-critical VDI deployments ensure an acceptable end user experience.