Storage Networking – Time to TAP the SAN

Ever since the availability of enterprise class Ethernet switches it has been possible to easily define one of the ports on those switches as a “mirror port”. A mirror port is a tremendously useful thing as it gets a read only copy of all of the traffic which goes through the switch sent to it. This enables a wide variety of management tools from low level network tools, to packet analyzers to applications performance management tools to collect their data without being directly in the data path. This approach has taken on even more prominence in the virtualized world since VMware discourages agents in guests, which makes collecting performance data via a virtual appliance on the virtual mirror port on the vSwitch (or the Cisco Nexus 1000V) the preferred method.

Up until now, it as been a little difficult and expensive to do the same thing for your Fiber Channel SAN. Before delving into why this has been hard let’s discuss why it is important. Just as the TCP/IP network sits in between your users and your servers, the SAN sits in between your servers and your storage arrays. Issues with SAN port congestion, and delays caused by LUN congestion show up as additional latency in the Fibre Channel frame data. Having access to the low level Fibre Channel frame data is the best way to find port congestion on the SAN switch, and is also the most convenient way to find LUN contention in the storage array (it is more convenient that instrumenting the arrays themselves). Therefore,  having real time, comprehensive and deterministic information about what is going on in the SAN can give you invaluable information about how congestion issues in the storage array are impacting physical server, virtual server and applications performance. It is also the case that since most SAN’s are massively over-provisioned, knowing how busy all of the ports really are can lead to decisions that can save a lot of money on future SAN port purchases.

The reason that it has been difficult to create mirror or spanned ports for the SAN is that this is today most often done on an after the fact basis which means that either you have to wait for a maintenance window (which is not practical when there is a performance fire), or you have to unplug SAN cables while the system is up and running (which should be handled by redundancy – but which does not always work like it is supposed to).

To ease the process of being able to collect data on the SAN in the same manner that TCP/IP data is collect from IP switches, Virtual Instruments has announced its SANInsight 10G Fibre Traffic Access Point (TAP) Patch Panel System. This TAP ensures that performance monitoring and management tools (like Virtual Instruments VirtualWisdom) will always be able to get to the data that they need in order to be able to provide the best possible visibility into the SAN and how the SAN is impacting the performance of the rest of the infrastructure and the applications. The idea behind this new product is that it makes it easy to TAP every port on the SAN ahead of time so that when issues arise the mechanism exists to immediately plug in the right diagnostics tool. Have a TAP on every port also greatly facilitates the operation of the Virtual Instruments NetWisdom product as it allows this product to then see all of the data on every port all of the time.

One Cassette in the SANInsight Front Panel

VI TAP front panel


The collapsed, shared, and dynamic nature of virtual infrastructures places a premium upon being able to collect performance data about the infrastructure and the applications running on the infrastructure in new ways. Specifically is it critical that performance data be collected in the following manner:

  • Real Time – virtual infrastructures are subject to rapid changes in load patterns due to the fact that they are shared by many applications, and that the virtualization platform itself is performing dynamic operations to balance load. This places a premium upon collecting information in real time. Sampling at 15 or 30 minute intervals will simply miss too many problems.
  • Deterministic – many performance metrics are approximations of what actually occurred. This is particularly true in virtualized environments where time shifting can warp metrics collected within guests that are rates over time (which includes CPU utilization, network I/O rates, disk I/O rates, etc.). Useful data must be collected deterministically and must not be a statistical approximation of the truth.
  • Comprehensive – collecting one bit of data in real time is useful. Collecting all of the data all of the time is more useful and is in fact required for virtual infrastructures running business critical and performance critical applications.

With this new element in the physical infrastructure Virtual Instruments has made it possible for enterprises to plan ahead and ensure that they can always have real time, deterministic and comprehensive data about their SAN and the impacts of their SAN available to them.

Posted in IT as a Service, SDDC & Hybrid CloudTagged ,