As I’ve thought about how to implement high-performance, very large-scale networks within a secure hybrid cloud, I have come to the conclusion that the cloud works best with disaggregated network functions. This is the goal of network function virtualization, or NFV, but the real problem is knowing what functions to virtualize and how to do so at scale. Very large scale. We need to consider the multipaths our data will take and the rates at which data can pass through the various virtual components of our system that makes up the hybrid cloud. When we think hybrid cloud, we need to think scale out, not up. Scaling up can cost lots of money, while scaling out may save dollars. This means we need to rethink networking and security as well as protection. With containers on my mind, we have a path for our journey.
Where do we start? Let’s list some of the seemingly necessary components of our network stack:
- web application firewalls
- intrusion detection systems
- behavioral detection systems
Wait, let us stop here. The list is endless, but we are now talking about active and passive control systems. Each of these works differently. On the one hand, we have items that must be in the data path. On the other hand, we have items that should not be in the data path. At high scale, anything in the data path slows down transactions and processing, so having detection systems in the data path just does not make sense. We only want prevention systems in the data path: elements we need. Yet both have to work on the same set of data: network packets, flows, etc. As such, we need a way to split our traffic and then do different things on all streams, as we see in the following graphic:
In the above graphic, we need a splitter to mirror traffic between two network function paths. The active path needs to be as fast as possible and is in the data path, while the passive path feeds back into the application delivery and perhaps into the active protections. However, to handle hundreds of millions of packets, we need a splitter that also scales up or out, such as a Gigamon device, port mirroring, or some form of SPAN port for the entire switch or router. I also wonder if network anycast would help here at all. I am unsure.
What can be in each of these stages, and how does the feedback from detection work? I’ve made a short list of elements and requirements for each each path:
Active Path or Service Chain
- must contain reply from application (bidirectional traffic)
- detect user, device, source application
- apply device, user, source application rules
- cloud-aware service brokers
- management-aware service brokers
- encryption (VPN) termination
- SDN terminator
- traditional LB, FW, IPS, WAF
Passive Path or Service Chain
- reply not necessary (unidirectional traffic)
- device, user, source application behavioral analysis
- deep packet inspection
- operational data gathering (ingress)
- incident-response data gathering
- deception-based analysis
- other long-running analyses
- traditional IDS
The goal of each service chain is distinctly different. The active chain is about performance. As such, it is best to scale out each action using something like anycast or load balancers to handle the load. We could even break down each function further, such as by having one set of services that just detects the user, another for the device, etc., in effect building up a set of reusable services that together form a traditional network device. This could be the goal of a future-generation firewall, to be built out of myriad microservices.
Performance vs. long running times is the reason for this split, as we discussed in the USENIX LISA16 workshop I moderated. We just cannot scale with current constraints without using hardware that scales up; we need to change this when we are within cloud environments where scaling up is impossible (as they will not buy the hardware or it is not cost-effective). Scaling out becomes possible, but only if each service does one thing and one thing well. The whole is formed by the many.
Each microservice could then be part of the application instead of as separate network services. They become application services. This is one way in which security could be embedded in an application with little fuss. We need to start somewhere. Disaggregated networking is a starting point. However we achieve our goal within a hybrid cloud, scale is going to be an issue. Think of these numbers: if you do 44 billion queries a day, you have 30,000 new sessions per second or so. Most software, non-ASIC-based firewalls can only handle 1,000 to 2,000 new sessions per second of encrypted traffic. This implies you need thirty times the capability presented. You can either go the hardware route or start to disaggregate your network functions. This is the goal of NFV within the cloud.