Subnet sizing and heterogeneous subnets

Along my different jobs, one of the things that led to L2 extension and then brought risks and slew down or even blocked projects was big heterogeneous subnets.

Often the initial rational of big heterogeneous subnets was to avoid too much vlans on physical infrastructure. Historically physical infrastructure had vlans and spanning tree limits. The problem is that when migration time is coming nobody wants to change IP addresses. Even after arguing, this is the network that bears the risk to break the entire company by stretching L2. No need to say that the layer 2 extension stay forever because it takes ages to migrate everything. After two years, there is no more budget for the dozen of remaining devices and everybody move on leting the network as a battled field. Some people are even arguing that with microsementation you only need one vlan…

In this blog I will list the pros and cons of small, big and heterogeneous subnets.

Small subnet

I like to think of using small subnets as a “scale out” approach. When the subnet is full you just assign a new subnet in the same reserved space.

Pros

  • Agility. You can do swimlane application design. When migration time is coming, you can do it per application.
  • For application migration, if you can’t change the ip addresses of the servers, you can reroute the trafic without jeorpadizing the entire company with L2 extension.

Cons

  • More entries in the routing table but that can be mitigated with summarization.
  • Loss of ip addresses. Does it really matter ? how full are the big subnets ?

Big subnet

This is for me a more “scale up” approach. You oversize subnets then when you need a new server, you pick one IP in the big subnet without asking any question.

Pros

  • IP address saving. Is it worth it ? Most of the time the subnets are big only to anticipate growth and those IPs are lost anyway.
  • Simplicity to allocate addresses. Only a couple of big subnets.

Cons

  • If the subnet is not full, it’s very difficult to carve out the unused space in the subnet to use it somewhere else.
  • L2 broadcast domain is mitigated by arp suppression on modern fabric but still you have broadcast reaching every VM and thus can have troubles if the stormcontrol is not set properly.
  • For all the firewalls in the enterprise that are not relying on tags, you might have to open firewall rules per host instead of per subnets.

Heterogeneous subnet

It’s not rare to see big subnets also heterogeneous with a mix of applications and workload type (physical/virtual).

Pros

  • It allows physical devices like NAS or Loadbalancer to be in the same subnet as your application workloads (If you want to avoid putting a FW in between and can’t use VRF mechanism)

Cons

  • Application mutualization. It can block applications migration because not all the applications have the same requirements. If one of the application has been asked to migrate somewhere else and nobody wants to change IP addresses, you might be forced to stretch L2.
  • Mix of workload type (physical/virtual). It can block applications migration because the physical devices can’t go where the VMs are going. If nobody wants to change IP addresses, you might be forced to stretch L2.

Other considerations

  • Pets vs cattle : VM vs Container
    In the above use cases I’m talking essentially of workloads that are not containers. For containers the paradigm is often different because most of the time containers don’t share the subnet with physical device and application developper doesn’t rely on the containers IP addresses. What matter is the FQDN of the service. VM are still considered most of the time as pets and nobody wants to change the IP address mostly to avoid to change firewall rules or hard coded IP addresses in application.

  • VLANs Scale
    The number of supported VLAN are less relevent in virtualized environments because most of the time software scale better and you are not limited to 4K vlans (or even 2K in some case for ACI Bridge Domain and EVPN Layer 2 VNIs).

  • Suboptimal IP addressing plan
    When you move an entire subnet somewhere else you can break the summarization of the company but you can always do the summarization again after all the applications have moved or even reIP the entire subnet afterwards. It’s less risky and complex than having a L2 streched “forever”.

  • Distributed routing
    Now, with distributed routing you don’t have to go to the network core where the default gateway was in the old days to route between subnets. The trafic have a better distribution in the fabric and you should have less risks to congest uplinks with EAST/WEST trafic.