Skip to content

Terraform Design Considerations for Cisco ACI - Part 2

Estimated time to read: 17 minutes

  • Originally Written: August, 2023

This is the second of a four post series

Topics Covered

  • Recap
  • Design Consideration: ACI Network Connectivity Options
    • Notes on the test setup
    • ACI Network Connectivity Option: VMM Integration
    • ACI Network Connectivity Option: Individual Static Port Bindings
    • ACI Network Connectivity Option: Bulk Static Port Bindings ​ - ​​​​​​ACI Network Connectivity Option: Bind to a Group/Subset of Ports Across One or More Switches
    • ACI Network Connectivity Option: Bind to All Ports in a Single Switch
    • ACI Network Connectivity Options: Summary
  • Additional References

Recap

From the last post:

  • Many customers are starting to use Infrastructure as Code (IaC) tools such as Terraform for various reasons such as scalability, consistency, version control, collaboration, and efficiency.
  • A key component of how Terraform works is the state file
  • There is no right or wrong design, only tradeoffs and considerations
  • Terraform configuration and state can be separated into different folders or Git repositories
  • Some of the reasons for organizing configuration and state files across different folders are:
    • Fault domain size and deployment time: As the state file and configuration increases so too does the time to run the plan and the potential impact if something goes wrong
    • Readability and troubleshooting: If the configuration is too large it can be troublesome to read and understand (which is also why this content is split into multiple posts)
    • Static vs dynamic environments: It can be challenging when multiple teams or people are working on a single configuration
    • Roles and responsibilities i.e. RBAC: Different people may have different levels of access to the fabric

Design Consideration: ACI Network Connectivity Options

ACI provides various network connectivity options and it's helpful to understand how the connectivity choice may influence a Terraform design. In particular, what are the Terraform plan and apply times for each option.

This post gives you an overview of each along with the number of required Terraform resources, example calculations, and run times. This information makes a useful data point when determining a folder structure such as those shown in the example designs in part 3.

Factors Contributing to Performance Issues with Large Terraform Plans

There are a couple of factors contributing to the long run times mentioned above.

  • If you are making individual API calls to provision resources, the more calls the longer the run time. If you see this issue with large amounts of static port bindings have a look at the aci_bulk_epg_to_static_path Terraform resource which can help reduce the amount of API calls.

aci_bulk_epg_to_static_path

  • As outlined in the Github Issues below, certain use of for_each in Terraform can result in longer run times as "Terraform needs to lookup resource as complete objects containing all instances in order to be able to index them in expressions. "

Performance issues when referencing high cardinality resources

Terraform with large set of resources take very long time to run

Notes On The Test Setup

  • Tested against APIC Simulator
  • Terraform commands ran on a Linux CentOS VM with the following specs:
    • Intel(R) Xeon(R) CPU E7- 2830 @ 2.13GHz
    • 1 x vCPU, 1 x core
    • 2GB memory
  • Using default Terraform configuration i.e. -parallelism=10
  • Minimal configuration is used in each test simply to provide sample run times. These times may differ from a real world environment.
  • The aci_rest_managed resource (i.e. raw API call) was used to perform all tests with the exception of the Bulk Static Port test which used Terraform ACI resources including the aci_bulk_epg_to_static_path resource.
  • The exact quantity of resources may differ from your environment depending on the configuration e.g. using ACI resources such as aci_bridge_domain vs the generic aci_rest_managed resources to implement a bridge domain and BD to VRF mapping
  • For each test, after the initial configuration is applied a new BD and EPG are added to show the time taken for a refresh and small change

VMM Integration

When using VMM integration the ACI fabric can dynamically discover where hosts and applicable VMs are connected. The benefit to the network admin is that ACI will automatically create port groups/VLANs/configuration on the DVS to which the VMs connect. VMM integration can also help to reduce the amount of Terraform configuration required as seen in the example previously.

In a design with 1 EPG = 1 BD = 1 VLAN, the following Terraform resources are required to configure a VMM integration when using the generic aci_rest_managed resource:

  • 1 x bridge domain per VLAN
  • 1 x BD to VRF mapping
  • 1 x subnet (if using ACI as default GW)
  • 1 x EPG per VLAN
  • 1 x EPG to BD mapping
  • 1 x EPG to VMM domain mapping
  • 1 x EPG VMM security policy

Info

This is not including the VLAN pools or interface/switch policies and profiles resources which would also need to be created.

The general calculation is as follows:

total_resources = (7 * num_of_vlans)

For example, 4 VLANs (with BD default GW) would require 28 resources and be calculated as follows.

  • 4 VLANs x 7 resources from the list above

A larger configuration such as 20 VLANs would require 140 resources.

  • 20 VLANs x 7 resources from the list above

Info

Compared to the static binding option below this is a significant decrease in the required Terraform configuration, but may not be suitable for every environment (e.g. baremetal environments)


Individual Static Port Bindings

Static port bindings may still be required and within ACI there are a few configuration options. The first is individually binding a static port to an EPG.

Within the Tenant tab -> App Profile -> EPG -> Static ports

This configures the EPG/VLAN on the port and switch that is specified. It provides flexibility and granular control but can become difficult to scale. See further down in the paper for example refresh/apply times when working with large amounts of VLANs and ports.

In a design with 1 EPG = 1 BD = 1 VLAN, the following Terraform resources are required to configure a static port binding for a VLAN when using the generic aci_rest_managed resource:

  • 1 x bridge domain per VLAN
  • 1 x BD to VRF mapping
  • 1 x subnet (if using ACI as default GW)
  • 1 x EPG per VLAN
  • 1 x EPG to BD mapping
  • 1 x EPG physical domain mapping
  • 1 x EPG to static port mapping (per port, per VLAN)

Info

This is not including the VLAN pools or interface/switch policies and profiles resources which would also need to be created.

The general calculation is as follows:

total_resources = (6 * num_of_vlans) + (num_of_vlans * num_of_ports * num_of_switches)

For example, 4 VLANs (with BD default GW) on 2 ports of all 3 switches would require 48 resources.

  • 24 BD and EPG resources (4 BDs + 4 BD to VRF mappings + 4 subnets + 4 EPGs + 4 EPGs to BD mappings+ 4 EPG to domain mappings)
  • 24 static ports (4 VLANs x 3 switches x 2 ports each)

Info

The static binding configuration can grow quite large as the number of VLANs and associated switches/ports increases. This also affects the overall Terraform apply and refresh times.

A larger configuration such as 20 VLANs on 48 ports on all 6 switches would require 5860 resources.

  • 100 BD and EPG resources (20 BDs + 20 BD to VRF mappings + 20 subnets + 20 EPGs + 20 EPG to domain mappings)
  • 5760 static ports (20 VLANs x 48 ports x 6 switches)



Bulk Static Port Bindings

Within the Tenant tab -> App Profile -> EPG -> Static ports

This applies the same configuration as the previous example, however uses the aci_bulk_epg_to_static_path resource from the ACI Terraform provider. This allows you to configure multiple static paths in one EPG resource and reduces the number of API calls to the ACI fabric. In the calculation using individual static port bindings (previous example), one resource (EPG to static port mapping) was required per port, per VLAN. The Terraform plan size therefore grew larger as more ports and VLANs were added.

The aci_bulk_epg_to_static_path implementation in the Terraform provider bundles all related EPG to static port mappings into an EPG resource and makes one API call. This can greatly reduce the Terraform apply and refresh times.

In a design with 1 EPG = 1 BD = 1 VLAN, the following Terraform resources are required to configure a bulk static port binding for a VLAN when using the aci_bulk_epg_to_static_path resource:

  • 1 x bridge domain per VLAN
  • 1 x subnet (if using ACI as default GW)
  • 1 x EPG per VLAN
  • 1 x EPG physical domain mapping
  • 1 x EPG to bulk static port mapping (all ports, switches associated to this EPG or VLAN)

Info

This is not including the VLAN pools or interface/switch policies and profiles resources which would also need to be created.

The general calculation is as follows:

total_resources = (5 * num_of_vlans)

For example, 4 VLANs (with BD default GW) on 2 ports of all 3 switches would require 20 resources.

  • 4 VLANs x 5 resources from the list above

A larger configuration such as 20 VLANs on 48 ports on all 6 switches would require 100 resources.

  • 20 VLANs x 5 resources from the list above

Note

This option allows large quantities of static port bindings to be created while keeping the Terraform apply time to a minimum.

Bind To A Group/Subset Of Ports Across One Or More Switches

Within the Fabric tab -> Access Policies tab -> Policies -> Global -> AAEP -> Application EPGs (on the page)

As you can see in the screenshot below, you do not specify a switch or a port with this method. This configures the EPG/VLAN on all ports and switches to which the AAEP is attached. It provides an easier way to configure many EPGs/VLANs on a group of ports that are part of the same AAEP.

One disadvantange of using this method (when configured in the APIC UI) is that it's hard to tell what has been configured. A physical domain is required in the tenant EPG UI page but the user must then navigate to a different page (Fabric -> Access Aolicies) and confirm the EPG is associated to the AAEP. Even then you need to trace the AAEP to interface/switch profile to determine the configuration to the correct ports.

However, when using Terraform or Nexus as Code you can construct the folder and file layout to best suit your needs. Therefore a folder can have configuration that is found across different pages in the APIC UI (e.g. tenant and access policies in a single Terraform plan). Although anyone viewing the objects through the UI would still need to check in multiple locations.

The following screenshot provides a very minimal sample Nexus as Code configuration (the structure would be similar for pure Terraform code). In this example a single file contains bridge domains and endpoint groups, as well as the AAEP to EPG mappings. Since the EPG/VLAN will be applied to all ports associated with the AAEP, there is no need to include port configuration which keeps the number of lines to a minimum.

A point to consider when defining EPGs on the AAEP is the security/RBAC policy. This applies not just for this AAEP to EPG mapping example but for all "cross-page" configuration. As previously discussed, when segmenting via pages (e.g. tenant or access policies), each Terraform plan can be restricted to a security domain or a specific user account that has access to only the resources that plan should create.

When the Terraform configuration starts to expand across areas, e.g. contains multiple tenants or tenant and access policy configuration in one plan, the permissions may need to be expanded. A user who may have been given write access to their tenant and read to fabric/access policies might now also need write privileges for access policies to configure the AAEP to EPG mapping.

In a design with 1 EPG = 1 BD = 1 VLAN, the following Terraform resources are required to bind an EPG to an AAEP when using the generic aci_rest_managed resource:

  • 1 x AAEP
  • 1 x AAEP to EPG mapping
  • 1 x AAEP to VLAN encap mapping
  • 1 x AAEP to physical domain mapping
  • 1 x bridge domain per VLAN
  • 1 x BD to VRF mapping
  • 1 x subnet (if using ACI as default GW)
  • 1 x EPG per VLAN
  • 1 x EPG to BD mapping
  • 1 x EPG physical domain mapping

Info

This is not including the VLAN pools or interface/switch policies and profiles resources which would also need to be created.

The general calculation is as follows:

total_resources = 2 + 7 * num_of_vlans

For example 4 VLANs (with BD default GW) would require 30 resources.

  • 1 x AAEP
  • 1 x AAEP to physical domain mapping
  • 4 x AAEP to EPG mappings
  • 24 x BD, EPG resources (4 VLANs x 6 resources from list above)

A larger configuration such as 20 VLANs (with BD default GW) would require 142 resources.

  • 1 x AAEP
  • 1 x AAEP to physical domain mapping
  • 20 x AAEP to EPG mappings
  • 120 x BD, EPG resources (20 VLANs x 6 resources)

Info

Compared to the static binding option this is a significant decrease in the required Terraform configuration. Keep in mind that this configures the EPG/VLAN on all ports and switches to which the AAEP is attached.


Bind to All Ports In A Single Switch

Within the Tenant tab -> App Profile -> EPG -> Static leafs

This configures the EPG/VLAN on all ports of a switch and provides an easier way to configure many static ports on a switch that may have a dedicated function. For example a switch where all ports are used for OOB management and require the same EPG on all ports.

In a design with 1 EPG = 1 BD = 1 VLAN, the following Terraform resources are required to configure a static leaf binding for a VLAN when using the generic aci_rest_managed resource:

  • 1 x bridge domain per VLAN
  • 1 x BD to VRF mapping
  • 1 x subnet (if using ACI as default GW)
  • 1 x EPG per VLAN
  • 1 x EPG to BD mapping
  • 1 x EPG physical domain mapping
  • 1 x EPG to static leaf mapping per VLAN per switch

Info

This is not including the VLAN pools or interface/switch policies and profiles resources which would also need to be created.

The general calculation is as follows:

total_resources = (6 * num_of_vlans) + (num_of_vlans * num_of_switches)

For example 4 VLANs (with BD default GW) on all ports of all 3 switches would require 36 resources.

  • 4 VLANs x BD
  • 4 BD to VRF mapping
  • 4 x subnets
  • 4 VLANs x EPG
  • 4 x EPG to BD mapping
  • 4 x EPG physical domain mapping
  • 12 x EPG to static leaf mappings (4 EPGs/VLANs x 3 static leafs)

A larger configuration such as 20 VLANs (with BD default GW) on all ports of all 3 switches would require 180 resources.

  • 20 VLANs x BD
  • 20 BD to VRF mapping
  • 20 x subnets
  • 20 VLANs x EPG
  • 20 x EPG to BD mapping
  • 20 x EPG physical domain mapping
  • 60 x EPG to static leaf mappings (20 EPGs/VLANs x 3 static leafs)

Info

Keep in mind that this option configures the EPG/VLAN on all ports of a switch, not just a single port nor a subset of ports.


ACI Network Connectivity Options: Summary

Consider the following points if the length of time to apply configuration is too long or the amount of configuration is overwhelming:

  • Is VMM integration an option for some of the workloads? This can reduce the static port configuration which improves run time and readability
  • If static ports are the only option:
  • Does every VLAN need to be associated to every port on every switch? e.g. in the example of 20 VLANs on 6 switches, each with 48 ports, could this be reduced?
  • Can bulk static port configuration be used?
  • If the static port configuration can't be reduced, can it be split into different folders to reduce run time?
  • Would one of the other options be suitable? i.e. associate the EPGs/VLANs to the AAEP, configure a static leaf, set and forget the static ports and use ESGs?

Onto the next chapter

Part 3 - Example Designs

Additional Referencesr

Comments