https://en.wikipedia.org/wiki/Autonomous_system_(Internet) https://www.rfc-editor.org/rfc/rfc1997.html https://datatracker.ietf.org/doc/html/rfc4271 https://datatracker.ietf.org/doc/html/rfc7911
-
example realistic traffic engineering - prefix lists to shift some subnets to a different ISP if people complain about packet loss - rather than global local preference
-
communities
-
neighbour group
-
templates
-
usecase/purpose - internet peering, control plane for VXLAN, MPLS etc
-
administrative distances - 200 for iBGP and 20 for eBGP
Intro to BGP¶
Estimated time to read: 78 minutes
- Originally Written: September, 2024
Note
This post is an introduction with some notes I've taken for future reference. It doesn't cover everything but I've tried to include additional resources where it makes sense
Overview¶
- Operates between Autonomous Systems (AS)
- A set of routers under a single technical administration
- Assigned a 2-byte or 4-byte AS Number (ASN)
- Private ASN ranges are
64512
to65534
for 2-byte and4200000000
to4294967294
for 4-byte
- Built on TCP which provides reliable transport
- TCP Port 179
-
TCP meets BGP's transport requirements and is present in virtually all commercial routers and hosts
-
This eliminates the need to implement explicit update fragmentation, retransmission, acknowledgement, and sequencing
- BGP uses TCP mechanisms for authentication (See TCP Authentication section)
- BGP is a path vector protocol (vs link-state like OSPF)
- Sends updates advertising routes or other Network Level Reachability Information (NLRI) and path attributes used to determine how to reach those destinations i.e. which AS path can you take
- Built-in loop prevention
- BGP routers will only advertise the single best path to their neighbors although there is an option to configure additional BGP paths to be advertised - Advertisement of Multiple Paths in BGP - Cisco IOS BGP Additional Paths
- Multi-protocol BGP (MP BGP) is an extension to BGP which allows you to send more than just IPv4 routes
- For example:
- IPv6 addresses
- VPNv4 addresses used in MPLS VPN networks
- Endpoint/host reachability information such as IP/MAC addresses
- Address family is L2VPN and Subsequent Address Family is EVPN
- For example:
- BGP neighbours are statically defined although it's also possible to use dynamic peers or peer groups which are defined by a range of IP addresses
- Since it was designed to be used between different Autonomous Systems (which may have been different organizations), care should be taken when configuring both sides of the neighbour connection. Compare this to OSPF (
224.0.0.5
) or EIGRP (224.0.0.10
) which use multicast and hello packets for automatically discovering neighbours
- Since it was designed to be used between different Autonomous Systems (which may have been different organizations), care should be taken when configuring both sides of the neighbour connection. Compare this to OSPF (
- BGP sessions can be either internal BGP (iBGP) or external BGP (eBGP)
- iBGP: between routers in the same AS
- NEXT_HOP does not change by default
- Use
next-hop-self
command to change the next-hop IP address for a prefix to the router itself - Updates received from iBGP peers are not advertised to other iBGP peers (multiple hops away) unless route reflection or confederations are used
- For example imagine a full mesh triangle topology with R1 -> R2, R2 -> R3, and R3 -> R1. In this example (without route reflectors) R1 learns a route from an eBGP peer and advertises it to R2 and R3. R2 will not advertise this route to R3, and vice versa. To ensure R2 and R3 receive the route, R1 must have direct iBGP sessions with both R2 and R3.
- iBGP typically uses loopback interfaces when peering with another iBGP speaker
- eBGP: between routers in different AS'
- A BGP speaker sending an NLRI update sets the NEXT_HOP address as the local IPv4/IPv6 address being used for that peering session
- Updates received from an eBGP peer can be advertised to both iBGP and other eBGP peers
- eBGP typically uses physical interfaces when peering with another eBGP speaker
- iBGP: between routers in the same AS
- BGP favours stability over sharing the latest updates straight away
- eBGP: suggested default value for the advertisement interval is 30 seconds
- iBGP: suggested default value for the advertisement interval is 5 seconds
- https://datatracker.ietf.org/doc/html/rfc4271
iBGP and eBGP interface peering
For simplicity the iBGP configuration examples below use the physical interfaces for peering. The iBGP configuration using loopbacks is shown from the iBGP vs eBGP loop prevention, peering interfaces, and next-hop section
BGP Packet Header¶
- Marker
- 16 bytes
- Used to delimit BGP messages
- Expected value is all ones
- Length
- 2 bytes
- Total length of the message including the header
- Total length >= 19 bytes and <= 4096 bytes
- Type
- 1 byte
- Type of message
- 1: OPEN
- 2: UPDATE
- 3: NOTIFICATION
- 4: KEEPALIVE
- 5: ROUTE REFRESH
Formatting issue when using dark theme
As per this Github link, Unfortunately, you cannot override Mermaid's CSS with additional CSS, because Mermaid.js diagrams must be encapsulated in shadow DOMs due to duplicate identifiers which lead to undefined behavior
. Therefore it's best to view this packet diagram using the light theme. You can change the theme by clicking the button at the top of the page, on the left hand side next to the search box.
https://github.com/squidfunk/mkdocs-material/discussions/4582
packet-beta
0-127: "Marker"
128-143: "Length"
144-151: "Type"
IGP vs EGP¶
Feature | IGP (e.g. OSPF) | EGP (e.g. BGP) |
---|---|---|
Protocol Type | Link-state, hierarchical | Path vector |
Usage | Within an autonomous system | eBGP is used between autonomous systems (iBGP is used within an AS) |
Convergence | Fast | Slower compared to IGPs |
Metric | Cost based on bandwidth (or other metric depending on protocol) | Path attributes |
Scalability | Suitable for large enterprise networks | Suitable for large-scale internet routing |
Path Selection | Shortest path first | Best path based on multiple attributes |
Topology Information | Complete topology map | Path information between ASes |
Configuration | Per area configuration | Per neighbor configuration |
Policy Control | Limited | Extensive policy control and traffic engineering |
BGP Message Types¶
- There are 5 message types which BGP uses for communication
- Unicast is used for sending messages
- Each message has a fixed-size header (19 bytes)
- The maximum message size is 4096 bytes but there is also extended message size support with RFC 8654
OPEN¶
- This is the first message that is sent after the TCP session is established
- If the peer is incompatible the peering/session is closed
- Message contains:
- Version number
- AS of the sender
- Hold time
- Router ID of the sender
- Optional parameters
- Capabilities advertisement - lists the capabilities (e.g. MP-BGP) supported by the speaker
- The following screenshots shows the TCP setup (SYN, SYN/ACK, ACK) and the OPEN messages between 10.1.1.1 and 10.1.1.2
- Some optional capabilities are also present
Router configurations
- R1
!
interface GigabitEthernet1
ip address 10.1.1.1 255.255.255.0
!
router bgp 65001
bgp router-id 1.1.1.1
network 192.168.1.0
neighbor 10.1.1.2 remote-as 65002
- R2
NOTIFICATION¶
- You'll see notifications when there are problems with the communication
- It is sent by the peer when an unrecoverable condition is sent and needs to terminate the peering
- After sending the sender closes the session and it's logged to syslog
- You'll also see the session status move into idle and closing
- The following configuration results in notifications being sent due to an ASN mismatch and you can also see the TCP session closing (FIN, ACK)
Router configurations
- R1
!
interface GigabitEthernet1
ip address 10.1.1.1 255.255.255.0
!
router bgp 65001
bgp router-id 1.1.1.1
neighbor 10.1.1.2 remote-as 65001
network 192.168.1.0 mask 255.255.255.0
- R2
KEEPALIVE¶
- BGP uses its own keepalives instead of the TCP keepalive
- It's sent immediately after receiving a compatible OPEN message from a peer
- Keepalives are sent periodically (one third of the holdtimer)
- IOS-XE holdtimer is 180 seconds by default
- This message only contains the message header (19 bytes)
ROUTE REFRESH¶
- This is necessary when an inbound route policy changes
- It allows BGP to dynamically request a re-advertisement of routes of any address family from a BGP peer, rather than using
soft-configuration
which keeps a separate unmodified copy of all routes received from a neighbour but results in high memory usage-
When the inbound routing policy for a peer changes, all prefixes from that peer must be somehow made available and then re-examined against the new policy. To accomplish this, a commonly used approach, known as 'soft-reconfiguration', is to store an unmodified copy of all routes from that peer at all times, even though routing policies do not change frequently (typically no more than a couple times a day). Additional memory and CPU are required to maintain these routes.
-
- Sent within the Optional Parameters section of the OPEN message header
Router configurations
- R1
!
interface GigabitEthernet1
ip address 10.1.1.1 255.255.255.0
!
interface Loopback0
ip address 192.168.1.1 255.255.255.255
!
router bgp 65001
bgp router-id 1.1.1.1
neighbor 10.1.1.2 remote-as 65001
network 192.168.1.1 mask 255.255.255.255
- R2
- I've added a route-map to set the metric of routes received from R1
- To trigger the route refresh I cleared BGP session using
clear ip bgp 10.1.1.1 in
!
interface GigabitEthernet1
ip address 10.1.1.2 255.255.255.0
!
interface Loopback0
ip address 192.168.2.1 255.255.255.255
!
router bgp 65002
bgp router-id 2.2.2.2
network 192.168.2.1 mask 255.255.255.255
neighbor 10.1.1.1 remote-as 65001
neighbor 10.1.1.1 route-map SET-BGP-METRIC in
!
route-map SET-BGP-METRIC permit 10
set metric 10
!
Verification
- This is a screenshot showing the BGP routing table after the inbound policy has been set
- As you can see, the metric is still the default value of 0 and therefore we need to request a refresh to update the metric using the new route-map
- To trigger the route refresh I cleared BGP session using
clear ip bgp 10.1.1.1 in
and as you can see the metric has been updated
UPDATE¶
- This is the main message used by BGP to advertise or withdraw routing information between peers
- The UPDATE message includes withdrawn routes, path attributes, and Network Layer Reachability Information (e.g. routes)
- Path Attributes are bits of information associated with a BGP route that help in determining the best path for routing decisions. For example AS path, next-hop, local preference, and metric
- iBGP: Updates received from iBGP peers are not advertised to other iBGP peers unless route reflection or confederations are used
- eBGP: Updates received from an eBGP peer can be advertised to both iBGP and other eBGP peers
Router configurations
- R1
!
interface GigabitEthernet1
ip address 10.1.1.1 255.255.255.0
!
interface Loopback0
ip address 192.168.1.1 255.255.255.255
!
router bgp 65001
bgp router-id 1.1.1.1
neighbor 10.1.1.2 remote-as 65001
network 192.168.1.1 mask 255.255.255.255
- R2
- To trigger the update I added the
Loopback1
interface andnetwork 192.168.3.1 mask 255.255.255.255
to R2 and waited for the update message
- To trigger the update I added the
!
interface GigabitEthernet1
ip address 10.1.1.2 255.255.255.0
!
interface Loopback0
ip address 192.168.2.1 255.255.255.255
!
interface Loopback1
ip address 192.168.3.1 255.255.255.255
!
router bgp 65002
bgp router-id 2.2.2.2
network 192.168.2.1 mask 255.255.255.255
network 192.168.3.1 mask 255.255.255.255
neighbor 10.1.1.1 remote-as 65001
!
BGP States¶
-
There are a number of states a BGP goes through as it establishes a connection with its peer
-
Idle
- The initial state where BGP waits for a start event to initiate a connection
- Connect
- BGP attempts to establish a TCP connection with the peer
- Active
- This is not good and means BGP is actively trying to establish a session
- If the TCP connection fails, BGP transitions to the Active state and retries the connection
- If you see Active status then ensure the configuration is correct
- OpenSent
- BGP has sent an OPEN message and is waiting to receive an OPEN message from the peer
- OpenConfirm
- BGP has received an OPEN message and is waiting for a KEEPALIVE or NOTIFICATION message to confirm the connection
- Established
- The BGP session is fully established, and peers can exchange routing information
Example log output when enabling BGP
This output has been modified to only show the relevant states
*Aug 29 08:37:19.513: BGP: nopeerup-delay cold-boot, set to default, 300s
*Aug 29 08:37:19.513: BGP: scope global Creating.
*Aug 29 08:37:19.537: BGP(base): will wait 10s for peer to be configured
*Aug 29 08:37:20.251: BGP: nbr global 10.1.1.2 Received RIB notification for neighbor route: reachable
*Aug 29 08:37:20.251: BGP: nbr global 10.1.1.2 Open active delayed 11264ms (35000ms max, 60% jitter)
*Aug 29 08:37:20.483: BGP: 10.1.1.2 passive open to 10.1.1.1
*Aug 29 08:37:20.484: BGP: 10.1.1.2 passive went from Idle to Connect
*Aug 29 08:37:20.484: BGP: ses global 10.1.1.2 (0x7F2A0DF6F218:0) pas Setting open delay timer to 60 seconds.
*Aug 29 08:37:20.484: BGP: 10.1.1.2 passive rcv message type 1, length (excl. header) 38
*Aug 29 08:37:20.484: BGP: ses global 10.1.1.2 (0x7F2A0DF6F218:0) pas Receive OPEN
*Aug 29 08:37:20.484: BGP: 10.1.1.2 passive rcv OPEN, version 4, holdtime 180 seconds
*Aug 29 08:37:20.484: BGP: 10.1.1.2 passive rcv OPEN w/ OPTION parameter len: 28
*Aug 29 08:37:20.485: BGP: 10.1.1.2 passive rcvd OPEN w/ optional parameter type 2 (Capability) len 6
*Aug 29 08:37:20.485: BGP: 10.1.1.2 passive OPEN has CAPABILITY code: 1, length 4
*Aug 29 08:37:20.485: BGP: 10.1.1.2 passive OPEN has MP_EXT CAP for afi/safi: 1/1
*Aug 29 08:37:20.485: BGP: 10.1.1.2 passive rcvd OPEN w/ optional parameter type 2 (Capability) len 2
*Aug 29 08:37:20.485: BGP: 10.1.1.2 passive OPEN has CAPABILITY code: 128, length 0
*Aug 29 08:37:20.485: BGP: 10.1.1.2 passive OPEN has ROUTE-REFRESH capability(old) for all address-families
*Aug 29 08:37:20.485: BGP: 10.1.1.2 passive rcvd OPEN w/ optional parameter type 2 (Capability) len 2
*Aug 29 08:37:20.485: BGP: 10.1.1.2 passive OPEN has CAPABILITY code: 2, length 0
*Aug 29 08:37:20.485: BGP: 10.1.1.2 passive OPEN has ROUTE-REFRESH capability(new) for all address-families
*Aug 29 08:37:20.485: BGP: 10.1.1.2 passive rcvd OPEN w/ optional parameter type 2 (Capability) len 2
*Aug 29 08:37:20.485: BGP: 10.1.1.2 passive OPEN has CAPABILITY code: 70, length 0
*Aug 29 08:37:20.485: BGP: ses global 10.1.1.2 (0x7F2A0DF6F218:0) pas Enhanced Refresh cap received in open message
*Aug 29 08:37:20.485: BGP: 10.1.1.2 passive rcvd OPEN w/ optional parameter type 2 (Capability) len 6
*Aug 29 08:37:20.485: BGP: 10.1.1.2 passive OPEN has CAPABILITY code: 65, length 4
*Aug 29 08:37:20.485: BGP: 10.1.1.2 passive OPEN has 4-byte ASN CAP for: 65002
*Aug 29 08:37:20.485: BGP: 10.1.1.2 passive rcvd OPEN w/ remote AS 65002, 4-byte remote AS 65002
*Aug 29 08:37:20.485: BGP: ses global 10.1.1.2 (0x7F2A0DF6F218:0) pas Send OPEN
*Aug 29 08:37:20.485: BGP: ses global 10.1.1.2 (0x7F2A0DF6F218:0) pas Building Enhanced Refresh capability
*Aug 29 08:37:20.485: BGP: 10.1.1.2 passive went from Connect to OpenSent
*Aug 29 08:37:20.485: BGP: 10.1.1.2 passive sending OPEN, version 4, my as: 65001, holdtime 180 seconds, ID 1010101
*Aug 29 08:37:20.485: BGP: 10.1.1.2 passive went from OpenSent to OpenConfirm
*Aug 29 08:37:20.487: BGP: 10.1.1.2 passive went from OpenConfirm to Established
*Aug 29 08:37:20.487: BGP: nbr global 10.1.1.2 Stop Active Open timer as all topologies are allocated
*Aug 29 08:37:20.487: BGP: ses global 10.1.1.2 (0x7F2A0DF6F218:1) Up
*Aug 29 08:37:20.487: BGP(0): 10.1.1.2 was the first peer to be established for IPv4 Unicast
*Aug 29 08:37:20.488: %BGP-5-ADJCHANGE: neighbor 10.1.1.2 Up
Example log output if there's a problem with the connection
To trigger this even I blocked the keepalives between the peers. This output has been modified to only show the relevant states
*Aug 29 09:14:34.373: BGP: 10.1.1.2 connection timed out 180856ms (last update) 180000ms (hold time)
*Aug 29 09:14:34.373: BGP: 10.1.1.2 went from Established to Closing
*Aug 29 09:14:34.373: BGP: 10.1.1.2 reset due to BGP Notification sent
*Aug 29 09:14:34.376: %BGP-3-NOTIFICATION: sent to neighbor 10.1.1.2 4/0 (hold time expired) 0 bytes
*Aug 29 09:14:34.376: BGP: ses global 10.1.1.2 (0x7F2A0DF50338:1) Send NOTIFICATION 4/0 (hold time expired) 0 bytes
*Aug 29 09:14:34.376: BGP: 10.1.1.2 local error close after sending NOTIFICATION
*Aug 29 09:14:34.376: %BGP-5-NBR_RESET: Neighbor 10.1.1.2 reset (BGP Notification sent)
*Aug 29 09:14:34.377: BGP: nbr_topo global 10.1.1.2 IPv4 Unicast:base (0x7F2A0DF50338:1) Resetting ALL counters.
*Aug 29 09:14:34.377: BGP: 10.1.1.2(0x7F2A0DF50338) closing
*Aug 29 09:14:34.377: BGP: ses global 10.1.1.2 (0x7F2A0DF50338:1) Session close and reset neighbor 10.1.1.2 topostate
*Aug 29 09:14:34.377: BGP: 10.1.1.2 went from Closing to Idle
*Aug 29 09:14:34.377: %BGP-5-ADJCHANGE: neighbor 10.1.1.2 Down BGP Notification sent
*Aug 29 09:14:34.377: BGP: nbr global 10.1.1.2 Open active delayed 11264ms (35000ms max, 60% jitter)
*Aug 29 09:14:34.377: BGP: nbr global 10.1.1.2 Active open failed - open timer running
*Aug 29 09:14:45.641: BGP: 10.1.1.2 active went from Idle to Active
*Aug 29 09:14:45.641: BGP: 10.1.1.2 open active, local address 10.1.1.1
*Aug 29 09:15:15.641: BGP: 10.1.1.2 open failed: Connection timed out; remote host not responding
*Aug 29 09:15:15.641: BGP: 10.1.1.2 Active open failed - tcb is not available, open active delayed 10240ms (35000ms max, 60% jitter)
*Aug 29 09:15:15.641: BGP: 10.1.1.2 active went from Active to Idle
*Aug 29 09:15:15.641: BGP: nbr global 10.1.1.2 Active open failed - open timer running
*Aug 29 09:15:15.641: BGP: nbr global 10.1.1.2 Active open failed - open timer running
*Aug 29 09:15:25.580: BGP: 10.1.1.2 active went from Idle to Active
*Aug 29 09:15:25.580: BGP: 10.1.1.2 open active, local address 10.1.1.1
BGP Path Attributes¶
- BGP uses path attributes to determine which are the best paths to take to a destination
- They are carried within the UPDATE message except when an UPDATE message contains only withdraw routes
- Path attributes are also be used to filter or sort routes and to prevent routing loops
- There are two types of attributes, well known and optional, with each type having two sub-types
Well known¶
-
Every BGP implementation must support these attributes
-
Mandatory
- These must always be included with the NLRI
- AS_PATH
- Lists the AS' that the route has traversed
- NEXT_HOP
- Specifies the next hop IP address to reach the destination
- ORIGIN
- Indicates the origin of the route (IGP, EGP, or Incomplete)
- IGP
- Route was originated within the same AS using an IGP such as OSPF
- If you configure a BGP process to advertise a route using the
network x.x.x.x
command it will show up as Origin IGP (i) in theshow ip bgp x.x.x.x
output
- EGP
- Route was learned from EGP (the old protocol)
- Incomplete
- Indicates the origin of the route is unknown or learned through some other means, such as redistribution from another routing protocol
- Redistributed routes from an IGP (static, connected, OSPF, EIGRP) into BGP will show up as Origin incomplete (?)
- IGP
- Indicates the origin of the route (IGP, EGP, or Incomplete)
- AS_PATH
- These must always be included with the NLRI
-
Discretionary
- These can be included as needed
- LOCAL_PREF
- Indicates the preferred path for outbound traffic within an AS. Used in iBGP, not eBGP
- ATOMIC_AGGREGATE
- Indicates that the route has been aggregated and some path attributes may have been lost
- LOCAL_PREF
- These can be included as needed
Optional¶
-
BGP implementations don't need to, but can, support these attributes
-
Transitive
- When advertising an NLRI, keep the attribute with the NLRI even if it's not recognized
- AGGREGATOR
- Identifies the AS and router that performed route aggregation
- COMMUNITIES
- Used to group routes that should be treated the same and tag them with specific information that can be used by routers to make routing decisions
- There are several well-known standard communities:
- NO_EXPORT: Routes with this community should not be advertised outside the local AS or confederation
- NO_ADVERTISE: Routes with this community should not be advertised to any BGP peers
- NO_EXPORT_SUBCONFED: Routes with this community should not be advertised outside the local AS, including sub-ASes in a confederation
- LOCAL_AS: Routes with this community should not be advertised outside the local AS
- There are several well-known standard communities:
- Used to group routes that should be treated the same and tag them with specific information that can be used by routers to make routing decisions
- EXTENDED_COMMUNITIES
- Extension of the standard COMMUNITY attribute which provides more flexibility and additional functionality
- AGGREGATOR
- When advertising an NLRI, keep the attribute with the NLRI even if it's not recognized
-
Non-transitive
- These must be ignored and not passed along to other BGP peers
- MULTI_EXIT_DISCRIMINATOR(MED) or METRIC
- Suggests the preferred path into an AS when multiple entry points exist
- If this attribute is received from a neighboring AS it must not be propagated to other neighboring AS'
- ORIGINATOR_ID
- Identifies the originator of the route in a route reflector environment
- CLUSTER_LIST
- Used to prevent routing loops in a route reflector environment
- MULTI_EXIT_DISCRIMINATOR(MED) or METRIC
- These must be ignored and not passed along to other BGP peers
BGP Path Selection¶
- BGP uses path attributes (carried in UPDATE message) to select the best path for a prefix
- Although it may seem complex, there is logic behind the best path selection process
- Always have the possibility for an administrator to override the path selection process
- Prefer local routes over routes learned from someone else
- Try to have a loop-free topology
- Use the shortest path in terms of number of AS'
- Use the most likely path to hit the true destination
- Leave the local AS as quick as possible without advertising insignificant routes
- Aim to select exactly one best path per NLRI which is used locally and advertised to peers
- For every NLRI (route) learned:
- Select the first variant in the database as the best one
- Go line by line of the routing table looking for the same route with other attributes
- Compare it with the first
- Pick the new best path
- This could be the current one (no change) or could be the entry at hand (this is known as an implicit withdraw)
- Finish the process when all routes have been inspected
- The following order is used to compare the variants of each NLRI and determine the best path selection
- The process stops after a better path is found
- e.g. Consider a scenario where a router has received multiple routes to the same destination. If the weights and local preferences are equal the path with the shortest AS path will be preferred and the comparison process stops
- NEXT_HOP REACHABILITY
- Ensure that the next hop is reachable
- If the next hop is not reachable, the path is not considered
- WEIGHT
- Highest weight wins
- Default: 32768 for locally originated prefixes and 0 for routes learned from peers
- Local to a router, never advertised
- Cisco specific parameter
- Allows the administrator to override the best outbound path selection for a single router by setting a higher weight
- If a decision is made at this step we do not need to visit the other steps in the algorithm
- LOCAL_PREF
- Higher local preference wins
- Default: 100
- Advertised to all iBGP peers in an AS
- Allows the administrator to override the best outbound path selection for the entire AS
- LOCALLY ORIGINATED PATH
- Advertised
- Prefer the routes this router installed locally in my own routing table over the ones I learned from someone else
- AS_PATH
- Shortest AS path wins
- Traverse the least amount of AS'
- ORIGIN TYPE
- Lower wins: IGP is lower than EGP and EGP is lower than incomplete
- Take the most trustworthy path
- If a route is redistributed via an IGP into BGP we will find it more trustworthy than if it is learned via a BGP peer
- MULTI_EXIT_DISC (MED)
- Lower metric value wins
- Default: 0
- One AS can tell another AS which entrypoint it should use if there are multiple entries into the AS
- Respect the preferred path hint given by the neighbour AS
- This is between two AS' only and will not be advertised further
- If this attribute is received from a neighboring AS it must not be propagated to other neighboring AS'
- PREFER AN EBGP LEARNED PATH OVER IBGP
- If you need to leave the local AS then choose the most direct path and don't traverse via iBGP peers
- PREFER A PATH WITH LOWER IGP METRIC TO NEXT HOP
- If you need to traverse the local AS but need to traverse iBGP peers then take the shortest path towards the exit
- PREFER AN OLDER EBGP PATH
- If both paths are learned via eBGP, prefer the older one
- At this stage of the process the paths are effectively equal so don't bother updating everyone and just use the existing older path
- TECHNICAL TIE-BREAKERS
- If the comparison of the two paths are equal at this point then use the following tie-breakers (in order) to find a single best path
- Prefer the path learned from peer with:
- Lower ROUTER_ID
- Shorter CLUSTER_LIST
- Lower PEERING_ADDRESS
- Prefer the path learned from peer with:
- If the comparison of the two paths are equal at this point then use the following tie-breakers (in order) to find a single best path
Traffic engineering examples using path selection criteria¶
Router configurations
There are minimal configs and topologies provided to focus on the example scenarios. These are not production ready configs.
Influencing which local interface to use for outbound traffic with WEIGHT¶
- Download Cisco Modeling Labs Topology
- For this scenario I am using a higher weight to override the path selection process and prefer the path via
r2
Router configurations
- R1
!
interface GigabitEthernet1
ip address 10.1.1.1 255.255.255.0
!
interface GigabitEthernet2
ip address 10.1.2.1 255.255.255.0
!
router bgp 65001
bgp router-id 1.1.1.1
bgp log-neighbor-changes
neighbor 10.1.1.2 remote-as 65002
neighbor 10.1.1.2 weight 500
neighbor 10.1.2.2 remote-as 65003
!
- R2
!
interface GigabitEthernet1
ip address 10.1.1.2 255.255.255.0
!
interface GigabitEthernet2
ip address 10.1.3.1 255.255.255.0
!
router bgp 65002
bgp router-id 2.2.2.2
bgp log-neighbor-changes
neighbor 10.1.1.1 remote-as 65001
neighbor 10.1.3.2 remote-as 65004
!
- R3
!
interface GigabitEthernet1
ip address 10.1.2.2 255.255.255.0
!
interface GigabitEthernet2
ip address 10.1.4.1 255.255.255.0
!
router bgp 65003
bgp router-id 3.3.3.3
bgp log-neighbor-changes
neighbor 10.1.2.1 remote-as 65001
neighbor 10.1.4.2 remote-as 65004
!
- R4
!
interface GigabitEthernet1
ip address 10.1.3.2 255.255.255.0
interface GigabitEthernet2
ip address 10.1.4.2 255.255.255.0
interface Loopback0
ip add 192.168.1.1 255.255.255.255
router bgp 65004
bgp router-id 4.4.4.4
bgp log-neighbor-changes
neighbor 10.1.3.1 remote-as 65002
neighbor 10.1.4.1 remote-as 65003
network 192.168.1.1 mask 255.255.255.255
!
LOCAL PREFERENCE to influence which router in an AS to use for outbound traffic¶
- Download Cisco Modeling Labs Topology
- In this scenario
r3
has learned how to reach192.168.1.1
and192.168.2.1
fromr5
r3
advertises these prefixes tor2
andr1
but sets a higher LOCAL PREFERENCE of 200 to the192.168.1.1
prefix which meansr1
andr2
will use the path throughr3
to reach192.168.1.1
- Note in the verification that the preferred path for
192.168.2.1
is still throughr2
(AS 65002) - Also, while
r3
will advertise this route tor2
, because of the iBGP loop prevention mechanism,r2
will not "forward" this advertisement tor1
and vice versa
Router configurations
- R1
!
hostname r1
!
interface GigabitEthernet1
ip address 10.1.1.1 255.255.255.0
!
interface GigabitEthernet2
ip address 10.1.2.1 255.255.255.0
!
interface Loopback0
ip add 172.16.1.1 255.255.255.255
!
router bgp 65001
bgp router-id 1.1.1.1
bgp log-neighbor-changes
neighbor 10.1.1.2 remote-as 65001
neighbor 10.1.2.2 remote-as 65001
network 172.16.1.1 mask 255.255.255.255
!
- R2
!
hostname r2
!
interface GigabitEthernet1
ip address 10.1.1.2 255.255.255.0
!
interface GigabitEthernet2
ip address 10.1.3.1 255.255.255.0
!
interface GigabitEthernet3
ip address 10.1.4.1 255.255.255.0
!
router bgp 65001
bgp router-id 2.2.2.2
bgp log-neighbor-changes
neighbor 10.1.1.1 remote-as 65001
neighbor 10.1.1.1 next-hop-self
neighbor 10.1.3.2 remote-as 65001
neighbor 10.1.3.2 next-hop-self
neighbor 10.1.4.2 remote-as 65002
!
- R3
!
hostname r3
!
interface GigabitEthernet1
ip address 10.1.2.2 255.255.255.0
!
interface GigabitEthernet2
ip address 10.1.3.2 255.255.255.0
!
interface GigabitEthernet3
ip address 10.1.5.1 255.255.255.0
!
router bgp 65001
bgp router-id 3.3.3.3
bgp log-neighbor-changes
neighbor 10.1.2.1 remote-as 65001
neighbor 10.1.2.1 next-hop-self
neighbor 10.1.3.1 remote-as 65001
neighbor 10.1.3.1 next-hop-self
neighbor 10.1.5.2 remote-as 65003
neighbor 10.1.5.2 route-map SET-LOCAL-PREF-HERE in
!
ip prefix-list 192_168_1_1 seq 1 permit 192.168.1.1/32
!
route-map SET-LOCAL-PREF-HERE
match ip address prefix-list 192_168_1_1
set local-preference 200
!
- R4
!
hostname r4
!
interface GigabitEthernet1
ip address 10.1.4.2 255.255.255.0
!
interface GigabitEthernet2
ip address 10.1.6.1 255.255.255.0
!
router bgp 65002
bgp router-id 4.4.4.4
bgp log-neighbor-changes
neighbor 10.1.4.1 remote-as 65001
neighbor 10.1.6.2 remote-as 65004
!
- R5
!
hostname r5
!
interface GigabitEthernet1
ip address 10.1.5.2 255.255.255.0
!
interface GigabitEthernet2
ip address 10.1.7.1 255.255.255.0
!
router bgp 65003
bgp router-id 5.5.5.5
bgp log-neighbor-changes
neighbor 10.1.5.1 remote-as 65001
neighbor 10.1.7.2 remote-as 65004
!
- R6
!
hostname r6
!
interface GigabitEthernet1
ip address 10.1.6.2 255.255.255.0
!
interface GigabitEthernet2
ip address 10.1.7.2 255.255.255.0
!
interface Loopback0
ip add 192.168.1.1 255.255.255.255
!
interface Loopback1
ip add 192.168.2.1 255.255.255.255
!
router bgp 65004
bgp router-id 6.6.6.6
bgp log-neighbor-changes
neighbor 10.1.6.1 remote-as 65002
neighbor 10.1.7.1 remote-as 65003
network 192.168.1.1 mask 255.255.255.255
network 192.168.2.1 mask 255.255.255.255
!
Suggest the preferred path into an AS when multiple entry points exist with MED/METRIC¶
- Download Cisco Modeling Labs Topology
- This scenario shows
AS 65003
has multiple entry points, one throughr6
and one throughr7
- We want to tell the neighbour AS,
AS 65002
to user7
and so a higher MED/Metric value is advertised fromr6
- A lower Metric wins and the default is 0, therefore traffic from
AS 65002
should prefer the path throughr7
Router configurations
- R1
!
hostname r1
!
interface GigabitEthernet1
ip address 10.1.1.1 255.255.255.0
!
interface GigabitEthernet2
ip address 10.1.2.1 255.255.255.0
!
interface Loopback0
ip add 172.16.1.1 255.255.255.255
!
router bgp 65001
bgp router-id 1.1.1.1
bgp log-neighbor-changes
neighbor 10.1.1.2 remote-as 65001
neighbor 10.1.2.2 remote-as 65001
network 172.16.1.1 mask 255.255.255.255
!
- R2
!
hostname r2
!
interface GigabitEthernet1
ip address 10.1.1.2 255.255.255.0
!
interface GigabitEthernet2
ip address 10.1.3.1 255.255.255.0
!
interface GigabitEthernet3
ip address 10.1.4.1 255.255.255.0
!
router bgp 65001
bgp router-id 2.2.2.2
bgp log-neighbor-changes
neighbor 10.1.1.1 remote-as 65001
neighbor 10.1.1.1 next-hop-self
neighbor 10.1.3.2 remote-as 65001
neighbor 10.1.3.2 next-hop-self
neighbor 10.1.4.2 remote-as 65002
!
- R3
!
hostname r3
!
interface GigabitEthernet1
ip address 10.1.2.2 255.255.255.0
!
interface GigabitEthernet2
ip address 10.1.3.2 255.255.255.0
!
interface GigabitEthernet3
ip address 10.1.5.1 255.255.255.0
!
router bgp 65001
bgp router-id 3.3.3.3
bgp log-neighbor-changes
neighbor 10.1.2.1 remote-as 65001
neighbor 10.1.2.1 next-hop-self
neighbor 10.1.3.1 remote-as 65001
neighbor 10.1.3.1 next-hop-self
neighbor 10.1.5.2 remote-as 65002
!
- R4
!
hostname r4
!
interface GigabitEthernet1
ip address 10.1.4.2 255.255.255.0
!
interface GigabitEthernet2
ip address 10.1.7.1 255.255.255.0
!
interface GigabitEthernet3
ip address 10.1.6.1 255.255.255.0
!
router bgp 65002
bgp router-id 4.4.4.4
bgp log-neighbor-changes
neighbor 10.1.4.1 remote-as 65001
neighbor 10.1.6.2 remote-as 65002
neighbor 10.1.6.2 next-hop-self
neighbor 10.1.7.2 remote-as 65003
!
- R5
!
hostname r5
!
interface GigabitEthernet1
ip address 10.1.5.2 255.255.255.0
!
interface GigabitEthernet2
ip address 10.1.8.1 255.255.255.0
!
interface GigabitEthernet3
ip address 10.1.6.2 255.255.255.0
!
router bgp 65002
bgp router-id 5.5.5.5
bgp log-neighbor-changes
neighbor 10.1.5.1 remote-as 65001
neighbor 10.1.6.1 remote-as 65002
neighbor 10.1.6.1 next-hop-self
neighbor 10.1.8.2 remote-as 65003
!
- R6
!
hostname r6
!
interface GigabitEthernet1
ip address 10.1.7.2 255.255.255.0
!
interface GigabitEthernet2
ip address 10.1.9.1 255.255.255.0
!
router bgp 65003
bgp router-id 6.6.6.6
bgp log-neighbor-changes
neighbor 10.1.7.1 remote-as 65002
neighbor 10.1.7.1 route-map SET-MED out
neighbor 10.1.9.2 remote-as 65003
!
route-map SET-MED permit 10
set metric 50
- R7
!
hostname r7
!
interface GigabitEthernet1
ip address 10.1.8.2 255.255.255.0
!
interface GigabitEthernet2
ip address 10.1.10.1 255.255.255.0
!
router bgp 65003
bgp router-id 7.7.7.7
bgp log-neighbor-changes
neighbor 10.1.8.1 remote-as 65002
neighbor 10.1.10.2 remote-as 65003
!
- R8
!
hostname r8
!
interface GigabitEthernet1
ip address 10.1.9.2 255.255.255.0
!
interface GigabitEthernet2
ip address 10.1.10.2 255.255.255.0
!
interface Loopback0
ip add 192.168.1.1 255.255.255.255
!
interface Loopback1
ip add 192.168.2.1 255.255.255.255
!
router bgp 65003
bgp router-id 8.8.8.8
bgp log-neighbor-changes
neighbor 10.1.9.1 remote-as 65003
neighbor 10.1.9.1 next-hop-self
neighbor 10.1.10.1 remote-as 65003
neighbor 10.1.10.1 next-hop-self
network 192.168.1.1 mask 255.255.255.255
network 192.168.2.1 mask 255.255.255.255
Verification
- Here is the output of the BGP table on
r4
before the metric is advertised. Note that the best path selected is via G2 connected tor6
- After the metric value has been advertised from
r6
. Note the best path is now via the link connected tor5
- MED is only used between AS', never propagated to another AS. Therefore it does not appear in the BGP table on
r2
- Finally this is a good point to note that the screenshots so far have shown the BGP table. The best path is then selected and added to the IP routing table
Influencing distant incoming traffic with the SHORTEST AS_PATH¶
- Download Cisco Modeling Labs Topology
- Since MED works between two neighbouring AS' and not propagated further, how do we influence which path to use across many AS'?
- The path selection criteria prefers a shorter AS_PATH, so we can just make the non-preferred path seem longer by prepending our ASN multiple times when we advertise a prefix
Router configurations
- R1
!
hostname r1
!
interface GigabitEthernet1
ip address 10.1.1.1 255.255.255.0
!
interface GigabitEthernet2
ip address 10.1.2.1 255.255.255.0
!
interface Loopback0
ip add 172.16.1.1 255.255.255.255
!
router bgp 65001
bgp router-id 1.1.1.1
bgp log-neighbor-changes
neighbor 10.1.1.2 remote-as 65001
neighbor 10.1.2.2 remote-as 65001
network 172.16.1.1 mask 255.255.255.255
!
- R2
!
hostname r2
!
interface GigabitEthernet1
ip address 10.1.1.2 255.255.255.0
!
interface GigabitEthernet2
ip address 10.1.3.1 255.255.255.0
!
interface GigabitEthernet3
ip address 10.1.4.1 255.255.255.0
!
router bgp 65001
bgp router-id 2.2.2.2
bgp log-neighbor-changes
neighbor 10.1.1.1 remote-as 65001
neighbor 10.1.1.1 next-hop-self
neighbor 10.1.3.2 remote-as 65001
neighbor 10.1.3.2 next-hop-self
neighbor 10.1.4.2 remote-as 65002
!
- R3
!
hostname r3
!
interface GigabitEthernet1
ip address 10.1.2.2 255.255.255.0
!
interface GigabitEthernet2
ip address 10.1.3.2 255.255.255.0
!
interface GigabitEthernet3
ip address 10.1.5.1 255.255.255.0
!
router bgp 65001
bgp router-id 3.3.3.3
bgp log-neighbor-changes
neighbor 10.1.2.1 remote-as 65001
neighbor 10.1.2.1 next-hop-self
neighbor 10.1.3.1 remote-as 65001
neighbor 10.1.3.1 next-hop-self
neighbor 10.1.5.2 remote-as 65003
neighbor 10.1.5.2 route-map SET-LOCAL-PREF-HERE in
!
ip prefix-list 192_168_1_1 seq 1 permit 192.168.1.1/32
!
route-map SET-LOCAL-PREF-HERE
match ip address prefix-list 192_168_1_1
set local-preference 200
!
- R4
!
hostname r4
!
interface GigabitEthernet1
ip address 10.1.4.2 255.255.255.0
!
interface GigabitEthernet2
ip address 10.1.6.1 255.255.255.0
!
router bgp 65002
bgp router-id 4.4.4.4
bgp log-neighbor-changes
neighbor 10.1.4.1 remote-as 65001
neighbor 10.1.6.2 remote-as 65004
!
- R5
!
hostname r5
!
interface GigabitEthernet1
ip address 10.1.5.2 255.255.255.0
!
interface GigabitEthernet2
ip address 10.1.7.1 255.255.255.0
!
router bgp 65003
bgp router-id 5.5.5.5
bgp log-neighbor-changes
neighbor 10.1.5.1 remote-as 65001
neighbor 10.1.7.2 remote-as 65004
!
- R6
!
hostname r6
!
interface GigabitEthernet1
ip address 10.1.6.2 255.255.255.0
!
interface GigabitEthernet2
ip address 10.1.7.2 255.255.255.0
!
interface Loopback0
ip add 192.168.1.1 255.255.255.255
!
interface Loopback1
ip add 192.168.2.1 255.255.255.255
!
router bgp 65004
bgp router-id 6.6.6.6
bgp log-neighbor-changes
neighbor 10.1.6.1 remote-as 65002
neighbor 10.1.6.1 route-map PREPEND-AS-PATH out
neighbor 10.1.7.1 remote-as 65003
network 192.168.1.1 mask 255.255.255.255
network 192.168.2.1 mask 255.255.255.255
!
route-map PREPEND-AS-PATH permit 10
set as-path prepend 65004 65004
iBGP vs eBGP - Loop Prevention, Peering Interfaces, and Next Hop¶
Loop Prevention¶
-
eBGP
- Uses the AS path attribute to prevent routing loops
- When a BGP router receives a route advertisement, if its own ASN is present in the AS path, the route is discarded to prevent a loop Split Horizon: eBGP inherently prevents loops by not advertising routes learned from one eBGP peer to another eBGP peer within the same AS.
-
iBGP
- Updates received from iBGP peers are not advertised to other iBGP peers (multiple hops away) unless route reflection or confederations are used. Therefore all iBGP routers within an AS must be fully meshed, meaning each iBGP router has a direct BGP session with every other iBGP router
- The following example shows a square iBGP topology with one eBGP connection to AS 65002
- Notice in the two router outputs that while there is a BGP session between
ibgp-1
(10.1.2.1
) andibgp-3
(10.1.2.2
), sinceibgp-3
is not directly connected to theibgp/ebgp
peer it will not receive the192.168.1.1
advertisement
- Updates received from iBGP peers are not advertised to other iBGP peers (multiple hops away) unless route reflection or confederations are used. Therefore all iBGP routers within an AS must be fully meshed, meaning each iBGP router has a direct BGP session with every other iBGP router
Loopback vs Physical Peering Interfaces¶
iBGP and eBGP interface peering
- For simplicity, all the scenarios up until now have used physical interfaces for both eBGP and iBGP peering, with a full-mesh between all iBGP peers
- In a real-world deployment you would commonly see eBGP peering on the physical interfaces and iBGP peering on loopback interfaces
-
In many cases eBGP would be used for external or ISP connectivity and you may only have one physical interface connected to the peer
- With iBGP you may have many iBGP peers and many links between those routers so it can be tedious to configure each individual peering. There may also be multiple paths between peers for redundancy
- Loopback interfaces do not go down like physical interfaces. The BGP session will remain up as long as there is at least one path to get to the loopback interface of the peer router
- Since iBGP is within the same administrative domain there would typically be an IGP such as OSPF already running (you may be using BGP in an overlay scenario such as VXLAN MP-BGP EVPN)
- You can use the IGP (e.g OSPF) to advertise the reachability of each iBGP speakers loopback
- The IGP may provide multiple routes to reach the loopback interface on the BGP peer
- Here is an example topology with two "spine" switches and three "leaf" switches with an iBGP configuration
- In this scenario the the BGP peerings are configured using the
loopback0
interface, rather than the multiple physical interfaces connecting each peer - As you'll see in the router configuration below, OSPF is configured as point-to-point links on each of the physical interfaces and a loopback interface is advertised from each router
- The BGP session (
neighbor
) is configured using this loopback address R5
has another network,192.168.1.1/32
which is advertised in BGP
- In this scenario the the BGP peerings are configured using the
Router configurations
- R1
!
hostname r1
!
interface GigabitEthernet1
ip unnumbered loopback 0
ip ospf network point-to-point
ip ospf 1 area 0
!
interface GigabitEthernet2
ip unnumbered loopback 0
ip ospf network point-to-point
ip ospf 1 area 0
!
interface GigabitEthernet3
ip unnumbered loopback 0
ip ospf network point-to-point
ip ospf 1 area 0
!
interface Loopback0
ip add 1.1.1.1 255.255.255.255
ip ospf 1 area 0
!
router ospf 1
router-id 1.1.1.1
!
router bgp 65001
bgp router-id 1.1.1.1
bgp log-neighbor-changes
neighbor 2.2.2.2 remote-as 65001
neighbor 2.2.2.2 update-source loopback0
neighbor 3.3.3.3 remote-as 65001
neighbor 3.3.3.3 update-source loopback0
neighbor 4.4.4.4 remote-as 65001
neighbor 4.4.4.4 update-source loopback0
neighbor 5.5.5.5 remote-as 65001
neighbor 5.5.5.5 update-source loopback0
!
- R2
!
hostname r2
!
interface GigabitEthernet1
ip unnumbered loopback 0
ip ospf network point-to-point
ip ospf 1 area 0
!
interface GigabitEthernet2
ip unnumbered loopback 0
ip ospf network point-to-point
ip ospf 1 area 0
!
interface GigabitEthernet3
ip unnumbered loopback 0
ip ospf network point-to-point
ip ospf 1 area 0
!
interface Loopback0
ip add 2.2.2.2 255.255.255.255
ip ospf 1 area 0
!
router ospf 1
router-id 2.2.2.2
!
router bgp 65001
bgp router-id 2.2.2.2
bgp log-neighbor-changes
neighbor 1.1.1.1 remote-as 65001
neighbor 1.1.1.1 update-source loopback0
neighbor 3.3.3.3 remote-as 65001
neighbor 3.3.3.3 update-source loopback0
neighbor 4.4.4.4 remote-as 65001
neighbor 4.4.4.4 update-source loopback0
neighbor 5.5.5.5 remote-as 65001
neighbor 5.5.5.5 update-source loopback0
!
- R3
!
hostname r3
!
interface GigabitEthernet1
ip unnumbered loopback 0
ip ospf network point-to-point
ip ospf 1 area 0
!
interface GigabitEthernet2
ip unnumbered loopback 0
ip ospf network point-to-point
ip ospf 1 area 0
!
interface Loopback0
ip add 3.3.3.3 255.255.255.255
ip ospf 1 area 0
!
router ospf 1
router-id 3.3.3.3
!
router bgp 65001
bgp router-id 3.3.3.3
bgp log-neighbor-changes
neighbor 1.1.1.1 remote-as 65001
neighbor 1.1.1.1 update-source loopback0
neighbor 2.2.2.2 remote-as 65001
neighbor 2.2.2.2 update-source loopback0
neighbor 4.4.4.4 remote-as 65001
neighbor 4.4.4.4 update-source loopback0
neighbor 5.5.5.5 remote-as 65001
neighbor 5.5.5.5 update-source loopback0
!
- R4
!
hostname r4
!
interface GigabitEthernet1
ip unnumbered loopback 0
ip ospf network point-to-point
ip ospf 1 area 0
!
interface GigabitEthernet2
ip unnumbered loopback 0
ip ospf network point-to-point
ip ospf 1 area 0
!
interface Loopback0
ip add 4.4.4.4 255.255.255.255
ip ospf 1 area 0
!
router ospf 1
router-id 4.4.4.4
!
router bgp 65001
bgp router-id 4.4.4.4
bgp log-neighbor-changes
neighbor 1.1.1.1 remote-as 65001
neighbor 1.1.1.1 update-source loopback0
neighbor 2.2.2.2 remote-as 65001
neighbor 2.2.2.2 update-source loopback0
neighbor 3.3.3.3 remote-as 65001
neighbor 3.3.3.3 update-source loopback0
neighbor 5.5.5.5 remote-as 65001
neighbor 5.5.5.5 update-source loopback0
!
- R5
!
hostname r5
!
interface GigabitEthernet1
ip unnumbered loopback 0
ip ospf network point-to-point
ip ospf 1 area 0
!
interface GigabitEthernet2
ip unnumbered loopback 0
ip ospf network point-to-point
ip ospf 1 area 0
!
interface Loopback0
ip add 5.5.5.5 255.255.255.255
ip ospf 1 area 0
!
interface Loopback192
ip add 192.168.1.1 255.255.255.255
!
router ospf 1
router-id 5.5.5.5
!
router bgp 65001
bgp router-id 5.5.5.5
bgp log-neighbor-changes
neighbor 1.1.1.1 remote-as 65001
neighbor 1.1.1.1 update-source loopback0
neighbor 2.2.2.2 remote-as 65001
neighbor 2.2.2.2 update-source loopback0
neighbor 3.3.3.3 remote-as 65001
neighbor 3.3.3.3 update-source loopback0
neighbor 4.4.4.4 remote-as 65001
neighbor 4.4.4.4 update-source loopback0
network 192.168.1.1 mask 255.255.255.255
!
Verification
- Here you can see the BGP peering between each speaker
- The
192.168.1.1/32
network is advertised through BGP and reachable via5.5.5.5
- You can see in
r5
's router table that theloopback0
interfaces are advertised correctly through OSPF 192.168.1.1/32
is reachable via5.5.5.5
which can be reached via the spines (1.1.1.1
and2.2.2.2
)- If
1.1.1.1
was to fail the BGP session would remain active
IP Unnumbered¶
- The
ip unnumbered loopback 0
command was used on all the physical interfaces in the previous example - Nomally, every interface connecting to a segment must belong to a unique subnet
- IP unnumbered allows us to provide connectivity between the routers without assigning each interface a unique IP address
- The main benefits being simplification of the configuration and conserving IP address space
- Works by borrowing the IP address of another interface (typically a loopback)
-
When IP unnumbered is configured, routes learned through the IP unnumbered interface have the interface as the next hop instead of the source address of the routing update.
- Used on point-to-point links (not ones connected to endhosts)
- The main disadvantage is that the interface is unavailable for testing, troubleshooting, and management
-
It also won't work if the unnumbered interface is pointing to an interface that is not functional
- This shouldn't be a problem with a loopback interface unless you happen to delete the loopback interfaces itself
-
As an example I've configured the connection between
r1
andr3
to use the10.1.1.0/24
subnet while the connection betweenr2
andr3
uses IP unnumbered.
BGP update-source¶
- By default BGP uses the IP address configured on the physical interface directly connected to the BGP peer as the source address when it establishes the BGP peering session
- When configuring iBGP sessions with Loopback interfaces you can use the
neighbor x.x.x.x update-source
command to use the Loopback as the source address to establish the session - Only one session between the loopback interfaces needs to be configured even if multiple paths between the BGP peers exist
Next-hop¶
eBGP¶
- The next-hop attribute is typically updated to the IP address of the eBGP peer that is advertising the route which ensures that it's reachable from within the receiving AS
iBGP¶
- The next-hop attribute does not change by default so it remains the same as what's advertised by the eBGP peer or the original source
- This may cause an issue if the next-hop is not reachable from an iBGP peer as shown below
- In this example
r2
is learning about192.168.1.1
from an eBGP peer and advertising it tor1
andr3
- The next-hop is not changed from
10.1.4.2
which is the IP address of the G1 interface on the router in AS 65002
- In this example
- There are a couple of options to overcome this issue
- You could advertise the point-to-point eBGP links into your IGP so that your iBGP peers know how to reach the next-hop
- Alternatively you could use the
next-hop-self
command on the edge routers to force the next-hop to be updated when advertised to other iBGP peers - People have different opinons on each method
Router configurations
-
Have a look above at the earlier examples which had the full
next-hop-self
configuration for all routers -
R2
!
hostname r2
!
interface GigabitEthernet1
ip address 10.1.1.2 255.255.255.0
!
interface GigabitEthernet2
ip address 10.1.3.1 255.255.255.0
!
interface GigabitEthernet3
ip address 10.1.4.1 255.255.255.0
!
router bgp 65001
bgp router-id 2.2.2.2
bgp log-neighbor-changes
neighbor 10.1.1.1 remote-as 65001
neighbor 10.1.1.1 next-hop-self
neighbor 10.1.3.2 remote-as 65001
neighbor 10.1.3.2 next-hop-self
neighbor 10.1.4.2 remote-as 65002
Scaling BGP with Route Reflectors¶
- As seen in the loop prevention section, updates received from iBGP peers are not advertised to other iBGP peers (multiple hops away) unless route reflection or confederations are used
- Therefore all iBGP routers within an AS must be fully meshed, meaning each iBGP router has a direct BGP session with every other iBGP router
- Session count is
n*(n-1)/2
- e.g. for 10 iBGP routers =
10 * 9 / 2 = 45
sessions
- e.g. for 10 iBGP routers =
- This does not scale when there is a large amount of iBGP speakers in an AS
Note
The session count refers to the number of logical BGP sessions between peers, not the physical links (which may not be fully meshed). In this example I've used the physical connections to help visualize the scale challenge
-
BGP Route Reflection helps to overcome this full-mesh requirement by designating one or more iBGP speakers as route reflectors (RR)
-
The following topology will be used to demonstrate some examples
-
iBGP clients peer with one or more route reflectors
-
The RRs advertise iBGP learned routes to their iBGP clients peers (which don't have to be fully meshed), thereby simplifying the topology and configuration
-
Route reflectors advertise routes to the following types of clients
- Clients
- iBGP routers that have a direct BGP session with the route reflector
- Non-Client Peers
- Other route reflectors or iBGP routers that are not clients of the route reflector but are part of the same AS
- Non-client peers operate as normal iBGP peers and need to be fully meshed
- eBGP peers
- eBGP sessions with routers in other AS'
- Clients
- Updated in RFC 4456
BGP Route Reflector Advertisement Rules¶
-
RR advertisement rules
- Routes learned from a client are advertised to all other clients and non-client peers (including eBGP peers)
- Routes learned from a non-client peer are advertised only to clients (including eBGP peers)
- Routes learned from an eBGP peer are advertised to all clients and non-client peers
-
Route reflectors treat other RRs like any other iBGP speakers
BGP Route Reflector Loop Prevention¶
-
Each route reflector is assigned a cluster-id, which is used to identify the route reflector cluster
- If the cluster-id is not configured it will use the BGP router-id of the RR
-
Route reflector loop prevention uses the CLUSTER_LIST and ORIGINATOR_ID attributes
- Cluster List
- eBGP learned routes
- The CLUSTER_LIST attribute is specifically used to prevent loops within iBGP route reflection
- The CLUSTER_LIST attribute is only added when reflecting routes learned from iBGP peers to other iBGP peers or clients
- Routes learned from eBGP peers are considered external and do not carry the CLUSTER_LIST attribute when they are reflected to iBGP peers
- iBGP learned routes
- Each RR adds its cluster-id to the CLUSTER_LIST attribute of the route
- If an RR receives a route with its own cluster-id in the CLUSTER_LIST, it discards the route to prevent loops
- eBGP learned routes
- Originator ID
- The ORIGINATOR_ID attribute is used to identify the original router that injected the route into the iBGP network
- If an RR receives a route with its own router-id as the ORIGINATOR_ID, it discards the route to prevent loops
- Cluster List
Router configurations
- RR-1
!
hostname rr-1
!
interface GigabitEthernet1
ip unnumbered loopback 0
ip ospf network point-to-point
ip ospf 1 area 0
!
interface GigabitEthernet2
ip unnumbered loopback 0
ip ospf network point-to-point
ip ospf 1 area 0
!
interface GigabitEthernet6
ip unnumbered loopback 0
ip ospf network point-to-point
ip ospf 1 area 0
!
interface Loopback0
ip add 1.1.1.1 255.255.255.255
ip ospf 1 area 0
!
router ospf 1
router-id 1.1.1.1
!
router bgp 65001
bgp router-id 1.1.1.1
bgp log-neighbor-changes
neighbor 2.2.2.2 remote-as 65001
neighbor 2.2.2.2 update-source Loopback0
neighbor 3.3.3.3 remote-as 65001
neighbor 3.3.3.3 update-source Loopback0
neighbor 3.3.3.3 route-reflector-client
neighbor 4.4.4.4 remote-as 65001
neighbor 4.4.4.4 update-source Loopback0
neighbor 4.4.4.4 route-reflector-client
neighbor 5.5.5.5 remote-as 65001
neighbor 5.5.5.5 update-source Loopback0
neighbor 5.5.5.5 route-reflector-client
!
- RR-2
!
hostname RR-2
!
interface GigabitEthernet2
ip unnumbered loopback 0
ip ospf network point-to-point
ip ospf 1 area 0
!
interface GigabitEthernet3
ip unnumbered loopback 0
ip ospf network point-to-point
ip ospf 1 area 0
!
interface GigabitEthernet4
ip unnumbered loopback 0
ip ospf network point-to-point
ip ospf 1 area 0
!
interface GigabitEthernet6
ip unnumbered loopback 0
ip ospf network point-to-point
ip ospf 1 area 0
!
interface GigabitEthernet7
ip add 10.1.1.1 255.255.255.0
!
interface Loopback0
ip add 2.2.2.2 255.255.255.255
ip ospf 1 area 0
!
router ospf 1
router-id 2.2.2.2
!
router bgp 65001
bgp router-id 2.2.2.2
bgp log-neighbor-changes
neighbor 1.1.1.1 remote-as 65001
neighbor 1.1.1.1 update-source Loopback0
neighbor 1.1.1.1 next-hop-self
neighbor 3.3.3.3 remote-as 65001
neighbor 3.3.3.3 update-source Loopback0
neighbor 3.3.3.3 next-hop-self
neighbor 3.3.3.3 route-reflector-client
neighbor 4.4.4.4 remote-as 65001
neighbor 4.4.4.4 update-source Loopback0
neighbor 4.4.4.4 next-hop-self
neighbor 4.4.4.4 route-reflector-client
neighbor 5.5.5.5 remote-as 65001
neighbor 5.5.5.5 update-source Loopback0
neighbor 5.5.5.5 next-hop-self
neighbor 5.5.5.5 route-reflector-client
neighbor 7.7.7.7 remote-as 65001
neighbor 7.7.7.7 update-source Loopback0
neighbor 7.7.7.7 next-hop-self
neighbor 10.1.1.2 remote-as 65002
!
- Client-1
!
hostname client-1
!
interface GigabitEthernet1
ip unnumbered loopback 0
ip ospf network point-to-point
ip ospf 1 area 0
!
interface Loopback0
ip add 3.3.3.3 255.255.255.255
ip ospf 1 area 0
!
router ospf 1
router-id 3.3.3.3
!
router bgp 65001
bgp router-id 3.3.3.3
bgp log-neighbor-changes
neighbor 1.1.1.1 remote-as 65001
neighbor 1.1.1.1 update-source loopback0
neighbor 2.2.2.2 remote-as 65001
neighbor 2.2.2.2 update-source loopback0
- Client-2
!
hostname client-2
!
interface GigabitEthernet1
ip unnumbered loopback 0
ip ospf network point-to-point
ip ospf 1 area 0
!
interface GigabitEthernet2
ip unnumbered loopback 0
ip ospf network point-to-point
ip ospf 1 area 0
!
interface Loopback0
ip add 4.4.4.4 255.255.255.255
ip ospf 1 area 0
!
router ospf 1
router-id 4.4.4.4
!
router bgp 65001
bgp router-id 4.4.4.4
bgp log-neighbor-changes
neighbor 1.1.1.1 remote-as 65001
neighbor 1.1.1.1 update-source loopback0
neighbor 2.2.2.2 remote-as 65001
neighbor 2.2.2.2 update-source loopback0
- Client-3
!
hostname client-3
!
interface GigabitEthernet2
ip unnumbered loopback 0
ip ospf network point-to-point
ip ospf 1 area 0
!
interface Loopback0
ip add 5.5.5.5 255.255.255.255
ip ospf 1 area 0
!
interface Loopback1
ip add 192.168.1.1 255.255.255.255
!
router ospf 1
router-id 5.5.5.5
!
router bgp 65001
bgp router-id 5.5.5.5
bgp log-neighbor-changes
neighbor 1.1.1.1 remote-as 65001
neighbor 1.1.1.1 update-source loopback0
neighbor 2.2.2.2 remote-as 65001
neighbor 2.2.2.2 update-source loopback0
network 192.168.1.1 mask 255.255.255.255
- Non-client-1
!
hostname non-client-1
!
interface GigabitEthernet2
ip unnumbered loopback 0
ip ospf network point-to-point
ip ospf 1 area 0
!
interface Loopback0
ip add 7.7.7.7 255.255.255.255
ip ospf 1 area 0
!
interface Loopback1
ip add 192.168.2.1 255.255.255.255
!
router ospf 1
router-id 7.7.7.7
!
router bgp 65001
bgp router-id 7.7.7.7
bgp log-neighbor-changes
neighbor 2.2.2.2 remote-as 65001
neighbor 2.2.2.2 update-source loopback0
network 192.168.2.1 mask 255.255.255.255
- External-Router
!
hostname external-router
!
interface GigabitEthernet1
ip add 10.1.1.2 255.255.255.0
!
interface Loopback0
ip add 8.8.8.8 255.255.255.255
!
interface Loopback1
ip add 192.168.3.1 255.255.255.255
!
router bgp 65002
bgp router-id 8.8.8.8
bgp log-neighbor-changes
neighbor 10.1.1.1 remote-as 65001
network 192.168.3.1 mask 255.255.255.255
Verification
-
It's important to understand the BGP session that have been configured. They will be shown in three diagrams
-
Firstly the peering from RR-1 to Client-1 and Client-2
- RR-2 peers to RR-1 in a non-client role
- RR-2 has a session with Client-2 and Client-3
- Just like above, RR-2 also peers with RR-1 in a non-client role
- Finally, although they are not physically connected, Client-1 and Client-3 still have BGP sessions to RR-1 and RR-2
- When you capture a packet on the link between RR-1 and RR-2 you can see the source/destination as RR-2 (
2.2.2.2
) to Client-1 (3.3.3.3
)
- These three screenshots show the update-groups i.e. the peers of this router to which BGP updates will be sent
- Notice there are multiple groups on the route reflectors
- One containing the route reflector clients (
3.3.3.3
,4.4.4.4
,5.5.5.5
) - One with the iBGP non-client peer (
7.7.7.7
)- Each RR also has the other as a non-client peer
- One containing the eBGP peer on RR-2
- The clients and non-clients just see the other routers as iBGP peers
- One containing the route reflector clients (
- These two screenshots show which advertisements will be sent from the route reflectors and to which peers
192.168.3.1
received from the eBGP neighbour- Sent by RR-2 to RR-1 (non-client peer) who only sends it to clients (Routes learned from a non-client peer are advertised only to clients)
- Sent by RR-2 to clients and other non-clients
192.168.2.1
received from a non-client peer- Sent by non-client-peer-1 to RR-2 who sends it to clients and eBGP peers
- Note that it doesn't exist in the routing table for RR-1
- Routes learned from a non-client peer are advertised only to clients and since RR-1 is a non-client peer of RR-2 it will not receive it
192.168.1.1
received from client peer and advertised by both RR-1 and RR-2 to clients, non-clients, and eBGP peers- It will also be advertised back to Client-3 by both RRs but will be dropped due to the ORIGINATING_ID loop prevention mechanism
- From the client perspective notice that Client-3 advertises to the update group containing the route reflectors since
192.168.1.1
is local
- Client-1 does not have any local routes to advertise
- Using the configuration above, routes are advertised to both RR-1 and RR-2
- This includes Client-3 from which the advertisement came. The UPDATE will be denied due to the ORIGINATING_ID loop prevention mechanism
- If you look at the routing table of RR-1 you should see two possible paths
- One received from RR-2 (
2.2.2.2
) - One received directly from the RR Client-3 (
5.5.5.5
). This is still physically via RR-2
- One received from RR-2 (
Do we use the same cluster-id on each RR?¶
- Each route reflector is assigned a cluster-id, which is used to identify the route reflector cluster
- If the cluster-id is not configured it will use the BGP router-id of the RR
-
The example configuration above was not using a cluster-id and therefore it used the two different router-ids
-
Clients peer with one or more route reflectors
-
If a route reflector receives a route with its own cluster-id in the CLUSTER_LIST attribute, it will drop the route to avoid reflecting it back into the same cluster
- See the verification debugging output below which shows the DENID updates due to using the same cluster-id
-
Using a different cluster-id provides more redundancy even if the two RRs have the same set of route reflector clients
- When using multiple RRs with different cluster-ids, clients can receive multiple paths to the same destination
- If one RR fails, the clients can still receive routing information from the other RR
-
If two RRs are not serving the same set of clients you must use a different cluster-id on them
- You also need a regular non-client peer iBGP session between the two RRs
Router configuration
-
RR-1 - Note the
bgp cluster-id 1.1.1.1
command! router bgp 65001 bgp router-id 1.1.1.1 bgp cluster-id 1.1.1.1 bgp log-neighbor-changes neighbor 2.2.2.2 remote-as 65001 neighbor 2.2.2.2 update-source Loopback0 neighbor 3.3.3.3 remote-as 65001 neighbor 3.3.3.3 update-source Loopback0 neighbor 3.3.3.3 route-reflector-client neighbor 4.4.4.4 remote-as 65001 neighbor 4.4.4.4 update-source Loopback0 neighbor 4.4.4.4 route-reflector-client neighbor 5.5.5.5 remote-as 65001 neighbor 5.5.5.5 update-source Loopback0 neighbor 5.5.5.5 route-reflector-client ! !
-
RR-2 - Note the
bgp cluster-id 1.1.1.1
command
router bgp 65001
bgp router-id 2.2.2.2
bgp cluster-id 1.1.1.1
bgp log-neighbor-changes
neighbor 1.1.1.1 remote-as 65001
neighbor 1.1.1.1 update-source Loopback0
neighbor 1.1.1.1 next-hop-self
neighbor 3.3.3.3 remote-as 65001
neighbor 3.3.3.3 update-source Loopback0
neighbor 3.3.3.3 next-hop-self
neighbor 3.3.3.3 route-reflector-client
neighbor 4.4.4.4 remote-as 65001
neighbor 4.4.4.4 update-source Loopback0
neighbor 4.4.4.4 next-hop-self
neighbor 4.4.4.4 route-reflector-client
neighbor 5.5.5.5 remote-as 65001
neighbor 5.5.5.5 update-source Loopback0
neighbor 5.5.5.5 next-hop-self
neighbor 5.5.5.5 route-reflector-client
neighbor 7.7.7.7 remote-as 65001
neighbor 7.7.7.7 update-source Loopback0
neighbor 7.7.7.7 next-hop-self
neighbor 10.1.1.2 remote-as 65002
!
Verification
- When the cluster_id is the same, as is the case in the configuration above, the CLUSTER_LIST loop prevention mechanism on the route reflectors will deny the update
- Note the DENIED message in the RR-1 logging output which refers to the UPDATE received from RR-2 and is due to "reflected from the same cluster"
- You can also confirm by looking at the routing table for RR-1
- Notice there is only one entry for
192.168.1.1
which is via the RR Client-3
- Notice there is only one entry for
BGP Route Server¶
- There is also the concept of a BGP route server which is not covered in this post
- They are used in Internet Exchange Points (IXPs)
- It is like a route reflector but for eBGP
- With BGP route servers you can avoid the need to configure 10s/100s of individual EBGP sessions with each member in an IXP
Extensibility and address families with Multi-protocol BGP¶
- All examples up until this point have been using BGP to advertise IPv4 prefixes (similar to what you might with other routing protocols)
- Key strength of BGP is its ability to advertise other types of network layer reachability information
- The different types of information are grouped into address families
- These include IPv4 (default), IPv6, and L3VPN/L2VPN, among others
- Achieved through a BGP extension called Multi-protocol BGP (MP-BGP) defined in RFC 4760
- MP-BGP is an optional non-transitive attribute
- Attribute includes an Address Family Identifier (AFI) and Subsequent Address Family Identifier (SAFI)
- Address Family Identifier (AFI)
- Identifies the network layer protocol
- e.g. AFI 1 is used for IPv4, AFI 2 is used for IPv6
- Identifies the network layer protocol
- Subsequent Address Family Identifier (SAFI)
- Provides additional info about the type of network layer reachability information
- e.g. SAFI 1 is used for unicast, SAFI 25 is used for L2VPN
- Provides additional info about the type of network layer reachability information
- e.g. To transport EVPN information the AFI is 25(L2VPN) and SAFI is 70(EVPN)
- Address Family Identifier (AFI)
MP-BGP CAPABILITIES Advertisement¶
- Optional parameter
- Sent in the BGP OPEN message
- Lists the AFI/SAFI capabilities supported by the speaker
- Each speaker MUST advertise to the other the capability to support that particular AFI/SAFI route
MP-BGP Updates¶
- Two new path attributes in MP-BGP
- MP_REACH_NLRI
- Used to advertise reachable destinations for a specific address family
- MP_UNREACH_NLRI
- Used to withdraw previously advertised routes for a specific address family
- MP_REACH_NLRI
BGP VPNv4¶
- Don't get confused with VPN technologies such as IPSec or TLS
- BGP VPNv4 is an address family used to support MPLS-based Layer 3 VPNs
- Allows the exchange of routing information for VPNs that use IPv4 addresses
- VPNv4 address family includes information such as Route Distinguishers (RDs) and Route Targets (RTs)
- Route Distinguisher (RD)
- 64-bit unique identifier
- Prepended to the IPv4 route
- Distinguishes the VPN to which the route belongs
- Route Target (RT)
- 64-bit extended BGP community attribute
- Used to control the import and export of routes between different VPNs
- Route Distinguisher (RD)
Ethernet VPN (EVPN)¶
- Similar to BGP BVPNv4, Ethernet VPN does not mean an encrypted tunnel like you have with IPSec or TLS
- The EVPN NLRI is carried with a new address family called Layer-2 VPN (L2VPN) EVPN
- Uses route distinguishers (RDs) to maintain uniqueness among identical routes in different VRF instances
- Uses route targets (RTs) to define policies which determine how routes are advertised and shared by different VRF instances
- Commonly seen with VXLAN fabrics where VXLAN provides the data plane and EVPN provides the control plane (e.g. where are MAC/IPs located)
- RFCs
- RFC 7209: "Requirements for Ethernet VPN (EVPN)"
- Outlines the requirements for the development and deployment of EVPN
- RFC 7432: "BGP MPLS-Based Ethernet VPN"
- Defines the EVPN architecture and the basic route types (1-4)
- RFC 8365: "Network Virtualization Overlay Solution using EVPN"
- Extends EVPN for network virtualization overlays
- RFC 9135: "IP Prefix Advertisement in EVPN"
- Defines the IP Prefix Route (Route Type 5) for advertising IP prefixes in EVPN
- RFC 7209: "Requirements for Ethernet VPN (EVPN)"
- EVPN defined a new NLRI used to hold different route types
- Route Type 1: Ethernet Auto-Discovery (A-D) Route
- Used for discovering and advertising Ethernet segments and their associated attributes
- Route Type 2: MAC/IP Advertisement Route
- Used to advertise MAC addresses and optionally IP addresses associated with those MACs
- Often seen in VXLAN where we need to learn IP/MAC addresses and distribute them within the same VNI
- Route Type 3: Inclusive Multicast Ethernet Tag Route
- Used to advertise multicast group membership and Ethernet tag information for inclusive multicast
- Route Type 4: Ethernet Segment Route
- Used to advertise Ethernet segment identifiers and their associated attributes
- Route Type 5: IP Prefix Route
- Used to advertise IP prefixes for integrated Layer 3 VPN services
- Often seen in VXLAN where it's used to advertise Layer 3 VNI reachability information
- Route Type 1: Ethernet Auto-Discovery (A-D) Route
BGP Communities¶
- RFC 1997 - BGP Communities Attribute
- Optional Transitive path attribute
- Standard community tags are 32-bit (extended tags used in VPNv4 and EVPN above are 64-bit)
- The standard format is
ASN:TAG
e.g.65001:200
- The ASN used depends on the scenario
- Private: When you want to tag routes within your own network e.g.
65001:100
- Public: When you are coordinating with an external AS or when tagging routes to be recognized by an external network e.g. Cogent will use the
174:21000
tag for any route that is learned from NA (North America) non-customer
- Private: When you want to tag routes within your own network e.g.
- The standard format is
- Used to group routes that should be treated the same (e.g. should have the same local preference, should be filtered)
- Tags them with specific information that can be used to make decisions (routing, filtering)
- Some use cases:
- Traffic Engineering
- Control how traffic is distributed across multiple paths
- Routing Policies
- Apply specific routing policies based on community tags
- Route Filtering
- Filter routes based on community values
- Traffic Engineering
- There are several well-known standard communities:
- INTERNET: Advertise these routes to all BGP neighbours (iBGP and eBGP)
- Learn these routes and share them with all neighbours
- NO_ADVERTISE: Do not advertise these routes to any BGP peers (iBGP or eBGP)
- Just install the learned route but don't share it
- NO_EXPORT: Do not advertise these routes outside the local AS (i.e. to eBGP peers)
- Just install the route and share it with iBGP peers
- LOCAL_AS: Do no advertise these routes outside the local AS
- Used with BGP confederations
- INTERNET: Advertise these routes to all BGP neighbours (iBGP and eBGP)
- Communities attribute is set using route maps
router bgp 65001
bgp log-neighbor-changes
network 10.0.10.0 mask 255.255.255.0
network 10.1.0.0 mask 255.255.255.0
neighbor 1.1.1.1 remote-as 65002
neighbor 1.1.1.1 send-community
neighbor 2.2.2.2 remote-as 65002
neighbor 2.2.2.2 send-community
!
address-family ipv4
neighbor 1.1.1.1 activate
neighbor 2.2.2.2 activate
neighbor 1.1.1.1 route-map R1_COMM out
neighbor 2.2.2.2 route-map R2_COMM out
!
route-map R1_COMM permit 10
match ip address prefix-list NET1
set community 65001:300
!
route-map R1_COMM permit 20
match ip address prefix-list NET2
set community 65001:250
!
route-map R2_COMM permit 10
match ip address prefix-list NET1
set community 65001:250
!
route-map R2_COMM permit 20
match ip address prefix-list NET2
set community 65001:300
!
ip prefix-list NET1 seq 5 permit 10.0.10.0/24
ip prefix-list NET2 seq 5 permit 10.1.0.0/24
BGP Path Diversity¶
- Normally BGP only advertises the best path to a neighbors
- If a better path is found, it replaces the current path
- This impacts the speed of reconvergence in case one path fails
- Repeated advertisement of the same NLRI is processed as an update of the previous advertisement i.e the latter replaces the former
- BGP Additional Paths
- RFC 7911: Advertisement of Multiple Paths in BGP
- Defines a BGP extension that allows the advertisement of multiple paths for the same address prefix without the new paths implicitly replacing any previous ones
- Within the NLRI, each path is identified by a Path Identifier in addition to the address prefix
- BGP speaker announcing multiple paths to a neighbor assigns a locally unique Path ID to every path
- BGP Diverse Paths
- RFC 6774: Distribution of Diverse BGP Paths
- Have an additional (shadow) route reflector alongside the regular one, and have it advertise a different route than the best one
https://www.cisco.com/c/en/us/td/docs/ios-xml/ios/iproute_bgp/configuration/xe-16/irg-xe-16-book/bgp-additional-paths.html
R1
bgp additional-paths select all
bgp additional-paths send receive
R2 bgp additional-paths select all bgp additional-paths send receive neighbor 10.1.1.1 advertise additional-paths all
BGP Security¶
TCP Authentication¶
https://datatracker.ietf.org/doc/html/rfc5925
Secure eBGP Default Policy¶
works on XR by default Check XE and NX-OS
BGP Multi-session¶
BGP Resource Public Key Infrastructure¶
Configuration Explanation¶
show how to set different path attributes
route reflector
route maps etc
import/export policy
Verification and Troubleshooting¶
Q&A¶
update-source https://forum.networklessons.com/t/the-use-of-the-update-source-command/7418/2
R1 fa0/1-2=========== fa0/1-2 R2 , R1 is connected to R2 via two interfaces for redundancy. R1 will use one interface to form the neighborship, let’s say fa0/1 in this case. If Fa0/1 goes down for some reason, our BGP relationship will go down immediately then another session will start using the 2nd interface. On a production network, this will definitely introduce an outage while another neighborship is forming.
next-hop-self
Table types - , , Adj-RIB-Out: show bgp neighbours 10.1.1.1 advertised-routes Loc-RIB: show ip bgp Adj-RIB-In: show bgp neighbours 10.1.1.1 received-routes
https://datatracker.ietf.org/doc/html/rfc3392
https://www.rfc-editor.org/rfc/rfc7432.html