Play with BGP in Azure
What is BGP ?
If you are not familiar with network protocols, the 4th of October 2021 is probably the first time you have heard about BGP. What is really interesting is most of the people don't realize how BGP is important for Internet.
BGP stands for Border Gateway Protocol. It's a mechanism to exchange routing information inside or between autonomous systems (AS).
The routers that make the Internet work are sharing routes to each other to make sure a network packet can reach his destination. Without BGP, the Internet routers wouldn't know what to do, and the Internet wouldn't work.
How to play with BGP ?
BGP is really important for Internet but it can be used at smaller scale to provide high availability and simple way to manage routing.
In Azure, Virtual Network Gateway can be configured to use BGP and we will use it to see how BGP is working.
Let assume the following network design :
All this infrastructure can be provided easily using this Terraform code: https://github.com/vmisson/terraform-azure-fullmesh
It will take around 30 minutes to provisioned everything. This infrastructure has a fairly significant cost, so I recommend you to remove it as soon as you have completed your tests. Destroy everything can take more than 15 minutes.
Let's have a look in details
We will focus on the West Europe Virtual network gateways but it will be exactly the same for the other region.
First of all, we can check the BGP peers :
As we have provisioned a full mesh BGP configuration, all gateways are connected to each other : one peering with the other gateway in the same region + 2 peers per external region = 5 peers per gateway
Let's have a look at the BGP learned routes :
We get for each external subnet 10 different routes. How to explain this ?
Let's take as an example 10.1.0.0/23 (North Europe subnet):
We learn 5 routes per gateway (local address) corresponding to the 5 peers we have per gateway. If a route is learn from a gateway with the same ASN the origin is marked as IBgp (Internal BGP) and if a route is learn from a gateway with a different ASN the origin is marked as EBgp (External BGP).
All the routing was automatically generated by BGP but how is managed a new VNet ? As soon as you created a new VNet and you have setup the peering between this VNet (using the remote virtual network gateway) and the hub VNet (using his virtual network gateway), this new subnet will be automatically advertised by BGP.
As you can see, when it's correctly configured, BGP can managed routing automatically and creating a new VNet on any region will be automatically advertised everywhere in only few seconds.
BGP and redundancy
Let's have a look on AS path, we have some routes able to reach directly the destination ASN (65001 in our case) and we have some routes with multiple AS in the path (65002-65001). In this case, it means we can reach North Europe but passing through East US. It's probably not the best solution in term of performance but if we lost completely the connectivity between West Europe and North Europe, we can have a degraded solution passing through an other path (West Europe => East US => North Europe).
We can simulate this kind of issue by deleting both connections between West Europe and North Europe :
Let's run a ping between West Europe VM and North Europe VM to check the impact:
It takes around 5 seconds to recover the connectivity but as the traffic is now passing through East US to reach North Europe, the latency is not really good (+150 ms).
Using Terraform is easy to simulate the connectivity restoration:
No packet lost and we are back to normal latency. Easy peasy !
As you can see BGP is an nice way to manage routing and automatic failover. Remember, if BGP can manage Internet it can probably manage your private routing.
This article is only the first about a series about routing and BGP inside Azure. Next time we will see Route Server and how it can be interesting in some specific design / use cases.
Don't hesitate to contact me if you have any question or remark.