After we published our whitepaper, I have been working with many enterprise organizations and service providers in order to deliver a successfully NVGRE implementation.
The journey has been great and interesting, and I know for sure that the whitepaper should get updated very soon with additional tricks & tips.
Recently, we had a scenario where the gateway VMs would require default gateways for both management and front-end.
This caused some issues at first, and the result is as follow:
If you have default gateway configured on the routable management network, VMM and the gateway VM can communicate and is quite happy with that.
If you have default gateway configured on the front-end network (the one that faces the internet), then the tenants should get internet access and are quite happy with that.
The problem is that the gateway VM without any metric is not able to determine the correct route every time, so your tenants will most likely not get a successful connection to internet.
A quick cmdlet to check the routes on your gateway VM ( route print) should show you the desired route to 0.0.0.0. if this is the management network, you will have problems.
The solution is to add metrics so that the gateway VM can ensure connectivity to the right networks.
Metric for management gateway: 300
Metric for front-end gateway: 200
In addition, we ended up with static routes for the gateway VM, for the different networks.
This lead to a stabile NVGRE environment where the gateway VM could continue to be managed by VMM, and the tenants could have a stable internet connection.
route add 10.0.0.0 mask 255.255.255.0 10.0.0.1 METRIC 300
route add 0.0.0.0 mask 0.0.0.0 18.104.22.168 METRIC 200