AutoDeploy and VXLAN

UPDATE!!: So another awesome engineer that i work with has found a solution to this.

“as long as you let VSM create the vmknic the first time, and then preserve that MAC address (in the answer file of the Host-Profile), you’re good.  (Assuming you’ve added the vxlan vib to your
image)”  He also mentions “I think one of the keys is to make sure VSM is showing the correct IP in Datacenters->Network Virtualization->Preparation->Connectivity BEFORE attempting host reboots.”

Thank you Jason

Original Post:

So working with the same technician that found the UCS and PXE bug, he found another one this time relating to VXLAN and Auto-Deploy. (Thanks Zach & Eric)

First, a little background.

The customer is doing a full-scale vCloud Enterprise Suite deployment.  They wanted to utilize VXLAN and fully stateless hosts.

So normally when the cluster gets prepared, vShield Manager (now called vCloud Networking and Security. I’m sure the name will change tomorrow) creates a vmknic on the hosts that is used for the VXLAN transportation.

Now we have Auto-Deploy which muddies everything up…. The process  that should occur is this;

First boot a new host and configure it as needed.  Then prepare the cluster through vSM/VCNS.  This adds a vmknic to the host for the VXLAN transport.  Then you create a host profile from it.  Then as you add more hosts you update their answer file for those hosts.  Then reboot and all is happy…

Well here is the rub, upon reboot the vSM/VCNS prep happens before the host-profile is applied.  So when the host-profile gets applied the, just created, vmknic is removed and re-added. Sadly, there is no way to just add the IP address to this vmknic through the host-profile, it’s an all or nothing affair.  What sucks is when doing the Host-Profile remediation the vmknic isn’t just modified, it’s actually deleted and re-created with the appropriate settings.  I’m sure this is to simplify code.  Anyway, now this new vmknic is created with the correct settings, but vSM/VCNS doesn’t know about it because some identifier has changed….doh!!

There is currently no fix for this Order of Operations issue…  This will be fixed when VCNS gets updated to 5.5 though.  So for now it’s VXLAN or Auto-Deploy.