Versions, for posterity

vCenter 8.0.1 – build 22088981
ESXi – 22088125
NSX – 4.1.0.2

Backstory

I recently came across a rather interesting issue. Long story short, I needed to rebuild one of my NSX-prepared hosts. I put the host in maintenance mode, pulled it out of my NSX cluster, disconnected the host, and removed it from vCenter.

Issue at hand

All is well, right? Well, not exactly.

After I rebuilt my host, I came across an interesting problem. I could add the host back to vCenter and back to my cluster correctly. (Note: I have a VSAN cluster, so I needed to match the ESXi build on my re-added host first before it would functionally add to my cluster. A simple single-host LCM image worked to upgrade this host).

However, NSX would not prepare my host!

Validation Errors
26210: Node node-name-uuidhost-number with same ip 10.0.0.1 already exists.

Well, this is not good. I removed that host, there should be no duplicates around! Some stale entry exists, somewhere. How do I remove it? How do I identify the correct UUID to remove?

Googling this I came across a community thread where others had discovered this problem. However, where I struggled was finding the correct UUID to remove, because the UUID in the error is *NOT* the correct one!

Solution

After digging around in the API fruitlessly for a while, I finally opened a VMware ticket. The engineer on the call said to simply search in the NSX GUI for the UUID or the hostname! Never would I have ever thought to search the GUI to find the stale entry.

Sure enough, the entry I needed was there.

Once I knew the proper UUID, I could run the API call from within Postman to clean this up!

First, I made sure my host was *not* in my NSX-prepared cluster.

DELETE  {{baseUrl}}/transport-nodes/:transport-node-id?force=true&unprepare_host=false

Where {{baseUrl}} is https://fqdn.of.my.nsx.manager/api/v1
Headers:
key : X-Allow-Overwrite
Value : true

Conclusion

After running that API call, I waited a minute and searched for the hostname again in the GUI and I couldn’t find it anymore. After that, I re-added my host back to my NSX Cluster, and voila! LCM started preparing my host as it should. No longer did I get the error in NSX that the same IP already exists!

I’ll update this post when I discover how to properly remove the host from the cluster without leaving any stale entries behind.

Edit: After you put your host in Maintenance mode through vSphere, put your host in Maintenance mode in NSX. Then once you remove it the stale entries are cleaned up.