VMware Cluster upgrade vSphere 7 to vSphere 8 with a single image is blocked by not supported VIBs

Recently, I wanted to upgrade a vSphere 7 non vSAN cluster to version 8 with a single image. During the compliance check it turned out that all hosts in the cluster were not compliant and therefore the upgrade could not be started. The following error was displayed.

There appeared to be two VIBs installed that were not supported. On the host’s commandline, I looked up the names of VIBs and noted them. The VIBs involved in this case were the following:

  • nenic
  • iavmd

These will be needed later in the Powershell script.

Since I don’t want to log into every server and then have to delete the VIBs via the CLI, I created the following script. I would like to thank my colleague Kabir very much for his time, explanation and mentoring. He has great scripting and automation skills and has written great articles that you can find here, whatkabirwrites.nl

Back to the script. The script removes the two VIBs on all hosts in a cluster. Before using the script be sure that the VIBs are not used. The following steps are performed by the script:

  • Host is put into maintenance mode
  • Check if VIBs are installed
  • If present, they are removed
  • Host is rebooted
  • Host goes out of maintenance mode
  • Next host

If a host is already in maintenance mode, it will remain in maintenance mode after the reboot.

At the top of the script, the Remove VIBs function is defined. Adjusting the setting $settings.dryrun = $true to $false really removes the VIBs. Without adjusting this, the VIBs are not removed and only verified to be present or not. Regardless of the value of $settings.dryrun = $true or $false the hosts are always put into maintenance mode and rebooted.

After executing the host reboot, the script waits 4 minutes before continuing. I built in this pause because nested ESXi hosts reboot so quickly that otherwise the script won’t enter or exit the wait loop. I have used the script in a test lab and it works very well.

Please be aware that using this script is at your own risk!

#Version 1.0
#2024-10-20
#Pre-Upgrade script ESXi7 to ESXi8
#This script remove vibs on all hosts in a cluster that blocks the upgrade

# Function Remove VIBs
function RemoveVIB {
    Param (
        $ESXi  # The EsxCli object
      
    )


[array]$arrayvibs = @("nenic", "iavmd")

$esxcli = Get-EsxCli -V2 -VMHost $ESXi

    foreach ($vib in $arrayvibs) {
        
        if ($esxcli.software.vib.list.Invoke() | where {$_.Name -eq "$VIB"}) {
            $settings = $esxcli.software.vib.remove.CreateArgs()
            $settings.dryrun = $true
            $settings.vibname = "$VIB"
            echo "$VIB VIB found, remove VIB $esx"
            $esxcli.software.vib.remove.Invoke($settings)        
        } 
        else {
            echo "No $VIB VIB found $esx"             
        } 
    }

}

# vCenter & Cluster Parameters
$vCenter = "FQDN vCENTER"
$cluster = "Cluster Name"


# Connect vCenter
Try {Disconnect-VIServer * -Confirm:$false -ErrorAction SilentlyContinue | out-null}
Catch {}
    
Connect-VIServer $vCenter 

$ESXis= Get-Cluster -Name $cluster| Get-VMHost | sort Name | where {$_.ConnectionState -eq 'Connected' -or $_.ConnectionState -eq 'Maintenance'}

foreach ($ESXi in $ESXis) {
    write-host "Working on host $($esxi)"
    
    # Host status is Maintenance Mode
    if ($ESXi.ConnectionState -eq 'Maintenance') {
        Write-host "Host is already in Maintenance Mode..."
        RemoveVIB -ESXi $ESXi
        Write-host "Host Reboot in Maintenance Mode..."
        write-host "Herstarten host $($esxi)"
        Restart-VMHost -VMHost $ESXi -Confirm:$False | Out-Null
        write-host "Waiting 4 minutes to make sure the host is disconnected before proceeding..."
        start-sleep 240
        $hoststat = (Get-VMHost -Name $ESXi.Name)
        While ($hoststat.ConnectionState -eq "NotResponding") {
        Write-host "Host is still rebooting... waiting 10sec..."
        Start-Sleep 10
        $hoststat = (Get-VMHost -Name $ESXi.Name)
       }
    }
    #Host status is not Maintenance Mode
    else {
        Write-Host "$($ESXi) is not in Maintenance Mode. Put host in Maintenance Mode..."
        write-host ""

        # Host in Maintenance Mode
        Set-VMHost -VMHost $ESXi -State Maintenance -Confirm:$False | Out-Null

        # Vib Remove
        RemoveVIB -ESXi $ESXi
        # Host Reboot
        Write-host "Host is in Maintenance Mode"
        write-host "Reboot host $($esxi)"
        Restart-VMHost -VMHost $ESXi -Confirm:$False | Out-Null
        write-host "Waiting 4 minutes to make sure the host is disconnected before proceeding..."
        start-sleep 240
        $hoststat = (Get-VMHost -Name $ESXi.Name)
        While ($hoststat.ConnectionState -eq "NotResponding") {
        Write-host "Host is still rebooting... waiting 10sec..."
        Start-Sleep 10
        $hoststat = (Get-VMHost -Name $ESXi.Name)
       }
        
    # Host uit MM
        write-host "Reboot on host $($esxi) is completed..."
        write-host ""
        Start-Sleep 20
        Set-VMHost -VMHost $ESXi -State Connected -Confirm:$False | Out-Null
   }  
    
    write-host "Done on host $($esxi)"
    write-host ""
}

#Disconnect vCenter
    write-host "Disconnecting vCenter $vCenter"
    Disconnect-VIServer  -Confirm:$False  | Out-Null

Cannot install the vCenter Server agent(HA) service. Unknown installer error (UPDATE)

Almost two years ago I wrote a blogpost about a failure during installing the vCenter Server agent(HA) service. This post is one of most read articles on my blog. You can find the orginal post here. Recently, I ran into this problem again. This time I could not solve the problem using my earlier post.

I have found a workaround to this problem that is easy to implement and works well. I have been able to use it successfully several times recently.

  • Put the ESXi host in Maintenance Mode
  • SSH to the ESXi host
  • esxcli software vib remove -n vmware-fdm (no reboot needed)
  • Wait a few minutes and the result should like this:

Removal Result
  
Message: Operation finished successfully.
   Reboot Required: false
   VIBs Installed:
   VIBs Removed: VMware_bootbank_vmware-fdm_7.0.3-19193900
   VIBs Skipped:

  • Check if the vCenter Server agent is removed
  • esxcli software vib list | grep fdm (result should be empty)
  • Exit Maintenance Mode, HA agent will be installed
  • Check if the vCenter Server agent is installed
  • esxcli software vib list | grep fdm (result should be like this:
    vmware-fdm 7.0.3-19193900 VMware VMwareCertified 2022-10-12
  • Done

Set ProductLocker Location with PowerCLI

Something I struggled with in the past is setting the productlocker, hassle with scripts and ssh and avoiding host reboots. Today we had to configure another 20 hosts with this and I was pointed to a solution doing it with the MOB. So my next thing was, how to automate this. And with 2 simple foreach loops you can set it and check it and be done!

Setting all hosts from a cluster with a specific path for the productlocker

$allhosts = get-cluster "clustername" | get-vmhost

foreach ($hostname in $allhosts){
Get-VMhost -Name $hostname | %{$_.ExtensionData.UpdateProductLockerLocation_Task("/vmfs/volumes/pathname")}
}

Query all hosts in a cluster to check what location is set

$allhosts = get-cluster "clustername" | get-vmhost

foreach ($hostname in $allhosts){
Get-VMhost -Name $hostname | %{$_.ExtensionData.QueryProductLockerLocation()}
Write-Host $hostname -ForegroundColor Green
}

Source: Hollebollevan

Replacing the VxRail Manager SSL Certificate failed

Cause:
Replacing the VxRail Manager SSL Certificate through the Certificate Management Gui ended up with an error.

I was used to replacing the VxRail certificate using the cli. This was the preferred way somewhere up to version VxRail 4.7.x. All preparations for using the gui are the same as when replacing the certificate through the cli:

  1. Config OpenSSL.conf with the required information
  2. Create RSA key
  3. Creare CSR file
  4. Request a Certificate signed by your CA
  5. Download Certificate CA chain

The certificate and CA certificate chain must still be in PEM format.

So I start with a snapshot of the VxRail manager and copied the VxRail certificate, RSA key, certificate chain and entered a password in the gui. After pressing the Update button I received the following error.

Solution:
I will briefly tell you what went wrong. As told earlier, I used the same way to create the certifcates as I did when using the cli.

The error was caused by the certificate CA chain. After downloading the certificate CA chain file, the certificates are listed from top to bottom. The Root CA was at the bottom. This is order has always worked when replacing the VxRail manager certificate using the cli.

I was led to the solution after reading the certificate chain content information twice, see the screenshot above here. I assumed that the certificate format was causing the issue. There is also information about the certificate format next to the information about the order of the CA certificates in the chain file. I changed the order of the CA certificates in chain file with the Root CA on top.

At the next try I copied again the VxRail certificate, RSA key and certificate chain and entered a password in the gui. After pressing the Update button I received a notification that the VxRail Manager certificate is successful replaced. It takes about 5 minutes, before the VxRail manager is back online.

During this 5 minutes the VxRail SSL Thumbprint in the cluster in vCenter will be updated and services will be restarted. You can see the VxRail SSL Thumbrint on the cluster summary page in vCenter.

This auto replacement of the VxRail SSL Thumbprint is an enhancement compared to the old way when the certificate was replaced using the cli. The VxRail SSL Thumbprint then had to be manually read and copied from the VxRail manager to the cluster custom attribute in vCenter.

Return to the VxRail System info page after 5 minutes. You should now see the VxRail information.

Done.

Get notificated when vSAN Force Provisioning is enabled and applied to a vm

Recently I was asked if it is possible to receive an email notification if a vSAN storage policy with Force Provisioning enabled is applied to a vm. In this blogpost I want to show that this is possible.

Use Case – An administrator wants apply an vSAN storage policy with Force Provisioning enabled to virtual machines beacause of a possible shortage of vSAN storage capacity. In my opinion not a very good idea in a production environment!

Goal – Get an email notification when a vSAN storage policy with Force Provisioning enabled is applied to an vm. There is also the wish to make this visable in a dashboard.

Solution – With a bit of reverse engineering and vRealize Log Insight (vRLI) it’s possible to achieve this.

Setup lab – VMware vSphere 7.0 Update 3a vSAN cluster and vRLI 8.6.

The first step is create a storage policy with Force Provisioning enabled. We name this policy “FP VM Storage Policy”

We need to apply the new storage policy “FP VM Storage Policy” to our test vm “sbpm01”

The policy is successful applied to the vm “sbpm01”.

Now we need some reverse engineering because it’s not possible to grep the name of the storage policy in vRLI. We move to the sbpm01 events in vCenter.

Note the following two details:

  1. Event Type ID: com.vmware.pbm.profile.associate
  2. Associated storage policy: 98df0443-5244-49af-9069-ad9fdbfedb52

This is the information we need to created a new filter in vRLI Interactive Analytics.

Add the associated storage policy id “98df0443-5244-49af-9069-ad9fdbfedb52″ to the text field and add an extra filter (+ ADD FILTER). Choose from the pull down menu “vc_event_type” contains com.vmware.pbm.profile.associate. Choose a time window. In our example I choose “Latest 24 hours of data”. You can also choose here the last hour or last 5 minutes of data. It depends on the time you applied the policy and when you search in vRLI.

In de results above you don’t see the name of the vm with the applied storage policy. In the last image above here you see on the right of Events the Field Table section. Select Field Table. Search for the row with name vc_vm_name. Below here is the vm friendly name displayed of the vm with new applied storage policy.

Finally you want a email notification and a dashboard. I am not going to explain here how to create an email notification and a dashboard. This can be done in vRLI at the same way you normally create notifications and dashboards. Press the icon (1) to create a email notification and press the icon (2) to create a dashboard.

If you want receive an email notification if the vm get another storage policy applied. Then you should create another filter including the following two details.

  1. Event Type ID: com.vmware.pbm.profile.dissociated
  2. Associated storage policy: 98df0443-5244-49af-9069-ad9fdbfedb52

Conclusion:

In this blog post I wanted to demonstrate that it is possible with vRLI to receive an email notification if a vm has a storage policy applied where Force Provisioning is enabled. A disadvantage is that if the vm gets a different storage policy with different settings, this email notification is no longer valid, because the notifications are based on this specific storage policy id.

I have shown that it is works but as far as I believe it is not a solution for a production environment.

VMware vCLS datastore selection part 2

Last year I wrote an blog post about the VMware vCLS datastore selection. This blog post is one of the most read articles on my website. This does indicate that there is a need to be able to choose a datastore on which the vCLS vms are placed.

Today VMware announced vSphere 7.0 update 3. In this update there is also an improvement on the vCLS datastore selection. It’s now possible to choose the datastore on which the vCLS vms should be located.

In the following video on the VMware vSphere YouTube channel move on to 20 minutes to learn more about the vCLS vms datastore selection improvement.

Another improvement is that the vCLS vms now have a unique identifier. This is useful when you have multiple clusters managed by the same vCenter.

It’s always good to see that a vendor is listening to the customers’ needs to further improve a product.

Update vCenter vCSA 7.0 Update 2 failed

Last night I spend several hours to update the vCenter in my lab from vCSA 7.0. Update 1d to vCSA 7.0 Update 2. The update kept going wrong. I’ve staged the update package first. After staging the update was completed I started the installation. This result in the following error: Exception occured in postInstallHook.

So I tried Resume.

This seems so far so good. Continue.

Continue the Installation.

Same error again! Let’s retry Resume and Cancel in the next step.

Cancel.

Now I get stuck. So I quit to get some sleep :-).

This morning I woke up and received an e-mail from WIlliam Lam that he has written a new blog post about an error during upgrade to vCenter vCSA 7.0 Update 2, “Exception occurred in install precheck phase“. This is a different error than the error that I experienced yesterday but I have seen this error also during one of the attemps.

Here an overview of the errors during my attempts:

  • Exception occured in postInstallHook
    This error appears after staging the update and install it later
  • Exception occurred in install precheck phase
    This error appears after stage and install at the same time.

Now let’s try the workaround from William Lam that should result in a working vCenter vCSA 7.0 Update 2.

  • Create a snapshot of the vCSA
  • Stage the update file
  • SSH to the vCSA
  • Move to folder /etc/applmgmt/appliance/
  • Remove the file software_update_state.conf
  • Move to folder /usr/lib/applmgmt/support/scripts
  • Run script ./software-packages.py install –url –acceptEulas

The update ended with an PostgreSQL error and vCenter is not working after the update. I rebooted the appliance one more time without any result.

Conclusion:

vCenter vCSA 7.0 Update 2 is in my opinion not ready for deployment at this moment. I rollback the snapshot and decide to wait for an updated version of vCenter vCSA 7.0 Update 2.

I will update this blog post later.

VM Summary Customize View

Recently I updated the vCenter appliance in my lab to version vCSA 7.0 update 1d. After updating I was clicking a bit through the environment. By coincidence I saw the following button when I opened the summary page of a VM.

Curious as I am, I clicked the button. But first the regular view below. This view, which everyone knows, is now called the classic view.

After clicking the “Swith To New View” button an customize view will appear.

What immediately stands out is the fresh widget view. It’s a small change, but I’m a fan of it right away. I been wondering ever since when this view was introduced. I searched the VMware documentation but I cannot find it. It is certainly not available in versions prior to vCenter vCSA 7. Maybe it has been available for a while but I haven’t noticed it before.

If you still prefer the classic view. You can just as easily switch back to your old trusted view.

You can easily adjust what you want to see and what not. If you know when this customize view was introduced, please leave a comment.