The following script may be useful if you are in the process of migrating vRealize Log Insight to a new appliance/cluster. You can use this script before, during and after migrating to check the settings of Syslog.Global.Loghost of all ESXi hosts in vCenter.
# Connect to vCenter Server
Connect-VIServer <vCenterServer>
# Get all ESXi hosts in the cluster
$hosts = get-vmhost
# Loop through each ESXi host and get the syslog.global.loghost advanced setting
foreach ($esxi in $hosts) {
$setting = Get-AdvancedSetting -Entity $esxi -Name 'syslog.global.loghost'
Write-Host "$($esxi.Name): $($setting.Value)"
}
# Disconnect from vCenter Server
Disconnect-VIServer <vCenterServer> -Confirm:$false
For example: You can use the script during vRealize Log Insight in the following way:
Before migration Check the current configured syslog endpoint
During migration Check the current and new syslog endpoints are configured
Aftermigration Check the new configured syslog endpoint
This time a short post about a vRealize Log Insight (vrLI) configuration issue that took too long to solve. In the end, the solution is simple after I found the documentation. Finding the right documentation was the hardest part.
Just briefly the reason of this setup. I want ESXi hosts to use Syslog over SSL to send logging encrypted to vRLI.
While adding the vCenter I configured the hosts to use SSL.
After configuring, everything seemed to work fine, until I got a vRLI Admin mail with the following alert:
This alert is about your Log Insight installation on https://vrli.vrmware.nl/
SSL Certificate Error (Host = vrli.vrmware.nl) triggered at 2023-04-16T09:23:53.412Z
This notification was generated from Log Insight node (Host = vrli.vrmware.nl, Node Identifier = de568ad3-d4e3-7f8a-b543-cef17632af11).
Syslog client esx01.vrmware.nl disconnected due to a SSL handshake problem. This may be a problem with the SSL Certificate or with the Network Time Service. In order for Log Insight to accept syslog messages over SSL, a certificate that is validated by the client is required and the clocks of the systems must be in sync.
Log messages from esx01.vrmware.nl are not being accepted, reconfigure that system to not use SSL or see Online Help for instructions on how to install a new SSL certificate .
This message was generated by your Log Insight installation, visit the Documentation Center for more information.
Time couldn’t be the issue in my case. So it had to be a certificate issue. The issue was caused by the vRLI certificate that wasn’t in the ESXi host truststore.
Per ESXi host, the following steps should be taken to solve the issue. Step 3 is only a verification step.
If ESXi hosts have the vRLI certificate in their truststore, the vRLI Admin mail (1x per day per vRLI node) should no longer occur.
Here is the link to the VMware documentation. This documentation is actually for vRLI Cloud which is a different product than standard vRLI although they overlap in some areas. The documentation for vRLI will be updated according to VMware GSS.
So this is probably why the vRLI documentation on this topic was so hard to find. Hopefully this blog post will save you a lot of time.
Recently I wanted to test whether it is possible to configure vRealize Log Insight (vRLI) log forwarding to a second network interface to reach a log target in another network segment that could not be reached from the default vRLI appliance ip address.
The first step is adding a second network interface to the appliance. In this example we use the following network configuration.
VMnic1 Vlan10, IP 10.1.1.10, Subnetmask 255.255.255.0, Gateway 10.1.1.1
VMnic2 Vlan20, IP 20.2.2.20, Subnetmask 255.255.255.0, Gateway 20.2.2.1
In this example the log forwarding target ip address is 30.3.3.233
To configure the second network interface open a SSH session to vRLI appliance. Move to /opt/vmware/share/vami/ and run the network configuration script. vami_config_net. Eth1 is now also available for configuration. First select ‘0’ for a configuration overview. In the results is an error on eth1 displayed. This error keeps us from being able to configure eth1.
After some ‘Trial & Error’ research I noticed the following error during reconfiguring eth1 “can’t open /etc/systemd/network/10-eth1.network”
In the directory /etc/systemd/network is the file “10-eth1.network” not present. The name of the file could be different then here in this example. It depends on the number of network interfaces. I fixed this issue by creating this file manual.
touch /etc/systemd/network/10-eth1.network
chmod 644 /etc/systemd/network/10-eth1.network
Config the second network interface. Go to the directory /opt/vmware/share/vami/ and run the network configution script. vami_config_net. Eth1 is now also available for configuation.
Check the new configuration by selecting option 0. If Ok press 1 to exit
Restart the network, systemctl restart systemd-networkd.service
Now this issue is fixed we can move on to configure the persistant static route for vRLI log forwarding.
Recently, I started an ssh session on a vRealize Log Insight (vRLI) appliance. After entering name and password a message was prompted to change the password immediately. After I did this, the ssh session was dropped and I had to log in again. The password appeared to be unchanged. I repeated this step several times and then decided to restart the vRLI appliance. Hoping this would fix the problem.
After waiting for some time, the appliance does not appear to come online and appears to be in a loop during startup. The screenshot below shows that the Journal Service does not want to start.
I have seen similar issue more often and I suspect a partition has filled up. I couldn’t log in because the appliance won’t start. In the following action plan, I explain step by step how I solved this problem.
Action Plan:
1, Take a snapshot 2. Edit the VM options in the next step before restarting the vm 3. Change Boot Options, Change Boot Delay in 1000 milliseconds and Force Bios Setup
4. Restart the VM 5. Bios Exit Discarding Changes
6. Press ‘E’ direct during the start 7. Type “rw init=/bin/bash” at the end of the line and press “F10”
8. Type “df -h”. In our case we see that the root partition is full
9. Let’s clean up some old audit and authentication logs
10. Let’s check the root partition again. Type “df -h”.
11. Reboot the appliance and we’re done.
After cleaning up the root partition, the appliance started normally. I was also able to change the root password without any problems. Don’t forget change the Boot Options, Change Boot Delay back in 0 milliseconds and remove the snapshot if everything is working fine and you’re happy with this solution.
You can always open a VMware SR if you need help. 🙂
Almost two years ago I wrote a blogpost about a failure during installing the vCenter Server agent(HA) service. This post is one of most read articles on my blog. You can find the orginal post here. Recently, I ran into this problem again. This time I could not solve the problem using my earlier post.
I have found a workaround to this problem that is easy to implement and works well. I have been able to use it successfully several times recently.
Put the ESXi host in Maintenance Mode
SSH to the ESXi host
esxcli software vib remove -n vmware-fdm (no reboot needed)
Wait a few minutes and the result should like this:
Last week was VMware Explore Europe in Barcelona, Spain. Here vExperts were able to pick up a mini PC, the Maxtang EHL30. The mini PC was offered by the vExpert community and Cohesity. This as a gift for all the work vExperts do for the vCommunity.
The mini PC still had to be fitted with DDR4 memory and an M.2 SATA SSD. Since I wanted to keep it low budget I bought memory and an SSD on Amazon for about €50.
After inserting the memory and SSD, the Maxtang boots up. I have installed Ubuntu 22.04 LTS and used Rufus to create a bootable Ubuntu install USB media. After the setup was completed the system hung during the reboot. After a coldboot I observed that the startup was also take some minutes, much to slow in my opinion. After searching for a while I found the solution for the slow startup and hung during reboot/shutdown.
enter command “sudo vi /etc/modprobe.d/blacklist.conf” in terminal
add a new line “blacklist pinctrl_elkhartlake” , save and exit editor mode
enter command “update-initramfs –u” in terminal
reboot system, coldboot to apply change
Now the Maxtang EHL30 with Ubuntu 22.04 LTS Reboot and startup in a few seconds.
For a customer, I deployed a new vRealize Log Insight 8.10 cluster as a test. After initial installation and configuration of the first vRLI node I experienced a performance that was particularly bad. It wasn’t always this bad, but it frequently occurred.
The bad performance initially revealed itself when choosing one of the options in the menu on the left side of the screen. It seemed like the whole screen was frozen. After 10 till 30 seconds the requested information appeared.
I decided to first complete the cluster setup. After adding some vCenters and importing Content Packs, technically everything worked as designed. The bad performance was still there even after temporarily adding additional memory to the nodes.
During one of the moments that the performance was decreased I saw something in the lower left-hand corner. There were the following messages displayed:
Read <vRLI FQDN>
Transferring data from <vRLI FQDN>
Connecting to cdn.pendo.io
The last bullet was the key to resolve this issue. I searched some information about “cdn.pendo.io” and found this article about “Join or Leave the VMware Customer Experience Improvement Program“. At the bottom of the article is a section “What to do next”
After CEIP is enabled, when a user logs in to vRealize Log Insight, they see a banner at the top of their window that asks whether they want Pendo to collect data based on their interaction with the user interface.
If the user clicks Accept, Pendo collects their data and sends it to VMware
If the user clicks Decline, Pendo does not collect their data
In the General Configuration section of the vRLI configuration I deselected the Usage Reporting (Join the VMware Customer Experience Improvement Program). This resolved the issue for me.
Finally, in this case there was no need to join VMware CEIP. So, therefore it is an acceptable solution for me.
Recently I ran into a problem when I needed to generate a VxRail V7.x support log bundle. The log collection failed after I went to the VxRail log section.
Recently, We missed an alert notification that had been generated in VMware vRealize Log Insight (vRLI) outside of office hours. This had caused a disruption that we could have avoided if we had been informed in time. The alert notifcation had been sent via email but email is not always accessed outside of business hours. Which I can well understand. In this blog, I will explain what I have come up with to notice these types of alerts earlier.
Use Case – Increase the ability to notice prio 1 alerts outside of office hours with the available technical resources.
Goal – In addition to the standard vRLI alerts, we also want to have the option available to receive alerts through Microsoft (MS) Teams.
Solution – Use vRLI Webhook to send alerts to MS Teams
Setup – In order to have vRLI alerts sent to MS Teams, we need to set up two things.
Setup a MS Teams Connector to receive alerts
Setup the vRLI Webhook configuration to push alerts
Setup a MS Teams Connector to receive alerts
First, decide in which Teams Channel you want to receive the vRLI Alerts or add a new Teams Channel. I have created a new Channel called VRMware VMware Alerts.
Click on the 3 dots on the right side and select Connectors.
Select Configure Incoming Webhook.
Provide a friendly name, upload an image and create the connector.
After creation copy the url to the clipboard. We need this URL later to configure the vRLI Webhook.
Before we move on to vRLI we need to enable the channel notifications. Click once again on the 3 dots on the right side and select Channel notifications > All activity.
Setup the vRLI Webhook configuration to push alerts
Go to the Administration section and open Configuration > Webhook > New Webhook. Choose a name. From the Endpoint drop down menu select Custom. Copy the Webhook URL that was copied from MS Teams connector. From the Content Type drop down menu select JSON and from the Action drop down menu select POST. The Webhook Payload will be described under the picture.
Webload Payload
The Webload Payload was the hardest part to configure. Thanks to my colleague Roger who has figured out how the Webhook Payload layout should look like.
As far as I know, from vRLI Webhook only clear text can be used to send notifications to MS Teams. It’s possible to use one or more parameters in the script. For an overview of the parameters see the picture above here. Because the notictions are send in clear text it’s not possible to use all parameters. In our case not a problem because MS Teams is not used to replace monitoring software. It is just an additional option to be informed in a timely manner.
I wouldn’t go indepth how we found out the layout of the Webhook Payload code. That’s why I’m only sharing the code with you, so you can start testing for yourself.
After completing the Webhook configuration you may want test the Webhook configartion. Press the Send Test button.
Finally Save the Webhook configuration.
Open the MS Teams Channel where the connector was created earlier. You should see here the Test Alert.
The last part is sending a notification to MS Teams when a ESXi host have entered Maintenance Mode.
I have created an vRLI alert with the name “TEST VRMware VRLI Alert: vSphere Host entered Maintenance in vCenter“.
I have decided that I would like to be notified by both email and MS Teams. This can be set under the Trigger Conditions.
If everything is configured correctly we should receive the Send Test Alert Results after sending a test alert.
Save the Alert. Now we are ready for the final test. I put a ESXi host in maintenance mode and we should receive within 5 minutes a MS Teams notification. It works!
I hope this blog post will help you configure vRLI to send notifications to MS Teams. Please remember that MS Teams is not a monitoring tool. So be selective with the alerts you forward. I have chosen to only forward alerts that I know need to be acted on as soon as possible.
Something I struggled with in the past is setting the productlocker, hassle with scripts and ssh and avoiding host reboots. Today we had to configure another 20 hosts with this and I was pointed to a solution doing it with the MOB. So my next thing was, how to automate this. And with 2 simple foreach loops you can set it and check it and be done!
Setting all hosts from a cluster with a specific path for the productlocker