Vincent Jansen – Page 2

December 17, 2023December 27, 2023

Forward Aria Operations alerts as SNMPv3 traps to a Linux collector

Recently a question came up whether it is possible to forward Aria Operations alerts to a third-party monitoring tool using SNMPv3. I had no experience with this until that point. During this exploration it turns out that there is a lot of information to be found about configuring a SNMPv3 target. It turned out to be quite a search to find the right information to be able to decode the SNMPv3 traps and write them readable away in a log file. I tested with several types of Linux, including Raspberry Pi and Rocky Linux 9, but in the end I succeeded with Alpine Linux. An added advantage of Alpine is that it is lightweight.

In this blog post I will describe the steps taken to send an SNMPv3 trap from Aria Operations to a Linux collector where the traps are stored readably in a log file. For the test, I used the following configuration:

VMware vCenter 8.0 update 2a
VMware ESXi 8.0 update 2
VMware Aria Operations version: 8.14.1
Linux Alpine Standard 3.19.0 x86_64
NET-SNMP 5.9.4

First step: Configure Aria Operations SNMP Trap Instance (Configure > Alerts > Outbound Settings > Add)

The name of the SNMP instance in my lab is “SNMP-Alpine-Linux”

The data highlighted in green later is needed later when configuring the Linux SMMP trap collector. For both the Authentication Password and Privacy Password, we use the same password, which is “T3st123!”.

Second step: Install and configure the Linux SNMP trap collector.

Install Alpine Linux. Click here for the Alpine Wiki.

Once Alpine is installed, NET-SNMP can be installed. Note that from version NET-SNMP 5.9.x onwards, AES256 can be used, which is needed to decode SNMPv3 traps.

Install NET-SNMP

apk update && apk add net-snmp net-snmp-tools

Modify /etc/snmp/snmp.conf

In this step we will need the green marked higlighted data from the Aria Operations SNMP instance.

#######################################################################
# Example configuration file for snmptrapd
#
# No traps are handled by default, you must edit this file!
#
# authCommunity   log,execute,net public
# traphandle SNMPv2-MIB::coldStart    /usr/bin/bin/my_great_script cold
########################################################################

authCommunity log,execute,net public

#AriaOps

#createUser -e <ENGINE-ID> <USERNAME> <AUTHENTICATION PROTOCOL> <AUTHENTICATION PASSWORD> <PRIVACY PROTOCOL> <PRIVACY PASSWORD>

createUser -e 687517EA587080304ED28C73A0FB0B676570 ariaops SHA-256 T3st123! AES-256 T3st123!

######################################
authUser log,execute,net ariaops

format2 %.4y-%.2m-%.2l %.2h:%.2j:%.2k %B [%b]:\n%v\n

outputOption s

Modify /etc/conf.d/snmptrapd

Here we unmark #OPTS=”${OPTS} -Lf /var/log/snmptrapd.log” so snmp traps will be logged in /var/log/snmptrapd.log.

# extra flags to pass to snmptrapd
OPTS=""

# ignore authentication failure traps
#OPTS="${OPTS} -a"

# log messages to specified file
OPTS="${OPTS} -Lf /var/log/snmptrapd.log"

# log messages to syslog with the specified facility
# where facility is: 'd' = LOG_DAEMON, 'u' = LOG_USER, [0-7] = LOG_LOCAL[0-7]
#OPTS="${OPTS} -Ls d"

Restart snmptrapd

/etc/init.d/snmptrapd restart of systemctl restart snmptrapd

Create snmptrapd.log

touch /var/log/snmptrapd.log

Download the Aria Operations MIB files from the Aria Operations appliance. See VMware KB2118780. This is needed to convert the Aria Operations in readable alerts in the snmptrapd.log. See the next two Aria Operations test traps why the Aria Operations MIB files are needed.

First screenshot, the Aria Operations MIB files are not imported in Alpine. It cannot be seen that these alerts are VMware alerts

Next screenshot, the Aria Operations MIB files are imported in Alpine. Now it is clear that these are VMware events.

#Download the Aria Operations MIB files from this directory 
/usr/lib/vmware-vcops/user/plugins/outbound/vcops-snmptrap-plugin/mib

Upload the Aria Operations MIB files from the Aria Operations appliance to Alpine

#Upload the Aria Operations MIB files into this directory
/usr/share/snmp/mibs

Create /etc/snmp/snmp.conf

touch /etc/snmp/snmp.conf

Add the custom Aria Operations MIB files to snmp.conf

#Add custom Aria Operations MIB files
mibs +VMWARE-ROOT-MIB
mibs +VMWARE-VROPS-AGENTCAP-MIB
mibs +VMWARE-VROPS-MIB
mibs +VMWARE-PRODUCTS-MIB

Restart snmpd and snmptrapd

/etc/init.d/snmpd restart
/etc/init.d/snmptrapd restart

Auto start snmpd and snmptrapd at boot

rc-update add snmpd default
rc-update add snmptrap default

Third step: Test if Aria Operations SNMPv3 traps are received on the Linux SNMP collector

Test 1: Sending a test trap from the Aria Operations SNMP instance.

Note that before testing both password have been entered. Otherwise the test will fail.

Hit the “TEST” button. The following message will appear.

Check the snmptrapd.log (tail -f /var/log/snmptrapd.log) and search for “Test Notification Rule”

We see “Test Notification Rule” in the snmptrad.log. Test is succeeded.

Test2: Disconnect a ESXi host from vCenter

I have created a notification that wil send a SNMP trap to the LinuxSNMP collector.

Disconnect ESXi host from vCenter:

Alert is raised in Aria Operations

Check the snmptrapd.log (tail -f /var/log/snmptrapd.log) and search for “Host disconnected from vCenter”

We see “Host disconnected from vCenter” in the snmptrad.log. Test is succeeded.

From here, the alerts can move on to the third-party monitoring tool. That is beyond the scope of this test. Hopefully this blog post is useful to you.

November 8, 2023November 22, 2023

Upgrade Aria Operations 8.12 to 8.14 fails

There is a known issue occurring with some Aria Ops upgrades from version 8.12 to 8.14. During the upgrade Aria Ops is hanging for some time and lose access to the Admin UI. The Admin UI does not come back. The issue is related due the Photon OS upgrade.

VMware Support has confirmed that Aria Ops 8.14 will be replaced shortly by Aria Ops 8.14 hotfix 1. Unlike the normal upgrade process, it will be possible to upgrade directly from Aria Ops 8.12 to 8.14 hotfix 1.

VMware Support’s advice is to wait a while before upgrading until Aria Ops 8.14 hotfix 1 is available.

Update November 22, 2023

Last night VMware released Aria Operations Upgrade PAK | 2023 | Build 22611353 . Read the complete release note here.

October 25, 2023October 25, 2023

UPDATE: Aria Operations for Logs persistent static route on second network interface on behalf of Log Forwarding

There is an update on the blog post “vRealize Log Insight persistent static route on second network interface on behalf of Log Forwarding”. Beginning with version Aria Operations for Logs 8.14, the VAME_CONF_NET is no longer available, it’s removed in this version.

The information in the orignal blog post can still be used. Only now you need to manually create and predefine the 10-eth1.network configuration. See example at the bottom of the original post.

There are two important things to consider:

Keep in mind that multic-nic configurations in Aria Ops for Logs is not officially supported
After every update the 2nd interface must be recreated and configured again

Orginal post:

Recently I wanted to test whether it is possible to configure vRealize Log Insight (vRLI) log forwarding to a second network interface to reach a log target in another network segment that could not be reached from the default vRLI appliance ip address.

The first step is adding a second network interface to the appliance. In this example we use the following network configuration.

VMnic1
Vlan10, IP 10.1.1.10, Subnetmask 255.255.255.0, Gateway 10.1.1.1
VMnic2
Vlan20, IP 20.2.2.20, Subnetmask 255.255.255.0, Gateway 20.2.2.1
In this example the log forwarding target ip address is 30.3.3.233

To configure the second network interface open a SSH session to vRLI appliance. Move to /opt/vmware/share/vami/ and run the network configuration script. vami_config_net. Eth1 is now also available for configuration. First select ‘0’ for a configuration overview. In the results is an error on eth1 displayed. This error keeps us from being able to configure eth1.

After some ‘Trial & Error’ research I noticed the following error during reconfiguring eth1 “can’t open /etc/systemd/network/10-eth1.network”

In the directory /etc/systemd/network is the file “10-eth1.network” not present. The name of the file could be different then here in this example. It depends on the number of network interfaces. I fixed this issue by creating this file manual.

touch /etc/systemd/network/10-eth1.network
chmod 644 /etc/systemd/network/10-eth1.network
Config the second network interface. Go to the directory /opt/vmware/share/vami/ and run the network configution script. vami_config_net. Eth1 is now also available for configuation.
Check the new configuration by selecting option 0. If Ok press 1 to exit
Restart the network, systemctl restart systemd-networkd.service

Now this issue is fixed we can move on to configure the persistant static route for vRLI log forwarding.

Edit /etc/systemd/network/10-eth1.network

The file shlould look like this before editing:

[Match]
Name=eth1
[Network]

Gateway=10.1.1.1
Address=20.2.2.20/24
DHCP=no

[DHCP]
UseDNS=false

Now add route information at the end of the file:

[Match]
Name=eth1
[Network]

Gateway=10.1.1.1
Address=20.2.2.20/24
DHCP=no

[DHCP]
UseDNS=false
[Route]
Gateway=20.2.2.1
Destination=30.3.3.233/24

Save the file and restart the network service.

systemctl restart systemd-networkd.service

Check if the new route is present.

route -n

Test if you reach the destionation from CLI. I used Syslog over UDP port 514.

nc -v 30.3.3.233 514

Answer if configuration is working:

[30.3.3.233 514] open

The last step is to configure the vRLI Log Forwarding Destination.

Send a test event and check if the event is received by the target.

Keep in mind that multic-nic configurations in vRLI are not officially supported.

Also credits to this blog post that pushed me in the right direction.

April 17, 2023April 17, 2023

vRealize Log Insight Admin Alert: SSL Certificate Error (Host =vrli.vrmware.nl)

This time a short post about a vRealize Log Insight (vrLI) configuration issue that took too long to solve. In the end, the solution is simple after I found the documentation. Finding the right documentation was the hardest part.

Just briefly the reason of this setup. I want ESXi hosts to use Syslog over SSL to send logging encrypted to vRLI.

While adding the vCenter I configured the hosts to use SSL.

After configuring, everything seemed to work fine, until I got a vRLI Admin mail with the following alert:

This alert is about your Log Insight installation on https://vrli.vrmware.nl/

SSL Certificate Error (Host = vrli.vrmware.nl) triggered at 2023-04-16T09:23:53.412Z

This notification was generated from Log Insight node (Host = vrli.vrmware.nl, Node Identifier = de568ad3-d4e3-7f8a-b543-cef17632af11).

Syslog client esx01.vrmware.nl disconnected due to a SSL handshake problem. This may be a problem with the SSL Certificate or with the Network Time Service. In order for Log Insight to accept syslog messages over SSL, a certificate that is validated by the client is required and the clocks of the systems must be in sync.

Log messages from esx01.vrmware.nl are not being accepted, reconfigure that system to not use SSL or see Online Help for instructions on how to install a new SSL certificate .

This message was generated by your Log Insight installation, visit the Documentation Center for more information.

Time couldn’t be the issue in my case. So it had to be a certificate issue. The issue was caused by the vRLI certificate that wasn’t in the ESXi host truststore.

Per ESXi host, the following steps should be taken to solve the issue. Step 3 is only a verification step.

openssl s_client -connect [FDQN or IP vRLI]:1514 < /dev/null | openssl x509 -outform PEM >> /etc/vmware/ssl/castore.pem
Example: openssl s_client -connect vrli.vmware.nl:1514 < /dev/null | openssl x509 -outform PEM >> /etc/vmware/ssl/castore.pem
esxcli system syslog reload
esxcli network ip connection list | grep 1514

If ESXi hosts have the vRLI certificate in their truststore, the vRLI Admin mail (1x per day per vRLI node) should no longer occur.

Here is the link to the VMware documentation. This documentation is actually for vRLI Cloud which is a different product than standard vRLI although they overlap in some areas. The documentation for vRLI will be updated according to VMware GSS.

So this is probably why the vRLI documentation on this topic was so hard to find. Hopefully this blog post will save you a lot of time.

March 10, 2023May 24, 2023

vRealize Log Insight persistent static route on second network interface on behalf of Log Forwarding

The first step is adding a second network interface to the appliance. In this example we use the following network configuration.

VMnic1
Vlan10, IP 10.1.1.10, Subnetmask 255.255.255.0, Gateway 10.1.1.1
VMnic2
Vlan20, IP 20.2.2.20, Subnetmask 255.255.255.0, Gateway 20.2.2.1
In this example the log forwarding target ip address is 30.3.3.233

After some ‘Trial & Error’ research I noticed the following error during reconfiguring eth1 “can’t open /etc/systemd/network/10-eth1.network”

touch /etc/systemd/network/10-eth1.network
chmod 644 /etc/systemd/network/10-eth1.network
Config the second network interface. Go to the directory /opt/vmware/share/vami/ and run the network configution script. vami_config_net. Eth1 is now also available for configuation.
Check the new configuration by selecting option 0. If Ok press 1 to exit
Restart the network, systemctl restart systemd-networkd.service

Now this issue is fixed we can move on to configure the persistant static route for vRLI log forwarding.

Edit /etc/systemd/network/10-eth1.network

The file shlould look like this before editing:

[Match]
Name=eth1
[Network]

Gateway=10.1.1.1
Address=20.2.2.20/24
DHCP=no

[DHCP]
UseDNS=false

Now add route information at the end of the file:

[Match]
Name=eth1
[Network]

Gateway=10.1.1.1
Address=20.2.2.20/24
DHCP=no

[DHCP]
UseDNS=false
[Route]
Gateway=20.2.2.1
Destination=30.3.3.233/24

Save the file and restart the network service.

systemctl restart systemd-networkd.service

Check if the new route is present.

route -n

Test if you reach the destionation from CLI. I used Syslog over UDP port 514.

nc -v 30.3.3.233 514

Answer if configuration is working:

[30.3.3.233 514] open

The last step is to configure the vRLI Log Forwarding Destination.

Send a test event and check if the event is received by the target.

Keep in mind that multic-nic configurations in vRLI are not officially supported.

Also credits to this blog post that pushed me in the right direction.

February 9, 2023February 9, 2023

vRealize Log Insight failed to start Journal Service

Recently, I started an ssh session on a vRealize Log Insight (vRLI) appliance. After entering name and password a message was prompted to change the password immediately. After I did this, the ssh session was dropped and I had to log in again. The password appeared to be unchanged. I repeated this step several times and then decided to restart the vRLI appliance. Hoping this would fix the problem.

After waiting for some time, the appliance does not appear to come online and appears to be in a loop during startup. The screenshot below shows that the Journal Service does not want to start.

I have seen similar issue more often and I suspect a partition has filled up. I couldn’t log in because the appliance won’t start. In the following action plan, I explain step by step how I solved this problem.

Action Plan:

1, Take a snapshot
2. Edit the VM options in the next step before restarting the vm
3. Change Boot Options, Change Boot Delay in 1000 milliseconds and Force Bios Setup

4. Restart the VM
5. Bios Exit Discarding Changes

6. Press ‘E’ direct during the start
7. Type “rw init=/bin/bash” at the end of the line and press “F10”

8. Type “df -h”. In our case we see that the root partition is full

9. Let’s clean up some old audit and authentication logs

10. Let’s check the root partition again. Type “df -h”.

11. Reboot the appliance and we’re done.

After cleaning up the root partition, the appliance started normally. I was also able to change the root password without any problems. Don’t forget change the Boot Options, Change Boot Delay back in 0 milliseconds and remove the snapshot if everything is working fine and you’re happy with this solution.

You can always open a VMware SR if you need help.

November 15, 2022November 15, 2022

Cannot install the vCenter Server agent(HA) service. Unknown installer error (UPDATE)

Almost two years ago I wrote a blogpost about a failure during installing the vCenter Server agent(HA) service. This post is one of most read articles on my blog. You can find the orginal post here. Recently, I ran into this problem again. This time I could not solve the problem using my earlier post.

I have found a workaround to this problem that is easy to implement and works well. I have been able to use it successfully several times recently.

Put the ESXi host in Maintenance Mode
SSH to the ESXi host
esxcli software vib remove -n vmware-fdm (no reboot needed)
Wait a few minutes and the result should like this:

Removal Result
   Message: Operation finished successfully.
   Reboot Required: false
   VIBs Installed:
   VIBs Removed: VMware_bootbank_vmware-fdm_7.0.3-19193900
   VIBs Skipped:

Check if the vCenter Server agent is removed
esxcli software vib list | grep fdm (result should be empty)
Exit Maintenance Mode, HA agent will be installed
Check if the vCenter Server agent is installed
esxcli software vib list | grep fdm (result should be like this:
vmware-fdm 7.0.3-19193900 VMware VMwareCertified 2022-10-12
Done

November 15, 2022November 15, 2022

Slow startup and hung during reboot Ubuntu 22.04 LTS on Maxtang EHL30

Last week was VMware Explore Europe in Barcelona, Spain. Here vExperts were able to pick up a mini PC, the Maxtang EHL30. The mini PC was offered by the vExpert community and Cohesity. This as a gift for all the work vExperts do for the vCommunity.

The mini PC still had to be fitted with DDR4 memory and an M.2 SATA SSD. Since I wanted to keep it low budget I bought memory and an SSD on Amazon for about €50.

After inserting the memory and SSD, the Maxtang boots up. I have installed Ubuntu 22.04 LTS and used Rufus to create a bootable Ubuntu install USB media. After the setup was completed the system hung during the reboot. After a coldboot I observed that the startup was also take some minutes, much to slow in my opinion. After searching for a while I found the solution for the slow startup and hung during reboot/shutdown.

enter command “sudo vi /etc/modprobe.d/blacklist.conf” in terminal
add a new line “blacklist pinctrl_elkhartlake” , save and exit editor mode
enter command “update-initramfs –u” in terminal
reboot system, coldboot to apply change

Now the Maxtang EHL30 with Ubuntu 22.04 LTS Reboot and startup in a few seconds.

October 26, 2022October 26, 2022

vRealize Log Insight 8.10 bad performance

For a customer, I deployed a new vRealize Log Insight 8.10 cluster as a test. After initial installation and configuration of the first vRLI node I experienced a performance that was particularly bad. It wasn’t always this bad, but it frequently occurred.

The bad performance initially revealed itself when choosing one of the options in the menu on the left side of the screen. It seemed like the whole screen was frozen. After 10 till 30 seconds the requested information appeared.

I decided to first complete the cluster setup. After adding some vCenters and importing Content Packs, technically everything worked as designed. The bad performance was still there even after temporarily adding additional memory to the nodes.

During one of the moments that the performance was decreased I saw something in the lower left-hand corner. There were the following messages displayed:

Read <vRLI FQDN>
Transferring data from <vRLI FQDN>
Connecting to cdn.pendo.io

The last bullet was the key to resolve this issue. I searched some information about “cdn.pendo.io” and found this article about “Join or Leave the VMware Customer Experience Improvement Program“. At the bottom of the article is a section “What to do next”

After CEIP is enabled, when a user logs in to vRealize Log Insight, they see a banner at the top of their window that asks whether they want Pendo to collect data based on their interaction with the user interface.

If the user clicks Accept, Pendo collects their data and sends it to VMware
If the user clicks Decline, Pendo does not collect their data

In the General Configuration section of the vRLI configuration I deselected the Usage Reporting (Join the VMware Customer Experience Improvement Program). This resolved the issue for me.

Finally, in this case there was no need to join VMware CEIP. So, therefore it is an acceptable solution for me.

July 1, 2022July 1, 2022

VxRail V7.x log collection failed

Recently I ran into a problem when I needed to generate a VxRail V7.x support log bundle. The log collection failed after I went to the VxRail log section.

Cluster > Configure > VxRail > Troubleshoot > Create

After hitting the Create button I selected the required data. In my case the VxRail Manager data was enough.

When start to Generate the log bundle nothing happened. The Create button was greyed out for a few minutes and no log bundle was generated.

Solution:

Take a snapshot of the VxRail Manager vm
Open a ssh session to the VxRail manager
Login with mystic, switch to root
Cd /tmp/mystic/dc
Look for old log bundles for example VxRail_Support_Bundle_12d4235c-1458-d47f-e68b-a3221bca3bc3_2022-01-01_12_00_00.zip
Delete all old log bundles in /tmp/mystic/dc
Systemctl restart runjars
Systemctl restart vmware-marvin
Wait about 5-10 min.
Try again collecting VxRail manager logs

Now the VxRail Support log bundle was successfully generated.

If you are not familiar enough you can always open a Dell Support request. They will then help you further.