VI / vCenter cannot connect to Host, Service mgmt-vmware restart may not restart hostd

Yesterday, I had a problem after I shutdown a lagcy San, that after I rescaned my HBA’s, ( which timed out), I lost connection to the host.

I tried restarting the normal services, in console,

# service vmware-vpxa restart

#service mgmt-vmware restart

but I would get,

Stopping VMware ESX Server Management services:
   VMware ESX Server Host Agent Watchdog                   [fusion_builder_container hundred_percent=”yes” overflow=”visible”][fusion_builder_row][fusion_builder_column type=”1_1″ background_position=”left top” background_color=”” border_size=”” border_color=”” border_style=”solid” spacing=”yes” background_image=”” background_repeat=”no-repeat” padding=”” margin_top=”0px” margin_bottom=”0px” class=”” id=”” animation_type=”” animation_speed=”0.3″ animation_direction=”left” hide_on_mobile=”no” center_content=”no” min_height=”none”][FAILED]

 

I found the following post , http://communities.vmware.com/message/1157237 and KB article, http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1005566

 

Both which have you kill the hostd, process, then rename the pid, file. Here is the contents of the KB.

"

Service mgmt-vmware restart may not restart hostd

Symptoms
  • You are using the command  service mgmt-vmware restart but it does not finish restarting hostd.
  • The script gets stuck when stopping the service.
  • The SSH session to the ESX host becomes unresponsive.

  • hostd does not restart

Resolution

You must manually stop the stuck service and restart it.

To stop the service and restart it:

  1. Log in as root to the ESX host command-line via the physical console or via KVM connection.

  2. Navigate to the /var/run/vmware directory:
    # cd /var/run/vmware
  3. Run the following command to list the files vmware-hostd.PID and watchdog-hostd.PID:
    # ls -l vmware-hostd.PID watchdog-hostd.PID

  4. Determine the Process ID (PID) management service. View the contents of the vmware-hostd.PID file: 
    # cat vmware-hostd.PID
    For example:
    [[email protected]]# cat vmware-hostd.PID
    1191[[email protected]]#
  5. Use the resulting PID to kill the process.
    Caution: Use the kill -9 command with care. It kills the process of the supplied PID without exception or confirmation.
    # kill -9 <PID>
    In this example you run kill -9 1191.

  6. Delete the vmware-hostd.PID and watchdog-hostd.PID files:
    # rm vmware-hostd.PID watchdog-hostd.PID
  7. Restart the management service.
    # service mgmt-vmware restart”

This seemed to fix it, and without any downtime.

 

Roger L

[/fusion_builder_column][/fusion_builder_row][/fusion_builder_container]

Written by Roger Lund

VMware and Storage crazy man, vExpert, MN VMUG leader