How to restrict the amount of Physical CPUs presented to ESXi

Issue:
There are times when you might want or need to limit the amount of Physical CPUs (PCPU) presented to ESXi.  It can be either a licensing issue or perhaps you just need to run some testing.  For whatever reason, physically removing the CPUs is not an option so you need a way to tell ESXi to ignore the excess Physical CPUs.

Resolution:

  1. Modify the VMkernel.Boot.maxPCPUS setting
  2. This is listed under Configuration -> Software -> Advanced Settings ->VMKernel -> Boot
  3. The number you put here depends on your configuration.  For instance, if you have a 4 proc by 4 core system and you want to reduce the number of physical processors to 2 and maximize the logical processors, you would put 8 as the number. 2×4.
  4. If you have a 4 proc by 2 core system and you want to reduce the number of physical processors to 2 and maximize the logical processors, you would put 4 as the number. 2×2

 

Search terms followed:

  • make server use less processors
  • vmware utilize fewer sockets
  • only want vmware to use two sockets
  • only want server to use 2 processor sockets
  • how to turn off a processor socket
  • esx server has too many sockets
  • esx server mask cpu socket
  • limit cpu cores
  • vmware limit cpu
  • Multiple cpu limit in VMware
  • vmware limit pcpu
  • vmware limit physical cpu

ESXi host becomes disconnected after offlining shared LUN

Issue:
A backup is run on a VM to an attached drive on shared storage which is thin provisioned.  That storage has not been monitored closely and runs out of space which forces the LUN offline.  Suddenly the ESXi hosts start randomly disconnecting from Virtual Center.  You cannot connect directly to the host through the client, but you can access the host through SSH.  You attempt to connect the host to Virtual Center, but it also fails.

Resolution

  1. Connect to host through SSH and run /sbin/services.sh restart.
  2. Now run /sbin/services.sh restart again
  3. Once again, run /sbin/services.sh restart
  4. You will now be able to connect the host to Virtual Center.

Dell servers crash when running memtest

Issue:
Maintenance is finally complete on your Dell Poweredge server and it is ready to be placed back into production.  You decide to run through a day of memory testing so you download the latest version of memtest  – 4.0a.  You mount it in the machine either physically or through DRAC and let it run.

First you try it on your 2950.  The 2950 completely freaks out and throws CPU errors.
CPU 1 has an internal error (IERR).
CPU 2 has an internal error (IERR).
CPU 1 is operating correctly.
CPU 2 is operating correctly.
CPU 1 machine check detected.
A fatal IO error detected on a component at
CPU 2 machine check detected.
A fatal IO error detected on a component at
CPU 2 machine check detected.

Strange.  This machine has never had any issues before.  Next you try it on your r900.  You boot over DRAC first and then physical media.  This machine just restarts when booted off of memtest.

So you decide to run memtest 3.5b instead of 4.0a on both machines.  Wow it works.

Solution:
When running memtest on your Dell Poweredge machines, use 3.5b.

The website wants to install the following add-on: ‘DRAC5 Virtual Media Active-X plugin’

Issue:
Internet Explorer keeps prompting you with the following error when you try to load the virtual media plugin:

The website wants to install the following add-on: ‘DRAC5 Virtual Media Active-X plugin’ from ‘Dell Inc’.  If you trust the website and the add-on and want to install it, click here…

You choose Install, but it logs you off.  You are prompted again and it logs you off.  Its a vicious loop.

Solution:
Add the DRAC IP to Local Intranet Sites.  If that doesn’t work, try enabling “Download unsigned ActiveX controls”.  Of course turn that option off once you get DRAC working.

DRAC5 is stuck trying to boot from virtual media

Issue:
Sometimes if a machine crashes while virtual media is attached, it will continue to attempt to boot from virtual media even if no media is present.  This creates long boot times at the Dell Splash Screen and after the Remote Access Controller is detected.  If you have a linux variant installed, you will also see it hang when USB is being initialized.  In this instance, the DRAC needs to be reset.

Solution:

  1. Install the DRAC Windows tools from ftp://ftp.dell.com/sysman/
  2. The file will start with “OMI-DRAC”.  As of this article, the current version is OMI-DRAC-Dell-Web-WIN-620-32-677.exe
  3. Run the install and extract it to a temp directory.  Default is openmanage.
  4. Browse to the temp directory where the file was extracted and browse to windowsdrac
  5. Run the Drac msi
  6. After install completes, open up the command prompt and go to c:program filesdellsysmgtrac5.  If it is installed on a 64 bit machine, the application will be in Program Files (x86)
  7. Type in “racadm -r <IP> -i racreset”.
  8. You will be prompted for username and password and the DRAC will reset.  It typically takes 1-2 minutes.

More information on DRAC Commands:
http://support.dell.com/support/edocs/software/smdrac3/drac5/OM53/en/ug/racugaa.htm

How to update DRAC5 firmware using Web GUI

  1. Go to the Dell FTP site – ftp://ftp.dell.com/sysman/
  2. Search for “f_drac5”
  3. Find the latest edition.  As of this article, it is f_drac5v160_A00.exe
  4. Download, Run and Extract the file which will contain a d5 file
  5. Open up your DRAC5 interface and go to Remote Access – Update (or DRAC Update)
  6. Click the Browse button and choose the d5 file extracted from step 3.
  7. Click Update and don’t interrupt the process.

Dell Firmware Updates – How to create a boot disk

There is nothing more miserable than a Dell firmware upgrade.  This includes BIOS, DRAC, PERC controllers and any other hardware that needs to be upgraded.  For those of you that are spoiled using HP technologies, you will soon find out what I mean…

HP has the smartstart DVD which is a well compiled single piece of bootable media.  Dell has the Systems Build and Update Utility (SBU) and Server Update Utility (SUU) which depend on each other to make this work.  Yes, that means you need two DVDs in order to flash your BIOS.  Now lets start this miserable task…

The Dell Drivers and Downloads site is confusing and convoluted.  You never really know what utility you need so you download everything and try to sort out the mess later.  The goal of this guide is to remove the confusion.

  1. Download the SBU.  You can typically find this under Systems Management on your systems download page. The iso name starts with “cdu”.  I would highly recommend going to the Dell FTP site ftp://ftp.dell.com/sysman/ and search for cdu.  The latest version, as of this writing, is cdu_1.6_core_173_A01.
  2. Download the SUU.  This is listed under the Systems management page as “Server Update Utility ISO”.  If it is not listed, remove the OS refinement.  Read the readme and verify the SUU applies to your machine.  Download all available parts and combine them together using “copy /b”.  More information here.
  3. Burn both DVDs and boot from the SBU.  This disk can take a while to run so be patient.  Go through the steps and when it asks you to select the repository location, choose CD/DVD and enter the SUU.
  4. For a more detailed run through of the actual firmware update process using DRAC, go to http://www.getyournerdon.com/2010/07/dell-firmware-updates-with-sbu-and-suu/.  A video is also provided which is extremely helpful.

How to force detach DRAC5 virtual media

Issue:
Sometimes if DRAC5 virtual media is acting up (which is more often than not), you need to force terminate the session.  Unfortunately, there is no way to detach using the System – Media – Configuration tab since you cannot apply changes.

Solution:
You must terminate all sessions from the Remote Access – Session Management page.  Click the trash can next to the hung session and your virtual media will be detached once again.

High CPU utilization after moving Exchange database

Issue:
After moving an Exchange 2010/2007 database to a new location, the CPU starts spiking with high utilization. msftefd.exe is using most of the CPU cycles and Event IDs 109 and 108 are logged in the System Log by the Search Indexer.

Solution:
After initial troubleshooting, you likely made it to the Microsoft link entitled “The new Search in Exchange Server 2007“.  This article says that the Exchange Service Indexer triggers a crawl in three instances:

  1. Adding a new Mailbox Database
  2. Adding a new Mailbox
  3. Receiving a new email.

What it doesn’t say specifically is moving a database to a new location has the same content indexing consequences as if a new mailbox database was added.  This will trigger a full crawl of the moved database resulting in high CPU utilization.  The process will not end until a MSExchange Search Indexer Event 110 is logged in the system log.

In a real world environment(~1000 users), it will take approximately 1.6 minutes to process 1 gigabyte.

Example: If you move a 100GB database, it will take approximately 2.6 hours to index.  During this time, msftefd.exe will be utilizing most of the CPU cycles.

How to stop Exchange message tracking logs from truncating prematurely

Issue:
The Exchange server keeps deleting message tracking logs before they are scheduled to be deleted.  They should stay on the server for 30 days, but they often delete at the 10-15 day mark.  The Log directory currently has 1GB of logs.

Solution:
Exchange 2007 and 2010 place retention policies on message tracking logs in three ways: Age, Directory Size and File Size.  In this situation the problem is being caused by the Directory Size limit.

  1. First determine what your message directory size settings are currently at by running ” Get-transportserver | select name,*message* ” from the management shell.
  2. Look at the number next to MessageTrackingLogMaxDirectorySize.  In 2007, the default is 250MB.  In 2010, the default is 1GB (1024MB).
  3. Since the log directly is currently maxed out at 1GB, this directory size limitation needs to be raised to 2GB.
  4. From the management shell, run ” Get-transportserver “transportservername” | Set-TransportServer -MessageTrackingLogMaxDirectorySize 2048MB “