Tag Archives: memory

Dell servers crash when running memtest

Issue:
Maintenance is finally complete on your Dell Poweredge server and it is ready to be placed back into production.  You decide to run through a day of memory testing so you download the latest version of memtest  – 4.0a.  You mount it in the machine either physically or through DRAC and let it run.

First you try it on your 2950.  The 2950 completely freaks out and throws CPU errors.
CPU 1 has an internal error (IERR).
CPU 2 has an internal error (IERR).
CPU 1 is operating correctly.
CPU 2 is operating correctly.
CPU 1 machine check detected.
A fatal IO error detected on a component at
CPU 2 machine check detected.
A fatal IO error detected on a component at
CPU 2 machine check detected.

Strange.  This machine has never had any issues before.  Next you try it on your r900.  You boot over DRAC first and then physical media.  This machine just restarts when booted off of memtest.

So you decide to run memtest 3.5b instead of 4.0a on both machines.  Wow it works.

Solution:
When running memtest on your Dell Poweredge machines, use 3.5b.