[lug] sad hardware announcement :(

George Sexton gsexton at mhsoftware.com
Wed Jul 19 10:15:38 MDT 2000


The other truth is that SMP under 2.2.x just plain sucks. It doesn't work. I
have several machines with a Intel 440BX chipset which Alan Cox describes as
"as stable as it gets" that don't work. Here are some typical uptimes:

2.2.14	5-70 days (70 was observed only once. The mean is around 15.
2.2.16	< 2 hours
2.2.17pre1	15 days
2.2.17pre9	4-6 days
2.2.17pre13	< 24 Hours

These boards work correctly under NT 4.0. It's not a board problem, its a
kernel problem. Andrea Arcangeli has at least 4 unapplied patches for 2.2.17
that correct SMP related issues. There are a lot more left I think.

The real killer is that when an SMP machine locks up it doesn't generate an
oops under 2.2.x. I tried using a serial console with Ingo Molnar's NMI
oopser and it just doesn't work at all on current kernel versions. I guess
that I could try buying a hardware watchdog card...

Andrea A. has said that he would backport the 2.4.x oopser to 2.2.17 but it
hasn't happened yet.

Your report documents high I/O as a trigger condition. In my case, the
machines just flake when they are doing nothing. They run great all day long
under very high CPU and I/O load. At 12:00 AM when they are doing nothing
<boink>.

My advice to you, is that if you want to run Linux, do not buy an SMP board.
Spend your money on the fastest processor you can buy.

George Sexton
MH Software, Inc.
Voice: 303 438 9585
http://www.mhsoftware.com


> -----Original Message-----
> From: lug-admin at lug.boulder.co.us [mailto:lug-admin at lug.boulder.co.us]On
> Behalf Of D. Stimits
> Sent: Tuesday, July 18, 2000 9:09 PM
> To: BLUG
> Subject: [lug] sad hardware announcement :(
>
>
> I have used SuperMicro motherboards for years now, and recently picked
> up a PIIIDM3, which at first seemed stable. I'm not sure how many of you
> have noted sporadic reports of some server boxes locking up under high
> i/o, but Redhat and others have made notes on this. I've found out that
> this is a problem with all of the SuperMicro i840 chipset boards as
> well.
>
> The problem isn't entirely high i/o, but this tends to generate the
> conditions that trigger it. The problem is an unknown IO-APIC, which is
> a device responsible for reprogrammable IRQ steering between multiple
> cpu's. When i/o doesn't lock up the system prior to logging failure, a
> note is found as the last entry of /var/log/messages, "kernel:
> unexpected IRQ vector 217 on CPU#0!" (or on CPU#1). In other locations
> of the log, you'll likely see the entry "WARNING: unexpected IO-APIC,
> please mail".
>
> After speaking with SuperMicro, they simply state "it runs fine on NT",
> and they won't help. In the past they were interested in Linux, but
> SuperMicro has apparently changed its mind and is not interested
> anymore. I've contacted Allen Cox to see what else can be done, but for
> now, you should consider all i840 SuperMicro boards incompatible with
> Linux (I also saw very similar reports on FreeBSD and other open source
> o/s's).
>
> The temporary workaround is to boot with the kernel option "noapic".
> This removes irq redirection to the 2nd cpu, meaning all device i/o is
> entirely on the first cpu. Additionally, some PCI devices which might
> have been at an irq value will be changed or at an unreachable irq.
> There is some explanation of this sort of problem in the kernel source
> Documentation directory: "IO-APIC.txt".
>
> At this point, I am looking for a new motherboard, dual cpu, with 4x
> AGP-pro (I'm looking at high end OpenGL graphics cards) and 64 bit, 66
> MHz PCI slots (required for ultra 160, which I plan to continue using).
> Iwill has a dual slot 2 board, the DCA200, which unfortunately requires
> rdram (expensive and increased latency, with no ability to reuse my
> current pc133 ram), which might be the route to go if nothing else
> appears. Anyone know if this board really is stable under linux? The
> Intel OR840 would be a candidate, but it lacks 64 bit PCI. Does anyone
> know if the Via Apollo Pro 133A chipset is a solution? Do any of the
> 133A boards have 64 bit PCI slots?
>
> And is there anyone who is interested in buying a good non-linux
> motherboard, a PIIIDM3 SuperMicro?
>
> Thanks,
> D. Stimits, stimits at idcomm.com
>
> _______________________________________________
> Web Page:  http://lug.boulder.co.us
> Mailing List: http://lists.lug.boulder.co.us/mailman/listinfo/lug





More information about the LUG mailing list