[lug] sad hardware announcement :(
D. Stimits
stimits at idcomm.com
Thu Jul 20 00:19:56 MDT 2000
George Sexton wrote:
>
> I have been running SMP under NT for about 4 years. My old machine (a Tyan
> Dual Pentium 133) rarely crashed. The same BX boards that don't work with
> Linux are running NT 4.0 great. I have had maybe one blue screen in 3
> months.
>
> With your SMP machine based on a BX chipset, what do your up-times look
> like?
About 15 minutes to a few hours.
>
> > -----Original Message-----
> > From: lug-admin at lug.boulder.co.us [mailto:lug-admin at lug.boulder.co.us]On
> > Behalf Of D. Stimits
> > Sent: Wednesday, July 19, 2000 5:36 PM
> > To: lug at lug.boulder.co.us
> > Subject: Re: [lug] sad hardware announcement :(
> >
> >
> > George Sexton wrote:
> > >
> > > The other truth is that SMP under 2.2.x just plain sucks. It
> > doesn't work. I
> > > have several machines with a Intel 440BX chipset which Alan Cox
> > describes as
> > > "as stable as it gets" that don't work. Here are some typical uptimes:
> >
> > I have another SMP machine with the BX chipset, and it *never* crashes
> > under linux. This same machine dies several times a day under NT 4,
> > Win2K, and 98. Every imaginable driver, the sound card, and multiple
> > video cards have been used. It simply won't die under linux SMP 2.2.x,
> > nor will it stay up in any windows environment (I'm lucky if it stays up
> > long enough to shutdown...which tends to blue screen). So I guess I'd
> > say I'm a fan of linux SMP relative to anything MS runs. Oh well, c'est
> > la vie.
> >
> > >
> > > 2.2.14 5-70 days (70 was observed only once. The mean is around 15.
> > > 2.2.16 < 2 hours
> > > 2.2.17pre1 15 days
> > > 2.2.17pre9 4-6 days
> > > 2.2.17pre13 < 24 Hours
> > >
> > > These boards work correctly under NT 4.0. It's not a board
> > problem, its a
> > > kernel problem. Andrea Arcangeli has at least 4 unapplied
> > patches for 2.2.17
> > > that correct SMP related issues. There are a lot more left I think.
> > >
> > > The real killer is that when an SMP machine locks up it doesn't
> > generate an
> > > oops under 2.2.x. I tried using a serial console with Ingo Molnar's NMI
> > > oopser and it just doesn't work at all on current kernel
> > versions. I guess
> > > that I could try buying a hardware watchdog card...
> > >
> > > Andrea A. has said that he would backport the 2.4.x oopser to
> > 2.2.17 but it
> > > hasn't happened yet.
> > >
> > > Your report documents high I/O as a trigger condition. In my case, the
> > > machines just flake when they are doing nothing. They run great
> > all day long
> > > under very high CPU and I/O load. At 12:00 AM when they are
> > doing nothing
> > > <boink>.
> > >
> > > My advice to you, is that if you want to run Linux, do not buy
> > an SMP board.
> > > Spend your money on the fastest processor you can buy.
> > >
> > > George Sexton
> > > MH Software, Inc.
> > > Voice: 303 438 9585
> > > http://www.mhsoftware.com
> > >
> > > > -----Original Message-----
> > > > From: lug-admin at lug.boulder.co.us
> > [mailto:lug-admin at lug.boulder.co.us]On
> > > > Behalf Of D. Stimits
> > > > Sent: Tuesday, July 18, 2000 9:09 PM
> > > > To: BLUG
> > > > Subject: [lug] sad hardware announcement :(
> > > >
> > > >
> > > > I have used SuperMicro motherboards for years now, and recently picked
> > > > up a PIIIDM3, which at first seemed stable. I'm not sure how
> > many of you
> > > > have noted sporadic reports of some server boxes locking up under high
> > > > i/o, but Redhat and others have made notes on this. I've
> > found out that
> > > > this is a problem with all of the SuperMicro i840 chipset boards as
> > > > well.
> > > >
> > > > The problem isn't entirely high i/o, but this tends to generate the
> > > > conditions that trigger it. The problem is an unknown
> > IO-APIC, which is
> > > > a device responsible for reprogrammable IRQ steering between multiple
> > > > cpu's. When i/o doesn't lock up the system prior to logging failure, a
> > > > note is found as the last entry of /var/log/messages, "kernel:
> > > > unexpected IRQ vector 217 on CPU#0!" (or on CPU#1). In other locations
> > > > of the log, you'll likely see the entry "WARNING: unexpected IO-APIC,
> > > > please mail".
> > > >
> > > > After speaking with SuperMicro, they simply state "it runs
> > fine on NT",
> > > > and they won't help. In the past they were interested in Linux, but
> > > > SuperMicro has apparently changed its mind and is not interested
> > > > anymore. I've contacted Allen Cox to see what else can be
> > done, but for
> > > > now, you should consider all i840 SuperMicro boards incompatible with
> > > > Linux (I also saw very similar reports on FreeBSD and other
> > open source
> > > > o/s's).
> > > >
> > > > The temporary workaround is to boot with the kernel option "noapic".
> > > > This removes irq redirection to the 2nd cpu, meaning all device i/o is
> > > > entirely on the first cpu. Additionally, some PCI devices which might
> > > > have been at an irq value will be changed or at an unreachable irq.
> > > > There is some explanation of this sort of problem in the kernel source
> > > > Documentation directory: "IO-APIC.txt".
> > > >
> > > > At this point, I am looking for a new motherboard, dual cpu, with 4x
> > > > AGP-pro (I'm looking at high end OpenGL graphics cards) and 64 bit, 66
> > > > MHz PCI slots (required for ultra 160, which I plan to
> > continue using).
> > > > Iwill has a dual slot 2 board, the DCA200, which
> > unfortunately requires
> > > > rdram (expensive and increased latency, with no ability to reuse my
> > > > current pc133 ram), which might be the route to go if nothing else
> > > > appears. Anyone know if this board really is stable under linux? The
> > > > Intel OR840 would be a candidate, but it lacks 64 bit PCI. Does anyone
> > > > know if the Via Apollo Pro 133A chipset is a solution? Do any of the
> > > > 133A boards have 64 bit PCI slots?
> > > >
> > > > And is there anyone who is interested in buying a good non-linux
> > > > motherboard, a PIIIDM3 SuperMicro?
> > > >
> > > > Thanks,
> > > > D. Stimits, stimits at idcomm.com
> > > >
> > > > _______________________________________________
> > > > Web Page: http://lug.boulder.co.us
> > > > Mailing List: http://lists.lug.boulder.co.us/mailman/listinfo/lug
> > >
> > > _______________________________________________
> > > Web Page: http://lug.boulder.co.us
> > > Mailing List: http://lists.lug.boulder.co.us/mailman/listinfo/lug
> >
> > _______________________________________________
> > Web Page: http://lug.boulder.co.us
> > Mailing List: http://lists.lug.boulder.co.us/mailman/listinfo/lug
>
> _______________________________________________
> Web Page: http://lug.boulder.co.us
> Mailing List: http://lists.lug.boulder.co.us/mailman/listinfo/lug
More information about the LUG
mailing list