Booting YDL 2.1 on B&W G3 Problems

Bill Fink yellowdog-general@lists.terrasoftsolutions.com
Sat Jun 29 22:32:01 2002


On Sat, 29 Jun 2002, "Timothy A. Seufert" wrote:

> At 11:38 AM -0400 6/29/02, Bill Fink wrote:
> 
> >Here are some of my experiences with installing Linux on a B&W G3 at work.
> >First, it just would not install on the slave drive, so I had to switch
> >master and slave.  It is my understanding that early B&W G3s cannot boot
> >from a slave drive because of an OpenFirmware bug.
> 
> Hmmm... I'm fairly sure I've booted a B&W from a slave drive at one 
> time or another.  I do remember not being able to do a plug & chug 
> install of Linux to a slave, due to bugs in the installer.  (I forget 
> which distribution and which version of it.)
> 
> BTW, Apple has a firmware update for B&W G3s, so if you have apparent 
> OF bugs, apply it.  It applies both to rev 1 and rev 2 B&W G3 systems.
> 
> http://docs.info.apple.com/article.html?artnum=58374

Thanks for the tip.  I'll have to check it out.

> >But even after installing Linux on the now master drive, I was still
> >having all kinds of problems actually running Linux on that system
> >including weird system errors and file system corruption.  These finally
> >went away went I installed a custom built 2.4.19-pre8-ben0 kernel.
> >
> >However, I noticed that Linux wasn't detecting the original (now slave)
> >internal disk drive.  I tracked this down to not having the CMD64X driver
> >configured in my kernel and added it in.  Linux then detected the slave
> >drive, but unfortunately the weird system errors and file system
> >corruption also returned.  I also checked and determined that the
> >default YDL 2.1 2.4.10-12a kernel has the CMD64X driver enabled,
> >which explained the earlier problem behavior.  It is my understanding
> >that some of the early disks in the B&W G3s had buggy firmware, so I
> >got bit by two early B&W G3 bugs.
> 
> I haven't heard of any disk firmware bugs before.  I think what you 
> got bitten by is the known IDE chip bug.  The B&W originally shipped 
> with rev 5 of the CMD646U2 controller.  Rev 5 has nasty corruption 
> problems in UDMA mode in master/slave configurations.  Even with only 
> a single drive attached, it can have problems (depends on the drive 
> model -- some work fine, others don't, the only ones you can trust 
> for absolute sure are the factory original drives which Apple did 
> qualification testing on).
> 
> The reason you aren't having problems when you eliminate the CMD64X 
> driver is presumably that the generic IDE driver can't enable UDMA 
> mode.  It might not even be able to enable DMA.

Actually, being somewhat paranoid after all the severe problems I had
run into, I finally removed the original internal drive to avoid any
further problems.  I do have the CMD64X driver disabled in the current
kernel:

astro% dmesg | grep CMD
CMD646: IDE controller on PCI bus 01 dev 08
CMD646: detected chipset, but driver not compiled in!
CMD646: chipset revision 5
CMD646: 100% native mode on irq 26

Is using_dma as reported by hdparm the same thing as the UDMA mode
you were talking about?  If so, it still seems to be set on the new
drive:

astro# hdparm /dev/hda

/dev/hda:
 multcount    =  0 (off)
 I/O support  =  0 (default 16-bit)
 unmaskirq    =  0 (off)
 using_dma    =  1 (on)
 keepsettings =  0 (off)
 nowerr       =  0 (off)
 readonly     =  0 (off)
 readahead    =  8 (on)
 geometry     = 12009/16/63, sectors = 78165360, start = 0

Thus far I haven't had any problems with the new drive.  I guess that's
why I thought it was a disk problem rather than a controller problem,
although perhaps the problem only manifests itself in a master/slave
setup.  IIRC even when I still had the original drive connected, it
wasn't even detected by Linux at all without the CMD64X driver, so it
was as if it wasn't connected, and this may be why it worked OK in that
configuration.  However, this also confused me, since if the IDE driver/
controller couldn't detect the original drive, how was it able to detect
the new drive.  Apparently, the generic IDE driver has a problem
detecting the original drive when it's the slave drive.

> Anyways, CMD fixed the problems in rev 7 of the chip, which Apple 
> incorporated into Rev 2 of the B&W motherboard.  As far as I have 
> ever been able to tell, that was the lone motherboard change during 
> the product life of the B&W G3, which is pretty unusual and means 
> that it was a solid design aside from the use of a buggy IDE chip.
> 
> How you can tell what you've got: Rev 7 chips are marked 
> "CMD646U2-402", while Rev 5's lack the "-402".  The chip is located 
> in a far corner of the motherboard, behind the PCI slots (between the 
> slots and the slot covers).  Fortunately, it's on the top of the 
> board, so you don't have to take the board out.
> 
> If you're in Linux, you can find the chip revision without cracking 
> the box.  Just issue the "lspci" command and look at the revision 
> given for the 646.

It looks like I've got the buggy Rev 5 IDE chip:

astro% lspci
00:00.0 Host bridge: Motorola MPC106 [Grackle] (rev 40)
00:0d.0 PCI bridge: Digital Equipment Corporation DECchip 21154 (rev 02)
00:10.0 VGA compatible controller: ATI Technologies Inc Rage 128 RE
01:00.0 FireWire (IEEE 1394): Texas Instruments PCILynx/PCILynx2 IEEE 1394 Link
Layer Controller (rev 02)
01:01.0 IDE interface: CMD Technology Inc PCI0646 (rev 05)
01:05.0 Class ff00: Apple Computer Inc. Paddington Mac I/O
01:06.0 USB Controller: OPTi Inc. 82C861 (rev 10)

According to your theory, it would seem all that would be necessary to
workaround the problem would be to issue an "hdparm -d 0" on the drive(s).
If I get some spare time, I may test that theory, but as much time as I've
already wasted on that system, it's not real high on my priority list,
especially since it seems to be working pretty well finally.

Thanks for all the info.

						-Regards

						-Bill