[lug] NFS woes -- request for opinion

Jeffrey Veiss blug2 at sirveiss.com
Thu Dec 25 20:19:29 MST 2003


Hi all,

I have a strange problem.

First, my hardware configuration for reference:

Tyan MB w/500MHz AMD CPU and 256M RAM
hda - IDE HD attached to onboard IDE bus (system disk running Mandrake 8.1)
sda - Seagate SCA-2 HD attached to Adaptec wide SCSI PCI card (ext3: /data1)
sdb - Seagate SCA-2 HD attached to Adaptec wide SCSI PCI card (ext3: /data2)
sdc - Seagate SCA-2 HD attached to Adaptec wide SCSI PCI card (ext3: /data3)
hdc - ATA66 HD attached to Promise PCI card (ext3: /data4)
hde - ATA100 HD attached to Promise PCI card (ext3: /data5)
serves files using samba and NFS
fstab entries for all /data? partitions are set to defaults

The system has been running fine for a few years with no changes except
periodically, I would get NFS I/O errors copying files to the server from
two other client Mandrake systems, one running 9.1 (now running 9.2) and the
other running 8.2 (now running 9.2).  When I get an I/O error, I would
have to keep reattempting the copy until the files copied successfully
(sometimes that took 2-3 tries).

I figured that upgrading the NFS package on the file server might
eliminate the I/O errors that were occurring.  I downloaded, compiled
and installed the latest NFS sourcecode.  Everything seemed to be
working fine.

A few weeks later, I noticed an extra 10GB of space on sda.  Further
investigation revealed that all the files in certain directories were
gone though the directories were intact.  The timestamps of all the
directories with the missing files were the same, which was about the
time I believe i was copying files to the server via NFS.

I tried running e2fsck on the partition but no errors were found and the
files were not restored.  I started restoring the files from tape to another
system (which the tape drive was attached), then copying them to the server
via NFS.  A few minutes later, the files I just restored were gone.

The only way I was able to restore the files to the server with NFS was
to copy them to hde (which had enough space for the xfer), then copy
them locally from hde to sda.  After I restored all the files, I closely
monitored the server over the next few days but no further file
disappearances occurred.

A few weeks later (this morning) I noticed about 10GB of missing files on
hdc.  The timestamps on all the empty directories was yesterday which
was about the time I was again copying files via NFS to the server.  I
logged into the server and shut down the NFS daemon.  The next thing I
know, I lost another 20GB of files from hde!

I'm now currently upgrading the OS on the server to Mandrake 9.2 (after
backing up all the 8.1 OS partitions) but now I'm really reluctant to enable
NFS on the server at this point.  It's going to take me a few days to restore
all the files now missing from hdc and hde.

I'm really confused as to why random directories were emptied.  Other
than the installation of the newer version of NFS, I don't see any other
patterns.  Any thoughts, recommendations, things to look at, etc. would
be very much appreciated.

Please contact me if there are any further questions via internet mail at
blug2 at sirveiss.com.  Thank you very much!

Jeffrey Veiss, CCNA, CISSP, TICSA             Sir Veiss, Inc.
Network Engineer/System Administrator         blug2 at sirveiss.com

NOTE: Any personal or contact information is not to be sold or used for
      solicitation.



More information about the LUG mailing list