[lug] NFS woes -- request for opinion

Chip Atkinson chip at rmpg.org
Fri Dec 26 18:51:31 MST 2003


I'd start with the alarmist/easy thing to do first -- see if you've been
cracked.  I'd look at some of the programs like netstat and ps.  While
it's not terribly likely that's the problem, I'd check first.  There is
a package called chkrootkit which is kind of reassuring though I don't
know for sure how effective it is.

That will help you just to be sure, but you are probably looking at some
sort of hardware problem.   Look at /var/log/messages and see if there are
problems with either your disk drive or card.  You may see things like bus
reset or timeouts.  My personal experience has been that often things like
NFS exercize the system in different ways than "normal" file access and
because of that you may see behavior that looks caused by NFS but is just
revealed by NFS.

IIRC I had a disk where there were files on it that exceeded the size of
the disk.  Needless to say there was something wrong.  I eventually
replaced the drive and the problems went away.  Not very scientific I know,
but it did fix the problem.

Are the files disappearing on one drive only?  Also, run ifconfig -a and
look at the error count.  That can often reveal network problems too.

Chip




 On Thu, 25 Dec 2003, Jeffrey Veiss wrote:

> Hi all,
>
> I have a strange problem.
>
> First, my hardware configuration for reference:
>
> Tyan MB w/500MHz AMD CPU and 256M RAM
> hda - IDE HD attached to onboard IDE bus (system disk running Mandrake 8.1)
> sda - Seagate SCA-2 HD attached to Adaptec wide SCSI PCI card (ext3: /data1)
> sdb - Seagate SCA-2 HD attached to Adaptec wide SCSI PCI card (ext3: /data2)
> sdc - Seagate SCA-2 HD attached to Adaptec wide SCSI PCI card (ext3: /data3)
> hdc - ATA66 HD attached to Promise PCI card (ext3: /data4)
> hde - ATA100 HD attached to Promise PCI card (ext3: /data5)
> serves files using samba and NFS
> fstab entries for all /data? partitions are set to defaults
>
> The system has been running fine for a few years with no changes except
> periodically, I would get NFS I/O errors copying files to the server from
> two other client Mandrake systems, one running 9.1 (now running 9.2) and the
> other running 8.2 (now running 9.2).  When I get an I/O error, I would
> have to keep reattempting the copy until the files copied successfully
> (sometimes that took 2-3 tries).
>
> I figured that upgrading the NFS package on the file server might
> eliminate the I/O errors that were occurring.  I downloaded, compiled
> and installed the latest NFS sourcecode.  Everything seemed to be
> working fine.
>
> A few weeks later, I noticed an extra 10GB of space on sda.  Further
> investigation revealed that all the files in certain directories were
> gone though the directories were intact.  The timestamps of all the
> directories with the missing files were the same, which was about the
> time I believe i was copying files to the server via NFS.
>
> I tried running e2fsck on the partition but no errors were found and the
> files were not restored.  I started restoring the files from tape to another
> system (which the tape drive was attached), then copying them to the server
> via NFS.  A few minutes later, the files I just restored were gone.
>
> The only way I was able to restore the files to the server with NFS was
> to copy them to hde (which had enough space for the xfer), then copy
> them locally from hde to sda.  After I restored all the files, I closely
> monitored the server over the next few days but no further file
> disappearances occurred.
>
> A few weeks later (this morning) I noticed about 10GB of missing files on
> hdc.  The timestamps on all the empty directories was yesterday which
> was about the time I was again copying files via NFS to the server.  I
> logged into the server and shut down the NFS daemon.  The next thing I
> know, I lost another 20GB of files from hde!
>
> I'm now currently upgrading the OS on the server to Mandrake 9.2 (after
> backing up all the 8.1 OS partitions) but now I'm really reluctant to enable
> NFS on the server at this point.  It's going to take me a few days to restore
> all the files now missing from hdc and hde.
>
> I'm really confused as to why random directories were emptied.  Other
> than the installation of the newer version of NFS, I don't see any other
> patterns.  Any thoughts, recommendations, things to look at, etc. would
> be very much appreciated.
>
> Please contact me if there are any further questions via internet mail at
> blug2 at sirveiss.com.  Thank you very much!
>
> Jeffrey Veiss, CCNA, CISSP, TICSA             Sir Veiss, Inc.
> Network Engineer/System Administrator         blug2 at sirveiss.com
>
> NOTE: Any personal or contact information is not to be sold or used for
>       solicitation.
> _______________________________________________
> Web Page:  http://lug.boulder.co.us
> Mailing List: http://lists.lug.boulder.co.us/mailman/listinfo/lug
> Join us on IRC: lug.boulder.co.us port=6667 channel=#colug
>




More information about the LUG mailing list