Disk moves

I’ve been experience a lot of random filesystem grumpiness on dustpuppy lately.  On a routine basis, once every few days, /var/spool decides it has too many errors to continue operating read/write and the kernel locks the block (pseudo-) device against writes.  Then I have to stop all the spooling daemons (crond, atd, sendmail, dovecot, innd), unmount /var/spool, fsck -f the pseudo-device (/dev/mapper/VolGroup00-LogVol08), wait for it to clear the errors (which are usually many and involve cloning thousands of multiply-claimed blocks), remount it, and start all the daemons again.  I even went so far as writing a script that automates everything but the fsck -f (since that’s dangerous to automate).  I don’t know why this is happening.  It’s a Linux MD pseudo-device, on which I have an LVM2 physical volume split into 14 logical volumes. The problems just started in the last few weeks.

Just yesterday, I had to reboot the machine because / and /usr started experiencing the same issues.  It took me over two hours to get it back up since I had to clone somewhere around 6 million multiply-claimed blocks in /usr/libexec/usermin.

If it matters, all filesystems are ext3 (the only FS Red Hat supports) except /boot and /tmp which are ext2.  /boot for GRUB1 backwards compatibility and /tmp for performance reasons and since nothing written to /tmp is critical to recover on boot.

The disks are 2x PATA-133 and 2x SATA-150.  All are 160GB.  Two are Western Digital (one PATA and one SATA), the other PATA is Hitachi, and the other SATA is Toshiba.  S.M.A.R.T. is reporting a massive amount of read errors corrected for the Hitachi Deskstar PATA drive but the internal diagnostics pass every time I run them from the S.M.A.R.T. console.

I doubt it’s a hardware error as mdadmin would have reported the array as out of sync if that was the issue.  Is it a bug in the Linux mdraid driver?  Or possibly LVM?

At any rate, I succeeded in my quest to re-use the 512M partition formerly mounted at /var/www as my new /usr/mailman filesystem:

(dustpuppy) $ df -h
Filesystem            Size  Used Avail Use% Mounted on
2.0G  1.1G  809M  58% /
2.0G  149M  1.8G   8% /tmp
301G   99G  188G  35% /export/home
7.8G  4.7G  2.8G  63% /opt
7.8G  3.1G  4.4G  41% /usr
992M  817M  124M  87% /usr/local
3.9G  2.4G  1.4G  64% /var
7.8G  495M  6.9G   7% /var/lib/mysql
93G  9.5G   78G  11% /var/ftp
2.0G  222M  1.7G  12% /var/cache
2.0G  216M  1.7G  12% /var/spool
2.0G  1.1G  774M  59% /var/log
/dev/hda1             122M   21M   96M  18% /boot
tmpfs                 506M     0  506M   0% /dev/shm
496M   51M  420M  11% /usr/mailman

, ,

  1. Leave a comment

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: