Well, I got a ton of very helpful information. Thanks to everyone who responded. I've posted the answers/comments that I received below and the original message is at the bottom. In summary, it seems that software RAID5 (especially on a slower box) probably isn't a great idea. It is also, apparently, a worse idea to have your RAID filesystem be /var on a mailserver. What helped me, though, was to create a spool directory on a non-RAID filesystem and then symlink /var/spool to this new directory. Now the box seems a tad bit pokey, but it's able to keep up with incoming messages. I was seeing upwards of 3000 sleeping sendmail processes and now I only see 20-50 (which is more like it used to be). Many people suggested ditching SW RAID entirely, or, at least, switching to mirroring/striping. Here's the individual comments. ------------------------------------------------------------------- Jay Lessert: SW RAID5 is very, very slow on writes, no matter what platform it's on. sendmail in particular does a *lot* of create-small-file-then-discard operations. Even on a fast file system this bottlenecks sendmail. A SW RAID5 for sendmail /var/spool would not be a good thing. Striped and mirrored would be OK. Just not RAID5, no free lunch. If it is an option, you might consider changing MTAs to postfix (can be very sendmail-like) or qmail (not very sendmail-like). Either will be much less of a resource hog compared to sendmail. Joe Fletcher: Software RAID5 is always going to put quite a load on the system especially in write intensive environments. The parity calculations are what's killing you. You have to do the numbers then do each I/O as many times as you have disks in the RAID set. An E250 is not a particularly quick box to start with. I know from experience that something like an old Proliant 3000 with 2x PIII-550s and a hardware RAID card running SuSE can run rings round a 420R with 2x450MHz UltraIIs talking to A1000s. I'd suggest either hardware RAID or moving to a simple striped config for performance though obviously you lose you resilience. Martin Hepworth: how many disks on the RAID5 set and over how many controllers? using RAID5 on less that 5 disks can result in this problem. Also increasing the number of controllers can help as you spread the load. Glen: With a mial server you should have bumped down the strip size! mailserver is small block i/o, which is faster with a smaller strip size. Also a Dns server will not affect i/o. Are you using Veritas? If so convert to a striped pro. John Timon: Raid 5 in software is always bad. You will get very bad performance in a raid 5 setup in software. If you must do raid 5 do it in hardware. Or if not hardware then do mirror+stripe for good performance and fault tolerance Kevin Buterbaugh: I/O wait has meaning only on single CPU boxes. On MP servers, it's irrelevant since the CPU issueing the I/O request is free to do other work while "waiting" for the I/O to complete. In fact, it's quite possible that when the I/O completes the process will resume on a different CPU. I would also discourage the use of top. It's not a Sun tool, they don't support it, and I've personally seen it give incorrect information. If you're running Solaris 8 or later, then you can use prstat instead. I'd recommend continuing to monitor your box. Use sar or vmstat / mpstat / iostat. With iostat, the most useful options are "-xMn." Look for disks with service times > 30 ms or %busy > 25%. Ignore statistics for metadevices and concentrate on the statistics for the disks themselves. HTH... ----------------------------------------------------------------------- Thanks to everyone for your help! Shane -----Original Message----- From: Shane Hickey [mailto:shane@howsyournetwork.com] Sent: Thursday, January 30, 2003 12:09 PM To: sunmanagers@sunmanagers.org Subject: high iowait with raid5 Howdy all, I apologize if this has been covered. I did some searching and found similar questions, but not answers, and I'm in a bit of a bind. Anyway, I have an e250 with a gig of RAM and two 296Mhz processors. I can give more specifics if needed (I don't have it in front of me). I'm trying to migrate the services of another e250 onto this box (mainly sendmail, dns, pop3). The difference is that I've setup RAID5 on the new box using the instructions I found at http://www.pennasoft.com/articles/SolarisRAID.shtml. Sadly, I didn't document the process as well as I should have and I'm not very familiar with raid on solaris. Anyway, I do recall that I bumped the strip size up a bit. Anyway, what I'm seeing is tons of sleeping sendmail processes and a fairly high load. It seems like a bunch of CPU is getting tied up in IO? Also, I'm watching my tmpfs dwindle away to nothing. Here are some particulars, please let me know if I can provide information. I'm more of linux/freebsd person that a solaris person, so I don't think I know all the diagnostic commands that I should. df -k -------------- Filesystem kbytes used avail capacity Mounted on /dev/dsk/c0t0d0s0 16468538 4207068 12096785 26% / /proc 0 0 0 0% /proc fd 0 0 0 0% /dev/fd mnttab 0 0 0 0% /etc/mnttab /dev/md/dsk/d5 69585470 4677568 59341065 8% /var swap 2033024 32 2032992 1% /var/run swap 2033008 16 2032992 1% /tmp iostat --------------- tty md5 sd0 sd6 sd7 cpu tin tout kps tps serv kps tps serv kps tps serv kps tps serv us sy wt id 0 74 414 37 65 497 15 55 0 0 9 760 87 28 11 12 57 20 top --------------- load averages: 0.65, 0.55, 0.46 10:07:27 2606 processes:2603 sleeping, 2 zombie, 1 on cpu CPU states: 27.3% idle, 6.3% user, 13.3% kernel, 53.0% iowait, 0.0% swap Thanks in advance for any assistance, _______________________________________________ sunmanagers mailing list sunmanagers@sunmanagers.org http://www.sunmanagers.org/mailman/listinfo/sunmanagersReceived on Thu Jan 30 14:20:53 2003
This archive was generated by hypermail 2.1.8 : Thu Mar 03 2016 - 06:43:02 EST