Summary: How to figure what my Solaris Kernel does Usual Suspects -------------- * It is serving NFS ... this can use a lot of CPU. Make sure you are running version 3. * A fast (Gigabit) interface can almost fill a cpu if it is busy * It is swapping. If the kernel runs out of memory it will spend most of its time moving pages back and forth between disk and ram. - run "vmstat 5" the sr (scan rate) column should be very low (<100) this means the system is not scanning for free memory pages - It may make sense to have a lot of swap space configured, as Solaris does conservative memory allocation. When a process forks it will immediately allocate all the memory necessary even though it does not use it. Solaris does "copy on write" so why not have this extra memory allocated in swap instead of real ram, assuming it is never going to be used anyway. (correct me if I am wrong here.) * It is forking ... this does not have to be a real fork bomb, but just some process quitting and being restarted immediately. Pidentd running non multi-threaded may be such a software. Some cgi process could also be it. This is detectable by looking at the 'last process id' with a tool like top. * It is running veritas volume manager and a disk has failed. Useful Tools ------------ * lockstat lockstat -gkIW sleep 60 gives a 60 second profile of the kernel * iftop http://www.ex-parrot.com/~pdw/iftop will show which box is sending how much traffic through your interface * se toolkit www.setoolkit.com virtual adrian may be able to give some hints onto where the performance issues lie * prstat prstat -m will show user vs system time for each process, so if it is a process causing the problem it should show here * truss truss -c -p PID can help to identify which system calls a problematic process is spending its time on. A summary is printerd on ctrl-c * iostat iostat -xnP 30 30 shows where the system is writing and reading data and how much * vmstat vmstat 5 shows paging activity (check the sr column) * kstat Displays kernel statistics. Did not get any useful hints on what could be discovered here ... but sure gives a lot of numbers * prex prex -k Part of the solaris tracing architecture. Note, that this will just open a shell where you are expected to enter commands to activate the tracing. I got the following example ... (reading the output is another issue) # prex -k 1) Type "help" for help ... prex> buffer alloc 10m 2) Buffer of size 10485760 bytes allocated prex> enable $all 3) prex> trace $all 4) prex> ktrace on 5) ... wait a bit ... prex> ktrace off prex> untrace $all prex> disable $all prex> quit # tnfxtract ./tnf.result 6) # prex -k Type "help" for help ... prex> buffer dealloc 7) prex> quit # tnfdump ./tnf.result 8) 1) Issue prex command with kernel trace mode 2) You should allocate kernel in-core buffer to trace kernel activity. 3) Enable trace set named $all. You can specify your own trace facility (tnf_name) set. (ie. all I/O operation) Refer prex man page. 4) Trace $all set. 5) Start kernel trace. Immediately kernel starts to collect tnf_probe and store it kernel in-core buffer. 6) Extract contents of kernel buffer to file system. 7) Deallocate kernel in-core buffer. You should extract contents of buffer before deallocate buffer. Contents of buffer will be erased immediately when you issue "deallocate" 8) Convert raw tnf data to readable ASCII format. Reading List ------------ Sun Performance and Tuning: Java and Internet, 2nd Edition (Adrian Cockcroft) http://www.booksmatter.com/b0130952494.htm Unlocking the kernel http://www.sun.com/sun-on-net/itworld/UIR980801perf.html Performance and Tuning on the Solaris 2.6, 7, and 8 http://developers.sun.com/solaris/articles/tuning_solaris.html Contributors ------------ Markus Kluge, Ramiro Santos, Allen Wooden, przemol, Casper Dik, Jon Andrews, Thomas 'Mike' Michlmayr, Amiel Lee Yee, William Hathaway, Jeff Vaneek, Frank Smith, Darren Dunham, Jon Andrews, Darren Dunham, Luc I. Suryo, Joe Fletcher, Mark Pfeiffer, Joohyun Cha, Karl Vogel, Todd M. Wilkinson. Yesterday Tobias Oetiker wrote: > Folks, > > We have this 4 Way Sun Enterprise 420R server. With 4GB Ram and > about 10GB swap. It runs a ton of services (Apache, Postfix, > Amavis, Spamassassin) and it also acts as a NFS server. > > Lately we are experiencing performance issues ... the box goes to > load 17 and responds rather sluggishly. > When looking at the load we often see the following picture: > > 50% User > 50% Kernel > 0% Idle > > The 50% User is easy to attribute by looking at the processes. But > what is the system doing in the 50% kernel time? > > Is there something like kernel-top? I played around with lockstat > a bit, but it did not really answer my questions ... > > We are running Solaris 8. > > cheers > tobi > -- ______ __ _ /_ __/_ / / (_) Oetiker @ ISG.EE, ETZ J97, ETH, CH-8092 Zurich / // _ \/ _ \/ / System Manager, Time Lord, Coder, Designer, Coach /_/ \.__/_.__/_/ http://people.ee.ethz.ch/~oetiker +41(0)1-632-5286 _______________________________________________ sunmanagers mailing list sunmanagers@sunmanagers.org http://www.sunmanagers.org/mailman/listinfo/sunmanagersReceived on Tue Jan 13 02:43:39 2004
This archive was generated by hypermail 2.1.8 : Thu Mar 03 2016 - 06:43:27 EST