SUMMARY: Multi-processor stats and control

From: Robert J. Cronin (rjcronin@uop.com)
Date: Thu Dec 21 1995 - 09:32:23 CST


Summary time...

I had asked about the intricacies of thread migration on
multi-processor Solaris 2.X. (original post at bottom)

Well, I got 3 answers, all with good info. The best TOOL advice was
"proctool", which really provided me with all the information I
required. "top" version 3.3 MIGHT also provide the required data, but
proctool is GUI based and includes active graphing capability, whereas
top is character based. (I never actually built top on the Solaris MP
box, but I tried it out on a single processor 4.1.3 box. It claims to
support MP, but I don't know what that actually "means".) top comes in
source form, but proctool only comes in binary.

Kevin Sheehan also gave me some insight into the migration algorithm
employed, as well as many other useful tips. I include his reply in
full below. His information seems to be accurate for my situation: my
CPU intensive tasks migrate quite frequently from processor to
processor (at least once a second), even when there is only one CPU
intensive task running.

Thanks go to:
greg.harrison@analog.com (greg harrison)
Glenn.Satchell@Uniq.com.au (Glenn Satchell - Uniq Professional Services)
Kevin.Sheehan@uniq.com.au (Kevin Sheehan {Consulting Poster Child})

Regards,

Bob Cronin
(RJCronin@uop.com)

Addenda:
========

for proctool, archie sez:
==========================================================================

Host qiclab.scn.rain.com (204.188.34.97)
Last updated 02:50 23 Nov 1995

    Location: /pub
      DIRECTORY drwxr-xr-x 1024 bytes 02:11 23 Nov 1995 proctool

Host ftp.iij.ad.jp (192.244.176.50)
Last updated 15:53 26 Nov 1995

    Location: /pub/systems/sun-info/sun-us/mde
      DIRECTORY drwxr-xr-x 512 bytes 19:10 10 Nov 1995 proctool

Host leica.ccu.edu.tw (140.123.1.3)
Last updated 15:33 9 Dec 1995

    Location: /pub1/unix/sun/mde
      DIRECTORY drwxr-xr-x 512 bytes 03:58 31 Oct 1995 proctool

Host opcom.sun.ca (205.189.200.5)
Last updated 03:42 19 Nov 1995

    Location: /pub/binaries
      DIRECTORY drwxr-xr-x 512 bytes 07:12 1 Aug 1995 proctool

==========================================================================
NOTE: The first two digits of the proctool "version" must match the
Solaris version. Also, as of version 2.X.5, you must also have the
Motif library, libXm.so.x. Not having this, I ended up with version
2.3.3.

top 3.3 is available from sun-managers own "eecs.nwu.edu" in /pub/top.

----- Begin Included Message -----

>From kevin@uniq.com.au Thu Dec 7 17:39:01 1995
.
.
.

[ Regarding "Multi-processor stats and control", rjcronin@uop.com writes on Dec 6: ]

> We are trying to get a better handle on how thread migration works on
> multi-processor SPARCs running Solaris2.X. I am looking for pointers
> to DOCUMENTS/WHITE PAPERS/BOOKS to assist us in understanding the
> scheduling algorithms, as well as any TOOLS available for monitoring
> LWPs/thread scheduling/thread migration from the perspective of
> individual processes.

The Solaris 2.x kernel interals course goes into this in some detail.
>
> We have read:
> AnswerBook: Guide to Multithread Programming
> Cockcroft: Sun Performance and Tuning
> man pages: mpstat(1M), sar(1), pbind(1M), etc.
>
> Specifically, we have multiple concurrent users running a compute
> intensive simulation application on a multiprocessor SPARC20 with
> Solaris2.3. The simulation application is commercial software that is

I would recommend moving to 2.4 at least. Scheduling and memory use
improved a great deal between the two.

> single-threaded. We leave the threads unbound. We would like to be
> able to tell what is happening with respect to a single instance of a
> simulator process -- is it migrating from processor to processor, or
> does it stay on one the whole time? If it migrates, what secondary

top will tell you this - so will processor_bind() with the PBIND_QUERY
once you know the LWP id.

> cache efficiency is lost? (mpstat shows the machine doing about 20
> migrations/second, but which threads are migrating?)

The basic idea of the loose cache affinity is that if a process has
been runnable for a while (TS_RUN) but not on a processor (TS_ONPROC)
then it probably doesn't have much left in a particular cache, and
can be migrated to the processor running the lowest priority thread.

As I recall, this is 30ms. This means that if you are about to run
on the same processor and likely still have things in that cache,
you'll wait. If you wait long enough, then it presume you have little
left in the cache and tries to get you a processor that you are more
likely to run on.

If you want to know where your threads are, then I recommend using
proctool. If all else fails, you can get the LWPID from this, and
use the PBIND_QUERY version of processor_bind() to see where they
all are at any given moment. Remember you are looking at a dynamic
system, and that the process measuring may cause differences as well.
>
> We are planning to invest in bigger hardware to support many more
> users. We would like to make an intelligent decision as to how to
> allocate money spent on the number and speed of processors and size of
> secondary cache.

One study I'd make - if you have N processors, I'd investigate using
a batching system to run N instances of the compute intensive application
at a time thru them to see if cache effects and migration cause any
difference at all to the run time of an instance.

You should also take a look at the paging rate - if they are waiting for
I/O all the time (page faults for data) then they may not be truly CPU
bound, but becoming migratable(sp??) while they wait for the page faults.

Hope this helps a bit anyway - answering these kind of questions has
become harder in the MP world, but I would definitely recommend moving
away from 2.3.

                l & h,
                kev

----- End Included Message -----

ORIGINAL POST
=============

Dear Sun-Managers:

We are trying to get a better handle on how thread migration works on
multi-processor SPARCs running Solaris2.X. I am looking for pointers
to DOCUMENTS/WHITE PAPERS/BOOKS to assist us in understanding the
scheduling algorithms, as well as any TOOLS available for monitoring
LWPs/thread scheduling/thread migration from the perspective of
individual processes.

We have read:
        AnswerBook: Guide to Multithread Programming
        Cockcroft: Sun Performance and Tuning
        man pages: mpstat(1M), sar(1), pbind(1M), etc.

Specifically, we have multiple concurrent users running a compute
intensive simulation application on a multiprocessor SPARC20 with
Solaris2.3. The simulation application is commercial software that is
single-threaded. We leave the threads unbound. We would like to be
able to tell what is happening with respect to a single instance of a
simulator process -- is it migrating from processor to processor, or
does it stay on one the whole time? If it migrates, what secondary
cache efficiency is lost? (mpstat shows the machine doing about 20
migrations/second, but which threads are migrating?)

We are planning to invest in bigger hardware to support many more
users. We would like to make an intelligent decision as to how to
allocate money spent on the number and speed of processors and size of
secondary cache.

Thanks for any info...
Summary will follow.

Bob Cronin
(RJCronin@uop.com)



This archive was generated by hypermail 2.1.2 : Fri Sep 28 2001 - 23:10:35 CDT