SUMMARY: rpc.ttdbserverd

From: Johnie Stafford (js@cctechnol.com)
Date: Mon Jun 02 1997 - 17:54:44 CDT


Hi all, thanks for the responses.

My original post:
>>> On 27 May 1997 14:36:15 -0500, js@cctechnol.com (Johnie Stafford) said:

 js> I'm having troubles with rpc.ttdbserverd on several machines. After a
 js> user logs in (using CDE) this process will take up all of the
 js> available cpu time on the system (generally 60-90%). It only happens
 js> with some users on specific machines.

The solution appears to be either corrupt TT_DB files or a lack of
file descriptors. The following are the responses that I
received. Clearing the TT_DB directories solved the problem for me, at
least for now.

        Johnie

Davin Milun <milun@cs.Buffalo.EDU> writes:

Here is a question that I wrote up and submitted to Casper for the Solaris
FAQ. I don't know if it has been added yet though:

Q) Why do I get a CPU-bound rpc.ttdbserverd process?

A) rpc.ttdbserverd is the RPC-based ToolTalk database server. It creates and
   manages database files kept in TT_DB directories. See ttdbserverd(1M).

   The problem is usually caused by corrupted entries in some TT_DB
   directory. The solution is therefore to kill the running
   rpc.ttdbserverd, and to completely remove all local TT_DB directories.

   rpc.ttdbserverd will be restarted from inetd when it is needed again.
   And it will rebuild the TT_DB directories automatically.

   By default these TT_DB directories are created in the top directory of
   every filesystem, however one can use /etc/tt/partition_map to tell
   ttdbserverd where to put them. See partition_map(4) for more details.

Hans Schaechl <hans@mpim-bonn.mpg.de> writes:

I had the same problem, it turned out that the ToolTalk-DB
was corrupted. The only way to solve it was to manually
delete all the fileobjects in the TT_DB directory.

1) Disable the rpc.ttdbserved-line in /etc/inetd.conf and
   kill -HUP the inetd-process.
2) Kill the rpc.ttdbserverd-process
3) Find the TT_DB-directory in ufs-filesystems and there delete
   all files.
4) Enable the rpc.ttdbserverd-line in /etc/inetd.conf again and
   again kill -HUP the inetd-process.

Now the rpc.ttdbserverd-process should show normal CPU-consumation
again.

Shawn Hooton <shooton@cuc.com> writes:

I too had this problem. Here is a desciption of the problem and the fix
for it.

rpc.ttdbserverd is the RPC-based ToolTalk database server. It creates
and manages database files kept in TT_DB directories. See
ttdbserverd(1M).

The problem is usually caused by corrupted entries in some TT_DB
directory. The solution is therefore to kill the running
rpc.ttdbserverd, and to completely remove all local TT_DB directories.

rpc.ttdbserverd will be restarted from inetd when it is needed again.
And it will rebuild the TT_DB directories automatically.

By default these TT_DB directories are created in the top directory of
every filesystem, however one can use /etc/tt/partition_map
to tell ttdbserverd where to put them. See partition_map(4) for more
details.

A second possible cause is running out of filedescriptors, which can be
fixed by upping the soft limit on the number of
filedescriptors rpc.ttdbserverd starts with.

And I found this bug report in Sun's archives:

                        Bug Reports document 4017415

                 [ Notify of patch changes ][ Mark README ]

----------------------------------------------------------------------------

 Bug Id: 4017415
 Category: tooltalk
 Subcategory: dbserver
 State: evaluated
 Synopsis: rpc.ttdbserverd spinning, consuming nearly all cpu time
 Description:

Customer reports an errorsituation with the rpc.ttdbserverd consuming nearly
all cpu-time on an Ultra-2 2.5.1. CDE is not installed, only SUNWdtcor.

Unfortunately, I could not reproduce the error, but at least found a
workaround,
that could probably help to understand the error situation posthum and take
measures against it.

# uname -a
SunOS kora2 5.5.1 Generic_103640-02 sun4u sparc SUNW,Ultra-2

Local Tooltalk databases:

/usr/TT_DB
/var/TT_DB
/export/root/TT_DB
/export/home/kora2/ac-home/TT_DB
/export/home/kora2/bv-home/TT_DB
/export/home/kora2/km-home/TT_DB
/export/home/kora2/inf-home/TT_DB
/export/home/kora2/nv-home/TT_DB
/TT_DB

# more /etc/inetd.conf | grep ttdb
100083/1 stream rpc/tcp wait root /usr/dt/bin/rpc.ttdbserverd
rpc.ttdbserverd

After a while (5 minutes upto 48 hours) after having cleaned out the
TT_DB databases and having rebooted the machine, the rpc.ttdbserverd
started spinning:

# w
  8:34am up 58 min(s), 3 users, load average: 1.20, 1.07, 1.02
User tty login@ idle JCPU PCPU what
root pts/0 7:36am 57 -sh
root pts/1 7:43am 7 4 truss -p 619
root pts/3 8:29am 1 w
# ps -ef|grep rpc.ttdb
    root 2006 1967 0 08:34:41 pts/3 0:00 grep rpc.ttdb
    root 619 216 91 07:42:20 ? 41:05 rpc.ttdbserverd

# kill -ABRT <ttdbserverd-pid> yielded the following stacktrace:

Reading symbolic information for /usr/dt/bin/rpc.ttdbserverd
warning: core object name "rpc.ttdbserver" matches
object name "rpc.ttdbserverd" within the limit of 14. assuming they
match
core file header read successfully
core file read error: address 0x5050c not in data space
core file read error: address 0x5050c not in data space
core file read error: address 0x5050c not in data space
Reading symbolic information for rtld /usr/lib/ld.so.1
core file read error: address 0x5050c not in data space
warning: cannot get address of PLT for "/usr/dt/bin/rpc.ttdbserverd"
detected a multi-LWP program
(l@1) terminated by signal ABRT (Abort)
(debugger) where
=>[1] 0xef70cdcc(0x5bdb9, 0xefffda84, 0x9, 0xfff898c3, 0x29de18,
0xefffdb44), at 0xef70cdcb
  [2] 0xef7039f0(0x14f4e0, 0xefffdb24, 0xefffdb40, 0xefffdb3c, 0xd24ff,
0x14f4e0), at 0xef7039ef
  [3] 0xef7037f0(0xefffdbd7, 0xefffdbcc, 0x0, 0xef716710, 0xef71670c,
0xefffdb24), at 0xef7037ef
  [4] isamfatalerror(0xefffdc60, 0xefffdc70, 0xefffdc78, 0xefffdc68, 0x7cda0,
0x7cda0), at 0x23c5c
  [5] _tt_create_obj_1(0xefffdcec, 0xcfe80, 0x1, 0x454, 0xef5fec08, 0x0), at
0x1e46c
  [6] db_server_svc_C:__sti(0x77658, 0xcfe80, 0x77658, 0x547a8, 0x548a4,
0x1e438), at 0x25690
  [7] 0xef5be1e4(0xcd0e8, 0x77658, 0xcff28, 0xcfe88, 0xef5ff210, 0xcfe80), at
0xef5be1e3
  [8] 0xef5be104(0xefffdee0, 0x0, 0xef5fec60, 0xef5ff210, 0xef773e90, 0x16), at
0xef5be103
  [9] 0xef5bffac(0x0, 0xffffe000, 0xef5f46ec, 0xef5fec60, 0xef5ff210, 0x17), at
0xef5bffab
  [10] _tt_process_transaction(0x71a30, 0x71a20, 0x796b8, 0x71a28, 0x71a2c,
0x71a18), at 0x246cc
(debugger)

 Work around:

Clearing out the databases did not help. At least ttdbck did not find
any problems.

Creating a partition-map and mapping all the tooltalk databases to
one single TT_DB did not help to avoid the error situation either,
but helped in that way as there was only one TT_DB to be cleared out.

Starting the rpc.ttdbserverd from a shell with an increased amount of
filedescriptors (128 instead of 64) helped to avoid the problem
permanently.

        Integrated in releases:
 Duplicate of:
 Patch id:
 See also:
 Summary:
The dbserver can run out of file descriptors between it and the various
libtts from clients that connect to it. The dbserve should zoom the number
of file descriptors from 64 to some larger number (probably 1024).

----------------------------------------------------------------------------

     Copyright 1997 Sun Microsystems, Inc. 2550 Garcia Ave., Mt. View, CA
     94043-1100 USA. All rights reserved.

-- 
==============================================================================
 Johnie Stafford, System Administrator     *  Phone: (318) 261-0660
 C & C Technologies, Inc.                  *    Fax: (318) 261-0192
 730 East Kaliste Saloom Road              * E-mail: js@cctechnol.com
 Lafayette, LA  70508                      *    URL: http://www.cctechnol.com
==============================================================================



This archive was generated by hypermail 2.1.2 : Fri Sep 28 2001 - 23:11:56 CDT