SUMMARY:<exiting process>

From: Amid Ansari (amid@nippon.com)
Date: Sun Mar 20 1994 - 16:19:35 CST


Dear Managers,

Thanks to people who sent me various responses. I will try to summarize my findings as best
as I can.

Problem:

If I were to send a file to be printed on a printer connected to a paralled port and the
printer was not ready to receive, my printing function would stop, even after the printer
was made ready. I would then try to restart the lpd daemon through lpc, but I would get a
process with the "<exiting>" status like the following:

 PID TT STAT TIME COMMAND
2042 ? I 0:00 <exiting>

to end the process, and thus start the printing, the only thing I could do was to reboot.

Solution:

>From what I've heard so far, the <exiting> process is a dead process trying to clean up. It
seems though a parent process spawns the child, and after the child is dead, it can't
properly finish itself off, thus is left with <exiting> status. Luckilly!!, this is a bug
and there should be a patch for it(check with you sun rep, or whoever can help you get the
patch for it). I also received advises to use gcore to kill the process. I haven't yet try
it, but is seems like it may work.

I will also include messages I received regarding this problem, and you can reach your own
conclusion.

**********************************************************************************************

I think you have a "zombie". I don't think there is any way to get rid
of them until re-boot time. Please let us all know if somebody has
invented the magic rite of exorcism for these little devils.

                                        - Stefan
Stefan W. Mochnacki INTERNET - stefan@centaur.astro.utoronto.ca
Astronomy, U. of Toronto UUCP - {uunet,pyramid}!utai!helios.physics!stefan
Ph. (905) 884-9562 LOCATION - David Dunlap Observatory
FAX (905) 884-2672 Ph. (Mon,Wed) (416)-978-4165 (St.George Campus)
Cell(416) 460-0539 S/V "SOMEWHERE"

**********************************************************************************************

try using the gcore function, you will get a core dump, but the process should stop.

thanks bruce

**********************************************************************************************

We have the same problem here with processes attached to remote printers.
The only way I have found to get rid of this is to first abort the print
queue and then do a "gcore <pid>". Sometimes it takes a few "gcore"s to get
rid of the process. In some cases this doesn't even work and only a reboot
will fix it. Also, turn of the printer after aborting the queue (if a
printer is what your problem is).

Bret Robinson

| Bret Robinson, Unix System Admin \ brobinso@bchspd.wimsey.bc.ca |
| B.C. Hydro, Burnaby, BC, Canada \ Voice: (604) 528-2137 |
| \ Fax: (604) 528-1828 |
| "Keep your stick on the ice" \__ o |

I many times have the same problem. You have to run "ps", with an option that
shows the parent process id (check man page for ps). You may have to trace it
back several generations, until you find the parent parent process id. Kill the
parent, and it might kill the child processes....

Can you e-mail a summary of replies that you might get to me at:

   nishan@alldata.com

-nishan

**********************************************************************************************

You generally can't kill these. They're stuck in the kernel, waiting for a
device to clear. This can happen if it was using a tape drive that locked
up, or it exited while writing to a tty that was in XOFF state.

-- 
Barry Margolin
System Manager, Thinking Machines Corp.

barmar@think.com {uunet,harvard}!think!barmar

**********************************************************************************************

Things to try:

1) Convince the process that spawned it, to wait*() for it 2) Kill the process that spawned it 3) Send a HUP to init (kill -1 1) 4) Reboot 5) Track down the code that's forking those processes, and -change- it to do the appropriate wait*().

You didn't mention the version of SunOS - that would be helpful in the future. **********************************************************************************************

In message <9403151858.AA02425@nippon.com>you write: >How do I get rid of a process that has "<exiting>" listed as its command. > >The process looks like this when I do a "ps" on it : > > PID TT STAT TIME COMMAND >2042 ? I 0:00 <exiting> > >I have tried all kill commands, even with a -9 switch. no sire! > >please help me at amid@nippon.com > >thank you. > In article <9403151858.AA02425@nippon.com>, you write: > How do I get rid of a process that has "<exiting>" listed as its command.

you can't - it's up to another process to remove it. here's the way it works:

to create a new process in unix, you use fork(), which creates a practically identical child process. Since there's a parent-child relationship, all processes are in a tree formation. When a child process dies, it is marked as exiting, and is no longer runnable. It is not removed from the process table until the parent process issues a wait() system call. wait() gives the exit status to the parent, and the child is removed from the table. So your problem is that some parent process either isn't waiting for children, or just hasn't got around to it yet.

When a parent dies before the child, the child is changed to be a child process of init (process 1). init spends most of its time waiting for children, so it tends to clear up such dangling children quickly. Thus, if you've got an exiting child left over from a broken parent, kill the parent, and init will clear up the child.

Note: you should check which process is spawning the children then not waiting for them (PPID column on ps), then fix it.

steve -- <Steve_Kilbane@gec-epl.co.uk>

**********************************************************************************************

Amid,

at best you've got to kill what it's depending on - parent process, etc..

If it's waiting for a disk device this may be an indefinite wait.

Henry

_____________________________________________________________________________ |Henry Katz | | |ISCS, Inc | email: | |1.212.685.3057 | work: hkatz@lehman.com | | Currently on contract at: | play: hkatz@nucmed.med.nyu.edu | |Lehman Brothers | voice: 1.212.464.7319 | |388 Greenwich St | fax: 1.212.464.3118 | |NY NY 10013 | pager: 1.917.899.1420 | ------------------------------------------------------------------------------

**********************************************************************************************

Find the PARENT process; it has to do a wait() to reap the dead child. Apparently that parent process was not correctly written to handle its children.

**********************************************************************************************

a process that is <exiting> is already dead and is trying to clean up. you can't send it a signal because there's no code there to catch the signal. if this is 4.1.x, you might be hung up on some STREAMS/tty code cleanup (this is a bug; there's a patch). or you could be hung up on a parent process that hasn't pickedup the exit status from the terminated process.

a "kill -1 1" will clean up some of these. this looks more like a tty line problem since the stat is "I" and now "Z" (zombie, which would be a child that exiting and is waiting for the parent process to notice)

--hal I've had that problem before as well. The process eventually disappeared by itself. It did tie up my modem port for a long time, though.

If you don't do a summary of answers for the Sun-man list, could you forward any answers to me?

Thank you,

Michael Lewis ------------- michael@kraus.com Bernan/UNIPUBReboot, or ignore it.

**********************************************************************************************

It won't hurt anything, generally.

If it's a talk daemon or something like that, it might confuse the "legitimate" process, but if there's no side effects, ya might as well leave it until the next shutdown.

-John (I've managed to get processes stuck like this by sending them a "kill -QUIT" shortly followed by a "kill -9", before the process was finished its cleanup)

**********************************************************************************************

Once again, thank you all for your help. Should I find something more, I will send another summary, called SUMMARY:<exiting process 2>

take care,

have fun,

amid@nippon.com



This archive was generated by hypermail 2.1.2 : Fri Sep 28 2001 - 23:08:58 CDT