SUMMARY: MAIL Dissappearing problem (Long)

From: Ad S. Talwar (ast@thorn.geoquest.com)
Date: Tue Apr 21 1992 - 06:12:23 CDT


Couple of weeks back while in 4.1.1, (soon afterwards
i CaTapulaTed into SunOS 4.1.2, and I though mail had decided to change
its characteristics on me), I had posted a help message to track dissappearing
Mail problem. My Question was:

*******************************************************************************
>After scratching my head, chasing emails, mounting & unmounting /var/spool/mail
>directory, listing /var/spool/mail files, tweaking sendmail and applying
>patches, I cannot come to grips with the problem. The thing is, lately,
>when everything was going smooth on our network. Email to some individuals
>from within my company via aliases YP, or in most cases coming from outside
>to a particular group of users listed individually on the cc line, DISAPPEARS
>randomly.
>The incidents are rare, whenever I test my /usr/lib/sendmail -bv alias_name,
>all individuals are listed correctly, but then in actuality when mail is sent
>to an alias name, some get it, some donn't. In addition, as I mentioned above,
>in one case, mail is being sent to 4 or 5 users from outside the company. All
>the users are addressed individually on the To line of Mail header.
>e.x.
>To: ast@geoquest.com, bby@geoquest.com, rxy@geoquest.com, lby@xyz.nobody.com,
> zzz@geoquest.com, tta@xyz.nobody.com
> (I cannot list the specific example in this case)
> In this case user ast and rxy will get the mail, but mail to bby, zzz will
> be flaky. I do not know the status of lby and tta. I have some addn.
> things, e.x., put specific entries in YP aliases file, bby@machinename, this
> did not work. I have installed the following patch, but that did not help
********************************************************************************

To this day, I could not determine what exactly happened, but it taught me a
lesson to do extensive tracking of mail and uucp. Firstly, I must thank the
SUN MANAGERS who took time replying to me. Thanks once again for your suggestions.
The honor roll is:
***********************************
<rickert@cs.niu.edu>
uunet!sunsci.cern.ch!Roberto
mdl@cypress.com
uunet!systems.caltech.edu!danielle
derek@nexus.ca
wwtz@ciba-geigy.ch
**********************************

When the problem started, we did not have an extensive mail
logging. But, as EMAIL has become important, I have implemented
most of the suggestions which came.
___________________________________________________________________________________
1. Install the patch 100224-02, which is also required for 4.1.2,
   the patch replaces /bin/mail & /bin/rmail,
   solves some bit-bucket dissappearing mail problems.
   I THINK PATCH 100224-03, 4.1.2 VERSION is dangerous, caused
   problems for me in 4.1.2, and I had to switch to 4.1.1 programs.
   Check the permissions on that program.
2. I have started syslogging extensively on the mailhost. The logs
   keep track of all sendmail/mail-transactions. Which i think has
   benefitted me most. On the mailhost, in /etc/syslog.conf
   mail.debug /var/log/syslog
3. To check mail on clients which could not send emails, someone suggested
   to check the /var/spool/mqueue directory using command:
   /usr/lib/sendmail -bp;
   you can force the mail from this queue.
4 By default on the mailhost, sendmail comes up with Level 9 of
   debugging. If required, in times of trouble, sendmail can be started
   with Level 12 or 16. The errors are recorded in /var/adm/messages file
   as SYSERR.
5 An intresting workaround detailed here:
  ____________________________________________________________________________
   Here is (are) the problem(s) I have
   had with mail. Maybe it applies to your case somehow.
   Problem 2: looks kind of like yours.
   =========================================
   Since 4.x I have had a problem with the "metoo" mail variable.

   Example:

   section of NIS aliases from master "al" (4/280 running 4.1.1)

   ...
   cad: user1,user2,...,mdl,....
   ...

   my login is mdl, my machine tx (4/65, 4.1.1) and all our other
   NIS clients, have al's /var/spool/mail mounted so I can get mail
   on any machine.

   Problem 1:

   If I "mail cad" (from the shell or mailtool (sunview)) on tx (or any
   other machine the NIS master/mailhost) the mail is not sent to me. It
   doesn't matter if I have metoo or nometoo in my .mailrc file. If I
   send mail to cad when I am logged in to al (from shelltool or
   mailtool, ow or sunview), metoo functions correctly. If I set debug,
   it looks like everything is ok (e.g. sendmail uses -m if metoo is
   set). To work around this problem I put the following in my .mailrc:

   alias cad mdl cad

   I (and the rest of our users) have to do this for all the NIS aliases
   to workaround this problem. This has been ok for a year until....

   (scary music)

   I moved to OW version 3 and started experiencing...
    
    Problem 2:

    If I send mail to an alias using the ow mailtool (compose, or reply),
    the mail goes into the bitbucket. It doesn't go to other users on the
    alias. It's not returned... nothing.

    I tried removing everything from my .mailrc and added only metoo.
    (naturally I got out of ow everytime a change was made, even though
    I'm sure I just needed to restart mailtool. By the way, what happened
    to the "source .mailrc" which used to be in the misc section of the
    sunview mailtool?)

    Also, I set up a dummy user, using defaults from /usr/lib and /usr/openwin/lib
    and duplicated the problem. I have not modified /etc/aliases on the clients.

    I have patch 100224-02 installed on the mailhost in order to get
    around a problem of sending mail to a list of people when one of them
    has an "f" for the second letter of the user name. Clients running
    4.1.2 have the problem as well.

    **Workaround** (start paying attention here)

    openwin mailtool does not use /usr/ucb/mail, you should change
    all the workaround aliases in your .mailrc file that look like

    cad cad mdl

    to

    cad cad@al mdl

    Sun closed the service order. They didn't give a [expletive deleted]
    that the problem really was not solved and that I was not happy
    with the workaround. I suppose I should have pushed it.

    Here I have my workaround (I guess everysite has found one by now).
    If a person wants to get mail if he/she is part of the alias group, and he/she
    is sending mail to that alias group, and also wants to get Biffed.
    We put and entry like: username:username:machinename in /etc/aliases
    in YP. Also in /etc/sendmail.main.cf, you can turn on the send me
    mail option by having entry om, in the options list.
    __________________________________________________________________________
6. here are some more suggestions I received.
>> The "To:" and "Cc:" line don't tell everything. If you are syslogging,
>syslogging, the syslog records will do a better job of helping you trace
>what has happened.
>>
>I started to syslog on the mailhost and watch for SYSERR messages.

It is not just the SYSERR messages that matter.

When a message disappears into a black hole, look at all syslog records for
that message. There should be one identifying the message and its
"Message-ID", on with 'received from=' and one or more 'sent' messages.

If the 'sent' messages do not appear, you have a sendmail problem. If
they do show up, you problem is in the delivery agent - probably /bin/mail.

>> If the mail is being sent from a client, there is always the chance that
>>it is sitting in a queue file on your client, and the queue is never being
>>processed.
  I have not looked at this option on machines other that the mailhost.

 I should have mentioned how to check. Use: /usr/lib/sendmail -bp

> I will check this option on other slave servers, and machines. My
> headache is, most of the time the dissappearing act is affecting
> two users out of 4, from an email coming from outside the company,

If it is always the same two users, check the permissions and ownership of
their mailboxes in /var/spool/mail. If that is wrong you will have problems.

> I am using uucp to pickup mail from uunet, is it possible that at
> the remote site their mail system is not sending mail to these users?

  If you are receiving via UUCP, your UUCP log records should indicate which
  recipients were on the incoming message envelope. Look for the arguments to
  the 'rmail' command. [I hope they are being logged - I'm not familiar with
  the logging of HDB versions of UUCP].

  The usual way to check this is using /usr/lib/uucp/uulog, this will
  detail all the transactions and time. There are scripts availiable
  to format the information, I think somewhere in comp.sources or UUNET.
7
Are you by any chance running s-mail?
You do not mention running newaliases.
At one site I deal with we had a problem with new entries in the
aliases file not taking. (even after running newaliases)
I am not sure why but the s-mail at that site was not seeing newaliases.
To make it work I manually ran the real sendmail program with -bi.
I say 'real' sendmail because smail sits on the system as sendmail when
it is installed.
After doing that I was able to send to the new alias entries

8
 A cron to clean up syslogs is:
 Yes, the file is there. The system I use to cleanup is the same as for
 the /var/adm/messages, e.g.:

 mv -f /var/log/syslog.6 /var/log/syslog.7
 mv -f /var/log/syslog.5 /var/log/syslog.6
 mv -f /var/log/syslog.4 /var/log/syslog.5
 mv -f /var/log/syslog.3 /var/log/syslog.4
 mv -f /var/log/syslog.2 /var/log/syslog.3
 mv -f /var/log/syslog.1 /var/log/syslog.2
 mv -f /var/log/syslog.0 /var/log/syslog.1
 mv -f /var/log/syslog /var/log/syslog.0
 cp /dev/null /var/log/syslog

9 here is a neat suggestion for people nfs mounting /var/spool/mail
  to clients:
  Well, I've had some problems in this area some months ago. My problem then
  was a timing problem with the NFS mounted /var/spool/mail on the NFS-client
  side. It happened to users which had their mailbox open - when mail was
  delivered and the mailbox closed - some of the mail go lost due to restoring
  the accessed mbox over the newly added mail. The solution to this was to
  mount the /var/spool/mail with option "actimeo=1" on the NFS clients. I'm
  not sure wether this is your problem - anyway: I'd suggest strongly you apply
  this option if you haven't done so already
______________________________________________________________________________

Hope this helps someone, if anyone has different opinions, thoughts,
suggestions, PLEASE EMAIL them to me, I am making a README for
email problems together.

Thanks



This archive was generated by hypermail 2.1.2 : Fri Sep 28 2001 - 23:06:41 CDT