SUMMARY: Openwindows failure

From: Roxane du Berger (roxane@linda.RI.MGH.McGill.CA)
Date: Fri Feb 28 1997 - 09:20:24 CST


Dear sun-managers:

I apologize for not having posted a summary earlier. Unfortunately I have
been unsuccessful in bringing life back to openwindows. Below, I summarize
my original post, followed by a collection of proposed suggestions. I don't
understand every aspect of the problem but I have discovered, after closer
examination of boot messages, two "ifconfig" messages, one related to a
bad address and the other to a nonexistant interface. Not being able to
work with openwindows is very difficult. I am lucky to have temporary
access to another Sun, and use rlogin and xhost to work with my day-to-day
programs. I will discuss further these "ifconfig" messages in the last
portion of this summary.

Summary of Original Question (February 17, 1997):
-------------------------------------------------

Two Sun ELC (SunOS 4.1.2 2 sun4c) failed, within hours of
each other, to have access to openwindows (both are versions 1).
At the command openwin, the following message appears:

" unable to resolve host localhost "
When rebooting, we read
" syslogd: line 33: unknown host localhost "

=================
My thanks go to the following for their suggestions:

Matthew.Stier@MCI.ComFri Feb 21 07:07:00 1997
Claus Assmann <ca@informatik.uni-kiel.de>
sunman@oak.london.waii.com
Frank Pardo <fpardo@tisny.com>
Bismark Espinoza <bismark@alta.Jpl.Nasa.Gov>
Jacques Rall <jacques.rall@za.eds.com>
White Gary SrA USAFE CSS/SCOE <Gary.White@ramstein.af.mil>
Jens Fischer <jefi@kat.ina.de>
Brett Lymn <blymn@awadi.com.au>
Rasana Atreya <atreya@library.ucsf.edu>
Jochen Bern <bern@penthesilea.uni-trier.de>
=================

The suggestions tried are summarized as follows:
------------------------------------------------

- Ensure the following line is in /etc/hosts: 127.0.0.1 localhost

These are the first few lines of /etc/hosts:

127.0.0.1 localhost
#
198.168.146.1 mtlgeneral.cc.mcgill.ca gateway
# McGill mailhost
132.206.27.10 mailhost.mcgill.ca mailhost sifon sifon.cc.mcgill.ca
#
198.168.146.23 linda.RI.MGH.McGill.CA linda loghost

--------------

- Make sure entry in /etc/hostname.le0 matches entry for loghost in /etc/hosts

I modified /etc/hostname.le0 such that entry is identical to
loghost entry in /etc/hosts:

% cat /etc/hostname.le0
linda.RI.MGH.McGill.CA

% grep -i linda /etc/hosts
198.168.146.23 linda.RI.MGH.McGill.CA linda loghost

---------------

- Check that the file /etc/resolv.conf is correct.
  
Contents of /etc/resolv.conf is correct in the sense that they
are identical to those I examined in other Sun machines
operating under OS 4.1.x

% cat /etc/resolv.conf
domain RI.MGH.McGill.CA
nameserver 132.206.44.21
nameserver 132.206.1.11
---------------

- ping -s localhost

This is what I obtain:

% ping -s linda
64 bytes from linda.RI.MGH.McGill.CA (198.168.146.23): icmp_seq=0. time=2. ms
64 bytes from linda.RI.MGH.McGill.CA (198.168.146.23): icmp_seq=1. time=1. ms

----linda.RI.MGH.McGill.CA PING Statistics----
2 packets transmitted, 2 packets received, 0% packet loss
round-trip (ms) min/avg/max = 1/1/2

---------------

- If the name "localhost" is correctly defined in the /etc/hosts file,
  then is the machine
  A) configured to ignore /etc/hosts?
  B) trying to look up that name in DNS and/or NIS instead?
  C) possibly addressing the wrong DNS nameserver?

To answer A), I don't think the machines are configured to
ignore /etc/hosts. The system recognizes the aliases at
boot time and with rlogin. I'm not sure how to answer B) and C)

---------------

- put 127.0.0.1 into every Registry possible on: /etc/hosts,
  NIS hosts Map, DNS, etc..

I added the line
nameserver 127.0.0.1
to /etc/resolv.conf. I rebooted and received many unfamiliar, ugly and
cryptic messages. In any case, none of the other Suns in our unit
operating under OS 4.1.x have this line in their /etc/resolv.conf or
other files. I therefore removed the entry.
-----------------
- examine /tmp/truss.out with the command
  # truss -o /tmp/truss.out -f /usr/openwin/bin/openwin

I did not find this command "truss".
---------------
- Change:
  hosts: nis [NOTFOUND=return] files
  to
  hosts: files nis [NOTFOUND=return]

I don't know how and where to implement this command in my OS.
=====================

New Revelations:
_________________

Unfortunately, none of the above suggestions have brought back openwin.
What I don't understand is that only openwindows does not work. All other
network usages work such as ping, rlogin, netstat, xhost, usage of e-mail,
.... This is what I get with

% ifconfig -a
le0: flags=63<UP,BROADCAST,NOTRAILERS,RUNNING>
        inet 198.168.146.23 netmask ffffff00 broadcast 198.168.146.255
lo0: flags=49<UP,LOOPBACK,RUNNING>
        inet 127.0.0.1 netmask ff000000

I don't know if the output is reasonable. Inspection of other configs
found over the local network shows that the broadcasting number does
not always end with the address *.255. The inet number does convey my
my correct internet address. The netstat command under various options
produces

% netstat -r
Routing tables
Destination Gateway Flags Refcnt Use Interface
localhost.CC.McGill. localhost.CC.McGill. UH 0 298 lo0
default 198.168.146.1 UG 2 2919 le0
198.168.146.0 linda U 3 1142 le0

% netstat -r -n
Routing tables
Destination Gateway Flags Refcnt Use Interface
127.0.0.1 127.0.0.1 UH 0 298 lo0
default 198.168.146.1 UG 1 2845 le0
198.168.146.0 198.168.146.23 U 3 1103 le0

% netstat -i
Name Mtu Net/Dest Address Ipkts Ierrs Opkts Oerrs Collis Queue
le0 1500 198.168.146.0 linda 6597 0 594 0 0 0
lo0 1536 loopback localhost.CC.McGill.CA 702 0 702 0 0 0

The command nslookup recognizes my name and address (ignoring the first
line message):

% nslookup linda
*** Can't find server name for address 132.206.44.21: Query refused
Server: [132.206.1.11]
Address: 132.206.1.11

Name: linda.RI.MGH.McGill.CA
Address: 198.168.146.23

Given the above observations, I can't explain why, at boot time, I read
the messages below, somewhere between message lines describing swap, dump,
followed by check file systems

"...
swap on...
dump on...
ifconfig: linda.epi.mcgill.ca: bad address
ifconfig: ioct (SIOCGIFFLAGS): no such interface
check filesystems...
...."

The first ifconfig message is strange indeed: linda.epi.mcgill.ca is my
old internet address which changed in July 1996. My old address was
linda.epi.mcgill.ca (132.206.248.23) and, since July 1996, became
linda.ri.mgh.mcgill.ca (198.168.146.23). I have grep'ed everywhere I
can think of in my machine, but find nothing which remotely resembles
"linda.epi.mcgill.ca". I have also reviewed all the /etc/rc* files and
believe, with my limited network knowledge, that there are no errors.
I added to the /etc/hosts file the alias linda.epi.mcgill.ca and the
offending message disappeared at reboot. However, this does not answer
the question of why the system is looking for my old address.

I have no understanding of the second ifconfig message related to a
missing ioct interface. Why is it missing and could this affect the
operation of openwindows.

My questions are now the following:

1) How do commands ping, ifconfig, netstat, etc.. provide
   correct internet names and addresses, in spite of the
   boot time ifconfig messages searching for my old address (prior
   to alias in /etc/hosts)?

2) can ioctl interface presumably non existent affect openwin?

3) If all attempts at recuperating openwin fail,
   are there more "radical" solutions, short of abandoning
   my computer? The machine and the OS have given me years
   and years of stability and satisfaction and I feel awfully
   attached to it...

I thank you in advance for any advice you can provide me with.

Regards,
o===========================================================================o
| Roxane du Berger |,/| ,. - _ ,<roxane@linda.ri.mgh.mcgill.ca> |
|tel:(514) 937-6011 (4726) /, \'. "\ Hopital general de Montreal |
| { \ ` : Montreal, Quebec, CAN H3G 1A4|
| fax:(514) 934-8293 `;;_' { ; |
| (,(, _.,>-(__,/ Service d'epidemiologie clinique|
o===========================================================================o



This archive was generated by hypermail 2.1.2 : Fri Sep 28 2001 - 23:11:47 CDT