Hi All, As ever people responded with a great number of suggestions and experiences. Thanks go to the following for their suggestions and comments (in the order replies were received): Joe Fletcher Gene Beaird Gary Bacon Reginald Beavers Mike Hong Gavin McDonald Elizabeth Lee Paul Richards Adam Bisbe Steve Bagdon This summary contains their suggestions further down, I will post a further summary some time in the future when we've done some testing. If you don't want to read the responses and just want the products mentioned: Big Brother (www.bb4.com) 4 Also see www.deadcat.net for extensions to BB BMC Patrol (www.bmc.com) 2 SNIPS (http://www.netplex-tech.com/software/snips/) 1 Spectrum (http://www.aprisma.com/) 1 Remedy (http://www.syscomworld.com/solutions/servicedesk/remedy_ar_systems.htm - I think) 1 XACCT (usage) (www.xacct.com) 1 Mon (www.kernel.org/software/mon/) 1 Site Scope (http://www.freshwater.com/SiteScope.htm) 1 Thanks again to everyone for taking time to respond. James My original posting: I've been asked to put forward some suggestions, for a client, on monitoring a set of applications and where necessary alerting operators via pager/sms - a broad brief. I've been searching the Internet for products and going through this lists archive in search of suggestions and wisdom. Internet searches have yielded advertisements for various products which is good but doesn't give me any idea of general experiences, limitations or problems real life users have had. The list archive mainly has summaries covering resource monitoring like CPU, I/O, network etc but we're also looking for 'bigger-picture'/'high-level' reporting/alerting. Where there are requirements similar to mine there are no summaries. So I'm turning to the list at large for any input you have to pass on. The (Sun part of the) setup that needs to be monitored has: - DB server running Informix & some custom applications - Application server, which talks to the DB server, running mainly custom applications and MQ Series (a middleware/guranteed message delivery system) which talks to an external server The applications mentioned above and the custom applications have their own logs i.e. syslogd isn't used so we can't monitor a centralised event log. Obviously we can monitor the processes to see if they are up however we know of events which can occur where the process remains up but an application event prevents 'normal operation'. Assume that these events are logged in the application specific way i.e. to a file somewhere. So we will be monitoring server resources, monitoring the application processes and want to extract event information from the scattered logs. It is the latter two that I'm seeking your input on as the first one is well documented in summaries. Where possible we would like to combine as much of this as possible in a single application to simplify the solution itself. Where an event is detected and we have defined it as significant enough to warrant alerting an operator we would like to do this via pager/SMS(/GSM) as sendmail is disabled on the servers and no alternative will be implemented. This is almost certainly going to require a combination of products to achieve the solution so I'm especially interested to hear from people in a similar situation. ----------------------------------------------------------------- The responses (in the order I received them): 1) I've used Big Brother to great effect (www.bb4.com). It's free(ish), simple to configure and use, extensible, handles SMS alerts and suchlike. Purely on a cost basis it's definitely worth considering when compared with BMC/Tivoli etc al. ----------------------------------------------------------------- 2) It depends on whether or not your client is wanting to buy something, or go open source. There is a nice open source app called Big Brother, that we run in our datacenter. It can monitor systems, message logs, processess, etc. Using qpage and a modem, you can get it to dial out and notify a page list of problems. It takes a bit of setting up to work, and some tweaking to fine-tune, but it has served us well for free. We use a dedicated system for our monitoring, but it is a U5, and is not stressed too much. We monitor >35 Solaris systems and >10 NT systems with this product. Check out their web site at: http://bb4.com ----------------------------------------------------------------- 3) Depending on how much money you've got to spend I suggest you look at BMC Patrol. This software is typical system monitoring software where it monitors system resources etc... you can set thresholds so that it sends out warnings and alarms when resources reach certain levels eg disk space is 99% full. It also has the capability to send out sms messages. This product is highly configurable and should be ideal for what you're looking for. The link is www.bmc.com , the product that you need to look at is Patrol. ----------------------------------------------------------------- 4) Although BigBrother (bb4.com) monitors system resources by default, it can be easily configured to monitor applications as well. It's fully extendable plus you can turn off the system monitoring is you prefer. www.deadcat.net offers extensions developed for databases and more. These provide good examples for developing extensions yourself. I don't know what the license fees are (I used BB at a government site) but I'd guess that they're much less than for Tivoli or BMC. In my experience, BB did the job just as well as either. ----------------------------------------------------------------- 5) If you havent already, you can take a look at http://www.netplex-tech.com/software/snips/. SNIPS is a freely distributed system and network monitoring app. It works pretty well as it monitors basically all system functions and supports SMS, paging, and email. SNIPS is highly customizable, so if it doesnt do something you want, you can develop it yourself. I believe it can also monitor specific logfiles...havent actually tried it, but i see the config option in the conf file. The only disadvantage of SNIPS is that since it is freeware, there is really no support for it besides an email list and forum. ----------------------------------------------------------------- 6) As far as application monitoring, I cannot offer much advice, but for hardware monitoring, have you looked at Spectrum by Aprisma? It is a network monitoring tool, and it can page on alerts. Another product which you may find useful is Remedy, a trouble-ticket application, which can monitor systems for failure, (Apps too?) and then alert (via pagers/ SMS/IP-clients/etc) the responsible individuals. These may be overkill, but then again, you didn't specify how large your networks are. One last thought, XACCT makes an interesting product for parsing log files, and combining miscellaneous data, then converting to a custom output format, which you specify. Though again, this is an enterprise- class solution. ----------------------------------------------------------------- 7) BMC Patrol will do all this & more. I don't work for 'em -- just a satisfied customer. ----------------------------------------------------------------- 8) BMC Patrol will do all this & more. I don't work for 'em -- just a satisfied customer. ----------------------------------------------------------------- 9) have you checked: http://www.kernel.org/software/mon/ ----------------------------------------------------------------- 10) We use SiteScope, by Freshwater. Does more then we've ever been able to tap, and that's with 12,000 monitors running. ----------------------------------------------------------------- _______________________________________________ sunmanagers mailing list sunmanagers@sunmanagers.org http://www.sunmanagers.org/mailman/listinfo/sunmanagersReceived on Thu Jul 25 22:44:04 2002
This archive was generated by hypermail 2.1.8 : Thu Mar 03 2016 - 06:42:50 EST