OPEN SOURCE - Job Scheduler - Solution Stacks for Network Monitors

Job Scheduler
 
    Integration with Network Monitors

  This solution stack is used to check the results of jobs that have been executed by the Job Scheduler and to report the respective status messages to a Network Monitor.

This stack includes jobs for the detection of errors and warnings in log files and for message passing. It can be used out of the box.

Why would you integrate the Job Scheduler with a Network Monitor?
  • You could use your Network Monitor console as a single interface to monitor jobs. You would not have to check your mails for warnings and errors but could monitor all jobs with a glance at the status console.
  • You could use the Job Scheduler to report the execution result of cron jobs, shell scripts etc. to your Network Monitor.
  • You could configure the Job Scheduler to report warnings and errors continuously to the Network Monitor.
  • Warnings and errors are reported independently from the mail settings of specific jobs, which might specify different recipients.

Currently supported Network Monitors are Nagios and Hobbit (Big Brother).

 
Benefits of this stack   ... enlarge Monitoring with Nagios   ... enlarge Monitoring with Hobbit   ... enlarge


Nagios Network Monitor

  Nagios is an open source network monitor that is available at http://www.nagios.org.

Download

Job implementations for this stack are not included in the Job Scheduler distribution for legal considerations, you can download them from the sos.stacks.jar archive. The source code of these jobs is given below.

Copy the check_scheduler.pl plugin to the plugin directory of your Nagios installation.

The installation requirements for the Nagios Plugin are: Perl not below version 5.8 and the Perl packages Net::HTTP and XML::XPath which could be installed from http://www.cpan.org.

Configuration

Nagios compliant jobs are included from the scheduler_monitor.xml configuration file with the <base file="scheduler_monitor.xml"/> element in the scheduler.xml configuration file.

Add the above sos.stacks.jar to the class_path setting of your configuration file factory.ini in order to make these classes available at run time.

Configure the plugin based on the sample in scheduler.cfg. Consider the lines:

define host{
        use                     generic-host            ; Name of host template to use
        host_name               localhost
        ...
        }
define command{
       command_name    check_scheduler
       command_line    /home/sos/nagios/check_scheduler.pl -H $HOSTADDRESS$ -p $ARG1$ -e $ARG2$ -v 1
       }
define service{
       ...
       check_command   check_scheduler!4444!/monitor_service
       }
                  
The command_name assigns an alias to the Plugin script and the command_line specifies the Plugin filename on disk. The check_command configures the script addressed by the check_scheduler alias to be executed with the 4444 and /monitor_service parameter values. The first parameter is the port on which the Job Scheduler is operated, the second parameter specifies the URL that has been configured in scheduler_monitor.xml for the Web Service in the Job Scheduler.

Please note the verbosity level specified by the command line argument -v 1: a value 1 will report every message in a separate line of the Nagios console, a value 0, which is default, will show one line that states the sanity of jobs.

Implementation

  • Nagios Plugin: check Job Scheduler status (check_scheduler.pl)

    The Plugin is called by Nagios or at the command line with the parameters:

    check_scheduler.pl -H localhost -p 4444 -e /monitor_service -v 1

    Additionally you could use the parameter -r to reset messages of error states being repeatedly sent, see job documentation for details.

    This plugin uses the following SOAP request that is sent to the Job Scheduler:
    <?xml version="1.0" encoding="ISO-8859-1"?>
    <soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/">
      <soapenv:Header>
        <wsa:To xmlns:wsa="http://schemas.xmlsoap.org/ws/2004/08/addressing">
          http://www.sos-berlin.com/monitor_service
        </wsa:To>
        <wsa:ReplyTo xmlns:wsa="http://schemas.xmlsoap.org/ws/2004/08/addressing">
          <wsa:Address>http://schemas.xmlsoap.org/ws/2004/08/addressing/role/anonymous</wsa:Address>
        </wsa:ReplyTo>
      </soapenv:Header>
      <soapenv:Body>
        <addOrder xmlns="http://www.sos-berlin.com/scheduler"></addOrder>
      </soapenv:Body>
    </soapenv:Envelope>
                          
    The request is sent to the respective host and port at which the Job Scheduler is operated. The URL has to include the path that is specified by the <web_service url_path="/monitor_service"/> attribute in the above scheduler_monitor.xml configuration file, e.g. http://www.sos-berlin.com:4444/monitor_service.

    The following example SOAP response could be returned as a result:
    <?xml version="1.0" encoding="ISO-8859-1"?>
    <soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/">
      <soapenv:Body>
        <sos:order xmlns:sos="http://www.sos-berlin.com/scheduler">
          <sos:id>657</sos:id>
          <sos:state>start</sos:state>
          <sos:initialState>start</sos:initialState>
          <sos:jobChain>monitor_service</sos:jobChain>
          <sos:job>monitor_check_status</sos:job>
          <sos:task>2126</sos:task>
          <sos:inProcessSince>2006-05-11 11:32:50.953</sos:inProcessSince>
          <sos:created>2006-05-11 11:32:50.812</sos:created>
          <sos:priority>0</sos:priority>
          <sos:webService>monitor_service</sos:webService>
          <xml_payload>
            <messages errors="1" warnings="2">
              <message id="1" severity="WARN">
                2006-05-11 11:32:42.546 [WARN]   (Task monitor_sample_warning:2123) 
                sample warning for monitor launched
              </message>
              <message id="2" severity="WARN">
                2006-05-11 11:32:44.453 [WARN]   (Task monitor_sample_warning:2124) 
                sample warning for monitor launched
              </message>
              <message id="3" severity="ERROR">
                2006-05-11 11:32:48.703 [ERROR]  (Task monitor_sample_error:2125) 
                sample error for monitor launched
              </message>
            </messages>
          </xml_payload>
        </sos:order>
      </soapenv:Body>
    </soapenv:Envelope>
                          
    Besides the standard job information a <messages> element is included that states the number of errors and warnings in its attributes. The consecutive message elements are retrieved from the Job Scheduler log file. The Nagios Plugin interprets the severity of these messages and displays them in the Network Monitor console by default:

    Job Scheduler reports 1 errors, 2 warnings

    If a higher plugin verbosity level were given allowing multiline output then the result in the console could be:

    Job Scheduler reports 1 errors, 2 warnings
    2006-05-11 11:32:42.546 [WARN] (Task monitor_sample_warning:2123) sample warning for monitor launched
    2006-05-11 11:32:44.453 [WARN] (Task monitor_sample_warning:2124) sample warning for monitor launched
    2006-05-11 11:32:48.703 [ERROR] (Task monitor_sample_error:2125) sample error for monitor launched


    Show Plugin Source Code

  • Job: Check status and report to Network Monitor (monitor_check_status)

    This job is used to check the results of all jobs that have been executed by the Job Scheduler and to report their respective status messages.

    Installation of the Nagios client is not required as all status messages are sent as responses to SOAP requests. In turn, these requests were initiated by the Nagios Plugin via HTTP.

    Please use the above configuration file to expose this job as a Web Service.

    Show Job Documentation      Show Job Source Code

  • Job: Reset messages for Network Monitor (monitor_reset_status)

    This job uses the same implementation as the monitor_check_status above and resets messages on error states. Normally these messages are repeatedly sent in order to retain the messages in the Network Monitor console. This job could be started by an administrator after error states have been fixed.

    Show Job Documentation      Show Job Source Code



 
Hobbit Network Monitor (Big Brother)

  Hobbit is an open source network monitor that is maintained by Henrik Storner at http://hobbitmon.sourceforge.net.

Download

Job implementations for this stack are not included in the Job Scheduler distribution because of legal considerations, you can download them from the sos.stacks.jar archive. The source code of these jobs is given below.

Configuration

Hobbit compliant jobs are included from the scheduler_hobbit.xml configuration file with the <base file="scheduler_hobbit.xml"/> element into the scheduler.xml configuration file.

Add the above sos.stacks.jar to the class_path setting of your factory.ini configuration file in order to make these classes available at run time.

Implementation

  • Job: Check status and report to Hobbit (hobbit_check_status)

    This job is used to check the results of all jobs that have been executed by the Job Scheduler and to report their respective status messages to the Hobbit web console.

    No Hobbit client installation is required as all status messages are sent by HTTP posts to the Hobbit server.

    Show Job Documentation      Show Job Source Code

  • Job: Send message to Hobbit (hobbit_send_message)

    This job uses the same implementation as the above hobbit_check_status but works for orders in a job chain, i.e. the job does not check log files but is activated by an order that contains the message in it's payload.

    This is useful if you implement your own jobs with the Job Scheduler API and want errors to be reported more quickly with individual message content.

    Show Job Documentation      Show Job Source Code

  • Job: Reset messages for Hobbit (hobbit_reset_status)

    This job uses the same implementation as the above hobbit_check_status and resets messages on error states. Normally these messages are repeatedly sent in order to retain the messages in the Network Monitor console. This job could be started by an administrator after error states have been fixed.

    Show Job Documentation      Show Job Source Code



 
 
 
  Office Automation - Document Delivery - Job Scheduling - Systems Integration - Output Management - Enterprise Application Integration - Connectivity