Using Events to Debug Problems
If you are unfamiliar with the new Event Handling features of ZXTM 5.1, it is recommended that you read this article before continuing. IntroductionThe reference documentation Events in ZXTM provides a description of how event information is formatted when it is passed to an action, along with a list of events that ZXTM can report. This article provides an example to accompany that documentation, showing how an action can process the information it is supplied with when triggered by an event. OverviewIn this example we will create a 'program' action. This type of action executes a specified program with the arguments you supply. We will configure the action to execute a program that logs some debugging information when a particular problem occurs. The program will:
Configuring ZXTMCreating an actionThe first step in configuring ZXTM is to create an action. To do this, go to the System -> Alerting -> Manage Actions page. Enter the name 'Debug Problem' into the 'name' box at the bottom of the page and select Program from the list of action types. Click 'Add Action' to create the action. A new page appears with some extra configuration options for the action. In the additional settings section, select 'Custom...' from the drop-down box of programs and enter '/bin/echo' into the box that appears along side it. This will be updated to execute our own program later, but for now just click 'Update' at the bottom of the page. A new box should appear underneath where you typed in the program name showing the exact command that will be executed when the action is triggered.
As you can see, the program /bin/echo is passed two parameters by default: the name of the event type that triggered the action and information about the specific event reported within that event type, in the format specified in the reference documentation mentioned in the introduction. This will suffice for now - we will add more arguments later when we have finished writing the program. Creating an event typeNext we can choose some events that we want to trigger the action. Go to System -> Alerting -> Manage Event Types and create a new event type called 'Problems to Debug'. You will be presented with a list of all the events that ZXTM can report in a tree structure. Select the following events:
Save the event type by clicking 'Update'. Linking the event type to the actionThe next step is to configure ZXTM to trigger the action when one of the events in our event type occurs. Go to the System -> Alerting page and select the 'Problems to Debug' event type from the drop-down box at the bottom of the page. The event type will appear in the list of mappings alongside a drop-down box containing a list of all the actions that have been configured. Select the 'Debug Problem' action from the list. It would also be useful to receive a notification that some debug output has been produced, so select 'E-Mail' from the list of actions as well. Click 'Update' to save the changes and then, if you haven't already done so, configure the E-Mail action to use your mail server and e-mail address.
Writing the ProgramCurrently the 'Debug Problem' action will not do anything useful when it is triggered, so we need to write a program for it to run. Source codeProvided below is the source code of a basic Perl program that will do some simple debugging for the errors we selected in the 'Problems to Debug' event type. OverviewThe program examines the event information it receives and, for certain events, performs some debugging actions. The program determines which event it is handling by matching the primary tag. The primary tag for each event can be found in the list at the end of the Events in ZXTM documentation. For example, looking up the event 'Node has failed' shows that its primary tag is 'nodefail'. The list also shows the section of the configuration that the event reports information about. In this case, it is the 'nodes' section, so we can expect the event information to contain the name of the node that caused the event to be reported. This is presented in the form 'nodes/<node name>'. When a node fails...The Perl program looks for the 'nodefail' tag, then extracts the name of the node and its port from the message.
It then starts capturing traffic going between ZXTM and that node to see if there are any clues as to what is causing the failure. The node might, for example, be ignoring invalid requests from a particular client, thus causing the passive monitoring feature of ZXTM to mark it as failed.
The captured traffic is then sent to a different machine so it can be analysed.
The program uses scp to send the information, which usually requires a password to be entered to access the remote machine. Because scp is being invoked by the program there is no opportunity to enter a password. To get around this problem, you can configure scp to contact a particular remote machine without requiring a password¹. Alternatively, if no location is passed to the program, it will just write the files to a specific location on the ZXTM machine so you can access them manually. When the ZXTM software encounters a problem...If there is a problem with the ZXTM software, the program will create a technical support report that you can send to the Zeus support team should you need further assistance with the problem. Information about the specific problem that occurred in the software will be sent in the notification e-mail that we configured earlier.
When the number of free file descriptors is running low...If ZXTM detects that it is running low on free file descriptors, the program will obtain information about current memory usage, disk usage, active connections and file descriptor settings.
By examining this information, you should be able to determine why the system is running low on file descriptors. Often it is because the maximum number of file descriptors (as reported by ulimit) is too low, though it could also be caused by the system running out of memory or disk space or there simply being an abnormally high number of active connections. When SLM fails...Finally, if SLM fails the program is triggered with the 'slmnodeinfo' event that identifies which nodes contributed to the SLM failure. In this case, the program will log on to the nodes in question and obtain information about the running processes to see what is going wrong. To do this it uses rsh, which means that you need to have the appropriate permissions configured in the '.rhosts' files on each node to allow the machine running ZXTM to access them without a password².
Testing the programThe program also looks out for a 'testaction' event, which is reported when you use the 'Update and Test' button on the action page. We will use this later to make sure the program is working correctly and copies the debug output to the correct location. Adding the Program to ZXTMWe can now configure the 'Debug Problem' action to use the correct program. First of all, you need to upload the program to ZXTM. Save the source code from above as debug-events.pl and go to Catalogs -> Extra Files -> Action Programs. Find the Perl program you just saved and upload it.
Use the link at the bottom of this page, or go to System -> Alerting -> Manage Actions, and edit the Debug Problem action. Change the program from 'Custom...' to the 'debug-events.pl' program you just uploaded. You might have noticed that the program takes several arguments beyond just the event information. These arguments include the location to which files should be sent and the scp and rsh usernames to use when connecting to remote machines. You can use the 'Argument Descriptions' section of the page to configure the action to supply these arguments. After expanding the Argument Descriptions section, enter 'rshuser' into the name box and 'Username used to log on to failing nodes' in the description box. Click update and then add the remaining arguments - scpuser and scpdest - in the same way. The arguments will appear in the 'Additional Settings' section where you can configure them with the appropriate values for your system. Click 'Update' to save the configuration and scroll down to the Additional Settings section again. The command that will be executed when the action is triggered is shown at the bottom of this section:
It would also be helpful to enable 'Verbose' mode on the action at this point so any problems that occur are reported in the Event Log. If you want to test the program out, click 'Update and Test' from the Debug Problem action's page and you should find a file called 'test-event.txt' in the location you put in the 'scpdest' parameter. If not then double check that you can use scp to copy files from the ZXTM machine to that location without requiring any user interaction. If you did get the file then when any of the events in the 'Debug Problems' event type occur you will receive some additional debugging information! ¹This article offers some information about how to configure SSH and SCP to operate without requiring a password. ²The rsh man page has more information about configuring rsh to operate without requiring a password. Comments:This public messageboard is not a forum for technical support. To report technical support problems, please contact our dedicated Support team using the instructions at the bottom of this page.
Comment from:
kpfoote [Visitor]
This is great. Any chance you can use the zxtm's ability to send email to send the debug report to someone or some group? This would be instead of the scp.
Comment from:
andy knox [Zeus Dev Team]
The best way to achieve this is probably to use a Perl SMTP library, such as Net::SMTP, and add some extra logic to the end of the script to e-mail you the debug data. The Net::SNMP library is usually included with a standard perl installation and is also available on our appliances.
You can find documentation for it here: http://perldoc.perl.org/Net/SMTP.html It's very simple to use and will hopefully solve your problem! If you really want to use the E-Mail Action inside the traffic manager though, you could create a new HTTP Virtual Server that listens on a local IP. If you add a rule to the Virtual Server with the line: event.emit( "debug-data", http.getBody() ); then any body data contained in a HTTP request to that Virtual Server will be raised in the debug-data event. If you map this event to the E-Mail Action then you will be e-mailed the data. You can then add some extra logic to the Perl script to use the http client provided with the traffic manager (found in $ZEUSHOME/admin/bin/httpclient) to send the debug output to the new Virtual Server and it will then be e-mailed to you! Hope this helps! |
Recently...
Other Resources
|

A simple example illustrating how to process events in ZXTM 5.1. This article shows you how to configure ZXTM to invoke a program that will collect debugging information and send it to you when a problem occurs.








