BusyBee is a free tool you can use to monitor your IDS servers. This article explains the
architecture and tells you how to use it to solve many of your monitoring problems.
Introduction
One of the most important activities in the daily work of a DBA is monitoring the server to prevent
problems. IBM' Informix Dynamic Server' (IDS). is one of the most scalable database servers
available today and one which requires little administrative overhead compared with other
database servers playing in this high-end league. However, as the number of database server
installations increases and the daily business of your company depends on the smooth operation
of these servers, monitoring and observing their critical functions is absolutely essential.
This article discusses the monitoring of critical areas of IDS. Besides the tools delivered with the
database server (ISA and AGS ServerStudio), there are several vendor tools that you can use for
this purpose. However many tools do not have database monitoring as their primary focus, and for
those that are database monitoring tools, the cost may be an issue. In this article we'll talk about a
tool that's available for free and can help you solve many of your monitoring problems.
What is BusyBee?
BusyBee is a database monitoring tool you can download for free, written completely in the
versatile and powerful Python programming language. It's sole purpose is the observation and
monitoring of Informix Dynamic Server.
BusyBee architecture
BusyBee is based on a multiprocess architecture where each subprocess (domain inspector) is
responsible for observing a dedicated set of functionality which I will refer to as a "domain" of your
database server. These domain inspectors act as daemon processes in the background and are
configured thru an XML-based configuration file. Different intervals and individual thresholds can
be established, allowing a flexible and easy customization of BusyBee serving your dedicated
environment.
The BbWatch master daemon parses the XML configuration file and forks a child process (domain
inspector) for each enabled domain. Each domain inspector observes a dedicated area of the
database server, examining any violations of the configured thresholds and generating appropriate
alerts. Depending on the individual domain type, different tools or interfaces will be called to gather
the required information:
The interval of each domain inspector can be individually configured; thus the more critical
domains (like 'DbmStatusCheck') can be observed more often than the less critical ones (like
'SessionLockCheck').
Each IBM Informix Dynamic Server instance must have an individual XML configuration file named
as follows:
<InstallDir>/BusyBee/etc/BbWatchIfxCfg.<instance>.xml
If you wish to share the XML configuration file between different instances, you can create the
appropriate symbolic links.
Next we'll talk about the individual XML tags and their respective meanings.
<Domain Name='xxx'>
This is the name of an individual domain. Domains are the single areas of the database manager
that will be observed by an individual BbWatch daemon (domain inspector).
<Enabled>
This XML tag controls whether or not a domain is activated (enabled). If you want to deactivate
a certain domain, you can change this setting to 0 (disabled). If a domain is disabled, no domain
inspector will be started, thus this database manager resource will not be observed.
<Interval>
This XML tag controls the interval for this domain. The specific area of the database manager will
be checked at the given interval.
<Severity>
This XML tag is only available for domains which are not based on any thresholds. For example
the domain 'DbmStatusCheck' is not based on thresholds because the database server is either
down or up. You can configure the appropriate severity for this event here. Possible values for the
severity in the order of relevance are:
• Info
• Warning
• Minor
• Critical
• Fatal
The configured severity will also be delivered to the external script (see 'ExecuteScript' below).
<ExecuteScript>
This XML tag may contain the full path of an external script which will be automatically triggered by
BbWatch if any of the configured thresholds has been exceeded. The external script will be called
by BbWatch with the following arguments:
Instance The first argument is the name of database manger instance where the event occurred.
Domain The second argument is the name of the domain inspector which triggered this event.
Severity The third argument is the severity code (i.e. WARNING, MINOR, CRITICAL, FATAL).
Message The fourth argument is the full message text.
Examples:
iif930 DbspaceFullCheck FATAL <DWHIFX006 Dbspace [rootdbs] is [83] percent full>
iif930 VirtualSegmentCheck MINOR <DWHIFX008 Number of allocated Virtual SHM Segments is [4]>
This mechanism allows you to write your own scripts in your preferred programming language so
you can receive those events and divert them to other interfaces like SMS, pager, email or existing
system management platforms.
<Warning><Minor><Critical><Fatal>
These XML tags allow the configuration of individual thresholds. BbWatch will trigger an alarm with
the given severity as soon as any one of these thresholds is exceeded.
<Group>
Certain domains contain an additional XML tag called 'Group' which allows a more granular
configuration for this domain.
For example the domain 'DbspaceFullCheck' allows the specification of dbspace groups to enable
the configuration of different threshold values for individual groups of dbspaces. Another example
using the 'Group' tag is the domain 'OnlineLogCheck'. Individual search patterns can be configured
in different groups, each one with its own severity.
Domain inspectors
The domain inspectors are the workhorses of BusyBee. They observe an individual domain of
the database server at the arranged interval and generate appropriate alarms if a configured
threshold is exceeded. Figure 3 shows one of the most important domain inspectors called
'DbspaceFullCheck' which is responsible for checking how full individual dbspaces are.
As soon as any one of the configured thresholds is reached, BbWatch will write an appropriate
alarm to the logfile ( <InstallDir>/BusyBee/log/BbWatch.<instance>.log ):
• 60% (Warning)
• 70% (Minor)
• 80% (Critical)
• 90% (Fatal)
In addition to the generated event, BbWatch will also trigger the defined external script. You might
for example setup a script which automatically adds a new diskspace to the affected dbspace in
order to prevent any outages.
Here is a listing of all twenty domain inspectors currently available for the IBM Informix Dynamic
Server. Beside the purpose of the individual domain inspector you will also see the reason why
observing this resource is essential.
DbmStatusCheck
Reason: If your database server is down, you could not access your data. This
might have a negative impact on your business.
LongTxCheck
ConnectCheck
DbspaceFullCheck
Reason: Managing disk space is one of the primary tasks of a DBA. If one
of your dbspaces runs out of space, some applications could stop
working. You can configure discrete thresholds for individual dbspaces,
thus more critical dbspaces (like the 'rootdbs') could be set up with
lower threshold values to allow an earlier detection of such situations.
Static dbspaces (like 'physdbs' or 'logdbs') could be excluded from any
observation.
CheckpointCheck
OnlineLogCheck
LogFullCheck
Reason: The logical logs are one of the most important resources of the
database server. Your database server will be blocked if they become
full because they could not be backed up. You might also lose
transactions in the case of a failure if your logical logs have not been
backed up. Thus observing the behaviour of the logical logs is also one
of the primary activities of a DBA.
Yes, we have the dynamic log allocation feature since IDS 9.30. But this
is not always a good thing. Poorly written applications performing large
transactions without any commit could trigger the allocation of several
additional logs scattered throughout your dbspaces if you have set
DYNAMIC_LOGS to 2 (which is also the default). All these dbspaces
will then be considered 'critical' (because they contain a logical log), so
if one of them is marked down for some reason, your database server
will stop.
MaxExtentCheck
Reason: The number of extents per table is limited. The upper limit depends
on the page size of your system (2K or 4K) and the number of special
data types in your table (like datetime or interval). For a 2K system the
limit will be approximately 200-220 extents. If this limit is reached, no
more inserts into this table will be possible. You have reorganize the
relevant table which might be time consuming if the table is large (which
is obvious because it ran out of extents).
MaxPageCheck
SessionLockCheck
Reason: Poorly written applications can consume a huge number of locks. This
domain inspector checks the number of locks held by each database
session and triggers appropriate alarms if one of the configured
thresholds is exceeded.
Yes, we have had the dynamic lock table since IBM Informix Dynamic
Server 9.21. However the disadvantage of this feature is that you might
no longer detect those poorly-written applications consuming many
locks and slowing down the performance of your database server. This
domain inspector helps you identify those applications.
VirtualSegmentCheck
DbspaceDownCheck
Reason: You cannot access the data in 'down' dbspaces. You may need to
restore the data from an archive.
ChunkDownCheck
Reason: You cannot access the data in 'down' chunks. You may need to restore
the data from an archive.
SessionMemoryCheck
ForegroundWriteCheck
Reason: Foreground writes should be avoided. They slow down the performance
of your database server. Depending on the nature of your application
(batch or more interactive) you should prefer chunk writes or lru writes,
so that foreground writes almost never happen.
DeadLockCheck
ReadCacheCheck
Purpose: Checks the current read cache rate of your database server.
Reason: The read cache rate is an indicator of the overall performance of your
database server. The goal for OLTP systems is too keep the read cache
rate over 95 percent. This domain inspector allows the specification of a
'WarmUpTime' that has to be elapsed before the read cache check will
be measured. This 'WarmUpTime' refers to the time since the database
server has been started or since the last 'onstat -z' (zero profile) has
been executed. This ensures that the computed read cache rate is
representative.
WriteCacheCheck
Purpose: Checks the current write cache rate of your database server.
Reason: Almost the same as the above described 'ReadCacheRate'. The goal
for OLTP systems is to keep the write cache rate over 85 percent.
However depending on the nature of the application (for example write-
intensive batch applications) this might not always be possible.
SequentialCacheCheck
ReadyQueueCheck
Reason: Threads will be put on the ready queue because all available CPU
virtual processors (VPs) are currently busy. They wait for a free virtual
processor, hence the database session is blocked. A constantly large
number of threads on the ready queue ('onstat -g rea'), leads to the
conclusion that you have to add additional CPU VPs.
However if you have already configured the same number of CPU VPs
as physical processors available in the machine, you might have to
add more physical processors or investigate other performance tuning
activities. This domain inspector allows you to specify the number of
samples that should be taken before computing the average number of
threads on the ready queue. This ensures that the computed value is
more representative.
Through this mechanism it should be fairly easy to connect BusyBee with existing system
management tools like BigBrother, CA Unicenter, IBM Tivoli® or HP/OpenView. If you don't
have such a system management framework in use, you could also setup a simple script which
sends you an e-mail or an SMS alert if a threshold has been exceeded. There are two sample
shell scripts delivered with BusyBee ( BbWatchMail.ksh and BbWatchSyslog.ksh ) which will
show you how to deliver events to external interfaces like email or thru the UNIX® syslog
mechanism.
These external scripts/programs need not to be written as a shell script. They could also be
programmed in C, Java, Perl or Python or any other language which is capable of receiving
command line arguments and delivering return codes.
Future plans
Currently I'm working on a port of BusyBee for the IBM DB2 UDB database server (on UNIX/
LINUX platforms). The next step will probably be a port for the open source database system
MySQL .
http://www.herber-consulting.de/BusyBee
There is also some other Informix stuff on my website which might be interesting for you. If you
have any questions about BusyBee feel free to contact me via e-mail.