Anda di halaman 1dari 11

Database Monitoring with BusyBee

Eric Herber 25 September 2003

BusyBee is a free tool you can use to monitor your IDS servers. This article explains the
architecture and tells you how to use it to solve many of your monitoring problems.

Introduction
One of the most important activities in the daily work of a DBA is monitoring the server to prevent
problems. IBM' Informix Dynamic Server' (IDS). is one of the most scalable database servers
available today and one which requires little administrative overhead compared with other
database servers playing in this high-end league. However, as the number of database server
installations increases and the daily business of your company depends on the smooth operation
of these servers, monitoring and observing their critical functions is absolutely essential.

This article discusses the monitoring of critical areas of IDS. Besides the tools delivered with the
database server (ISA and AGS ServerStudio), there are several vendor tools that you can use for
this purpose. However many tools do not have database monitoring as their primary focus, and for
those that are database monitoring tools, the cost may be an issue. In this article we'll talk about a
tool that's available for free and can help you solve many of your monitoring problems.

What is BusyBee?
BusyBee is a database monitoring tool you can download for free, written completely in the
versatile and powerful Python programming language. It's sole purpose is the observation and
monitoring of Informix Dynamic Server.

Let's start by looking at BusyBee's architecture, shown in Figure 1.

© Copyright IBM Corporation 2003 Trademarks


Database Monitoring with BusyBee Page 1 of 11
developerWorks® ibm.com/developerWorks/

Figure 1. BusyBee architecture

BusyBee architecture
BusyBee is based on a multiprocess architecture where each subprocess (domain inspector) is
responsible for observing a dedicated set of functionality which I will refer to as a "domain" of your
database server. These domain inspectors act as daemon processes in the background and are
configured thru an XML-based configuration file. Different intervals and individual thresholds can
be established, allowing a flexible and easy customization of BusyBee serving your dedicated
environment.

The BbWatch master daemon parses the XML configuration file and forks a child process (domain
inspector) for each enabled domain. Each domain inspector observes a dedicated area of the
database server, examining any violations of the configured thresholds and generating appropriate
alerts. Depending on the individual domain type, different tools or interfaces will be called to gather
the required information:

• Calling the ONSTSAT utility


• Performing sysmaster queries
• Parsing the online.log

The interval of each domain inspector can be individually configured; thus the more critical
domains (like 'DbmStatusCheck') can be observed more often than the less critical ones (like
'SessionLockCheck').

Database Monitoring with BusyBee Page 2 of 11


ibm.com/developerWorks/ developerWorks®

XML configuration file


The XML configuration file allows the customization of the individual domain inspectors to meet the
requirements of your dedicated environment. Figure 2 shows an excerpt of the XML configuration
file describing the two domain inspectors 'DbspaceFullCheck' and 'CheckPointCheck'.

Figure 2. XML configuration file

Each IBM Informix Dynamic Server instance must have an individual XML configuration file named
as follows:
<InstallDir>/BusyBee/etc/BbWatchIfxCfg.<instance>.xml

If you wish to share the XML configuration file between different instances, you can create the
appropriate symbolic links.

Next we'll talk about the individual XML tags and their respective meanings.

<Domain Name='xxx'>

This is the name of an individual domain. Domains are the single areas of the database manager
that will be observed by an individual BbWatch daemon (domain inspector).

<Enabled>

This XML tag controls whether or not a domain is activated (enabled). If you want to deactivate
a certain domain, you can change this setting to 0 (disabled). If a domain is disabled, no domain
inspector will be started, thus this database manager resource will not be observed.

Database Monitoring with BusyBee Page 3 of 11


developerWorks® ibm.com/developerWorks/

<Interval>

This XML tag controls the interval for this domain. The specific area of the database manager will
be checked at the given interval.

<Severity>

This XML tag is only available for domains which are not based on any thresholds. For example
the domain 'DbmStatusCheck' is not based on thresholds because the database server is either
down or up. You can configure the appropriate severity for this event here. Possible values for the
severity in the order of relevance are:

• Info
• Warning
• Minor
• Critical
• Fatal

The configured severity will also be delivered to the external script (see 'ExecuteScript' below).

<ExecuteScript>

This XML tag may contain the full path of an external script which will be automatically triggered by
BbWatch if any of the configured thresholds has been exceeded. The external script will be called
by BbWatch with the following arguments:

Instance The first argument is the name of database manger instance where the event occurred.
Domain The second argument is the name of the domain inspector which triggered this event.
Severity The third argument is the severity code (i.e. WARNING, MINOR, CRITICAL, FATAL).
Message The fourth argument is the full message text.

Examples:
iif930 DbspaceFullCheck FATAL &lt;DWHIFX006 Dbspace [rootdbs] is [83] percent full>
iif930 VirtualSegmentCheck MINOR &lt;DWHIFX008 Number of allocated Virtual SHM Segments is [4]>

This mechanism allows you to write your own scripts in your preferred programming language so
you can receive those events and divert them to other interfaces like SMS, pager, email or existing
system management platforms.

<Warning><Minor><Critical><Fatal>

These XML tags allow the configuration of individual thresholds. BbWatch will trigger an alarm with
the given severity as soon as any one of these thresholds is exceeded.

<Group>

Database Monitoring with BusyBee Page 4 of 11


ibm.com/developerWorks/ developerWorks®

Certain domains contain an additional XML tag called 'Group' which allows a more granular
configuration for this domain.

For example the domain 'DbspaceFullCheck' allows the specification of dbspace groups to enable
the configuration of different threshold values for individual groups of dbspaces. Another example
using the 'Group' tag is the domain 'OnlineLogCheck'. Individual search patterns can be configured
in different groups, each one with its own severity.

Domain inspectors
The domain inspectors are the workhorses of BusyBee. They observe an individual domain of
the database server at the arranged interval and generate appropriate alarms if a configured
threshold is exceeded. Figure 3 shows one of the most important domain inspectors called
'DbspaceFullCheck' which is responsible for checking how full individual dbspaces are.

Figure 3. Domain inspector

As soon as any one of the configured thresholds is reached, BbWatch will write an appropriate
alarm to the logfile ( <InstallDir>/BusyBee/log/BbWatch.<instance>.log ):

• 60% (Warning)
• 70% (Minor)
• 80% (Critical)
• 90% (Fatal)
In addition to the generated event, BbWatch will also trigger the defined external script. You might
for example setup a script which automatically adds a new diskspace to the affected dbspace in
order to prevent any outages.

Database Monitoring with BusyBee Page 5 of 11


developerWorks® ibm.com/developerWorks/

Here is a listing of all twenty domain inspectors currently available for the IBM Informix Dynamic
Server. Beside the purpose of the individual domain inspector you will also see the reason why
observing this resource is essential.

DbmStatusCheck

Purpose: Check the availibility of the database


server using the 'onstat' utility.

Reason: If your database server is down, you could not access your data. This
might have a negative impact on your business.

LongTxCheck

Purpose: Check to see if there are any ongoing long transactions.

Reason: Long transactions should be avoided. They consume major resources


of your database server. If a long transaction exclusive high water mark
(LTXEHWM) is reached, all other write activities will be blocked.

ConnectCheck

Purpose: Make sure that a connect thru 'dbaccess' is possible


to any server names defined in your $ONCONFIG
file (i.e. DBSERVERNAME, DBSERVERALIASES ).

Reason: The previous domain inspector 'DbmStatusCheck' checks the


availability of your database server. However the fact that your
database server is 'online' does not always ensure that you could
connect to your defined servernames. So this additional test is helpful in
some special situations.

DbspaceFullCheck

Purpose: Check the filling degree of your dbspaces according


the thresholds defined in the XML config file.

Reason: Managing disk space is one of the primary tasks of a DBA. If one
of your dbspaces runs out of space, some applications could stop
working. You can configure discrete thresholds for individual dbspaces,
thus more critical dbspaces (like the 'rootdbs') could be set up with
lower threshold values to allow an earlier detection of such situations.
Static dbspaces (like 'physdbs' or 'logdbs') could be excluded from any
observation.

CheckpointCheck

Purpose: Parse the online.log for any violations of the


configured maximum checkpoint time allowed.

Reason: Long checkpoints should be avoided. During a checkpoint all write


activities will be blocked. So the goal for every DBA is to keep the
checkpoint time as small as possible. This domain inspector also allows
the configuration of distinct timeframes, so you could configure a bigger
threshold value for your batch processing time (for example from hour
20-7) and a smaller threshold value for your interactive user time (for
example from hour 8-19).

Database Monitoring with BusyBee Page 6 of 11


ibm.com/developerWorks/ developerWorks®

OnlineLogCheck

Purpose: Parse the online.log and compare each row to the


configured search patterns from the XML configuration file
and perform an alert for any matches that have been found.

Reason: The online.log is an important information resource for the DBA.


It contains useful information about any problems that have occurred.
With this domain inspector you are able to configure individual search
patterns which will be automatically checked for you by the BbWatch
daemon.

LogFullCheck

Purpose: Checks to see how full the logical logs are


and generates appropriate alarms if one
of the configured thresholds is exceeded.

Reason: The logical logs are one of the most important resources of the
database server. Your database server will be blocked if they become
full because they could not be backed up. You might also lose
transactions in the case of a failure if your logical logs have not been
backed up. Thus observing the behaviour of the logical logs is also one
of the primary activities of a DBA.
Yes, we have the dynamic log allocation feature since IDS 9.30. But this
is not always a good thing. Poorly written applications performing large
transactions without any commit could trigger the allocation of several
additional logs scattered throughout your dbspaces if you have set
DYNAMIC_LOGS to 2 (which is also the default). All these dbspaces
will then be considered 'critical' (because they contain a logical log), so
if one of them is marked down for some reason, your database server
will stop.

MaxExtentCheck

Purpose: Checks the number of extents for a table or a table fragment.

Reason: The number of extents per table is limited. The upper limit depends
on the page size of your system (2K or 4K) and the number of special
data types in your table (like datetime or interval). For a 2K system the
limit will be approximately 200-220 extents. If this limit is reached, no
more inserts into this table will be possible. You have reorganize the
relevant table which might be time consuming if the table is large (which
is obvious because it ran out of extents).

MaxPageCheck

Purpose: Counts the number of pages allocated


for a table or a table fragment.

Reason: There is an upper limit of 16,000,000 pages per table/fragment/index


in Informix Dynamic Server. You cannot insert any additional rows
when the limit is reached. You must reorganize the table (for example
splitting a single table into several fragments) before work can continue.
16,000,000 pages is not that much. On a system using a 2K page size it
is around 32 GB. I've seen some customers here in Germany which ran
into that limit on large tables which have not been fragmented.

SessionLockCheck

Database Monitoring with BusyBee Page 7 of 11


developerWorks® ibm.com/developerWorks/

Purpose: Inspects the number of locks held


by individual database sessions.

Reason: Poorly written applications can consume a huge number of locks. This
domain inspector checks the number of locks held by each database
session and triggers appropriate alarms if one of the configured
thresholds is exceeded.
Yes, we have had the dynamic lock table since IBM Informix Dynamic
Server 9.21. However the disadvantage of this feature is that you might
no longer detect those poorly-written applications consuming many
locks and slowing down the performance of your database server. This
domain inspector helps you identify those applications.

VirtualSegmentCheck

Purpose: Counts the number of allocated virtual shared memory segments.

Reason: Allocating additional shared memory segments is resource intensive


and on some platforms (like HP/UX) degrades the overall performance
of your database server. There are also some limits on the amount
of shared memory that can be allocated, for example on 32-Bit AIX
systems.

DbspaceDownCheck

Purpose: Checks if any dbspaces have been marked down.

Reason: You cannot access the data in 'down' dbspaces. You may need to
restore the data from an archive.

ChunkDownCheck

Purpose: Checks if any chunks are marked down.

Reason: You cannot access the data in 'down' chunks. You may need to restore
the data from an archive.

SessionMemoryCheck

Purpose: Inspects the allocated memory of each


database session currently connected.

Reason: Depending on the number of prepared statements, recursive stored


procedures and some other factors, the allocated memory per session
might be high, especially if these sessions fail to free their prepared
statements. With a large number of users, you might suddenly hit the
known memory limitations on 32-bit systems. This domain inspector lets
you identify those memory-intensive sessions.

ForegroundWriteCheck

Purpose: Checks the number of foreground writes that have occurred.

Reason: Foreground writes should be avoided. They slow down the performance
of your database server. Depending on the nature of your application
(batch or more interactive) you should prefer chunk writes or lru writes,
so that foreground writes almost never happen.

Database Monitoring with BusyBee Page 8 of 11


ibm.com/developerWorks/ developerWorks®

DeadLockCheck

Purpose: Counts the number of deadlocks that have occurred.

Reason: Deadlocks should also be avoided. They normally result from an


inappropriate application design. However I've seen also a lot of
deadlocks because people forgot to create their tables in 'row' lock
mode (the default is 'page').

ReadCacheCheck

Purpose: Checks the current read cache rate of your database server.

Reason: The read cache rate is an indicator of the overall performance of your
database server. The goal for OLTP systems is too keep the read cache
rate over 95 percent. This domain inspector allows the specification of a
'WarmUpTime' that has to be elapsed before the read cache check will
be measured. This 'WarmUpTime' refers to the time since the database
server has been started or since the last 'onstat -z' (zero profile) has
been executed. This ensures that the computed read cache rate is
representative.

WriteCacheCheck

Purpose: Checks the current write cache rate of your database server.

Reason: Almost the same as the above described 'ReadCacheRate'. The goal
for OLTP systems is to keep the write cache rate over 85 percent.
However depending on the nature of the application (for example write-
intensive batch applications) this might not always be possible.

SequentialCacheCheck

Purpose: Checks the number of sequential scans


that occurred on individual tables.

Reason: Sequential scans on large tables should normally be avoided in OLTP


systems. In data warehouse environments they are quite common.
This domain inspector allows you to specify the minimum size that a
table must have in order to be considered. This helps you exclude small
tables in which sequential scans are not a problem. Also the explicit
exclusion of tables through their names can be done, because you
might also have large tables which are processed sequentially, for
example by batch jobs.

ReadyQueueCheck

Database Monitoring with BusyBee Page 9 of 11


developerWorks® ibm.com/developerWorks/

Purpose: Counts the number of threads


currently waiting on the ready queue.

Reason: Threads will be put on the ready queue because all available CPU
virtual processors (VPs) are currently busy. They wait for a free virtual
processor, hence the database session is blocked. A constantly large
number of threads on the ready queue ('onstat -g rea'), leads to the
conclusion that you have to add additional CPU VPs.
However if you have already configured the same number of CPU VPs
as physical processors available in the machine, you might have to
add more physical processors or investigate other performance tuning
activities. This domain inspector allows you to specify the number of
samples that should be taken before computing the average number of
threads on the ready queue. This ensures that the computed value is
more representative.

Integration with existing system management tools


BusyBee offers the ability to call an external program which can be configured thru the XML tag
'ExecuteScript' if anyone of the thresholds has been exceeded. This is comparable to the Informix
'alarmprogram' or the DB2&reg; UDB 'userexit' mechanism.

• Domain name (like 'DbspaceFullCheck' or 'CheckpointCheck')


• Severity (Warning, Minor, Critical, Fatal)
• Instance name (the name of the Informix instance)
• Message text (The full text of the BusyBee message)

Through this mechanism it should be fairly easy to connect BusyBee with existing system
management tools like BigBrother, CA Unicenter, IBM Tivoli&reg; or HP/OpenView. If you don't
have such a system management framework in use, you could also setup a simple script which
sends you an e-mail or an SMS alert if a threshold has been exceeded. There are two sample
shell scripts delivered with BusyBee ( BbWatchMail.ksh and BbWatchSyslog.ksh ) which will
show you how to deliver events to external interfaces like email or thru the UNIX&reg; syslog
mechanism.

These external scripts/programs need not to be written as a shell script. They could also be
programmed in C, Java, Perl or Python or any other language which is capable of receiving
command line arguments and delivering return codes.

Database platform support


BusyBee is available on all UNIX based platforms where Python can be ported. For popular
platforms like AIX, HP/UX, Linux and Solaris there are binary distributions of Python. BusyBee
supports the following database servers on these platforms:

• IBM Informix IDS 7.x


• IBM Informix IDS 9.x
• IBM DB2 Universal Database™ (UDB) 7.x (available in Q3/2003)
• IBM DB2 UDB 8.x (available in Q3/2003)

Database Monitoring with BusyBee Page 10 of 11


ibm.com/developerWorks/ developerWorks®

Future plans
Currently I'm working on a port of BusyBee for the IBM DB2 UDB database server (on UNIX/
LINUX platforms). The next step will probably be a port for the open source database system
MySQL .

How to get BusyBee


The following URL provides you with additional information about BusyBee and the capability to
download it:

http://www.herber-consulting.de/BusyBee

There is also some other Informix stuff on my website which might be interesting for you. If you
have any questions about BusyBee feel free to contact me via e-mail.

© Copyright IBM Corporation 2003


(www.ibm.com/legal/copytrade.shtml)
Trademarks
(www.ibm.com/developerworks/ibm/trademarks/)

Database Monitoring with BusyBee Page 11 of 11

Anda mungkin juga menyukai