Postfix monitoring with Cacti, my way – part 1

In the next few posts I’ll show how I collect data from Postfix, publish them into net-snmp, and graph them with Cacti.

I am interested in data about Postfix queues sizes possibly on multi-instance configurations, and about Postfix events.
I decided to use SNMP instead of launching scripts directly by Cacti for a number of reasons, some of which are:

  • SNMP is standard, so I am able to access the collected data from a series of monitoring and diagnostic tools, not only Cacti;
  • I monitor different servers, so if I chose to run scripts directly from Cacti, the scripts would have to run remotely. This could be easily done for example in ssh, but would imply some management cost: maintain a dedicated user and a key pair for every server, for example. The user should be able to access the Postfix queues or the services it should colllect statistics about, and so on.

In this first post, I’ll focus on Postfix queues data collection and their publication in snmpd.

The first straightforward solution was to write a helper script which would be spawned by snmpd whenever a request arrives. The helper script should scan the Postfix queue directories involved in the request, and return the result to snmpd. snmpd has had support for this for long, via the extend config file keyword. This works reasonably well for data required by Cacti (hence 1 request every 5 minutes) but I liked to think about something a little more optimized. snmpd can be extended in other ways, and for my second attempt I chose pass_persist.
With pass_persist, only one helper process is spawned, and communication between it and snmpd is done via a simple line-oriented protocol. Having only one, persistent process, also makes it easy to cache the collected data and prevents continously scanning the file system to find the same results. There are other important differences between the two extention types I mentioned, for this please have a look at man snmpd.conf.
I wrote the extension script in perl, where we can use a handy module called SNMP::Extension::PassPersist (find it in CPAN) which conveniently handles the communication protocol.

Here is the main script. I put it in /usr/local/sbin/snmpPostfixQueues:

#!/usr/bin/perl
use strict;
use File::Find;
use SNMP::Extension::PassPersist;
use Getopt::Std;

my @postfixdirs = glob('/var/spool/postfix*');            # I could also use postmulti -l, if my spools were not all in /var/spool/postfix-something
my $baseoid = '.1.3.6.1.4.1.2021.53';                # may be overriden with option -b
my @queues = qw/active deferred hold incoming maildrop/;    # Order is important, If changed, change the Cacti data query xml accordingly.

my $extsnmp;
my %opts;
getopts('db:', \%opts);

$baseoid = $opts{b} || $baseoid;

if( $opts{d} ) {
    $extsnmp = undef;
    update_tree();
} else {
    $extsnmp = SNMP::Extension::PassPersist->new(
            backend_collect => \&update_tree,
            idle_count      => 60,      # no more than 60 idle cycles
            refresh         => 50,
        );
    $extsnmp->run;
}

sub addOidEntry($$$) {
    my ($dove, $tipo, $cosa) = @_;
    if( $opts{d} ) {
        print "$dove = $tipo: $cosa\n";
    } else {
        $extsnmp->add_oid_entry($dove, $tipo, $cosa);
    }
}

sub update_tree {
    my $idx = 0;

    foreach my $pfxspool ( @postfixdirs ) {
        next unless -d $pfxspool;
        $idx++;
    
        addOidEntry("${baseoid}.1.${idx}", 'integer', $idx);

        my $istanza = $pfxspool;
        $istanza =~ s|/var/spool/||;
        if( $istanza eq 'postfix' ) {
            addOidEntry("${baseoid}.2.${idx}", 'string', '');
            addOidEntry("${baseoid}.3.${idx}", 'string', '-');    # name coherent with postmulti -l
        } else {
            my $i = $istanza;
            $i =~ s|^postfix-?||;
            addOidEntry("${baseoid}.2.${idx}", 'string', $i);
            addOidEntry("${baseoid}.3.${idx}", 'string', $istanza);
        }

        my $n = 10;
        foreach my $coda ( @queues ) {
            my $dir = "$pfxspool/$coda";
            my $sum = 0;
    
            -d $dir && find( sub{ -f && $sum++ }, $dir );
            #print STDERR "$k $dir $sum\n";
            #print "$k:$sum\n";
            addOidEntry("${baseoid}.$n.${idx}", 'integer', $sum);
            $n++
        }
    }

    addOidEntry("${baseoid}.0", 'integer', $idx);
}

You can run the script on the command line with the -d option to test it. It gives you a dump of what is going into the MIB tree:

# snmpPostfixQueues -d
 .1.3.6.1.4.1.2021.53.1.1 = integer: 1
 .1.3.6.1.4.1.2021.53.2.1 = string:
 .1.3.6.1.4.1.2021.53.3.1 = string: -
 .1.3.6.1.4.1.2021.53.10.1 = integer: 2
 .1.3.6.1.4.1.2021.53.11.1 = integer: 72
 .1.3.6.1.4.1.2021.53.12.1 = integer: 0
 .1.3.6.1.4.1.2021.53.13.1 = integer: 0
 .1.3.6.1.4.1.2021.53.14.1 = integer: 0
 .1.3.6.1.4.1.2021.53.1.2 = integer: 2
 .1.3.6.1.4.1.2021.53.2.2 = string: slow
 .1.3.6.1.4.1.2021.53.3.2 = string: postfix-slow
 .1.3.6.1.4.1.2021.53.10.2 = integer: 0
 .1.3.6.1.4.1.2021.53.11.2 = integer: 101
 .1.3.6.1.4.1.2021.53.12.2 = integer: 0
 .1.3.6.1.4.1.2021.53.13.2 = integer: 0
 .1.3.6.1.4.1.2021.53.14.2 = integer: 0
 .1.3.6.1.4.1.2021.53.0 = integer: 2

The script is then added in snmpd.conf :

pass_persist .1.3.6.1.4.1.2021.53 /usr/bin/perl /usr/local/sbin/snmpPostfixQueues

Reload snmpd, and test it:

# snmpwalk -v1 -cpublic -On localhost .1.3.6.1.4.1.2021.53
 .1.3.6.1.4.1.2021.53.0 = INTEGER: 2
 .1.3.6.1.4.1.2021.53.1.1 = INTEGER: 1
 .1.3.6.1.4.1.2021.53.1.2 = INTEGER: 2
 .1.3.6.1.4.1.2021.53.2.1 = ""
 .1.3.6.1.4.1.2021.53.2.2 = STRING: "slow"
 .1.3.6.1.4.1.2021.53.3.1 = STRING: "-"
 .1.3.6.1.4.1.2021.53.3.2 = STRING: "postfix-slow"
 .1.3.6.1.4.1.2021.53.10.1 = INTEGER: 2
 .1.3.6.1.4.1.2021.53.10.2 = INTEGER: 0
 .1.3.6.1.4.1.2021.53.11.1 = INTEGER: 72
 .1.3.6.1.4.1.2021.53.11.2 = INTEGER: 102
 .1.3.6.1.4.1.2021.53.12.1 = INTEGER: 0
 .1.3.6.1.4.1.2021.53.12.2 = INTEGER: 0
 .1.3.6.1.4.1.2021.53.13.1 = INTEGER: 2
 .1.3.6.1.4.1.2021.53.13.2 = INTEGER: 0
 .1.3.6.1.4.1.2021.53.14.1 = INTEGER: 0
 .1.3.6.1.4.1.2021.53.14.2 = INTEGER: 0

Data is in the SNMP MIB tree, and for this part we’re done.
The next time I’ll show how I collected smtpd (and other Postfix daemons) event statistics.

Comments are closed.