mailgraph and logwatch reporting curiosities with postmulti and some regex’es for fun.

The zoo’s mailgraph charts are not working and i have mentioned it before (my blog).

So after changing our /etc/postfix instance (we have more better instances) for a new feature to allow outbound internet mail to be sent to an address the charts began to show only that traffic.  Bounces too also appear to work (not shown).

Spam and viruses as defined by amavis do work but the received email from those other postfix instances is still not being recognised even with explicit syslog statements in the main.cf file.

So something is off

Reading the charts could give you the impression that despite receiving email that the chart does not graph bur we appear send out spam and viruses and blocked,  The bounces where something i induced and could have been dmarc related too as many dmarc reporters have problems clearing there gmail inbox..

It is a good reminder that badly made statistics may look interesting but do not reflect reality.

The logwatch config files /usr/share/logwatch/default.conf/services/postfix.conf are written as perl and at this point are beyond my comprehension

*OnlyService = “(?:post(?:fix|grey|fwd|fix-1|fix2|fix-0|fix-3|policyd-spf)(?:/[-\w]*)?”
$postfix_Syslog_Name = “(?:post(?:fix|grey|fwd)|policyd-spf)”
# POSTMULTI NOT WORK *OnlyService = “postfix\d?/[-a-zA-Z\d]*”
#$postfix_Syslog_Name = “postfix\d?”

My changes are in bold. That does not work.. /etc/postfix-1 etc is how postmulti expects its managed instances to be located (my blog).

A few days pass and with the help of a pcre debuger [https://regex101.com/] i find that

$postfix_Syslog_Name = “postfix/[\w]*”
*OnlyService = “(:postfix-1/|postfix-2/|postfix-3/|postfix-4/|policyd-spf|postfix/|post-grey|post-fwd)(?:[-\w]*)?”

Provides output from postmulti instances as well as the /etc/postfix daemon.  I might not need that last postfix on the third line but completist me me thought it worth specifying.

post-fwd and post-grey are not used here in the zoo we use postscreen  The spf log part of the the section is a little unwieldy but that always was and i could turn it off,

I find with postmulti reporting that “postfix/lmtp” is best stated as “lmtp” if grepping unless you want to add extra grep lines to your cron jobs.

So charts are still a bit messed up.   Not the end of the world although i have cron jobs that grep for connections and sasl abusers so between the broken things and our existing zoo cron jobs we keep on top on what postfix is having to deal with.

A work in progress mailgraph.requires that the /usr/sbin/mailgraph file be changed for postmulti.

I seemed ho have some luck and you can see the switch on since the data before was sent from a non internet postfix host denoted by green and red suddenly appearing.

I changed the line for postfix (a regex again) from

if($prog =~ /^postfix\/(.*)/) {

to

if($prog =~ /:postfix|postfix-1\/(.*)|postfix-2\/(.*)|postfix-3\/(.*)|postfix-4\/(.*)/) {

Which is not very maintainable and a bit of a bodge job but gets the regex working for more than one instance..  If that reflects reality or not i will have to check with logwatch reporting although with postfix dropping more bad connections earlier (my blog) feels right so the charts now ignore a large quantity of data of bad smtp clients say.

106 Reject by IP --------
 3 49.213.57.100 unknown
 3 103.241.75.75 unknown

So mailgraph and postfix seem now not count certain items compared to before the upgrade.  So that regex might see an edit.

Mailgraph was and then was not working i was unsure of my efforts – another regex to adjust

I eventually found

/postfix-1\/(.*)|postfix-2\/(.*)|postfix-3\/(.*)|postfix-4\/(.*)|postfix/

Appears to show green / blue and red posfix lines

Fail2ban also seems to need some help – although it seems it will not trip with rate throttling controls in my experience although the odd prober does try an extract from logwatch.

smtp
10 AUTH command rate
10 110.175.112.118 110-175-112-118.tpgi.com.au
1 Connection rate
1 110.175.112.118 110-175-112-118.tpgi.com.au

Perhaps fail2ban’s postfix jails are redundant with the rate limiting feature in newer postfix. Not that fail2ban tripped that often with our non postmulti config.

As most of our email traffic is using tls (dane – my blog) (or trying to) i somehow think mailgraphs use out of the box does not reflect reality with the rate controls, bad clients getting ignore and tls traffic not shown so i suppose this graph shows genuine email traffic rather than all port 25 attempts..