Viewable With Any Browser Your vote? I Disagree I Agree

Sendmail/Postfix Milters in Python

by Jim Niemira and Stuart D. Gathman
This web page is written by Stuart D. Gathman
and
originally sponsored by Business Management Systems, Inc.
(see LICENSE for copying permissions for this documentation)
Last updated Jul 11, 2015

Maxwell's Daemon: pymilter mascot Mascot by students of Christian Hafner See the FAQ | Download now | Support | Overview

Sendmail introduced a new API beginning with version 8.10 - libmilter. Sendmail 8.12 officially released libmilter. Version 8.12 seems to be more robust, and includes new privilege separation features to enhance security. Even better, sendmail 8.13 supports socket maps, which makes pysrs much more efficient and secure. Sendmail 8.14 finally supports modifying MAIL FROM via the milter API, and a data callback allowing spam to be rejected before beginning the DATA phase (even after accepting some recipients).
A Python Pymilter provides a milter module for Python that implements a python interface to libmilter exploiting all its features.
A Postmark Now Postfix also implements the milter protocol, and you can program SMTP time filters for Postfix in Python.

What's New

  • pymilter 1.0 removes the start.sh glue script. EL6 RPMs for packages using pymilter (milter,pysrs,pygossip) now use daemonize as a replacement. ACCEPT is supported as an untrapped exception policy. An optional dir for getaddrset and getaddrdict in Milter.config supports moving some clutter. Untrapped exceptions now report the registered milter name. An selinux subpackage is include for EL6. Provide sqlite support for greylisting, and Milter.greylist export and Milter.greysql import to migrate data.
  • pyspf 2.0.9 adds a new test suite and support for RFC 7208, the official (non-experimental) RFC for SPF.
  • pyspf 2.0.8 adds much improved python3 support. All test suites now pass with python3 and py3dns. SPF records are restricted to 7-bit ascii. But some people try to use an extended set anyway, crashing pyspf. We now return PermError for non-ascii SPF records. IP address parsing and arithmetic is now handled by the ipaddr (ipaddress in python3) module. I fixed a bug caused by a null CNAME in cache.
  • milter 0.8.18 adds test cases and SMTP AUTH policies in sendmail access for spf-milter. You can now also configure an untrapped exception policy for spf-milter, and it rejects numeric HELO. For the bms milter, from words can be in a file, and you can use the BAN feature for configured public email providers like gmail and yahoo - it bans the mailbox rather than the entire domain.
  • pymilter 0.9.8 adds a test modules for unit testing milters. It fixes a typo that prevented setsymlist from actually working all these years (misspelled as setsmlist). The untrapped exception message is changed to "pymilter: untrapped exception in milter app".
  • milter 0.8.17 reports keysize of DKIM signatures, adds a simple DKIM milter, and DKIM policies in the sendmail access file. It also broke spf-milter for people using SMTP AUTH - sorry guys!
  • milter 0.8.16 has dkim signing, and Authentication-Results header. pymilter-0.9.7 has several improved diagnostics for milter programming errors.
  • milter has dkim checking and logging in CVS. Will use DKIM Pass for reputation tracking, and as an additional acceptable identity along with HELO, PTR, or SPF.
  • pymilter-0.9.4 supports python-2.6
  • pymilter-0.9.2 supports the negotiate, data, and unknown callbacks. Protocol steps are automatically negotiated by the high-level Milter package by annotating callback methods with @nocallback or @noreply.
  • pymilter-0.9.1 supports CHGFROM, introduced with sendmail-8.14, and also supported by postfix-2.3.

Support

  • pymilter mailing list
  • SPF forums and chat for SPF questions
  • IRC channel: #dkim on irc.perl.org for DKIM questions
  • IRC channel: #pymilter on irc.freenode.net for pymilter questions

    You may be required to register your user nickname (nick) and identify with that nick. Otherwise, you may not be able to join or be heard on the IRC channel. There is a page describing how to register your nick at freenode.net.

Overview

To accomodate other open source projects using pymilter, this package has been shedding modules which can be used by other packages.
  • The pymilter package provides a robust toolkit for Python milters that wraps the C libmilter library. There are also several pure Python milter libraries that implement the milter protocol in Python.
  • The milter package provides the beginnings of a general purpose mail filtering system written in Python. It also includes a simple spfmilter that supports policy by domain and spf result via the sendmail access file.
  • The pysrs package provides an SRS library, SES library, a sendmail socketmap daemon implementing SRS, and (Real Soon Now) an srsmilter daemon implementing SRS, now that sendmail-8.14 supports CHGFROM and this is supported in pymilter-0.9.
  • The pyspf package provides the spf module, a well tested implementation of the of the SPF protocol, which is useful for detecting email forgery.
  • The pygossip package provides the gossip library and server daemon for the GOSSiP protocol, which exchanges reputation of qualified domains. (Qualified in the milter package means that example.com:PASS tracks a different reputation than example.com:NEUTRAL.)
  • The pydns package provides the low level DNS library for python DNS lookups. It is much smaller and lighter than the more capable (and bigger) dnspython library. Low level lookups are needed to find SPF and MX records for instance.
  • The pydspam package wraps libdspam for python.

At the lowest level, the milter module provides a thin wrapper around the sendmail libmilter API. This API lets you register callbacks for a number of events in the process of sendmail receiving a message via SMTP. These events include the initial connection from a MTA, the envelope sender and recipients, the top level mail headers, and the message body. There are options to mangle all of these components of the message as it passes through the milter.

At the next level, the Milter module (note the case difference) provides a Python friendly object oriented wrapper for the low level API. To use the Milter module, an application registers a 'factory' to create an object for each connection from a MTA to sendmail. These connection objects must provide methods corresponding to the libmilter callback events.

Each event method returns a code to tell sendmail whether to proceed with processing the message. This is a big advantage of milters over other mail filtering systems. Unwanted mail can be stopped in its tracks at the earliest possible point.

The Milter.Milter class provides default implementations for event methods that do nothing, and also provides wrappers for the libmilter methods to mutate the message.

The mime module provides a wrapper for the Python email package that fixes some bugs, and simplifies modifying selected parts of a MIME message.

Finally, the bms.py application is both a sample of how to use the Milter and spf modules, and the beginnings of a general purpose SPAM filtering, wiretapping, SPF checking, and Win32 virus protecting milter. It can make use of the pysrs package when available for SRS/SES checking and the pydspam package for Bayesian content filtering. SPF checking requires pydns. Configuration documentation is currently included as comments in the sample config file for the bms.py milter. See also the HOWTO and Milter Log Message Tags.

Python milter is under GPL. The authors can probably be convinced to change this to LGPL if needed.

What is a milter?

Milters can run on the same machine as sendmail, or another machine. The milter can even run with a different operating system or processor than sendmail. Sendmail talks to the milter via a local or internet socket. Sendmail keeps the milter informed of events as it processes a mail connection. At any point, the milter can cut the conversation short by telling sendmail to ACCEPT, REJECT, or DISCARD the message. After receiving a complete message from sendmail, the milter can again REJECT or DISCARD it, but it can also ACCEPT it with changes to the headers or body.

What can you do with a milter?

  • A milter can DISCARD or REJECT spam based based on algorithms scripted in python rather than sendmail's cryptic "cf" language.
  • A milter can alter or remove attachments from mail that are poisonous to Windows.
  • A milter can scan for viruses and clean them when detected.
  • A milter scans outgoing as well as incoming mail.
  • A milter can add and delete recipients to forward or secretly copy mail.
  • For more ideas, look at some of the milters linked at the PyMilter Main Page.
  • Documentation for the C API is provided with sendmail. Documentation for pymilter is provided via Doxygen. Miltermodule provides a thin python wrapper for the C API. Milter.py provides a simple OO wrapper on top of that.

    The Python milter package includes a sample milter that replaces dangerous attachments with a warning message, discards mail addressed to MAILER-DAEMON, and demonstrates several SPAM abatement strategies. The MimeMessage class to do this used to be based on the mimetools and multifile standard python packages. As of milter version 0.6.0, it is based on the email standard python packages, which were derived from the mimelib project. The MimeMessage class patches several bugs in the email package, and provides some backward compatibility.

    The "defang" function of the sample milter was inspired by MIMEDefang, a Perl milter with flexible attachment processing options. The latest version of MIMEDefang uses an apache style process pool to avoid reloading the Perl interpreter for each message. This makes it fast enough for production without using Perl threading.

    mailchecker is a Python project to provide flexible attachment processing for mail. I will be looking at plugging mailchecker into a milter.

    TMDA is a Python project to require confirmation the first time someone tries to send to your mailbox. This would be a nice feature to have in a milter.

    Is a milter written in python efficient?

    The python milter process is multi-threaded and startup cost is incurred only once. This is much more efficient than some implementations that start a new interpreter for each connection. Testing in a production environment did not use a significant percentage of the CPU. Furthermore, python is easily extended in C for any step requiring expensive CPU processing.

    For example, the HTML parsing feature to remove scripts from HTML attachments is rather CPU intensive in pure python. Using the C replacement for sgmllib greatly speeds things up.

    Goals

  • Implement RRS - a backdoor for non-SRS forwarders. User lists non-SRS forwarder accounts (perhaps in ~/.forwarders), and a util provides a special local alias for the user to give to the forwarder. Alias only works for mail from that forwarder. Milter gets forwarder domain from alias and uses it to SPF check forwarder. Requires milter to have read access to ~/.forwarders or else a way for user to submit entries to milter database.
  • The bms.py milter has too many features. Create a framework where numerous small feature modules can be plugged together in the configuration.
  • Find or write a faster implementation of sgmllib. The sgmlop package is not very compatible with Python-2.1 sgmllib, but it is a start, and is supported in milter-0.4.5 or later.
  • Implement all or most of the features of MIMEDefang.
  • Follow the official Python coding standards more closely.
  • Make unit test code more like other python modules.
  • Confirmed Installations

    Please email me if you do not successfully install milter. The confirmed installations are too numerous to list at this point.

    Enough Already!

    Nearly a dozen people have emailed me begging for a feature to copy outgoing and/or incoming mail to a backup directory by user. Ok, it looks like this is a most requested feature. In the meantime, here are some things to consider:
    • The milter package (bms.py) supports the mail_archive option in the [wiretap] section. This is not by user, however.
    • If you want to equivalent of a Bcc added to each message, this is very easy to do in the python code for bms.py. See below.
    • If you want to copy to a file in a directory (thus avoiding having to set up aliases), this is slightly more involved. The bms.py milter already copies the message to a temporary file for use in replacing the message body when banned attachments are found. You have to open a file, and copy the Mesage object to it in eom().
    • Finally, you are probably aware that most email clients already keep a copy of outgoing mail? Presumably there is a good reason for keeping another copy on the server.

    To Bcc a message, call self.add_recipient(rcpt) in envfrom after determining whether you want to copy (e.g. whether the sender is local). For example,

      def envfrom(...
        ...
        if len(t) == 2:
          self.rejectvirus = t[1] in reject_virus_from
          if t[0] in wiretap_users.get(t[1],()):
    	self.add_recipient(wiretap_dest)
          if t[1] == 'mydomain.com':
            self.add_recipient('<copy-%s>' % t[0])
          ...
    

    To make this a generic feature requires thinking about how the configuration would look. Feel free to make specific suggestions about config file entries. Be sure to handle both Bcc and file copies, and designating what mail should be copied. How should "outgoing" be defined? Implementing it is easy once the configuration is designed.


     [ Valid HTML 3.2! ]