Jari's Procmail Tips Page

Submitted by jari_aalto (Contact Author) (Forums) on Thu, 2005-04-14 09:54. :: Procmail

Author: Jari Aalto.

Table of contents

1.0 Document id
2.0 Procmail pointers
3.0 Dry run testing
4.0 Things to remember
5.0 Procmail flags
6.0 Matching and regexps (regular expressions)
7.0 Variables
8.0 Suggestions and miscellaneous
9.0 Scoring
10.0 Formail usage
11.0 Saving mailing list messages
12.0 Procmail, MIME and HTML
13.0 Simple recipe examples
14.0 Miscellaneous recipes
15.0 Procmail and PGP
16.0 Includerc usage
17.0 Mailing list server
18.0 Common troubles
19.0 Technical matters
20.0 Procmail software for Emacs
21.0 RFC, Request for comments
22.0 Introduction to E-mail Headers
23.0 Message headers


1.0 Document id

1.1 General

$Id: pm-tips.txt,v 2.28 2004/10/06 13:55:39 jaalto Exp $
$URL: http://pm-doc.sourceforge.net/ $
$UrlLinksLastChecked: 2002-07-11 $

This is a Procmail Tips page: a collection of procmail recipes, instructions, howtos. The document also contains URL pointers to the procmail mailing list and sites that fight against Internet UBE. Procmail is powerful mail handling tool and a lot of space here has been devoted to discuss about UBE (aka Spam) and its essence. You will also find many other interesting subjects that discuss about internet mail in general: mail headers, MIME and RFCs. Another part of this document is dedicated to Emacs and Emacs plug-in package Gnus.el, simply because Emacs is the best tool you can use to deal with your mail and news reading. Nowadays Emacs is also available in Windows platform as well. This is not to say that existing Unix elm(1), mutt(1) or pine(1), slrn(1) mail/news programs are bad, they are just limited in power compared to Emacs and usually tied to Unix platform. Finally, to your blessing or curse (smile) the author happens to know Emacs quite well. The tips are compiled from the procmail discussion list, from comp.mail.misc and from the author's own experiences with procmail.

This document does not intend to teach you the basics of procmail, instead you have to be familiar with the procmail man pages already. Procmail manual pages exists primarily on Unix/Linux platform, If You're using Windows operating system, see Cygwin at http://www.cygwin.com/

You may want to read Nancy's and Era's procmail FAQ pages before this page. Ther are wealth of useful procmail links and pointers to Unix programs that deal with mail. If you find errors or things to improve in this document, please send mail to this document's Maintainer.

If any mentioned URL is not alive, you may still be able to successfully find it using the WWW search such as http://www.google.com/

1.2 What is Procmail?

[FAQ] Procmail is a mail processing utility, which can help you filter your mail, sort incoming mail according to sender, Subject line, length of message, keywords in the message, etc, implement an ftp-by-mail server, and much more. Procmail is also a complete drop-in replacement for your MDA. (If this doesn't mean anything to you, you may not want to know.) Procmail runs under Unix. See Infinite Ink's Mail Filtering and Robots page for information about related utilities for various other platforms, and competing Unix programs, too (there aren't that many of either).

1.3 Abbreviations and thanks

People and documents, abbreviations referred to, tokens used, are in no particular order.

[stephen] Stephen R. van den Berg, Author of Procmail Last heard from stephen 1997-08 in procmail mailing list by using address srb@cuci.nl. Later 1998 due to his regular work activities and lack of time he nominated Philip Guenther to the head of Procmail development.

[aaron] Aaron Schrab aaron+procmail A T schrab com
[alan] Alan K. Stebbens alan.stebbens A T openwave com
[dan] Daniel Smith J.Daniel.Smith A T WriteMe dt com
[david] David W. Tamkin dattier A T panix com
[ed] Edward J. Sabol sabol A T alderaan gsfc nasa gov
[elijah] Eli the Bearded process A T qz little-neck ny us
[hal] Hal Wine hal A T dtor com
[jari] Jari Aalto jari aalto A T poboxes dt com
[philip] Philip Guenther guenther A T gac edu
[richard] Richard Kabel rkabel A T sequent com
[sean] Sean B. Straw PSE-L A T mail professional org
[timothy] Timothy J Luoma luomat+procmail A T luomat peak org
[walter] Walter Dnes waltdnes A T interlog com

[FAQ] Procmail FAQ era A T iki.fi
[manual] Quote from some procmail manual page
[maintainer] As of 2000-09 the maintainer is [jari]
#broken-link Link does not exist any more. A replacement is needed

A big Thank you goes all these people:

  • 1999-06-16 Mark Seiden mis@seiden.com Did a enermous work to proofread the v1.74. He sent a massive 105k wpatch ith many editorial corrections. My wholeheart thank you to you, Mark.
  • 1999-01-08 Steven Alexander stevena@teleport.com thought that a small perl script would help me to fix spelling mistakes more easily. The script has been much better correction program than that I myself. Thank you. (Being a perl programmer myself, I should have thought thia laready smile)
  • 1999 Guido.Van.Hoecke@se.bel.alcatel.be took 1.48 and sent a huge 55k patch to correct many English language typos. Thank you very much Guido.
  • 1998-10-28 Richard Kabel rkabel@sequent.com sent massive patch to correct language and provided excellent improvement comments. Thank you Guido for spending the time with it.
  • 1998 Era Eriksson proof read the v1.12 and sent numerous corrections.
  • Karl E. Vogel vogelke@c17mis.region2.wpafb.af.mil sent numerous new anti-spam links to be added to the document.
  • 1998 John Gianni jjg@cadence.com send some nice recipes: one is now in the procmail module list and the other ideas I have added to this tips file.
  • 1998 Tim Potter tpot@zip.com.au had a spare moment with v1.27 and sent lot of spelling corrections. Thank you.

1.4 Version information

Here is version and file size log of the text file, which gives you some estimate how the document has evolved.

      v2.27   2004-10-10  516  Spam related things removed.
v2.16 2002-08-31 596 Removed old UBE pointers.
v2.13 2002-08-13 596 Removed old UBE pointers.
v2.5 2002-02-01 608 Spelling checked with Emacs ispell
v2.2 2002-01-28 608 URL links checked and updated
v2.0 2001-08-09 608 http://pm-doc.sourceforge.net opened.
v1.77 1999-12-27 603 Netscape spam filters added
v1.76 1999-10-01 602 Mark Seiden's patch applied. Now under CVS.
v1.74 1999-04-26 599 document moved to www.procmail.org
v1.72 1999-04-21 597 Links corrected
v1.71 1999-03-29 597 Ricochet -- Perl script to fight UBE
v1.70 1999-02-26 592 procmail's Y2K compliance
v1.69 1999-02-23 590 RFC and using MIME in Usenet postings
v1.68 1998-01-29 587 Added "Lua" language pointer
v1.67 1998-01-07 579 Eli's procmail recipes in module section
v1.66 1998-12-14 578 Philip took care of bugs/patches listing
v1.64 1998-11-26 602 More Richard's comments integrated
v1.63 1998-10-30 595 Richard's english correction patch
v1.60 1998-10-21 591 UMASK, .forward if procmail already is LDA
v1.58 1998-10-12 583 SmartList and other MLM software discussed
v1.57 1998-10-06 575 PLUS addr. Convert HTML body to text
v1.55 1998-08-29 565 Fetching fields with formail -x
v1.53 1998-08-24 554 Procmail doesn't pass 8bit characters
v1.52 1998-08-24 553 Flag c forking study, procmail wish list
v1.51 1998-08-18 541 Small changes. MIME notes
v1.49 1998-08-10 529 Guido.Van.Hoeck's 55k patch applied
v1.46 1998-06-24 526 Added live urls to procmail archive
v1.45 1998-06-23 521 All recipes checked by eye. Many fixes.
v1.44 1998-06-19 516 Detecting mailing lists with pm-jalist.rc
v1.41 1998-06-17 510 How to disable recipe quickly with
v1.36 1998-04-03 493 Includerc rewritten, plus addressing
v1.34 1998-04-02 488 ORing and supreme scoring added
v1.32 1998-03-23 471 All recipes checked (by eye)
v1.31 1998-03-10 469 Better ordering: ORing rules discussed
v1.29 1998-01-30 429 "regexp" section rewrite.
v1.24 1997-12-30 415 up till 1996-12 is now included
v1.17 1997-12-09 343 up till archive 1996-07 now included
v1.14 1997-11-25 260
v1.13 1997-11-08 218 Era's correction suggestions.
v1.10 1997-10-13 181 archive file 1995-10's tips included
v1.9 1997-10-11 142
v1.8 1997-10-01 127
v1.6 1997-09-18 94
v1.5 1997-09-16 76
v1.05 1997-09-14 53
v1.01 1997-09-13 46 (k)

1.5 Document layout and maintenance

In order to be able to maintain this documentation in every possible platform, the base version of this document is kept in text format, which is easily accessible and requires no special editors or learning a markup language like LaTex, Texinfo, or Linux DocBook SGML. Granted, that some other base format may be more suitable for multiple presentation output formats (like postscript, Emacs info), but in today's world a simple TEXT and generated HTML hopefully suffices to all needs. Also Perl and Emacs are cross-platform tools, (Windows, Unix ..) and easily installed, so getting work is hopefully no obstacle. The tools to help maintaining this document include (not required!):

Text version of this file was converted into HTML with following command. You need Perl interpreter 5.4 or newer to call t2html.pl script. The --Out option generates file pm-tips.html in current directory. Please also familiarize yourself with GNU RCS ident(1), if you have it available. It is important that you mark interesting text to these tools so that someone can get an overview of your supplied files

      % per -S t2html.pl                                              \
--html-frame \
--title "Procmail tips page" \
--author "Jari Aalto" \
--meta-keywords "procmail, sendmail, mail, filter, FAQ, ube" \
--meta-description "Procmail tips page" \
--base http://pm-doc.sourceforge.net \
--document http://pm-doc.sourceforge.net \
--url http://pm-doc.sourceforge.net \
--html-body "LANG=en" \
--Out \
pm-tips.txt

1.5.1 Sending improvements

Because I'm not English speaking, I regret any typos in the document. If you have any time, 5-10 minutes to find some spelling mistake or misuse of the English verbs, please go ahead and send a patch to maintainer of this page. The preferred way to send corrections to this document is as diff(1) output. Here's how to make corrections send them forward. The diff option -u is only available in GNU diff, please try to send the -u diff if possible. If you don't have -u option, use -c option:

      %   cp pm-tips.txt pm-tips.txt.orig

... load the pm-tips.txt to your text editor
... edit the file and save
... Generate the difference (a patch(1) compatible file)

% diff -bwu pm-tips.txt.orig pm-tips.txt > pm-tips.txt.patch

...Send content of pm-tips.txt.diff by mail to document maintainer.

1.6 About presented recipes

The recipes presented here are collected from the net and procmail archives. The recipes have been kept as original as possible, but a generalization of the ideas have been done when necessary. If some recipe doesn't work as announced, please a) send note to [maintainer] b) send mail to procmail mailing list and ask how to correct it. Sometimes a simple dot(.) has been used in regular expressions, where the right, pedantic way would have been to use an escaped dot. If you want to be very strict, you should use the escaped dot where applicable.

      # free hand version     # pedantic version
:0 :0
* match.this.site * match\.this\.site

Procmail also accepts assignments without quotes, like this:

      var = value
num = 1
dir = /var/mail

But in this document a strict style has been adopted, where literal strings are assigned with double quotes:

      var = "value"

That's because the procmail code checker (Emacs package tinyprocmail.el) then won't warn about missing dollar-sign, which might have very well been forgotten. Emacs package font-lock.el, a syntax highlighting assistant, also displays double quoted string in color.

      #   If you do this...

var = value

# then you might have made a typo. It is in fact not clear
# what was intended:

var = "value" # Did you mean: literal assignment?
var = $value # Did you mean: variable assignment?

Recipe flags are also not stuck together, because the visual distinction of :0 and flags is a valuable one. Reasoning for which flags are kept together and in which order is explained later in details.

      # Erm, all stuck]      # This may be visually more clear
:0ABDc: :0 A BD c:

1.7 Variables used in recipes

These are part of the procmail module pm-javar.rc and are used in recipes.

      #   Pure newline; typical usage if you want to write
# Something directly to procmail's active logfile:
#
# LOG = "$NL message $NL"

NL = "
"

Refer to "improving Space-Tab syndrome" section for more details

      WSPC    = "     "               # whitespace: space + tab

SPC = "[$WSPC]" # Regexp: space + tab
SPCL = "($SPC|$)" # whitespace + linefeed: spc/tab/nl
NSPC = "[^$WSPC]" # negation

s = $SPC # shortname: like perl -- \s
d = "[0-9]" # A digit -- Perl \d
w = "[0-9a-z_A-Z]" # A word -- Perl \w
W = "[^0-9a-z_A-Z]" # A word -- Perl \W
a = "[a-zA-Z]" # A word, only alphabetic chars

Writing recipes is now a little easier and may look more clear at least to people that have accustomed reading Perl regular expression short names:

      :0
*$ Header-Name:$s+$d+$s+$d # Matches "Header: 11 12"
{
# Matched "whitespace" + "digit" + "whitespace" + "digit"
# Do something
}

SUPREME = 9876543210, is the highest score value that causes procmail to bail out. [david] Actually the maximum is 2147483647, but 9876543210 is easier to remember/type and will function just as well.

PMSRC = Procmail module source code directory. Location where *.rc files reside. Anywhere you want it to be. Usually $HOME/pm or $HOME/procmail/lib. Here you can keep the procmail files, log files and includerc scripts. Another common used synonym is PMDIR.

SPOOL = Directory where your procmail delivers the categorized messages. Like mailing lists:

      list.procmail, list.lynx-users, list.emacs, list.elm

and work mail:

      work.announcements, work.lab, work.doc, work.customer

and your private message:

      mail.Usenet, mail.private, mail.default, mail.perl

and unimportant messages

      junk.daemon, junk.cron, junk.ube

If you read the procmail-delivered files directly, this directory is usually $HOME/Mail or $HOME/mail. If you use some other software that reads these files as mail spool files (like Emacs Gnus), then this directory is typically ~/Mail/spool or similar.

MYXLOOP = Used to prevent re-sending messages that have already been handled. Typically $LOGNAME@$HOST, but this can be any user chosen string. Make it it unique to your address. In this document the definition is:

      MYXLOOP = "X-Loop: $LOGNAME@$HOST"

SENDMAIL = Program to deliver composed mail. Usually standard Unix sendmail(1), but it must have some switches with it. See man page for more. We use following definition in scripts:

      SENDMAIL = "sendmail -oi -t"

NICE = In a Unix environment you can lower the scheduling priority with nice(1). If you are conscious of how many external processes you launch for each piece of mail it would be polite to lower the priority of such processes. You may see in this document that external processes are called with NICE enabled:

      :0 w                # Same as "nice -10 script.pl"
| $NICE script.pl

IS functions; Functions to test file or directory attributes. E.g. IS_EXIST is defined as "test -e" and so on. The definition of IS functions are system-dependent. E.g. On Irix the "-e" option is not recognized and the nearest equivalent is "test -r". All IS functions are defined in the pm-javar.rc module.

1.8 About "useless use of cat award"

FIXME: Replace wc -l and use other example.

Randal Schwartz, a well-known Perl programmer and Perl book writer, started giving rewards for the "useless use of cat command" whenever someone wrote examples without token "<". Like this:

      % cat file.name.this | wc -l

Instead he writes that the call should have been written like this, which saves the pipe (never mind that wc can read the file directly; this is an example).

      % wc -l < file.name.this

[Paul David Fardy pdf@morgan.ucs.mun.ca] There is weight in the pipeline, but the true cost is in process startup. Try running wc 100 times on /etc/motd or on this message. My tests show the useless use of cat doubles the real and processing time (real, user, and system time are each roughly doubled):

      $ cat > /tmp/randall <'EOF'
[[ -n $COUNT ]] || COUNT0
typeset -i i=1
while (( i < $COUNT )); do
< /etc/motd wc;
(( i = i + 1 ))
done > /dev/null
EOF

$ cat > /tmp/useless <'EOF'
[[ -n $COUNT ]] || COUNT=100
typeset -i i=1
while (( i < $COUNT )); do
cat /etc/motd | wc;
(( i = i + 1 ))
done > /dev/null
EOF

$ set -x
$ export COUNT0
$ time ksh /tmp/randall
$ time ksh /tmp/useless

This becomes important, for example, when you decide to filter all your mail with procmail--looking for virus signatures for example. I might well decide to look only at the first 3 or 4 kilobytes. It's not the size of messages--most are small anyway--but the number of messages that cause a problem. Do you want to double the processing cost of all our mail? I'm looking at a system-wide filter for all my users' mail. I'm considering Sendmail's mail filter versus procmail filtering. I'll likely be using a bit of both. And given that all of the filtering really just getting in the way of legitimate traffic, it'd really piss me off if I naively doubled the cost.


2.0 Procmail pointers

2.1 Where is procmail developed

Philip Guenther guenther@gac.edu is currently taking care of and coordinating procmail bug fixes. Please send any procmail bugs to the mailing list or to bug@procmail.org. The development mailing list is running SmarList at procmail-dev@procmail.org. Newest Procmail code:

      http://www.procmail.org/
ftp://ftp.procmail.org/

Manual pages

      http://www.voicenet.com/~dfma/intro.html

2.2 Procmail resources

Procmail is discussed in Usenet newsgroup comp.mail.misc.

Procmail archive
ftp://ftp.informatik.rwth-aachen.de:/pub/packages/procmail/ Articles from procmail mailing list: covers from 1994-08 to 1995-05 (A .gz file: ~2Meg when uncompressed) More later articles can be found at <http://mailman.rwth-aachen.de/pipermail/procmail/>. Search page is at http://www.rosat.mpe-garching.mpg.de/mailing-lists/procmail/

Nancy McGough nm@noadsplease.ii.com - Prcmail Quick start
http://www.ii.com/internet/robots/procmail/qs/
http://www.ii.com/internet/faqs/launchers/mail/filtering-faq/

Era's Procmail FAQ and link collection
http://www.iki.fi/~era/procmail

Professor Timo Salmis's Procmail page
http://www.uwasa.fi/~ts/info/proctips.html See Timo's "Foiling Spam with an Email Password System" at http://www.uwasa.fi/~ts/info/spamfoil.html

Joe G