Author: Jari Aalto.
1.1 General
$Id: pm-tips.txt,v 2.28 2004/10/06 13:55:39 jaalto Exp $
$URL: http://pm-doc.sourceforge.net/ $
$UrlLinksLastChecked: 2002-07-11 $
This is a Procmail Tips page: a collection of procmail recipes,
instructions, howtos. The document also contains URL pointers to
the procmail mailing list and sites that fight against Internet
UBE. Procmail is powerful mail handling tool and a lot of space
here has been devoted to discuss about UBE (aka Spam) and its
essence. You will also find many other interesting subjects that
discuss about internet mail in general: mail headers, MIME and
RFCs. Another part of this document is dedicated to Emacs and Emacs
plug-in package Gnus.el, simply because Emacs is the best tool you
can use to deal with your mail and news reading. Nowadays Emacs is
also available in Windows platform as well. This is not to say that
existing Unix elm(1), mutt(1) or pine(1), slrn(1) mail/news
programs are bad, they are just limited in power compared to Emacs
and usually tied to Unix platform. Finally, to your blessing or
curse (smile) the author happens to know Emacs quite well. The
tips are compiled from the procmail discussion list, from
comp.mail.misc and from the author's own experiences with procmail.
This document does not intend to teach you the basics of procmail,
instead you have to be familiar with the procmail man pages
already. Procmail manual pages exists primarily on Unix/Linux
platform, If You're using Windows operating system, see Cygwin at
http://www.cygwin.com/
You may want to read Nancy's and Era's procmail FAQ pages
before this page. Ther are wealth of useful procmail links and
pointers to Unix programs that deal with mail. If you find
errors or things to improve in this document, please send mail
to this document's Maintainer.
If any mentioned URL is not alive, you may still be able to
successfully find it using the WWW search such as
http://www.google.com/
1.2 What is Procmail?
[FAQ] Procmail is a mail processing utility, which can help you
filter your mail, sort incoming mail according to sender, Subject
line, length of message, keywords in the message, etc, implement an
ftp-by-mail server, and much more. Procmail is also a complete
drop-in replacement for your MDA. (If this doesn't mean anything to
you, you may not want to know.) Procmail runs under Unix. See
Infinite Ink's Mail Filtering and Robots page for information about
related utilities for various other platforms, and competing Unix
programs, too (there aren't that many of either).
1.3 Abbreviations and thanks
People and documents, abbreviations referred to, tokens used, are
in no particular order.
[stephen] Stephen R. van den Berg, Author of Procmail Last heard
from stephen 1997-08 in procmail mailing list by using address
srb@cuci.nl. Later 1998 due to his regular work activities and
lack of time he nominated Philip Guenther to the head of Procmail
development.
[aaron] Aaron Schrab aaron+procmail A T schrab com
[alan] Alan K. Stebbens alan.stebbens A T openwave com
[dan] Daniel Smith J.Daniel.Smith A T WriteMe dt com
[david] David W. Tamkin dattier A T panix com
[ed] Edward J. Sabol sabol A T alderaan gsfc nasa gov
[elijah] Eli the Bearded process A T qz little-neck ny us
[hal] Hal Wine hal A T dtor com
[jari] Jari Aalto jari aalto A T poboxes dt com
[philip] Philip Guenther guenther A T gac edu
[richard] Richard Kabel rkabel A T sequent com
[sean] Sean B. Straw PSE-L A T mail professional org
[timothy] Timothy J Luoma luomat+procmail A T luomat peak org
[walter] Walter Dnes waltdnes A T interlog com
[FAQ] Procmail FAQ era A T iki.fi
[manual] Quote from some procmail manual page
[maintainer] As of 2000-09 the maintainer is [jari]
#broken-link Link does not exist any more. A replacement is needed
A big Thank you goes all these people:
- 1999-06-16 Mark Seiden mis@seiden.com Did a enermous work to
proofread the v1.74. He sent a massive 105k wpatch ith many
editorial corrections. My wholeheart thank you to you, Mark.
- 1999-01-08 Steven Alexander stevena@teleport.com thought that
a small perl script would help me to fix spelling mistakes more
easily. The script has been much better correction program than
that I myself. Thank you. (Being a perl programmer myself, I
should have thought thia laready smile)
- 1999 Guido.Van.Hoecke@se.bel.alcatel.be took 1.48 and sent a huge
55k patch to correct many English language typos. Thank you
very much Guido.
- 1998-10-28 Richard Kabel rkabel@sequent.com sent massive patch
to correct language and provided excellent improvement comments.
Thank you Guido for spending the time with it.
- 1998 Era Eriksson proof read the v1.12 and sent numerous
corrections.
- Karl E. Vogel vogelke@c17mis.region2.wpafb.af.mil sent
numerous new anti-spam links to be added to the document.
- 1998 John Gianni jjg@cadence.com send some nice recipes:
one is now in the procmail module list and the other ideas
I have added to this tips file.
- 1998 Tim Potter tpot@zip.com.au had a spare moment with v1.27 and
sent lot of spelling corrections. Thank you.
1.4 Version information
Here is version and file size log of the text file, which gives you
some estimate how the document has evolved.
v2.27 2004-10-10 516 Spam related things removed. v2.16 2002-08-31 596 Removed old UBE pointers. v2.13 2002-08-13 596 Removed old UBE pointers. v2.5 2002-02-01 608 Spelling checked with Emacs ispell v2.2 2002-01-28 608 URL links checked and updated v2.0 2001-08-09 608 http://pm-doc.sourceforge.net opened. v1.77 1999-12-27 603 Netscape spam filters added v1.76 1999-10-01 602 Mark Seiden's patch applied. Now under CVS. v1.74 1999-04-26 599 document moved to www.procmail.org v1.72 1999-04-21 597 Links corrected v1.71 1999-03-29 597 Ricochet -- Perl script to fight UBE v1.70 1999-02-26 592 procmail's Y2K compliance v1.69 1999-02-23 590 RFC and using MIME in Usenet postings v1.68 1998-01-29 587 Added "Lua" language pointer v1.67 1998-01-07 579 Eli's procmail recipes in module section v1.66 1998-12-14 578 Philip took care of bugs/patches listing v1.64 1998-11-26 602 More Richard's comments integrated v1.63 1998-10-30 595 Richard's english correction patch v1.60 1998-10-21 591 UMASK, .forward if procmail already is LDA v1.58 1998-10-12 583 SmartList and other MLM software discussed v1.57 1998-10-06 575 PLUS addr. Convert HTML body to text v1.55 1998-08-29 565 Fetching fields with formail -x v1.53 1998-08-24 554 Procmail doesn't pass 8bit characters v1.52 1998-08-24 553 Flag c forking study, procmail wish list v1.51 1998-08-18 541 Small changes. MIME notes v1.49 1998-08-10 529 Guido.Van.Hoeck's 55k patch applied v1.46 1998-06-24 526 Added live urls to procmail archive v1.45 1998-06-23 521 All recipes checked by eye. Many fixes. v1.44 1998-06-19 516 Detecting mailing lists with pm-jalist.rc v1.41 1998-06-17 510 How to disable recipe quickly with v1.36 1998-04-03 493 Includerc rewritten, plus addressing v1.34 1998-04-02 488 ORing and supreme scoring added v1.32 1998-03-23 471 All recipes checked (by eye) v1.31 1998-03-10 469 Better ordering: ORing rules discussed v1.29 1998-01-30 429 "regexp" section rewrite. v1.24 1997-12-30 415 up till 1996-12 is now included v1.17 1997-12-09 343 up till archive 1996-07 now included v1.14 1997-11-25 260 v1.13 1997-11-08 218 Era's correction suggestions. v1.10 1997-10-13 181 archive file 1995-10's tips included v1.9 1997-10-11 142 v1.8 1997-10-01 127 v1.6 1997-09-18 94 v1.5 1997-09-16 76 v1.05 1997-09-14 53 v1.01 1997-09-13 46 (k) |
1.5 Document layout and maintenance
In order to be able to maintain this documentation in every
possible platform, the base version of this document is kept in
text format, which is easily accessible and requires no special
editors or learning a markup language like LaTex, Texinfo, or Linux
DocBook SGML. Granted, that some other base format may be more
suitable for multiple presentation output formats (like postscript,
Emacs info), but in today's world a simple TEXT and generated HTML
hopefully suffices to all needs. Also Perl and Emacs are
cross-platform tools, (Windows, Unix ..) and easily installed, so
getting work is hopefully no obstacle. The tools to help
maintaining this document include (not required!):
Text version of this file was converted into HTML with following
command. You need Perl interpreter 5.4 or newer to call t2html.pl
script. The --Out option generates file pm-tips.html in current
directory. Please also familiarize yourself with GNU RCS ident(1),
if you have it available. It is important that you mark interesting
text to these tools so that someone can get an overview of your
supplied files
1.5.1 Sending improvements
Because I'm not English speaking, I regret any typos in the
document. If you have any time, 5-10 minutes to find some spelling
mistake or misuse of the English verbs, please go ahead and send a
patch to maintainer of this page. The preferred way to send
corrections to this document is as diff(1) output. Here's how to
make corrections send them forward. The diff option -u is only
available in GNU diff, please try to send the -u diff if possible.
If you don't have -u option, use -c option:
% cp pm-tips.txt pm-tips.txt.orig
... load the pm-tips.txt to your text editor ... edit the file and save ... Generate the difference (a patch(1) compatible file)
% diff -bwu pm-tips.txt.orig pm-tips.txt > pm-tips.txt.patch
...Send content of pm-tips.txt.diff by mail to document maintainer. |
1.6 About presented recipes
The recipes presented here are collected from the net and procmail
archives. The recipes have been kept as original as possible, but a
generalization of the ideas have been done when necessary. If some
recipe doesn't work as announced, please a) send note to
[maintainer] b) send mail to procmail mailing list and ask how to
correct it. Sometimes a simple dot(.) has been used in regular
expressions, where the right, pedantic way would have been to use
an escaped dot. If you want to be very strict, you should use the
escaped dot where applicable.
# free hand version # pedantic version :0 :0 * match.this.site * match\.this\.site |
Procmail also accepts assignments without quotes, like this:
var = value num = 1 dir = /var/mail |
But in this document a strict style has been adopted, where literal
strings are assigned with double quotes:
That's because the procmail code checker (Emacs package
tinyprocmail.el) then won't warn about missing dollar-sign, which
might have very well been forgotten. Emacs package font-lock.el,
a syntax highlighting assistant, also displays double quoted string
in color.
# If you do this...
var = value
# then you might have made a typo. It is in fact not clear # what was intended:
var = "value" # Did you mean: literal assignment? var = $value # Did you mean: variable assignment? |
Recipe flags are also not stuck together, because the visual
distinction of :0 and flags is a valuable one. Reasoning for
which flags are kept together and in which order is explained later
in details.
# Erm, all stuck] # This may be visually more clear :0ABDc: :0 A BD c: |
1.7 Variables used in recipes
These are part of the procmail module pm-javar.rc and are used in
recipes.
# Pure newline; typical usage if you want to write # Something directly to procmail's active logfile: # # LOG = "$NL message $NL"
NL = " " |
Refer to "improving Space-Tab syndrome" section for more details
WSPC = " " # whitespace: space + tab
SPC = "[$WSPC]" # Regexp: space + tab SPCL = "($SPC|$)" # whitespace + linefeed: spc/tab/nl NSPC = "[^$WSPC]" # negation
s = $SPC # shortname: like perl -- \s d = "[0-9]" # A digit -- Perl \d w = "[0-9a-z_A-Z]" # A word -- Perl \w W = "[^0-9a-z_A-Z]" # A word -- Perl \W a = "[a-zA-Z]" # A word, only alphabetic chars |
Writing recipes is now a little easier and may look more clear at
least to people that have accustomed reading Perl regular expression
short names:
:0 *$ Header-Name:$s+$d+$s+$d # Matches "Header: 11 12" { # Matched "whitespace" + "digit" + "whitespace" + "digit" # Do something } |
SUPREME = 9876543210, is the highest score value that causes
procmail to bail out. [david] Actually the maximum is 2147483647,
but 9876543210 is easier to remember/type and will function just as
well.
PMSRC = Procmail module source code directory. Location where *.rc
files reside. Anywhere you want it to be. Usually $HOME/pm or
$HOME/procmail/lib. Here you can keep the procmail files, log files and
includerc scripts. Another common used synonym is PMDIR.
SPOOL = Directory where your procmail delivers the categorized
messages. Like mailing lists:
list.procmail, list.lynx-users, list.emacs, list.elm |
and work mail:
work.announcements, work.lab, work.doc, work.customer |
and your private message:
mail.Usenet, mail.private, mail.default, mail.perl |
and unimportant messages
junk.daemon, junk.cron, junk.ube |
If you read the procmail-delivered files directly, this directory
is usually $HOME/Mail or $HOME/mail. If you use some other software
that reads these files as mail spool files (like Emacs Gnus), then
this directory is typically ~/Mail/spool or similar.
MYXLOOP = Used to prevent re-sending messages that have already
been handled. Typically $LOGNAME@$HOST, but this can be any user
chosen string. Make it it unique to your address. In this document
the definition is:
MYXLOOP = "X-Loop: $LOGNAME@$HOST" |
SENDMAIL = Program to deliver composed mail. Usually standard
Unix sendmail(1), but it must have some switches with it. See man
page for more. We use following definition in scripts:
SENDMAIL = "sendmail -oi -t" |
NICE = In a Unix environment you can lower the scheduling
priority with nice(1). If you are conscious of how many external
processes you launch for each piece of mail it would be polite to
lower the priority of such processes. You may see in this document
that external processes are called with NICE enabled:
:0 w # Same as "nice -10 script.pl" | $NICE script.pl |
IS functions; Functions to test file or directory attributes.
E.g. IS_EXIST is defined as "test -e" and so on. The definition of
IS functions are system-dependent. E.g. On Irix the "-e" option
is not recognized and the nearest equivalent is "test -r". All IS
functions are defined in the pm-javar.rc module.
1.8 About "useless use of cat award"
FIXME: Replace wc -l and use other example.
Randal Schwartz, a well-known Perl programmer and Perl book writer,
started giving rewards for the "useless use of cat command"
whenever someone wrote examples without token "<". Like this:
% cat file.name.this | wc -l |
Instead he writes that the call should have been written like this,
which saves the pipe (never mind that wc can read the file
directly; this is an example).
[Paul David Fardy pdf@morgan.ucs.mun.ca] There is weight
in the pipeline, but the true cost is in process startup. Try
running wc 100 times on /etc/motd or on this message. My tests show
the useless use of cat doubles the real and processing time (real,
user, and system time are each roughly doubled):
$ cat > /tmp/randall <'EOF' [[ -n $COUNT ]] || COUNT0 typeset -i i=1 while (( i < $COUNT )); do < /etc/motd wc; (( i = i + 1 )) done > /dev/null EOF
$ cat > /tmp/useless <'EOF' [[ -n $COUNT ]] || COUNT=100 typeset -i i=1 while (( i < $COUNT )); do cat /etc/motd | wc; (( i = i + 1 )) done > /dev/null EOF
$ set -x $ export COUNT0 $ time ksh /tmp/randall $ time ksh /tmp/useless |
This becomes important, for example, when you decide to filter all
your mail with procmail--looking for virus signatures for example.
I might well decide to look only at the first 3 or 4 kilobytes.
It's not the size of messages--most are small anyway--but the
number of messages that cause a problem. Do you want to double the
processing cost of all our mail? I'm looking at a system-wide
filter for all my users' mail. I'm considering Sendmail's mail
filter versus procmail filtering. I'll likely be using a bit of
both. And given that all of the filtering really just getting in
the way of legitimate traffic, it'd really piss me off if I naively
doubled the cost.
Recent comments
10 hours 20 min ago
10 hours 27 min ago
12 hours 6 min ago
13 hours 40 min ago
17 hours 14 min ago
19 hours 34 min ago
19 hours 35 min ago
21 hours 21 min ago
21 hours 30 min ago
1 day 1 hour ago