index_long.html
E-Mail and Spam Processing
Click Here for a single page version of this document.
Click Here for a multi-page version.
This document describes the processing of incoming e-mail messages, ``virus'' detection, server side routing of messages to user-defined IMAP folders, and spam processing.
Anti-virus and Phishing''
Incoming messages are first checked for ``virus'' and ``phishing'' attempts with messages containing any of these normally dropped without further processing. It is possible to bypass this processing for certain users, but probably not a good idea.
Mail Folders and Spam Processing''
By default all incoming messages that don't contain viruses or worms are placed in the user's INBOX unless they're identified as spam in which case they are put in the ``spam''folder.
Spam Folders
The screen shot below is my ``Thunderbird'' mailer showing several IMAP accounts on the left side, and you can see the spam folder with three subfolders under it, falsepositive, missed, and whitelist.
Any messages identified by the system as spam will go into the top spam folder. Messages may be copied or dropped into one of the other three spam subfolders where the system will process them within 15 minutes to update your spam filters or to ``whitelist'' the message so that mail from that sender will not be scored as spam in the future. The most useful of these are the ``missed'' and ``whitelist'' folders.
You should check the spam folder periodically for false positives, messages that are scored as spam, but aren't. Messages that are false positives may be copied or dropped in either the ``falsepositive'' or ``whitelist'' folders.
falsepositive
Messages in this folder will be used to lower the scores for similar messages which will decrease the probability of their being scored as spam. It's usually better to use the ``whitelist'' folder as that is effective almost immediately while scoring takes more time to accumulate data.
missed
When spam messages are found in the INBOX or other folders, you can drop them in this folder to have the system learn that similar messages are spam. It may require some time for the system to learn your preferences.
whitelist
Messages in this folder are scanned, and future messages from the same sender are assigned a high negative spam score to minimize the possibility that their messages will be scored as spam.
Mail Routing to Folders
Users may have messages routed to selected IMAP folders based on patterns in the message headers.
Incoming messages may be sent to folders other than the default INBOX, and this is controlled by a file in the user's $HOME directory, ~/Maildir.rules. The format of this file is quite simple. It is a plain text file with one line per rule specifying, the header type, a pattern, and the folder in which to put the message. Here's a sample of mine.
# This file is very similar to the tcp_wrappers /etc/hosts.allow # file, and is used to map mail headers to Maildir mailboxes. # Each line consists of three parts, a header list, pattern, and # mailbox name in the format: # Header[,header] : pattern : DROP | mailboxname | emailaddr. # # Any whitespace around the first and last ``:'' characters will # be dropped. # # The DROP mailbox name is used to indicate mail that will be # dropped, and won't go into any mailbox. The mailbox names are # used to specify Maildir/ style mailboxes. Any ``/'' characters # in the mailbox name will be replaced by ``.'' for compatibility # with courier-imap. # # custom header from PloneFormGen, usually from Contact Us X-FormGen,X-PloneFormGen: . : vendors # extraneus message from our accounting software Subject: AP.POSTING.PERIOD : DROP # Security update for the SNORT package Sender: noreply@snort.org : security Subject: Logwatch : security # many mailing list programs identify lists with ``List-Id'' List-Id: lists.apple.com : bulk From: reply.myfamilyinc.com : vendors # This has two header types separated by ``,'' in the first # field. From,Sender: vrmmail@vrm.ca.ibm.com : vendors From,Sender: vrmmail@us.ibm.com : vendors From,Sender: billing@ebay.com : vendors Subject: Postfix.log.summ : postmaster Subject: cssecscan.errs : security Subject: BlueSecurity : general To: checkservices : alert Subject: missing.checkins : alert Subject: checkServices|chkhosts.pl : alert # Final rule to send things to the general folder Received: . : general |
Per-Folder Options
After selecting a folder, the system then checks another file in the user's $HOME directory, ~/Maildir.conf to get options on delivery which may apply to that folder. These options control things like whether to do spam checking for messages (we don't for mail to folders postmaster, security, or abuse as they may well be spam reports). The following sample has most of the options in use with comments. I will go into more detail on the more generally useful options below.
# Maildir.conf, Mailbox configuration file. # # The options here in the DEFAULT section apply to all folders, # and may be overridden in the folder sections. [DEFAULT] # set True for debugging output to /tmp/deliver.username debug: False # Set dupsok to allow duplicate messages dupsok: False # if dupsok isn't True, you can set this to specify another # mailbox for delivery (e.g. spam) # send duplicate messages to the spam box (INBOX.spam) # dupmailbox: spam # Don't accept messages with duplicate bodies. This can be # useful to eliminate multiple copies of spam that isn't detected # by anti-spam software. nodupbody: True # set lastdupbody True to keep only the last copy of messages # with duplicate bodies. This is useful for things like periodic # maintenance reports where it's useful to know when the last # message came in. This overrides the nodupbody setting. lastdupbody: True # Set for shared folders. If this is True, it needs to include a # path line with the *FULL PATH* to the directory, and must be # included for all sub-folders of that directory shared: False # This is really a system level parameter that determines the # hostname part of the file name. Set it to False to use full # host names. shorthostname: True # The default path to the main (INBOX) Maildir # mailbox = ~/Maildir # This is the prefix for all subfolders. Generally this is used # to allow shortcuts (e.g. spam instead of INBOX.spam) folderprefix = INBOX # Set ``archive'' to ``year'' or ``month'' in a folder to # automatically put messages in subfolders by year or year.month. # archive = '' # cronarchive is used to allow periodic cron jobs to archive # folders, and is the same as archive, (e.g. year or month) cronarchive = # Set ``taggedboxes'' to True to automatically put tagged mail messages # into separate boxes. That is mail to user+tag@example.com will # go into the tag folder in their main mailbox. This is for # processing by the deliver.post process. taggedboxes = True # # Check incoming mail using spamassassin's spamd daemon. # the sa_host and sa_port parameters may be used to change the # default host and port respectively. # sa_check = True sa_host = localhost sa_port = 783 # # The sa_levels option sets cutoff levels for spam scores. Mail # with Spamassassin scores greater than or equal to the score is # put the folder. Valid entries are score, folder pairs with no # commas or other special characters, and may be put one per line # with leading white space. # sa_levels: 7.00 DROP sa_levels: 14.00 DROP # sa_levels: 200.00 DROP # Set this to point to a file containing headers, patterns, and # folders for automatic mail sorting. This should be overridden in any # mailboxes that don't want further looking at mailbox rules (e.g. things # like support, spam, security which may be hit with tagged addresses. mailrules: ~/Maildir.rules # These variables are used when forwarding messages via the mail # rules. # smtphost = localhost # smtpport = 25 myorigin = celestial.com # Default retry time for mailbox reading in seconds xelmretry: 60 # The next two options are the names of folders used to drop # messages that are either spam that was missed by the filters or # non-spam messages that were identified as spam. junkfolder: spam.missed falsepositive: spam.fallpositive # spamassassin no ***SPAM*** prefix sa_subject_prefix: # These map folder names which is useful for things like mapping tagged # addresses to real folders (e.g. user+pp@example.com -> vendor.paypal # folder. # [foldermaps] inbox = general # clamav tags virus = postmaster banned = postmaster header = postmaster # This maps folders to shared mailboxes. The shared Maildir # folders will have the prefix set to the key (e.g. the folder # name. # [maildirs] public: ~/MaildirShared spamtrap: ~/MaildirSpamtrap archivebill: ~/MaildirArchive incoming: ~/MaildirIncoming [INBOX] # This is the default INBOX dupsok = False # save only the most recent copy of duplicate message bodies lastdupbody = True [alert] sa_check: False cronarchive = month dupsok = True mboxpatterns: '' # suppress additional rules checking nodupbody: False [bulk] # keep a year's traffic in the top bulk folder cronarchive: year xelmretry: 1800 [junkfax] # keep a month's traffic in the top junkfax folder cronarchive: month # xelmretry: 1800 [customer] cronarchive: year lastdupbody = True [fax] cronarchive: month # Fax notifications [general] cronarchive: year # This is really the default INBOX dupsok = False # save only the most recent copy of duplicate message bodies lastdupbody = True xelmretry: 600 [postmaster] sa_check = False dupsok = True mboxpatterns: '' # suppress additional rules checking nodupbody: False lastdupbody: False [support] # # We want to keep the most recent copy of messages with duplicate # bodies # lastdupbody: True # Don't check against mailbox rules mboxpatterns: '' nodupbody: False # Don't check support e-mail for spam as it may well be a spam # report. sa_check = False [security] cronarchive: month sa_check: False # archive = month dupsok = True mboxpatterns: '' # suppress additional rules checking nodupbody: False [spam] sa_check: False # take all duplicates dupsok: True mboxpatterns: False nodupbody: False [sugm] # News and Seattle Unix Group xelmretry: 600 [spam.missed] sa_check: False vacation_allow: False # take all duplicates dupsok: True mailrules: '' nodupbody: False [spam.falsepositive] sa_check: False vacation_allow: False # take all duplicates dupsok: True mailrules: '' nodupbody: False [vendors] sa_check: False vacation_allow: False # take all duplicates dupsok: True mailrules: '' nodupbody: False [Trash] purgetime = 1d |
cronarchive
This option controls automatic archiving of messages into subfolders. The usual options are year and month which create subfolders by year and month respectively. I use year for top level folders of general use such as my ``bulk'' folder which contains mailing list traffic, and my ``general'' folder which is miscellaneous e-mail that doesn't fit in other categories. I use ``month'' for things like my ``security'' folder which is fairly high volume (over 15,000 messages per month).
taggedboxes
Setting this to ``True'' causes the system to recognize ``tagged'' e-mail addresses such as user+vendor@example.com where the user is the user name, and ``vendor'' is the tag. This would route this message to the vendor folder.
Spam Options
These options turn spam checking on and off for individual folders, and may be used to select spam into different folders based on the spam score.
sa_check
This is either ``True'' or ``False''. Setting it to False turns off spam checking for the folder.
sa_levels
This sets one or more pairs of values, a score, and a folder. In the example above, ``sa_levels: 14.00 DROP'' has all spam with a level about 14.00 to be silently dropped. Something like this would create additional spam folders. Spam with scores below 10.00 would go in the spam folder, between 10.00 and 15.00 in the spam.high folder, between 15.00 and 20.00 in spam.horrible, and anything above 20.00 would be dropped.
sa_levels: 10.00 spam.high 15.00 spam.horrible 20.00 DROP |
sa_subject_prefix
This may be used to set a prefix that will be added to the Subject of messages identified as spam. Generally it's best to leave this empty as anything in the spam folder will be spam.