The orginally published version can be found on the Usenix site at: http://www.usenix.org/publications/login/1999-10/features/archiving.html

The orginal version had a few minor typos that have been corrected below.

Email Archiving.


A few weeks ago a situation arose at work where I was asked to log a copy of ALL incoming/outgoing email. Management thought it would be good if we could provide an email archive that users could search through if the documents and memos in our electronic library were inadequate.

After searching the news groups I found a lot of people asking the same question, but no one had a satisfactory answer. Until I came across a posting from Robert Harker that listed a sendmail feature he called "copyuser"[1]. This appeared to be what I needed. After installing it and doing some testing, I found that it was missing a few things. Harker's version logged all external messages but it did not log any messages that were local. So I made some modifications to log a copy of ALL local and external mail to an account called "copyuser."

I'm not claiming to be an expert on writing sendmail rulesets. But with the recent thread on "Email Monitoring" going across the "sage-members" mailing list, I thought I'd share my solution. There may be a more efficient way to do this, but this seems to work for me. Listing 1 shows my modified version of Harker's configuration.

Listing 1.
1 VERSIONID(`msgidruleset.m4')
2 VERSIONID(`Copyright 1998, Robert Harker, Harker Systems')
3 VERSIONID(`www.harker.com, [email protected], 408-295-6239')
4 VERSIONID(`Permission granted to user this as long as this')
5 VERSIONID(`VERSIONID information is preserved in the M4 macro file')
6 VERSIONID(`and any sendmail.cf files created using thie M4 macro file')
7 VERSIONID(`Modified to handle logging local mail as well as external mail')
8 VERSIONID(`Shane B. Milburn, [email protected], 04/19/1999'')
9 ifdef(`_MAILER_smtp_',,
10 `errprint(`*** MAILER(smtp) must appear before copymail mailer')')dnl
11
12 LOCAL_CONFIG
13 CPNOCOPY
14
15 LOCAL_NET_CONFIG
16
17 LOCAL_RULE_0
18 R$+.NOCOPY. $#local @ $: $1
19 R$+<@$j.NOCOPY.> $#local @ $: $1
20 R$+<@$+.NOCOPY.> $#esmtp $@$2 $:$1<@$2.>
21 R$+<@$+.> $#copymail $@nohostneeded $:$1<@$2.NOCOPY>
22 R$+<@$+> $#copymail $@nohostneeded $:$1<@$2.NOCOPY>
23 R$+ $#copymail $@nohostneeded $:$1<@$j.NOCOPY>
24
25 MAILER_DEFINITIONS
26 # Copy a message by sending it back to sendmail with an
27 # additional adress: copyuser
28 Mcopymail, P=/usr/lib/sendmail, F=mDFMuX, S=11/31, R=21, E=\r\n, L=990,
29 T=DNS/RFC822/X-Unix,
30 A=/usr/lib/sendmail -oi copyuser@$j.NOCOPY $u

Here's a description of what mgsidruleset.m4 does. The first eight lines will insert the text between the quotes into your sendmail.cf file as comments. Lines 25 through 30 define a new mailer named "copymail". This definition re-invokes sendmail with the original reciepients ($u) and an additional recipient named 'copyuser@$j.NOCOPY'. While this does cause additional load on the server, in my case it was not enough of a load to cause concern. Now let's look at the rest of the m4 file. Lines 9-10 will print an error message if in your site-config.mc file you try to declare "copymail" before "smtp". The next two lines tell sendmail to declare a local class. The LOCAL_NET_CONFIG line forwards non-local network stuff to SMART_HOST. The LOCAL_RULE_0 statement is used to introduce new parsing rules. This is where you place any custom delivery agents and parsing rules you have defined. In Listing 1, the custom parsing rules are lines 18 through 23.

Installation:

In order to install this, you need to place listing 1 into a file called msgidruleset.m4 in sendmail-8.9.3/cf/feature/ directory. Don't forget that there are tabs between the first and second entry on lines 18-23 (if you do, sendmail will remind you.) Now add the following line to your site-config.mc file.

FEATURE(msgidruleset)

Here is what my site-config.mc file looks like.

VERSIONID(`@(#)mcst-config.mc Shane B. Milburn 04/21/1999')
VERSIONID(`@(#)This configuration logs ALL email to copyuser.')
OSTYPE(solaris2) FEATURE(use_cw_file)dnl
FEATURE(relay_entire_domain)dnl
FEATURE(always_add_domain)
FEATURE(rbl) MAILER(smtp)
MAILER(local)
FEATURE(msgidruleset)

After you create your site-config.mc file, use the m4 program to generate your sendmail.cf file. In /usr/local/src/sendmail-8.9.3/cf/cf you would use
"m4 ../m4/cf.m4 site-config.mc > sendmail.cf". This would create a sendmail.cf in the cf directory. You can either move this file into /etc/mail/ or invoke
sendmail with the "-C" option to test. (Note: If you use the -C option for testing make sure you change line 30 in Listing 1 to "A=/usr/lib/sendmail -C/path/to/sendmail.cf copyuser@$j.NOCOPY $u". Otherwise, when you re-invoke sendmail it uses /etc/mail/sendmail.cf which does not have the .NOCOPY parsing rules and will bounce your message.)

Testing:

To test this new configuration, invoke sendmail from the command line. The first test takes a username "mkephart" and passes it to ruleset 3 and then 0. This causes the username to be rewritten to "[email protected]". Even though it's not shown, line 30 in listing 1 also causes a second address "[email protected]" also to be passed back to sendmail.

# /usr/lib/sendmail -C/usr/local/src/sendmail-8.9.3/cf/cf/sendmail.cf -bt
> 3,0 mkephart
rewrite: ruleset 3 input: mkephart
rewrite: ruleset 96 input: mkephart
rewrite: ruleset 96 returns: mkephart
rewrite: ruleset 3 returns: mkephart
rewrite: ruleset 0 input: mkephart
rewrite: ruleset 199 input: mkephart
rewrite: ruleset 199 returns: mkephart
rewrite: ruleset 98 input: mkephart
rewrite: ruleset 98 returns: $# copymail $@ nohostneeded $: mkephart < @ mcst . gsfc . nasa . gov . NOCOPY >
rewrite: ruleset 0 returns: $# copymail $@ nohostneeded $: mkephart < @ mcst . gsfc . nasa . gov . NOCOPY >

Continuing with the testing, let's pass "[email protected]" to rulesets 3,0. You can see that this time the address is eventually passed to ruleset 98 which is lines 18-23. Notice that the ".NOCOPY" tag gets stripped off and the email is delivered to mkephart. Same as before,
this is also happening for "[email protected]" and being delivered via mail.local to copyuser.

> 3,0 [email protected]
rewrite: ruleset 3 input: mkephart @ mcst . gsfc . nasa . gov . NOCOPY
rewrite: ruleset 96 input: mkephart < @ mcst . gsfc . nasa . gov . NOCOPY >
rewrite: ruleset 96 returns: mkephart < @ mcst . gsfc . nasa . gov . NOCOPY . >
rewrite: ruleset 3 returns: mkephart < @ mcst . gsfc . nasa . gov . NOCOPY . >
rewrite: ruleset 0 input: mkephart < @ mcst . gsfc . nasa . gov . NOCOPY . >
rewrite: ruleset 199 input: mkephart < @ mcst . gsfc . nasa . gov . NOCOPY . >
rewrite: ruleset 199 returns: mkephart < @ mcst . gsfc . nasa . gov . NOCOPY . >
rewrite: ruleset 98 input: mkephart < @ mcst . gsfc . nasa . gov . NOCOPY . >
rewrite: ruleset 98 returns: $# local @ $: mkephart
rewrite: ruleset 0 returns: $# local @ $: mkephart

Everything looks like it works, but lets try sending an email and make sure it actually works before we put this sendmail.cf file into production.

# /usr/lib/sendmail -C/path/to/sendmail.cf -t
To: [email protected] Subject: testing logging ALL email.
This is a test message.
Did you recieve a copy and is there a copy of this message in the "copyuser" mail folder?

-shane
.
#

Checking my netcom account, I see that I did recieve the email. If I cat /var/mail/copyuser it contains a copy of the message. Now that you're
convinced it works, you can install your new sendmail.cf into /etc/mail/sendmail.cf and restart sendmail.

There is a catch to all of this logging. Depending on the amount of mail that goes through the system /var/mail/copyuser can get quite large. Since I needed to make an archive through which a user must parse, it made sense to rotate the file daily. This allowed the user to grep a particular day's
email rather than a week or a month's worth of email. At the end of the week I tared the files into a weekending.MMDDYYYY.tar file and wrote it to 8mm. Before you implement an email archive, make sure your company has a policy about privacy issues and who actually owns the email
sent to/from your server.

Reference:

[1] http://www.harker.com/sendmail/copyuser.html

Other Resources:


Sendmail 2nd Edition, by Bryan Costales with Eric Allman, January
Newgroup: comp.mail.sendmail

About me:

Shane is the Lead Systems Administrator for the Modis Characterization Support Team (MCST)
a subtask on NASA's MODIS project. Shane can be reached at [email protected].

Hosted by www.Geocities.ws

1