SpamBurglar

SpamBurglar was written by Tom Ashley, Matt Hansen, Marie Joiner, Jen Knutson and Josh Ourisman as our senior project for Carleton College's "Integrative Exercise". (a.k.a. Comps) The advisor for this project was Dave Musicant, Professor of Computer Science at Carleton College. The Bayesian filtering algorithm used in this system was derived from a paper written by Paul Graham, which can be found at http://www.paulgraham.com/spam.html.

How to use the program:

When you run SpamBurglar, a window pops up with three options: Proxy startup, Configuration and About SpamBurglar. The contents of each window is as follows:

Proxy startup: Starts and stops the local proxy server. The proxy server must be started for the filter to operate. The proxy server must be stopped before the program is exited.

Configuration: Sets up the various user preferences needed in the operation of the program. The "Host" field indicates the mail server you wish to filter spam through. The "Port" field indicates the port number for this server. The "Proxy Port" field indicates the port number for the local proxy server which is responsible for the actual spam filtering.

The user must select the correct protocol used by their mail server, either POP or IMAP. The user must also determine the action the filter should take on messages identified as spam. The filter will either mark the message as spam by appending a notification to the subject line, or it will delete the message entirely.

The user may specify the "threshhold" probability for determining if a message is spam. The default is 90%. The window also has an option to edit a "whitelist". The whitelist is a list of email addresses that will always be considered to be senders of nonspam. The user must enter the entire address: username@host. If a user wishes to add all users from a particular host, the syntax is *@hostname. Finally, the user may choose to enable or disable "Training mode". When training mode is enabled, a window will appear at the end of the SpamBurglar session with a list of all new email messages and their spam identities. The user has the opportunity to correct/change the results if necessary, and the program will use this new data to update the filter's table of spam probabilities. Training mode should always be enabled unless you are sure that the filter has been adequately trained on your email.

Note: SpamBurglar is licensed under the GNU Public License.

Files:

Comps Presentation in PDF format Source Code GNU Public License (Text File)
Joshua Ourisman
Last modified: Thu May 13 11:34:10 CDT 2004