Testing Spam Filtering Software at the University of Cincinnati
by Amin Shafie
Like most enterprises connected to the Internet, UC finds spam an increasing problem. Spam consumes network resources and affects performance and productivity. Gartner Research recently estimated that up to 80 percent of incoming email is spam. The volume alone is a source of frustration for both users and management; and our email administrators are under tremendous pressure to reduce spam. For the vast majority of people, the most appropriate response to spam is to delete it. Increasingly, organizations worldwide are deploying spam-filtering technologies that can reduce - though not eliminate - the problem.
During the months of April and May, UCit conducted a pilot project to test and evaluate the effectiveness of one kind of spam-filtering software that could possibly be deployed on the University's email systems. The software uses heuristic analysis techniques. Heuristic analysis is a statistical process that scans for common spam characteristics, each of which is assigned a probability. If the cumulative spam probability of the message exceeds a threshold, the message can be tagged as spam. The technique learns as it is used. We deployed the software at the boundary, on the two Bearcat Online email system Mirapoint gateway servers. We invited a random sample of 475 students and 225 faculty members on Bearcat Online, and 225 faculty and staff members on the UCMail (Exchange) system to participate in the pilot, and added to the sample a list of 140 users who had sent us complaints about spam. Some invitees declined the offer to participate, but a sufficient number agreed. We gave each participant instructions on how to set up a filter on his or her desktop. We asked everyone to record and report: a) the number of spam messages received; b) the number of spam messages caught by the spam filter; c) the number of spam messages not caught by the spam filter; and d) the number of false positives. Finally, from an analysis of these numbers, we asked each participant to select one of the following responses:
- My percent caught is very low. This software is very ineffective in catching my spam messages. Do not waste University resources on this product.
- My percent caught is relatively low, and I consider this software unsatisfactory.
- My percent caught is good, and I recommend further consideration of this software.
- My percent caught is excellent. This software is very effective in catching my spam. I strongly recommend making it available on the University's email system.
Results of the pilot were as follows:
- 73 participants kept records and reported results.
- The overall percent catch rate was 71%.
- 41 participants (or 56%) gave the software good to excellent rating.
- 15 participants (or 21%) gave the software an unsatisfactory rating.
- 10 participants (or 14%) said they could not judge the software.
- 6 participants (or 8%) had no response.
- For some participants, the false positives were a problem. This means the technique became too aggressive as it erroneously classified legitimate messages as spam.
Since we completed the pilot, the Mirapoint vendor has announced an upgrade of the software. UCit is contemplating deploying it on Bearcat Online as soon as we can determine a funding source. Also, the Microsoft Corporation has announced that a similar spam-filtering technique, called Exchange Intelligent Message Filter, will be bundled at no cost with its Exchange 2003 Server software. UCit plans to migrate its UCMail (Exchange) email services to this server in late summer. For more information on this software, please access the following:
http://www.microsoft.com/exchange/downloads/2003/imf/overview.asp
You may send email to the author at Amin.Shafie@UC.Edu.
|