What do Viagra, low interest rates, and the Abacha family of Nigeria have in common? Not much, except that news of them is as predictable as your morning caffeine hit. The difference is, your beverage consumption remains about the same from year to year, while the spam problem has reached crisis proportions--more than 50 percent of all e-mail, and the ratio grows worse and worse.

Current methods, such as rejecting mail from known spammers (blacklists), and only accepting mail from friends and colleagues (white lists), help, but not enough. And merely filtering known spam messages is always one step behind clever spammers. More aggressive filtering poses an unacceptable risk of killing legitimate messages--until recently.

New filtering methods analyze e-mail messages in their entirety, instead of just a handful of key words. The filters then create sophisticated models, based on probability and statistics theory going back to the ideas of the 18th-century mathematician and cleric Thomas Bayes, that determine whether new messages are spam or not.

The new Bayesian filters are already available as open-source code, and will show up in commercial products later in the year. Similar Bayesian-like machine learning software has been included in several e-mail products from Microsoft Corp. (Redmond, Wash.).

In this article by longtime Internet journalist Steven J. Vaughan-Nichols, IEEE Spectrum takes a fresh look at the spam problem and asks whether it can be contained by the new methods of Bayesian filtering.