We need to talk about e-mail. E-mail isn't the topic we planned for this month, and its administration isn't what we generally regard as a "Regular Expressions" subject.
Essentially all the applications we develop include an e-mail component, though, and there are vital aspects of e-mail management no one else seems to cover; we finally realized this means it's a conversation we need to start.
Bad, and Getting Worse--But Ahead of Everything Else
E-mail is old-fashioned, it's stodgy, it's probably lost value in the last decade, and there are few innovations that seem poised to improve it significantly--yet e-mail started with such an enormous lead as the original "killer app" that it remains, in 2009, indispensable. Essentially all of us count on it. Even with the competition SMS, IM, "Web 2.0 messaging," and more esoteric channels provide, e-mail does the heavy hauling for all sorts of "mission-critical" situations.
This remains true despite severe liabilities. It's well-known, for example, that at least nine out of every ten messages are spam.
As annoying and disgusting as the noise of spam is, though, there are at least two other technical aspects of e-mail that we find even thornier. The noise of spam admits palliatives, if not cures: scores of vendors offer products and services that reduce spam. It's the countermeasures and consequences of spam that have grabbed our attention lately.
Belligerence in Administration
We now work as hard fighting blacklists as the spam they're supposed to combat. In principle, central experts identify and report on--"blacklist"--spam sources, so that all of us can cheaply learn which mail servers to avoid. Everyone recognizes there'll be occasional "false positives," with innocent servers mistakenly characterized as spam sources. All blacklists have procedures for reporting and resolving such mistakes.
As plausible as this arrangement sounds, we're finding that its reality is nearly unlivable. Essentially all our servers are on some blacklist, always. Fresh servers we bring on-line for the first time are occasionally blacklisted, because they're re-using an IP address that belonged to a spammer in the past, sometimes many years ago. Others are born blacklisted, because they're co-located at a data center where other customers have let hosts be taken over.
Even when we escape these hazards, we find ourselves blacklisted shortly after we first send out e-mail. No one thinks our e-mail is particularly commercial, or unsolicited, and certainly not both at the same time; however, as best we can tell, once one of our return addresses shows up in the address book of a correspondent whose machine was hijacked in the past, our server is flagged for exclusion.
Confusion and Ignorance
We explain this with an excess of conditionals--"might be," "we think"--because it's so hard to be certain. Quite a bit of the security implemented around spam involves obscurity: vendors don't and won't document their precise algorithms for judgment, because they reasonably fear the bad guys will abuse the information. Even when we manage to make contact with a human at the blacklist maintainers, they often seem sincerely unable to explain how we got on, or how to stay off, their lists.
It's surely not because we're abusing e-mail. We manage a few mass mailings every year for customers, but our mass mailings have almost nothing in common with the spammers: we send out information on specific computing languages, certain scientific research opportunities, or similarly obscure and "unmonetized" subjects, to subscribers who've asked for the information. In the abstract, it's hard to confuse any of our activities with spamming.
We certainly show up on blacklists, though. And, from what we can tell, the situation is getting worse. Major ISPs, including Comcast, refuse e-mail from some of our servers without explanation or recourse. As best we can tell, they seem to have decided that it's simply not worth their bother to blacklist accurately, and they appear to have made permanent exclusion a policy for small operators contaminated even once by spamming suspicion.
The rules aren't the same for the major players. Industry heavyweights like Gmail, AT&T, Comcast, MSN, and so on, appear to have worked things out so that they accept e-mail from each other, even though we can document spam originating within each of their services. It's easy to anticipate a near future in which e-mail must pass through one of the major players to have a chance of successful reception.
We suspect that part of the attraction of Facebook and similar services is this preferential treatment of their transmissions. Internet service providers make a point of "clearing the tracks" for mass-market messages to and from Facebook, while they throttle e-mail traffic from small domains like the ones we manage.
The result: it's simply impossible to manage e-mail for a single organization or interest group as we would have a decade ago. It's not enough to configure a server correctly, and leave it running for years at a time; now, it takes weekly intervention to minimize blacklisting.
A dual consequence of the market shift described above has also arisen, with the result that other small domains have become harder to trust. We've seen that e-mail gating now is often on a basis of market share rather than any traditional technical criterion: e-mail from AT&T gets through, and ours doesn't, even though we follow the pertinent standards at least as well as AT&T.
With this pattern, we're finding that other small administrators are giving up on standards conformance. Forward-confirmed reverse DNS (FCrDNS) look-ups provide an example: we typically configure our mail transfer agents (MTAs) to confirm FCrDNS as part of spam detection for incoming e-mail traffic. This amounts to a well-documented and reasonable requirement that a server appears to be located within the organization it claims to represent. If an MTA in Upper Spamola tries to send one of our customers e-mail claiming to be from Bank of America, our default action is to block the e-mail. While this is a weak form of authentication, it significantly lightens the load on our MTAs.
It's not as safe a check as we'd like, though. FCrDNS and related techniques depend on correct configuration of peer MTAs. When we find cases where DNS for a university or small business is misconfigured, our policy is to be a good 'Net citizen and let them know how to correct the error. What we've observed in recent years is administrators indifferent to Internet standards. Their position: as long as e-mail gets through to SBC, AOL, and so on, there's nothing to change. Published standards don't matter apart from their enforcement by the market leaders.
Summary
The e-mail landscape is discouraging. Spam is a drag on all our productivity, and, apart from the distraction of spam, we see e-mail practices increasingly determined by market dominance rather than engineering standards and virtues. E-mail remains indispensable to us, and we'll continue to manage at least some of our own servers for the benefits of internal automation, and integration with the vertical-market applications we develop. If you have ideas about how to deal with the problems this column outlines, though, or just your own problems and surprises in managing e-mail, let us know.
The next installment of "Regular Expressions" will return to our usual focus on working code in high-level languages that you can use in your own programs.
Kathryn and Cameron run their own consultancy, Phaseit, Inc., specializing in high-reliability and high-performance applications managed by high-level languages. They write about scripting languages and related topics in their "Regular Expressions" columns. Their own experience with e-mail goes back thirty years.
