Great Circle Associates List-Managers
(October 2001)

Subject: Automated bounce recognition
From: "Ronald F. Guilmette" <rfg @ monkeys . com>
Date: Sat, 20 Oct 2001 12:06:30 -0700
To: list-managers @ greatcircle . com

For purposes unrelated to mailing list administration, I have developed
a C language program whose purpose is to automatically differentiate
automated response e-mail messages (e.g. bounce messages) from other
types of e-mail messages.

This program had been quite well tested and debugged now, and I've been
using it for production purposes for some time.  It's accuracy in making
the differentiation between automated and non-automated replies is very
very high, certainly in excess of 99%.

I am mentioning this program here for two reasons.

The first reason is that I want to further `tune' the differentiator
using more input test data.  It has already been tuned using a very
large sample of both automated replies and non-automated replies,
but the program can always benefit by yet more tuning/testing on yet
more sample inputs.  Thus, I'm asking anyone and everyone here who
might be in possession of any large archives of e-mail and who might
be willing and able to supply me with copies of those archives (in
compressed form via FTP) to contact me and let me know where/how I
can obtain copies of those archives.

The second reason that I mention this differentiator here is that it
occurs to me that it might possibly have some application to mailing
list administration, either on its own, or perhaps when coupled with
some other software.  I'd like to solicit opinions from List-Managers
about this possibility.  Would anyone here have a use for such a pro-
gram?  If so, then I'd like to know what the interest level is, and
whether it would be worth my time to ``productize'' and document the
code with an eye towards making some sort of a public release, either
as freeware or perhaps commercially.

P.S.  If there is any interest in this sort of a program for use in
conjunction with mailing list administration, then a number of other
questions arise.  First on the list is the question of the handling
of non-bounce automatically-generated replies, e.g. from autoresponders.
In the context of mailing list administration, would it be best to
consider autoresponder messages as being functionally equivalent to
non-deliverable bounce messages, or as functionally equivalent to
ordinary non-bounce messages, or as neither of the above (i.e. a
third and separate category, all by themselves).

