More on the Email -> RSS Gateway
Posted by Bill Rini @ 7:10 amIf you're new here, you may want to subscribe to my RSS feed. Thanks for visiting!
As Ville pointed out there seems to be an existing project (Mailbucket.org) doing something very similar to what I’m attempting to accomplish. I just found it the other day while working on my gateway so I thought I would take a look and compare the two. Here’s what I can tell are the major differences:
My version is based on Qmail, PHP, and MySQL; Mailbucket.org is Exim, Python, Perl (XML::RSS), SpeedyCGI and sqlite.
Mailbucket.org allows anyone to subscribe themselves to a mailing list and forward the email to Mailbucket.org. As some people on the Mailbucket.org site seem to have noticed, this may allow people to spam the list by simply sending an email to somemailinglist@mailbucket.org. They seem to solve this by running everything through SpamAssassin (an excellent program which I use on all my personal accounts). I approached it from a different (but not necessarily) better way. First, you can’t just subscribe to a list. You must enter the list into the database and when the program executes it will verify that that address has been set up to receive traffic. If no match is found in the database it dumps the email to a log file and exits (this can be turned off with a config option so that it just exits without logging but since I’m still in test mode I want to see the errors as well as the sucesses). Next you’re required to enter some information from the email headers into the database to further authenticate the email. Specifically, you must enter some part of the return path. I use a regular expression to match them. I know, for instance, that ezmlm mailing lists adds extra stuff that is unique to each email to the return path for mailing list management purposes so I don’t try to match the exact return path just the parts that I know are common to every email from that list. Now, the big question is; Is this foolproof? It’s not. Someone could still forge the return path and guess (or learn) the email address and spam the gateway. The bigger question is; Is it worth it? If they went to all that trouble they could spam the entire mailing list directly instead of just the email->rss gateway. I’ve included banned IP/domain list functionality so if it’s an isolated case one can just block that specific IP or their mail relay and if it becomes more of a general problem one can always send it through SpamAssassin as a last resort. I wasn’t so hot on the SpamAssassin solution due to the fact that in my own use of the program I usually had to whitelist the mailing lists I subscribe to or it rejected some of them as spam. If one were to rely only on SpamAssassin then they would probably could be tricked using the same methods one would use to trick my solution. So, in the end, IMHO, it was six of one half dozen of another (i.e. no real difference).
Although I probably won’t delay releasing the software for this, I’ve also been looking into ways to password protect the feeds so you would need to append a login and password onto the end of the url in order to access them. At first I was thinking of just using Apache’s .htaccess but I’m not sure how all news readers would deal with being prompted for a login/password.
So where are things at right now? Well, I’m still getting a message here or there that’s sneaking by with characters that cause my RSS newsreader to choke. Funny thing is, it will fail at FeedValidator.org but the same message will pass validation via the W3C Validator. There seem to be two major problems with validation. FeedValidator doesn’t seem to like it when the lastBuildDate uses single digits in the time or date (i.e. 3:30 vs. 03:30). I’ll test that out to see if fixing the format makes it happy. Also, I think FeedValidator is more sensitive to special characters even though they’re properly enclosed in CDATA blocks.
If all goes well over the next couple of days I might have something pretty solid by the beginning of next week.
License
This work is published under a Creative Commons Attribution-NonCommercial-NoDerivs 2.5 License.
If you enjoyed this post, please consider getting our free RSS feed so you can be notified of other posts like this.
- BROWSE / IN TIMELINE
- « Email to RSS Gateway
- » Jupiter Personalization Story Only Gets It Part Right
- RELATED / YOU MIGHT FIND THESE INTERESTING
SPEAK / ADD YOUR COMMENT
Comments are moderated.

