h1

cleanse RSS as you collect it

May 10th, 2002

Tim’s been doing some programming with RSS and learning a thing or two.

Not to get behind, I’m working with Scott Johnson on some RSS stuff and wrote a test harness for a routine to clean up the descriptions in a feed, removing or escaping html, truncating description size, etc.

I’ve been following the debate about whether Jenny should truncate her feed., so it occurs to me it doesn’t matter any more whether she does or not. As long as I use my new handy-dandy RSS cleansing proxy, I can control it myself.

I’ve saved a piece of Scott’s feed as an example since he’s got some html in it and his entries are long. Here’s the raw feed

If you use my cleanser, you get this result

The feed parameter of course is required. I’ve added optional parameters too –

limit sets the byte limit to truncate the feed at, or 0 for all of it. defaults to 500 bytes

method allows you to specify whether you want all tags removed from the feed (remove) or just have ampersands escaped so it doesn’t mess up your aggregator output or cause script security concerns (amp). Defaults to remove.

More examples: limit to 50 bytes, amp method

I’ll leave this available for a day or two, but after that, you’ll have to do it yourself – I don’t need the entire world’s RSS calls proxied through my server.

One comment to “cleanse RSS as you collect it”

  1. Brent, are you cleansing just the descriptions or other items as well? Looks as though we’ll see other item attributes that will need cleansing as well if you’re following stuff Jon Udell is doing. See my blog for pointers to that stuff.