Comments on: Aggregators for the New Age

By: mote

mote — Thu, 08 Jun 2006 15:50:07 +0000

(Hmmm… looks like there’s a bug in wordpress’s “\” and “‘” escaping)

By: mote

mote — Thu, 08 Jun 2006 15:47:18 +0000

Les, thanks for stopping by. I\’ll be emailing you once I\\\’m ready to start writing the intelligent part of my aggregator–I\’m curious what sort of features you\\\’ve tried for machine learning.

A few thoughts on what you wrote:
1. I\’m quite sure a straight binary classifier is not the answer. Interestingness is different from the spam/ham problem because the answer is fuzzy rather than black & white. My naive guess is that the best user interface will be one that subtly marks to the user that an article is worth reading (say, a brighter red used for the entry header) or not worth reading (a duller grey for the background). Ordering (higher noise-to-signal stuff down near the bottom) is also a possibility. I find it helpful, when dealing with AI judgements that are not too accurate, to let the user interface be as vague as possible.

2. I definitely agree with you about the advances since 2003–the mechanical turk has been great! I\’ve been thinking of different ways to use digg or del data, but I\’m not sure if Joshua or Kevin would be too happy with me pounding their server every time a new feed item comes in.

By: l.m.orchard

l.m.orchard — Wed, 07 Jun 2006 14:22:09 +0000

One thing, about Bayes and aggregators: I don’t think anyone’s found a way to apply Bayesian filtering to RSS aggregation that produces a satisfying result – at least not one worth cheering to the web about.

I know I haven’t – and I’m up to my 6th private attempt or so at different arrangements with Bayes in particular. I think a different form of filtering is what’s needed. Maybe LSI, maybe some other form of valued scoring that doesn’t result in a flat spam/ham answer. (ie. “interestingness”, as you say.)

But, in the time since 2003, what’s really shown promise are more and more varied ways of soliciting and exploiting human intelligence in the course of finding and filtering news and feed items. See: del.icio.us, digg, etal. The best news lately is pre-scanned and filtered by human domain experts.

By: mote

mote — Tue, 06 Jun 2006 20:42:06 +0000

Also, I’m noticing that a lot of things (e.g. the rise of folksonomy, “web 2.0” (whatever that means) ) have changed since 3 years back when these ideas were first being thrown around. Untapped potential there, too.