There has been a bit of recent discussion about the fact that as JavaBlogs grows it is changing, with a few problems with what some people see as low quality posts.
Gerard has outlined the four main methods of making a community scale, but I would like to suggest a fifth. IMO, I believe that automatted text categorisation can increase the size a community can scale to without requiring non-software intervention.
I’ve done some experimentation with using text analysis algorithms for simple match/non-match categorisation. I believe something as simple as Bayesian classification for blog posts can go some way to improving the quality of links on the “Hot List”.
Ultimatly, I think that some of the more advanced text categoriation algorithms might be even more useful. For instance, Google News manages to categorise its stories fairly well, and I believe they do most of that automatically. NewsInEssence categorises news into “clusters” atomatically. A quick look on citeseer shows plenty of algorithms around, and I’m pretty sure the author of Classifier4J might be interested in implementing at least one.
