Saturday, February 28, 2009

Wikipedia corpus

While there have been a lot of speculation around Web 2.0 and Semantic Web manifestations of wikipedia like nature, Wikipedia itself is a great corpus to do information retrieval and run semantic experiments on (just saying the obvious).

Never the less this is well understood by such startups as

http://www.powerset.com/ a tool with fancy UI design to do NL "search"

and

http://www.freebase.com/ one of the promising social networks for "I like" information. Not really another facebook or digg.

Both companies do a lot of retrieving for further compilations just like DJs like to play with Beethoven's pieces.

No comments: