Tomaž Šolc at Wikimania 2008, Alexandria, Egypt
A common use of Wikipedia in web publishing is to provide explanations for various terms in published texts with which the reader may not be familiar. This is usually done in form of in-text hyperlinks to relevant pages in Wikipedia. Building on the existing research we have created a system that automatically adds such explanatory links to a plain text article. Combined with structured data extracted from linked Wikipedia articles, the system can also provide links to other websites concerning the subject and semantic tagging that can be used in any further processing.
This talk is about the research that resulted in Wikitag, a system that is currently running as part of Zemanta (www.zemanta.com) service. An overview of the algorithm is given with descriptions of its basic building blocks and discussion of the primary problems we encountered: how to get link candidates, automatically disambiguate terms, estimate link desirability and select only the most appropriate links for the final result.
good
Miks1010 3 years ago