Michal rosen-zvi
IBM Research Division, Haifa Research Lab
On the topic of:
Bayesian models of text generation - from the basic assumptions to the most practical solutions
ELSC-ICNC lecture hall (Silverman Bldg., Wing 3, 6th floor - Edmond J. Safra Campus)
March 24, 2011, at 17:00
Abstract:
Algorithms for text mining that are based on statistical (Bayesian) topic models also known as Latent Dirichlet Allocation (LDA) -- introduced by Blei et al in 2003 -- have achieved significant progress in modeling word document relationships. These algorithms, which are at the center of this talk, assume each word in the document was generated by a hidden topic and explicitly model the word distribution of each topic as well as the prior distribution over topics in the document. This assumption was shown by Griffiths, Steyvers and Tenenbaum to be related to semantic memory in cognitive science. Particularly, it provides a powerful framework to analyze the abstract computational problem underlying the extraction and use of gist of verbal discourse. This talk is devoted to a number of variations on this basic assumption, including exploiting attributes of the text documents, links from words and order of words, and to the approximate algorithms derived from the models. Finally, the potential applications of the models will be presented.
Link to this comment:
All Comments (0)