ML Lunch (Oct 7, 2013): Extracting Knowledge from Informal Text - YouTube


ML Lunch (Oct 7, 2013): Extracting Knowledge from Informal Text





The interactive transcript could not be loaded.



Rating is available when the video has been rented.
This feature is not available right now. Please try again later.
Published on Oct 7, 2013

Speaker: Alan Ritter

The internet has revolutionized the way we communicate, leading to a constant flood of informal text available in electronic format, including: email, Twitter, SMS and the clinical text found in electronic medical records. This presents a big opportunity for Natural Language Processing (NLP) and Information Extraction (IE) technology to enable new large scale data-analysis applications by extracting machine-processable information from unstructured text at scale.

In this talk I will discuss several challenges and opportunities which arise when applying NLP and IE to informal text, focusing specifically on Twitter, which has recently rose to prominence, challenging the mainstream news media as the dominant source of realtime information on current events. I will describe several NLP tools we have adapted to handle Twitter’s noisy style, and present a system which leverages these to automatically extract a calendar of popular events occurring in the near future.

I will further discuss fundamental challenges which arise when extracting meaning from such massive open-domain text corpora. Several probabilistic latent variable models will be presented, which are applied to infer the semantics of large numbers of words and phrases and also enable a principled and modular approach to extracting knowledge from large open-domain text corpora.

For more ML Lunch talks:


When autoplay is enabled, a suggested video will automatically play next.

Up next

to add this to Watch Later

Add to

Loading playlists...