Alert icon
We're changing our privacy policy. This stuff matters.  Learn more  Dismiss

Forest-based Search Algorithms in Parsing and Machine Translation

Loading...

Sign in or sign up now!
15,074
Loading...
Alert icon
Sign in or sign up now!
Alert icon

Uploaded by on Mar 18, 2008

Google Tech Talks
March, 14 2008

ABSTRACT

Many problems in Natural Language Processing (NLP) involves an
efficient search for the best derivation over (exponentially) many
candidates, especially in parsing and machine translation. In these
cases, the
concept of "packed forest" provides a compact representation of the
huge search spaces, where efficient inference algorithms based on
Dynamic Programming (DP) are possible.

In this talk we address two important open problems within this
framework: exact k-best inference which is often used in NLP
pipelines such as parse reranking and MT rescoring, and approximate
inference when the search space is too big for exact search.

We first present a series of fast and exact k-best algorithms on
forests, which are orders of magnitudes faster than previously used
methods on state-of-the-art parsers such as Collins (1999). We then
extend these algorithms for approximate search when the forests are
too big for exact inference. We discuss two particular instances of
this new method, forest rescoring for MT decoding with integrated
language models, and forest reranking for discriminative parsing. In
the former, our methods perform orders of magnitudes faster than
conventional beam search on both state-of-the-art phrase-based and
syntax-based systems, with the same level of search error or
translation quality. In the latter, faster search also leads to
better learning, where our approximate decoding makes whole-Treebank
discriminative training practical and results in the best accuracy to
date for parsers trained on the Treebank.

This talk includes joint work with David Chiang (USC Information
Sciences Institute).

Liang Huang (2008). Forest Reranking: Discriminative Parsing with Non-
Local Features.
Proceedings of ACL 2008 (to appear).
http://www.cis.upenn.edu/~lhuang3/forest-rerank.pdf

Liang Huang and David Chiang (2007). Forest Rescoring: Faster
Decoding with Integrated Language Models.
Proceedings of ACL 2007.
http://www.cis.upenn.edu/~lhuang3/acl-cube.pdf

Liang Huang and David Chiang (2005). Better k-best Parsing.
Proceedings of IWPT 2005.
http://www.cis.upenn.edu/~lhuang3/huang-iwpt-correct.pdf

Speaker: Liang Huang
Liang Huang is a final-year PhD student at the University of
Pennsylvania, co-supervised by Aravind Joshi and Kevin Knight (USC/
ISI). He is mainly interested in the theoretical aspects of
computational linguistics, in particular, efficient algorithms in
parsing and machine translation, generic dynamic programming, and
formal properties of synchronous grammars. He also works on applying
computational linguistics to structural biology.

Category:

People & Blogs

Tags:

License:

Standard YouTube License

  • likes, 3 dislikes

Link to this comment:

Share to:
see all

All Comments (5)

Sign In or Sign Up now to post a comment!
  • What I deciphered was: post original, interesting, fast, and secure content that is relevant to your keywords on your blog or web site, I think?!

  • hmm..Its more academic oriented..this guy is moving too fast & running through slides..I am not able to understand..may be its too advanced level

    sorry i didnt like

  • ugh, I need the 'Forrest Based Search Algorithms yada yada...' for dummies version.

  • Excelent, thanks!

  • this is awesome...

Loading...

0 / 00Unsaved Playlist Return to active list
    1. Your queue is empty. Add videos to your queue using this button:
      or sign in to load a different list.
    Loading...Loading...Saving...
    • Clear all videos from this list
    • Learn more