Alert icon
We're changing our privacy policy. This stuff matters.  Learn more  Dismiss

Google Developers Day US - Theorizing from Data

Loading...

Sign in or sign up now!
22,614
Loading...
Alert icon
Sign in or sign up now!
Alert icon

Uploaded by on Jun 5, 2007

"Theorizing from Data: Avoiding the Capital Mistake
Peter Norvig
""It is a capital mistake to theorize before one has data."" Sir Arthur Conan Doyle's words from 1891 remain true today. Researchers in computational linguistics and information retrieval now have a million times more data than was available 30 years ago. This talk explores what this data can do for problems in language understanding, translation, information extraction, and inference, and extrapolates to what more data may bring in the future. "

Category:

News & Politics

Tags:

License:

Standard YouTube License

Link to this comment:

Share to:

Top Comments

  • Thanks Peter Norvig, Your shirt gave me a siezure.. ;)

  • The basic method of a probabilistic translation model and a language model is relatively old news (Brown et al, 1990), and the same criticisms that applied 17 years ago have not been answered here: what do you do with language pairs that differ?

    Now, if they manage to translate English-Klingon, that'd be impressive.

see all

All Comments (15)

Sign In or Sign Up now to post a comment!
  • The one thing that bothers me about Google is that it trusts me to type in what I really want less over time. Now, sometimes the correcting of my misspellings actually helps me, but most of the time all it does is prevent me (often with some difficulty) of actually searching for what I want to search for.

  • Pattern recognition pattern recognizer.

    Godel is laughing somewhere.

  • The most accurate translation is not always the best.

    Idiom, by definition, has no translation.

  • Quite interesting...

  • It still amazes me over and over again of how smart some people can be. I'm getting my Professional Bachelor of Informatics in 2 months and I feel really dumb compared to these people. But then again, they have their years of experience and I only have my 3 years at collegue. I find this topic very interesting, though a little bit hard to understand at certain times.

  • He mentioned a DVD that Google sold which had their collection of English words. Anyone know how to obtain it?

    Any help will be VERY appreciated^^

  • You would put that into the search criteria, and have it search words within words (synthetic) or context (analytic). English is a mix of synthetic and analytic already, so you can see it already has those capabilities.

  • Ut what happens when you try aligning eg. polysynthetic languages such as the Greenlandics (where a single word may express what in English would be a ten letter sentence) and analytic languages such a s Chinese (where the average word length is, what, 2.5 letters?). There are a lot of challenges to be met, and it'd be very interesting to see how Norvig and the Google MT team are dealing with them.

  • Very interesting overview, but the question session in the end revealed a rather low competence among the audience, which is too bad -- there are some much more interesting theoretical questions to be asked. For one, this type of machine translation seems to be founded on having some sort of parallell aligned texts; this is relatively easy for German and English as showed in the examples, they're very similar languages both syntactically and lexically.

Loading...

0 / 00Unsaved Playlist Return to active list
    1. Your queue is empty. Add videos to your queue using this button:
      or sign in to load a different list.
    Loading...Loading...Saving...
    • Clear all videos from this list
    • Learn more