Natural Language processing, short introduction.

Loading...

Sign in or sign up now!
Alert icon
Upgrade to the latest Flash Player for improved playback performance. Upgrade now or more info.
1,866
Loading...
Alert icon
Sign in or sign up now!
Alert icon
There is no Interactive Transcript.

Uploaded by on Oct 22, 2010

Natural Language processing (NLP) is a field of computer science and linguistics concerned with the interactions between computers and human languages.
Natural-language understanding is sometimes referred to as an Artifial Inelligence-complete problem, because natural-language recognition seems to require extensive knowledge about the outside world and the ability to manipulate it, and is often considered a sub-field of artificial intelligence.

Modern NLP algorithms are grounded especially in statistical machine learning. Research into modern statistical NLP algorithms requires an understanding of a number of disparate fields, including linguistics, computer science, statistics (particularly Bayesian statistics), linear algebra and optimization theory.

Prior implementations of language-processing tasks typically involved the direct hand coding of large sets of rules. The machine-learning paradigm often uses statistical inference to automatically learn such rules through the analysis of a set of documents that have been hand-annotated with the correct values to be learned. This documents are named corpus.

As an example, consider the task of determining the correct part of speech of each word in a given sentence, typically one that has never been seen before. A typical machine-learning-based implementation of a part of speech tagger proceeds in two steps, a training step and an evaluation step.

The training step : makes use of a corpus of training data, which consists of a large number of sentences, each of which has the correct part of speech attached to each word.

This corpus is analyzed and a learning model is generated from it, consisting of automatically-created rules for determining the part of speech for a word in a sentence, typically based on the nature of the word in question, the nature of surrounding words, and the most likely part of speech for those surrounding words.

The model that is generated is typically the best model that can be found that simultaneously meets two conflicting objectives: To perform as well as possible on the training data, and to be as simple as possible (so that the model avoids overfitting the training data.

In The evaluation step, the model that has been learned is used to process new previously unseen data. It is critical that the data used for testing is not the same as the data used for training; otherwise, the testing accuracy will be unrealistically high.

Many different classes of machine learning algorithms have been applied to NLP tasks.

Research has focused on statistical models, which make soft, probabilistic decisions based on attaching real-valued weights to each input feature.

Such models have the advantage that they can express the relative certainty of many different possible answers rather than only one, producing more reliable results when such a model is included as a component of a larger system.

In addition, models that make soft decisions are generally more robust when given unfamiliar input, especially input that contains errors (as is very common for real-world data).

This was only an introduction, If you want to learn more watch next video. Thanks for your time.

Category:

Education

Tags:

License:

Standard YouTube License

  • likes, 1 dislikes

Link to this comment:

Share to:
see all

All Comments (0)

Sign In or Sign Up now to post a comment!
Loading...

Alert icon
0 / 00Unsaved Playlist Return to active list
    1. Your queue is empty. Add videos to your queue using this button:
      or sign in to load a different list.
    Loading...Loading...Saving...
    • Clear all videos from this list
    • Learn more