Loading...

Matthew Honnibal - Designing spaCy: Industrial-strength NLP

11,524 views

Loading...

Loading...

Transcript

The interactive transcript could not be loaded.

Loading...

Rating is available when the video has been rented.
This feature is not available right now. Please try again later.
Published on May 31, 2016

PyData Berlin 2016

The spaCy natural language processing (NLP) library features state-of-the-art performance, and a high-level Python API. Efficiency is crucial for NLP, because job sizes are constantly increasing. This talk describes how we’ve met these challenges in spaCy, by implementing the library in Cython.

The spaCy natural language processing (NLP) library features state-of-the-art performance, and a high-level Python API. Efficiency is crucial for NLP, because job sizes are constantly increasing. The key algorithms are also relatively complicated, and frequently subject to change, as new research is published. This talk describes how we’ve met these challenges in spaCy, by implementing the library in Cython. Unlike many Cython users, we did not write the library in Python first, and then optimize it. Instead, we designed the library as a C extension from the start, and added the Python API on top. This allows us to build the library on top of efficient, memory-managed data structures, without having to maintain a separate C or C++ codebase. The result is the fastest NLP library in the world, support for GIL-free multithreading, in a concise readable codebase, and with no compromise on user friendliness.

Comments are disabled for this video.
When autoplay is enabled, a suggested video will automatically play next.

Up next


to add this to Watch Later

Add to

Loading playlists...