Prodigy is a new, active learning-powered annotation tool from the makers of spaCy. In this video, we'll show you how to use Prodigy to train a classifier to detect disparaging or insulting comments. Prodigy makes text classification particularly powerful, because you can try out new ideas very quickly. The same approach can be used to solve problems such as sentiment analysis or chatbot intent detection.
IMPORTANT NOTES Since this video was recorded, the `textcat.teach` command has changed in one detail: instead of a --seeds argument, you can now pass in --patterns, which lets you describe single words but also more complex combinations of tokens based on their attributes. To convert a seed dataset to patterns, you can use the `terms.to-patterns` recipe. For more details, see here: https://support.prodi.gy/t/523
The `textcat.batch` train command now also reads the labels from the data automatically, so you won't have to use the `--label` argument anymore.
We were pleased to invite the spaCy community and other folks working on Natural Language Processing to Berlin this summer for a small and intimate event July 6, 2019. We booked a beautiful venue in one of Berlin's coolest neighborhoods, hand-picked an awesome lineup of speakers and scheduled plenty of social time to get to know each other and exchange ideas. https://irl.spacy.io/2019
In this new video series, data science instructor Vincent Warmerdam gets started with spaCy, an open-source library for Natural Language Processing in Python. His mission: building a system to automatically detect programming languages in large volumes of text. Follow his process from the first idea to a prototype all the way to data collection and training a statistical named entity recogntion model from scratch.