 Masked RCNN works towards the problem of instant segmentation, the process of detecting and delineating each distinct object of interest in an image. And so instant segmentation is a combination of two sub-problems. The first is object detection, and this is the problem of finding and classifying a variable number of objects in an image. They are variable numbered because the number of objects detected in an image can vary image to image. And the second part of instant segmentation is semantic segmentation. Semantic segmentation is the understanding of an image at the pixel level. That is, we want to assign an object class to each pixel in the image. In this figure, with the motorcyclist, apart from recognizing the bike and the person riding it, we also have to delineate the boundaries of each object. Using object detection and semantic segmentation together, we get instant segmentation.