The video shows RGB camera images which are pixel-wisely segmented by a CNN (FCN8s). The FCN8s was trained on the Cityscapes Dataset. The segmented output is further converted into a binary images, labeling static and potential dynamic objects. Only pixels labeled as static are retained for the continuous stereo camera calibration algorithm.