 This study investigates the parallelization of the mean shift image segmentation algorithm on a single GPU and a task scheduling method with MPI plus OpenCL programming model on a GPU cluster platform presenting test results on Shilob, a GPU cluster platform at Louisiana State University, showing good speedups with different configurations and our data. This article was authored by Fang Huang, Imjie Chen, Li Li, and others.