 The vision tracking dataset provides a valuable resource for researchers interested in developing computer vision algorithms for automating the assessment of sperm mortality and kinematics. The dataset contains 20 video recordings of 30 seconds each, containing 29,196 frames of wet semen preparations, along with manually annotated bounding box coordinates and a set of sperm characteristics analyzed by experts in the domain. Additionally, the dataset includes unlabeled video clips for easy-to-use access and analysis of the data. The dataset was used to train a YOL of 5-deep learning, DL, model, which achieve promising results in terms of sperm detection performance. This demonstrates the potential of the dataset for training complex DL models to analyze sperm metazoa. This article was authored by Vijay Rathambawita, Steven A. Hicks, Andrea N. Storrers, and others.