 This paper examines the use of active learning models for reducing the workload in systematic reviews. It compares six different models using two feature extraction methods and six datasets from different research areas. The models are evaluated based on their ability to reduce the number of publications needed to screen, while still finding 95% of all relevant records. The results show that the naive base plus TFIDF model yields the best results overall. Additionally, it introduces a new metric called Average Time to Discovery, ATD, which measures the performance of active learning models throughout the entire screening process without the need for an arbitrary cut-off point. This makes the ATD a promising metric for comparing the performance of different models across different datasets. This article was authored by Gerberik Ferdinandz, Raoul Schramm, Jonathan De Bruyne and others.