 This paper reviews several metrics commonly used in polyphonic sound event detection systems, which are designed to evaluate performance in realistic scenarios where multiple sound sources are active simultaneously. These metrics include segment-based and event-based definitions, as well as instance-based and class-based averaging. A toolbox containing implementations of these metrics is provided. This article was authored by Anna Maria Messaris, Tony Heitola, and Tuomas Vertanen.