 This study proposes combining male frequency septal coefficients, MFCCs, and time domain features, TDS, to improve the performance of speech emotion recognition, SER. A convolutional neural network, CNN, was trained on the resulting hybrid features to achieve higher accuracy than other models. This approach outperforms existing methods on three different datasets demonstrating its effectiveness in identifying emotions from audio signals. This article was authored by Ala Saleh El Hayden, Ameema Sedani, Rashi Jahangir, and others.