 This paper proposes a novel facial expression recognition system called Distract Your Attention Network, DAN. It uses a combination of feature clustering, multi-head attention networks, and attention fusion networks to achieve state-of-the-art results on three public datasets. This article was authored by Zhenya One, Wenzheng Lin, Tao Wang, and others.