 This review explores recent articles on remote sensing image scene classification with deep learning, DL, and classifies them into three main categories, convolutional neural network, CNN-based, vision transformer, VIT-based, and generative adversarial network, GAN-based architectures. A meta-analysis of 50 peer-reviewed journal articles is performed to provide insights in this domain, showing that the most adopted remote sensing scene data sets are aid and NWPU-resist 45. The review identifies challenges and future opportunities for improvement in this domain, making it an invaluable resource for researchers seeking to contribute to this growing area of research. This article was authored by Akash Tapa, Teria Horanat, Vipul Nupain, and others.