 This paper presents a new approach to automatic water body extraction based on a hybrid-mix-former architecture. It combines the advantages of convolutional neural networks, CNNs, and transformers to better capture both local and global contextual information from remote-sensing images. The resulting Immunet architecture outperforms existing methods in terms of segmentation accuracy and robustness. This article was authored by Yonghang Zhang, Huang Yulu, Guanyi Ma, and others.