Abstract:
Smart agriculture aims to improve crop monitoring through automated and accurate analysis of plant health. A critical task in this domain is disease severity estimation, which focuses on identifying the progression stages of plant infections. In this work, we propose a deep learning-based solution using two transformer architectures: Vision Transformer (ViT) and Swin Transformer. These models are implemented, evaluated, and combined into a novel architecture that leverages ViTs global attention and Swins hierarchical local attention for fine-grained severity classification. The models are trained on Wheat Yellow Rust dataset, which includes six severity stages. Finally, results show that the combined model outperforms individual baselines, providing an effective solution for automated severity estimation.