Accurate Medicinal Plant Identification in Natural Environments by Embedding Mutual Information in a Convolution Neural Network Model

Shahmiri, Lida; Wong, Patrick and Dooley, Laurence (2022). Accurate Medicinal Plant Identification in Natural Environments by Embedding Mutual Information in a Convolution Neural Network Model. In: Fifth IEEE International Conference on Image Processing, Applications and Systems (IPAS 2022), 5-7 Dec 2022, Genova.

Abstract

Medicinal plants are a primary source of disease treatment in many countries. As most are edible however, consumption of the wrong herbal plants can have serious consequences and even lead to death. Automatic accurate recognition of plant species to help users who do not have specialist knowledge of herbal plants is thus a desirable aim. Several automatic medicinal plant identification systems have been proposed, though most are significantly constrained either in the small number of species or in requiring manual image segmentation of plant leaves. This means they are captured on a plain background rather than being readily identified in their natural surroundings, which often involve complex and noisy backgrounds. While deep learning (DL) based methods have made considerable strides in recent times, their potential has not always been maximised because they are trained with samples which are not always fully representative of the intra-class and inter-class differences between the plant species concerned. This paper addresses this challenge by incorporating mutual information into a Convolutional Neural Network (CNN) model to select samples for the training, validation, and testing sets based on a similarity measure. A critical comparative evaluation of this new CNN medicinal plant classification model incorporating a mutual information guided training (MIGT) algorithm for sample selection, corroborates the superior classification performance achieved for the VNPlant-200 dataset, with an average accuracy of more than 97%, while the precision and recall values are also consistently above 97%. This is significantly better than existing CNN classification methods for this dataset as it crucially means false positive rates are substantially lower thus affording improved identification reliability.

Viewing alternatives

Download history

Item Actions

Export

About

Recommendations