Dados do Trabalho
Título
Cellphone picture-based automated identification of Chagas disease vectors: effects of bug genus and image orientation on the accuracy of five machine-learning algorithms
Introdução
Automated identification of triatomine bugs can help strengthen Chagas disease surveillance. To be broadly useful, automated systems must be capable of accurately identifying bugs pictured with digital cameras at varying angles or positions.
Objetivo (s)
We assess the accuracy of 5 machine-learning algorithms at identifying Chagas disease vector genera (Triatoma, Panstrongylus, and Rhodnius) based on bugs pictured at different angles/positions with an ordinary cellphone camera.
Material e Métodos
We studied 730 bugs of 13 species. Each bug was pictured with a Moto G6 Play cellphone camera at 9 angles representing 3 positions: dorsal-flat, dorsal-oblique, and front/back-oblique. We randomly split the 6570-picture database into a training set (80%) and a testing set (20%), and then trained and tested 5 algorithms: a pre-trained convolutional neural network (AlexNet, AN), 3 boosting-based classifiers (Multi-Class AdaBoost, AB; Gradient Boosting, GB; Histogram-Based Gradient Boosting, HB), and a linear discriminant model (LD). To gauge performance consistency, we tested each algorithm in 10 pseudo-replicate runs. We assessed accuracy using logit-binomial generalized linear mixed models (GLMMs) with fixed (algorithm; bug genus; photograph angle or position) and random effects (specimen; pseudo-replicate run). Models were fit in a Bayesian framework and evaluated using Akaike’s information criterion.
Resultados e Conclusão
Overall accuracy was variable, with correct identification of individual pictures ranging from 69.1% (AB) to 100% (AN). The top-ranking GLMM revealed that differences across algorithms were mainly driven by a worse performance of AB with Rhodnius (predicted accuracy 30.8%; 95% CI 25–37.4) and Panstrongylus (65.8%; 56.9–73.7) specimens. For Triatoma, predicted accuracies ranged from 97.5% (96.8–98.2) for AB to 100% (99.99–100) for AN. For GB (95.2%; 93–96.7) and LD (94%; 91.3–95.9), accuracies were low for Panstrongylus identifications. Dorsal-flat photographs appeared to improve accuracy slightly for all algorithms, but variation across angles and positions was overall small. Accuracy varied widely across specimens, but very little over algorithm-test pseudo-replicate runs. When machine-learning algorithms as highly accurate as AN are used, neither genus-level taxonomy nor the angle/position at which bugs are photographed seems to pose any problems for cellphone picture-based automated identification of Chagas disease vectors.
Palavras-chave
Triatoma; Panstrongylus; Rhodnius
Agradecimentos
CAPES, CNPq, US National Science Foundation
Área
Eixo 04 | Entomologia / Controle de Vetores
Categoria
Concorrer ao Prêmio Jovem Pesquisador - Doutorado
Autores
Vinícius Lima de Miranda, Ewerton Pacheco de Souza, Déborah Bambil, Ali Khalighifar, A Townsend Peterson, Francisco Assis de Oliveira Nascimento, Fernando Abad-Franch, Rodrigo Gurgel-Gonçalves