Precision seed certification through machine learning

Akram Ghaffari; Akram Ghaffari

doi:10.48130/tia-0024-0013

2024 Volume 4

Article Contents

Next Previous

REVIEW Open Access

Precision seed certification through machine learning

Akram Ghaffari^,

1.
Molecular Markers Lab, Registration and Identification of Plant Varieties Department, Seed and Plant Certification and Registration Institute (SPCRI), Agricultural Research, Education, and Extension Organization (AREEO), 31535-1516 Karaj, Iran

More Information

Corresponding author: a.ghaffari@areeo.ac.ir

Received: 02 December 2023
Revised: 05 May 2024
Accepted: 07 May 2024
Published online: 23 July 2024
Technology in Agronomy 4, Article number: e019 (2024) | Cite this article

Abstract

Original and pure seeds are the most important factors for sustainable agricultural production, development, and food security. Conventionally, seed protection and certification programs are carried out on several classes from breeders to certify seed based on a physical, biochemical, and genetic evaluation to approve seed as a cultivar. In seed industries, quality assurance programs depend on different methods for certifying seed quality characteristics such as seed viability and varietal purity. Those methods are mostly conducted in a less cost-effective and timely manner. Combining machine learning (ML) algorithms and optical sensors can provide reliable, accurate, non-destructive, and quick pipelines for seed quality assessments. ML employs various classifiers to authenticate and recognize varieties through K-means, Support Vector Machines (SVM), Discriminant Analysis (DA), Naive Bayes (NB), Random Forest (RF) and Artificial Neural Networks (ANNs). In recent years, progress in ANN algorithms as deep learning simplifies big data analytics procedures by categorizing learning and extracting distinct levels of multiplex data. Deep learning opened a new door for developing a smartphone as a fast and robust substitute for the online seed variety discrimination stage through developing a Convolutional neural networks (CNNs) model-based mobile app. This review presents machine learning and seed quality assessment areas to recognize and classify seeds through long-standing and novel ML algorithms.
- High-throughput phenotyping,
- Machine learning,
- Seeds certification,
- ML algorithms,
- Artificial Neural Networks (ANNs),
- Convolutional Neural Networks (CNNs).
Rights and permissions
Copyright: © 2024 by the author(s). Published by Maximum Academic Press, Fayetteville, GA. This article is an open access article distributed under Creative Commons Attribution License (CC BY 4.0), visit https://creativecommons.org/licenses/by/4.0/.

References

[1]	Naghashzadeh MR, Azadbakht N. 2018. Principles of seed production and certification. Iran: Agricultural Institute of Applied Scientific and Skilled Higher Education. 81 pp.
[2]	Osroush S. 2010. Cereal registration and certification. Iran: Seed and plant certification and registration institute, Agricultural Research, Education, and Extension Organization. 29 pp.
[3]	Akbarzai DK. 2019. Methods of genetic purity testing. Meerut, India: Chaudhary Charan Singh University.
[4]	Zhang S, Li B, Chen Y, Shaibu AS, Zheng H, et al. 2020. Molecular-assisted distinctness and uniformity testing using SLAF-sequencing approach in soybean. Genes 11:175 doi: 10.3390/genes11020175 CrossRef Google Scholar
[5]	Parimala K, Subramanian K, Mahalinga Kannan S, Vijayalakshmi K. 2013. A manual on seed production and certification. India: Centre for Indian Knowledge Systems, Chennai, Revitalising Rainfed Agriculture Network.
[6]	Singh A, Ganapathysubramanian B, Singh AK, Sarkar S. 2016. Machine learning for high-throughput stress phenotyping in plants. Trends in Plant Science 21:110−24 doi: 10.1016/j.tplants.2015.10.015 CrossRef Google Scholar
[7]	Khoshroo A, Arefi A, Masoumiasl A, Jowkar GH. 2014. Classification of wheat cultivars using image processing and artificial neural networks. Agricultural Communications 2:17−22 Google Scholar
[8]	Guevara-Hernandez F, Gomez-Gil J. 1970. A machine vision system for classification of wheat and barley grain kernels. Spanish Journal of Agricultural Research 9:672−80 Google Scholar
[9]	Pazoki A, Farokhi F, Pazoki Z. 2014. Classification of rice grain varieties using two Artificial Neural Networks (MLP and Neuro-Fuzzy). The Journal of Animal & Plant Sciences 24:336−43 Google Scholar
[10]	Ali A, Qadri S, Mashwani WK, Brahim Belhaouari S, Naeem S, et al. 2020. Machine learning approach for the classification of corn seed using hybrid features. International Journal of Food Properties 23:1110−24 doi: 10.1080/10942912.2020.1778724 CrossRef Google Scholar
[11]	Zheng C, Abd-Elrahman A, Whitaker V. 2021. Remote sensing and machine learning in crop phenotyping and management, with an emphasis on applications in strawberry farming. Remote Sensing 13:531 doi: 10.3390/rs13030531 CrossRef Google Scholar
[12]	Qadri S, Furqan Qadri S, Razzaq A, Ul Rehman M, Ahmad N, et al. 2021. Classification of canola seed varieties based on multi-feature analysis using computer vision approach. International Journal of Food Properties 24:493−504 doi: 10.1080/10942912.2021.1900235 CrossRef Google Scholar
[13]	Qian Y, Xu Q, Yang Y, Lu H, Li H, et al. 2021. Classification of rice seed variety using point cloud data combined with deep learning. International Journal of Agricultural and Biological Engineering 14:206−12 doi: 10.25165/j.ijabe.20211405.5902 CrossRef Google Scholar
[14]	Taheri-Garavand A, Nasiri A, Fanourakis D, Fatahi S, Omid M, et al. 2021. Automated In Situ Seed Variety Identification via Deep Learning: A Case Study in Chickpea. Plants 10:1406 doi: 10.3390/plants10071406 CrossRef Google Scholar
[15]	Chowdhury SH, Sany MR, Ahamed MH, Das SK, Badal FR, et al. 2023. A state-of-the-art computer vision adopting non-euclidean deep-learning models. International Journal of Intelligent Systems 2023:1−33 doi: 10.1155/2023/8674641 CrossRef Google Scholar
[16]	Macuácua JC, Centeno JAS, Amisse C. 2023. Data mining approach for dry bean seeds classification. Smart Agricultural Technology 5:100240 doi: 10.1016/j.atech.2023.100240 CrossRef Google Scholar
[17]	Liakos K, Busato P, Moshou D, Pearson S, Bochtis D. 2018. Machine Learning in Agriculture: A Review. Sensors 18:2674 doi: 10.3390/s18082674 CrossRef Google Scholar
[18]	Liaghat S, Balasundram SK. 2010. A review: The role of remote sensing in precision agriculture. American journal of agricultural and biological sciences 5:50−55 doi: 10.3844/ajabssp.2010.50.55 CrossRef Google Scholar
[19]	Jha K, Doshi A, Patel P, Shah M. 2019. A comprehensive review on automation in agriculture using artificial intelligence. Artificial Intelligence in Agriculture 2:1−12 doi: 10.1016/j.aiia.2019.05.004 CrossRef Google Scholar
[20]	van Dijk ADJ, Kootstra G, Kruijer W, de Ridder D. 2021. Machine learning in plant science and plant breeding. iScience 24:101890 doi: 10.1016/j.isci.2020.101890 CrossRef Google Scholar
[21]	Chaugule A. 2021. Survey of Seed Classification techniques. Turkish Journal of Computer and Mathematics Education 12:1236−60 doi: 10.17762/turcomat.v12i13.8678 CrossRef Google Scholar
[22]	Basheer IA, Hajmeer M. 2000. Artificial neural networks: fundamentals, computing, design, and application. Journal of Microbiological Methods 43:3−31 doi: 10.1016/S0167-7012(00)00201-3 CrossRef Google Scholar
[23]	Giordani DS, Siqueira AF, Silva MLCP, Oliveira PC, de Castro HF. 2008. Identification of the biodiesel source using an electronic nose. Energy & Fuels 22:2743−47 doi: 10.1021/ef700760b CrossRef Google Scholar
[24]	Elsheikh AH, Sharshir SW, Abd Elaziz M, Kabeel AE, Wang G, et al. 2019. Modeling of solar energy systems using artificial neural network: A comprehensive review. Solar Energy 180:622−39 doi: 10.1016/j.solener.2019.01.037 CrossRef Google Scholar
[25]	Almasi F, Soltanian S, Hosseinpour S, Aghbashlo M, Tabatabaei M. 2018. Advanced soft computing techniques in biogas production technology. In Biogas. Biofuel and Biorefinery Technologies, eds. Tabatabaei M, Ghanavati H. Cham: Springer. pp. 387−417. https://doi.org/10.1007/978-3-319-77335-3_15
[26]	Aghbashlo M, Peng W, Tabatabaei M, Kalogirou SA, Soltanian S, et al. 2021. Machine learning technology in biodiesel research: A review. Progress in Energy and Combustion Science 85:100904 doi: 10.1016/j.pecs.2021.100904 CrossRef Google Scholar
[27]	Kohonen T. 1989. Self-organizing feature maps. In Self-organization and associative memory. Berlin, Heidelberg: Springer. pp. 119−57. https://doi.org/10.1007/978-3-642-88163-3_5
[28]	Zupan J, Gasteiger J. 1991. Neural networks: A new method for solving chemical problems or just a passing phase? Analytica Chimica Acta 248:1−30 doi: 10.1016/s0003-2670(00)80865-x CrossRef Google Scholar
[29]	Ghamari S. 2012. Classification of chickpea seeds using supervised and unsupervised artificial neural networks. African Journal of Agricultural Reseearch 7:3193−201 doi: 10.5897/ajar11.2071 CrossRef Google Scholar
[30]	Jiang H. 2020. The analysis of plants image recognition based on deep learning and artificial neural network. IEEE Access 8:68828−41 doi: 10.1109/ACCESS.2020.2986946 CrossRef Google Scholar
[31]	Pal SK, Mitra S. 1992. Multilayer perceptron, fuzzy sets, classifiaction. IEEE Transactions on Neural Networks 3(5):683−97 doi: 10.1109/72.159058 CrossRef Google Scholar
[32]	Hecht-Nielsen R. 1989. Neurocomputing. Boston, MA, United States: Addison-Wesley Longman Publishing Co., Inc. 433 pp.
[33]	Geetha M. 2020. Forecasting the crop yield production in trichy district using fuzzy C-means algorithm and multilayer perceptron (MLP). International Journal of Knowledge and Systems Science (IJKSS) 11:83−98 doi: 10.4018/IJKSS.2020070105 CrossRef Google Scholar
[34]	Granitto PM, Verdes PF, Ceccatto HA. 2005. Large-scale investigation of weed seed identification by machine vision. Computers and Electronics in Agriculture 47:15−24 doi: 10.1016/j.compag.2004.10.003 CrossRef Google Scholar
[35]	Paliwal J, Visen NS, Jayas DS, White NDG. 2003. Cereal grain and dockage identification using machine vision. Biosystems Engineering 85:51−57 doi: 10.1016/S1537-5110(03)00034-5 CrossRef Google Scholar
[36]	Ghorbani MA, Zadeh HA, Isazadeh M, Terzi O. 2016. A comparative study of artificial neural network (MLP, RBF) and support vector machine models for river flow prediction. Environmental Earth Sciences 75:476 doi: 10.1007/s12665-015-5096-x CrossRef Google Scholar
[37]	Pourreza A, Pourreza H, Abbaspour-Fard MH, Sadrnia H. 2012. Identification of nine Iranian wheat seed varieties by textural analysis with image processing. Computers and Electronics in Agriculture 83:102−8 doi: 10.1016/j.compag.2012.02.005 CrossRef Google Scholar
[38]	Vilar WTS, Aranha RM, Medeiros EP, Pontes MJC. 2014. Classification of individual castor seeds using digital imaging and multivariate analysis. Journal of the Brazilian Chemical Society 26(1):102−9 doi: 10.5935/0103-5053.20140221 CrossRef Google Scholar
[39]	Chen X, Xun Y, Li W, Zhang J. 2010. Combining discriminant analysis and neural networks for corn variety identification. Computers and Electronics in Agriculture 71:S48−S53 doi: 10.1016/j.compag.2009.09.003 CrossRef Google Scholar
[40]	Venora G, Grillo O, Ravalli C, Cremonini R. 2009. Identification of Italian landraces of bean (Phaseolus vulgaris L.) using an image analysis system. Scientia Horticulturae 121:410−18 doi: 10.1016/j.scienta.2009.03.014 CrossRef Google Scholar
[41]	Chtioui Y, Bertrand D, Dattée Y, Devaux MF. 1996. Identification of seeds by colour imaging: Comparison of discriminant analysis and artificial neural network. Journal of the Science of Food and Agriculture 71:433−41 doi: 10.1002/(SICI)1097-0010(199608)71:4<433::AID-JSFA596>3.0.CO;2-B CrossRef Google Scholar
[42]	Vapnik VN. 2000. The nature of statistical learning theory. New York, NY: Springer Science & Business Media. 314 pp. https://doi.org/10.1007/978-1-4757-3264-1
[43]	Auria L, Moro RA. 2008. Support vector machines (SVM) as a technique for solvency analysis. Berlin: Deutsches Institut für Wirtschaftsforschung (DIW). DIW Discussion Papers 811.
[44]	Brereton RG, Lloyd GR. 2010. Support vector machines for classification and regression. Analyst 135:230−67 doi: 10.1039/B918972F CrossRef Google Scholar
[45]	Feng L, Zhang Z, Ma Y, Du Q, Williams P, et al. 2020. Alfalfa yield prediction using UAV-based hyperspectral imagery and ensemble learning. Remote Sensing 12:2028 doi: 10.3390/rs12122028 CrossRef Google Scholar
[46]	Cervantes J, Garcia-Lamont F, Rodríguez-Mazahua L, Lopez A. 2020. A comprehensive survey on support vector machine classification: Applications, challenges and trends. Neurocomputing 408:189−215 doi: 10.1016/j.neucom.2019.10.118 CrossRef Google Scholar
[47]	Namias R, Gallo C, Craviotto RM, Arango MR, Granitto PM. 2012. Automatic grading of green intensity in soybean seeds. Proc. XIII Argentine Symposium on Artificial Intelligence (ASAI 2012), Argentine, 2012. Argentine: Argentine Society of Computing and Operational Research. pp. 96−104. https://41jaiio.sadio.org.ar/sites/default/files/9_ASAI_2012.pdf
[48]	Yang X, Hong H, You Z, Cheng F. 2015. Spectral and image integrated analysis of hyperspectral data for waxy corn seed variety classification. Sensors 15:15578−94 doi: 10.3390/s150715578 CrossRef Google Scholar
[49]	Kiratiratanapruk K, Sinthupinyo W. 2011. Color and texture for corn seed classification by machine vision. Proc. 2011 International Symposium on Intelligent Signal Processing and Communications Systems (ISPACS), Chiang Mai, Thailand, 7-9 December 2011. USA: IEEE. pp. 1−5. https://doi.org/10.1109/ISPACS.2011.6146100
[50]	Zhao M, Wu W, Zhang YQ, Li X. 2011. Combining genetic algorithm and SVM for corn variety identification. Proc. 2011 International Conference on Mechatronic Science, Electric Engineering and Computer (MEC), Jilin, China, 19-22 August 2011. USA: IEEE. pp. 990−93. https://doi.org/10.1109/MEC.2011.6025631
[51]	Breiman L. 2001. Random forests. Machine Learning 45:5−32 doi: 10.1023/A:1010933404324 CrossRef Google Scholar
[52]	Lepetit V, Lagger P, Fua P. 2005. Randomized trees for real-time keypoint recognition. Proc. 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05), San Diego, CA, USA, 20-25 June 2005. USA: IEEE. pp. 775-81. https://doi.org/10.1109/CVPR.2005.288
[53]	Mokry FB, Higa RH, de Alvarenga Mudadu M, Oliveira de Lima A, Meirelles SLC, et al. 2013. Genome-wide association study for backfat thickness in Canchim beef cattle using Random Forest approach. BMC Genetics 14:47 doi: 10.1186/1471-2156-14-47 CrossRef Google Scholar
[54]	Khilari N, Hadawale P, Shaikh H, Kolase S. 2022. Analysis of Machine Learning Algorithm to predict Wine Quality. International Journal of Scientific Research in Science, Engineering and Technology 9:231−36 doi: 10.32628/ijsrset229235 CrossRef Google Scholar
[55]	Langsetmo L, Schousboe JT, Taylor BC, Cauley JA, Fink HA, et al. 2023. Advantages and Disadvantages of Random Forest Models for Prediction of Hip Fracture Risk Versus Mortality Risk in the Oldest Old. JBMR Plus 7(8):e10757 doi: 10.1002/jbm4.10757 CrossRef Google Scholar
[56]	Hong PTT, Hai TTT, Le Thi Lan VTH, Thuy NT. 2015. Identification of seeds of different rice varieties using image processing and computer vision techniques. Science and Technology Development Journal 13:1036−42 Google Scholar
[57]	Chaugule A, Mali S. 2013. Seed technological development—a survey. Proceedings of the International Conference on Information Technology in Signal and Image Processing, Mumbai, India, 2013. India: Association of Computer Electronics and Electrical Engineers (ACEEE). pp. 71−78.
[58]	Ali A, Mashwani WK. 2023. A supervised machine learning algorithms: applications, challenges, and recommendations. Proceedings of the Pakistan Academy of Sciences: A. Physical and Computational Sciences 60(4):1−12 doi: 10.53560/PPASA(60-4)831 CrossRef Google Scholar
[59]	Granitto PM, Navone HD, Verdes PF, Ceccatto HA. 2002. Weed seeds identification by machine vision. Computers and Electronics in Agriculture 33:91−103 doi: 10.1016/S0168-1699(02)00004-2 CrossRef Google Scholar
[60]	Frigau L, Antoch J, Bacchetta G, Sarigu M, Ucchesu M, et al. 2020. A statistical approach to the morphological classification of Prunus sp. seeds. Plant Biosystems - An International Journal Dealing with all Aspects of Plant Biology 154:877−86 doi: 10.1080/11263504.2019.1701126 CrossRef Google Scholar
[61]	Ajaz RH, Hussain L. 2015. Seed Classification using Machine Learning Techniques. Journal of Multidisciplinary Engineering Science and Technology 2(5):1098−102 Google Scholar
[62]	LeCun Y, Bengio Y, Hinton G. 2015. Deep learning. Nature 521:436−44 doi: 10.1038/nature14539 CrossRef Google Scholar
[63]	Taghavi Namin S, Esmaeilzadeh M, Najafi M, Brown TB, Borevitz JO. 2018. Deep phenotyping: deep learning for temporal phenotype/genotype classification. Plant Methods 14:66 doi: 10.1186/s13007-018-0333-4 CrossRef Google Scholar
[64]	Goodfellow I, Bengio Y, Courville A. 2016. Deep learning. Cambridge, MA, USA: MIT press. 802 pp.
[65]	Patrício DI, Rieder R. 2018. Computer vision and artificial intelligence in precision agriculture for grain crops: A systematic review. Computers and Electronics in Agriculture 153:69−81 doi: 10.1016/j.compag.2018.08.001 CrossRef Google Scholar
[66]	Zhao G, Quan L, Li H, Feng H, Li S, et al. 2021. Real-time recognition system of soybean seed full-surface defects based on deep learning. Computers and Electronics in Agriculture 187:106230 doi: 10.1016/j.compag.2021.106230 CrossRef Google Scholar
[67]	Loddo A, Loddo M, Di Ruberto C. 2021. A novel deep learning based approach for seed image classification and retrieval. Computers and Electronics in Agriculture 187:106269 doi: 10.1016/j.compag.2021.106269 CrossRef Google Scholar
[68]	Nasiri A, Taheri-Garavand A, Fanourakis D, Zhang YD, Nikoloudakis N. 2021. Automated grapevine cultivar identification via leaf imaging and deep convolutional neural networks: a proof-of-concept study employing primary Iranian varieties. Plants 10:1628 doi: 10.3390/plants10081628 CrossRef Google Scholar
[69]	Gulzar Y, Hamid Y, Soomro AB, Alwan AA, Journaux L. 2020. A Convolution Neural Network-Based Seed Classification System. Symmetry 12:2018 doi: 10.3390/sym12122018 CrossRef Google Scholar
[70]	Uzal LC, Grinblat GL, Namías R, Larese MG, Bianchi JS, et al. 2018. Seed-per-pod estimation for plant breeding using deep learning. Computers and Electronics in Agriculture 150:196−204 doi: 10.1016/j.compag.2018.04.024 CrossRef Google Scholar
[71]	Başol Y, Toklu S. 2021. A Deep Learning-Based Seed Classification with Mobile Application. Turkish Journal of Mathematics and Computer Science 13:192−203 doi: 10.47000/tjmcs.897631 CrossRef Google Scholar
[72]	Ebrahimi E, Mollazade K, Babaei S. 2014. Toward an automatic wheat purity measuring device: A machine vision-based neural networks-assisted imperialist competitive algorithm approach. Measurement 55:196−205 doi: 10.1016/j.measurement.2014.05.003 CrossRef Google Scholar
[73]	Shahid M, Naweed M, Qadri S. 2014. Varietal discrimination of wheat seeds by machine vision approach. Life Science Journal 11:245−52 doi: 10.7537/marslsj1106s14.46 CrossRef Google Scholar
[74]	Medeiros AD, Silva LJD, Ribeiro JPO, Ferreira KC, Rosas JTF, et al. 2020. Machine learning for seed quality classification: an advanced approach using merger data from FT-NIR spectroscopy and X-ray imaging. Sensors 20:4319 doi: 10.3390/s20154319 CrossRef Google Scholar
[75]	Ambrose A, Lohumi S, Lee WH, Cho BK. 2016. Comparative nondestructive measurement of corn seed viability using Fourier transform near-infrared (FT-NIR) and Raman spectroscopy. Sensors and Actuators B: Chemical 224:500−6 doi: 10.1016/j.snb.2015.10.082 CrossRef Google Scholar
[76]	Seo YW, Ahn CK, Lee H, Park E, Mo C, et al. 2016. Non-destructive sorting techniques for viable pepper (Capsicum annuum L.) seeds using Fourier transform near-infrared and raman spectroscopy. Journal of Biosystems Engineering 41:51−59 doi: 10.5307/JBE.2016.41.1.051 CrossRef Google Scholar
[77]	AgaAzizi S, Rasekh M, Abbaspour-Gilandeh Y, Kianmehr MH. 2021. Identification of impurity in wheat mass based on video processing using artificial neural network and PSO algorithm. Journal of Food Processing and Preservation 45:e15067 doi: 10.1111/jfpp.15067 CrossRef Google Scholar
[78]	Nikhade Pratibha, More Hemlata, Manekar Krunali, Khot PST. 2017. Analysis and identification of rice granules using image processing and neural network. International Journal of Electronics and Communication Engineering 10:25−33 Google Scholar
[79]	Aznan A, Rukunudin I, Shakaff A, Ruslan R, Zakaria A, et al. 2016. The use of machine vision technique to classify cultivated rice seed variety and weedy rice seed variants for the seed industry. International Food Research Journal 23:S31−S35 Google Scholar
[80]	Veeranampalayam Sivakumar AN, Li J, Scott S, Psota E, Jhala AJ, et al. 2020. Comparison of Object Detection and Patch-Based Classification Deep Learning Models on Mid- to Late-Season Weed Detection in UAV Imagery. Remote Sensing 12:2136 doi: 10.3390/rs12132136 CrossRef Google Scholar
[81]	Cao W, Zhang C, Wang J, Liu S, Xu X. 2012. Purity identification of maize seed based on discrete wavelet transform and BP neural network. Transactions of the Chinese society of Agricultural Engineering 28:253−58 doi: 10.3969/j.issn.1002-6819.2012.z2.044 CrossRef Google Scholar
[82]	Altuntaş Y, Cömert Z, Kocamaz AF. 2019. Identification of haploid and diploid maize seeds using convolutional neural networks and a transfer learning approach. Computers and Electronics in Agriculture 163:104874 doi: 10.1016/j.compag.2019.104874 CrossRef Google Scholar
[83]	Zhou G, Wang J, Zhang X, Guo M, Yu G. 2020. Predicting functions of maize proteins using graph convolutional network. BMC Bioinformatics 21:420 doi: 10.1186/s12859-020-03745-6 CrossRef Google Scholar
[84]	Kozłowski M, Górecki P, Szczypiński PM. 2019. Varietal classification of barley by convolutional neural networks. Biosystems Engineering 184:155−65 doi: 10.1016/j.biosystemseng.2019.06.012 CrossRef Google Scholar
[85]	Ahmed MR, Yasmin J, Collins W, Cho BK. 2018. X-ray CT image analysis for morphology of muskmelon seed in relation to germination. Biosystems Engineering 175:183−93 doi: 10.1016/j.biosystemseng.2018.09.015 CrossRef Google Scholar
[86]	Andrade GC, Medeiros Coelho CM, Uarrota VG. 2020. Modelling the vigour of maize seeds submitted to artificial accelerated ageing based on ATR-FTIR data and chemometric tools (PCA, HCA and PLS-DA). Heliyon 6:e03477 doi: 10.1016/j.heliyon.2020.e03477 CrossRef Google Scholar
[87]	Xia Y, Xu Y, Li J, Zhang C, Fan S. 2019. Recent advances in emerging techniques for non-destructive detection of seed viability: A review. Artificial Intelligence in Agriculture 1:35−47 doi: 10.1016/j.aiia.2019.05.001 CrossRef Google Scholar
[88]	Yang G, Wang Q, Liu C, Wang X, Fan S, et al. 2018. Rapid and visual detection of the main chemical compositions in maize seeds based on Raman hyperspectral imaging. Spectrochimica Acta Part A: Molecular and Biomolecular Spectroscopy 200:186−94 doi: 10.1016/j.saa.2018.04.026 CrossRef Google Scholar
[89]	Zhang L, Liu Z, Ren T, Liu D, Ma Z, et al. 2020. Identification of Seed Maize Fields With High Spatial Resolution and Multiple Spectral Remote Sensing Using Random Forest Classifier. Remote Sensing 12:362 doi: 10.3390/rs12030362 CrossRef Google Scholar
[90]	de Medeiros AD, Pinheiro DT, Xavier WA, da Silva LJ, dos Santos Dias DCF. 2020. Quality classification of Jatropha curcas seeds using radiographic images and machine learning. Industrial Crops and Products 146:112162 doi: 10.1016/j.indcrop.2020.112162 CrossRef Google Scholar
[91]	Gadotti GI, Ascoli CA, Bernardy R, Monteiro RdCM, Pinheiro RdM. 2022. Machine learning for soybean seeds lots classification. Engenharia Agrícola 42:e20210101 doi: 10.1590/1809-4430-eng.agric.v42nepe20210101/2022 CrossRef Google Scholar
[92]	Ma T, Tsuchikawa S, Inagaki T. 2020. Rapid and non-destructive seed viability prediction using near-infrared hyperspectral imaging coupled with a deep learning approach. Computers and Electronics in Agriculture 177:105683 doi: 10.1016/j.compag.2020.105683 CrossRef Google Scholar
[93]	Andaur Navarro CL, Damen JAA, Takada T, Nijman SWJ, Dhiman P, et al. 2021. Risk of bias in studies on prediction models developed using supervised machine learning techniques: systematic review. BMJ 375:n2281 doi: 10.1136/bmj.n2281 CrossRef Google Scholar
[94]	Ramezan CA, Warner TA, Maxwell AE. 2019. Evaluation of sampling and cross-validation tuning strategies for regional-scale machine learning classification. Remote Sensing 11:185 doi: 10.3390/rs11020185 CrossRef Google Scholar
[95]	Rocha LD, Gadotti GI, Bernardy R, Pinheiro RdM, Monteiro RdCM. 2023. Data mining for ranking sorghum seed lots. Revista Caatinga 36:471−78 doi: 10.1590/1983-21252023v36n224rc CrossRef Google Scholar
[96]	Bernardy R, Gadotti GI, Monteiro RdCM, Pinto KVA, Pinheiro RdM. 2023. FITTING data mining settings for ranking seed lots. Engenharia Agrícola 43:e20220193 doi: 10.1590/1809-4430-eng.agric.v43n2e20220193/2023 CrossRef Google Scholar
[97]	Gadotti GI, Moraes NAB, Silva JGd, Pinheiro RdM, Monteiro RdCM. 2022. Prediction of Ranking of Lots of Corn Seeds by Artificial Intelligence. Engenharia Agrícola 42:e20210005 doi: 10.1590/1809-4430-eng.agric.v42n4e20210005/2022 CrossRef Google Scholar
[98]	Liu Y, Su J, Shen L, Lu N, Fang Y, et al. 2021. Development of a mobile application for identification of grapevine (Vitis vinifera L.) cultivars via deep learning. International Journal of Agricultural and Biological Engineering 14:172−79 doi: 10.25165/j.ijabe.20211405.6593 CrossRef Google Scholar
[99]	Kamilaris A, Prenafeta-Boldú FX. 2018. Deep learning in agriculture: a survey. Computers and Electronics in Agriculture 147:70−90 doi: 10.1016/j.compag.2018.02.016 CrossRef Google Scholar

About this article

Cite this article

Ghaffari A. 2024. Precision seed certification through machine learning. Technology in Agronomy 4: e019 doi: 10.48130/tia-0024-0013

Ghaffari A. 2024. Precision seed certification through machine learning. Technology in Agronomy 4: e019 doi: 10.48130/tia-0024-0013

Figures(7) / Tables(2)

Download PDF

Article Metrics

Article views(1671) PDF downloads(571)

Other Articles By Authors

on this site
- Akram Ghaffari
on Google Scholar
- Akram Ghaffari

HTML

ML in determining seed germination and vigor quality

Data bias

Most prediction models produced by machine learning do not display proper methodological quality, leading to a high risk of bias. There are several reasons for a risk of bias: small study size, inadequate classification and handling of missing data, and improper control on overfitting. Improvement of the design, conduct, reporting, and validation of such studies will enhance the confidence, speed and precision application of prediction models in several field of studies^[93]. For example, high lot quality caused unbalanced data in soybean seed analyses. Unbalanced learning results in surpassing one class on another class is a classification process. To overcome this problem, a resample filter was applied to prevent bias of the algorithm. The accurate selection of resources reduces the dimensionality of the data and benefit to faster function of the classifier, resulting in higher accuracy. When classes are classified incorrectly by classifiers, the classes with more data had higher accuracy and precision values than the other classes in unbalanced data. A smaller sample number of classes resulted in lower values of the performance compared with larger data classes. Employing classification via regression (CVR) showed a lower amount of false positives in some classes, led to better accuracy in the rejected and intermediate classes of soybean seed lots^[91].

Cross-validation and percentage-split

The main principle to estimate accuracy is that the evaluation samples should not be the same as the training samples. Training samples are excluded from the evaluation samples of certain parameter values that decreases the probability of overtraining and led to an increase of the generalization of the classifier^[94]. Several partitions are created for each sample to be applied several times for several aims. The percentage split divides data into training and testing by percentage value (70%, 30%−90%, 10% for training and testing data). It is essential to choose the best percentage due to raising uncertainties^[95]. To increase the reliability of the machine learning results, cross-validation is developed to train and asses accuracy of samples several times. The k-fold cross-validation is a popular validation method that randomly divided the sample set into a series of identical sized folds. The dataset is divided into several partitions, or folds that is showed by k. For example, if a k-value of ten is applied, the dataset is divided into ten partitions. Thus, nine of the partitions are allocated for training data, while the remaining one partition is allocated for test data. The training is iterated ten times where nine partitions are specified for training data and a different partition is applied for the test set each time^[94]. A study evaluated several machine learning models to assess soybean seed lot quality. Soybean lots evaluated through two methods of test sets including cross-validation (with 8, 10, and 12 folds), and percentage split (with 66% and 70%). Results revealed that the 10-fold cross-validation achieved better figures than 8 and 12 that was 90.22% classification accuracy. However, the method applying 66% of data for training reached 93.55% accuracy. Therefore, precise classification through appropriate algorithm training lead to reliable information for seed quality testing^[96]. However, in other research, k = 10 cross-validation was applied to training and testing data of maize seed lots led to the high accuracy and precision of the classification for corn seed lots^[10,97].

Concluding remarks and future perspectives

Rapid and robust seed variety and purity recognition are the main challenges in the seed registration and certification process. Thus, developing high-throughput analyzing procedures is required to accelerate the seed variety discrimination accurately and affordably. ML algorithms were exploited in several studies to identify seed varieties, recognize weeds and evaluate seed viability through image analysis. In terms of discussing studies in the literature and providing insights into particular subjects in seed certification procedures, this review was dedicated to two important parts, including: 1) conventional machine learning methods and algorithms and their application in seed recognition, vigor, and purity; 2) the deep learning algorithms and architectures for seed identification and classification.

Several conventional machine learning and feature engineering algorithms comprising ANN, DA, RF, SVM, PLS-DA, and NB were employed to classify and detect seeds. These algorithms showed their great efficiency in the seed recognition and certification process (Fig. 5 & Table 2). However, it still needs human involvement for feature extraction. Therefore, deep learning was introduced to automatically extract complex data features from huge volumes of unsupervised and un-categorized data. Deep learning simplifies big data analytics procedures by categorizing learning and extracting distinct levels of multiplex data, particularly for discriminative purposes such as classification and prediction. Deep learning created great opportunities to develop in situ seed identification and sorting systems. For instance, imaging data can be easily accessed through mobile phones, tablets, or action cameras (GoPro) for in situ information processing. Thus, it can open a new window for utilizing a smartphone as a fast and robust substitute for the online seed variety discrimination stage by developing a CNN model-based mobile app^[14]. Some scholars presented that the CNN algorithm is trustworthy and highly effective for variety recognition^[98] in farms. Developing mobile apps would be a great approach for seed producers, processors, and distributors. Besides, a real-time seed recognition system is a fast and automatic recognition system through deep learning to detect seeds deficiencies in the selection process. The mechanism evaluates the whole surface features of seeds precisely via deep learning^[66]. Precise seed sorting methods will lead to increasing yield in the breeding industry. These developments can be extended to invent more mobile and online apps to detect and discriminate several features of seeds with minimum human involvement, such as seeds protein content, hardness, moisture content, and so on. These developments have evolved in the agriculture sector to accelerate food production for the projected global population in the near future. In summary, deep learning is a promising means for smarter, sustainable agriculture and secure food production.

However, the relatively low maturity of the deep-learning models demands further study. Specifically, further developments are crucial to adjust deep learning procedures for big data problems, containing high dimensionality, streaming data analysis, scalability of deep learning models upgraded formulation of data abstractions, distributed computing, semantic indexing, data tagging, information retrieval, criteria for extracting good data representations, and domain adaptation. Upcoming research should address one or more of these difficulties^[99].

Author contributions

The author confirms sole responsibility for the following: study conception and design, data collection, analysis and interpretation of results, and manuscript preparation.

ML algorithms	SOM	MLP	DA	SVMs	RF	NB	CNNs
Accuracy	+	+++	++	++++	+++++	+++	++++++
Flexibility	+	++	+	+++	++++	+++	++++++
Advantages	Fast	Non-linear classification	Quick, inexpensive	Analyzing complex networks, diminishing the generalization error, using large number of hidden units	Tolerant of highly correlated predictors	Simplicity, fast training, decreases the number of parameters	Analysis of massive amounts of unsupervised data, better classification and prediction
Disadvantages	Unsupervised	Time-consuming, few hidden neurons	Unsupervised	Algorithmic complexity, development of ideal classifiers for multi-class problems and unbalanced data sets	Evaluation of pairwise interactions is difficult, future predictions require the original data	Oversensitive to redundant or irrelevant attributes, classification bias	It needs further developments for big data analysis

Plant species	Type of machine learning	Classifier	Accuracy	Features	Purpose	Ref.
Corn	Digital image	MLP	98.83%	Texture - spectrum hybrid	Seed varietal purity	[10]
Wheat	Digital image	ANNs	85.72%	Morphology	Seed varietal purity	[7]
Wheat	Digital image	ICA-ANN hybrid	96.25%	Color, morphology, and texture	Seed varietal purity	[72]
Wheat	Digital bulk image	LDA	98.15%	Texture	Seed varietal purity	[37]
Wheat	Digital bulk image	ANNs	97.62%	Texture	Seed varietal purity	[73]
Forage grass (Urochloabrizantha)	FT-NIR spectroscopy & X-ray imaging	RF	85%	Spectrum-composition hybrid	Seed germination & vigor	[74]
Corn	FT-NIR spectroscopy	PLS-DA	100%	Chemical composition	Seed germination & vigor	[75]
Pepper	FT-NIR & Raman spectroscopy	PLS-DA	99%	Chemical composition	Seed germination & vigor	[76]
57 weed species	Digital image	NB & ANNs	99.5%	Color, morphology, and texture	Weed identification	[59]
Wheat	Video processing	ANN - PSO hybrid	97.77%	Shape, texture & color	Physical purity & weed identification	[77]
Rice	Digital image	MLP	99.46%	Morphology, texture & color	Seed varieties classification	[9]
Rice	Digital image	ANNs	−	Morphology	Seed grading	[78]
Rice	Digital image	DFA	96%	Morphology	Physical purity	[79]
Soybean	Aerial imagery	CNNs	65%	Object detection	Weed identification	[80]
Soybean	Digital image	CNNs	97%	Color, texture and shape	Seed deficiency	[66]
Soybean	Digital image	CNNs	86.2%	Color, texture and shape	Seed counting	[70]
Corn	Digital image	MLP	94.5%	Color	Physical purity	[81]
Soybean	Flatbed scanner	SVMs & RF	78%	Color	Seed grading	[47]
Bean	Digital image	RF	95.5%	Color, texture and shape	Seed varieties classification	[16]
Corn	Hyperspectral image	SVMs	98.2%	Spectrum-texture – morphology hybrid	Seed varieties classification	[48]
Corn	Digital image	SVMs	95.6%	Color and texture	Seed varieties classification	[49]
Corn	Digital image	GA–SVM hybrid	94.4%	Color, texture and shape	Seed varieties identification	[50]
Corn	Digital image	CNNs	95%	Color, texture and shape	Haploid and dioploid discrimination	[82]
Corn	Digital image	CNNs	95%	Color, texture and shape	Seed varieties identification	[83]
Barley	Digital image	DA & K-NN	99%	Color, morphology & texture	Seed varieties classification	[8]
Barley	Digital image	CNNs	93%	Color, morphology & texture	Seed varieties classification	[84]
MLP: Multilayer Perceptron, ICA: Imperialist Competitive Algorithm, ANNs: Artificial Neural Networks, LDA: linear discriminate analysis, FT-NIR: Fourier transform near-infrared, PLS-DA: partial least squares discriminant analysis, NB: naïve Bayes, PSO: partial swarm optimization, DFA: stepwise discriminant function analysis, CNNs: Convolutional neural networks, SVM: support vector machine, GA: genetic algorithm, DA: discriminant analysis, K-NN: K-nearest neighbors, SVMs: support vector machines.

{{lists.name}}

Precision seed certification through machine learning

Abstract