Applications of Artificial Intelligence and Machine Learning in Plant Breeding
Abstract
Plant breeding plays a vital role in meeting the needs of ever-increasing global food demands, climate change and sustainable agricultural practices. Artificial Intelligence and Machine Learning algorithms are used in plant breeding for several activities, including genotype-phenotype prediction, genomic selection, trait discovery, and the optimization of breeding methods. These methods help to determine the location of genetic markers that are related to certain traits based on the analysis of big data sets containing genomic and phenotypic information, which in turn allows the breeders to choose the plants with the desired traits effectively. The use of AI technologies can enhance the breeding process through the use of simulation of breeding results, hence cutting down on the time and resources needed for the conventional trial and error methods. Concerns on data quality, model interpretability and ethical issues need to be addressed so that the application of AI in breeding is reliable and devoid of ethical concerns. Also, the lack of advanced computing infrastructure and skilled personnel is a challenge to many breeders especially in developing countries. The prospects of artificial intelligence (AI) and machine learning (ML) in plant breeding exhibit considerable promise. The continuous advancements in computational biology, genomics, and data analytics will substantially enhance the capabilities of artificial intelligence-driven breeding systems. The integration of artificial intelligence (AI) and machine learning (ML) into plant breeding methodologies has the potential to revolutionize crop improvement efforts, therefore laying the foundation for sustainable agriculture and food security in the context of a changing climate.
Keywords
Download Options
Introduction
1.1 Significance and Context of Plant Breeding: Plant breeding is basic to agricultural development and has a long history of human endeavor with the aid of science consistently over centuries (Acquaah, 2012; Tanksley & McCouch, 1997). It began when farmers in the early years of agriculture picked and collected seeds from plants that possessed desirable characteristics, which implied the beginning of crop domestication. For centuries, farmers and developers gradually improved and crossbred crop plants to meet their needs, laying the groundwork for the current plant breeding techniques. It began with the discovery of genetics in the 19th century, with Mendel's work, and entered the modern era in the twentieth century in connection with molecular biology, genomics, and biotechnologies.
1.2 The Advancement of Artificial Intelligence (AI) and Machine Learning (ML) in Plant Breeding: AI and ML, during the early forms of its implementation, were mostly operating as marginal aids in the typical breeding methodology used to improve the data managing and handling system. Nevertheless, with the emergence and development of high-throughput sequencing technology and continuous accumulation of genetic resources, AI and ML have gradually changed the modern breeding mode in the present breeding mode.
Conclusion
The revolutionary capacity of AI and ML in transforming plant breeding and tackling global agriculture concerns is significant. These technologies provide unparalleled opportunity to expedite breeding cycles, improve forecast accuracy, and create crop varieties that are hardy, productive, and suited to evolving environmental circumstances. Artificial intelligence and machine learning algorithms allow breeders to examine extensive genomic and phenotypic data with greater efficiency and precision than conventional approaches. AI-driven methodologies enhance genotype-phenotype prediction, trait discovery, and breeding value calculation by detecting intricate patterns, genetic connections, and predictive models with increased accuracy and efficiency (Miotto et al., 2018). Furthermore, AI and ML methodologies enable breeders to enhance breeding methods, experimental frameworks, and resource distribution, resulting in more efficient and economical crop enhancement initiatives (Bhatia et al., 2020).
These technologies also provide potential solutions for urgent agricultural issues, including climate change, food hunger, and resource depletion. AI-driven breeding methodologies can create crop varieties with enhanced stress tolerance, disease resistance, and nutritional quality, hence providing food security and sustainability amid shifting climatic circumstances and increasing pest pressures (Tallis et al., 2018). Moreover, AI and ML methodologies empower breeders to maximize resource utilization, mitigate environmental effects, and improve agricultural output, hence facilitating the efficient and sustainable management of land, water, and inputs (Van Evert et al., 2020).
Moreover, AI and ML has the capacity to democratize access to breeding tools and resources, enabling farmers, especially in poor nations, to engage in crop enhancement initiatives and reap the advantages of technical advancements. AI-driven breeding technologies can enhance capacity building, technology transfer, and inclusive innovation by promoting open science principles, collaborative platforms, and knowledge-sharing initiatives, thereby advancing agricultural development and economic growth in rural communities (Chen et al., 2018).
In summary, AI and ML has the revolutionary capacity to reinvent plant breeding and tackle global agricultural concerns by increasing breeding efficiency, enhancing crop performance, and fostering sustainability. Utilizing these tools, researchers, breeders, and governments may expedite innovation, enhance resilience, and guarantee food security for future generations.
References
[1] Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., Corrado, G. S., Davis, A., Dean, J., Devin, M., Ghemawat, S., Goodfellow, I., Harp, A., Irving, G., Isard, M., Jia, Y., Jozefowicz, R., Kaiser, L., Kudlur, M., ... Zheng, X. (2016). TensorFlow: Large-scale machine learning on heterogeneous distributed systems. arXiv. https://arxiv.org/abs/1603.04467
[2] Acquaah, G. (2012). Principles of plant genetics and breeding (2nd ed.). John Wiley & Sons.
[3] Alipanahi, B., Delong, A., Weirauch, M. T., & Frey, B. J. (2015). Predicting the sequence specificities of DNA- and RNA-binding proteins by deep learning. Nature Biotechnology, 33(8), 831–838.
[4] Angermueller, C., Pärnamaa, T., Parts, L., & Stegle, O. (2016). Deep learning for computational biology. Molecular Systems Biology, 12(7), Article 878.
[5] Araus, J. L., & Cairns, J. E. (2014). Field high-throughput phenotyping: The new crop breeding frontier. Trends in Plant Science, 19(1), 52–61.
[6] Barchi, L., Pietrella, M., Venturini, L., Minio, A., Toppino, L., Acquadro, A., & Lanteri, S. (2019). A genome-wide association analysis of the anthocyanin and brix content in wild and cultivated strawberry highlights the diversification of their metabolic pathways. BMC Plant Biology, 19(1), 1–17.
[7] Bassil, N. V., Davis, T. M., Zhang, H., Ficklin, S., Mittmann, M., Webster, T., & Main, D. (2015). Development and preliminary evaluation of a 90 K Axiom® SNP array for the allo-octoploid cultivated strawberry Fragaria × ananassa. BMC Genomics, 16(1), 1–14.
[8] Bhatia, A., Hasan, M. M., Palchowdhury, S., Paul, S., & Asif, M. (2020). A review on machine learning approaches for breeding trait prediction and recommendation in crop plants. Computers and Electronics in Agriculture, 178, Article 105743.
[9] Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5–32.
[10] Chen, Y., Wang, Z., Li, Y., Truong, E., & Banerjee, A. (2018). Meta-transfer learning for few-shot learning. Proceedings of the 35th International Conference on Machine Learning, 80, 4343–4352.
[11] Chum, P., Park, S., Ko, K., & Sim, K. (2011). Optimal EEG feature extraction using DWT for classification of imagination of hands movement. Journal of Korean Institute of Intelligent Systems, 21(6), 786–791.
[12] Collard, B. C., & Mackill, D. J. (2008). Marker-assisted selection: An approach for precision plant breeding in the twenty-first century. Philosophical Transactions of the Royal Society B: Biological Sciences, 363(1491), 557–572.
[13] Cooper, M., Gho, C., Leafgren, R., Tang, T., & Messina, C. (2014). Breeding drought-tolerant maize hybrids for the US corn-belt: Discovery to product. Journal of Experimental Botany, 65(21), 6191–6204.
[14] Crossa, J., Pérez, P., Cuevas, J., López, M. O., Jarquín, D., de los Campos, G., & Burgueño, J. (2017). Genomic selection in plant breeding: Methods, models, and perspectives. Trends in Plant Science, 22(11), 961–975.
[15] Deb, D., Garg, T., Mahajan, A., & Sharma, D. (2018). Quality attributes and quality assessment techniques of open data: A comprehensive review. Data Technologies and Applications, 52(1), 54–79.
[16] Ding, W., Li, Y., & Gao, Y. (2019). Using machine learning models to select CRISPR-Cas9 target sites based on chromatin accessibility. Proceedings of the 10th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics, 96–103.
[17] Ducrocq, V., Besbes, B., & Casellas, J. (2015). Genetic algorithms for optimization in livestock management and breeding. Genetics Selection Evolution, 47(1), Article 54.
[18] Fahlgren, N., Gehan, M. A., & Baxter, I. (2015). Lights, camera, action: High-throughput plant phenotyping is ready for a closeup. Current Opinion in Plant Biology, 24, 93–99.
[19] Feldman, M. W., Khoury, C. K., & van de Wouw, M. (2019). Agricultural biotechnology and intellectual property: Seeds of change. Nature Plants, 5(11), 1142–1146.
[20] Gianola, D., Okut, H., Weigel, K. A., Rosa, G. J., & Bacheller, L. R. (2009). Predicting complex quantitative traits with Bayesian neural networks: A case study with Jersey cows and wheat. BMC Genetics, 10(1), Article 37.
[21] Goddard, M. E., & Hayes, B. J. (2007). Genomic selection. Journal of Animal Breeding and Genetics, 124(6), 323–330.
[22] Golzarian, M. R., Frick, R. A., Rajendran, K., Berger, B., Roy, S., Tester, M., & Lun, D. S. (2011). Accurate inference of shoot biomass from high-throughput images of cereal plants. Plant Methods, 7(1), 1–11.
[23] Gong, L., Sun, X., Sui, L., & Zhao, H. (2019). Ethical considerations in AI-driven breeding. In O. B. Smith (Ed.), Ethics in agriculture – An African perspective (pp. 103–121). Springer.
[24] González-Camacho, J. M., de los Campos, G., Pérez, P., Gianola, D., Cairns, J. E., Mahuku, G., & Crossa, J. (2012). Genomeenabled prediction of genetic values using radial basis function neural networks. Theoretical and Applied Genetics, 125(4), 759–771.
[25] Gupta, R., Bhatt, S., & Patel, K. (2020). Digital divide in India: A systematic literature review. Telematics and Informatics, 56,Article 101456.
[26] Guyon, I., & Elisseeff, A. (2003). An introduction to variable and feature selection. Journal of Machine Learning Research, 3, 1157– 1182.
[27] Habier, D., Fernando, R. L., Kizilkaya, K., & Garrick, D. J. (2007). Extension of the Bayesian alphabet for genomic selection. BMC Bioinformatics, 8(1), 1–15.
[28] Haghighattalab, A., González Pérez, L., Mondal, S., Singh, D., Schinstock, D., Rutkoski, J., & Poland, J. (2016). Application of unmanned aerial systems for high throughput phenotyping of large wheat breeding nurseries. Plant Methods, 12(1), Article 35.
[29] Heffner, E. L., Jannink, J. L., & Sorrells, M. E. (2009). Genomic selection accuracy using multifamily prediction models in a wheat breeding program. The Plant Genome, 2(2), 191–197.
[30] Heslot, N., Yang, H. P., Sorrells, M. E., & Jannink, J. L. (2015). Genomic selection in plant breeding: A comparison of models. Crop Science, 55(1), 36–45.
[31] Hickey, L. T., Hafeez, A. N., Robinson, H., Jackson, S. A., Leal-Bertioli, S. C. M., Tester, M., & Dieters, M. J. (2017). Breeding crops to feed 10 billion. Nature Biotechnology, 35(10), 927–937.
[32] Hirsch, C. N., Foerster, J. M., Johnson, J. M., Sekhon, R. S., Muttoni, G., Vaillancourt, B., & de Leon, N. (2014). Insights into the maize pan-genome and pan-transcriptome. The Plant Cell, 26(1), 121–135.
[33] Hirschhorn, J. N., & Daly, M. J. (2005). Genome-wide association studies for common diseases and complex traits. Nature Reviews Genetics, 6(2), 95–108.
[34] Jarquín, D., Crossa, J., Lacaze, X., Du Cheyron, P., Daucourt, J., Lorgeou, J., & Gay, L. (2014). A reaction norm model for genomic selection using high-dimensional genomic and environmental data. Theoretical and Applied Genetics, 127(3), 595–607.
[35] Jiang, Y., & Reif, J. C. (2015). Modeling epistasis in genomic selection. Genetics, 201(2), 759–768.
[36] Kadam, S., & Yin, X. (2020). Applications of artificial intelligence in plant breeding. Trends in Plant Science, 25(7), 672–680.
[37] Kamilaris, A., Prenafeta-Boldú, F. X., & Fountas, S. (2018). Deep learning in agriculture: A survey. Computers and Electronics in Agriculture, 147, 70–90.
[38] Kandavelou, K., Manivannan, K., Balasubramanian, S., & Lakshmanan, G. (2019). Artificial intelligence in plant genome editing. In P. K. Gupta (Ed.), Artificial intelligence applications in genetic improvement (pp. 83–97). Elsevier.
[39] Khan, M. H. U., Wang, S., Wang, J., Ahmar, S., Saeed, S., Khan, S. U., Xu, X., Chen, H., Bhat, J. A., & Feng, X. (2022). Applications of artificial intelligence in climate-resilient smart-crop breeding. International Journal of Molecular Sciences, 23(19), Article 11156.
[40] Rai, K. K. (2022). Integrating speed breeding with artificial intelligence for developing climate-smart crops. Molecular Biology Reports, 49(12), 11385–11402.
[41] Kuang, Z., Ping, Y., Hao, Y., Fang, Z., Li, J., & Yin, H. (2019). Highly efficient RNA-guided base editing in rabbit. Nature Communications, 10(1), 1–10.
[42] Kumar, S., Garrick, D. J., Bink, M. C., Whitworth, C., & Chagné, D. (2012). Genomic selection for fruit quality traits in apple (Malus × domestica Borkh.). PLoS ONE, 7(5), Article e36674.
[43] Law, J. A., & Jacobsen, S. E. (2010). Establishing, maintaining, and modifying DNA methylation patterns in plants and animals. Nature Reviews Genetics, 11(3), 204–220.
[44] Lee, K., Zhang, Y., & Kleinstiver, B. P. (2019). Activities and specificities of CRISPR/Cas9 and Cas12a nucleases for targeted mutagenesis in maize. Plant Biotechnology Journal, 17(2), 362–372.
[45] Liu, M., Liu, X., Li, J., Ding, C., & Jiang, J. (2014). Evaluating total inorganic nitrogen in coastal waters through fusion of multitemporal RADARSAT-2 and optical imagery using random forest algorithm. International Journal of Applied Earth Observation and Geoinformation, 33, 192–202.
[46] López-Cruz, M. A., Crossa, J., de los Campos, G., Alvarado, G., Mondal, S., & Rutkoski, J. (2018). A review of computational tools for prediction and improvement of crop traits. Frontiers in Plant Science, 10, Article 422.
[47] Lundberg, S. M., & Lee, S. I. (2017). A unified approach to interpreting model predictions. Advances in Neural Information Processing Systems, 30, 4765–4774.
[48] Meuwissen, T. H. E., Hayes, B. J., & Goddard, M. E. (2001). Prediction of total genetic value using genome-wide dense marker maps. Genetics, 157(4), 1819–1829.
[49] Miotto, R., Wang, F., Wang, S., Jiang, X., & Dudley, J. T. (2018). Deep learning for healthcare: Review, opportunities and challenges. Briefings in Bioinformatics, 19(6), 1236–1246.
[50] Yoosefzadeh-Najafabadi, M., Earl, H., Tulpan, D., Sulik, J., & Eskandari, M. (2021). Application of machine learning algorithms in plant breeding: Predicting yield from hyperspectral reflectance in soybean. Frontiers in Plant Science, 11, Article 2169.
[51] Mulla, D. J., Joshi, N., Chang, A., Abel, K., & Venugopal, S. (2020). Automation and robotics in plant breeding: Current status and future prospects. Plant Phenomics, 2020, Article 8849382.
[52] Nicholls, H. L., John, C. R., Watson, D. S., Munroe, P. B., Barnes, M. R., & Cabrera, C. P. (2020). Reaching the end-game for GWAS: Machine learning approaches for the prioritization of complex disease loci. Frontiers in Genetics, 11, Article 350.
[53] Paulus, S., Dupuis, J., Mahlein, A. K., Kuhlmann, H., & Kersting, K. (2014). Automatic early plant disease detection using machine learning-based image analysis. Plant Pathology, 63(6), 1302–1312.
[54] Pérez-Enciso, M., & Misztal, I. (2011). Qxpak5: Old mixed model solutions for new genomics problems. BMC Bioinformatics, 12(1), Article 202.
[55] Poldrack, R. A., Gorgolewski, K. J., & Varoquaux, G. (2017). Computational and informatic advances for reproducible data analysis in neuroimaging. Annual Review of Biomedical Engineering, 19, 259–287.
[56] Rincent, R., Charpentier, J. P., Faivre-Rampant, P., Paux, E., Le Gouis, J., Bastien, C., & Moreau, L. (2014). Phenomic selection is a low-cost and high-throughput method based on indirect predictions: Proof of concept on wheat and poplar. G3: Genes, Genomes, Genetics, 4(8), 1603–1610.
[57] Ringnér, M. (2008). What is principal component analysis? Nature Biotechnology, 26(3), 303–304.
[58] Rosenberg, N. A., Sella, G., & Blum, M. G. (2019). Interpreting polygenic scores, polygenic adaptation, and human phenotypic differences. In M. A. Goldman (Ed.), Evolutionary medicine (pp. 115–125). Oxford University Press.
[59] Roychowdhury, R., Das, S. P., Gupta, A., Parihar, P., Chandrasekhar, K., Sarker, U., Kumar, A., Ramrao, D. P., & Sudhakar, C. (2023). Multi-omics pipeline and omics-integration approach to decipher plant's abiotic stress tolerance responses. Genes, 14(6), Article 1281.
[60] Rutkoski, J., Benson, J., Jia, Y., Brown-Guedira, G., Jannink, J. L., & Sorrells, M. (2012). Evaluation of genomic prediction methods for Fusarium head blight resistance in wheat. The Plant Genome, 5(2), 51–61.
[61] Rutkoski, J., Poland, J., Mondal, S., Autrique, E., Pérez, L. G., & Crossa, J. (2019). Canopy temperature and vegetation indices from high-throughput phenotyping improve accuracy of pedigree and genomic selection for grain yield in wheat. G3: Genes, Genomes, Genetics, 9(7), 2495–2508.
[62] Saito, K., & Matsuda, F. (2010). Metabolomics for functional genomics, systems biology, and biotechnology. Annual Review of Plant Biology, 61, 463–489.
[63] Shi, T., Shi, L., Fang, H., Weng, Z., & Schadt, E. E. (2020). Investigating and suppressing batch effects in single-cell RNA-Seq data. Genome Biology, 21(1), 1–19.
[64] Singh, A., Ganapathysubramanian, B., Singh, A. K., & Sarkar, S. (2016). Machine learning for high-throughput stress phenotyping in plants. Trends in Plant Science, 21(2), 110–124.
[65] Spindel, J., Begum, H., Akdemir, D., Collard, B., Redoña, E., Jannink, J. L., & McCouch, S. R. (2015). Genome-wide prediction models that incorporate de novo GWAS are a powerful new tool for tropical rice improvement. Heredity, 116(4), 395–408.
[66] Tallis, H., Kreis, K., O'Hare, M., O'Connell, D., Hawkins, E., Folarin, A., & Rahman, M. (2018). The role of digital agriculture in food security. The Nature Conservancy.
[67] Tanksley, S. D., & McCouch, S. R. (1997). Seed banks and molecular maps: Unlocking genetic potential from the wild. Science, 277(5329), 1063–1066.
[68] van Evert, F. K., Franke, A. C., de Rooij-van der Goes, P. C., van Laar, H. H., & Schans, D. A. (2020). Applications of AI and machine learning in agricultural systems: A review. Engineering Applications of Artificial Intelligence, 91, Article 103529.
[69] VanRaden, P. M. (2008). Efficient methods to compute genomic predictions. Journal of Dairy Science, 91(11), 4414–4423.
[70] Vanschoren, J., van Rijn, J. N., Bischl, B., & Torgo, L. (2020). OpenML: Networked science in machine learning. ACM Transactions on Intelligent Systems and Technology, 11(3), 1–36.
[71] Voss-Fels, K. P., Cooper, M., Hayes, B. J., & Wu, H. (2019). Genomic selection in plant breeding programs: A comparison of simulation models and field experiments in wheat. Frontiers in Plant Science, 10, Article 179.
[72] Watson, A., Ghosh, S., Williams, M. J., Cuddy, W. S., Simmonds, J., Rey, M. D., & Hickey, L. T. (2018). Speed breeding is a powerful tool to accelerate crop research and breeding. Nature Plants, 4(1), 23–29.
[73] Whetzel, P. L., Noy, N. F., Shah, N. H., Alexander, P. R., Nyulas, C., Tudorache, T., & Musen, M. A. (2019). BioPortal: Ontologies and integrated data resources at the click of a mouse. Nucleic Acids Research, 37(Suppl. 1), W170–W173.
[74] Whishart, J., Arief, V. N., & Smith, A. B. (2019). Operations research in agriculture: A review. Computers and Electronics in Agriculture, 157, 8–27.
[75] Yoosefzadeh-Najafabadi, M., Earl, H., Tulpan, D., Sulik, J., & Eskandari, M. (2021). Application of machine learning algorithms in plant breeding: Predicting yield from hyperspectral reflectance in soybean. Frontiers in Plant Science, 11, Article 2169.
[76] Zhang, X., Pérez-Rodríguez, P., Burgueño, J., Olsen, M., Buckler, E., Atlin, G., & Crossa, J. (2019). Rapid cycling genomic selection in a multiparental tropical maize population. G3: Genes, Genomes, Genetics, 9(7), 2299–2312.
[77] Zhou, H., Cheng, N., & Yu, X. (2018). Utilizing machine learning approaches for precise and fast CRISPR editing. Trends in Biotechnology, 36(10), 1017–1027.
[78] Zou, J., Huss, M., Abid, A., Mohammadi, P., Torkamani, A., & Telenti, A. (2020). A primer on deep learning in genomics. Nature Genetics, 52(1), 12–18.