Autor: Paulino Pérez-Rodríguez

Deep kernel and deep learning for genomic-based prediction

Jose Crossa Paulino Pérez-Rodríguez Juan Burgueño Ravi Singh Philomin Juliana Osval Antonio Montesinos-Lopez Jaime Cuevas (2019)

Deep learning (DL) is a promising method in the context of genomic prediction for selecting individuals early in time without measuring their phenotypes. iI this paper we compare the performance in terms of genome-based prediction of the DL method, deep kernel (arc-cosine kernel, AK) method, Gaussian kernel (GK) method and the conventional kernel method (Genomic Best Linear Unbiased Predictor, GBLUP, GB). We used two real wheat data sets for the benchmarking of these methods. We found that the GK and deep kernel AK methods outperformed the DL and the conventional GB methods, although the gain in terms of prediction performance of AK and GK was not very large but they have the advantage that no tuning parameters are required. Furthermore, although AK and GK had similar genomic-based performance, deep kernel AK is easier to implement than the GK. For this reason, our results suggest that AK is an alternative to DL models with the advantage that no tuning process is required.

Dataset

CIENCIAS AGROPECUARIAS Y BIOTECNOLOGÍA

Replication Data for: Sparse kernel models provide optimization of training set design for genome-based prediction in multi-year wheat breeding data

Marco Lopez-Cruz Susanne Dreisigacker Leonardo Abdiel Crespo Herrera Alison Bentley Ravi Singh Suchismita Mondal Paulino Pérez-Rodríguez Jose Crossa (2021)

When genomic selection (GS) is used in breeding schemes, data from multiple generations can provide opportunities to increase sample size and thus the likelihood of extracting useful information from the training data. The Sparse Selection Index (SSI), is is a method for optimizing training data selection. The data files provided with this study include a large multigeneration wheat dataset of grain yield for 68,836 lines generated across eight cycles (years) as well as genotypic data that were analyzed to test this method. The results of the analysis are published in the corresponding journal article.

Dataset

CIENCIAS AGROPECUARIAS Y BIOTECNOLOGÍA

Replication Data for: Multimodal Deep Learning Methods Enhance Genomic Prediction of Wheat Breeding

Carolina Rivera-Amado Francisco Pinto Francisco Javier Pinera-Chavez David González-Diéguez Paulino Pérez-Rodríguez Huihui Li Osval Antonio Montesinos-Lopez Jose Crossa (2023)

In plant breeding research, several statistical machine learning methods have been developed and studied for assessing the genomic prediction (GP) accuracy of unobserved phenotypes. To increase the GP accuracy of unobserved phenotypes while simultaneously accounting for the complexity of genotype × environment interaction (GE), deep learning (DL) neural networks have been developed.These analyses can potentially include phenomics data obtained through imaging. The two datasets included in this study contain phenomic, phenotypic, and genotypic data for a set of wheat materials. They have been used to compare a novel DL method with conventional GP models.The results of these analyses are reported in the accompanying journal article.

Dataset

CIENCIAS AGROPECUARIAS Y BIOTECNOLOGÍA

Replication Data for: Genomic Prediction of Gene Bank Wheat Landraces

Jose Crossa DIEGO JARQUIN Jorge Franco Paulino Pérez-Rodríguez Juan Burgueño Carolina Saint Pierre Prashant Vikram Carolina Sansaloni Cesar Petroli Deniz Akdemir Clay Sneller Matthew Paul Reynolds Thomas Payne Carlos Guzman Roberto Peña Peter Wenzl Sukhwinder Singh (2023)

Genomic prediction methods may be used to enhance efforts to rapidly introgress traits of interest from exotic germplasm into elite materials. This study examined the performance of different genomic prediction models using genotypic and phenotypic data related to 8416 Mexican landrace accessions and 2403 Iranian landrace accessions stored in germplasm banks. The Mexican and Iranian collections were evaluated under optimal, drought, and heat conditions for several traits including the highly heritable traits, days to heading (DTH), and days to maturity (DTM). The results of the different analyses are reported in the accompanying journal article.

Dataset

CIENCIAS AGROPECUARIAS Y BIOTECNOLOGÍA