Autor: Osval Antonio Montesinos-Lopez

Supplemental data for multi-trait, multi-environment deep learning modeling for genomic-enabled prediction of plant traits

Osval Antonio Montesinos-Lopez Jose Crossa Francisco Javier Martin Vallejo (2018)

This study provides supplemental data to support an investigation of the power of multi-trait deep learning (MTDL) models in terms of genomic-enabled prediction accuracy.

Dataset

CIENCIAS AGROPECUARIAS Y BIOTECNOLOGÍA

Replication Data for: Multi-trait genome prediction of new environments with partial least squares

Osval Antonio Montesinos-Lopez Brandon Alejandro Mosqueda González Marco Alberto Valenzo-Jimenez Jose Crossa (2022)

The genomic selection (GS) methodology has revolutionized plant breeding. This methodology makes predictions for genotyped candidate lines based on statistical machine learning algorithms that are trained with phenotypic and genotypic data of a reference population. GS can save significant resources in the selection of candidate individuals. However, plant breeders can face challenges when trying to implement it practically to make predictions for future seasons or new locations and/or environments. To help address this challenge, this study seeks to explore the use of the multi-trait partial least square (MT-PLS) regression methodology and to compare its performance with the Bayesian Multi-trait Genomic Best Linear Unbiased Predictor (MT-GBLUP) method. A benchmarking process was performed with five actual data sets contained in this study. The results of the analysis are reported in the accompanying article.

Dataset

CIENCIAS AGROPECUARIAS Y BIOTECNOLOGÍA

Prediction of multiple-trait and multiple-environment genomic data using recommender systems

Osval Antonio Montesinos-Lopez Jose Crossa Ravi Singh Suchismita Mondal Philomin Juliana (2017)

In genomic-enabled prediction, the task of improving the accuracy of the prediction of lines in environments is difficult because the available information is generally sparse and usually has low correlations between traits. In current genomic selection, while researchers have a large amount of information and appropriate statistical models to process it, there is still limited computing efficiency to do so. Although statistical models are usually mathematically elegant, they are also computatio nally inefficient, and they are impractical for many traits, lines, environments, and years because they need to sample from huge normal multivariate distributions. For these reasons, this study explores two recommender systems: a) item-based collaborative filtering (IBCF; method M1) and b) the matrix factorization algorithm (method M2) in the context of multiple traits and multiple environments. The IBCF and matrix factorization methods were compared with two conventional methods on simulated and real data. Results of the simulated and real data sets show that the IBCF technique (method M1) was slightly better in terms of prediction accuracy than the two conventional methods and the matrix factorization method when the correlation was moderately high. The IBCF technique is very attractive because it produces good predictions when there is high correlation between items (environment-trait combinations) and its implementation is computationally feasible, which can be useful for plant breeders who deal with very large data sets.

Dataset

CIENCIAS AGROPECUARIAS Y BIOTECNOLOGÍA

Replication Data for: Sparse multi-trait genomic prediction under incomplete block designs

Osval Antonio Montesinos-Lopez Brandon Alejandro Mosqueda González JOSAFHAT SALINAS RUIZ Abelardo Montesinos Jose Crossa (2022)

The efficiency of genomic selection methodologies can be increased by sparse testing where a subset of materials are evaluated in different environments. Seven different multi-environment plant breeding datasets were used to evaluate four different methods for allocating lines to environments in a multi-trait genomic prediction problem. The results of the analysis are presented in the accompanying article.

Dataset

CIENCIAS AGROPECUARIAS Y BIOTECNOLOGÍA

Prediction of multiple-trait and multiple-environment genomic data using recommender systems

Osval Antonio Montesinos-Lopez Jose Crossa Ravi Singh Suchismita Mondal Philomin Juliana (2017)

In genomic-enabled prediction, the task of improving the accuracy of the prediction of lines in environments is difficult because the available information is generally sparse and usually has low correlations between traits. In current genomic selection, while researchers have a large amount of information and appropriate statistical models to process it, there is still limited computing efficiency to do so. Although statistical models are usually mathematically elegant, they are also computatio nally inefficient, and they are impractical for many traits, lines, environments, and years because they need to sample from huge normal multivariate distributions. For these reasons, this study explores two recommender systems: a) item-based collaborative filtering (IBCF; method M1) and b) the matrix factorization algorithm (method M2) in the context of multiple traits and multiple environments. The IBCF and matrix factorization methods were compared with two conventional methods on simulated and real data. Results of the simulated and real data sets show that the IBCF technique (method M1) was slightly better in terms of prediction accuracy than the two conventional methods and the matrix factorization method when the correlation was moderately high. The IBCF technique is very attractive because it produces good predictions when there is high correlation between items (environment-trait combinations) and its implementation is computationally feasible, which can be useful for plant breeders who deal with very large data sets.

Dataset

CIENCIAS AGROPECUARIAS Y BIOTECNOLOGÍA

Multi-trait multi-environment genomic prediction of durum wheat

Osval Antonio Montesinos-Lopez ROBERTO TUBEROSA MARCO MACCAFERRI GIUSEPPE SCIARA Karim Ammar Jose Crossa (2019)

In this paper we cover multi-trait prediction of grain yield (GY), days to heading (DH) and plant height (PH) of 270 durum wheat lines that were evaluated in 43 environments (location-year combinations) in Bologna, Italy. The results of the multi-trait deep learning method also were compared with univariate predictions of the genomic best linear unbiased predictor (GBLUP) method and the univariate counterpart of the multi-trait deep learning method. All models were implemented with and without the genotype×environment interaction term. We found that the best predictions were observed without the genotype×environment interaction term in the univariate and multivariate deep learning methods, but under the GBLUP method, the best predictions were observed taking into account the interaction term. We also found that in general the best predictions were observed under the GBLUP model but the predictions of the multi-trait deep learning model were very similar to those of the GBLUP model.

Dataset

CIENCIAS AGROPECUARIAS Y BIOTECNOLOGÍA

New deep learning genomic prediction model for multi-traits with mixed binary, ordinal, and continuous phenotypes

Osval Antonio Montesinos-Lopez Francisco Javier Martin Vallejo Jose Crossa Philomin Juliana Ravi Singh (2018)

The seven data sets are wheat data from CIMMYT Global Wheat Breeding program. They comprise different traits, like days to heading, days to maturity, grain yield, grain color, different type of leaf and stripe rust in wheat. Also the trials were run in different environments.

Dataset

CIENCIAS AGROPECUARIAS Y BIOTECNOLOGÍA