Autor: Paulino Pérez-Rodríguez

Replication Data for: Multi-trait Bayesian decision for parental selection

Jose Crossa Fernando Henrique Toledo Paulino Pérez-Rodríguez (2020)

The files included in this study contains the data used with three promising multivariate loss functions: Kullback-Leibler (KL); the Energy Score; and the Multivariate Asymmetric Loss (MALF); to select the best performing parents for the next breeding cycle in two extensive real wheat data sets.

Dataset

CIENCIAS AGROPECUARIAS Y BIOTECNOLOGÍA

Replication Data for: Genome-based genotype × environment prediction enhances potato (Solanum tuberosum L.) improvement using pseudo-diploid and polysomic tetraploid modeling

Rodomiro Ortiz Jose Crossa Paulino Pérez-Rodríguez Jaime Cuevas (2021)

Potato breeding efficiency can be improved by increasing the reliability of selection and identifying promising germplasm for crossing. The data provided in these datasets were used to compare the prediction accuracy of genomic-estimated breeding values for several potato (Solanum tuberosum L.) breeding clones and released cultivars evaluated in three locations in northern and southern Sweden. The analysis included several traits such as tuber starch percentage and total tuber weight. Results of the analyses are reported in an accompanying journal article.

Dataset

CIENCIAS AGROPECUARIAS Y BIOTECNOLOGÍA

Supplemental data for hybrid wheat prediction using genomic, pedigree and environmental covariables interaction models

BHOJA BASNET Jose Crossa Paulino Pérez-Rodríguez Ravi Singh Fatima Camarillo-Castillo (2018)

Genomic prediction of hybrids unobserved in field evaluations is crucial. In this study, we used genomic G×E models for hybrid prediction, where similarity between lines was assessed by pedigree and molecular markers, and similarity between environments was accounted for by environmental covariables.

Dataset

CIENCIAS AGROPECUARIAS Y BIOTECNOLOGÍA

Replication Data for: Genomic Prediction with Genotype by Environment Interaction Analysis for Kernel Zinc Concentration in Tropical Maize Germplasm

Edna Mageto Jose Crossa Paulino Pérez-Rodríguez Thanda Dhliwayo natalia palacios rojas XUECAI ZHANG (2020)

The Zinc association mapping (ZAM) panel is a set of 923 elite inbred lines from the International Maize and Wheat Improvement Center (CIMMYT) biofortification breeding program. The panel represented wide genetic diversity for kernel Zn and is comprised of several lines with tolerance/resistance to an array of abiotic and biotic stresses commonly affecting maize production in the tropics, improved nitrogen use efficiency, and grain nutritional quality. The ZAM panel_923_LINES_GENO and Zinc association mapping (ZAM) panel_phenotypic data are two files with GBS and phenotypic data for zinc (Zn) from this population. From the ZAM panel, four inbred lines (two with high-Zn and two with low-Zn) were selected and used to form the bi-parental populations, namely DH population1 and DH population2. Genotypic and phenotypic data corresponding to these populations are DH populations1&2_255_LINES_GENO and DH population1_phenotypic data and DH population2_phenotypic data

Dataset

CIENCIAS AGROPECUARIAS Y BIOTECNOLOGÍA

Replication Data for: Multi-generation genomic prediction of maize yield using parametric and non-parametric sparse selection indices

Marco Lopez-Cruz Yoseph Beyene Manje Gowda Jose Crossa Paulino Pérez-Rodríguez Gustavo de los Campos (2021)

Genomic prediction models may be used in plant breeding pipelines. They are often calibrated using multi-generation data and there is an open question of whether all available data or a subset of it should be used to calibrate genomic prediction models. Therefore, a study was undertaken to determine whether combining sparse selection indexes (SSIs) and kernel methods could further improve prediction accuracy when training genomic models using multi-generation data. This dataset contains the genotypic and phenotypic data from CIMMYT maize doubled haploid lines that were used to perform the analyses. The results of the analyses are presented in the accompanying article.

Dataset

CIENCIAS AGROPECUARIAS Y BIOTECNOLOGÍA

Replication Data for: Approximate kernels for large data sets In genome-based prediction

Osval Antonio Montesinos-Lopez Johannes Martini Paulino Pérez-Rodríguez Jose Crossa (2020)

The rapid development of molecular markers and sequencing technologies has made it possible to use genomic selection (GS) and genomic prediction (GP) in animal and plant breeding. However, computational difficulties arise when the number of observations is large. This five datasets provided here were used to support a comparative analysis of two genomic-enabled prediction models: the full genomic method single environment (FGSE) and the approximate kernel method for a single environment model (APSE). The data were also used to compare the full genomic method with genotype × environment model (FGGE) to the approximate kernel method with genotype × environment interaction (APGE). The results of the analyses are described in the related publication.

Dataset

CIENCIAS AGROPECUARIAS Y BIOTECNOLOGÍA

Replication Data for: Joint use of genome, pedigree and their interaction with environment for predicting the performance of wheat lines in new environments

Osval Antonio Montesinos-Lopez Philomin Juliana Ravi Singh Jesse Poland Paulino Pérez-Rodríguez Jose Crossa DIEGO JARQUIN (2019)

In this study, we evaluated genome-based prediction using 35,403 wheat lines from the Global Wheat Breeding Program of the International Maize and Wheat Improvement Center (CIMMYT). We implemented eight statistical models that included genome-wide molecular marker and pedigree information in two different validation schemes. All models included main effects, and others also considered interactions between the different types of covariates via Hadamard products of similarity structures. The pedigree models always gave better results predicting new lines in observed environments than the genome-based models when only main effects were fitted. However, for all traits, the highest predictive abilities were obtained when interactions between pedigree, markers and environments were included. When new lines were predicted in unobserved environments in almost all trait/year combinations, the marker main-effects model was the best. These results provide strong evidence that the different sources of genetic information (molecular markers and pedigree) are not equally useful at different stages of the breeding pipelines, and can be employed differentially to improve the design of future breeding programs.

Dataset

CIENCIAS AGROPECUARIAS Y BIOTECNOLOGÍA