Autor: Jose Crossa

Results from rapid-cycle recurrent genomic selection in spring bread wheat

Susanne Dreisigacker Leonardo Abdiel Crespo Herrera Alison Bentley Jose Crossa (2022)

Empirical studies of early generation genomic selection strategies for parental selection or population improvement are still lacking in wheat and other major crops. We show the potential of rapid-cycle recurrent GS to increase genetic gain for grain yield in wheat. We show a consistent realized genetic gain for grain yield after three cycles of recombination of bi-parental F1’s, when summarized across two years of phenotyping.

Dataset

CIENCIAS AGROPECUARIAS Y BIOTECNOLOGÍA

Replication Data for: A comparison between three machine learning methods for multivariate genomic prediction using the Sparse Kernels Methods (SKM) library

Osval Antonio Montesinos-Lopez Pedro César Santana Mancilla Jose Crossa (2022)

Genomic selection (GS) provides a new way for plant breeders select the best genotype. It draws upon historical phenotypic and genotypic information for training a statistical machine learning model which is used for predicting phenotypic (or breeding) values of new lines for which only genotypic information is available. Many statistical machine learning methods have been proposed for this task, but multi-trait (MT) genomic prediction models are preferred because they take advantage of correlated traits to improve the prediction accuracy. This study contains six datasets that were used to compare the prediction performance of three MT methods: the MT genomic best linear unbiased predictor (GBLUP), the MT partial least square (PLS) and the multi-trait Random Forest (RF). The data come from groundnuts, rice, and wheat. The accompanying article describes the results of the analysis.

Dataset

CIENCIAS AGROPECUARIAS Y BIOTECNOLOGÍA

Replication Data for: Bayesian linear regression near infrared spectroscopy (NIR) to predict provitamin A carotenoids content in maize breeding programs

Jose Crossa Thanda Dhliwayo THOKOZILE NDHLELA natalia palacios rojas (2021)

Vitamin A deficiency (VAD) is a public health problem worldwide. For countries with a high per capita consumption of maize, breeding varieties with higher provitamin A carotenoid content than normal yellow maize — biofortification — can be a viable strategy to reduce VAD. Selection for provitamin A carotenoid content uses molecular markers and phenotypic data generated using expensive and laborious wet lab analyses. Near-infrared spectroscopy (NIRS) could be a fast and cheap method to measure carotenoids. This dataset contains carotenoid and NIRS data from 1857 tropical maize samples used as a training set to predict provitamin A carotenoid content of an independent set of 650 tropical maize samples using Bayesian linear regression models. The datasets contain information about specific carotenoids measured and the NIRS values measured at different wavelengths. The results of the analysis are described in the accompanying article.

Dataset

CIENCIAS AGROPECUARIAS Y BIOTECNOLOGÍA

Aerial High-Throughput Phenotyping Enabling Indirect Selection for Grain Yield at the Early-generation Seed-limited Stages in Breeding Program - data for publication

Suchismita Mondal Jose Crossa Ravi Singh Jesse Poland (2020)

The files contain pedigree information on lines used in the study, trait data for grain yield, spectral traits and other agronomic data and genotypic data

Dataset

CIENCIAS AGROPECUARIAS Y BIOTECNOLOGÍA

Evaluation of maize pre-breeding materials under the Seeds of Discovery initiative in 2017

Terence Molnar Marcela Carvalho Juan Burgueño Jose Crossa Cesar Petroli Monica Mezzalama Sarah Hearne (2019)

These data describe the evaluation of landrace-derived pre-breeding materials for biotic and abiotic stress resistance as well as for general yield potential in 2017. Populations of interest for drought stress during flowering time, heat stress during flowering time, and Tar Spot tolerance were evaluated for yield potential and response to the stresses under the MasAgro Biodiversidad project. Populations of interest for MCMV tolerance were evaluated for response to stress under the MAIZE CRP project.

Dataset

CIENCIAS AGROPECUARIAS Y BIOTECNOLOGÍA

Supplemental data for hybrid wheat prediction using genomic, pedigree and environmental covariables interaction models

BHOJA BASNET Jose Crossa Paulino Pérez-Rodríguez Ravi Singh Fatima Camarillo-Castillo (2018)

Genomic prediction of hybrids unobserved in field evaluations is crucial. In this study, we used genomic G×E models for hybrid prediction, where similarity between lines was assessed by pedigree and molecular markers, and similarity between environments was accounted for by environmental covariables.

Dataset

CIENCIAS AGROPECUARIAS Y BIOTECNOLOGÍA

Replication Data for: Multi-trait genome prediction of new environments with partial least squares

Osval Antonio Montesinos-Lopez Brandon Alejandro Mosqueda González Marco Alberto Valenzo-Jimenez Jose Crossa (2022)

The genomic selection (GS) methodology has revolutionized plant breeding. This methodology makes predictions for genotyped candidate lines based on statistical machine learning algorithms that are trained with phenotypic and genotypic data of a reference population. GS can save significant resources in the selection of candidate individuals. However, plant breeders can face challenges when trying to implement it practically to make predictions for future seasons or new locations and/or environments. To help address this challenge, this study seeks to explore the use of the multi-trait partial least square (MT-PLS) regression methodology and to compare its performance with the Bayesian Multi-trait Genomic Best Linear Unbiased Predictor (MT-GBLUP) method. A benchmarking process was performed with five actual data sets contained in this study. The results of the analysis are reported in the accompanying article.

Dataset

CIENCIAS AGROPECUARIAS Y BIOTECNOLOGÍA

Genomic and pedigree prediction with genotype × environment interaction in spring wheat grown in South and Western Asia, North Africa, and Mexico

Sivakumar Sukumaran Jose Crossa Carlos Jara Marta Lopes Matthew Paul Reynolds (2016)

Increases in genetic gains in grain yield can be accelerated through genomic selection (GS). In the present study seven genomic prediction models under two cross validation scenarios were evaluated on the Wheat Association Mapping Initiative population of 287 advanced elite lines phenotyped for grain yield (GY), thousand grain weight (GW), grain number (GN), and thermal time for flowering (TTF) in 18 environments (year location combinations) in major wheat producing countries in 2010 and 2011. The seven genomic prediction models tested herein: four of them (model 1 (L+E), model 2 (L+E+G), model 3 (L+E+A) , and model 4 (L+E+A+G )) with main effects (lines (L), environme nts (E), genetic relationship matrix (G), and pedigree derived matrix (A) and three of them (model 5 (L+E+A+AE), model 6 (L+E+G+GE), and model 7 (L+E+G+A+AE+GE)) with interaction effects between A×E, G×E, and both together with main effects. Moreover, two cross validation (CV) schemes were applied: (1) predicting lines’ performance at untested sites (CV1) and (2) predicting the lines’ performance at some sites with the performance from other sites (CV2). The genomic prediction models with interaction terms, models 6 and 7 had the highest prediction accuracy on average for CV1 for GY (0.31), GN (0.30), and model 5 for TTF (0.26). Models 3 and 7 2, were the best model for GW (0.45 each) under CV1 scenario. For CV2, the prediction accuracy was generally high for the model with interaction terms models 5, 6, and 7 for GY (0.39), model 5 and 7 for GN (0.43. For GW and TTF models prediction accuracy were similar. Results indicated genomic selection can be used to predict genotype by environment (G×E) interaction in multi environment trials to select varieties for release as well as for accelerated breeding.

Dataset

CIENCIAS AGROPECUARIAS Y BIOTECNOLOGÍA

Replication Data for: Genomic Prediction with Genotype by Environment Interaction Analysis for Kernel Zinc Concentration in Tropical Maize Germplasm

Edna Mageto Jose Crossa Paulino Pérez-Rodríguez Thanda Dhliwayo natalia palacios rojas XUECAI ZHANG (2020)

The Zinc association mapping (ZAM) panel is a set of 923 elite inbred lines from the International Maize and Wheat Improvement Center (CIMMYT) biofortification breeding program. The panel represented wide genetic diversity for kernel Zn and is comprised of several lines with tolerance/resistance to an array of abiotic and biotic stresses commonly affecting maize production in the tropics, improved nitrogen use efficiency, and grain nutritional quality. The ZAM panel_923_LINES_GENO and Zinc association mapping (ZAM) panel_phenotypic data are two files with GBS and phenotypic data for zinc (Zn) from this population. From the ZAM panel, four inbred lines (two with high-Zn and two with low-Zn) were selected and used to form the bi-parental populations, namely DH population1 and DH population2. Genotypic and phenotypic data corresponding to these populations are DH populations1&2_255_LINES_GENO and DH population1_phenotypic data and DH population2_phenotypic data

Dataset

CIENCIAS AGROPECUARIAS Y BIOTECNOLOGÍA