Maize Genomes to Fields: 2014 and 2015 field season genotype, phenotype, environment, and inbred ear image datasets

Authors

Naser AlKhalifah, Darwin A. Campbell, Celeste M. Falcon, Jack M. Gardiner, Nathan D. Miller, Maria Cinta Romay, Ramona Walls, Renee Walton, Cheng‑Ting Yeh, Martin

Bohn, Jessica Bubert, Edward S. Buckler, Ignacio Ciampitti, Sherry Flint‑Garcia, Michael A. Gore, Christopher Graham, Candice Hirsch, James B. Holland, David Hooker, Shawn Kaeppler, Joseph Knoll, Nick Lauter, Elizabeth C. Lee, Aaron Lorenz, Jonathan P. Lynch, Stephen P. Moose, Seth C. Murray, Rebecca Nelson, Torbert Rocheford, Oscar Rodriguez, James C. Schnable, Brian Scully, Margaret Smith, Nathan Springer, Peter Thomison, Mitchell Tuinstra, Randall J. Wisser, Wenwei Xu, David Ertl, Patrick S. Schnable, Natalia De Leon, Edgar P. Spalding, Jode Edwards and Carolyn J. Lawrence‑Dill

Source

BMC Research Notes 2018(11):452

Download Options

Online version:

BMC website

Abstract

Objectives: Crop improvement relies on analysis of phenotypic, genotypic, and environmental data. Given large, well‑integrated, multi‑year datasets, diverse queries can be made: Which lines perform best in hot, dry environments?

Which alleles of specific genes are required for optimal performance in each environment? Such datasets also can be leveraged to predict cultivar performance, even in uncharacterized environments. The maize Genomes to Fields (G2F) Initiative is a multi‑institutional organization of scientists working to generate and analyze such datasets from existing, publicly available inbred lines and hybrids. G2F's genotype by environment project has released 2014 and 2015 data‑sets to the public, with 2016 and 2017 collected and soon to be made available.

Data description: Datasets include DNA sequences; traditional phenotype descriptions, as well as detailed ear, cob, and kernel phenotypes quantified by image analysis; weather station measurements; and soil characterizations by site. Data are released as comma separated value spreadsheets accompanied by extensive README text descriptions. For genotypic and phenotypic data, both raw data and a version with outliers removed are reported. For weather data, two versions are reported: a full dataset calibrated against nearby National Weather Service sites and a second calibrated set with outliers and apparent artifacts removed.