1. Home >
  2. News >
  3. Mid-career scientists get new tools for big data analysis

Mid-career scientists get new tools for big data analysis

By Raman Babu & BS Vivek/CIMMYT

“It is the mark of a truly intelligent person to be moved by statistics,” George Bernard Shaw once said, and the 56 maize researchers who attended a mid-career refresher course on statistical and genomic analysis likely would agree.

Participants of the international refresher course on Statistical and Genomic Analysis
Participants of the international refresher course on Statistical and Genomic Analysis

Five agriculture universities, seven national agriculture research systems, five seed companies from South and Southeast Asia, CIMMYT and ICRISAT were represented at the course, held 12-21 May at CIMMYT’s Hyderabad office.

Big data is now a reality and the volume, variety and velocity of data coming into the breeding programs are reaching unprecedented levels. The ability to swiftly sift through multi-location phenotypes and high-density genotypes enables breeders to continuously drive innovation and make the best selection decisions. The course was intended to strengthen the statistical underpinnings of modern crop improvement approaches, particularly for mid-career scientists and students involved in maize research.

Presenting certificates of completion to the participants. Photo: Dzung Do Van
Presenting certificates of completion to the participants. Photo: Dzung Do Van

A significant percentage of the training was devoted to hands-on practical assignments using mostly open source data analysis platforms such as R and Genstat with real datasets obtained from CIMMYT breeding programs. A range of analyses such as generation of BLUPs for large and unbalanced data, factorial regressions, QTL mapping, genome-wide association analysis, genomic selection, fine mapping, and genotype imputation was demonstrated.

“Getting to know an amazing variety of powerful statistical and molecular breeding tools will definitely help advance my breeding program,” said Mahendra Tripathi, a maize breeder with the National Maize Research Program, Nepal, who is pursuing a Ph.D. with CIMMYT as part of the Heat Tolerant Maize for Asia project. Brad Thada, a student from Purdue University in the U.S. who researches heat tolerance, said he particularly liked the big picture of maize improvement that he could capture, while Ryan Gibson, also from Purdue, admired the fine mapping part of the course, which gave him an opportunity to understand the entire process of marker discovery and how to fine-tune it to breeder-ready applications. Willy Bayuardi from Indonesia’s Bogor Agricultural University said he found the course intensely educational, especially the “Meta-R” suite of programs that summarize R Script-based statistical analyses in a user-friendly interface.

Mateo Vargas and Gregorio Alvarado from the Biometrical and Statistical Unit of CIMMYT-Mexico facilitated the statistics part of the training as key resource persons. The molecular breeding team of CIMMYT-India (Raman Babu, Sudha Nair, Girish Krishna and S. Gajanan) along with Willy Bayuardi, Jefferson Paril (Institute of Plant Breeding, University of Philippines) and ICRISAT staff orchestrated the genomic analysis part. The course was coordinated by B.S. Vivek, Maize Breeder and Raman Babu, Molecular Breeder of CIMMYT-India, Hyderabad.