This is the first entry in a series of articles telling the personal stories of Iowa State University faculty and scientists whose big ideas are changing the world for the better.
AMES, Iowa – Carolyn Lawrence doesn’t really like the phrase “big data.”
The buzzword has risen to prominence at the same time new digital technology has given us the ability to track and store data on seemingly everything with unprecedented ease and precision.
But the term “big data” gives the impression that it’s all about volume, which doesn’t do the topic justice, said Lawrence, an associate professor of genetics, development and cell biology. It isn’t just that there are huge amounts of data out there (there are). But the complexity of the relationships between datasets and the rate at which new data can be generated also make data science a challenging and revolutionary field of study.
Big data has applications from business to public policy to science, but Lawrence has a better term for it: extreme information management.
Lawrence joined the faculty at Iowa State in January to continue her work in maize bioinformatics. One project involves the deployment of online tools to help plant breeders align the vast and growing sea of genetics data with more traditional field measurements.
If the motorcycle-riding Texas native is successful, the new information management system her lab is developing could usher in unprecedented opportunities for collaboration among plant breeders and geneticists. It would allow for faster advances in the development of better yielding, more robust crop varieties.
The data already exist. But it doesn’t matter if no one can understand it.
“There’s a big difference between having a huge amount of data and understanding the relationships within that data,” Lawrence said. “That’s where we come in.”
Big data and agriculture
In the digital age, everyone is creating data pretty much all the time. Every time you engage in commerce online or post something on Facebook, you’re adding a new data point to a set that someone can track. These data can be analyzed to allow businesses to spot new consumer trends or help public officials devise new policies – as long as they have the tools and know-how to understand huge amounts of complex data that are constantly changing.
In the past, that sort of data management was unthinkable. But new technology is making it possible.
In agriculture, harnessing data can lead to new levels of accuracy when it comes to planting the right seed variety in the right soil at the right depth. With equipment on the market today, a farmer can track all that information with a tablet or smart phone from the driver’s seat of a tractor.
It’s called precision farming, and it’s helping producers run their operations as efficiently as possible.
Lawrence said plant breeders would benefit greatly from similar tools to help them track the massive amounts of data they generate.
“Plant breeders need something, a resource they can use to store genomic, phenotypic and environmental data to help them plan their breeding strategies,” she said. “If everyone used a standardized tool, you could share information and get further, faster.”
Further, in this case, means producing crop varieties with higher yields and improved tolerance to stresses like drought and flood conditions.
Plant breeders in government, academia and private industry use a scattershot array of systems to manage data right now, an arrangement that limits collaboration. An open-source repository for all that data that also provides the analytical tools to help scientists predict how changes in a plant’s genetics will impact its performance in the field would be a big step forward, Lawrence said.
That sort of analytical information management system is exactly what her lab intends to give plant breeders. It’s a project that requires developers to have a unique combination of expertise in both biology and computer programming. As it just so happens, that’s the sort of background Lawrence brings to the table.
A touch of West Texas, a touch of Silicon Valley
Lawrence grew up on a West Texas wheat and cattle farm, and she has the drawl to prove it. She learned the value of agriculture from an early age and planned to pursue a career as a veterinarian, but an undergraduate research internship at Texas Tech led her to the world of plant biology. Turns out, it was a good fit.
While doing Ph. D. work at the University of Georgia, Lawrence’s research involved selecting genes that could be responsible for chromosome movements during cell division, a process important for pollen development. It was data-intensive work, and she began to suspect that automating the data analysis process would allow her to work much faster.
So she took a class in computer programming and wrote a software program that would make sense of her data more rapidly. Turns out, that was a good fit too.
It was also at Georgia that Lawrence discovered another lasting interest: motorcycles. Today, she owns a 1983 Kawasaki Spectre and keeps a glossy motorcycle helmet on a shelf in her Iowa State office.
After a 10-year stint working in Ames for the USDA Agricultural Research Service on a community database of genetic information on corn, Lawrence was hired in December to put together a lab at Iowa State to develop the new information management system for plant breeders.
The data management tool currently under development will help breeders with any number of plant species, but corn has long enjoyed a special place in Lawrence’s heart, as evidenced by the corn-cob-shaped coffee mug she sips from in her office.
She leads a laboratory with five students who have a mix of experience in biology and computer programming.
“The most fun I have is problem-solving with the students,” Lawrence said. “I’m eternally impressed with their creativity. They come up with ideas I never would have.”
That creative spark sometimes lends the lab an atmosphere similar to a Silicon Valley tech company, she said.
A concept known as “agile development,” an idea important in the tech world, informs much of their work in the lab. It centers on the notion that developers can’t wed themselves to any single process or method. If a better way of doing things comes down the pike, the new method must be integrated into the project – even if that means scrapping what came before.
She said she expects to have the new data management system ready for plant breeders within a year, but that won’t be the end of the project. In accordance with agile development principles, the technology will require regular updates and improvements.
Where agriculture meets extreme data management, the potential for positive change is seismic.
“As technology evolves, we’re going to adopt the fastest, best and most user-friendly way forward, even if that forces us to rethink or challenge our older ideas,” Lawrence said. “It’s like trying to stand where the ground is constantly shifting, and that’s something we have to be comfortable with.”