Can simple math help solve science’s big questions?

As scientists struggle to publish, collaborate and keep up with current research, mathematicians and computer scientists are finding new ways to crunch data and answer big questions. Published on The Huffington Post, June 21, 2013

Following is an email interview with Max Little, Parkinson’s Voice Initiative founder, TEDMED speaker and TEDGlobal Fellow, who seeks to answer big questions using relatively simple algorithms.

You’ve been working to discover the practical value of abstract patterns in various fields, with surprising results in areas as varied as diagnosing Parkinson’s disease over the phone to predicting the weather. Can you explain your approach?

As an applied mathematician, my training shows me patterns everywhere. Electricity flows like water in pipes, and flocks of birds behave like turbulent fluids. In my projects, I collate mathematical models from across disciplines, ignoring the assumptions of that discipline. To a large extent, I put in overly simple models. I use artificial intelligence to throw out inaccurate models. And this approach of exploiting abstract patterns has been surprisingly successful.

For example, during my PhD I stumbled across the rather niche discipline of biomedical voice analysis, originating in 1940?s clinical work. With some new mathematical methods, and combining these with recent mathematics in artificial intelligence, I was able to makeaccurate medical predictions about voice problems. The clinician’s methods were not accurate. This sparked off research in detecting Parkinson’s disease from voice recordings – the basis of the Parkinson’s Voice Initiative.

But success like this raises suspicions. So, with collaborators, I tried to make this approach fail. We assembled 30,000 data sets across a wide range of disciplines: exploration geophysics, finance, seismology, hydrology, astrophysics, space science, acoustics, biomedicine, molecular biology, meteorology and others. We wrote software for 9,000 mathematical models from a deep dive into the literature. We exhaustively applied each model to each data set.

When finished, a very revealing, big picture emerged. We found that many problems across the sciences could be accurately solved in this way. In many cases, the best models were not the ones that would be suggested by prevailing, disciplinary wisdom.

Are you doing other research that might have implications for clinical diagnosis?

Here is another example: There is a decades-old problem in biomedical engineering: automatically identifying epileptic seizures from EEG recordings. But we found over 150 models, some exceedingly simple, each of which, alone, could detect seizures with high accuracy.

Max Little at TEDMED 2013. Photo: Jerod Harris/TEDMED
This challenges quite a few assumptions — but it is not as if we are the first to find this. It happens often when new approaches to address old problems are attempted: for example, in obesity, a new, simple mathematical model revealed some surprising relationships about weight and diet.

You’ve also used fairly simple algorithms to successfully predict weather.

After my PhD, I teamed up with a hydrologist and an economist. We wanted to try weather forecasting using some fairly simple mathematics applied to rainfall data. Now, weather forecasting throws \$10m-supercomputers and ranks of atmospheric scientists together, and they crunch the equations of the atmosphere to make predictions. So, competing against this Goliath with only historical data and a laptop would seem foolhardy.

But after two years of hard work, I came up with mathematics that, when fed with rainfall data, could make predictions often as accurate as weather supercomputers. We even discovered that models as simple as calculating the historical average rainfall, and using this as a forecast, were sometimes more accurate than supercomputers. We were all surprised. but this finding seems to line up with results that others have found in climate science: it is actually possible to make forecasts of future global temperatures using simple statistical models that are as accurate as far more complex, general circulation models relied upon by the Intergovernmental Panel on Climate Change.

Is this a new way of doing science?

If we divide science into three branches: experiment, theory and computer simulation, then what I am describing here doesn’t quite fit. These are not just simulations: the results are entirely reproducible with just the data and the mathematics. This approach mixes and matches models and data across disciplines, using recent advances in artificial intelligence.

I don’t know what to call this approach, but I’m not the only one doing it. The most enthusiastic proponents are computer scientists, who do something like this regularly in mass-scale video analysis competitions or one-off prizes financed by big pharma for molecular drug discovery as do statisticians working in forecasting.

In your TEDMED 2013 talk, you expressed concern that advances in science have stagnated. Can you explain?

Like many scientists, I’m concerned that science is becoming too fragmented. So many scientific papers are published each year that it is impossible to keep track of most new findings. Since most articles are never read, much new research has never been independently tested.

And, unfortunately, scientists are encouraged to ‘hyper-specialize,’ working only in their narrow disciplines. It is alien to we applied mathematicians that a scientist who studies animal behavior might never read a scientific paper on fluid mechanics! In isolation from each other, could they just be duplicating each other’s mistakes?

What can we do to create a more unified approach?

First of all, open up the data. There is far too much politics, bureaucracy and lack of vision in sharing data among researchers and the public. Sharing data is the key to eliminating the lack of reproducibility that is becoming a serious issue. Second, don’t pre-judge. We need to have a renewed commitment to radical impartiality. Too often, favoured theories, models, or data persist (sometimes for decades), putting whole disciplines at risk of missing the forest for the trees.

Collaboration would greatly speed advances. Is first-to-publish attribution of scientific findings really that productive? I think of science as a collaborative journey of discovery, not a competition sport of lone geniuses and their teams.

Scientific theories that can withstand this “challenge” from other disciplines will have passed a very rigorous test. Not only will they be good explanatory theories, they will have practical, predictive power. And this is important because without this mixing of disciplinary knowledge, we will never know if science is really making progress, or merely rediscovering the same findings, time and again.