If you want to know what climate change will look like, you need to know what Earth's climate looked like in the past: what air temperatures were like, for example, and what ocean currents and sea levels were doing. You need to know what polar ice caps and glaciers were up to and — crucially — how hot the oceans were.

"Most of the earth is water," explains Peter Huybers, a climate scientist at Harvard University. "If you want to understand what global temperatures have been doing, you better understand, in detail, the rates that different parts of the ocean are warming."

Easier said than done.

In order to know how ocean temperature is changing today, scientists rely on more than a century's worth of temperature data gathered by sailors who used buckets to gather samples of water.

It's the best information available about how hot the oceans were before the middle of the 20th century, but it's full of errors and biases. Making the historical data more reliable led researchers on a wild investigation that involved advanced statistics and big data, along with early 20th century shipbuilding norms and Asian maritime history.

"I sometimes joke with my friends, I'm not only a climate scientist, I'm a detective!" says Duo Chan, a graduate student who led much of the analysis for an influential study published early this year.

The underlying problem Chan and Huybers were dealing with is that different countries used buckets made of different materials, in different sizes, on different lengths of rope — all things that could change a temperature reading.

For example, the water in a mid-size canvas bucket can lose up to 0.5 degree Celsius over the course of just a couple minutes, says Chan.

"Half a degree doesn't sound like a big deal, right? However if you look at the whole global warming, it's only, like, 1 degree," Chan explains. "Every 0.1 degree matters a lot."

He and Huybers set out to find and correct those tiny errors and biases within a massive database of historical sea surface temperature measurements maintained by researchers at the National Oceanography Centre in the United Kingdom.

The database has millions of entries from more than 100 sources, including Japanese whaling ships, Dutch naval vessels and a Norwegian Antarctic fleet. It's difficult figure out how reliable any given measurement is.

"This is like if someone left you all their receipts that they had ever spent during their lives, and you were trying to piece together what they had been doing," says Huybers.

"It's a big data problem," says Chan, a "statistical nightmare."

They approached the nightmare from a novel angle: What if they could compare the measurements made by sailors from different countries, to see if some countries were systematically warmer or cooler in their temperature readings?

To do that, they paired up measurements that happened when ships were close to each other, so sailors were measuring the same part of the ocean at around the same time, and then looked for patterns. The most stark pattern had to do with temperatures taken by Japanese ships in the 1930s: the measurements were too cold.

But why? Chan hypothesized that bigger ships might be to blame — perhaps buckets full of water were swaying in the wind for longer as they made their way up to higher decks, losing heat in the process. He analyzed Japanese ship data, and even learned to read Japanese to do it, and found that, indeed, Japanese ships had gotten taller.

But that all ended up being a red herring. Bigger ships weren't to blame for the erroneously low temperature measurements. The answer was even more mundane, and couldn't be solved using complex statistics or fancy computer models. Instead, the answer was written on an old U.S. Air Force document.

While Chan and Huybers were analyzing their millions of data pairs, researchers Elizabeth Kent and David Berry at the National Oceanography Centre in the United Kingdom had been going through thousands of pages of old documents about the data as part of their ongoing work to maintain the full sea surface temperature database.

The Japanese data in question had been digitized by the U.S. military after World War II, and one day Kent emailed Chan and Huybers with a clue.

"[She] sent us an email with just one PDF attached and it was a scan of a U.S. Air Force data sheet and she had circled part of the data sheet and she said 'hey, look at what we found here!' "

She had circled the word "truncation."

When the data was digitized, the U.S. military had dropped everything after the decimal point. A measurement of 15.1 degrees and a measurement of 15.9 degrees were both recorded as simply 15 degrees. Repeated over and over, those missing tenths of a degree added up to artificially cold measurements.

Kent says she had mixed feeling when she realized the data had been rounded down to the nearest degree.

"Of course it was great that we knew how the data had been handled, so we can treat it appropriately in our analyses," she wrote in an email. "But it's also very frustrating that we don't have the full precision of each observation as it was recorded."

Still, knowing what happened allows scientists to understand what they're working with. The newly corrected data has far-reaching implications for climate science. Sea surface temperature is a big part of every major climate model, and for decades scientists have struggled to understand how the Pacific Ocean's relatively cool temperatures in the early 20th century fit into the overall trend of a warming planet.

"Often when you find errors in data it makes your life more complicated," says Huybers. "In this case it's actually the opposite."

Of course, if you zoom out, a warmer Pacific isn't particularly great news for humanity. "If you correct for the Japanese measurements, then basically you would warm up the global trend," says Chan. "That actually implies or suggests that maybe the human contribution is greater than what we used to think."

But, he says, having a more accurate understanding of the past climate is important if scientists want to understand what the future holds, which will be crucial if humans hope to avoid the most catastrophic effects of climate change.

Copyright 2019 NPR. To see more, visit https://www.npr.org.