default blog post image

On Bias

Some of the best science stories seem to emerge in a three step process.  Step 1: Someone points out an error in your thinking, or you can’t let go of a nagging feeling that somehow you are wrong about what you think your data are telling you. Step 2: In a deep, and sometimes painful, exploration of what went wrong with your own analysis, you discover a systematic error in the way whole groups of researchers are thinking about the issue.  Step 3: The new story you are able to tell by looking at the data fresh turns out to be more interesting than the standard account.
Nenana by Julie Coghill

Photo by Julie Coghill

The wonderful invertebrate biologist and cnidarian expert Vicki Pearse just related to me several stories from her own career of having to face her own biases and emerging with a delightful discovery on the other side.  One story concerned the common belief, repeated still in books and aquarium placards, that the fabulous green color of west coast Anthopleura anemones is due to an algal symbiont.    As Vicki reports: “After trying for months to bleach anemones in the dark, and seeing the green color remain, I finally looked into the tissues: no symbionts. They were long gone. The green color, a GFP produced by the animal, does fade in the dark, but slowly. Zooxanthellae aren't green, they are brown, duh.” I’ve had my own experiences of challenging both my own conclusions and a larger scientific consensus with a deeper observational approach, and emerging with something entirely new.  It once happened when Fiorenza Micheli and I were publishing the story of the Nenana Ice Melt contest record in Science.  The overall story was that a then 85-year old gambling contest--where people guess the exact minute in spring when a wooden tripod placed on the frozen Tenana river will fall into the river--turned out to be a really good record of climate warming in a place that had few reliable climatic data.  This is an example of what Aníbal and I call “Accidental Ecological Knowledge” in Observation and Ecology. In the original manuscript, Fio and I reported in the text that 1948 was the earliest date of melt in the 85 year old record, but an extremely observant reviewer noted that in our figure, which plotted day of year of ice melt (incorrectly called "Julian Day" in our original manuscript) vs. year it appeared that 1998 was the earliest melt. Indeed, as I looked over the figure again and again, I could not deny that the symbol for 1998 was lower on the y-axis than 1948, though in the data table I had for dates and times of ice melt (provided to me by the contest organizers), the date of ice melt in 1948 was the same as the date in 1998 and the time was earlier in the day, so 1948 “should” have been plotted lower. I was about to just chalk it up to bad Microsoft products and jigger the offending point manually when I told myself to look again.  I eventually identified that 1948 was a leap year and 1998 was not, and a few plotting experiments led me to realize that if you just report the numerical day of year for the same day (say March 20) in a leap year vs. a non-leap year, you get a different point (because the leap year has squeezed in an extra day on your count—March 20 in a leap year is day 80, but in a non leap year it’s just day 79). That, in turn, left me thinking - "why would an artificial correction of the calendar have a significant effect on how we report natural events?" which led me into an intense study on the history of our calendar (Duncan Steele’s book Marking Time was very helpful in this regard) and the conversion of our calendar by Pope Gregor XIII in 1582 to more accurately (but not perfectly) reflect the true average year.  This correction ironically led to a bias in reporting natural events like phenology (the timing of things like spring blooms, migrations, and loss of autumn foliage) over the long term on an artificial calendar.  What happens is that spring events appear to arrive earlier than they actually do because the actual advent of spring comes earlier and earlier on the calendar throughout a given century. I ultimately published a provocative paper in Nature about how all phenological studies published so far are biased by reporting calendar date rather than something real like "days since the spring equinox that year".  This solution is what Fiorenza and I ultimately did in our paper on the Nenana Ice Classic, but the experience humbled me with respect to reporting natural events on artificial scales.  I’m pretty sure there are even better ways to report nature’s timing than the way we did.  Friends at the National Phenology Network have suggested it might be better to report "time since the solar angle was x at the study location y”.  Since more and more citizens are taking to recording natural events, it is good to know that this data bias only affects how scientists deal with the data when looking for long term trends.  If you record the first arrivals of robins in your backyard, or the first break in the ice on a neighborhood pond, you can still just record the calendar day. Decades ago, the biologist and philosopher Ed Ricketts cautioned observers of the world to be careful of their pre-determined notions, which he felt emerged when we try to immediately look for explanations of “why” a certain pattern exists.  As he wrote in Sea of Cortez, “When a person asks, “Why?” in a given situation, he usually deeply expects, and in any case receives, only a relational answer in place of the definitive “because” which he thinks he wants.”  He espoused instead what he called “non-teleological” thinking which, “concerns itself primarily not with what should be, or could be, or might be, but rather with what actually ‘is’”  This non-teleological view aligns nicely with the recursive, ever expanding process of achieving ecological understanding, or, as Ricketts wrote, “In the non-teleological sense there can be no “answer.” There can be only pictures which become larger and more significant as one’s horizon increases.”