The Open Science Movement

June 14, 2010

As the internet becomes more diverse, there continues to be intense debate about how to best leverage it for both productivity and fairness. This debate extends to many areas - politics, journalism, design, and also science.

The move to post scientific research data online is part of a movement called "Open Science" (although it has other names as well - see our article on Open Science Terminology). A (relatively) recent article in Science focuses on this trend, including the most extreme version - Open Notebook Science. In this variant, all of the raw data is posted online as soon as it becomes available.

While noble, there are of course a load of problems associated with it that have yet to be ironed out. In addition to the legal and ethical issues (which can be partly alleviated by clear statements of the author's intentions, as suggested by the Panton Principles), there are also some serious logistical and practical considerations.

Let's take one example that I, as an online author, well know: writing something online can be like shouting into a tornado. You just have no real way of guaranteeing you'll ever be heard over the rest of the roar that's out there. If the majority of scientists were publishing their data directly online, without any benefit of a formal peer-review process to filter out the truly unworthy information, the scientific community could suffer from a bit of an information overload. Mechanisms would need to be put in place to help the truly worthy information rise to the top ... and almost any system that were put in place would certainly be complained about by those who did not rise to the top.

And, since this is not just the written report but also the raw data, a whole new challenge enters the picture: What is the best format for this data to be distributed in? When it comes to scientific data, Marshall McLuhan's admonition is extremely relevant, because "the medium is the message." If you provide data in a format that's useless to another scientist, then that data is useless to them. The data only becomes meaningful when the format is meaningful. Already, this sort of accessibility problem is being addressed by a number of scientists and programmers, and as it gets solved, hopefully it will help contribute to greater open access to data.

The trend toward more straightforward, easy communication among scientists is not limited to just the posting of raw science data, of course. Consider LabSpaces.net, a website intended to leverage the power of new social networking structures in a way that is more conducive to productive scientific discourse. Also, many scientists regularly check arXiv.org for the newest science "preprints," where scientific papers are posted before they get published in the official peer-reviewed journals ... meaning that anyone who wants to read a paper can do so, without needing to subscribe to a costly scientific journal. (Unfortunately for many of us, some of the most popular journals, and therefore some of the most interesting papers, do not publish papers that are distributed on arXiv.org or other open access sites online.)

It remains to be seen what all of these new ways of communicating science will ultimately do to the field itself, but they appear to be pointing the way toward even more transparency in the sciences, and possibly more points of contact between scientists and the non-scientific community, both of which are, I think, a step in the right direction.

For those who want to begin thinking even more about the amazing potential of the internet to leverage massive amounts of people toward a goal, I recommend this TED speech by Clay Shirky.

