Four Big Ideas from the Open Science Summit 2010

Updated on August 04, 2010 | By DJ Strouse | Permalink

Microfinance & science, scientific reproducibility, and how to attract more researchers to open science.

Last weekend, half of my RSS, FriendFeed, and Twitter feeds assembled in Berkeley for the first major conference ever devoted to open science[¹] - the Open Science Summit 2010. The talks ranged from invigorating to completely inappropriate, but the real action was not on stage; it was in the hallways. Put a couple hundred hackers, scientists, and open science fanboys in a conference hall in Berkeley, add after-hours pub crawls, and simmer for three days and you’ve got a recipe for disruptive ideas. I’ll outline my favorite four below.

1. The Synergy Between Microfinance and Open Science
At least in the US, the most typical flow of funding for science follows the pattern: taxpayer -> government -> scientists. FundScience, SciFlies, and EurekaFund ask, “Why not cut out the middle man?” Their idea is to enable citizens to fund scientific projects directly. While any one citizen probably can’t afford to fund anything but mathematics (coffee is cheap), the collective donations of many science groupies can easily add up to support more resource-intensive projects.

I really like this idea because it beefs up the incentive for scientists to adopt open science practices. Why? Consider which projects are most likely to be funded by microfinance. If I’m a citizen about to throw several hundred dollars into a scientific project, I want to be able to see the science. A published paper every few months (or year) is not enough. I want to see the process, I want live updates, and I want to feel like my donation is moving science forward. In other words, citizens will be more likely to fund open science projects than traditional proprietary projects.

Microfinance needs open science because it needs a way to attract citizens and get them excited about the ongoing science of a particular lab. Open science needs microfinance in order to create clearer incentives for scientists to share their science.

2. Reproducibility as the Standard for Open Science
Science is supposed to provide a systematic way for us bumbling fools to avoid deceiving ourselves. One way it does so is by enforcing that our theories be based on results that are reproducible. Yet consider the last paper you read. Where was the raw data from which plots were produced? Where was the simulation code? Where were the exact experimental protocols? Could you really reproduce the results of that paper without this information?

Science should not require trust in another’s scientific infallibility. If you publish an interesting new discovery, I should have the opportunity to convince myself of your discovery by reproducing it. Science that is not reproducible is not science; its marketing.[²]

The standard of reproducibility provides an answer to the question: “Just how open should science be?” If we truly mean to do good science and avoid deceiving ourselves, we need to release every bit of data, code, protocol, and communication that would allow a colleague to reliably reproduce our results.

If you’re interested, you can read, listen, or watch more on this idea from computational scientist and policy wonk Victoria Stodden.

3. Come for the Closed, Stay for the Open
There’s a problem with websites whose main benefits come from a large community of users - they’re only useful once plenty of people sign up and early adopters will be bored in the meantime. Successful websites should be useful to single users or small groups, even if all their friends & colleagues haven’t signed up yet.

For web apps promoting open science, this means that the successful sites will be those that prove useful to individual researchers or research groups, regardless of whether or not their colleagues also use the site. For CoLab (a website enabling online scientific collaboration that Casey Stark and I built and demoed at OSS 2010), this means creating a rich set of tools that is useful for managing the workflow of individual scientists or groups.

Doing so is essential to convincing those that are on the fence about open science to give it a try. The goal is to draw scientists in with slick project management tools for their closed group activities, expose them to the lively discussion and new collaborations being formed over the open projects on the site, and gradually convince them that openness makes science more efficient and fun.

(Thanks to Jason Hoyt at Mendeley for pointing this out.)

4. New Vision for CoLab - Enable Scienctific Debate Around Any Piece of Scientific Content
CoLab was inspired by PolyMath, Quantiki, and a few other experiments in open science from the theoretical physics & mathematics communities and was built by a pair of physics and math majors. Not surprisingly, the site is currently optimized for collaborating over projects that focus on discussion and equations. But Casey and I are aiming to make it stupid easy for all scientists to collaborate openly online, not just physicists and mathematicians. After a series of long discussions with Jean-Claude Bradley, Lee Worden, and other experimentalists who want to share more than equations, I think we’ve got a better idea of how to do so.

Our new vision for CoLab is to enable scientific debate around any piece of scientific content. We want to make it stupid easy to center a discussion around protocols, data, plots, published papers, papers in progress, simulations, code, or any other component of scientific research. As an experimentalist, I should be able to import a lab protocol, raw data, or manipulable plots based on a live feed from that raw data and discuss it online with collaborators across the globe. As a computational scientist, I should be able to import code or live simulations and troubleshoot online with anyone in the world who might be able to help. As a member of a journal club, I should be able to import a published paper and collaboratively highlight and annotate in-line with colleagues, from those in the lab next door to those in another country. As a researcher ready to publish, I should be able to host a working version of my paper online, collaboratively edit with any of my colleagues, and submit a link directly to a journal, without being forced to download the paper and make finishing touches offline. In short, as a scientist, I should be able to easily and openly discuss any piece of my science with my entire scientific community.

That’s no small task, but its what science needs and what we will continue to build.

Update (August 7, 2010): As pointed out by Greg Wilson in the comments below and Lisa Green of Creative Commons over lunch today, there have been plenty of open science conferences over the past decade. This sentence should really read: “…first major conference devoted to open science that this baby scientist & web dev noob had ever seen.” ↩
Update (August 4, 2010): After a fruitful discussion with Michael Nielsen (@michael_nielsen) and Seb Paquet (@sebpaquet) on Twitter, I should clarify that certain fields, such as astronomy, have fundamental barriers to reproducibility. As much as they might love to, physicists cannot summon supernovas on command. Thus, in observation-based fields, we should stress that data analysis be reproducible but not necessarily data collection. The key point is that information exchange between researchers should not be a barrier to reproducibility. ↩