The Computational Problem in Science
Scientific discovery increasingly relies on software and computation. Database interfaces and content management systems have replaced the lab notebook, simulations are replacing pencil-and-paper analytics, and nights once spent leaning over an experimental benchtop are now spent glued to a computer screen. But, we’re doing it wrong.
The dream, of course, is that computers should empower scientists. Computers should help us produce more robust analyses, arrive at more accurate results, and make more discoveries. The reality, however, is that research science is far from that dream [1, 2, 3] . Scientists are often frustrated by data management, are unaware of code testing best practices, and struggle to collaborate in the absence of modern software development workflows [4, 5, 6, 7]. Accordingly, software bugs, undetected in the review process, have recently caused many high-profile journal article retractions [8, 9, 10, 11].
What with its ‘big data,’ intricate numerics, and fundamental philosophies of accuracy and reproducibilty, scientific research should be replete with modern software development techniques and best practices. However, reproducibility of scientific software suffers for many reasons. For example, there is no room in university curriculum to adequately train scientists in computational and numerical skills that their research will rely on. Furthermore, the incentive structure behind academic science fails to reward effort spent on software robustness, testing, and provenance-driven workflows. Appropriately, scientists who do somehow acquire computational skill often leave academic research to industry, where rewards for those skills are handsome .
Wonderwomen, To the Rescue
Working to bring reality closer to the dream, a non-profit organization, Software Carpentry , hosts bootcamps that spread awareness of and teach skills for scientific computation. Two weeks ago, at Lawrence Berkeley National Laboratory, the future of science began to look a little brighter as approximately 140 scientists and volunteer experts gathered for a two-day Software Carpentry bootcamp that was particularly special .
The first person to write an algorithm , the first people to hold the job title “computer,” the first people to program the ENIAC , and all of the attendees of last week’s bootcamp have something in common. They were all women.
Despite a decades-long, hopeful trend toward gender equality among STEM disciplines, (a modest increase in the number of women in most STEM fields), computer science has seen a steady decline in female participation .
The result is that science is increasingly reliant on a discipline dominated by men: computation. Almost every domain now boasts a speciality involving computing (Computational Physics, Computational Neuroscience, Bioinformatics, Quantitative Ecology – the list goes on). Unfortunately, in the same way that the sciences and computing have struggled to attract, include, and retain women, these specialized, computational subfields of the sciences sometimes struggle twofold.
The Women in Science and Engineering (WiSE) Software Carpentry bootcamp at LBNL was taught and attended entirely by females in the sciences in an effort to help counteract that struggle. Most Software Carpentry bootcamps are co-ed, but this bootcamp was the second in a series of female-focused bootcamps. The women who registered came from a mix of scientific and engineering discplines and a range of skill levels. They included Bay Area students, faculty, laboratory staff, and industry scientists.
The all-female cast of seven volunteer instructors taught basic lab skills for computing like program design, version control, data management, and task automation. As the lead instructor, it was my great pleasure to work with local expert instructors Cindee Madison, Suzanne Kiihne, and Professor Rachel Slaybaugh as well as experts from accross the US, Azalee Bostroem, Jessica Kerr, and Molly Gibson. Over a dozen volunteer helpers of both genders also came to support the learning process as students were immersed in topics such as the bash shell, python, and git.
The bootcamp seems to have been a resounding success. On the scientific quality front, attendees reported that they felt empowered to conduct their science more reproducibly after the bootcamp. And, on the gender equality front, many commented that the female-only learning environment contributed powerfully to an enjoyable learning experience. Wonderwomen – to the rescue!
 L. Hatton, “The t experiments: errors in scientific software,” Computing in Science and Engineering, vol. 4, no. 2, p. 2738, 1997.
 Z. Merali, “Computational science: Error, why scientific programming does not compute,” Nature, vol. 467, no. 7317, p. 775777, 2010.
 L. N. Joppa, G. McInerny, R. Harper, L. Salido, K. Takeda, K. O’Hara, D. Gavaghan, and S. Emmott, “Troubling trends in scientific software use,” Science, vol. 340, pp. 814–815,May 2013. PMID: 23687031.
 K. S. Ackroyd, S. H. Kinder, G. R.Mant, M. C. Miller, C. A. Ramsdale, and P. C. Stephenson, “Scientific software development at a research facility,” IEEE software, vol. 25, no. 4, p. 4451, 2008.
 J. Segal, “When software engineers met research scientists: A case study,” Empirical Software Engineering, vol. 10, no. 4, p. 517536, 2005.
 J. E. Hannay, C. MacLeod, J. Singer, H. P. Langtangen, D. Pfahl, and G. Wilson, “How do scientists develop and use scientific software?,” in Proceedings of the 2009 ICSE workshop on Software Engineering for Computational Science and Engineering, p. 18, IEEE Computer Society, 2009.
 P. F. Dubois, T. Epperly, and G. Kumfert, “Why johnny can’t build [portable scientific software],” Computing in Science & Engineering, vol. 5, no. 5, p. 8388, 2003.
 M. L. Bertoia, M. E. Waring, P. S. Gupta, M. B. Roberts, and C. B. Eaton, “Notice of Retraction,” Hypertension, vol. 60, Aug. 2012.
 G. Chang, “Retraction of ‘Structure of MsbA from Vibrio cholera: A Multidrug Resistance ABC Transporter Homolog in a Closed Conformation’ [J. Mol. Biol. (2003) 330 419430],” Journal of Molecular Biology, vol. 369, no. 2, 2007.
 Anon, “Retraction Notice to “Plasma PCSK9 Levels and Clinical Outcomes in the TNT (Treating to New Targets) Trial” [J Am Coll Cardiol 2012;59:17781784],” Journal of the American College of Cardiology, vol. 61, no. 16, p. 1751, 2013.
 G. Miller, “A Scientist’s Nightmare: Software Problem Leads to Five Retractions,” Science, vol. 314, pp. 1856–1857, Dec. 2006.
 J. Vanderplas, “The big data brain drain: Why science is in trouble,” Oct. 2013.
 G. V. Wilson, “Software carpentry,” 2014.
 K. Huff, A. Brown, and Software-Carpentry, “Software carpentry, women in science and engineering,” Apr. 2014.
 B. A. Toole, Ada, the enchantress of numbers: a selection from the letters of Lord Byron’s daughter and her description of the first computer. Strawberry Press, 1992.
 D. A. Grier, When computers were human. Princeton University Press, 2013.
 National Science Foundation, National Center for Science and Engineering Statistics, “Women, minorities, and persons with disabilities in science and engineering: 2013,” Tech. Rep. Special Report NSF 13-304, Arlington, VA, United States, 2013.