How to organize successful Scientific Software Projects?

2 downloads 197 Views 161KB Size Report
... no documentation consider forms that are easy to update (not books or email archives) ... 4. Continuous integration
How to organize successful Scientific Software Projects? Wolfgang Bangerth (Texas A&M), Timo Heister∗ (Clemson University) * [email protected] Abstract

Overarching goals

Software today provides the foundation for research in essentially all disciplines of the sciences and engineering, and in many cases the software packages are developed by lose collaborations of people who release their contributions under open source licenses. Here, we are collecting necessary ingredients for successful scientific software projects to answer the questions:

1. Quality

Promote community growth Save developer time Engineer for maintainability

1. What makes some projects successful (user base, longevity, funding) and others not? 2. How can we learn from successful projects? What are some best practices?

Best practices: 1. Aim for “it just works” 2. Promised functionality more important than features 3. Avoid bugs using infrastructure (see 4.)

This poster is a summary of the paper [BH13] and a lot of the lessons come from us being two of the main developers of the open source finite element library deal.II [BHH+15] and the developers of the open source mantle convection community code ASPECT [BH+15, KHB12].

2. Documentation Good documentation is crucial Forms: manuals, reference documentation (modules/classes/functions), tutorials, Wikis, FAQs, readme But also: mailing lists (and archives), code comments, training videos, private emails, lectures, conversations Best practices: 1. Document on all levels: high level overview of library (what? why? how?) complete examples in tutorial form high: module level documentation, how classes interact medium: class level low: functions, their parameters, pre/post conditions internal: algorithmic choices inside functions installation instructions

2. Start early: very time intensive writing after the fact is unrealistic write documentation first or during development

3. Consider scalability: can only help/train limited number of individuals developer time doesn’t scale (can not help each user individually) consider documentation forms that scale

4. Avoid out-of-date documentation: is worse than no documentation consider forms that are easy to update (not books or email archives) extract documentation from code to keep code and doc close (doxygen)

5. Consider all forms weigh pros/cons, choose appropriately combine as needed

3. Building a community

4. Technical aspects

Goals: grow community to survive Need to attract external, unpaid developers for sustainability Dynamic community: user → contributor → maintainer Best practices: 1. Be welcoming, friendly, open, social and interact with humility, respect, and trust 2. Goal: maximize user base 3. Make transition to contributor as easy as possible (“lowering the bar”) 4. Provide and highlight incentives (give credit, show appreciation, workshop invitations, . . . )

5. Misc Other reasons for success: Timing vs. competitors Superior features Backward compatibility Licensing

Provide promised functionality Provide reasonably bug free code Easy setup/installation Goal: acquire/retain users

Goals: save developer time, make contributions easier Invest in supporting infrastructure, automation Technical software design: modularity, extensibility, re-usability Best practices: 1. Modular design by providing modules that can be used/recombined/amended 2. Have unit and integration tests 3. Establish a good contribution/review workflow (e.g. github pull requests) 4. Continuous integration with automated testing (https://travis-ci.org, https://www.appveyor.com/, https://jenkins-ci.org/, https://www.docker.com/, . . . ) 5. Support many different platforms/compilers (code quality, reach) 6. Implement performance testing 7. Use scripting for automation: indentation, configuration, releases

Poster online

Bibliography W. Bangerth and T. Heister. What makes computational open source software libraries successful? Computational Science & Discovery, 6:015010/1–18, 2013. W. Bangerth, T. Heister, et al. ASPECT: Advanced Solver for Problems in Earth’s ConvecTion, 2015. http://aspect.dealii.org/.

http://goo.gl/I3A1bZ

W. Bangerth, T. Heister, L. Heltai, G. Kanschat, M. Kronbichler, M. Maier, and B. Turcksin. The deal.II library, version 8.3. preprint, 2015. M. Kronbichler, T. Heister, and W. Bangerth. High accuracy mantle convection simulation through modern numerical methods. Geophysical Journal International, 191:12–29, 2012.

6. Consider simple ways to report/fix documentation issues Made for the Computational Science and Engineering Software Sustainability and Productivity Challenges (CSESSP) Workshop 2015. W. Bangerth was partially supported by the National Science Foundation under award OCI-1148116 as part of the Software Infrastructure for Sustained Innovation (SI2) program; and by the Computational Infrastructure in Geodynamics initiative (CIG), through the National Science Foundation under Award No. EAR-0949446 and The University of California – Davis. T. Heister was partially supported by National Science Foundation under award DMS-1522191, the Computational Infrastructure in Geodynamics initiative (CIG), through the National Science Foundation under Award No. EAR-0949446 and The University of California – Davis.