/bosc2015

Content from BOSC2015 unconference sessions on building successful open source bioinformatics communities

Primary LanguageHTMLCreative Commons Zero v1.0 UniversalCC0-1.0

bosc2015

Content from BOSC2015 unconference sessions on building successful open source bioinformatics communities.

Aim of the repo

To do initial steps in collecting contributions to write an article sharing ideas and experience from a wide range of different people and open source bioinformatics communities, about what makes a great open source bioinformatics community.

If we get to the stage of publishing such an article, we hope it will be useful for others who want to start, or contribute to an existing, open source bioinformatics community, and who want to make it a great one.

We start by using fork and pull requests to edit the list of features of great open source bioinformatics communities we put together during the unconference, in file listOfFeaturesOfGreatOSBCs.md

Communication

We've thought hard about how to communicate with each other around this project.

Ideally we'd like to:

  • be open in terms of allowing anyone interested to join us
  • be open in terms of keeping an open record of our discussions for all to see
  • have a very low technical skill threshold for getting involved i.e. so that someone with relatively limited IT experience can also contribute their input/ideas
  • make it possible to contribute without sharing your email addresses with
  • use a way of jointly-editing documents, and discussing via messages our plans, that is easy (i.e. low threshold of IT experience required) preserves privacy, but remains open

We thought of doing it via GoogleDocs - but some people (I understand their concern!) don't want Google tracking their activity.

Etherpads are also an option for collaboration, but we're concerned they're too open to vandalism if they links get shared via social media.

Given that BOSCers and open source people in general are typically version-control savvy (part of the whole 'reproducible research' paradigm) we thus thought we'd go with the perhaps more fiddly, but more 'open', github option.

We're not sure how well github will allow us to have the discussions we need to have, to make this publishable - but we decided that we'd at least give it a try.

To start off, we'll try building together a list of features of great open source bioinformatics communities, by editing together using fork and pull requests the list of features we put together during the unconference session, found in the file: listOfFeaturesOfGreatOSBCs.md

The big/small print

If you want to contribute to the article, then please be aware that we have decided to organise this project in the following way; if you get involved, we consider that this means you are OK with this plan:

  • all contributors to the repo will be invited as authors, we'll trust them to decide if their contribution has been enough to warrant authorship
  • we'll assign authorship in the order with which people make their first commit to the repo
  • we will structure the article around a list of 'features' of great open source bioinformatics communities
  • we'll aim to publish somewhere that gives us open peer review

As mentioned elsewhere in this document, we are aware that we could organise things differently, and that some alternatives may well be better than the ones we've chosen here. But experience suggests that if we go into a project of this kind without having agreed on and intending to stick to some basic principles like this, then the project tends to stall in discussions about how to do it best.

Vision for article content

Provisional plan is to write it as a set of sections in each of which we describe one feature that is common in great open source bioinformatics communities.

For each 'feature' we plan to describe:

  • reasons why this is important for a great community of this kind
  • challenges in both (a) establishing and (b) maintaining this feature in such a community
  • tips for both (a) establishing and (b) maintaining this feature

and, as far as possible, provide examples/evidence/anecdotes from communities we're a part of, that illustrate and support these points.

We'll bracket this with an introduction and conclusion, and aim to publish it with this relatively simple structure. Part of the introduction and conclusion may be a description of the process used to write the article.

We are well aware that alternative approaches and structure may deliver a more useful article. However, experience making such decisions with large groups, suggests that if we lay also the structure open for discussion, we are likely to get involved in a long discussion that will make the process much longer. We feel the above structure is at least 'good enough' and do not want to spend time identifying alternative structures. Again, in our experience, this has in the past meant that momentum for the project evaporates and it never gets finished. Thus, if you do contribute, please accept this chosen structure and work with us to make it as good as possible.

Aidan and Bjoern (briefly) discussed including also educational communities such as Data Carpentry or Software Carpentry, but we decided to stick with a focus on code-producing, software-development-specific ones.

Vision for process

What motivates our choice of process

We're keen to crowd source the content to include evidence from a diverse communities and individuals.

We're also keen to make the process by which we develop the article to be open and transparent - firstly, given our commitment to openness, and secondly to potentially provide a clear example of how such a publication can be crowd sourced, which might be useful for others interested in doing something similar.

Initial process plan

We will aim to document the process carried out, with explanations for why decisions are made as they are, in a file in this repo: bosc2015UnconfWriteupDescribingProcess.md

Initial plan is to build on the results of the unconference sessions by collaborating via github using fork/pull.

Vision for publication

Where to publish

In the interests of openness, we are keen to publish the article in a context that provides open Peer Review.

Authorship

We will invite everyone who contributes to the manuscript via github to be an author on anything we submit. We leave it to those who contribute to the repo, to decide whether or not their contribution to the manuscript was enough to constitute authorship.

This could bring problems of people becoming authors for something that others would consider not enough of a contribution; however, we believe we can trust our community to behave appropriately here, not least because it will be clear to anyone who wants to find out, how much contribution has been made to the text. Note, however, that it would also be possible to contribute in other ways than just providing text i.e. that 'number of committed characters' is not necessarily an accurate reflection of contribution.

We are aware that we could have chosen to do this differently; however, we consider this a reasonable compromise, and in the second unconference session it felt like the participants agreed with this, thus to make the process of developing the manuscript simpler, we've decided to do it this way, and ask all authors to accept this and choose to contribute with this in mind.

Authorship order

Another issue is how to assign order of authorship. There is no easy way to estimate the relative importance of every authors contribution to such a project, especially if .

Thus, we have decided, and ask all authors to accept this and choose to contribute with this in mind, to assign authorship in the order with which commits are made to the github repo. This is a poor estimate of size of contribution, but given that all estimators of this are poor, and because we are working with many authors it becomes particularly difficult to do this through discussion.

How can I contribute?

Currently we're aiming to build consensus on a set of 'features' shared by most great open source bioinformatics software communities. We'll use these features as the backbone for the article, as described above in the section on "Vision for article content".

Justification for including a 'feature' is that it's perceived by us all as a good, concise description of a common feature of many/most/all, great open source bioinformatics software communities.

The basis for changing feature wording, adding new features, merging features, is that they provide a more accurate, or more concise, description of all these key features. The ideal list would be as short as possible (avoiding redundancy where possible) and as comprehensive as possible i.e. ideally a randomly selected 'great' community of this kind would match most/all of the list of features, and most/all the reasons for its greatness would be included in the list.

To build this list, please fork and edit and pull request the file with list of proposed features i.e. "listOfFeaturesOfGreatOSBCs.md"

The edits we're looking for are:

  • adding additional points you think are missing
  • editing the wording of the current set of points to improve them
  • and proposing mergers of existing points

When they make such edits, experience suggests it helps if you sign them (having these in the text body seems to help us keep track of the contributions/discussions we're involved in) and to provide a short explanation of why you prefer this new version of the wording, why they think these topics should be fused/merged, why they think it's essential for a great community of this kind to have that feature, with reference to the rational described above.

If you're planning to contribute to the article as an author on publication, we'll need eventually to know your name and affiliation. For now, so that we can have an overview of who and which projects are choosing to get involved, we'd appreciate it if you could add this information to the file: listOfContributors.md

Prior art

Plenty of articles and books on related topics exist, see a list of some of them below (feel free to add to the list if you know of important ones we are missing).

Despite these resources already existing, we believe there is utility in writing an article, in this way, on this topic. We see that utility as:

  • providing evidence for the features that are asserted as important for successful communities of this kind, drawn from many diverse communities; more than if it were written about just one or a few communities
  • the process of writing it is one promoting community amongst ourselves

Previous relevant articles/books include:

The art of community: building the new age of participation 2nd ed, Jono Bacon, ISBN: 978-1-449-31206-0, O'Reilly, 2012 pp539

Anatomy of BioJS, an open source community for the life sciences, Yachdav et al. 2015 eLife;4:e07009 DOI: http://dx.doi.org/10.7554/eLife.07009

A Quick Guide for Building a Successful Bioinformatics Community, Budd et al., PLOS Comp Bio 2015 DOI: 10.1371/journal.pcbi.1003972

Ten Simple Rules for Organizing an Unconference, Budd et al. PLOS Comp Bio 2015 DOI: 10.1371/journal.pcbi.1003905