/se4cs

Primary LanguageRuby

SE4CS: What Do We (Really) Know About Scientific Software?

Category Characteristic Conclusion
limitations Development is Driven and Limited by Hardware No-evidence
of Use of Old Programming Languages and Technologies Doubt
computer Intermingling of Domain Logic and Implementation Details Endorse
hardware Conflicting Software Quality Requirements No-evidence
nature of Requirements are Not Known up Front No-evidence
scientific Verification and Validation are Difficult and Strictly Scientific No-evidence/Endorse
challenge Overly Formal Software Processes Restrict Research No-evidence
Different Terminology Endorse
limitations Creating a Shared Understanding of a Code is Difficult Doubt
of Little Code Reuse Doubt
cultural Scientific Software in Itself has No Value But Still It is Long-Lived Doubt
differences Few Scientists are Trained in Software Engineering Doubt
Disregard of Most Modern Software Engineering Methods Doubt

a) Labelled Commits:

  • Retrieve the commits data by running python commits/commits_data.py, file can be adjusted to subset a sample for manual labelling.
  • The guideline for the labelling is included here.

b) Github Attributes of the projects:

  • Update the access_token column in data/project_list_[cs/se] with your Github token.
  • Retrieve the cs or se projects attributes by running python attributes/mine_attributes.py [cs/se].
  • Plot the box plots comparison graph of SE and CS by python attributes/plot_attributes.py

Approaches to reproduce results for each beliefs in section III:

A.1) Manual checking for the language usage and the Travis CI of all projects

A.2) Figure 3 can be seen in this spreadsheet and Figure 2 is the result of manual Labels of these commits. Table IV results are drawn from running this folder

B.1) Table 6 can be seen in this spreadsheet and Manual Labels of the commits from (A.2)

C.2) Based on the heroes paper code and social graphs interactions can be reproduced from RQ1 and RQ2 in this repo

C.3) Code reuse metrics are generated from this folder

C.4) Taken from Github attributes

C.5) Combination of A.1 and C.2-4

Data (CS Projects Selected for this study):

Project Name Link
abaco https://github.com/TACC/abaco
abinit https://github.com/abinit/abinit
apbs-pdb2pqr https://github.com/Electrostatics/apbs-pdb2pqr
blis https://github.com/flame/blis
cctools https://github.com/cooperative-computing-lab/cctools
changa https://github.com/N-BodyShop/changa
clowder https://github.com/ncsa/clowder
cpptraj https://github.com/Amber-MD/cpptraj
cyclus https://github.com/cyclus/cyclus
elasticsearch https://github.com/elastic/elasticsearch
forcebalance https://github.com/leeping/forcebalance
galaxy https://github.com/galaxyproject/galaxy
GooFit https://github.com/GooFit/GooFit
hoomd-blue https://github.com/glotzerlab/hoomd-blue
htmd https://github.com/Acellera/htmd
hubzero-cms https://github.com/hubzero/hubzero-cms
hydroshare https://github.com/hydroshare/hydroshare
irods https://github.com/irods/irods
lammps https://github.com/lammps/lammps
luafilesystem https://github.com/keplerproject/luafilesystem
luigi https://github.com/spotify/luigi
madness https://github.com/m-a-d-n-e-s-s/madness
MAST https://github.com/uw-cmg/MAST
mdtraj https://github.com/mdtraj/mdtraj
MetPy https://github.com/Unidata/MetPy
mpqc https://github.com/ValeevGroup/mpqc
ndslabs https://github.com/nds-org/ndslabs
nwchem https://github.com/nwchemgit/nwchem
ompi https://github.com/open-mpi/ompi
openforcefield https://github.com/openforcefield/openforcefield
openmm https://github.com/pandegroup/openmm/
openmmtools https://github.com/choderalab/openmmtools
OpenMx https://github.com/OpenMx/OpenMx
orca5 https://github.com/RENCI-NRIG/orca5
parsl https://github.com/Parsl/parsl
pcmsolver https://github.com/PCMSolver/pcmsolver
plumed2 https://github.com/plumed/plumed2
plumed2 https://github.com/plumed/plumed2
psi4 https://github.com/psi4/psi4
pymatgen https://github.com/materialsproject/pymatgen
pyscf https://github.com/pyscf/pyscf
QCFractal https://github.com/MolSSI/QCFractal
quantum_package https://github.com/LCPQ/quantum_package
radical.pilot https://github.com/radical-cybertools/radical.pilot
RMG-Py https://github.com/ReactionMechanismGenerator/RMG-Py
signac https://github.com/glotzerlab/signac/
signac-flow https://github.com/glotzerlab/signac-flow
TauDEM https://github.com/dtarb/TauDEM
trellis https://github.com/trellis-ldp/trellis
Trilinos https://github.com/trilinos/Trilinos
tripal https://github.com/tripal/tripal
WorkflowComponents https://github.com/LearnSphere/WorkflowComponents
Xenon https://github.com/NLeSC/Xenon
yank https://github.com/choderalab/yank
yt https://github.com/yt-project/yt
galaxy https://github.com/galaxyproject/galaxy
dealii https://github.com/dealii/dealii
foyer https://github.com/mosdef-hub/foyer