This packages provides R functions for running the SpeakEasy 2 community detection algorithm using the SpeakEasy2 C library. See the Genome Biology article.
SpeakEasy 2 (SE2) is a graph community detection algorithm that aims to be performant on large graphs and robust, returning consistent results across runs. SE2 does not require precognition about the number of communities in the network. Additionally, while the user can provide parameters to alter how the algorithm is run, the default option work well on a wide arrange of graphs and tweaking options generally has little affect on the results, reducing the risk of influencing the algorithm.
The core algorithm is written in C, providing speed and keeping the memory requirements low. This implementation can take advantage of multiple computing cores without increasing memory usage. SE2 can detect community structure across scales, making it a good choice for biological data, which is often organized hierarchical structure.
Graphs can be passed to the algorithm as adjacency matrices using the Matrix
library, igraph
graphs, or any data that can coerced into a matrix.
For most users, this package should be installed from CRAN using:
install.packages("speakeasyR")
It can also be installed using devtools
:
devtools::install_github("speakeasy-2/speakeasyR")
Additionally, it's possible to download the latest release from the release page (the speakeasyR_${release}.tar.gz
asset) and install it using install.packages
:
install.packages("speakeasyR_${release}.tar.gz")
Where ${release}
must be replaced with the value in the tarball's name.
Installation with devtools::install_github
has been tested in clean VMs running Ubuntu and Fedora.
To set up the development environment on Windows, install the appropriate version of Rtools for your R install. Using Rtools' MSYS2, install the required build tools. This has been tested with ucrt64 environment but likely works in other environments.
pacman -S mingw-w64-ucrt-x86_64-toolchain git
For development, clone this repository and use:
git submodule update --init --recursive
To set up the vendored dependencies.
For development astyle
is recommended for formatting C code while texlive
/latex
, qpdf
, and checkbashims
are expected by R
for building the documentation and checking shell scripts during the R CMD build
process.
It should now be possible to run devtools::load_all()
in R
.
GNU autotools is used to generate the configuration script and files needed to run the configuration script. R
's build commands do not run autoconf
instead, if changes are made to the configuration.ac
file, autoconf
(and possibly autoreconf -i
) needs to be run and manually and the resulting files should be committed along with the source configuration.ac
file.
The Makefile
can determine when the autoconf
programs need to be run by either directly calling the configure target (i.e. make configure
) or running a build target (i.e. make build
or make check
or similar).
The makefile
in the top level directory is intended for development. It will automate recreating committed generated files when needed. These generated files must be committed with changes to the source files that created them as they are not created by the R CMD build
command. It should always be possible to run R CMD build
to build the project in a clean state without needing to run make
to generate other files. The makefile
also sets some flags to provide stricter checks than what are run during the normal build process.
As clang
and gcc
can behave differently changes should be tested against both. To explicitly set the compiler used run make ${target} CC=${CC}
where target is likely build
or check
and CC is either clang
or gcc
.