/befair

BeFAIR (Be Findable, Accessible, Interoperable, Reusable) Open Science Framework

Primary LanguageShellMIT LicenseMIT

BeFAIR

BeFAIR (Be Findable, Accessible, Interoperable, Reusable) Open Science Framework.

BeFAIR is a Common Distributed Research Infrastructure where users can add and run any tools and components by themselves using Debian's way of managing services. All selected services should be available on a selected subdomain name and could be easily integrated together with Dataverse, BeFAIR data repository.

BeFAIR distributive was designed as out-of-the-box Distributed Networked Infrastructure that any research community can install with one command just as normal Operating System. The roadmap includes releases containing Open Data available for the different sciences, however COVID-19 Data Hub is our current priority.

Acknowledgements

BeFAIR infrastructure is standing the Shoulder of Giants. Please find below the acknowledgements for resources and contributions from the finished on ongoing projects.

Region Project Funding information Component
European Union CESSDA SaW H2020-INFRADEV-1-2015-1, grant agreement #674939 Dataverse as a service
European Union SSHOC H2020-INFRAEOSC-04-2018, grant agreement #823782 Cloud Dataverse
European Union EOSC Synergy INFRAEOSC-05-2018-2019, grant agreement No 857647 SQAaaS service
United States INDRA Defense Advanced Research Projects Agency under award W911NF-14-1-0397 INDRA service
European Union FAIRsFAIR H2020-INFRAEOSC-2018-2020 Grant agreement 831558 F-UJI and FAIR Data Points
Netherlands CLARIAH NWO grant number: 184.033.101 CLARIAH as a service
Finland SKOSMOS National Library of Finland SKOSMOS as a service

Available and planned services

Available basic infrastructure components:

  • traefik
  • postgresql
  • SOLR

The list of services integrated in BeFAIR:

To Do (we re accepting Pull Requests, please join the project if you want to contribute!):

  • CoronaWhy API (FastAPI with OpenAPI spec)
  • Elasticsearch
  • SPARQL endpoint (Virtuoso as a service)
  • Grlc (SPARQL queries into RESTful APIs convertor)
  • Doccano
  • Jupyter
  • OCR Tesseract (OCR as a service)
  • Kibana

BeFAIR is using Traefik load balancer and proxy service. Please define traefikhost in the configuration of your deployment (see deploys folder) to start enabled services.

if you want to enable some service, for example, INDRA, run this from ./deploys/your_domain_name where your_domain_name should correspond to your domain (default is localhost):

ln -s ../../services-available/indra.yaml indra.yaml

For example, if you will put the following subdomain (labs.coronawhy.org) in .env file

traefikhost=labs.coronawhy.org

the services will be available on airflow.labs.coronawhy.org, superset.labs.coronawhy.org and so on.

Installation and deployment

You need Docker and Docker Compose before you'll be able to run BeFAIR:

sudo apt install make unzip docker-compose

add current user to group 'docker'

sudo adduser $USER docker

create new shell with new 'docker' group applied

newgrp docker

If you see the message: "ERROR: Network traefik declared as external, but could not be found", please create the network manually using docker network create traefik and try again.

After Docker is installed you can run BeFAIR:

git clone https://github.com/CoronaWhy/befair
cd befair
make up

Warning: please use init commands for Apache Airflow and Apache Superset:

make airflow
make superset

Citation for the academic use

Please cite this work as follows:

Tykhonov V., Polishko A., Kiulian, A., Komar M. (2020). CoronaWhy: Building a Distributed, Credible and Scalable Research and Data Infrastructure for Open Science. Zenodo. http://doi.org/10.5281/zenodo.3922257

License

The content of this project itself is licensed under the Creative Commons Attribution 3.0 Unported license, and the underlying source code used to format and display that content is licensed under the MIT license.