/distributed-data-processing-pipeline-on-biodiversity-data-aggregators

an API service for a distributed data processing pipeline on CloudLab that analyzes data streamed from three biodiversity data aggregators: iDigBio (https://idigbio.org/), GBIF (https://gbif.org/), and OBIS (https://obis.org/). [ Apache Kafka, Apache Spark, CloudLab, Flask, Python ]

Primary LanguagePython

Watchers