/backyard-worlds-scripts

Helper scripts for the backyward-worlds-planet-9 project

Primary LanguagePython

backyard-worlds-scripts

Helper scripts for the backyward-worlds-planet-9 project


Setting up the environment

  1. Install dependencies
    • Python2.7
    • PIP
    • astropy
  2. Clone repo

git clone https://github.com/dancaselden/backyard-worlds-scripts.git

  1. cd backyard-worlds-scripts

Steps to reproduce results JOIN BD & Subjects operation

  1. Extract annotations from Zooniverse Classifications CSV

python -m byw.formats.annotationextract data/backyard-worlds-planet-9-classifications.csv data/test_annotations.csv

  1. Extract subjects from Zooniverse Subjects CSV

python -m byw.formats.subjectextract data/backyard-worlds-planet-9-subjects.csv data/test_subjects.csv

  1. [Optional] De-dupe subjects. Subjects are listed once for each workflow, so you may optionally cut the set of samples down

sort -r data/test_subjects.csv | uniq > data/test_subjects.csv.bak; mv data/test_subjects.csv.bak data/test_subjects.csv

(-r for reverse sort to keep header at the top)

  1. [Optional] Download your set of BDs from simbad. Otherwise, use data/l_to_t_bds.txt
  2. Extract Simbad entries from Simbad ASCII format

Unfortunately, even though there's only ~2000 entries, this step uses astropy, which has a few bugs that cause extrodinary performance issues

python -m byw.formats.bdextract data/l_to_t_bds.txt data/test_bds.csv

  1. Join BDs and Subjects CSVs

python -m byw.formats.bdsubjectjoin data/test_bds.csv data/test_subjects.csv data/test_bdsubjects.csv

655 data/test_bdsubjects.csv{code}
I gotta investigate why it outputs 4 duplicate items:
{code}uniq data/test_bdsubjects.csv  | wc -l
651

Steps to reproduce user click stats summary

  1. Execute Step 1) from JOIN BD & Subjects reproduction steps
  2. Run stats.usercount script

python -m byw.stats.usercount data/test_annotations.csv