/SkyModelling

Primary LanguageJupyter Notebook

SkyModelling

This git repo contains code to:

  1. Create a dataset of sky flux from BOSS data. This dataset includes each sky fiber for each plate/image in the BOSS data on nersc.
  2. Create file of meta data for all sky spectra
  3. Fit sky spectra and separate the airglow lines from the continuum.

Included in this repo:

  • spframe_flux.py: This script pulls all sky fibers from BOSS observations and converts the spframe flat field electrons to flux. The program outputs a .npy file per plate and a corresponding meta data file.
  • get_meta_rich_data.py: This script that takes raw metadata files generated by spframe_flux and calculates additional meta data. output file per raw_meta file (per plate)
  • get_line_sum_file.py: script takes spframe files and rich metadata and calculates the line strength of a list of lines (includes mean cont values), based on the lines in util/line_list.pkl. This script adds data to the rich meta files and resaves them as meta_with_lines.npy.
  • get_mean_mega_file.py: script that calculates the mean and var for the lines in util/line_list.pkl for sky fibers for a given observation. Resaves meta file.
  • get_mega_file.py: Takes all meta files in a directory (like mean_rich) and compiles them into one file.
  • get_mean_spectra.py: Makes mean spectra based on all sky fibers from a given observation (image/camera). Saved by image number and camera.
  • fit_blue_mean_cont.py: Models airglow lines (saved as util/blue_airglow_lines.npy) and removes them for mean spectra. Output is .npy file with original spframe flux, and continuum
  • fit_red_mean_cont.py: Same for blue but uses longer list of lines in util/red_airglow_lines.npy
  • MetaDataEval.ipynb: looks at the distribution of the metadata from meta_rich.npy and serves as a way to cross check the data
  • util/bitmask.py: This code is used by spframe_flux.py to decode the pixel mask.
  • util/line_list.pkl: A list of lines to take sum of in get_line_sum_file.py. This was created (and can be changed) with util/make_line_file.py
  • util/phot_rec.npy: Contains the cloud data used in get_rich_meta_data.py
  • util/CalibVector.pkl: Contains the calibration vector used in spframe_flux.py. It was created using util/make_calib_file.py
  • util/s10_zodi.pkl: Map for zodiacal contribution used for rich meta data
  • util/isl_map.pkl: Map for ISL contribution used for rich meta data
  • util/solar_flux.pkl: Daily solar flux contribution used for rich meta data
  • util/*_airglow_lines.pkl: List of all airglow (and artificial lines) needed for the fit_*_mean_cont.py script. These lists were created (and can be changed) with util/make_airglow_line_list.py.
  • util/artificial_lines.csv.pkl:Used by util/make_airglow_line_list.py to include lines from light pollution

Code Dependencies

The BOSS flux files are saved on nersc. This code is meant to run on nersc, or if you have downloaded some of the spFrame files from nersc on to another machine, you can point to that directory. Run the code on edison or cori.

All code is in python. You will need the following packages:

  • numpy, scipy, pandas, matplotlib
  • astropy
  • ephem (pip install ephem)
  • multiprocessing
  • statsmodel (conda install statsmodels)

How to get your own sky flux dataset

  • Clone this repo to nersc
  • Identify a location to save your data and modify spframe_flux.py. export SKY_FLUX_DIR=$SCRATCH/sky_flux
  • Run spframe_flux.py' on one node (on interactive or debug). python spframe_flux.py. If you run out of time you can just run it again. It will figure out how many files you have to go.This will output 2 .npy files for each plate: SKY_FLUX_DIR/PLATE_calibrated_sky.npyandSKY_FLUX_DIR/raw_meta/PLATE_raw_meta.npy`
  • python get_rich_meta_data.py This is best to be run with multiprocessing with max number of processes. There is one section that takes longer that can be commented out in the code if you want less meta information. You can restart this file again as it counts what files have already been converted from raw_meta to rich_meta. THe output is a new set of files under SKY_FLUX_DIR/rich_meta
  • python get_line_sum_file.py This will give you a new set of files under SKY_FLUX_DIR/rich_plus with all the line strength info. It will pull the flux for all lines and continuum bands in line_file.pkl.
  • python make_mega_file.py --full for all this data. This will save two astropy fits tables (.fits): SKY_FLUX_DIR/all_meta_data_YYMMDD.fits which contains all sky fiber data; SKY_FLUX_DIR/good_meta_data_YYMMDD.fits which throws out `bad' observations and saves one meta file with all good spectra.

Get mean spectra data set:

  • Make sure SKY_FLUX_DIR is still identified and exported
  • python get_mean_spectra.py This will give you mean spectrum for each observation (image/camera). The spectra will be saved as SKY_FLUX_DIR/PLATE/IMAGE_CAM_mean_spectrum.npy. A mean variance file is also saved.
  • python get_mean_meta_file.py This is the same as get_line_sum_file.py but measures the line strength only for these mean spectra. Starts with a rich meta file and the mean spectra and resaves the meta file.
  • python make_mega_file.py --mean for the mean spectra. This will save files SKY_FLUX_DIR/mean_meta_data_YYMMDD.fits and SKY_FLUX_DIR/good_mean_meta_data_YYMMDD.fits.

How to fit your sky spectra

  • Identify a location for your fit/split data. `export CONT_FLUX_DIR=$SCRATCH/sky_cont_flux'