The goal of the appc package is to provide daily, high resolution, near
real-time, model-based ambient air pollution exposure assessments. This
is achieved by training a generalized random forest on several
geomarkers to predict daily average EPA AQS concentrations from 2017
until the present at exact locations across the contiguous United States
(see vignette("cv-model-performance")
for more details). The appc
package contains functions for generating geomarker predictors and the
ambient air pollution concentrations. Predictor geomarkers include
weather and atmospheric information, wildfire smoke plumes, elevation,
and satellite-based aerosol diagnostics products. Source files included
with the package train and evaluate models that can be updated with any
release to use more recent AQS measurements and/or geomarker predictors.
Install the development version of appc from GitHub with:
# install.packages("remotes")
remotes::install_github("geomarker-io/appc")
In R, create model-based predictions of ambient air pollution
concentrations at exact locations on specific dates using the
predict_pm25()
function:
appc::predict_pm25(
x = s2::as_s2_cell(c("8841b39a7c46e25f", "8841a45555555555")),
dates = list(as.Date(c("2023-05-18", "2023-11-06")), as.Date(c("2023-06-22", "2023-08-15")))
)
#> ℹ (down)loading random forest model
#> ✔ (down)loading random forest model [8.2s]
#>
#> ℹ checking that s2 locations are within the contiguous united states
#> ✔ checking that s2 locations are within the contiguous united states [55ms]
#>
#> ℹ adding coordinates
#> ✔ adding coordinates [1.3s]
#>
#> ℹ adding elevation
#> ✔ adding elevation [1.3s]
#>
#> ℹ adding HMS smoke data
#> ✔ adding HMS smoke data [967ms]
#>
#> ℹ adding NARR
#> ✔ adding NARR [3.1s]
#>
#> ℹ adding MERRA
#> ✔ adding MERRA [569ms]
#>
#> ℹ adding time components
#> ✔ adding time components [24ms]
#>
#> [[1]]
#> # A tibble: 2 × 2
#> pm25 pm25_se
#> <dbl> <dbl>
#> 1 7.95 0.917
#> 2 9.32 0.814
#>
#> [[2]]
#> # A tibble: 2 × 2
#> pm25 pm25_se
#> <dbl> <dbl>
#> 1 5.82 0.685
#> 2 7.68 0.765
Installed geomarker data sources and the grf model are hosted as release
assets on GitHub and are downloaded locally to the package-specific R
user data directory (i.e., tools::R_user_dir("appc", "data")
). These
files are cached across all of an R user’s sessions and projects.
(Specify an alternative download location by setting the
R_USER_DATA_DIR
environment variable; see ?tools::R_user_dir
.)
See more examples in vignette("timeline-example")
.
The s2 geohash is a hierarchical geospatial index that uses spherical geometry. The appc package uses s2 cells via the s2 package to specify geospatial locations. In R, s2 cells can be created using their character string representation, or by specifying latitude and longitude coordinates; e.g.:
s2::s2_lnglat(c(-84.4126, -84.5036), c(39.1582, 39.2875)) |> s2::as_s2_cell()
#> <s2_cell[2]>
#> [1] 8841ad122d9774a7 88404ebdac3ea7d1
Spatiotemporal geomarkers are used for predicting air pollution concentrations, but also serve as exposures or confounding exposures themselves. View information and options about each geomarker:
geomarker | appc function |
---|---|
🌦 weather & atmospheric conditions | get_narr_data() |
🛰 satellite-based aerosol diagnostics | get_merra_data() |
🔥 wildfire smoke | get_hms_smoke_data() |
🗻 elevation | get_elevation_summary() |
Currently, get_urban_imperviousness()
, get_traffic()
, and
get_nei_point_summary()
are stashed in the /inst
folder and not
integrated into this package.
Please note that the appc project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.
To create and release geomarker data for release assets, as well as to
create the AQS training data, train, and evaluate a generalized random
forest model, use just
to execute
recipes in the justfile
.
> just --list
Available recipes:
build_site # build readme and webpage
check # CRAN check package
dl_geomarker_data # download all geomarker ahead of time, if not already cached
docker_test # run tests without cached release files
docker_tool # build docker image preloaded with {appc} and data
make_training_data # make training data for GRF
release_hms_smoke_data # install smoke data from source and upload to github release
release_merra_data # upload merra data to github release
release_model # upload grf model and training data to current github release
train_model # train grf model and render report