Improve query definitions in R
hms1 opened this issue · 1 comments
Defining CB queries in R via lists is not very user friendly, especially when there are multiple conditions. For example, even a relatively simple query with three conditions and two AND
operators:
adv_query <- list(
"operator" = "AND",
"queries" = list(
list( "id" = 13, "value" = list("from"="2016-01-21", "to"="2017-02-13")),
list(
"operator" = "AND",
"queries" = list(
list("id" = 4, "value" = "Cancer"),
list("id" = 21, "value" = "Consenting")
)
)
)
)
To simplify, we could include functions to define and combine individual phenotype definitions in a more modular fashion. As a starting point for discussion, the testing-new_query_syntax
branch adds two new functions: new_phenotype_cont
to define continuous variable phenotypes and new_phenotype_cat
to define categorical variable phenotypes. They can be combined using overloaded &
, |
and !
operators. For example:
Get the package from the PR branch:
> git clone 'https://github.com/lifebit-ai/cloudos.git'
> cd cloudos
> git checkout testing-new_query_syntax
In the cloudos directory enter an R session (or do so in Rstudio) and load the package + config:
> devtools::install(".")
> library(cloudos)
> cloudos_configure(base_url = "http://cohort-browser-dev-110043291.eu-west-1.elb.amazonaws.com/cohort-browser/",
token = "...api token...",
team_id = "5f7c8696d6ea46288645a89f")
Try building new queries:
A <- new_phenotype_cont(13, "2016-01-21", "2017-02-13")
B <- new_phenotype_cat(4, "Cancer")
C <- new_phenotype_cont(13, "2016-01-21", "2017-02-13")
D <- new_phenotype_cat(4, "Cancer")
########## test 1
AB <- A & B
cloudos::cb_apply_filter(cohort, adv_query = unclass(AB), keep_existing_filter = F)
########## test 2
AB <- A & !B
cohort <- cloudos::cb_load_cohort("60f96a97f2395b7f16a93c3a")
cloudos::cb_apply_filter(cohort, adv_query = unclass(AB), keep_existing_filter = F)
########## test 3
AB <- A | !B
cohort <- cloudos::cb_load_cohort("60f96a97f2395b7f16a93c3a")
cloudos::cb_apply_filter(cohort, adv_query = unclass(AB), keep_existing_filter = F)
########## test 3
AB <- (A | B) & (D | C)
cohort <- cloudos::cb_load_cohort("60f96a97f2395b7f16a93c3a")
cloudos::cb_apply_filter(cohort, adv_query = unclass(AB), keep_existing_filter = F)
Instead of new_phenotype_cont
and new_phenotype_cat
with specified arguments, it would be better to just have new_phenotype <- function(id, ...)
where all additional arguments are passed to the cb.phenotype
object. Something like:
> new_phenotype(13, value = list(from = "2015-01-01", to = "2017-01-01"))
$id
[1] 13
$value
$value$from
[1] "2015-01-01"
$value$to
[1] "2017-01-01"
attr(,"class")
[1] "cb.phenotype"
> new_phenotype(4, value = "Cancer")
$id
[1] 4
$value
[1] "Cancer"
attr(,"class")
[1] "cb.phenotype"