
Improve query definitions in R

hms1 opened this issue · 1 comments

hms1 commented

Defining CB queries in R via lists is not very user friendly, especially when there are multiple conditions. For example, even a relatively simple query with three conditions and two AND operators:

adv_query <- list(
  "operator" = "AND",
  "queries" = list(
    list( "id" = 13, "value" = list("from"="2016-01-21", "to"="2017-02-13")),
      "operator" = "AND",
      "queries" = list(
        list("id" = 4, "value" = "Cancer"),
        list("id" = 21, "value" = "Consenting")

To simplify, we could include functions to define and combine individual phenotype definitions in a more modular fashion. As a starting point for discussion, the testing-new_query_syntax branch adds two new functions: new_phenotype_cont to define continuous variable phenotypes and new_phenotype_cat to define categorical variable phenotypes. They can be combined using overloaded &, | and ! operators. For example:

Get the package from the PR branch:

> git clone ''                                                                                                                                   
> cd cloudos
> git checkout  testing-new_query_syntax

In the cloudos directory enter an R session (or do so in Rstudio) and load the package + config:

> devtools::install(".")
> library(cloudos)
> cloudos_configure(base_url = "", 
token = "...api token...",
team_id = "5f7c8696d6ea46288645a89f")

Try building new queries:

A <- new_phenotype_cont(13, "2016-01-21", "2017-02-13")
B <- new_phenotype_cat(4, "Cancer")
C <- new_phenotype_cont(13, "2016-01-21", "2017-02-13")
D <- new_phenotype_cat(4, "Cancer")

########## test 1
AB <- A & B

cloudos::cb_apply_filter(cohort, adv_query = unclass(AB), keep_existing_filter = F)

########## test 2
AB <- A & !B

cohort <- cloudos::cb_load_cohort("60f96a97f2395b7f16a93c3a")

cloudos::cb_apply_filter(cohort, adv_query = unclass(AB), keep_existing_filter = F)

########## test 3
AB <- A | !B

cohort <- cloudos::cb_load_cohort("60f96a97f2395b7f16a93c3a")

cloudos::cb_apply_filter(cohort, adv_query = unclass(AB), keep_existing_filter = F)

########## test 3
AB <- (A | B) & (D | C) 

cohort <- cloudos::cb_load_cohort("60f96a97f2395b7f16a93c3a")

cloudos::cb_apply_filter(cohort, adv_query = unclass(AB), keep_existing_filter = F)
hms1 commented

Instead of new_phenotype_cont and new_phenotype_cat with specified arguments, it would be better to just have new_phenotype <- function(id, ...) where all additional arguments are passed to the cb.phenotype object. Something like:

> new_phenotype(13, value = list(from = "2015-01-01", to = "2017-01-01"))
[1] 13

[1] "2015-01-01"

[1] "2017-01-01"

[1] "cb.phenotype"

> new_phenotype(4, value = "Cancer")
[1] 4

[1] "Cancer"

[1] "cb.phenotype"