probcomp/bayeslite

Crash in ESTIMATE * FROM VARIABLES OF population WHERE <bql> with ignored variables

Closed this issue · 1 comments

fsaad commented

Consider

ESTIMATE *
FROM VARIABLES OF gapminder
WHERE DEPENDENCE PROBABILITY WITH "life expectancy at birth" > 0.5

The compiled BQL is

SELECT c.name AS name
FROM bayesdb_population AS p,
bayesdb_variable AS v, bayesdb_column AS c
WHERE
p.id = 1 AND v.population_id = p.id
AND c.tabname = p.tabname
AND c.colno = v.colno
AND v.generator_id IS NULL
AND (bql_column_dependence_probability(1, NULL, 136, c.colno) > 0.5)

Because sqlite does not guarantee the order of evaluation of the expressions in the WHERE clause, it is possible that
bql_column_dependence_probability(1, NULL, 136, c.colno) > 0.5 is evaluated first, which causes a crash if c.colno is IGNORED in the population.

I believe a simple fix is to simply compile as:

SELECT v.name AS name
FROM bayesdb_variable AS v
WHERE population_id = 1
AND v.generator_id IS NULL
AND (bql_column_dependence_probability(1, NULL, 136, v.colno) > 0.5)
fsaad commented

It is also the case that the current implementation, which is selecting names from bayesdb_column, skips latent variables.