`No need to rerun` although query changed (`codeql database analyze`)
RasmusWL opened this issue ยท 5 comments
If you alter a query after you have run codeql database analyze with that query, any subsequent calls to codeql database analyze will not run the new query, but re-use the results from the old and outdated version
This seems to have been a problem for some users: github/codeql#5084
Steps to reproduce
mkdir src && echo 'print(42)' > src/wat.pycodeql database create db --language python --source-root src/- Add dummy query and qlpack:
qlpack.yml:
name: codeql-python-no-rerun-example
version: 0.0.0
libraryPathDependencies: codeql-python
wat.ql:
/**
* @kind problem
* @id py/example-of-no-rerun
* @name Calls
* @description Finds any calls
* @problem.severity error
* @tags call
*/
import python
from CallNode call
// where none()
select call, "a call"- Run the query:
codeql database analyze db wat.ql --format=csv --output=out.csv && cat out.csvand notice that output contains the one call. - Alter the query, by uncommenting the
where none()part - Run the query:
codeql database analyze db wat.ql --format=csv --output=out.csv && cat out.csvand notice that output still contains the one call ๐ฑRunning queries. [1/1] No need to rerun /home/rasmus/tmp/wat/wat.ql. Shutting down query evaluator. Interpreting results. "Calls","Finds any calls","error","a call","/wat.py","1","1","1","9" - Run the query, forcing a rerun:
codeql database analyze --rerun db wat.ql --format=csv --output=out.csv && cat out.csvand notice that output is now empty (as it should be)Running queries. Compiling query plan for /home/rasmus/tmp/wat/wat.ql. [1/1] Found in cache: /home/rasmus/tmp/wat/wat.ql. Starting evaluation of codeql-python-no-rerun-example/wat.ql. [1/1 eval 43ms] Evaluation done; writing results to codeql-python-no-rerun-example/wat.bqrs. Shutting down query evaluator. Interpreting results.
Additional info
I'm using codeql cli version 2.5.5+202105241554plus. (locally built)
Supplementary info:
Same can be reproduced with codeql cli version 2.5.0 too.
This is by design, though perhaps not the most user-friendly design in hindsight.
The result of codeql database run-queries (which is the first half of codeql database analyze) stores output of queries in a format that doesn't remember what the actual text of teh QL source of each query was -- it just identifies the result by filename within the database's results directory.
At the time we designed the CLI, we imagined that the primary reason anyone would run codeql database run-queries multiple times on a single database was that the first run had timed out, run out of RAM or otherwise died. Then you'd want the next run to be an attempt to see if leaving out the queries that did succeed the first time around would allow the rest to squeeze through too. Therefore the default behavior of run-queries is to skip queries that already appear to have results ready.
This behavior can be changed by giving the --rerun option, in which case all queries will be evaluated afresh even if they already have results.
You can make this behavior the default by adding a line reading
database analyze --rerun
to your ~/.config/codeql/config.
Thanks for providing that extra insight, avoiding rerunning all queries does seem very reasonable ๐
I guess there is no easy way to determine whether a query has changed, since that is not just the query text, but also all transitive imports it depends on. So I'm very understanding about the fact that there is no easy solution for "fixing" this. (so if you want to close this as "wont-fix", that's totally fine by me)
@carlspring this was a problem with local development. Please open a new issue instead ๐