This is a more advanced version of https://github.com/advanced-security/codeql-workshops-staging/tree/master/java/codeql-dataflow-sql-injection
Initial results should show a false positive, and false negative (missing path). To that end:
- false negative: db write using struct => dataflow won’t work, taint-tracking should. Debug using partial paths. Add some extra library function so that additional taint steps are needed to propagate.
- false positive: second write using safe sql command (pre-compiled), initially shown as false positive (will be sanitizer). Add a checking-function in this path also (integer from string?). This one will become a barrier function.
The jdbc connector at https://github.com/xerial/sqlite-jdbc, from here is included in the git repository.
# Use a simple headline prompt
PS1='
\033[32m---- SQL injection demo ----\[\033[33m\033[0m\]
$?:$ '
# Build
./build.sh
# Prepare db
./admin -r
./admin -c
./admin -s
# Add regular user interactively
./add-user 2>> users.log
First User
# Check
./admin -s
# Add Johnny Droptable
./add-user 2>> users.log
Johnny'); DROP TABLE users; --
# And the problem:
./admin -s
# Check the log
tail users.log
./add-user
is reading from STDIN
, and writing to a database; looking at the code in
./AddUser.java leads to
System.console().readLine();
for the read and
conn.createStatement().executeUpdate(query);
for the write.
This problem is thus a dataflow problem; in codeql terminology we have
- a source at the
System.console().readLine();
- a sink at the
conn.createStatement().executeUpdate(query);
We write codeql to identify these two, and then connect them via
- a dataflow configuration – for this problem, the more general taintflow configuration.
To get started, build the codeql database (adjust paths to your setup):
# Build the db with source commit id.
cd ~/local/codeql-dataflow-ii-java
SRCDIR=$(pwd)
DB=$SRCDIR/java-sqli-$(cd $SRCDIR && git rev-parse --short HEAD)
echo $DB
test -d "$DB" && rm -fR "$DB"
mkdir -p "$DB"
cd $SRCDIR && codeql database create --language=java -s . -j 8 -v $DB --command='./build.sh'
# Check for AddUser in the db
unzip -v $DB/src.zip | grep AddUser
Then add this database directory to your VS Code DATABASES
tab.
For larger projects, using a single command to build everything is costly when any part of the build fails.
To build a database in steps, use the following sequence, adjusting paths to your setup:
# Build the db with source commit id.
export PATH=$HOME/local/vmsync/codeql250:"$PATH"
SRCDIR=$HOME/local/codeql-training-material.java-sqli/java/codeql-dataflow-sql-injection
DB=$SRCDIR/java-sqli-$(cd $SRCDIR && git rev-parse --short HEAD)
# Check paths
echo $DB
echo $SRCDIR
# Prepare db directory
test -d "$DB" && rm -fR "$DB"
mkdir -p "$DB"
# Run the build
cd $SRCDIR
codeql database init --language=java -s . -v $DB
# Repeat trace-command as needed to cover all targets
codeql database trace-command -v $DB -- make
codeql database finalize -j4 $DB
Then add this database directory to your VS Code DATABASES
tab.
- Identify the source part of the
System.console().readLine();
expression, the
buf
argument. Start from afrom..where..select
, then convert to a predicate. - Identify the sink part of the
conn.createStatement().executeUpdate(query);
expression, the
query
argument. Again start fromfrom..where..select
, then convert to a predicate. - Fill in the taintflow configuration boilerplate
class SqliFlowConfig extends TaintTracking::Configuration { SqliFlowConfig() { this = "SqliFlow" } override predicate isSource(DataFlow::Node node) { none() } override predicate isSink(DataFlow::Node node) { none() } }
The final query (without isAdditionalTaintStep
) is
/**
* @name SQLI Vulnerability
* @description Using untrusted strings in a sql query allows sql injection attacks.
* @kind path-problem
* @id java/SQLIVulnerable
* @problem.severity warning
*/
import java
import semmle.code.java.dataflow.TaintTracking
import DataFlow::PathGraph
class SqliFlowConfig extends TaintTracking::Configuration {
SqliFlowConfig() { this = "SqliFlow" }
override predicate isSource(DataFlow::Node source) {
// System.console().readLine();
exists(Call read |
read.getCallee().getName() = "readLine" and
read = source.asExpr()
)
}
override predicate isSink(DataFlow::Node sink) {
// conn.createStatement().executeUpdate(query);
exists(Call exec |
exec.getCallee().getName() = "executeUpdate" and
exec.getArgument(0) = sink.asExpr()
)
}
}
from SqliFlowConfig conf, DataFlow::PathNode source, DataFlow::PathNode sink
where conf.hasFlowPath(source, sink)
select sink, source, sink, "Possible SQL injection"
codeql resolve library-path --query=SqlInjection.ql
The data flow and taint tracking libraries have been extended with versions of `isBarrierIn`, `isBarrierOut`, and `isBarrierGuard`, respectively `isSanitizerIn`, `isSanitizerOut`, and `isSanitizerGuard`, that support flow states.
module BarrierGuard<guardChecksSig/3 guardChecks> {
/** Gets a node that is safely guarded by the given guard check. */
Node getABarrierNode() {
exists(Guard g, SsaVariable v, boolean branch, RValue use |
guardChecks(g, v.getAUse(), branch) and
use = v.getAUse() and
g.controls(use.getBasicBlock(), branch) and
result.asExpr() = use
)
}
}
/**
* DEPRECATED: Use `BarrierGuard` module instead.
*
* A guard that validates some expression.
*
* To use this in a configuration, extend the class and provide a
* characteristic predicate precisely specifying the guard, and override
* `checks` to specify what is being validated and in which branch.
*
* It is important that all extending classes in scope are disjoint.
*/
deprecated class BarrierGuard extends Guard {
/** Holds if this guard validates `e` upon evaluating to `branch`. */
abstract predicate checks(Expr e, boolean branch);
/** Gets a node guarded by this guard. */
final Node getAGuardedNode() {
exists(SsaVariable v, boolean branch, RValue use |
this.checks(v.getAUse(), branch) and
use = v.getAUse() and
this.controls(use.getBasicBlock(), branch) and
result.asExpr() = use
)
}
}
override predicate isSanitizer(DataFlow::Node sanitizer) {
exists(Call noSemi |
noSemi.getCallee().getName() = "no_semi"
and sanitizer.asExpr() = noSemi
)
}
sanitizer.asExpr() = noSemi does not find no_semi or writeNicknames or write_info. It finds writeNicknames1 though.
The partial flow no longer exists?
from EntityNameMatchCfg cfg, DataFlow::PartialPathNode src, DataFlow::PartialPathNode sink
where cfg.hasPartialFlow(src, sink, 25)
select src, sink
String line = System.console().readLine();
if (line.isEmpty()) break;
nicknames.addNickname(line);
line as source only flows to nicknames;
nicknames.addNickname() has to propagate to nicknames.getNicknames()