You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm trying to detect sanitizing/barrier guards in more complex control flow. In another BarrierGarud issue I opened (#10011), we established how to address complex dataflow into the barrier guard, but the remaining issue I have is that the underlying mechanics of these guards (whether using barrier guards directly or the mechanic in #10011) relies on the 'controls' predicate.
Here is a null dereference example that use of barrier guards or the mechanism discussed in #10011 works fine with (the query correctly identifies the bad case, but not the good case):
char* d = (char*)my_malloc(10);
// BARRIER GUARD: CHECKS IF THE VALUE IS NULL
if(d == NULL)
goto end;
// GOOD: NULL-CHECKED DEREF (CORRECTLY NOT IDENTIFIED BY THE QUERY)
use(d);
end:
// BAD: POSSIBLE NULL DEREF (CORRECTSLY IDENTIFIED BY THE QUERY)
use(d);
This style of code (using gotos, exits, or returns) is common, and unfortunately, it's not always as that simple in terms of the control flow. Often the null check (barrier guard) is buried within other conditionals. Here is an example, and due to the nesting of the barrier guard, both the good and bad case are detected as potential null dereference.
char* d;
if (boolgen()) {
d = malloc(10);
if (d == NULL)
{
goto end;
}
}
// GOOD: NULL-CHECKED DEREF (INCORRECTLY IDENTIFIED BY THE QUERY)
use(d);
end:
// BAD: POSSIBLE NULL DEREF (CORRECTLY IDENTIFIED BY THE QUERY)
use(d);
Note that in this case it is possible the value is used without initialization, which is its own bug, but that's not what CodeQL is reporting for me query. It is identifying the malloc initialization as the source.
The issue, as far as I can tell, is that the underlying 'controls' logic correctly says the guard does not control the use (since it is nested and not a dominating guard).
Examples like this yield hundreds of false positives in real-world tests, and it's unclear to me what CodeQL paradigm can be used to filter out these cases. Any suggestions?
The text was updated successfully, but these errors were encountered:
I might have a solution here. It's not super pretty, and I haven't done any benchmarking on it yet. Consider this snippet:
boolbad(int);
intsource();
voidsink(int);
voidtest(bool b) {
int x;
if (b) {
x = source();
if (bad(x)) {
return;
}
}
sink(x);
}
where we don't want to report a flow from source() to sink(x). I hope that's a valid interpretation of your problem.
The following, somewhat ugly, barrier achieves this:
/** * @kind path-problem */import cpp
import semmle.code.cpp.dataflow.DataFlow
import semmle.code.cpp.controlflow.Guards
import semmle.code.cpp.controlflow.DefinitionsAndUses
import DataFlow::PathGraph
classBadCallextendsFunctionCall{BadCall(){this.getTarget().hasGlobalName("bad")}}classConfextends DataFlow::Configuration{Conf(){this="Conf"}overridepredicateisSource(DataFlow::Nodesource){source.asExpr().(Call).getTarget().hasName("source")}overridepredicateisSink(DataFlow::Nodesink){sink.asExpr()=any(Callcall|call.getTarget().hasName("sink")).getAnArgument()}overridepredicateisBarrier(DataFlow::Nodenode){exists(VariableAccessuse,Variablev|use=node.asExpr()and// The node is a use of some variablev=use.getTarget()and// and the variable is `v`.forex(SsaDefinitiondef|definitionUsePair(v,def,use)// For all possible definitions that might reach the node.|exists(BadCallbadCall,BasicBlockcontrolled|v.getAnAccess()=badCall.getAnArgument()and// There needs to be a call to `bad(v)`badCall.(GuardCondition).controls(controlled,true)and// such that the call controls whether we reach `controlled`notcontrolled.getASuccessor+()=use.getBasicBlock()// and controlled does not allow flow to the node)))}}fromConfconf, DataFlow::PathNodesource, DataFlow::PathNodesinkwhereconf.hasFlowPath(source,sink)selectsink.getNode(),source,sink,""
I'm trying to detect sanitizing/barrier guards in more complex control flow. In another BarrierGarud issue I opened (#10011), we established how to address complex dataflow into the barrier guard, but the remaining issue I have is that the underlying mechanics of these guards (whether using barrier guards directly or the mechanic in #10011) relies on the 'controls' predicate.
Here is a null dereference example that use of barrier guards or the mechanism discussed in #10011 works fine with (the query correctly identifies the bad case, but not the good case):
This style of code (using gotos, exits, or returns) is common, and unfortunately, it's not always as that simple in terms of the control flow. Often the null check (barrier guard) is buried within other conditionals. Here is an example, and due to the nesting of the barrier guard, both the good and bad case are detected as potential null dereference.
Note that in this case it is possible the value is used without initialization, which is its own bug, but that's not what CodeQL is reporting for me query. It is identifying the malloc initialization as the source.
The issue, as far as I can tell, is that the underlying 'controls' logic correctly says the guard does not control the use (since it is nested and not a dominating guard).
Examples like this yield hundreds of false positives in real-world tests, and it's unclear to me what CodeQL paradigm can be used to filter out these cases. Any suggestions?
The text was updated successfully, but these errors were encountered: