Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CPP: Nested Conditionals and BarrierGuards #10101

Open
bdrodes opened this issue Aug 18, 2022 · 2 comments
Open

CPP: Nested Conditionals and BarrierGuards #10101

bdrodes opened this issue Aug 18, 2022 · 2 comments
Labels
C++ question Further information is requested

Comments

@bdrodes
Copy link
Contributor

bdrodes commented Aug 18, 2022

I'm trying to detect sanitizing/barrier guards in more complex control flow. In another BarrierGarud issue I opened (#10011), we established how to address complex dataflow into the barrier guard, but the remaining issue I have is that the underlying mechanics of these guards (whether using barrier guards directly or the mechanic in #10011) relies on the 'controls' predicate.

Here is a null dereference example that use of barrier guards or the mechanism discussed in #10011 works fine with (the query correctly identifies the bad case, but not the good case):

	       char* d = (char*)my_malloc(10);
               // BARRIER GUARD: CHECKS IF THE VALUE IS NULL
		if(d == NULL)
			goto end;
		// GOOD: NULL-CHECKED DEREF (CORRECTLY NOT IDENTIFIED BY THE QUERY)
		use(d);

		end:
		// BAD: POSSIBLE NULL DEREF (CORRECTSLY IDENTIFIED BY THE QUERY)
		use(d);

This style of code (using gotos, exits, or returns) is common, and unfortunately, it's not always as that simple in terms of the control flow. Often the null check (barrier guard) is buried within other conditionals. Here is an example, and due to the nesting of the barrier guard, both the good and bad case are detected as potential null dereference.

		char* d; 
 		if (boolgen()) {
			d = malloc(10);
			if (d == NULL)
			{
				goto end;
			}
		}

		// GOOD: NULL-CHECKED DEREF (INCORRECTLY IDENTIFIED BY THE QUERY)
		use(d);
		
		end:
		// BAD: POSSIBLE NULL DEREF (CORRECTLY IDENTIFIED BY THE QUERY)
		use(d);

Note that in this case it is possible the value is used without initialization, which is its own bug, but that's not what CodeQL is reporting for me query. It is identifying the malloc initialization as the source.

The issue, as far as I can tell, is that the underlying 'controls' logic correctly says the guard does not control the use (since it is nested and not a dominating guard).

Examples like this yield hundreds of false positives in real-world tests, and it's unclear to me what CodeQL paradigm can be used to filter out these cases. Any suggestions?

@bdrodes bdrodes added the question Further information is requested label Aug 18, 2022
@bdrodes
Copy link
Contributor Author

bdrodes commented Aug 23, 2022

Anyone have a chance to consider this scenario yet?

@tausbn tausbn added the C++ label Aug 29, 2022
@MathiasVP
Copy link
Contributor

I might have a solution here. It's not super pretty, and I haven't done any benchmarking on it yet. Consider this snippet:

bool bad(int);
int source();
void sink(int);

void test(bool b) {
  int x;
  if (b) {
    x = source();
    if (bad(x)) {
      return;
    }
  }
  sink(x);
}

where we don't want to report a flow from source() to sink(x). I hope that's a valid interpretation of your problem.

The following, somewhat ugly, barrier achieves this:

/**
 * @kind path-problem
 */

import cpp
import semmle.code.cpp.dataflow.DataFlow
import semmle.code.cpp.controlflow.Guards
import semmle.code.cpp.controlflow.DefinitionsAndUses
import DataFlow::PathGraph

class BadCall extends FunctionCall {
  BadCall() { this.getTarget().hasGlobalName("bad") }
}

class Conf extends DataFlow::Configuration {
  Conf() { this = "Conf" }

  override predicate isSource(DataFlow::Node source) {
    source.asExpr().(Call).getTarget().hasName("source")
  }

  override predicate isSink(DataFlow::Node sink) {
    sink.asExpr() = any(Call call | call.getTarget().hasName("sink")).getAnArgument()
  }

  override predicate isBarrier(DataFlow::Node node) {
    exists(VariableAccess use, Variable v |
      use = node.asExpr() and // The node is a use of some variable
      v = use.getTarget() and // and the variable is `v`.
      forex(SsaDefinition def |
        definitionUsePair(v, def, use) // For all possible definitions that might reach the node.
      |
        exists(BadCall badCall, BasicBlock controlled |
          v.getAnAccess() = badCall.getAnArgument() and // There needs to be a call to `bad(v)`
          badCall.(GuardCondition).controls(controlled, true) and // such that the call controls whether we reach `controlled`
          not controlled.getASuccessor+() = use.getBasicBlock() // and controlled does not allow flow to the node
        )
      )
    )
  }
}

from Conf conf, DataFlow::PathNode source, DataFlow::PathNode sink
where conf.hasFlowPath(source, sink)
select sink.getNode(), source, sink, ""

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C++ question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants