Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Java: Improve NullGuards.clearlyNotNullExpr() #5762

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
30 changes: 25 additions & 5 deletions java/ql/src/semmle/code/java/dataflow/NullGuards.qll
Original file line number Diff line number Diff line change
Expand Up @@ -44,21 +44,32 @@ Expr clearlyNotNullExpr(Expr reason) {
or
result instanceof ArrayCreationExpr and reason = result
or
result instanceof FunctionalExpr and reason = result
or
result instanceof TypeLiteral and reason = result
or
result instanceof ThisAccess and reason = result
or
result instanceof StringLiteral and reason = result
or
// Add and AssignAdd performing String concatenation never have null result
result instanceof AddExpr and result.getType() instanceof TypeString and reason = result
or
result instanceof AssignAddExpr and result.getType() instanceof TypeString and reason = result
or
exists(Field f |
result = f.getAnAccess() and
(f.hasName("TRUE") or f.hasName("FALSE")) and
f.hasName(["TRUE", "FALSE"]) and
f.getDeclaringType().hasQualifiedName("java.lang", "Boolean") and
reason = result
)
or
result.getType() instanceof PrimitiveType and not result instanceof TypeAccess and reason = result
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These cannot be null simply due to their type, so possibly out of scope for this predicate. But I could imagine places where this disjunct might contribute. Please explain where you think that's the case and consider whether those cases are handled better without including all primitive-type expressions in this predicate.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These cannot be null simply due to their type

Can't they? At least for expressions which actually have a value a primitive value can never be null (any dereferencing would have thrown a NullPointerException before).
For non-value expressions this of course does not apply, which is why I excluded TypeAccess; maybe there are more which need to be excluded. Though I assume since this predicate is mostly used with control flow analysis, such Expr types would not be part of the result set in the first place, but maybe it would be good to restrict it more nonetheless?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that this addition is technically correct, hence "These cannot be null simply due to their type". What I'm questioning is the value that this adds, since the cost is a great many tuples.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, misread your comment. Yes, performance considerations might be a fair point (you can probably estimate or meassure that better than me).

However, this change does increase the number of findings, see this query.

or
// Ignore RValue because that is covered by SsaVariable check below and correctly
// reports the assigned value as 'reason'
result instanceof CompileTimeConstantExpr and not result instanceof RValue and reason = result
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

null is a compile-time constant expression.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, are there any compile-time constant operations that aren't already included directly in this predicate? If so, should they? (and leading to making this disjunct superfluous?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

null is a compile-time constant expression.

At least according to the language specification, it is not. See JLS 16 §15.29:

Literals of primitive type (§3.10.1, §3.10.2, §3.10.3, §3.10.4), string literals (§3.10.5), and text blocks (§3.10.6)

All of the other definitions rely on this point, so a NullLiteral cannot be a compile time constant.

Though you are right, with the check for primitive type below, the check for a CompileTimeConstantExpr here would probably be redundant.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At least according to the language specification, it is not. See JLS 16 §15.29:

Ah, right, I missed that (and the fact that CompileTimeConstantExpr indeed doesn't include null). But I'd expect all (or most of) the operations that make up compile-time constants to be included here, thus making it redundant.

or
result.(CastExpr).getExpr() = clearlyNotNullExpr(reason)
or
result.(AssignExpr).getSource() = clearlyNotNullExpr(reason)
Expand All @@ -70,6 +81,12 @@ Expr clearlyNotNullExpr(Expr reason) {
(reason = r1 or reason = r2)
)
or
exists(SwitchExpr s |
s = result and
s.getAResult() = clearlyNotNullExpr(reason) and
forall(Expr resultExpr | resultExpr = s.getAResult() | resultExpr = clearlyNotNullExpr())
)
or
exists(SsaVariable v, boolean branch, RValue rval, Guard guard |
guard = directNullGuard(v, branch, false) and
guard.controls(rval.getBasicBlock(), branch) and
Expand Down Expand Up @@ -104,10 +121,13 @@ predicate clearlyNotNull(SsaVariable v, Expr reason) {
clearlyNotNull(captured, reason)
)
or
exists(Field f |
v.getSourceVariable().getVariable() = f and
f.isFinal() and
f.getInitializer() = clearlyNotNullExpr(reason)
exists(Variable var |
v.getSourceVariable().getVariable() = var and
var.isFinal() and
var.getAnAssignedValue() = clearlyNotNullExpr(reason) and
forall(Expr assignedValue | assignedValue = var.getAnAssignedValue() |
assignedValue = clearlyNotNullExpr()
)
Comment on lines +127 to +130
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this could be a forex statement:

Suggested change
var.getAnAssignedValue() = clearlyNotNullExpr(reason) and
forall(Expr assignedValue | assignedValue = var.getAnAssignedValue() |
assignedValue = clearlyNotNullExpr()
)
forex(Expr assignedValue | assignedValue = var.getAnAssignedValue() |
assignedValue = clearlyNotNullExpr()
)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this entire disjunct rewrite looks a bit curious. Why the change from Field to Variable? We ought to be able to handle all non-fields more precisely through SSA. What's the idea behind this change?

Copy link
Contributor Author

@Marcono1234 Marcono1234 Apr 26, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this could be a forex statement:

I used forall here to report the reason; with your proposal reason would not be bound. Note that for SwitchExpr I tried binding the reason in the forall but that did not work, therefore I wrote it as separate formula here as well.


I think this entire disjunct rewrite looks a bit curious

Do you mean usage of getAnAssignedValue()? The reason is that getInitializer() is too restrictive and does not cover a lot of cases, e.g. consider:

class Test
  final String s;

  public Test(boolean b) {
    if (b) {
      s = "a";
    } else {
      s = "b";
    }
  }
}

Here s cannot be null because all assigned values are non-null, even though they do not appear as part of the initializer. But that does not matter since you cannot access an uninitialized final variable, see JLS 16 §16.

Actually when a LocalVariableDecl is used, the final requirement could be dropped, but this would require special handling for variables being assined a value implicitly, for example the variable of an enhanced for loop (for (String s : strings)).


We ought to be able to handle all non-fields more precisely through SSA.

I think here it does not matter due to not being able to access an unassigned final variable, see above. However, at least at the current state reducing this back to Field fails finding n3 of the included test.
I am not familiar enough with SSA, so if you think it should be solved using that, feel free to improve the pull request in that regard.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean usage of getAnAssignedValue()

Yes. It would be cleaner to add a case for phi-nodes. But this predicate isn't intended to cover everything that we think isn't null - it's intended to cover everything that clearly isn't null.
Are these changes motivated by concrete query FPs in real code?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But this predicate isn't intended to cover everything that we think isn't null - it's intended to cover everything that clearly isn't null.

When all getAnAssignedValue() expressions are clearly not null, then any read access of the variable has a clearly non-null result, see JLS 16 §16 linked above.
Checking only getInitializer() misses a lot cases where a variable or field is not directly assigned a value in the initializer, see Java source code example above.

(Note that this ignores reflection here since the previous code did not consider it either.)

)
}

Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
| Test.java:12:13:12:24 | new Object(...) | Test.java:12:13:12:24 | new Object(...) |
| Test.java:13:21:13:21 | 0 | Test.java:13:21:13:21 | 0 |
| Test.java:14:13:14:16 | this | Test.java:14:13:14:16 | this |
| Test.java:15:13:15:14 | "" | Test.java:15:13:15:14 | "" |
| Test.java:17:13:17:24 | Boolean.TRUE | Test.java:17:13:17:24 | Boolean.TRUE |
| Test.java:18:13:18:25 | Boolean.FALSE | Test.java:18:13:18:25 | Boolean.FALSE |
| Test.java:20:13:20:13 | 1 | Test.java:20:13:20:13 | 1 |
| Test.java:21:13:21:21 | Float.NaN | Test.java:21:13:21:21 | Float.NaN |
| Test.java:22:13:22:20 | constant | Test.java:3:29:3:30 | "" |
| Test.java:23:13:23:23 | (...)... | Test.java:23:22:23:23 | "" |
| Test.java:24:13:24:21 | ...=... | Test.java:24:20:24:21 | "" |
| Test.java:26:13:26:54 | ...?...:... | Test.java:26:38:26:39 | "" |
| Test.java:26:13:26:54 | ...?...:... | Test.java:26:43:26:54 | new Object(...) |
| Test.java:27:13:27:21 | switch (...) | Test.java:3:29:3:30 | "" |
| Test.java:27:13:27:21 | switch (...) | Test.java:28:27:28:28 | "" |
| Test.java:34:13:34:25 | ...::... | Test.java:34:13:34:25 | ...::... |
| Test.java:35:13:35:27 | ...->... | Test.java:35:13:35:27 | ...->... |
| Test.java:41:13:41:18 | ... + ... | Test.java:41:13:41:18 | ... + ... |
| Test.java:42:13:42:20 | ... + ... | Test.java:42:13:42:20 | ... + ... |
| Test.java:43:13:43:19 | ...+=... | Test.java:43:13:43:19 | ...+=... |
| Test.java:44:13:44:32 | ... + ... | Test.java:44:13:44:32 | ... + ... |
| Test.java:51:17:51:18 | n1 | Test.java:49:13:49:22 | ... != ... |
| Test.java:59:13:59:14 | n2 | Test.java:57:14:57:15 | "" |
| Test.java:70:13:70:14 | n3 | Test.java:65:18:65:19 | "" |
| Test.java:70:13:70:14 | n3 | Test.java:67:18:67:18 | 1 |
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
import java
import semmle.code.java.dataflow.NullGuards

from Expr notNull, Expr reason
where
notNull = clearlyNotNullExpr(reason) and
// Restrict to ArrayInit to make results easier to read
notNull.getParent() instanceof ArrayInit
select notNull, reason
88 changes: 88 additions & 0 deletions java/ql/test/library-tests/null-guards/Test.java
Original file line number Diff line number Diff line change
@@ -0,0 +1,88 @@
class NonNullTest {
final String constantNull = null;
final String constant = "";

Object getMaybeNull() {
return null;
}

void notNull() {
Object temp;
Object[] r = {
new Object(),
new int[0],
this,
"",

Boolean.TRUE,
Boolean.FALSE,

1,
Float.NaN,
constant,
(Object) "",
temp = "",

getMaybeNull() == null ? "" : new Object(),
switch(1) {
case 1 -> "";
default -> constant;
}
};

Runnable[] functional = {
this::notNull,
() -> notNull()
};

String s = null;
String s2 = null;
r = new Object[] {
s + s2,
null + s,
s += s2,
(String) null + null
};

Object n1 = getMaybeNull();
// Guarded by null check
if (n1 != null) {
r = new Object[] {
n1
};
}

Object n2 = getMaybeNull();
// Assigned a non-null value
n2 = "";
r = new Object[] {
n2
};

// final variable for which all assigned values are non-null
final Object n3;
if (getMaybeNull() == null) {
n3 = "";
} else {
n3 = 1;
}
r = new Object[] {
n3
};
}

void maybeNull(boolean b, int i) {
Object temp;
Object[] r = {
constantNull,
(Object) null,
temp = (String) null,
b ? "" : null,
switch(i) {
case 1 -> "";
default -> null;
},
null
};
}
}
1 change: 1 addition & 0 deletions java/ql/test/library-tests/null-guards/options
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
//semmle-extractor-options: --javac-args -source 15 -target 15