Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Java: Improve NullGuards.clearlyNotNullExpr() #5762

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

Marcono1234
Copy link
Contributor

@Marcono1234 Marcono1234 commented Apr 25, 2021

Improve NullGuards.clearlyNotNullExpr() by considering more cases where an expression cannot be null.

@github-actions github-actions bot added the Java label Apr 25, 2021
@Marcono1234 Marcono1234 force-pushed the marcono1234/clearlyNotNullExpr branch from 3812b3e to 3ce88e1 Compare April 25, 2021 15:30
@Marcono1234 Marcono1234 marked this pull request as ready for review April 25, 2021 15:31
@Marcono1234 Marcono1234 requested a review from a team as a code owner April 25, 2021 15:31
@Marcono1234 Marcono1234 force-pushed the marcono1234/clearlyNotNullExpr branch from 3ce88e1 to 9918f1b Compare April 25, 2021 15:32
or
// Ignore RValue because that is covered by SsaVariable check below and correctly
// reports the assigned value as 'reason'
result instanceof CompileTimeConstantExpr and not result instanceof RValue and reason = result
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

null is a compile-time constant expression.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, are there any compile-time constant operations that aren't already included directly in this predicate? If so, should they? (and leading to making this disjunct superfluous?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

null is a compile-time constant expression.

At least according to the language specification, it is not. See JLS 16 §15.29:

Literals of primitive type (§3.10.1, §3.10.2, §3.10.3, §3.10.4), string literals (§3.10.5), and text blocks (§3.10.6)

All of the other definitions rely on this point, so a NullLiteral cannot be a compile time constant.

Though you are right, with the check for primitive type below, the check for a CompileTimeConstantExpr here would probably be redundant.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At least according to the language specification, it is not. See JLS 16 §15.29:

Ah, right, I missed that (and the fact that CompileTimeConstantExpr indeed doesn't include null). But I'd expect all (or most of) the operations that make up compile-time constants to be included here, thus making it redundant.

Comment on lines +127 to +130
var.getAnAssignedValue() = clearlyNotNullExpr(reason) and
forall(Expr assignedValue | assignedValue = var.getAnAssignedValue() |
assignedValue = clearlyNotNullExpr()
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this could be a forex statement:

Suggested change
var.getAnAssignedValue() = clearlyNotNullExpr(reason) and
forall(Expr assignedValue | assignedValue = var.getAnAssignedValue() |
assignedValue = clearlyNotNullExpr()
)
forex(Expr assignedValue | assignedValue = var.getAnAssignedValue() |
assignedValue = clearlyNotNullExpr()
)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this entire disjunct rewrite looks a bit curious. Why the change from Field to Variable? We ought to be able to handle all non-fields more precisely through SSA. What's the idea behind this change?

Copy link
Contributor Author

@Marcono1234 Marcono1234 Apr 26, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this could be a forex statement:

I used forall here to report the reason; with your proposal reason would not be bound. Note that for SwitchExpr I tried binding the reason in the forall but that did not work, therefore I wrote it as separate formula here as well.


I think this entire disjunct rewrite looks a bit curious

Do you mean usage of getAnAssignedValue()? The reason is that getInitializer() is too restrictive and does not cover a lot of cases, e.g. consider:

class Test
  final String s;

  public Test(boolean b) {
    if (b) {
      s = "a";
    } else {
      s = "b";
    }
  }
}

Here s cannot be null because all assigned values are non-null, even though they do not appear as part of the initializer. But that does not matter since you cannot access an uninitialized final variable, see JLS 16 §16.

Actually when a LocalVariableDecl is used, the final requirement could be dropped, but this would require special handling for variables being assined a value implicitly, for example the variable of an enhanced for loop (for (String s : strings)).


We ought to be able to handle all non-fields more precisely through SSA.

I think here it does not matter due to not being able to access an unassigned final variable, see above. However, at least at the current state reducing this back to Field fails finding n3 of the included test.
I am not familiar enough with SSA, so if you think it should be solved using that, feel free to improve the pull request in that regard.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean usage of getAnAssignedValue()

Yes. It would be cleaner to add a case for phi-nodes. But this predicate isn't intended to cover everything that we think isn't null - it's intended to cover everything that clearly isn't null.
Are these changes motivated by concrete query FPs in real code?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But this predicate isn't intended to cover everything that we think isn't null - it's intended to cover everything that clearly isn't null.

When all getAnAssignedValue() expressions are clearly not null, then any read access of the variable has a clearly non-null result, see JLS 16 §16 linked above.
Checking only getInitializer() misses a lot cases where a variable or field is not directly assigned a value in the initializer, see Java source code example above.

(Note that this ignores reflection here since the previous code did not consider it either.)

f.getDeclaringType().hasQualifiedName("java.lang", "Boolean") and
reason = result
)
or
result.getType() instanceof PrimitiveType and not result instanceof TypeAccess and reason = result
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These cannot be null simply due to their type, so possibly out of scope for this predicate. But I could imagine places where this disjunct might contribute. Please explain where you think that's the case and consider whether those cases are handled better without including all primitive-type expressions in this predicate.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These cannot be null simply due to their type

Can't they? At least for expressions which actually have a value a primitive value can never be null (any dereferencing would have thrown a NullPointerException before).
For non-value expressions this of course does not apply, which is why I excluded TypeAccess; maybe there are more which need to be excluded. Though I assume since this predicate is mostly used with control flow analysis, such Expr types would not be part of the result set in the first place, but maybe it would be good to restrict it more nonetheless?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that this addition is technically correct, hence "These cannot be null simply due to their type". What I'm questioning is the value that this adds, since the cost is a great many tuples.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, misread your comment. Yes, performance considerations might be a fair point (you can probably estimate or meassure that better than me).

However, this change does increase the number of findings, see this query.

@Marcono1234
Copy link
Contributor Author

The predicate clearlyNotNull(SsaVariable v, Expr reason) is probably also missing an AssignAddExpr of type String (performing String concatenation). In this case regardless of what value the variable or the RHS had, the variable will afterwards have a non-null result.
How would you model that?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants