Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Infinite loop when executing DataFlow queries #7481

Open
moshec2 opened this issue Dec 26, 2021 · 15 comments
Open

Infinite loop when executing DataFlow queries #7481

moshec2 opened this issue Dec 26, 2021 · 15 comments
Assignees

Comments

@moshec2
Copy link

moshec2 commented Dec 26, 2021

I have built a CodeQL database for my Java project. I am able to execute simple queries on it, but when I attempt to execute a DataFlow query, The execution gets stuck in an infinite loop (probably in the ModulusAnalysis::phiModulusRankStep method) and never finishes. The log for such a query looks like this:

[2021-12-26 17:14:19] (114s) Starting to evaluate predicate ModulusAnalysis::phiModulusRankStep#bffff#reorder_0_4_1_2_3/5@i315#46ab60 (iteration 315)
[2021-12-26 17:14:19] (114s) 			 - ModulusAnalysis::phiModulusRankStep#bffff#reorder_0_4_1_2_3_delta has 6 rows (order for disjuncts: delta=500000).
[2021-12-26 17:14:19] (114s) Starting to evaluate predicate ModulusAnalysis::ssaModulus#fffff/5@i315#46ab61 (iteration 315)
[2021-12-26 17:14:19] (114s) 			 - ModulusAnalysis::ssaModulus#fffff_delta has 2022 rows (order for disjuncts: delta=500000).
[2021-12-26 17:14:19] (114s) Starting to evaluate predicate ModulusAnalysis::condExprBranchModulus#fbfff#reorder_1_0_2_3_4/5@i315#46ab62 (iteration 315)
[2021-12-26 17:14:19] (114s) Empty delta for ModulusAnalysis::condExprBranchModulus#fbfff#reorder_1_0_2_3_4_delta (order for disjuncts: delta=500000).
[2021-12-26 17:14:19] (114s) Starting to evaluate predicate ModulusAnalysis::addModulus#fffff#reorder_1_0_2_3_4/5@i315#46ab63 (iteration 315)
[2021-12-26 17:14:19] (114s) Empty delta for ModulusAnalysis::addModulus#fffff#reorder_1_0_2_3_4_delta (order for disjuncts: delta=500000).
[2021-12-26 17:14:19] (114s) Starting to evaluate predicate ModulusAnalysis::exprModulus#ffff/4@i316#46ab6w (iteration 316)
[2021-12-26 17:14:19] (114s) 			 - ModulusAnalysis::exprModulus#ffff_delta has 1394 rows (order for disjuncts: delta=500000).
[2021-12-26 17:14:19] (114s) Starting to evaluate predicate ModulusAnalysis::phiSelfModulus#bfff/4@i316#46ab6x (iteration 316)
[2021-12-26 17:14:19] (114s) 			 - ModulusAnalysis::phiSelfModulus#bfff_delta has 6 rows (order for disjuncts: delta=500000).
[2021-12-26 17:14:19] (114s) Starting to evaluate predicate ModulusAnalysis::subModulus#fffff/5@i316#46ab6z (iteration 316)
[2021-12-26 17:14:19] (114s) Empty delta for ModulusAnalysis::subModulus#fffff_delta (order for disjuncts: delta=500000).
[2021-12-26 17:14:19] (114s) Empty delta for ModulusAnalysis::subModulus#fffff#reorder_1_2_0_3_4_delta (order for disjuncts: delta=<standard>).


[2021-12-26 17:14:19] (114s) Starting to evaluate predicate ModulusAnalysis::phiModulusRankStep#bffff#reorder_0_4_1_2_3/5@i316#46ab60 (iteration 316)
[2021-12-26 17:14:19] (114s) 			 - ModulusAnalysis::phiModulusRankStep#bffff#reorder_0_4_1_2_3_delta has 630 rows (order for disjuncts: delta=500000).
[2021-12-26 17:14:19] (114s) Starting to evaluate predicate ModulusAnalysis::ssaModulus#fffff/5@i316#46ab61 (iteration 316)
[2021-12-26 17:14:19] (114s) 			 - ModulusAnalysis::ssaModulus#fffff_delta has 8 rows (order for disjuncts: delta=500000).
[2021-12-26 17:14:19] (114s) Starting to evaluate predicate ModulusAnalysis::condExprBranchModulus#fbfff#reorder_1_0_2_3_4/5@i316#46ab62 (iteration 316)
[2021-12-26 17:14:19] (114s) Empty delta for ModulusAnalysis::condExprBranchModulus#fbfff#reorder_1_0_2_3_4_delta (order for disjuncts: delta=500000).
[2021-12-26 17:14:19] (114s) Starting to evaluate predicate ModulusAnalysis::addModulus#fffff#reorder_1_0_2_3_4/5@i316#46ab63 (iteration 316)
[2021-12-26 17:14:19] (114s) Empty delta for ModulusAnalysis::addModulus#fffff#reorder_1_0_2_3_4_delta (order for disjuncts: delta=500000).
[2021-12-26 17:14:19] (114s) Starting to evaluate predicate ModulusAnalysis::exprModulus#ffff/4@i317#46ab6w (iteration 317)
[2021-12-26 17:14:19] (114s) 			 - ModulusAnalysis::exprModulus#ffff_delta has 890 rows (order for disjuncts: delta=500000).
[2021-12-26 17:14:19] (114s) Starting to evaluate predicate ModulusAnalysis::phiSelfModulus#bfff/4@i317#46ab6x (iteration 317)
[2021-12-26 17:14:19] (114s) Empty delta for ModulusAnalysis::phiSelfModulus#bfff_delta (order for disjuncts: delta=500000).
[2021-12-26 17:14:19] (114s) Starting to evaluate predicate ModulusAnalysis::subModulus#fffff/5@i317#46ab6z (iteration 317)
[2021-12-26 17:14:19] (114s) Empty delta for ModulusAnalysis::subModulus#fffff_delta (order for disjuncts: delta=500000).
[2021-12-26 17:14:19] (114s) Empty delta for ModulusAnalysis::subModulus#fffff#reorder_1_2_0_3_4_delta (order for disjuncts: delta=<standard>).


[2021-12-26 17:14:19] (114s) Starting to evaluate predicate ModulusAnalysis::phiModulusRankStep#bffff#reorder_0_4_1_2_3/5@i317#46ab60 (iteration 317)
[2021-12-26 17:14:20] (115s) 			 - ModulusAnalysis::phiModulusRankStep#bffff#reorder_0_4_1_2_3_delta has 644 rows (order for disjuncts: delta=500000).
[2021-12-26 17:14:20] (115s) Starting to evaluate predicate ModulusAnalysis::ssaModulus#fffff/5@i317#46ab61 (iteration 317)
[2021-12-26 17:14:20] (115s) 			 - ModulusAnalysis::ssaModulus#fffff_delta has 254 rows (order for disjuncts: delta=500000).
[2021-12-26 17:14:20] (115s) Starting to evaluate predicate ModulusAnalysis::condExprBranchModulus#fbfff#reorder_1_0_2_3_4/5@i317#46ab62 (iteration 317)
[2021-12-26 17:14:20] (115s) Empty delta for ModulusAnalysis::condExprBranchModulus#fbfff#reorder_1_0_2_3_4_delta (order for disjuncts: delta=500000).
[2021-12-26 17:14:20] (115s) Starting to evaluate predicate ModulusAnalysis::addModulus#fffff#reorder_1_0_2_3_4/5@i317#46ab63 (iteration 317)
[2021-12-26 17:14:20] (115s) Empty delta for ModulusAnalysis::addModulus#fffff#reorder_1_0_2_3_4_delta (order for disjuncts: delta=500000).
[2021-12-26 17:14:20] (115s) Starting to evaluate predicate ModulusAnalysis::exprModulus#ffff/4@i318#46ab6w (iteration 318)
[2021-12-26 17:14:20] (115s) 			 - ModulusAnalysis::exprModulus#ffff_delta has 6 rows (order for disjuncts: delta=500000).
[2021-12-26 17:14:20] (115s) Starting to evaluate predicate ModulusAnalysis::phiSelfModulus#bfff/4@i318#46ab6x (iteration 318)
[2021-12-26 17:14:20] (115s) 			 - ModulusAnalysis::phiSelfModulus#bfff_delta has 2 rows (order for disjuncts: delta=500000).
[2021-12-26 17:14:20] (115s) Starting to evaluate predicate ModulusAnalysis::subModulus#fffff/5@i318#46ab6z (iteration 318)
[2021-12-26 17:14:20] (115s) Empty delta for ModulusAnalysis::subModulus#fffff_delta (order for disjuncts: delta=500000).
[2021-12-26 17:14:20] (115s) Empty delta for ModulusAnalysis::subModulus#fffff#reorder_1_2_0_3_4_delta (order for disjuncts: delta=<standard>).
.............

It continues like this forever.

@moshec2 moshec2 added the question Further information is requested label Dec 26, 2021
@intrigus-lgtm
Copy link
Contributor

How do you define "forever"? 10 minutes? 1 hour? 10 hours? 1 day?

I've been running codeql queries for over 24 hours on huge databases (Mozilla Firefox).
It also looks like there is progress, you first have "iteration 315" and later "iteration 318".
(Of course it might still be that there is a bug, but I'm not familiar with the ModulusAnalysis class)

@moshec2
Copy link
Author

moshec2 commented Dec 27, 2021

The iteration number keeps increasing forever. I executed one (relatively simple) query and it did not finish for more than 3 days. Does it make sense?

@edoardopirovano
Copy link
Contributor

Greetings, thanks for reaching out to us with this question. It certainly is possible for queries to take a very long time on large databases - perhaps even days if the database is sufficiently large. Seeing predicates take hundreds of iterations is also not entirely uncommon, especially in some of the more complex parts of the data flow library.

Some common ways of speeding up our evaluation times include increasing the --ram option passed to CodeQL (more is always better, though definitely leave at least a few GB of your RAM for your operating system and other programs that may be running) and the --threads that CodeQL will use (here, you'll only see improvements up to around 8 threads and even then only on certain queries/DBs since not all evaluations can be parallelised).

It should not be possible for any CodeQL query to take forever - the CodeQL engine should refuse to evaluate things that won't provably terminate. However, it's possible (although unlikely) that there may be a bug and your query will genuinely never terminate. If the above suggestions don't help your query get a result, I'd be happy to look into this further for you if you could provide us with the database you are using and what query you are trying to evaluate.

@edoardopirovano edoardopirovano added awaiting-response The CodeQL team is awaiting further input or clarification from the original reporter of this issue. and removed question Further information is requested labels Dec 27, 2021
@github-actions
Copy link
Contributor

This issue is stale because it has been open 14 days with no activity. Comment or remove the Stale label in order to avoid having this issue closed in 7 days.

@github-actions github-actions bot added the Stale label Jan 11, 2022
@github-actions
Copy link
Contributor

This issue was closed because it has been inactive for 7 days.

@moshec2
Copy link
Author

moshec2 commented Feb 6, 2022

Hi,

I have noticed that even if I execute the following simple query:

import java
import semmle.code.java.dataflow.DataFlow

from DataFlow::Node n
select n

I get the same behavior.
(I did let it run for more than 3 weeks and it did not return)

I am pretty sure it has to be some bug in the DataFlow standard library.

@aschackmull
Copy link
Contributor

The log snippet at the top does not indicate looping (the varying row counts is an indication that progress is being made). However, certain types of bugs in the QL libraries related to ModulusAnalysis could conceivably result in an attempt to enumerate all of int (which would certainly look like an infinite loop). I'd be curious to have a look at the log output in more detail (in particular the parts of the log that presumably repeat, which would be after the above quoted snippet), and to see whether I can reproduce locally. Is the project open source, such that I could try to reproduce locally? If it something inside ModulusAnalysis that appears to never finish, then the most likely cause is some particularly unfortunate code pattern in a single method that somehow makes the library calculate a huge amount of expression-equals-value-modulo-constant facts. Determining what that code pattern is would be useful such that we potentially could fix the QL to avoid degeneracy.

@aschackmull aschackmull reopened this Feb 7, 2022
@moshec2
Copy link
Author

moshec2 commented Feb 7, 2022

Unfortunately, this is a closed source project, so I cannot share the source code / database.

I tinkered a bit with the codeql standard library, and I have noticed some interesting behaviors which may help to identify the root cause:

In this qll file: https://github.com/github/codeql/blob/main/java/ql/lib/semmle/code/java/dataflow/internal/DataFlowNodes.qll
When I comment out lines 16 and 466 - 472:

    // DataFlowImplCommon::forceCachingInSameStage() and
...
/*
private class SummaryPostUpdateNode extends SummaryNode, PostUpdateNode {
  private Node pre;

  SummaryPostUpdateNode() { FlowSummaryImpl::Private::summaryPostUpdateNode(this, pre) }

  override Node getPreUpdateNode() { result = pre }
}
*/

Suddenly the above query (from DataFlow::Node n select n) returns results!
After that change, When I query for RemoteFlowSource (which inherits DataFlow::Node), it gets stuck again. And again, when I comment out some of the classes which extend RemoteFlowSource in https://github.com/github/codeql/blob/main/java/ql/lib/semmle/code/java/dataflow/FlowSources.qll , it suddenly returns results.

Another interesting fact I have noticed is: Without commenting out those parts (when the query does get stuck), it always has 23+ execution stages, and it always gets stuck 1 stage before the last one (22/23, 23/24, ...). When there are less stages (usually 13 after I comment out those lines), it always works.

Is there something else I can test to help you identify the issue?
Can I somehow debug what happens (e.g. to make codeql print which code segments are currently getting processed)?

Thanks for your help!

@aschackmull
Copy link
Contributor

Those facts are unfortunately somewhat arbitrary and don't reveal much - it's only saying that when you don't evaluate the part that's broken then it doesn't appear broken. The log snippet you originally posted is more useful, except the snippet above was only after 115s of eval. We'd need to look at the end of the log from the point where it really starts to "loop". The log can be augmented with additional debug info through the option "Code QL > Running Queries: Debug" in VSCode.

@github-actions github-actions bot removed the Stale label Feb 8, 2022
@moshec2
Copy link
Author

moshec2 commented Feb 20, 2022

Hi @aschackmull

I have found the root cause of the issue. You can reproduce the issue by building a database containing a single class with the following code:

class Class {
  void func() {
    while(true) {
      int i = i - 1;
    }
  }
}

A class in my project contains this code pattern.
This class does not actually compile (variable value might not have been initialized). Despite that, the project successfully builds because this is a legacy class that is not actually in use anymore.
Although the class has failed to compile, CodeQL still indexed it (which I personally think is a good thing), and this specific code pattern caused some loop / recursion (probably in the SSA module, but I am not sure) which just keeps running forever. That's why I was not able to execute any queries which (directly or indirectly) uses the SSA module.

I am not sure if it's considered a bug or not (because the code does not actually compile), but I do think it should be fixed.

@smowton
Copy link
Contributor

smowton commented Feb 21, 2022

Thanks for the minimal test case -- candidate fix: #8155

@moshec2
Copy link
Author

moshec2 commented Feb 22, 2022

@smowton Wouldn't ignoring those files cause queries to miss out some results sometimes?

@smowton
Copy link
Contributor

smowton commented Feb 22, 2022

The files have compile errors, so I don't think it would be important

@github-actions
Copy link
Contributor

This issue is stale because it has been open 14 days with no activity. Comment or remove the Stale label in order to avoid having this issue closed in 7 days.

@github-actions github-actions bot added the Stale label Mar 12, 2022
@smowton smowton removed the Stale label Mar 14, 2022
@github-actions
Copy link
Contributor

This issue is stale because it has been open 14 days with no activity. Comment or remove the Stale label in order to avoid having this issue closed in 7 days.

@github-actions github-actions bot added the Stale label Mar 29, 2022
@smowton smowton removed Stale awaiting-response The CodeQL team is awaiting further input or clarification from the original reporter of this issue. labels Mar 29, 2022
@smowton smowton self-assigned this Mar 29, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants
@smowton @edoardopirovano @aschackmull @intrigus-lgtm @moshec2 and others