Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SSA: Add data flow integration layer #16884

Merged
merged 3 commits into from
Jul 10, 2024

Conversation

hvitved
Copy link
Contributor

@hvitved hvitved commented Jul 1, 2024

This PR adds a new module, DataFlowIntegration, to the shared SSA library, which makes it much easier to integrate SSA into the shared data flow library. The integration layer gives you all the relevant flow steps (such as use-use), but also gives you phi-reads and phi-input barriers without having to worry about how they are implemented.

The interface is inspired by the variable capture library, meaning that we construct a Node type where some branches should map to existing branches in the DataFlow::Node type, while the SsaNode sub class should be added as a new branch to DataFlow::Node.

This PR only adds the new integration layer; follow-up PRs will adopt it for C#, Ruby, and Java.

Examples

The examples below show which local flow steps are generated for a particular code snippet.

Example 1

    void M1()
    {
        string s = "taint";
        Use(s);
        s = "safe";
        Use(s);
    }
flowchart TD
929["SSA def(s)"]
931[""taint""]
934["access to local variable s"]
937["SSA def(s)"]
938[""safe""]
941["access to local variable s"]

929 --> 934
931 --> 929
937 --> 941
938 --> 937
Loading

Example 2

    void M2()
    {
        SsaFlow s = new();
        s.Field = "taint";
        Use(s.Field);
    }
flowchart TD
944["SSA def(s)"]
946["(...) ..."]
949["[post] access to local variable s"]
950["access to local variable s"]
957["access to local variable s"]

944 --> 950
946 --> 944
949 --> 957
950 --> 957
Loading

Example 3

    void M3(bool b1, bool b2)
    {
        string s = "a";

        if (b1)
        {
            s = "b"; // s1
        }
        else
        {
            s = "c"; // s2
        }

        // s3 = phi(s1, s2)

        if (b2)
        {
            Use(s); // s3
        }
        else
        {
            Use(s); // s3
        }
    }
flowchart TD
971["SSA def(s)"]
972["[input] SSA phi(s)"]
973[""b""]
976["SSA def(s)"]
977["[input] SSA phi(s)"]
978[""c""]
979["SSA phi(s)"]
983["access to local variable s"]
986["access to local variable s"]

971 --> 972
972 --> 979
973 --> 971
976 --> 977
977 --> 979
978 --> 976
979 --> 983
979 --> 986
Loading

Example 4

    void M4(bool b1, bool b2)
    {
        string s = "a";

        if (b1)
        {
            s = "b"; // s1
            Use(s); // s1
        }
        else
        {
            s = "c"; // s2
        }

        // s3 = phi(s1, s2)

        if (b2)
        {
            Use(s); // s3
        }
        else
        {
            s = "d"; // s4
            Use(s); // s4
        }

        // s5 = phi(s3, s4)

        Use(s); // s5
    }
flowchart TD
1000["[input] SSA phi(s)"]
1002["[post] access to local variable s"]
1003["access to local variable s"]
1006["SSA def(s)"]
1007["[input] SSA phi(s)"]
1008[""c""]
1009["SSA phi(s)"]
1011["[input] SSA phi(s)"]
1013["[post] access to local variable s"]
1014["access to local variable s"]
1017["SSA def(s)"]
1018[""d""]
1019["[input] SSA phi(s)"]
1021["[post] access to local variable s"]
1022["access to local variable s"]
1024["SSA phi(s)"]
1026["access to local variable s"]
998["SSA def(s)"]
999[""b""]

1000 --> 1009
1002 --> 1000
1003 --> 1000
1006 --> 1007
1007 --> 1009
1008 --> 1006
1009 --> 1014
1011 --> 1024
1013 --> 1011
1014 --> 1011
1017 --> 1022
1018 --> 1017
1019 --> 1024
1021 --> 1019
1022 --> 1019
1024 --> 1026
998 --> 1003
999 --> 998
Loading

Example 5

    void M5(bool b1, bool b2)
    {
        string s = "taint"; // s1

        if (b1)
        {
            Use(s); // s1
        }
        else
        {
            Use(s); // s1
        }

        // s2 = phi_read(s1)

        if (b2)
        {
            Use(s); // s2
        }
        else
        {
            Use(s); // s2
        }
    }
flowchart TD
1033["SSA def(s)"]
1035[""taint""]
1037["[input] SSA phi read(s)"]
1039["[post] access to local variable s"]
1040["access to local variable s"]
1041["[input] SSA phi read(s)"]
1043["[post] access to local variable s"]
1044["access to local variable s"]
1045["SSA phi read(s)"]
1049["access to local variable s"]
1052["access to local variable s"]

1033 --> 1040
1033 --> 1044
1035 --> 1033
1037 --> 1045
1039 --> 1037
1040 --> 1037
1041 --> 1045
1043 --> 1041
1044 --> 1041
1045 --> 1049
1045 --> 1052
Loading

@@ -1809,7 +1809,7 @@
* Holds if `(bb, i)` contains a write to an iterator that may have been obtained
* by calling `begin` (or related functions) on the variable `v`.
*/
predicate variableWrite(IRBlock bb, int i, SourceVariable v, boolean certain) {
predicate variableWrite(BasicBlock bb, int i, SourceVariable v, boolean certain) {

Check warning

Code scanning / CodeQL

Missing QLDoc for parameter Warning

The QLDoc has no documentation for certain, but the QLDoc mentions begin
@hvitved hvitved force-pushed the ssa/dataflow-integration branch 2 times, most recently from 3e93c31 to 59a5b34 Compare July 2, 2024 08:59
@hvitved hvitved force-pushed the ssa/dataflow-integration branch from 59a5b34 to 4d0d06d Compare July 2, 2024 11:23

bindingset[l1, l2]
pragma[inline_late]
private predicate inSameFile0(Location l1, Location l2) { l1.getFile() = l2.getFile() }

Check warning

Code scanning / CodeQL

Candidate predicate not marked as `nomagic` Warning

Candidate predicate to
inSameFile
is not marked as nomagic.
@hvitved hvitved force-pushed the ssa/dataflow-integration branch from 4d0d06d to e04847a Compare July 2, 2024 12:50
@hvitved hvitved force-pushed the ssa/dataflow-integration branch 3 times, most recently from 69608af to f8fd8a8 Compare July 3, 2024 08:44
@hvitved hvitved force-pushed the ssa/dataflow-integration branch 2 times, most recently from b1794cc to 35c9233 Compare July 4, 2024 08:53
@hvitved hvitved force-pushed the ssa/dataflow-integration branch from 35c9233 to bd303e8 Compare July 4, 2024 08:58
@hvitved hvitved force-pushed the ssa/dataflow-integration branch 2 times, most recently from a93296e to 0319e00 Compare July 5, 2024 08:37
@hvitved hvitved force-pushed the ssa/dataflow-integration branch 2 times, most recently from 8a8e8fc to 76751a5 Compare July 8, 2024 18:17
@hvitved hvitved force-pushed the ssa/dataflow-integration branch from 76751a5 to aeec768 Compare July 9, 2024 09:54
@hvitved hvitved force-pushed the ssa/dataflow-integration branch from aeec768 to d41eae6 Compare July 9, 2024 10:51
@hvitved hvitved added the no-change-note-required This PR does not need a change note label Jul 9, 2024
@hvitved hvitved marked this pull request as ready for review July 9, 2024 11:30
@hvitved hvitved requested a review from MathiasVP July 9, 2024 12:11
MathiasVP
MathiasVP previously approved these changes Jul 9, 2024
Copy link
Contributor

@MathiasVP MathiasVP left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only a few small comments, but otherwise this LGTM!

)
}

/** Holds if there is a local flow step from `nodeFrom` to `nodeTo`. */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should probably also mention isUseStep in the QLDoc.

shared/ssa/codeql/ssa/Ssa.qll Show resolved Hide resolved
/** Gets the pre-update expression node. */
ExprNode getPreUpdateNode() { result = TExprNode(e, false) }

override string toString() { result = e.toString() + " [postupdate]" }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the missing space intentional?

Suggested change
override string toString() { result = e.toString() + " [postupdate]" }
override string toString() { result = e.toString() + " [post update]" }

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I simply copied it from the variable capture library. However, these nodes should never actually be rendered, because they will be mapped to existing DataFlow::Nodes with appropriate toStrings.

Comment on lines +1329 to +1332
e = DfInput::getARead(_)
or
DfInput::ssaDefAssigns(_, e) and
isPost = false

Check warning

Code scanning / CodeQL

Var only used in one side of disjunct. Warning

The
variable isPost
is only used in one side of disjunct.
shared/ssa/codeql/ssa/Ssa.qll Fixed Show fixed Hide fixed
MathiasVP
MathiasVP previously approved these changes Jul 10, 2024
Copy link
Contributor

@MathiasVP MathiasVP left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

shared/ssa/codeql/ssa/Ssa.qll Outdated Show resolved Hide resolved
Co-authored-by: Mathias Vorreiter Pedersen <[email protected]>
@hvitved
Copy link
Contributor Author

hvitved commented Jul 10, 2024

I'm going to ignore the failing C++ test, as it cannot possibly be caused by this PR (which adds, for now, dead code).

@hvitved hvitved merged commit f183382 into github:main Jul 10, 2024
28 of 29 checks passed
@hvitved hvitved deleted the ssa/dataflow-integration branch July 10, 2024 10:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
no-change-note-required This PR does not need a change note
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants