Help write a CodeQL query for the Gradio framework #870
Replies: 2 comments 1 reply
-
Hi @NgocKhanhC311 👋🏻 You say that you're getting results when you evaluate If I look at the code you linked to, it seems that the call to |
Beta Was this translation helpful? Give feedback.
-
Hey @mbg! The question was also asked on our public Slack and we got the solution there—sources and sinks were found correctly, but it turned out we needed an additional taint step 👍 Pasting my answer from Slack, so others can have use of it: In short, it’s a corner case. You can make the flow work and report an alert if you add a Taint Step to Taint Tracking: predicate isAdditionalFlowStep(DataFlow::Node nodefrom, DataFlow::Node nodeto) {
exists(DataFlow::AttrRead attr |
// nodefrom.getLocation().getFile().getBaseName() = "toolbox.py" and
attr.accesses(nodefrom, "orig_name")
and nodeto = attr
)
} The above specifically works for cases when predicate isAdditionalFlowStep(DataFlow::Node nodefrom, DataFlow::Node nodeto) {
exists(DataFlow::AttrRead attr |
// nodefrom.getLocation().getFile().getBaseName() = "toolbox.py" and
attr.accesses(nodefrom, _)
and nodeto = attr
)
} To explain: there are cases, which might make the data flow not propagate. In this case, CodeQL flowed to /**
* @id codeql-t
* @severity error
* @kind path-problem
*/
import python
import semmle.python.dataflow.new.DataFlow
import semmle.python.dataflow.new.TaintTracking
import semmle.python.ApiGraphs
import semmle.python.dataflow.new.RemoteFlowSources
import MyFlow::PathGraph
private module MyConfig implements DataFlow::ConfigSig {
predicate isSource(DataFlow::Node source) {
source= API::moduleImport("gradio")
.getMember([
"Button", "Textbox", "UploadButton", "Slider", "JSON", "HTML", "Markdown", "Files",
"AnnotatedImage", "Audio", "BarPlot", "Chatbot", "Checkbox", "CheckboxGroup",
"ClearButton", "Code", "ColorPicker", "Dataframe", "Dataset", "DownloadButton",
"Dropdown", "DuplicateButton", "FileExplorer", "Gallery", "HighlightedText",
"Image", "ImageEditor", "Label", "LinePlot", "LoginButton", "LogoutButton",
"Model3D", "Number", "ParamViewer", "Plot", "Radio", "ScatterPlot", "SimpleImage",
"State", "Video"
])
.getReturn()
.getMember([
"change", "input", "click", "submit", "edit", "clear", "play", "pause", "stop",
"end", "start_recording", "pause_recording", "stop_recording", "focus", "blur",
"upload", "release", "select", "stream", "like", "load", "key_up",
])
.getACall().getParameter(0, "fn").getParameter(_).asSource()
}
predicate isSink(DataFlow::Node sink) {
sink = API::moduleImport("os").getMember("path").getMember("basename").getACall().getArg(0)
}
predicate isAdditionalFlowStep(DataFlow::Node nodefrom, DataFlow::Node nodeto) {
exists(DataFlow::AttrRead attr |
// nodefrom.getLocation().getFile().getBaseName() = "toolbox.py" and
attr.accesses(nodefrom, "orig_name")
and nodeto = attr
)
}
}
module MyFlow = TaintTracking::Global<MyConfig>;
from MyFlow::PathNode source, MyFlow::PathNode sink
where MyFlow::flowPath(source, sink)
select sink.getNode(), source, sink, "execute sink called with untrusted data" When running this query, I get the result you are looking for. You can also try it with the wildcard by replacing |
Beta Was this translation helpful? Give feedback.
-
Hi team,
I'm new to CodeQL. I read this article and want to write a practice query. I tried looking for the sink (os.path.basename) and using the methods passed in from gradio’s interface as the source. If I run each query separately in isSource and isSink, I do get results, but I want to observe the flow of the source code. Unfortunately, it’s not working.
Please help me.
Beta Was this translation helpful? Give feedback.
All reactions