You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/language/learn-ql/csharp/dataflow.rst
+57-55Lines changed: 57 additions & 55 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,11 +1,13 @@
1
1
Analyzing data flow in C#
2
2
=========================
3
3
4
-
Overview
5
-
--------
4
+
You can use CodeQL to track the flow of data through a C# program to its use.
6
5
7
-
This topic describes how data flow analysis is implemented in the CodeQL libraries for C# and includes examples to help you write your own data flow queries.
8
-
The following sections describe how to utilize the libraries for local data flow, global data flow, and taint tracking.
6
+
About this article
7
+
------------------
8
+
9
+
This article describes how data flow analysis is implemented in the CodeQL libraries for C# and includes examples to help you write your own data flow queries.
10
+
The following sections describe how to use the libraries for local data flow, global data flow, and taint tracking.
9
11
10
12
For a more general introduction to modeling data flow, see :doc:`Introduction to data flow analysis with CodeQL <../intro-to-data-flow>`.
11
13
@@ -17,7 +19,7 @@ Local data flow is data flow within a single method or callable. Local data flow
17
19
Using local data flow
18
20
~~~~~~~~~~~~~~~~~~~~~
19
21
20
-
The local data flow library is in the module ``DataFlow``, which defines the class ``Node`` denoting any element that data can flow through. ``Node``\ s are divided into expression nodes (``ExprNode``) and parameter nodes (``ParameterNode``). It is possible to map between data flow nodes and expressions/parameters using the member predicates ``asExpr`` and ``asParameter``:
22
+
The local data flow library is in the module ``DataFlow``, which defines the class ``Node`` denoting any element that data can flow through. ``Node``\ s are divided into expression nodes (``ExprNode``) and parameter nodes (``ParameterNode``). You can map between data flow nodes and expressions/parameters using the member predicates ``asExpr`` and ``asParameter``:
21
23
22
24
.. code-block:: ql
23
25
@@ -45,9 +47,9 @@ or using the predicates ``exprNode`` and ``parameterNode``:
45
47
*/
46
48
ParameterNode parameterNode(Parameter p) { ... }
47
49
48
-
The predicate ``localFlowStep(Node nodeFrom, Node nodeTo)`` holds if there is an immediate data flow edge from the node ``nodeFrom`` to the node ``nodeTo``. The predicate can be applied recursively (using the ``+`` and ``*`` operators), or it is possible to use the predefined recursive predicate ``localFlow``.
50
+
The predicate ``localFlowStep(Node nodeFrom, Node nodeTo)`` holds if there is an immediate data flow edge from the node ``nodeFrom`` to the node ``nodeTo``. You can apply the predicate recursively, by using the ``+`` and ``*`` operators, or you can use the predefined recursive predicate ``localFlow``.
49
51
50
-
For example, finding flow from a parameter ``source`` to an expression ``sink`` in zero or more local steps can be achieved as follows:
52
+
For example, you can find flow from a parameter ``source`` to an expression ``sink`` in zero or more local steps:
51
53
52
54
.. code-block:: ql
53
55
@@ -65,9 +67,9 @@ Local taint tracking extends local data flow by including non-value-preserving f
65
67
66
68
If ``x`` is a tainted string then ``y`` is also tainted.
67
69
68
-
The local taint tracking library is in the module ``TaintTracking``. Like local data flow, a predicate ``localTaintStep(DataFlow::Node nodeFrom, DataFlow::Node nodeTo)`` holds if there is an immediate taint propagation edge from the node ``nodeFrom`` to the node ``nodeTo``. The predicate can be applied recursively (using the ``+`` and ``*`` operators), or it is possible to use the predefined recursive predicate ``localTaint``.
70
+
The local taint tracking library is in the module ``TaintTracking``. Like local data flow, a predicate ``localTaintStep(DataFlow::Node nodeFrom, DataFlow::Node nodeTo)`` holds if there is an immediate taint propagation edge from the node ``nodeFrom`` to the node ``nodeTo``. You can apply the predicate recursively, by using the ``+`` and ``*`` operators, or you can use the predefined recursive predicate ``localTaint``.
69
71
70
-
For example, finding taint propagation from a parameter ``source`` to an expression ``sink`` in zero or more local steps can be achieved as follows:
72
+
For example, you can find taint propagation from a parameter ``source`` to an expression ``sink`` in zero or more local steps:
71
73
72
74
.. code-block:: ql
73
75
@@ -76,7 +78,7 @@ For example, finding taint propagation from a parameter ``source`` to an express
76
78
Examples
77
79
~~~~~~~~
78
80
79
-
The following query finds the filename passed to ``System.IO.File.Open``:
81
+
This query finds the filename passed to ``System.IO.File.Open``:
80
82
81
83
.. code-block:: ql
82
84
@@ -99,7 +101,7 @@ Unfortunately this will only give the expression in the argument, not the values
99
101
and DataFlow::localFlow(DataFlow::exprNode(src), DataFlow::exprNode(call.getArgument(0)))
100
102
select src
101
103
102
-
Then we can make the source more specific, for example an access to a public parameter. The following query finds where a public parameter is used to open a file:
104
+
Then we can make the source more specific, for example an access to a public parameter. This query finds instances where a public parameter is used to open a file:
103
105
104
106
.. code-block:: ql
105
107
@@ -112,7 +114,7 @@ Then we can make the source more specific, for example an access to a public par
112
114
and call.getEnclosingCallable().(Member).isPublic()
113
115
select p, "Opening a file from a public method."
114
116
115
-
The following example finds calls to ``String.Format`` where the format string isn't hard-coded:
117
+
This query finds calls to ``String.Format`` where the format string isn't hard-coded:
116
118
117
119
.. code-block:: ql
118
120
@@ -139,7 +141,7 @@ Global data flow tracks data flow throughout the entire program, and is therefor
139
141
Using global data flow
140
142
~~~~~~~~~~~~~~~~~~~~~~
141
143
142
-
The global data flow library is used by extending the class ``DataFlow::Configuration`` as follows:
144
+
The global data flow library is used by extending the class ``DataFlow::Configuration``:
143
145
144
146
.. code-block:: ql
145
147
@@ -157,12 +159,12 @@ The global data flow library is used by extending the class ``DataFlow::Configur
157
159
}
158
160
}
159
161
160
-
The following predicates are defined in the configuration:
162
+
These predicates are defined in the configuration:
161
163
162
-
- ``isSource`` - defines where data may flow from
163
-
- ``isSink`` - defines where data may flow to
164
-
- ``isBarrier`` - optionally, restricts the data flow
The characteristic predicate (``MyDataFlowConfiguration()``) defines the name of the configuration, so ``"..."`` must be replaced with a unique name.
168
170
@@ -177,7 +179,7 @@ The data flow analysis is performed using the predicate ``hasFlow(DataFlow::Node
177
179
Using global taint tracking
178
180
~~~~~~~~~~~~~~~~~~~~~~~~~~~
179
181
180
-
Global taint tracking is to global data flow what local taint tracking is to local data flow. That is, global taint tracking extends global data flow with additional non-value-preserving steps. The global taint tracking library is used by extending the class ``TaintTracking::Configuration`` as follows:
182
+
Global taint tracking is to global data flow what local taint tracking is to local data flow. That is, global taint tracking extends global data flow with additional non-value-preserving steps. The global taint tracking library is used by extending the class ``TaintTracking::Configuration``:
181
183
182
184
.. code-block:: ql
183
185
@@ -195,12 +197,12 @@ Global taint tracking is to global data flow what local taint tracking is to loc
195
197
}
196
198
}
197
199
198
-
The following predicates are defined in the configuration:
200
+
These predicates are defined in the configuration:
199
201
200
-
- ``isSource`` - defines where taint may flow from
201
-
- ``isSink`` - defines where taint may flow to
202
-
- ``isSanitizer`` - optionally, restricts the taint flow
Similar to global data flow, the characteristic predicate (``MyTaintTrackingConfiguration()``) defines the unique name of the configuration and the taint analysis is performed using the predicate ``hasFlow(DataFlow::Node source, DataFlow::Node sink)``.
206
208
@@ -214,7 +216,7 @@ The class ``RemoteSourceFlow`` (defined in module ``semmle.code.csharp.dataflow.
214
216
Example
215
217
~~~~~~~
216
218
217
-
The following example shows a data flow configuration that uses all public API parameters as data sources.
219
+
This query shows a data flow configuration that uses all public API parameters as data sources:
218
220
219
221
.. code-block:: ql
220
222
@@ -236,30 +238,30 @@ The following example shows a data flow configuration that uses all public API p
236
238
Class hierarchy
237
239
~~~~~~~~~~~~~~~
238
240
239
-
- ``DataFlow::Configuration`` - base class for custom global data flow analysis
240
-
- ``DataFlow::Node`` - an element behaving as a data flow node
241
+
- ``DataFlow::Configuration`` - base class for custom global data flow analysis.
242
+
- ``DataFlow::Node`` - an element behaving as a data flow node.
241
243
242
-
- ``DataFlow::ExprNode`` - an expression behaving as a data flow node
243
-
- ``DataFlow::ParameterNode`` - a parameter data flow node representing the value of a parameter at function entry
244
+
- ``DataFlow::ExprNode`` - an expression behaving as a data flow node.
245
+
- ``DataFlow::ParameterNode`` - a parameter data flow node representing the value of a parameter at function entry.
244
246
245
-
- ``PublicCallableParameter`` - a parameter to a public method/callable in a public class
247
+
- ``PublicCallableParameter`` - a parameter to a public method/callable in a public class.
246
248
247
-
- ``RemoteSourceFlow`` - data flow from network/remote input
249
+
- ``RemoteSourceFlow`` - data flow from network/remote input.
248
250
249
-
- ``AspNetRemoteFlowSource`` - data flow from remote ASP.NET user input
251
+
- ``AspNetRemoteFlowSource`` - data flow from remote ASP.NET user input.
250
252
251
-
- ``AspNetQueryStringRemoteFlowSource`` - data flow from ``System.Web.HttpRequest``
252
-
- ``AspNetUserInputRemoveFlowSource`` - data flow from ``System.Web.IO.WebControls.TextBox``
253
+
- ``AspNetQueryStringRemoteFlowSource`` - data flow from ``System.Web.HttpRequest``.
254
+
- ``AspNetUserInputRemoveFlowSource`` - data flow from ``System.Web.IO.WebControls.TextBox``.
253
255
254
-
- ``WcfRemoteFlowSource`` - data flow from a WCF web service
255
-
- ``AspNetServiceRemoteFlowSource`` - data flow from an ASP.NET web service
256
+
- ``WcfRemoteFlowSource`` - data flow from a WCF web service.
257
+
- ``AspNetServiceRemoteFlowSource`` - data flow from an ASP.NET web service.
256
258
257
-
- ``TaintTracking::Configuration`` - base class for custom global taint tracking analysis
259
+
- ``TaintTracking::Configuration`` - base class for custom global taint tracking analysis.
258
260
259
261
Examples
260
262
~~~~~~~~
261
263
262
-
The following data flow configuration tracks data flow from environment variables to opening files:
264
+
This data flow configuration tracks data flow from environment variables to opening files:
263
265
264
266
.. code-block:: ql
265
267
@@ -300,42 +302,42 @@ Exercise 4: Using the answers from 2 and 3, write a query to find all global dat
300
302
Extending library data flow
301
303
---------------------------
302
304
303
-
*Library* data flow defines how data flows through libraries where the source code is not available, such as the .NET Framework, third-party libraries or proprietary libraries.
305
+
Library data flow defines how data flows through libraries where the source code is not available, such as the .NET Framework, third-party libraries or proprietary libraries.
304
306
305
307
To define new library data flow, extend the class ``LibraryTypeDataFlow`` from the module ``semmle.code.csharp.dataflow.LibraryTypeDataFlow``. Override the predicate ``callableFlow`` to define how data flows through the methods in the class. ``callableFlow`` has the signature
- ``callable`` - the ``Callable`` (such as a method, constructor, property getter or setter) performing the data flow
312
-
- ``source`` - the data flow input
313
-
- ``sink`` - the data flow output
313
+
- ``callable`` - the ``Callable`` (such as a method, constructor, property getter or setter) performing the data flow.
314
+
- ``source`` - the data flow input.
315
+
- ``sink`` - the data flow output.
314
316
- ``preservesValue`` - whether the flow step preserves the value, for example if ``x`` is a string then ``x.ToString()`` preserves the value where as ``x.ToLower()`` does not.
315
317
316
318
Class hierarchy
317
319
~~~~~~~~~~~~~~~
318
320
319
321
- ``Callable`` - a callable (methods, accessors, constructors etc.)
320
322
321
-
- ``SourceDeclarationCallable`` - an unconstructed callable
323
+
- ``SourceDeclarationCallable`` - an unconstructed callable.
322
324
323
-
- ``CallableFlowSource`` - the input of data flow into the callable
325
+
- ``CallableFlowSource`` - the input of data flow into the callable.
324
326
325
-
- ``CallableFlowSourceQualifier`` - the data flow comes from the object itself
326
-
- ``CallableFlowSourceArg`` - the data flow comes from an argument to the call
327
+
- ``CallableFlowSourceQualifier`` - the data flow comes from the object itself.
328
+
- ``CallableFlowSourceArg`` - the data flow comes from an argument to the call.
327
329
328
-
- ``CallableFlowSink`` - the output of data flow from the callable
330
+
- ``CallableFlowSink`` - the output of data flow from the callable.
329
331
330
-
- ``CallableFlowSinkQualifier`` - the output is to the object itself
331
-
- ``CallableFlowSinkReturn`` - the output is returned from the call
332
-
- ``CallableFlowSinkArg`` - the output is an argument
333
-
- ``CallableFlowSinkDelegateArg`` - the output flows through a delegate argument (for example, LINQ)
332
+
- ``CallableFlowSinkQualifier`` - the output is to the object itself.
333
+
- ``CallableFlowSinkReturn`` - the output is returned from the call.
334
+
- ``CallableFlowSinkArg`` - the output is an argument.
335
+
- ``CallableFlowSinkDelegateArg`` - the output flows through a delegate argument (for example, LINQ).
334
336
335
337
Example
336
338
~~~~~~~
337
339
338
-
The following example is adapted from ``LibraryTypeDataFlow.qll``. It declares data flow through the class ``System.Uri``, including the constructor, the ``ToString`` method, and the properties ``Query``, ``OriginalString``, and ``PathAndQuery``.
340
+
This example is adapted from ``LibraryTypeDataFlow.qll``. It declares data flow through the class ``System.Uri``, including the constructor, the ``ToString`` method, and the properties ``Query``, ``OriginalString``, and ``PathAndQuery``.
339
341
340
342
.. code-block:: ql
341
343
@@ -489,7 +491,7 @@ Exercise 4
489
491
Exercise 5
490
492
~~~~~~~~~~
491
493
492
-
All properties can flow data. We can declare this as follows:
494
+
All properties can flow data:
493
495
494
496
.. code-block:: ql
495
497
@@ -545,8 +547,8 @@ This can be adapted from the ``SystemUriFlow`` class:
545
547
}
546
548
}
547
549
548
-
What next?
549
-
----------
550
+
Further reading
551
+
---------------
550
552
551
553
- Learn about the standard libraries used to write queries for C# in :doc:`Introducing the C# libraries <introduce-libraries-csharp>`.
552
554
- Find out more about QL in the `QL language handbook <https://help.semmle.com/QL/ql-handbook/index.html>`__ and `QL language specification <https://help.semmle.com/QL/ql-spec/language.html>`__.
0 commit comments