Skip to content

Commit 56f2a1c

Browse files
authored
docs: add mixed-mode explanations (#690)
1 parent b56d062 commit 56f2a1c

4 files changed

Lines changed: 68 additions & 18 deletions

File tree

DifferentiationInterface/docs/Project.toml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,7 @@ FiniteDiff = "6a86dc24-6348-571c-b903-95158fe2bd41"
99
ForwardDiff = "f6369f11-7733-5829-9624-2563aa707210"
1010
Markdown = "d6f4376e-aef5-505a-96c1-9c027394607a"
1111
PrettyTables = "08abe8d2-0d0c-5749-adfa-8a2ac140af0d"
12+
Random = "9a3f8284-a2c9-5f02-9a11-845980a1fd5c"
1213
SparseConnectivityTracer = "9f842d2f-2579-4b1d-911e-f412cf18a3f5"
1314
SparseMatrixColorings = "0a514795-09f3-496d-8182-132a7b665d35"
1415
Zygote = "e88e6eb3-aa80-5325-afca-941959d7151f"

DifferentiationInterface/docs/src/api.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -68,7 +68,6 @@ jacobian
6868
jacobian!
6969
value_and_jacobian
7070
value_and_jacobian!
71-
MixedMode
7271
```
7372

7473
## Second order
@@ -125,9 +124,10 @@ DifferentiationInterface.inner
125124
DifferentiateWith
126125
```
127126

128-
### Sparsity detection
127+
### Sparsity tools
129128

130129
```@docs
130+
MixedMode
131131
DenseSparsityDetector
132132
```
133133

DifferentiationInterface/docs/src/explanation/advanced.md

Lines changed: 17 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -71,10 +71,26 @@ But after preparation, the more zeros are present in the matrix, the greater the
7171
### Tuning the coloring algorithm
7272

7373
The complexity of sparse Jacobians or Hessians grows with the number of distinct colors in a coloring of the sparsity pattern.
74-
To reduce this number of colors, [`GreedyColoringAlgorithm`](@ref) has two main settings: the order used for vertices and the decompression method.
74+
To reduce this number of colors, [`GreedyColoringAlgorithm`](@extref SparseMatrixColorings.GreedyColoringAlgorithm) has two main settings: the order used for vertices and the decompression method.
7575
Depending on your use case, you may want to modify either of these options to increase performance.
7676
See the documentation of [SparseMatrixColorings.jl](https://github.com/gdalle/SparseMatrixColorings.jl) for details.
7777

78+
### Mixed mode
79+
80+
When a Jacobian matrix has both dense rows and dense columns, it can be more efficient to use "mixed-mode" differentiation, a mixture of forward and reverse.
81+
The associated bidirectional coloring algorithm automatically decides how to cover the Jacobian using a set of columns (computed in forward mode) plus a set of rows (computed in reverse mode).
82+
This behavior is triggered as soon as you put a [`MixedMode`](@ref) object inside `AutoSparse`, like so:
83+
84+
```julia
85+
AutoSparse(
86+
MixedMode(forward_backend, reverse_backend);
87+
sparsity_detector,
88+
coloring_algorithm
89+
)
90+
```
91+
92+
At the moment, mixed mode tends to work best when the [`GreedyColoringAlgorithm`](@extref SparseMatrixColorings.GreedyColoringAlgorithm) is provided with a [`RandomOrder`](@extref SparseMatrixColorings.RandomOrder) instead of the usual [`NaturalOrder`](@extref SparseMatrixColorings.NaturalOrder).
93+
7894
## Batch mode
7995

8096
### Multiple tangents

DifferentiationInterface/docs/src/tutorials/advanced.md

Lines changed: 48 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -3,10 +3,12 @@
33
We present contexts and sparsity handling with DifferentiationInterface.jl.
44

55
```@example tuto_advanced
6+
using ADTypes
67
using BenchmarkTools
78
using DifferentiationInterface
89
import ForwardDiff, Zygote
9-
using SparseConnectivityTracer: TracerSparsityDetector
10+
using Random
11+
using SparseConnectivityTracer
1012
using SparseMatrixColorings
1113
```
1214

@@ -71,8 +73,8 @@ x = float.(1:8);
7173
```
7274

7375
```@example tuto_advanced
74-
dense_first_order_backend = AutoForwardDiff()
75-
J_dense = jacobian(f_sparse_vector, dense_first_order_backend, x)
76+
dense_forward_backend = AutoForwardDiff()
77+
J_dense = jacobian(f_sparse_vector, dense_forward_backend, x)
7678
```
7779

7880
```@example tuto_advanced
@@ -89,14 +91,14 @@ Recipe to create a sparse backend: combine a dense backend, a sparsity detector
8991
The following are reasonable defaults:
9092

9193
```@example tuto_advanced
92-
sparse_first_order_backend = AutoSparse(
93-
dense_first_order_backend;
94+
sparse_forward_backend = AutoSparse(
95+
dense_forward_backend; # any object from ADTypes
9496
sparsity_detector=TracerSparsityDetector(),
9597
coloring_algorithm=GreedyColoringAlgorithm(),
9698
)
9799
98100
sparse_second_order_backend = AutoSparse(
99-
dense_second_order_backend;
101+
dense_second_order_backend; # any object from ADTypes or a SecondOrder from DI
100102
sparsity_detector=TracerSparsityDetector(),
101103
coloring_algorithm=GreedyColoringAlgorithm(),
102104
)
@@ -106,7 +108,7 @@ nothing # hide
106108
Now the resulting matrices are sparse:
107109

108110
```@example tuto_advanced
109-
jacobian(f_sparse_vector, sparse_first_order_backend, x)
111+
jacobian(f_sparse_vector, sparse_forward_backend, x)
110112
```
111113

112114
```@example tuto_advanced
@@ -123,7 +125,7 @@ Some result analysis functions from [SparseMatrixColorings.jl](https://github.co
123125
First, it records the sparsity pattern itself (the one returned by the detector).
124126

125127
```@example tuto_advanced
126-
jac_prep = prepare_jacobian(f_sparse_vector, sparse_first_order_backend, x)
128+
jac_prep = prepare_jacobian(f_sparse_vector, sparse_forward_backend, x)
127129
sparsity_pattern(jac_prep)
128130
```
129131

@@ -149,20 +151,20 @@ nothing # hide
149151
```
150152

151153
```@example tuto_advanced
152-
jac_prep_dense = prepare_jacobian(f_sparse_vector, dense_first_order_backend, zero(xbig))
153-
@benchmark jacobian($f_sparse_vector, $jac_prep_dense, $dense_first_order_backend, $xbig)
154+
jac_prep_dense = prepare_jacobian(f_sparse_vector, dense_forward_backend, zero(xbig))
155+
@benchmark jacobian($f_sparse_vector, $jac_prep_dense, $dense_forward_backend, $xbig)
154156
```
155157

156158
```@example tuto_advanced
157-
jac_prep_sparse = prepare_jacobian(f_sparse_vector, sparse_first_order_backend, zero(xbig))
158-
@benchmark jacobian($f_sparse_vector, $jac_prep_sparse, $sparse_first_order_backend, $xbig)
159+
jac_prep_sparse = prepare_jacobian(f_sparse_vector, sparse_forward_backend, zero(xbig))
160+
@benchmark jacobian($f_sparse_vector, $jac_prep_sparse, $sparse_forward_backend, $xbig)
159161
```
160162

161163
Better memory use can be achieved by pre-allocating the matrix from the preparation result (so that it has the correct structure).
162164

163165
```@example tuto_advanced
164166
jac_buffer = similar(sparsity_pattern(jac_prep_sparse), eltype(xbig))
165-
@benchmark jacobian!($f_sparse_vector, $jac_buffer, $jac_prep_sparse, $sparse_first_order_backend, $xbig)
167+
@benchmark jacobian!($f_sparse_vector, $jac_buffer, $jac_prep_sparse, $sparse_forward_backend, $xbig)
166168
```
167169

168170
And for optimal speed, one should write non-allocating and type-stable functions.
@@ -184,7 +186,38 @@ ybig ≈ f_sparse_vector(xbig)
184186
In this case, the sparse Jacobian should also become non-allocating (for our specific choice of backend).
185187

186188
```@example tuto_advanced
187-
jac_prep_sparse_nonallocating = prepare_jacobian(f_sparse_vector!, zero(ybig), sparse_first_order_backend, zero(xbig))
189+
jac_prep_sparse_nonallocating = prepare_jacobian(f_sparse_vector!, zero(ybig), sparse_forward_backend, zero(xbig))
188190
jac_buffer = similar(sparsity_pattern(jac_prep_sparse_nonallocating), eltype(xbig))
189-
@benchmark jacobian!($f_sparse_vector!, $ybig, $jac_buffer, $jac_prep_sparse_nonallocating, $sparse_first_order_backend, $xbig)
191+
@benchmark jacobian!($f_sparse_vector!, $ybig, $jac_buffer, $jac_prep_sparse_nonallocating, $sparse_forward_backend, $xbig)
192+
```
193+
194+
### Mixed mode
195+
196+
Some Jacobians have a structure which includes dense rows and dense columns, like this one:
197+
198+
```@example tuto_advanced
199+
arrowhead(x) = x .+ x[1] .+ vcat(sum(x), zeros(eltype(x), length(x)-1))
200+
201+
jacobian_sparsity(arrowhead, x, TracerSparsityDetector())
202+
```
203+
204+
In such cases, sparse AD is only beneficial in "mixed mode", where we combine a forward and a reverse backend.
205+
This is achieved using the [`MixedMode`](@ref) wrapper, for which we recommend a random coloring order (see [`RandomOrder`](@extref SparseMatrixColorings.RandomOrder)):
206+
207+
```@example tuto_advanced
208+
sparse_mixed_backend = AutoSparse(
209+
MixedMode(AutoForwardDiff(), AutoZygote()),
210+
sparsity_detector=TracerSparsityDetector(),
211+
coloring_algorithm=GreedyColoringAlgorithm(RandomOrder(MersenneTwister(), 0)),
212+
)
213+
```
214+
215+
It unlocks a large speedup compared to pure forward mode, and the same would be true compared to reverse mode:
216+
217+
```@example tuto_advanced
218+
@benchmark jacobian($arrowhead, prep, $sparse_forward_backend, $xbig) setup=(prep=prepare_jacobian(arrowhead, sparse_forward_backend, xbig))
219+
```
220+
221+
```@example tuto_advanced
222+
@benchmark jacobian($arrowhead, prep, $sparse_mixed_backend, $xbig) setup=(prep=prepare_jacobian(arrowhead, sparse_mixed_backend, xbig))
190223
```

0 commit comments

Comments
 (0)