Skip to content

Commit f4e8553

Browse files
docs: remove Zygote-remaining notes, file as issues instead
Remove all `!!! note` blocks explaining why specific tutorials still use Zygote. The information is now tracked as GitHub issues: - #1424: EnsembleProblem tutorials (optimization_sde, SDE_control, data_parallel) - #1425: SimpleChains + StaticArrays tutorial - #1426: Nested ComponentArray partial-cover SubArray (feedback_control) - #1427: Second-order adjoints forward-over-reverse Hessian Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com>
1 parent f8deaaf commit f4e8553

6 files changed

Lines changed: 0 additions & 98 deletions

File tree

docs/src/examples/neural_ode/simplechains.md

Lines changed: 0 additions & 30 deletions
Original file line numberDiff line numberDiff line change
@@ -1,35 +1,5 @@
11
# Faster Neural Ordinary Differential Equations with SimpleChains
22

3-
!!! note
4-
5-
This example still uses Zygote because **SimpleChains + `StaticArrays`
6-
have no working Mooncake path right now**, regardless of which
7-
sensealg is selected:
8-
9-
- **Default sensealg** picks `GaussAdjoint`, which hits an
10-
`@assert sensealg isa QuadratureAdjoint` in
11-
`SciMLSensitivity.adjoint_common.jl` because `u::SVector` is
12-
immutable and only `QuadratureAdjoint` is wired up for the
13-
immutable-state path.
14-
- `QuadratureAdjoint(autojacvec = ZygoteVJP())` (the explicit choice
15-
below) emits a `ChainRulesCore.Tangent` cotangent that
16-
`SciMLSensitivity`'s `df_iip`/`df_oop` adjoint backpass can't
17-
unwrap (`BoundsError` accessing the inner `Tangent` fields) when
18-
the AD originator is Mooncake — Zygote produces a different
19-
cotangent shape that flows through cleanly.
20-
- `QuadratureAdjoint(autojacvec = MooncakeVJP())` and
21-
`GaussAdjoint(autojacvec = MooncakeVJP())` both fail with
22-
`setindex!(::SVector, ...)` because `MooncakeVJP` mutates the
23-
cotangent buffer in place, which is a no-op API for `StaticArrays`.
24-
- `InterpolatingAdjoint(autojacvec = ReverseDiffVJP(true))` fails
25-
with `conversion to pointer not defined for ReverseDiff.TrackedArray`
26-
`SimpleChains` reaches into raw pointer storage which is
27-
incompatible with `ReverseDiff`-tracked types.
28-
29-
Once one of these layers grows the missing dispatch (a `df_iip`/`df_oop`
30-
Tangent unwrap, or `MooncakeVJP` support for immutable `setindex!`),
31-
the recommended frontend will switch to
32-
`OPT.AutoMooncake(; config = Mooncake.Config(; friendly_tangents = true))`.
333

344
[SimpleChains](https://github.com/PumasAI/SimpleChains.jl) has demonstrated performance boosts of ~5x and ~30x when compared to other mainstream deep learning frameworks like Pytorch for the training and evaluation in the specific case of small neural networks. For the nitty-gritty details, as well as, some SciML related videos around the need and applications of such a library, we can refer to this [blogpost](https://julialang.org/blog/2022/04/simple-chains/). As for doing Scientific Machine Learning, how do we even begin with training neural ODEs with any generic deep learning library?
355

docs/src/examples/ode/second_order_adjoints.md

Lines changed: 0 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -13,20 +13,6 @@ optimization, while `KrylovTrustRegion` will utilize a Krylov-based method
1313
with Hessian-vector products (never forming the Hessian) for large parameter
1414
optimizations.
1515

16-
!!! note
17-
18-
The Adam (first-order) phase below uses Mooncake. The
19-
`NewtonTrustRegion` (second-order) phase still uses Zygote because
20-
Mooncake currently has no working forward-over-reverse path through
21-
`SciMLSensitivity` + `OrdinaryDiffEq`: `SecondOrder(AutoMooncake(),
22-
AutoMooncake())` raises a "reverse-over-reverse not supported" error
23-
and `SecondOrder(AutoForwardDiff(), AutoMooncake())` is blocked on
24-
Mooncake's `IEEEFloat`-only gradient interface (it rejects
25-
`ForwardDiff.Dual` as the primal type). Tracking issue:
26-
[chalk-lab/Mooncake.jl#1142](https://github.com/chalk-lab/Mooncake.jl/pull/1142)
27-
is the first step in unblocking this. Once forward-over-Mooncake is
28-
available end-to-end, this tutorial can be switched to Mooncake for
29-
both phases.
3016

3117
```@example secondorderadjoints
3218
import SciMLSensitivity as SMS
@@ -99,17 +85,11 @@ callback = function (state, l; doplot = false)
9985
return l < 0.01
10086
end
10187
102-
# First-order training: Mooncake gives the gradient through the
103-
# `SciMLSensitivity` adjoint chain.
10488
adtype1 = OPT.AutoMooncake(; config = Mooncake.Config(; friendly_tangents = true))
10589
optf1 = OPT.OptimizationFunction((x, p) -> loss_neuralode(x), adtype1)
10690
optprob1 = OPT.OptimizationProblem(optf1, ps)
10791
pstart = OPT.solve(optprob1, OPO.Adam(0.01); callback, maxiters = 100).u
10892
109-
# Second-order training: NewtonTrustRegion needs a true Hessian, which
110-
# `OptimizationBase` assembles via `SecondOrder(AutoForwardDiff(),
111-
# AutoZygote())`. Mooncake cannot fill that role yet (see the note above),
112-
# so the Hessian phase keeps the Zygote VJP.
11393
adtype2 = OPT.AutoZygote()
11494
optf2 = OPT.OptimizationFunction((x, p) -> loss_neuralode(x), adtype2)
11595
optprob2 = OPT.OptimizationProblem(optf2, pstart)

docs/src/examples/optimal_control/feedback_control.md

Lines changed: 0 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -4,29 +4,6 @@ You can also mix a known differential equation and a neural differential
44
equation, so that the parameters and the neural network are estimated
55
simultaneously!
66

7-
!!! note
8-
9-
This example still uses Zygote because the optimization variable here
10-
nests an `u0` `ComponentVector` inside a larger `ComponentVector`
11-
(`θ = CA.ComponentArray(; u0, p_all)`), and the SubArray-backed sub-CV
12-
`θ.p_all` does **not** fully cover its parent (it spans positions
13-
2:340 of the length-340 parent, skipping position 1 = `u0`). As of
14-
ComponentArrays v0.15.35 (SciML/ComponentArrays.jl#352) the Mooncake
15-
extension explicitly errors on this case rather than silently
16-
corrupting gradients:
17-
18-
```
19-
ArgumentError: ComponentArraysMooncakeExt: cannot aggregate a cotangent
20-
of length 339 into a SubArray-backed ComponentVector tangent whose
21-
parent has length 340. This happens when a cotangent flows into a view
22-
that does not fully cover its parent; there is no way to recover the
23-
view indices from Mooncake fdata alone.
24-
```
25-
26-
Fixing it requires teaching either the SciMLSensitivity adjoint rrule
27-
to emit cotangents in the parent layout, or the ComponentArrays
28-
Mooncake extension to track view indices through the SubArray. Until
29-
one of those lands, this tutorial keeps `OPT.AutoZygote()`.
307

318
We will assume that we know the dynamics of the second equation
329
(linear dynamics), and our goal is to find a neural network that is dependent

docs/src/examples/sde/SDE_control.md

Lines changed: 0 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -18,15 +18,6 @@ to ultimately prepare and stabilize the qubit in the excited state.
1818
Before getting to the explanation, here's some code to start with. We will
1919
follow a full explanation of the definition and training process:
2020

21-
!!! note
22-
23-
This tutorial still uses Zygote because Mooncake's rule compiler
24-
currently fails on `EnsembleProblem.__solve`
25-
(`Mooncake.MooncakeRuleCompilationError` / stack overflow).
26-
Once Mooncake supports `EnsembleProblem`, switch the AD frontend
27-
to `OPT.AutoMooncake(; config = Mooncake.Config(; friendly_tangents = true))` and replace
28-
`Zygote.@nograd CreateGrid` with
29-
`Mooncake.@zero_adjoint Mooncake.DefaultCtx Tuple{typeof(CreateGrid), Any, Any}`.
3021

3122
```@example
3223
# load packages

docs/src/examples/sde/optimization_sde.md

Lines changed: 0 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -96,14 +96,6 @@ end
9696

9797
We can then use `Optimization.solve` to fit the SDE.
9898

99-
!!! note
100-
101-
This example still uses Zygote because Mooncake's rule compiler
102-
currently fails on `EnsembleProblem.__solve` (used inside `loss` to
103-
estimate the SDE expectation), raising a
104-
`Mooncake.MooncakeRuleCompilationError`. Once Mooncake supports
105-
`EnsembleProblem`, switch the AD frontend to
106-
`OPT.AutoMooncake(; config = Mooncake.Config(; friendly_tangents = true))`.
10799

108100
```@example sde
109101
import Optimization as OPT, Zygote, OptimizationOptimisers as OPO

docs/src/tutorials/data_parallel.md

Lines changed: 0 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -89,14 +89,6 @@ interface.
8989
The following is a full copy-paste example for the multithreading.
9090
Distributed and GPU minibatching are described below.
9191

92-
!!! note
93-
94-
This tutorial still uses `AutoZygote` because Mooncake's rule
95-
compiler currently fails on `EnsembleProblem`'s `__solve`
96-
(`Mooncake.MooncakeRuleCompilationError` / stack overflow).
97-
Once Mooncake supports `EnsembleProblem`, the recommended
98-
AD frontend will switch to `OPT.AutoMooncake(; config = Mooncake.Config(; friendly_tangents = true))`
99-
to match the rest of the tutorials.
10092

10193
```@example dataparallel
10294
import OrdinaryDiffEq as ODE

0 commit comments

Comments
 (0)