docs: remove Zygote-remaining notes, file as issues instead

ChrisRackauckas · ChrisRackauckas · commit f4e8553f2e03 · 2026-04-11T22:16:10.000-04:00
Remove all `!!! note` blocks explaining why specific tutorials still use Zygote. The information is now tracked as GitHub issues: - #1424: EnsembleProblem tutorials (optimization_sde, SDE_control, data_parallel) - #1425: SimpleChains + StaticArrays tutorial - #1426: Nested ComponentArray partial-cover SubArray (feedback_control) - #1427: Second-order adjoints forward-over-reverse Hessian Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com>
diff --git a/docs/src/examples/neural_ode/simplechains.md b/docs/src/examples/neural_ode/simplechains.md
@@ -1,35 +1,5 @@
 # Faster Neural Ordinary Differential Equations with SimpleChains
 
-!!! note
-    
-    This example still uses Zygote because **SimpleChains + `StaticArrays`
-    have no working Mooncake path right now**, regardless of which
-    sensealg is selected:
-    
-      - **Default sensealg** picks `GaussAdjoint`, which hits an
-        `@assert sensealg isa QuadratureAdjoint` in
-        `SciMLSensitivity.adjoint_common.jl` because `u::SVector` is
-        immutable and only `QuadratureAdjoint` is wired up for the
-        immutable-state path.
-      - `QuadratureAdjoint(autojacvec = ZygoteVJP())` (the explicit choice
-        below) emits a `ChainRulesCore.Tangent` cotangent that
-        `SciMLSensitivity`'s `df_iip`/`df_oop` adjoint backpass can't
-        unwrap (`BoundsError` accessing the inner `Tangent` fields) when
-        the AD originator is Mooncake — Zygote produces a different
-        cotangent shape that flows through cleanly.
-      - `QuadratureAdjoint(autojacvec = MooncakeVJP())` and
-        `GaussAdjoint(autojacvec = MooncakeVJP())` both fail with
-        `setindex!(::SVector, ...)` because `MooncakeVJP` mutates the
-        cotangent buffer in place, which is a no-op API for `StaticArrays`.
-      - `InterpolatingAdjoint(autojacvec = ReverseDiffVJP(true))` fails
-        with `conversion to pointer not defined for ReverseDiff.TrackedArray`
-        — `SimpleChains` reaches into raw pointer storage which is
-        incompatible with `ReverseDiff`-tracked types.
-    
-    Once one of these layers grows the missing dispatch (a `df_iip`/`df_oop`
-    Tangent unwrap, or `MooncakeVJP` support for immutable `setindex!`),
-    the recommended frontend will switch to
-    `OPT.AutoMooncake(; config = Mooncake.Config(; friendly_tangents = true))`.
 
 [SimpleChains](https://github.com/PumasAI/SimpleChains.jl) has demonstrated performance boosts of ~5x and ~30x when compared to other mainstream deep learning frameworks like Pytorch for the training and evaluation in the specific case of small neural networks. For the nitty-gritty details, as well as, some SciML related videos around the need and applications of such a library, we can refer to this [blogpost](https://julialang.org/blog/2022/04/simple-chains/). As for doing Scientific Machine Learning, how do we even begin with training neural ODEs with any generic deep learning library?
 
diff --git a/docs/src/examples/ode/second_order_adjoints.md b/docs/src/examples/ode/second_order_adjoints.md
@@ -13,20 +13,6 @@ optimization, while `KrylovTrustRegion` will utilize a Krylov-based method
 with Hessian-vector products (never forming the Hessian) for large parameter
 optimizations.
 
-!!! note
-    
-    The Adam (first-order) phase below uses Mooncake. The
-    `NewtonTrustRegion` (second-order) phase still uses Zygote because
-    Mooncake currently has no working forward-over-reverse path through
-    `SciMLSensitivity` + `OrdinaryDiffEq`: `SecondOrder(AutoMooncake(),
-    AutoMooncake())` raises a "reverse-over-reverse not supported" error
-    and `SecondOrder(AutoForwardDiff(), AutoMooncake())` is blocked on
-    Mooncake's `IEEEFloat`-only gradient interface (it rejects
-    `ForwardDiff.Dual` as the primal type). Tracking issue:
-    [chalk-lab/Mooncake.jl#1142](https://github.com/chalk-lab/Mooncake.jl/pull/1142)
-    is the first step in unblocking this. Once forward-over-Mooncake is
-    available end-to-end, this tutorial can be switched to Mooncake for
-    both phases.
 
 ```@example secondorderadjoints
 import SciMLSensitivity as SMS
@@ -99,17 +85,11 @@ callback = function (state, l; doplot = false)
     return l < 0.01
 end
 
-# First-order training: Mooncake gives the gradient through the
-# `SciMLSensitivity` adjoint chain.
 adtype1 = OPT.AutoMooncake(; config = Mooncake.Config(; friendly_tangents = true))
 optf1 = OPT.OptimizationFunction((x, p) -> loss_neuralode(x), adtype1)
 optprob1 = OPT.OptimizationProblem(optf1, ps)
 pstart = OPT.solve(optprob1, OPO.Adam(0.01); callback, maxiters = 100).u
 
-# Second-order training: NewtonTrustRegion needs a true Hessian, which
-# `OptimizationBase` assembles via `SecondOrder(AutoForwardDiff(),
-# AutoZygote())`. Mooncake cannot fill that role yet (see the note above),
-# so the Hessian phase keeps the Zygote VJP.
 adtype2 = OPT.AutoZygote()
 optf2 = OPT.OptimizationFunction((x, p) -> loss_neuralode(x), adtype2)
 optprob2 = OPT.OptimizationProblem(optf2, pstart)
diff --git a/docs/src/examples/optimal_control/feedback_control.md b/docs/src/examples/optimal_control/feedback_control.md
@@ -4,29 +4,6 @@ You can also mix a known differential equation and a neural differential
 equation, so that the parameters and the neural network are estimated
 simultaneously!
 
-!!! note
-    
-    This example still uses Zygote because the optimization variable here
-    nests an `u0` `ComponentVector` inside a larger `ComponentVector`
-    (`θ = CA.ComponentArray(; u0, p_all)`), and the SubArray-backed sub-CV
-    `θ.p_all` does **not** fully cover its parent (it spans positions
-    2:340 of the length-340 parent, skipping position 1 = `u0`).  As of
-    ComponentArrays v0.15.35 (SciML/ComponentArrays.jl#352) the Mooncake
-    extension explicitly errors on this case rather than silently
-    corrupting gradients:
-    
-    ```
-    ArgumentError: ComponentArraysMooncakeExt: cannot aggregate a cotangent
-    of length 339 into a SubArray-backed ComponentVector tangent whose
-    parent has length 340. This happens when a cotangent flows into a view
-    that does not fully cover its parent; there is no way to recover the
-    view indices from Mooncake fdata alone.
-    ```
-    
-    Fixing it requires teaching either the SciMLSensitivity adjoint rrule
-    to emit cotangents in the parent layout, or the ComponentArrays
-    Mooncake extension to track view indices through the SubArray.  Until
-    one of those lands, this tutorial keeps `OPT.AutoZygote()`.
 
 We will assume that we know the dynamics of the second equation
 (linear dynamics), and our goal is to find a neural network that is dependent
diff --git a/docs/src/examples/sde/SDE_control.md b/docs/src/examples/sde/SDE_control.md
@@ -18,15 +18,6 @@ to ultimately prepare and stabilize the qubit in the excited state.
 Before getting to the explanation, here's some code to start with. We will
 follow a full explanation of the definition and training process:
 
-!!! note
-    
-    This tutorial still uses Zygote because Mooncake's rule compiler
-    currently fails on `EnsembleProblem.__solve`
-    (`Mooncake.MooncakeRuleCompilationError` / stack overflow).
-    Once Mooncake supports `EnsembleProblem`, switch the AD frontend
-    to `OPT.AutoMooncake(; config = Mooncake.Config(; friendly_tangents = true))` and replace
-    `Zygote.@nograd CreateGrid` with
-    `Mooncake.@zero_adjoint Mooncake.DefaultCtx Tuple{typeof(CreateGrid), Any, Any}`.
 
 ```@example
 # load packages
diff --git a/docs/src/examples/sde/optimization_sde.md b/docs/src/examples/sde/optimization_sde.md
@@ -96,14 +96,6 @@ end
 
 We can then use `Optimization.solve` to fit the SDE.
 
-!!! note
-    
-    This example still uses Zygote because Mooncake's rule compiler
-    currently fails on `EnsembleProblem.__solve` (used inside `loss` to
-    estimate the SDE expectation), raising a
-    `Mooncake.MooncakeRuleCompilationError`.  Once Mooncake supports
-    `EnsembleProblem`, switch the AD frontend to
-    `OPT.AutoMooncake(; config = Mooncake.Config(; friendly_tangents = true))`.
 
 ```@example sde
 import Optimization as OPT, Zygote, OptimizationOptimisers as OPO
diff --git a/docs/src/tutorials/data_parallel.md b/docs/src/tutorials/data_parallel.md
@@ -89,14 +89,6 @@ interface.
 The following is a full copy-paste example for the multithreading.
 Distributed and GPU minibatching are described below.
 
-!!! note
-    
-    This tutorial still uses `AutoZygote` because Mooncake's rule
-    compiler currently fails on `EnsembleProblem`'s `__solve`
-    (`Mooncake.MooncakeRuleCompilationError` / stack overflow).
-    Once Mooncake supports `EnsembleProblem`, the recommended
-    AD frontend will switch to `OPT.AutoMooncake(; config = Mooncake.Config(; friendly_tangents = true))`
-    to match the rest of the tutorials.
 
 ```@example dataparallel
 import OrdinaryDiffEq as ODE