JuliaDiff
diff --git a/‎DifferentiationInterface/docs/src/dev_guide.md‎
Lines changed: 4 additions & 4 deletions b/‎DifferentiationInterface/docs/src/dev_guide.md‎
Lines changed: 4 additions & 4 deletions
diff --git a/‎DifferentiationInterface/docs/src/explanation/operators.md‎
Lines changed: 6 additions & 6 deletions b/‎DifferentiationInterface/docs/src/explanation/operators.md‎
Lines changed: 6 additions & 6 deletions
diff --git a/‎DifferentiationInterface/docs/src/tutorials/advanced.md‎
Lines changed: 6 additions & 6 deletions b/‎DifferentiationInterface/docs/src/tutorials/advanced.md‎
Lines changed: 6 additions & 6 deletions
diff --git a/‎DifferentiationInterface/docs/src/tutorials/basic.md‎
Lines changed: 7 additions & 7 deletions b/‎DifferentiationInterface/docs/src/tutorials/basic.md‎
Lines changed: 7 additions & 7 deletions
diff --git a/‎DifferentiationInterface/ext/DifferentiationInterfaceChainRulesCoreExt/DifferentiationInterfaceChainRulesCoreExt.jl‎
Lines changed: 1 addition & 1 deletion b/‎DifferentiationInterface/ext/DifferentiationInterfaceChainRulesCoreExt/DifferentiationInterfaceChainRulesCoreExt.jl‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎DifferentiationInterface/ext/DifferentiationInterfaceChainRulesCoreExt/differentiate_with.jl‎
Lines changed: 2 additions & 2 deletions b/‎DifferentiationInterface/ext/DifferentiationInterfaceChainRulesCoreExt/differentiate_with.jl‎
Lines changed: 2 additions & 2 deletions
diff --git a/‎DifferentiationInterface/ext/DifferentiationInterfaceChainRulesCoreExt/reverse_onearg.jl‎
Lines changed: 9 additions & 9 deletions b/‎DifferentiationInterface/ext/DifferentiationInterfaceChainRulesCoreExt/reverse_onearg.jl‎
Lines changed: 9 additions & 9 deletions
diff --git a/‎DifferentiationInterface/ext/DifferentiationInterfaceDiffractorExt/DifferentiationInterfaceDiffractorExt.jl‎
Lines changed: 5 additions & 5 deletions b/‎DifferentiationInterface/ext/DifferentiationInterfaceDiffractorExt/DifferentiationInterfaceDiffractorExt.jl‎
Lines changed: 5 additions & 5 deletions
diff --git a/‎DifferentiationInterface/ext/DifferentiationInterfaceEnzymeExt/DifferentiationInterfaceEnzymeExt.jl‎
Lines changed: 12 additions & 12 deletions b/‎DifferentiationInterface/ext/DifferentiationInterfaceEnzymeExt/DifferentiationInterfaceEnzymeExt.jl‎
Lines changed: 12 additions & 12 deletions
@@ -23,10 +23,10 @@ Most operators have 4 variants, which look like this in the first order: `operat
 To implement a new operator for an existing backend, you need to write 5 methods: 1 for [preparation](@ref Preparation) and 4 corresponding to the variants of the operator (see above).
 For first-order operators, you may also want to support [in-place functions](@ref "Mutation and signatures"), which requires another 5 methods (defined on `f!` instead of `f`).
 
-The method `prepare_operator` must output an `extras` object of the correct type.
-For instance, `prepare_gradient(f, backend, x)` must return a [`DifferentiationInterface.GradientExtras`](@ref).
-Assuming you don't need any preparation for said operator, you can use the trivial extras that are already defined, like `DifferentiationInterface.NoGradientExtras`.
-Otherwise, define a custom struct like `MyGradientExtras <: DifferentiationInterface.GradientExtras` and put the necessary storage in there.
+The method `prepare_operator` must output a `prep` object of the correct type.
+For instance, `prepare_gradient(f, backend, x)` must return a [`DifferentiationInterface.GradientPrep`](@ref).
+Assuming you don't need any preparation for said operator, you can use the trivial prep that are already defined, like `DifferentiationInterface.NoGradientPrep`.
+Otherwise, define a custom struct like `MyGradientPrep <: DifferentiationInterface.GradientPrep` and put the necessary storage in there.
 
 ## New backend
 
 
@@ -107,28 +107,28 @@ In addition, the preparation syntax depends on the number of arguments accepted
 | out-of-place function | `prepare_op(f, backend, x, [t])`     |
 | in-place function     | `prepare_op(f!, y, backend, x, [t])` |
 
-Preparation creates an object called `extras` which contains the the necessary information to speed up an operator and its variants.
-The idea is that you prepare only once, which can be costly, but then call the operator several times while reusing the same `extras`.
+Preparation creates an object called `prep` which contains the the necessary information to speed up an operator and its variants.
+The idea is that you prepare only once, which can be costly, but then call the operator several times while reusing the same `prep`.
 
 ```julia
 op(f, backend, x, [t])  # slow because it includes preparation
-op(f, extras, backend, x, [t])  # fast because it skips preparation
+op(f, prep, backend, x, [t])  # fast because it skips preparation
 ```
 
 !!! warning
-    The `extras` object is the last argument before `backend` and it is always mutated, regardless of the bang `!` in the operator name.
+    The `prep` object is the last argument before `backend` and it is always mutated, regardless of the bang `!` in the operator name.
 
 ### Reusing preparation
 
 Deciding whether it is safe to reuse the results of preparation is not easy.
 Here are the general rules that we strive to implement:
 
-For different-point preparation, the output `extras` of `prepare_op(f, b, x, [t])` can be reused in `op(f, extras, b, other_x, [other_t])`, provided that:
+For different-point preparation, the output `prep` of `prepare_op(f, b, x, [t])` can be reused in `op(f, prep, b, other_x, [other_t])`, provided that:
 
 - the inputs `x` and `other_x` have similar types and equal shapes
 - the tangents in `t` and `other_t` have similar types and equal shapes
 
-For same-point preparation, the output `extras` of `prepare_op_same_point(f, b, x, [t])` can be reused in `op(f, extras, b, x, other_t)`, provided that:
+For same-point preparation, the output `prep` of `prepare_op_same_point(f, b, x, [t])` can be reused in `op(f, prep, b, x, other_t)`, provided that:
 
 - the input `x` remains the same
 - the tangents in `t` and `other_t` have similar types and equal shapes
 
@@ -45,8 +45,8 @@ gradient(f_multiarg, backend, x, Constant(10))
 Preparation also works in this case, even if the constant changes before execution:
 
 ```@example tuto_advanced
-extras_other_constant = prepare_gradient(f_multiarg, backend, x, Constant(-1))
-gradient(f_multiarg, extras_other_constant, backend, x, Constant(10))
+prep_other_constant = prepare_gradient(f_multiarg, backend, x, Constant(-1))
+gradient(f_multiarg, prep_other_constant, backend, x, Constant(10))
 ```
 
 ## Sparsity
@@ -120,15 +120,15 @@ The speedup becomes very visible in large dimensions.
 
 ```@example tuto_advanced
 n = 1000
-jac_extras_dense = prepare_jacobian(f_sparse_vector, dense_first_order_backend, zeros(n))
-jac_extras_sparse = prepare_jacobian(f_sparse_vector, sparse_first_order_backend, zeros(n))
+jac_prep_dense = prepare_jacobian(f_sparse_vector, dense_first_order_backend, zeros(n))
+jac_prep_sparse = prepare_jacobian(f_sparse_vector, sparse_first_order_backend, zeros(n))
 nothing  # hide
 ```
 
 ```@example tuto_advanced
-@benchmark jacobian($f_sparse_vector, $jac_extras_dense, $dense_first_order_backend, $(randn(n)))
+@benchmark jacobian($f_sparse_vector, $jac_prep_dense, $dense_first_order_backend, $(randn(n)))
 ```
 
 ```@example tuto_advanced
-@benchmark jacobian($f_sparse_vector, $jac_extras_sparse, $sparse_first_order_backend, $(randn(n)))
+@benchmark jacobian($f_sparse_vector, $jac_prep_sparse, $sparse_first_order_backend, $(randn(n)))
 ```
@@ -80,26 +80,26 @@ These objects can be reused between gradient computations, even on different inp
 We abstract away the preparation step behind a backend-agnostic syntax:
 
 ```@example tuto_basic
-extras = prepare_gradient(f, backend, zero(x))
+prep = prepare_gradient(f, backend, zero(x))
 ```
 
 You don't need to know what this object is, you just need to pass it to the gradient operator.
 Note that preparation does not depend on the actual components of the vector `x`, just on its type and size.
-You can thus reuse the `extras` for different values of the input.
+You can thus reuse the `prep` for different values of the input.
 
 ```@example tuto_basic
 grad = similar(x)
-gradient!(f, grad, extras, backend, x)
+gradient!(f, grad, prep, backend, x)
 grad  # has been mutated
 ```
 
 Preparation makes the gradient computation much faster, and (in this case) allocation-free.
 
 ```@example tuto_basic
-@benchmark gradient!($f, $grad, $extras, $backend, $x)
+@benchmark gradient!($f, $grad, $prep, $backend, $x)
 ```
 
-Beware that the `extras` object is nearly always mutated by differentiation operators, even though it is given as the last positional argument.
+Beware that the `prep` object is nearly always mutated by differentiation operators, even though it is given as the last positional argument.
 
 ## Switching backends
 
@@ -121,9 +121,9 @@ gradient(f, backend2, x)
 And you can run the same benchmarks to see what you gained (although such a small input may not be realistic):
 
 ```@example tuto_basic
-extras2 = prepare_gradient(f, backend2, zero(x))
+prep2 = prepare_gradient(f, backend2, zero(x))
 
-@benchmark gradient!($f, $grad, $extras2, $backend2, $x)
+@benchmark gradient!($f, $grad, $prep2, $backend2, $x)
 ```
 
 In short, DifferentiationInterface.jl allows for easy testing and comparison of AD backends.
 
@@ -12,7 +12,7 @@ using ChainRulesCore:
 using Compat
 import DifferentiationInterface as DI
 using DifferentiationInterface:
-    DifferentiateWith, NoPullbackExtras, NoPushforwardExtras, PullbackExtras, Tangents
+    DifferentiateWith, NoPullbackPrep, NoPushforwardPrep, PullbackPrep, Tangents
 
 ruleconfig(backend::AutoChainRules) = backend.ruleconfig
 
 
@@ -1,9 +1,9 @@
 function ChainRulesCore.rrule(dw::DifferentiateWith, x)
     @compat (; f, backend) = dw
     y = f(x)
-    extras_same = DI.prepare_pullback_same_point(f, backend, x, DI.Tangents(y))
+    prep_same = DI.prepare_pullback_same_point(f, backend, x, DI.Tangents(y))
     function pullbackfunc(dy)
-        tx = DI.pullback(f, extras_same, backend, x, DI.Tangents(dy))
+        tx = DI.pullback(f, prep_same, backend, x, DI.Tangents(dy))
         return (NoTangent(), only(tx))
     end
     return y, pullbackfunc
 
@@ -1,22 +1,22 @@
 ## Pullback
 
-struct ChainRulesPullbackExtrasSamePoint{Y,PB} <: PullbackExtras
+struct ChainRulesPullbackPrepSamePoint{Y,PB} <: PullbackPrep
     y::Y
     pb::PB
 end
 
-DI.prepare_pullback(f, ::AutoReverseChainRules, x, ty::Tangents) = NoPullbackExtras()
+DI.prepare_pullback(f, ::AutoReverseChainRules, x, ty::Tangents) = NoPullbackPrep()
 
 function DI.prepare_pullback_same_point(
-    f, ::NoPullbackExtras, backend::AutoReverseChainRules, x, ty::Tangents
+    f, ::NoPullbackPrep, backend::AutoReverseChainRules, x, ty::Tangents
 )
     rc = ruleconfig(backend)
     y, pb = rrule_via_ad(rc, f, x)
-    return ChainRulesPullbackExtrasSamePoint(y, pb)
+    return ChainRulesPullbackPrepSamePoint(y, pb)
 end
 
 function DI.value_and_pullback(
-    f, ::NoPullbackExtras, backend::AutoReverseChainRules, x, ty::Tangents
+    f, ::NoPullbackPrep, backend::AutoReverseChainRules, x, ty::Tangents
 )
     rc = ruleconfig(backend)
     y, pb = rrule_via_ad(rc, f, x)
@@ -27,19 +27,19 @@ function DI.value_and_pullback(
 end
 
 function DI.value_and_pullback(
-    f, extras::ChainRulesPullbackExtrasSamePoint, ::AutoReverseChainRules, x, ty::Tangents
+    f, prep::ChainRulesPullbackPrepSamePoint, ::AutoReverseChainRules, x, ty::Tangents
 )
-    @compat (; y, pb) = extras
+    @compat (; y, pb) = prep
     tx = map(ty) do dy
         last(pb(dy))
     end
     return copy(y), tx
 end
 
 function DI.pullback(
-    f, extras::ChainRulesPullbackExtrasSamePoint, ::AutoReverseChainRules, x, ty::Tangents
+    f, prep::ChainRulesPullbackPrepSamePoint, ::AutoReverseChainRules, x, ty::Tangents
 )
-    @compat (; pb) = extras
+    @compat (; pb) = prep
     tx = map(ty) do dy
         last(pb(dy))
     end
 
@@ -2,7 +2,7 @@ module DifferentiationInterfaceDiffractorExt
 
 using ADTypes: ADTypes, AutoDiffractor
 import DifferentiationInterface as DI
-using DifferentiationInterface: NoPushforwardExtras, Tangents
+using DifferentiationInterface: NoPushforwardPrep, Tangents
 using Diffractor: DiffractorRuleConfig, TaylorTangentIndex, ZeroBundle, bundle, ∂☆
 
 DI.check_available(::AutoDiffractor) = true
@@ -11,9 +11,9 @@ DI.pullback_performance(::AutoDiffractor) = DI.PullbackSlow()
 
 ## Pushforward
 
-DI.prepare_pushforward(f, ::AutoDiffractor, x, tx::Tangents) = NoPushforwardExtras()
+DI.prepare_pushforward(f, ::AutoDiffractor, x, tx::Tangents) = NoPushforwardPrep()
 
-function DI.pushforward(f, ::NoPushforwardExtras, ::AutoDiffractor, x, tx::Tangents)
+function DI.pushforward(f, ::NoPushforwardPrep, ::AutoDiffractor, x, tx::Tangents)
     ty = map(tx) do dx
         # code copied from Diffractor.jl
         z = ∂☆{1}()(ZeroBundle{1}(f), bundle(x, dx))
@@ -23,9 +23,9 @@ function DI.pushforward(f, ::NoPushforwardExtras, ::AutoDiffractor, x, tx::Tange
 end
 
 function DI.value_and_pushforward(
-    f, extras::NoPushforwardExtras, backend::AutoDiffractor, x, tx::Tangents
+    f, prep::NoPushforwardPrep, backend::AutoDiffractor, x, tx::Tangents
 )
-    return f(x), DI.pushforward(f, extras, backend, x, tx)
+    return f(x), DI.pushforward(f, prep, backend, x, tx)
 end
 
 end
@@ -5,18 +5,18 @@ using Base: Fix1
 import DifferentiationInterface as DI
 using DifferentiationInterface:
     Context,
-    DerivativeExtras,
-    GradientExtras,
-    JacobianExtras,
-    HVPExtras,
-    PullbackExtras,
-    PushforwardExtras,
-    NoDerivativeExtras,
-    NoGradientExtras,
-    NoHVPExtras,
-    NoJacobianExtras,
-    NoPullbackExtras,
-    NoPushforwardExtras,
+    DerivativePrep,
+    GradientPrep,
+    JacobianPrep,
+    HVPPrep,
+    PullbackPrep,
+    PushforwardPrep,
+    NoDerivativePrep,
+    NoGradientPrep,
+    NoHVPPrep,
+    NoJacobianPrep,
+    NoPullbackPrep,
+    NoPushforwardPrep,
     Tangents,
     pick_batchsize
 using Enzyme: